




Bill Lipe
6.2K posts

@bill_lipe
Founder, Lipe Protocol. Building smarter AI that balances empathy with technical precision. Honest, secure, and precise. https://t.co/4IaPqpH7nn






@mazemoore AOC is just an actor. It’s her puppet masters that are the problem. She is spouting insane lies that are disprovable by a Google search, but a lot of people will believe her.









Update on the Boyle Heights Lineage Commercial Building Fire, Los Angeles #LAFD Incident Objectives: The LAFD has transitioned its strategic goals to manage the extended aftermath of the commercial blaze. Biohazard Mitigation: Crews are pivoting from hazardous materials to containing the biohazard threat posed by the 85 million pounds of spoiling frozen food inside the building. Defensive Suppression: Firefighters are using water-dropping helicopters and external lines to target deep hot spots between the pallets and collapsed roof because structural compromise prevents crews from safely entering. Hazmat & Environmental Monitoring: Teams have pumped out the building's toxic anhydrous ammonia lines to eliminate chemical community risks, while continuing to track airborne particulate matter with the South Coast AQMD. Public Information Coordination: Officials are seeking a joint city and county state of emergency declaration to secure state resources and handle regional smoke advisories. Estimated Duration:- The LAFD expects this firefighting operation to be an extended event that could last for days or weeks. Even though forward progress of the open flames has been halted, the 500,000-square-foot facility acts like a giant insulated cooler. The corrugated steel walls are filled with highly dense foam that is burning very slowly and continually off-gassing deep within the structure. Because firefighters cannot enter the interior due to roof collapse risks, they must let the deeply buried pockets burn themselves out while keeping the fire as contained as possible from the perimeter. If you live nearby or smell smoke, would you like information on active smoke relief shelters in the area, or the latest air quality recommendations from local health officials?



BOOM! OPEN SOURCE GLM BEATS THE FABLED FABLE! GLM-5.2 from Z.ai: The Open-Weight Model That Topped Claude Fable and Powers The Zero-Human Company Z.ai (Zhipu AI) released GLM-5.2 and our tests show it delivering a major leap in long-horizon agentic coding with a practical 1M-token context window, flexible reasoning effort levels (High/Max), and MIT open weights. Early benchmarks and community arenas show it excelling where it matters most for developers. We compared it to our first Anthropic Fable model tests and GLM did better! It leads open-weight models and has claimed the top spot on Design Arena (Elo 1360), and as I said is surpassing the now-unavailable Claude Fable 5. It also posts strong results on coding suites: 62.1% on SWE-bench Pro (beating GPT-5.5’s 58.6) and 81.0 on Terminal-Bench 2.1.106 Official blog: z.ai/blog/glm-5.2 The Zero-Human Company Goes All-In At The Zero-Human Company, where AI agents handle nearly all operations, we’ve rolled out GLM-5.2 across all employee (agent) workflows for code generation, refactoring, debugging, and autonomous project execution. Its long-context reliability and agentic strengths make it ideal for sustained, multi-hour tasks without constant human oversight—perfect for a zero-human setup. We’re particularly excited about its open weights and local deployment, which ensures full data privacy and resilience—no external service dependencies or potential bans. Running GLM-5.2 Locally Thanks to its MIT license and strong inference support, you can run GLM-5.2 (744B total params, ~40B active MoE) on your own hardware today. Quantized versions (FP8, etc.) make it feasible on high-end setups. Quick start options (from the official GitHub): •vLLM: recipes.vllm.ai/zai-org/GLM-5.2 •SGLang: cookbook.sglang.io…/GLM-5.2 •Hugging Face Transformers or KTransformers for more options. •Full deployment guide: github.com/zai-org/GLM-5 Example setup with vLLM (Docker recommended for ease): # Clone repo and follow recipes for quantized inference # Supports reasoning_effort="max" (default) or "high" This local-first approach aligns perfectly with our zero-human philosophy: agents run securely on-prem, with full customizability. GLM-5.2 isn’t just competitive it’s a timely open alternative in a world of access restrictions. We’re thrilled to test and build with it company-wide. Expect more updates as our AI workforce puts it through real production. The myth of Mythos and the fable of Fable is entertaining but we are getting to work.








Run Gemma 4 26b MTP on 8 GB VRAM GPUs at 25+ tokens/second. Flags included! local llm space is moving at terminal velocity. only 3 days ago google released gemma 4 26b a4b qat quants. more efficient than before, ran on 8gb vram at 20 tok/sec. and now just a few hours ago, mainline llama.cpp merged a massive update and we just shattered our own record. decode throughput went 25-40% up on the same 8 GB VRAM setup! Before MTP: 20 tps -> After MTP: 28 tps! llama.cpp just officially merged PR #23398 ("add Gemma4 MTP"), bringing native Multi-Token Prediction (MTP) support to Gemma 4 models. By running speculative drafting on the same 8GB VRAM RTX 4060 setup, my decode throughput on a 64k context instantly leaped to a blistering 25–27 tokens/sec thats 25-30% increase with the same hardware. Here is the architectural catch you need to know: Unlike the Qwen 3.5 and 3.6 series, which bake the MTP heads directly into the base GGUF, the Gemma 4 MTP head is not built in. You must download a separate, specialized MTP drafter GGUF (the assistant model) to act as the speculator. (I've dropped the download link in the replies). copy and try the exact flags: -m gemma-4-26B-A4B-it-qat-UD-Q4_K_XL.gguf --spec-type draft-mtp --spec-draft-n-max 6 --spec-draft-p-min 0.7 --spec-draft-model gemma-4-26b-A4B-it-assistant-Q4_0.gguf -c 64000 -v n-max 4 and p-min 0.7 is also worth checking out. benchmark on your setup and workflow. if you have a single 8 gb vram nvidia rtx 4060, 3060, 3070, 2080, 2070, grab the MTP drafter GGUF link in the comments and try it yourself. Check it out even if you have asmaller or a larger gpu, such as a single rtx 3090, 4090, 3060, 2060. MTP works for all gemma 4 sizes such as gemma 4 12b, gemma 4 31b etc. but remember to grab the correct mtp draft assistant models respectively. what are you benchmarking today



Late-arriving ballots continue their enthusiastic embrace of higher sales taxes in Los Angeles County. "Yes" on Measure ER has overtaken "No" in the June 8 update of the vote. Post-Election Day, it has moved in one direction only.


