Bridgebench
40 posts

Bridgebench
@bridgebench
The best vibe coding benchmark in the world. Built by @bridgemindai
United States 가입일 Mart 2026
4 팔로잉113 팔로워

GLM 5.1 is the slowest frontier model we've ever benchmarked on BridgeBench.
44.3 tokens per second.
Half the speed of GPT 5.4.
Nearly 6x slower than Grok 4.20.
Z.ai traded all of their speed for intelligence.
The coding benchmarks improved.
The throughput collapsed.
In 2026, agentic coding is about parallelism.
You're running 5, 10, 15 agents at once.
A model this slow bottlenecks every workflow it touches.
Intelligence without speed is a luxury most vibe coders can't afford.
bridgebench.ai

English

@hao47582057 Turbo is not intelligent and hallucinates like crazy
English

@bridgebench So you can use 5 Turbo then. Only use 5.1 for complex problems, right??
English

@0xSolfury That’s good. It is factually much slower than all other models we tested though.
English

@victorbayas We are in the process of testing it on other benchmarks. The verdict is still out on that.
English

@bridgebench They’re just lacking the compute, the model itself is very good
English

@Youssofal_ exactly. frontier-scale parameters on mid-tier hardware is a painful combo. the gap between model size and serving infra is the real problem
English

@bridgebench Their models have become as large as frontier American models like Chat GPT and Claude but they are running on outdated cards.
English

@brah_ddah @bridgemindai that'd be the move. a GLM 5.1 Turbo with this intelligence level could actually compete
English

@bridgebench @bridgemindai I bet they turbo it shortly
English

@_dr5w fair — not everyone is. but for those building agentic pipelines at scale, throughput becomes the biggest constraint
English

@bridgebench I'm not running 15 agents at once bro. Shit like that is why sites are down so much.
English

@Vojta_Humpl more people are than you'd think. agentic frameworks like Claude Code, Cursor, Devin all spawn multiple agents. it's the direction the industry is moving
English

@bridgebench "You're running 5, 10, 15 agents at once."
nobody serious does that
English

@wolfaidev if the coding benchmarks hold up at opus level, that's a real trade off worth considering. just gotta accept the speed cost
English

@bridgebench well if its close to opus level, i'd do that trade off
but i think its bench maxxed tbh
English

@Sabari_8956 fair point. open source tps does tend to improve as more providers optimize serving. but today's numbers are what we benchmark against
English

@bridgebench Open sourced model's tps gets better after a while.
1: compute
2: will get more optimised to the architecture
English

@ncq_syh exactly. intelligence is only valuable if you can actually use it at scale
English

@jonhillymakes glad the data saved you the trip. GLM 5 infra was slow, 5.1 didn't fix that. the model improved, the delivery didn't
English

@bridgebench I was going to give z.ai a try this weekend to test the new model. I'm glad you saved me the time, i already hated how slow 5 was
English

@MeetsKhalid @andrewdfeldman haha someone definitely needs to step in. throughput this low shouldn't be shipping as a frontier model
English

@nuvolore @bridgemindai 5 minutes for a "hi" is rough. that's not a speed issue, that's a reliability issue. completely unacceptable for any real workflow
English

@bridgebench @bridgemindai I tested it today, it took about 5 minutes to respond to a “hi”.
English

@canvi_eth yeah, their infra has always been a bottleneck. the model might be solid but you can only go as fast as your serving layer
English

@amatelic93 good question. slower throughput means each loop iteration takes longer, compounding the latency over a full run
English

@adampatricknc interesting, OpenClaw must have better infrastructure. could be the Z.ai API bottlenecking rather than the model itself
English

@bridgebench I don't know about this. The model has been performing well for me within OpenClaw. Speed seems close to 5-Turbo
English

@xundecidability fair. async workflows change the equation. if you're not waiting on results in real time, latency matters less
English

@briantexts solid point. async + thorough planning is a legit workflow where speed matters less. it's the parallel agent use case where latency kills you
English

@bridgebench I value the model's intelligence over speed when building real software.
Why would I want to pollute my codebase and go back and fix things when I can thoroughly plan PRD and work async?
English

@Manaho217794 that would be the move. if Z.ai ships a turbo variant with this level of intelligence, it could be a real contender
English






