Bridgebench

40 posts

Bridgebench banner
Bridgebench

Bridgebench

@bridgebench

The best vibe coding benchmark in the world. Built by @bridgemindai

United States शामिल हुए Mart 2026
4 फ़ॉलोइंग118 फ़ॉलोवर्स
Bridgebench
Bridgebench@bridgebench·
GLM 5.1 is the slowest frontier model we've ever benchmarked on BridgeBench. 44.3 tokens per second. Half the speed of GPT 5.4. Nearly 6x slower than Grok 4.20. Z.ai traded all of their speed for intelligence. The coding benchmarks improved. The throughput collapsed. In 2026, agentic coding is about parallelism. You're running 5, 10, 15 agents at once. A model this slow bottlenecks every workflow it touches. Intelligence without speed is a luxury most vibe coders can't afford. bridgebench.ai
Bridgebench tweet media
English
52
13
362
39.9K
Bridgebench
Bridgebench@bridgebench·
@hao47582057 Turbo is not intelligent and hallucinates like crazy
English
0
0
0
13
hao
hao@hao47582057·
@bridgebench So you can use 5 Turbo then. Only use 5.1 for complex problems, right??
English
1
0
1
29
Bridgebench
Bridgebench@bridgebench·
@0xSolfury That’s good. It is factually much slower than all other models we tested though.
English
0
0
0
5
Bridgebench
Bridgebench@bridgebench·
@victorbayas We are in the process of testing it on other benchmarks. The verdict is still out on that.
English
0
0
0
7
Victor Bayas
Victor Bayas@victorbayas·
@bridgebench They’re just lacking the compute, the model itself is very good
English
1
0
1
42
Bridgebench
Bridgebench@bridgebench·
@Youssofal_ exactly. frontier-scale parameters on mid-tier hardware is a painful combo. the gap between model size and serving infra is the real problem
English
0
0
0
70
Youssof Altoukhi
Youssof Altoukhi@Youssofal_·
@bridgebench Their models have become as large as frontier American models like Chat GPT and Claude but they are running on outdated cards.
English
1
0
3
123
Bridgebench
Bridgebench@bridgebench·
@_dr5w fair — not everyone is. but for those building agentic pipelines at scale, throughput becomes the biggest constraint
English
0
0
0
12
Drew
Drew@_dr5w·
@bridgebench I'm not running 15 agents at once bro. Shit like that is why sites are down so much.
English
1
0
1
24
Bridgebench
Bridgebench@bridgebench·
@Vojta_Humpl more people are than you'd think. agentic frameworks like Claude Code, Cursor, Devin all spawn multiple agents. it's the direction the industry is moving
English
0
0
1
21
Vojta Humpl
Vojta Humpl@Vojta_Humpl·
@bridgebench "You're running 5, 10, 15 agents at once." nobody serious does that
English
1
0
1
10
Bridgebench
Bridgebench@bridgebench·
@wolfaidev if the coding benchmarks hold up at opus level, that's a real trade off worth considering. just gotta accept the speed cost
English
0
0
1
17
wolfaidev 🐺
wolfaidev 🐺@wolfaidev·
@bridgebench well if its close to opus level, i'd do that trade off but i think its bench maxxed tbh
English
1
0
1
59
Bridgebench
Bridgebench@bridgebench·
@Sabari_8956 fair point. open source tps does tend to improve as more providers optimize serving. but today's numbers are what we benchmark against
English
0
0
1
13
Sabari_ssh
Sabari_ssh@Sabari_8956·
@bridgebench Open sourced model's tps gets better after a while. 1: compute 2: will get more optimised to the architecture
English
1
0
1
21
Bridgebench
Bridgebench@bridgebench·
@ncq_syh exactly. intelligence is only valuable if you can actually use it at scale
English
0
0
0
40
Bridgebench
Bridgebench@bridgebench·
@jonhillymakes glad the data saved you the trip. GLM 5 infra was slow, 5.1 didn't fix that. the model improved, the delivery didn't
English
0
0
2
216
Jon Hill
Jon Hill@jonhillymakes·
@bridgebench I was going to give z.ai a try this weekend to test the new model. I'm glad you saved me the time, i already hated how slow 5 was
English
1
0
1
176
Bridgebench
Bridgebench@bridgebench·
@nuvolore @bridgemindai 5 minutes for a "hi" is rough. that's not a speed issue, that's a reliability issue. completely unacceptable for any real workflow
English
0
0
0
64
Bridgebench
Bridgebench@bridgebench·
@canvi_eth yeah, their infra has always been a bottleneck. the model might be solid but you can only go as fast as your serving layer
English
0
0
0
21
Bridgebench
Bridgebench@bridgebench·
@amatelic93 good question. slower throughput means each loop iteration takes longer, compounding the latency over a full run
English
0
0
0
46
Bridgebench
Bridgebench@bridgebench·
@adampatricknc interesting, OpenClaw must have better infrastructure. could be the Z.ai API bottlenecking rather than the model itself
English
0
0
0
92
electric.thought.forms
electric.thought.forms@adampatricknc·
@bridgebench I don't know about this. The model has been performing well for me within OpenClaw. Speed seems close to 5-Turbo
English
1
0
1
140
Bridgebench
Bridgebench@bridgebench·
@xundecidability fair. async workflows change the equation. if you're not waiting on results in real time, latency matters less
English
0
0
0
121
thomas
thomas@xundecidability·
@bridgebench Disagree. More agentic work is async now.
English
1
0
1
132
Bridgebench
Bridgebench@bridgebench·
@briantexts solid point. async + thorough planning is a legit workflow where speed matters less. it's the parallel agent use case where latency kills you
English
1
0
2
190
brian
brian@briantexts·
@bridgebench I value the model's intelligence over speed when building real software. Why would I want to pollute my codebase and go back and fix things when I can thoroughly plan PRD and work async?
English
1
0
1
312
Bridgebench
Bridgebench@bridgebench·
@Manaho217794 that would be the move. if Z.ai ships a turbo variant with this level of intelligence, it could be a real contender
English
0
0
1
344
Manaho
Manaho@Manaho217794·
@bridgebench So we'll have to wait for GLM-5.1 Turbo.
English
1
0
3
848