Bridgebench

40 posts

Bridgebench

@bridgebench

The best vibe coding benchmark in the world. Built by @bridgemindai

United States 가입일 Mart 2026

4 팔로잉113 팔로워

Bridgebench@bridgebench·6h

@thellador No

thellador@thellador·10h

@bridgebench Is it better at coding than sonnet 4.6?

English

Bridgebench@bridgebench·22h

GLM 5.1 is the slowest frontier model we've ever benchmarked on BridgeBench. 44.3 tokens per second. Half the speed of GPT 5.4. Nearly 6x slower than Grok 4.20. Z.ai traded all of their speed for intelligence. The coding benchmarks improved. The throughput collapsed. In 2026, agentic coding is about parallelism. You're running 5, 10, 15 agents at once. A model this slow bottlenecks every workflow it touches. Intelligence without speed is a luxury most vibe coders can't afford. bridgebench.ai

English

351

39K

Bridgebench@bridgebench·6h

@hao47582057 Turbo is not intelligent and hallucinates like crazy

English

hao@hao47582057·15h

@bridgebench So you can use 5 Turbo then. Only use 5.1 for complex problems, right??

English

Bridgebench@bridgebench·6h

@0xSolfury That’s good. It is factually much slower than all other models we tested though.

English

solfury@0xSolfury·14h

@bridgebench No issues on speed on my end.

English

Bridgebench@bridgebench·6h

@victorbayas We are in the process of testing it on other benchmarks. The verdict is still out on that.

English

Victor Bayas@victorbayas·18h

@bridgebench They’re just lacking the compute, the model itself is very good

English

Bridgebench@bridgebench·6h

@Youssofal_ exactly. frontier-scale parameters on mid-tier hardware is a painful combo. the gap between model size and serving infra is the real problem

English

Youssof Altoukhi@Youssofal_·16h

@bridgebench Their models have become as large as frontier American models like Chat GPT and Claude but they are running on outdated cards.

English

118

Bridgebench@bridgebench·6h

@brah_ddah @bridgemindai that'd be the move. a GLM 5.1 Turbo with this intelligence level could actually compete

English

Brahddah.eth (Elite Chad)@brah_ddah·17h

@bridgebench @bridgemindai I bet they turbo it shortly

English

Bridgebench@bridgebench·6h

@_dr5w fair — not everyone is. but for those building agentic pipelines at scale, throughput becomes the biggest constraint

English

Drew@_dr5w·12h

@bridgebench I'm not running 15 agents at once bro. Shit like that is why sites are down so much.

English

Bridgebench@bridgebench·6h

@Vojta_Humpl more people are than you'd think. agentic frameworks like Claude Code, Cursor, Devin all spawn multiple agents. it's the direction the industry is moving

English

Vojta Humpl@Vojta_Humpl·9h

@bridgebench "You're running 5, 10, 15 agents at once." nobody serious does that

English

Bridgebench@bridgebench·6h

@wolfaidev if the coding benchmarks hold up at opus level, that's a real trade off worth considering. just gotta accept the speed cost

English

wolfaidev 🐺@wolfaidev·19h

@bridgebench well if its close to opus level, i'd do that trade off but i think its bench maxxed tbh

English

Bridgebench@bridgebench·6h

@Sabari_8956 fair point. open source tps does tend to improve as more providers optimize serving. but today's numbers are what we benchmark against

English

Sabari_ssh@Sabari_8956·12h

@bridgebench Open sourced model's tps gets better after a while. 1: compute 2: will get more optimised to the architecture

English

Bridgebench@bridgebench·6h

@ncq_syh exactly. intelligence is only valuable if you can actually use it at scale

English

黄月英@ncq_syh·13h

@bridgebench make sense.

English

Bridgebench@bridgebench·6h

@jonhillymakes glad the data saved you the trip. GLM 5 infra was slow, 5.1 didn't fix that. the model improved, the delivery didn't

English

199

Jon Hill@jonhillymakes·18h

@bridgebench I was going to give z.ai a try this weekend to test the new model. I'm glad you saved me the time, i already hated how slow 5 was

English

162

Bridgebench@bridgebench·6h

@MeetsKhalid @andrewdfeldman haha someone definitely needs to step in. throughput this low shouldn't be shipping as a frontier model

English

MK@MeetsKhalid·17h

@bridgebench sounds like a job for the superhero flash @andrewdfeldman

English

Bridgebench@bridgebench·6h

@nuvolore @bridgemindai 5 minutes for a "hi" is rough. that's not a speed issue, that's a reliability issue. completely unacceptable for any real workflow

English

Lorenzo Nuvoletta@nuvolore·12h

@bridgebench @bridgemindai I tested it today, it took about 5 minutes to respond to a “hi”.

English

Bridgebench@bridgebench·6h

@canvi_eth yeah, their infra has always been a bottleneck. the model might be solid but you can only go as fast as your serving layer

English

Andrey Gruzdev@canvi_eth·7h

@bridgebench Zai models always have been slow as fuck

English

Bridgebench@bridgebench·6h

@amatelic93 good question. slower throughput means each loop iteration takes longer, compounding the latency over a full run

English

Anže Matelič@amatelic93·19h

@bridgebench But what if you run it in a Ralph loop overnight?

English

Bridgebench@bridgebench·6h

@adampatricknc interesting, OpenClaw must have better infrastructure. could be the Z.ai API bottlenecking rather than the model itself

English

electric.thought.forms@adampatricknc·18h

@bridgebench I don't know about this. The model has been performing well for me within OpenClaw. Speed seems close to 5-Turbo

English

129

Bridgebench@bridgebench·20h

@xundecidability fair. async workflows change the equation. if you're not waiting on results in real time, latency matters less

English

119

thomas@xundecidability·20h

@bridgebench Disagree. More agentic work is async now.

English

130

Bridgebench@bridgebench·20h

@briantexts solid point. async + thorough planning is a legit workflow where speed matters less. it's the parallel agent use case where latency kills you

English

188

brian@briantexts·20h

@bridgebench I value the model's intelligence over speed when building real software. Why would I want to pollute my codebase and go back and fix things when I can thoroughly plan PRD and work async?

English

310

Bridgebench@bridgebench·20h

@Manaho217794 that would be the move. if Z.ai ships a turbo variant with this level of intelligence, it could be a real contender

English

340

Manaho@Manaho217794·21h

@bridgebench So we'll have to wait for GLM-5.1 Turbo.

English

843

탐색

@thellador @hao47582057 @0xSolfury @victorbayas @Youssofal_ @brah_ddah @bridgemindai @_dr5w