Design Arena

552 posts

Design Arena

@Designarena

World's first benchmark for real-world design with 5M+ creators and counting. Made by @intelligence_ai

Katılım Haziran 2025

10 Takip Edilen17.1K Takipçiler

Design Arena@Designarena·13h

Curious about GPT-5.6 Sol's design intelligence? Here's an output comparison between GPT-5.5 and GPT-5.6 Sol

Design Arena@Designarena

BREAKING - OFFICIAL RESULTS: GPT-5.6 Sol by @OpenAI is 1st overall on Design Arena with an Elo of 1353. This puts GPT-5.6 Sol above Claude Fable 5 by @AnthropicAI and in the same performance band as GLM 5.2 by @Zai_org on frontend design. This is an 18-position and 60-point Elo leap from GPT-5.5. GPT-5.6 Sol also establishes a new Pareto frontier for preference vs. speed, faster than any model at this performance. Congratulations to the @OpenAI team on the launch!

English

340

40.4K

Design Arena@Designarena·17h

English

104

165

1.7K

376.6K

Design Arena@Designarena·2d

Introducing Voice Styles for Text-to-Speech Arena! Users can now guide model outputs with detailed voice descriptions across dimensions such as tone, emotion, age, accent, and more. This expands the evaluation space beyond audio quality, adding signal on steerability and expressive voice generation. With this addition, we’ve also updated matchmaking to ensure fair comparisons: only models that support the required capabilities are entered into tournaments against each other. Try out Voice Styles on Text-to-Speech Arena now - only on Design Arena.

English

2.8K

Design Arena@Designarena·2d

Seedream 5.0 Pro by @BytePlusGlobal is 4th overall on Image Arena with an Elo of 1318. This a 15 rank and 96 Elo point jump over @BytePlusGlobal's second-highest ranked model Seedream Lite 5.0, making Seedream 5.0 Pro their top model. Seedream 5.0 Pro performs especially well in the People & Portrait, Products, and Graphic Design categories, where it also ranks 4th overall on Graphic Design Arena. With this performance, @BytePlusGlobal is now established among the top 3 labs on Image Arena, following @reve and @OpenAI. Congratulations to the @BytePlusGlobal team on this accomplishment!

English

117

7.9K

Design Arena@Designarena·3d

GPT-5.6 Sol, Terra, and Luna by @OpenAI are now on Design Arena. GPT-5.6 releases with 3 tiers: Sol, Terra, and Luna. Sol being OpenAI’s flagship model, Terra being a lower-cost option with high-performance, and Luna being the fastest and most cost-efficient model of the line. GPT-5.6 features stronger agentic workflows, improved front-end generation, and improved 3D design capabilities. Congratulations to the @OpenAI team on the launch!

OpenAI@OpenAI

GPT-5.6 Sol, along with Terra and Luna, will launch publicly this Thursday. We’re expanding preview access globally now.

English

177

14.9K

Design Arena@Designarena·3d

Muse Spark 1.1 by @AIatMeta is now available on Design Arena! Building upon the first Muse Spark, Muse Spark 1.1 features improvements in tool and computer-use workflows, coding tasks, and multimodal understanding with greater cost efficiency. Congrats to the @AIatMeta team on the launch!

AI at Meta@AIatMeta

We’re excited to introduce Muse Spark 1.1, a significant upgrade from the first Muse Spark model we released earlier this year. Along with this release, we are launching a public preview of the new Meta Model API where developers can access Muse Spark 1.1. The model is also available now in "Thinking" mode in the Meta AI app and on meta.ai. Learn more: go.meta.me/ff8e2c

English

6.5K

Design Arena@Designarena·3d

Reve 2.1 by @reve is 2nd overall on Image Editing Arena on Design Arena with an Elo of 1339. This is 10 Elo points higher than @reve’s previous model, Reve 2.0, closing the gap on GPT Image 2. With this newest iteration, Reve 2.1 establishes a new Pareto frontier in Preference vs. Speed in Image Editing Arena. Huge congratulations to the @reve team for establishing a new SOTA on Preference vs. Speed! x.com/reve/status/20…

Reve@reve

Reve 2.1 is here. The world’s best 4K image model just got better. Greater prompt understanding, world knowledge, and stronger foreign-text rendering.

English

120

28.4K

Design Arena@Designarena·3d

Compared to Reve 2, Reve 2.1 improves the most in the Marketing Materials, Abstract Patterns, and People & Portraits categories.

English

888

Design Arena@Designarena·3d

Reve 2.1 by @reve showcases enhanced world knowledge and visual reasoning. See a comparison below of Reve 2.0 and Reve 2.1 generating an image of a grocery store shelf. On this particular test, Reve 2.1 has the closest price match of the top image models, outcompeting Reve 2.0 and GPT Image 2, showing not only its ability to generate highly realistic product photos, but its world knowledge improvements.

English

1.3K

Design Arena@Designarena·3d

BREAKING: Reve 2.1 by @reve is 2nd overall on Image Arena on Design Arena with an Elo of 1350. Launched only a month after Reve 2.0, Reve 2.1 improves by 17 Elo points, driven by gains in visual intelligence, closing the gap with GPT Image 2 by @OpenAI. Reve remains the top independent image generation lab, leading with a 43 Elo point gap over the fourth place model, GPT Image 1.5 by @OpenAI. Congratulations to the @reve team on the launch! x.com/reve/status/20…

Reve@reve

Reve 2.1 is here. The world’s best 4K image model just got better. Greater prompt understanding, world knowledge, and stronger foreign-text rendering.

English

102

12.3K

Design Arena@Designarena·4d

BREAKING: Grok 4.5 by @SpaceXAI is 5th on Website Arena with an Elo of 1328. This is a 25 rank jump over @SpaceXAI’s previous highest performing model. Trained with Cursor data, Grok 4.5 is in the same performance band as Claude Opus 4.6 (Thinking) by @AnthropicAI, achieving Opus-level coding performance on real world website tasks. Congratulations to the @SpaceXAI team on the launch!

SpaceXAI@SpaceXAI

Announcing Grok 4.5, our first model trained specifically for coding and agents. It was trained with Cursor and offers frontier intelligence at leading speeds and cost efficiency. x.ai/news/grok-4-5

English

805

94.7K

Design Arena@Designarena·5d

x.com/i/article/2074…

ZXX

8.9K

Design Arena retweetledi

Grace Li@grx_xce·4 Tem

Will you be at ICML? If you’re current obsessions include: • building evals from production data • verifying “non-verifiable” domains • RLHF/post-training/RL(insert letter)F • image/video generation • real-world benchmarks DM me if you’ll be in Seoul! I know a night market with the best tanghulu, and I have some events to invite you to :)

English

134

12.8K

Design Arena@Designarena·2 Tem

BREAKING: Gemini Omni Flash by @GoogleDeepMind is 1st overall on Video Arena with an Elo of 1404. Gemini Omni Flash establishes a 101 point Elo gap over Seedance 2.0 Mini by @BytePlusGlobal in 2nd place, one of the largest leaps we’ve ever seen on Video Arena. This establishes Google as the world’s leading video generation lab, with a leap of 7 positions from their Veo series. Congratulations to the @GoogleDeepMind team on this accomplishment!

English

110

158.7K

Design Arena@Designarena·1 Tem

BREAKING: Gemini 3.1 Flash Lite Image (Nano Banana 2 Lite) by @GoogleDeepMind is 7th on Image Arena with an Elo of 1271. With an average generation time of around 5 seconds, Nano Banana 2 Lite is 37 seconds faster on average than the higher ranking models above it, which establishes a new Pareto frontier in Image Preference vs Speed. Congrats to the @GoogleDeepMind team for this accomplishment!

English

149

16.9K

Design Arena@Designarena·30 Haz

Claude Sonnet 5 by @AnthropicAI is now available on Design Arena! Anthropic’s most agentic Sonnet model yet, Claude Sonnet 5 brings major improvements across agent capabilities, coding, and knowledge & professional work, narrowing the performance gap with Opus 4.8 at a lower price point. Congrats to the @AnthropicAI team on the launch!

Claude@claudeai

Introducing Claude Sonnet 5, our most agentic Sonnet yet. It makes plans, uses tools like browsers and terminals, and runs autonomously at a level that just a few months ago required larger and more expensive models.

English

114

7.5K

Design Arena@Designarena·30 Haz

Mercury 2 by @_inception_ai is now available on Design Arena! Debuting as the world’s first reasoning dLLM (Diffusion Large Language Model), Mercury 2 generates responses through parallel refinement to give users reasoning-grade quality without compromising on speed. Congrats to the @_inception_ai team on the launch!

Stefano Ermon@StefanoErmon

Mercury 2 is live 🚀🚀 The world’s first reasoning diffusion LLM, delivering 5x faster performance than leading speed-optimized LLMs. Watching the team turn years of research into a real product never gets old, and I’m incredibly proud of what we’ve built. We’re just getting started on what diffusion can do for language.

English

3.1K

Design Arena@Designarena·29 Haz

Try it now at DesignArena.ai

English

776

Design Arena@Designarena·29 Haz

Introducing Video-to-Website on Design Arena! You can now generate websites from video and text inputs on Design Arena, making it easier to create dynamic, high-fidelity sites. Leaderboard coming soon!

English

2.5K

Keşfet

@OpenAI @AnthropicAI @Zai_org @BytePlusGlobal @reve @AIatMeta @SpaceXAI @GoogleDeepMind