( ✦﹏✦）

4.5K posts

( ✦﹏✦）

@SynthSquid

idea collector

เข้าร่วม Eylül 2024

421 กำลังติดตาม88 ผู้ติดตาม

( ✦﹏✦） รีทวีตแล้ว

Viv@Vtrivedy10·4h

GEPA <1 years old 😮 incredible the impact that the ideas here have spawned on hill climbing + improving agents does anyone know of cool work on looping/GEPA/Optimize_Anything + RL? main ideas: - eventually harness opt hits the wall of model intelligence - we can break through that wall by RLing on good evals that increase model ability in the eval domains - new weights shape intelligence where an updated harness can better use these new weights - loop Model-Harness codesign is really interesting, we’re pushing here much more with using traces to create datasets for self-improvement and there’s some interesting work to do in marrying Harness Eng and RL recipes here 👀

Lakshya A Agrawal@LakshyAAAgrawal

How does prompt optimization compare to RL algos like GRPO? GRPO needs 1000s of rollouts, but humans can learn from a few trials—by reflecting on what worked & what didn't. Meet GEPA: a reflective prompt optimizer that can outperform GRPO by up to 20% with 35x fewer rollouts!🧵

English

5.5K

( ✦﹏✦） รีทวีตแล้ว

ρ:ɡeσn@pigeon__s·4h

if i see a single person say "erm they didnt compare against opus 4.7 or gpt-5.5" i will actually kill you these are insane numbers

English

2.1K

( ✦﹏✦） รีทวีตแล้ว

◥◣ N I C O L E T T E ◥◣@nicoletteduclar·6h

you aren't ready for what CODEX 5.5 + GPT 2 just unlocked you can vibe code a complete pokemon game, with all CC0 assets (mine is @moggmon) > Codex 5.5 coded all UI + battle logic > GPT 2 generated all sprites, animation, SFX ask me anything about the process #vibejam

English

378

26.8K

( ✦﹏✦） รีทวีตแล้ว

Vals AI@ValsAI·5h

DeepSeek v4 is now the #1 open-weight model on our Vibe Code Benchmark, and it’s not close. It leaves the #2 (Kimi K2.6) in the dust, and even beats out frontier closed source models like Gemini 3.1 Pro.

English

112

82.8K

( ✦﹏✦） รีทวีตแล้ว

Lotto@LottoLabs·4h

Full benchmarks for DS4 vs DS3.2 Flash seems very solid

English

1.4K

( ✦﹏✦） รีทวีตแล้ว

DeepSeek@deepseek_ai·4h

🚀 DeepSeek-V4 Preview is officially live & open-sourced! Welcome to the era of cost-effective 1M context length. 🔹 DeepSeek-V4-Pro: 1.6T total / 49B active params. Performance rivaling the world's top closed-source models. 🔹 DeepSeek-V4-Flash: 284B total / 13B active params. Your fast, efficient, and economical choice. Try it now at chat.deepseek.com via Expert Mode / Instant Mode. API is updated & available today! 📄 Tech Report: huggingface.co/deepseek-ai/De… 🤗 Open Weights: huggingface.co/collections/de… 1/n

English

878

3.8K

21K

1.9M

( ✦﹏✦） รีทวีตแล้ว

Paweł Huryn@PawelHuryn·1d

Anthropic has quietly shipped third-party inference for Cowork and Code in Claude Desktop. This should work with local models or OpenRouter via LiteLLM proxy. Is it just me?

English

1.1K

673.4K

( ✦﹏✦） รีทวีตแล้ว

OpenRouter@OpenRouter·13h

@PawelHuryn Here's a guide for setting it up. No proxy needed. openrouter.ai/docs/guides/co…

English

1.2K

( ✦﹏✦） รีทวีตแล้ว

Charly Wargnier@DataChaz·22h

Google’s level of disrespect is OFF THE CHARTS right now. Anthropic really thought they had us locked down with Claude Design’s ridiculous rate limits… …and now Google has literally countered it straight away by open-sourcing DESIGN.md 🤯

Stitch by Google@stitchbygoogle

Today, we’re open-sourcing the draft specification for DESIGN.md, so it can be used across any tool or platform. We’re also adding new capabilities. DESIGN.md lets you easily export and import your design rules from project to project. Instead of guessing intent, agents know exactly what a color is for and can even validate their choices against WCAG accessibility rules. Watch David East break down this shared visual language in action👇. New capabilities and links in 🧵

English

118

421

7.1K

1.5M

( ✦﹏✦） รีทวีตแล้ว

Lisan al Gaib@scaling01·5h

DeepSeek-V4 Pricing

Lisan al Gaib@scaling01

DEEPSEEK-V4 FLASH AND PRO ITS HAPPENING api-docs.deepseek.com

English

363

49.8K

( ✦﹏✦） รีทวีตแล้ว

Parallel Web Systems@p0·12h

The best web search for agents is now free. Upgrade to Parallel's web search tools in any MCP-supported tool or agent, for free, in under 60 seconds. No account. No API keys. Zero cost. docs.parallel.ai/integrations/m…

GIF

English

217

101.3K

( ✦﹏✦）@SynthSquid·5h

@ProperPrompter based

GIF

English

184

proper@ProperPrompter·9h

i don't like gpt 5.5

English

719

17.4K

( ✦﹏✦）@SynthSquid·6h

@scaling01 x.com/SynthSquid/sta…

( ✦﹏✦）@SynthSquid

@robinebers 🤫

QME

Lisan al Gaib@scaling01·12h

im obviously fine with this if it consistently uses only half the tokens

Lisan al Gaib@scaling01

GPT-5.5 is 2x more expensive than GPT-5.4

English

120

6.4K

( ✦﹏✦）@SynthSquid·7h

@dhtikna Dont x.com/SynthSquid/sta…

( ✦﹏✦）@SynthSquid

@robinebers 🤫

English

212

Ankith 🐋/acc@dhtikna·11h

The 2x price increase basically made me lose all interest in gpt 5.5

English

1.8K

( ✦﹏✦）@SynthSquid·7h

@robinebers 🤫

QME

366

Robin Ebers | AI Coach for Founders@robinebers·8h

GPT 5.5 is now more expensive than Opus 4.7 remember when I posted this and everyone disagreed? GPT-5: $1.25 / $10 GPT-5.1: $1.60 / $12 GPT-5.2: $2.00 / $14 GPT-5.3: $2.30 / $16 GPT-5.4: $2.80 / $18 GPT-5.5: $5.00 / $30 GPT-5 got more expensive with EVERY MODEL RELEASE and is now 3x (!!!) the price that it was when it first launched we’re fucked

Robin Ebers | AI Coach for Founders@robinebers

everyone says AI models are getting cheaper they're not - gpt-5.2 is 40% more expensive than 5.1 - gemini 3 pro is 60% more expensive than 2.5 pro - sonnet with 1M context: 2x the price the world is slowly waking up to the fact that compute is expensive ... and you get what you pay for

English

17.1K

( ✦﹏✦）@SynthSquid·7h

@songjunkr evey comment about dsv4 pushes the release date back by 1 minute

English

송준 Jun Song@songjunkr·8h

What happened to DeepSeek V4? Now is the perfect time for them to release it. Has anyone heard any news?

English

8.7K

( ✦﹏✦）@SynthSquid·7h

I'm ready for spud-mini, probably releasing in 12 days, (per gpt5.4 release vs mini)

English

( ✦﹏✦）@SynthSquid·8h

@theo dude GPT-5.5 (low) is 51 !?

GIF

English

121

Theo - t3.gg@theo·13h

GPT-5.5 (medium) is tied for SOTA on Artificial Analysis. GPT-5.5 (high) and GPT-5.5 (xhigh) are meaningfully ahead. xhigh is the first model to break the 50's

English

1.4K

68K

( ✦﹏✦）@SynthSquid·8h

@theo i wish they tested low 🔅

English

( ✦﹏✦）@SynthSquid·9h

@altryne @sharifshameem it should be smart enough, it probably needs a super specific prompting , validation loop,

English

Alex Volkov@altryne·11h

@sharifshameem We actually tried this exact flow live on stream and the results were very meh. Image generated a crazy overthe top design and 5.5 didnt' even try to reimplement it :/

English

2.7K

Sharif Shameem@sharifshameem·11h

GPT 5.5 is the greatest frontend model in the world when combined with imagegen 2. it can build working products that feel like they're from sci-fi movies. it can take my blog's URL and one shot a 32 page magazine. a world-class designer and coder for $20 a month is surreal

English

389

20.8K

ค้นพบ

@moggmon @PawelHuryn @ProperPrompter @scaling01 @dhtikna @robinebers @songjunkr @elonmusk