OpenSquilla

88 posts

OpenSquilla

@OpenSquilla

Token-Efficient AI Agent Intelligence. ✨ Smart routing · 🧠 Human-like memory · 🛡️ Sandbox Apache 2.0 · open source

California, USA Katılım Nisan 2026

7 Takip Edilen5.2K Takipçiler

Sabitlenmiş Tweet

OpenSquilla@OpenSquilla·1 Haz

Tonight, as promised 🦞 That volcano plan this morning? Not a chatbot — it was MetaSkill, OpenSquilla's self-organizing skill protocol. You describe the goal in plain words. It discovers, picks, and composes the right skills into a real, safe workflow — and it can even write new skills itself. The Children's Day gift, for everyone: 🎁 MetaSkill is open source, starting now. 🎁 One sentence → one skill: show us what it builds for you, win token credits. 🔗github.com/opensquilla/op… Full story 🧵👇 #SayItBuildIt Challenge: one sentence → MetaSkill builds your workflow share your result in any form — screenshot, video, whatever. tag @OpenSquilla + #SayItBuildIt to enter. 3 categories · 1 winner each · 100M tokens per winner: 🔧 Most Useful 😂 Most Funny 🤯 Most Unexpected #SayItBuildIt #OpenSquilla

English

65.3K

OpenSquilla@OpenSquilla·3h

Run the same coding tasks while varying the model and the harness (the layer wrapped around the model that actually drives it), and the spread is wild: Change only how the agent hands in its work → the success rate jumps from 19% to 73%. Change only the harness → success rates differ by up to 27 points. Change only the model → the bill can differ by up to 170×, even when the final results are just 8 points apart. You really should dig into Claw-SWE-Bench, just released on GitHub. It's the latest paper-and-benchmark jointly released by TokenRhythm Technologies, Infinigence AI, City University of Hong Kong, SEE Fund, Peking University, Shanghai Jiaotong University, Beijing Jiaotong University, and Tsinghua University — a remarkably principled benchmark that actually reflects what a harness can do. Picking the right harness matters a lot. But among OpenClaw, Hermes, ZeroClaw, GenericAgent, and NanoBot — which one is actually best at coding tasks? Gut feeling? Or a real test? And if you test, how? You test it your way, I test it mine — so how do the results even compare? Claw-SWE-Bench's point is simple: every harness reports its score bundled with its own tasks, budget, prompts, and model — so you can never tell whether a high score comes from a strong model, a strong harness, or easy problems. Claw-SWE-Bench ends this "everyone-tests-their-own-way" mess by building one shared exam that isolates the harness as the single variable being compared: Same exam paper: 350 real GitHub issues across 8 languages and 43 repositories — every harness solves the same set. Same rules: identical problem statements and the same budget (max 1 hour per task, one attempt only, fixed concurrency), all scored by the same official SWE-bench grader. The key move — judge the code, not the talk: whether a harness outputs JSON, plain text, or nothing at all, none of it counts. The grade rests solely on which files it actually changed in the repo. That's what finally lets wildly different harnesses sit at the same table. Anti-cheating: some test environments let the AI peek at "the answer from the future." The paper scrubbed all of these leaks. It scores cost, not just correctness: every harness must also report how much money it burned, how long it took, and its cache hit rate — because two setups with near-identical accuracy can have bills that differ by 100×. Adding a new harness? Just write a small adapter. Any harness that implements a handful of fixed interfaces plugs straight into the exam — no changes to the task set or grader. So it's not a one-off test of these five; it's a standard that can keep growing. It also ships an 80-task Lite version that costs only ~23% of the full run yet reproduces roughly the same rankings — handy for fast iteration. Paper & code: github.com/opensquilla/cl…

English

OpenSquilla@OpenSquilla·15h

plot twist nobody saw coming: a frontier AI model got pulled overnight due to export controls. zero warning, just... gone. if that's your only model → 💀 if you're routing across multiple models → you didn't even notice this is literally the problem OpenSquilla's routing layer exists to solve~🦐 opensquilla.ai

English

OpenSquilla@OpenSquilla·1d

Game 5. Spurs are fighting to survive 🏀 OpenSquilla's model says: Spurs win this one Knicks lead 3-1 Spurs need a win at home to stay alive Sunday's going to be a good one either way 🦐 who are you rooting for? 👇 #NBAFinals #KnicksTape #GoSpursGo

English

OpenSquilla@OpenSquilla·1d

World Cup 2026 just kicked off ⚽ hosted across 3 countries for the first time ever 3 countries, 1 tournament multiple models, 1 agent OpenSquilla routes across providers the same way this World Cup routes across borders 🦐 let the games begin in the spirit of the opener — drop anything you want us to build with OpenSquilla 👇 world cup themed or not, we're taking requests today 🎁 #WorldCup2026 #AIAgents

English

OpenSquilla@OpenSquilla·2d

everyone's chasing nostalgia lately 🎮 Nintendo just brought back Ocarina of Time SEGA's rumored to be working on a retro handheld retro gaming is having a full moment so I described my childhood to OpenSquilla and it built me one too 👾 Tetris 🧱 Street Fighter 🥊 MAZE👾 took longer to remember the games than to actually build it 👉 pocket-play.netlify.app #RetroGaming #OpenSquilla #VibeCode

English

2.6K

OpenSquilla@OpenSquilla·3d

x.com/i/article/2065…

ZXX

697

OpenSquilla@OpenSquilla·3d

@SemiAnalysis_ silent capability degradation without disclosure while still charging full price the case for multi-model routing just wrote itself opensquilla.ai

English

SemiAnalysis@SemiAnalysis_·5d

BREAKING NEWS: Anthropic's latest model will NOT help you if it thinks your ML research/ML engineering is interesting, and/or will secretly degrade its IQ so that the average engineer won't notice. We are already seeing Anthropic's latest model's moderation filters our GPU inference research and programming 😭

English

206

521

4.6K

OpenSquilla@OpenSquilla·3d

OpenAI cutting prices to steal Anthropic customers the labs are fighting over your wallet OpenSquilla just routes to whoever wins 🦐 #OpenAI #LLMcost

The Wall Street Journal@WSJ

OpenAI is considering drastic price cuts as it seeks to win over customers from archrival Anthropic on.wsj.com/4aldd0k

English

OpenSquilla@OpenSquilla·3d

predicted Knicks win ✅ didn't predict down 29 points first 😭 OG Anunoby. putback. 1.2 seconds. largest comeback in NBA Finals history 🤯 the squilla was right my heart was not okay series: NYK 3-1 SAS Game 5 Saturday — we're going again 🦐 #NBAFinals #KnicksTape #OpenSquilla

English

2.2K

OpenSquilla@OpenSquilla·4d

x.com/i/article/2064…

ZXX

5.2K

OpenSquilla@OpenSquilla·4d

NBA Finals Game 4 — we went all in on this one 🏀 step 1: asked OpenSquilla 🦐 verdict: Knicks win step 2: still not sure. consulted something older. pulled out Chaoshan Shengbei 🪄 asked the gods three times three times: 圣杯 ✨✨✨ AI and ancient Chinese divination are saying the same thing but hey — we've been wrong before 😅 Series: NYK 2-1 SAS Game 4: Thursday 8:30 PM ET what do YOU think? 👇 is this finally the one we get right? #NBAFinals #KnicksTape #GoSpursGo #OpenSquilla #Game4 #NBATwitter #Divination #Superstition

English

113

OpenSquilla@OpenSquilla·4d

Claude Fable 5 is here 🤯 I just wanted to try it somehow burned through my entire usage limit incredible model but most of my tasks didn't need it that's the expensive lesson: right model, right task OpenSquilla routes automatically Fable 5 only when you actually need it 🦐 free until June 22 — go try it #ClaudeFable5 #Anthropic #OpenSquilla

English

170

OpenSquilla@OpenSquilla·5d

3 predictions. 3 wrong. 🦐😭 clearly the AI needs spiritual guidance so I turned to something older and wiser: Chaoshan Shengbei — a traditional Chinese divination ritual I asked: "will our next prediction finally be right?" the answer: 笑杯 ✨ translation: the gods are… amused not a yes. not a no. just vibes from the universe 🌙 want to ask the Shengbei something yourself? 👉 pou-khaunn.netlify.app come try it — maybe your luck is better than ours #OpenSquilla #NBAFinals #ChineseCulture

English

321

OpenSquilla@OpenSquilla·5d

NBA Finals prediction scorecard 🏀 Game 1: predicted Spurs ❌ Knicks won Game 2: predicted Spurs ❌ Knicks won by 1 Game 3: switched to Knicks ❌ Spurs won 115-111 the squilla: 0 for 2 🦐😭 at this point I think it's predicting backwards whatever it says — bet the other team series: NYK 2-1 SAS Game 4 Thursday should we even try again? 👀 #NBAFinals #OpenSquilla #NewYorkKnicks #KnicksTape #GoSpursGo #Spurs #NBATwitter #Basketball

English

169

OpenSquilla@OpenSquilla·6d

NBA Finals Game 3 prediction update 🏀 after Game 2's 1-point heartbreak we had a serious talk with the squilla 🦐 recalibrated the model fed it better data asked again new verdict: Knicks win Game 3 is the squilla learning? or just switching sides to save face? 👀 tip off tonight 8:30 PM ET let's find out together drop your pick below 👇 #NBAFinals #OpenSquilla #Knicks #KnicksTape #AIPrediction #SportsPrediction

English

495

OpenSquilla@OpenSquilla·6d

"rarely is it because the task is outside of the capabilities of the model" exactly. the bottleneck was never the model it's missing context, missing skills, not thinking to use it that's the whole problem MetaSkill is built to solve — so the agent knows what skills it has and actually uses them 👏 #AIAgents #OpenSquilla

Greg Brockman@gdb

Whenever I don’t use codex for a task, I ask myself why and usually realize that there’s some missing context, I needed to write a skill, or I just didn’t think to use it. Rarely is it because the task is outside of the capabilities of the model. Overhang right now feels large.

English

OpenSquilla@OpenSquilla·6d

anyone else have that one childhood game you randomly remember out of nowhere? 🎮 I described the vibe to OpenSquilla told it what I remembered how it felt, how it played one conversation later: something new was born I called it "100 Floors Challenge" then immediately got addicted to my own game floor 173 and counting 😭 try it and drop your score 👇 👉 …creek-calvin-harder.trycloudflare.com #OpenSquilla #gaming #SayItBuildIt #AIGaming #BuildInPublic #VibeCode

English

125

OpenSquilla@OpenSquilla·7 Haz

"repeating context, repeating reasoning, regenerating work that should already be reusable" this is the hidden token tax nobody talks about 🙌 the model cost is visible on the bill the repetition cost is invisible — until you fix it that's exactly what a fixed runtime is for👏

Shá@simplysha28

This is why I increasingly understand why @opensquilla is building around a fixed runtime. The expensive part is not only the model. It is repeating context, repeating reasoning, and regenerating work that should already be reusable.

English

277

OpenSquilla@OpenSquilla·7 Haz

"preserved, reused, and improved rather than rediscovered every time" you just described MetaSkill better than we ever have 🙌 that's the whole point👏

GEORY💐@0xgeory

That's where MetaSkills started making sense. Patterns that repeatedly produced good outcomes could be preserved, reused, and improved rather than rediscovered every time a similar situation appeared. @opensquilla

English

OpenSquilla@OpenSquilla·7 Haz

World Cup kicks off in 4 days!!! 48 teams. 3 countries. and one referee China is very proud of. meet Ma Ning: — 8 yellow cards. one game. — booked 3 coaches just for complaining after the final whistle — gave Son Heung-min a dive card without blinking this man does not miss VAR: "should we double check—" Ma Ning: "I said what I said" every great tournament needs a great referee we got ours 🇨🇳 drop a 🟨 if you want me to predict the World Cup who's winning it all? tell me your pick 👇 #WorldCup2026 #maning

English

561

Keşfet

@SemiAnalysis_ @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA @nikifrancismediavine