Bella Casa

474 posts

Bella Casa

@mxworxtonic

Katılım Nisan 2026

268 Takip Edilen9 Takipçiler

Bella Casa retweetledi

Liquid AI@liquidai·15h

Building a model is just the start. Post-training makes it useful. Our CTO Mathias Lechner (@mlech26l) sits down for a conversation with Maxime Labonne (@maximelabonne), our head of post-training, on the pipeline that takes a base model from autocomplete to something that can reason and follow instructions.

English

19.9K

Bella Casa retweetledi

Greg Kamradt@GregKamradt·16h

"Code and math are taking off because they are easy to verify, the next frontier is domains that are hard to verify" This got me thinking - what does the spectrum of "easy to verify" look like? This is loosely aligned w/ @DarioAmodei's "intelligence bottlenecked" domains. My take of easy > hard: - Level 1: Instant, objective verification Math, code, formal proofs, chess tactics, parsing AI improvement is easiest here because the loop is tight - Level 2: Fast but incomplete verification Software engineering, UI implementation, data analysis, security bug finding You can test a lot, but not everything. “It passes tests” is not the same as “it is good” - Level 3: Human-evaluable creative work Copywriting, design, video thumbnails, sales emails, landing pages Verification is possible through humans or markets, but noisy. AI can improve by predicting human reaction, but taste shifts and metrics can be gamed There is no "right" answer, only feedback from humans - Level 4: Market-verifiable work Startups, investing, product strategy, hiring, pricing, distribution Reality gives feedback, but slowly and with tons of confounders - Level 5: Experimentally verifiable science Materials, biology, chemistry, medicine, robotics There is ground truth (physics), but experiments cost time and money. AI helps most when it can propose better candidates and reduce search space - Level 6: Institutionally verifiable systems Education systems (Alpha school), legal systems, city planning, corporate management systems You can measure outcomes, but the feedback cycle is long, and the counterfactual is hard - Level 7: Civilization-scale verification Democracy variants, alternative governance, monetary systems, cultural norms, geopolitical strategy Verification is slow, morally loaded, noisy, and often impossible to isolate. You may never get a clean answer, only accumulated historical evidence

English

10.1K

Bella Casa retweetledi

GitHub@github·2d

1/ We are sharing additional details regarding our investigation into unauthorized access to GitHub's internal repositories. Yesterday we detected and contained a compromise of an employee device involving a poisoned VS Code extension. We removed the malicious extension version, isolated the endpoint, and began incident response immediately.

English

568

3.6K

11.4K

7.2M

Bella Casa retweetledi

Teknium 🪽@Teknium·2d

4 big upgrades to Hermes Agents speed today `hermes update` to get moving faster now

English

1.2K

76.5K

Bella Casa retweetledi

Vaishnavi@_vmlops·1d

OPENAI DROPPED A PDF ON HOW THEY USE CODEX INTERNALLY and it's actually useful their engineers across security, infra, frontend, and api teams use it daily for: ▫️ understanding unfamiliar codebases fast (especially during incidents) ▫️ refactoring changes that span dozens of files ▫️ generating tests for edge cases devs usually miss ▫️ scaffolding boilerplate so you ship faster ▫️ staying in flow when your calendar is a disaster the one that hit different: one engineer said "i was in meetings all day and still merged 4 PRs because codex was working in the background" cdn.openai.com/pdf/6a2631dc-7…

English

155

1.7K

197.5K

Bella Casa retweetledi

will brown@willccbb·1d

one of the biggest misconceptions about RL is that it's super expensive sure, training a 2T param model at 1M context on 100K environments for several weeks straight is expensive but specializing small-to-medium models for SOTA in-domain perf really isn't

Prime Intellect@PrimeIntellect

These experiments were done on Lab with Llama-3.2-1B, with most training runs completing in <30min, and using <$1 in Lab credits. Reward hacking and model behavior are excellent targets for crowdsourced research, where scaling patterns can be studied for many parallel methods.

English

433

35.2K

Bella Casa retweetledi

Nicolas Bustamante@nicbstme·1d

Welcome to the "G" of Artificial GENERAL intelligence (AGI)

Noam Brown@polynoamial

This is a general-purpose LLM. It wasn’t targeted at this problem or even at mathematics. Also, it’s not a scaffold. We have not pushed this model to the limit on open problems. Our focus is to get it out quickly so that everyone can use it for themselves.

English

2.5K

Bella Casa retweetledi

Emad@EMostaque·1d

a16z@a16z

AI repeals the Mythical Man Month: "Rather than requiring large teams across multiple subsystems that need to coordinate, AI models are developed by smaller teams whose output increases in quality as a function of the data and compute thrown at them." "To wit, now you can throw money at software engineering in order to get more output." @martin_casado and @abhishekn in @FortuneMagazine: fortune.com/2026/05/20/ai-…

ZXX

Bella Casa retweetledi

Nous Research@NousResearch·1d

Hermes Agent now has access to hundreds of browser skills through @browserbase’s new Browse.sh hub, so agents can more reliably perform any task on the internet. You can try a skill from their catalog or contribute your own.

English

103

183

2.3K

472.3K

Bella Casa retweetledi

Teknium 🪽@Teknium·1d

Access skills made directly for the websites your agents use with BrowserBases' domain skillshub integration in Hermes Agent! Access early with `hermes update`

Nous Research@NousResearch

English

259

19.2K

Bella Casa retweetledi

Pau Labarta Bajo@paulabartabajo_·1d

Advice for AI engineers 💡 Web automation agents don't need a huge proprietary model. Fine-tune a small model with GRPO on BrowserGym tasks and you get a reliable browser controller. End-to-end tutorial ↓ docs.liquid.ai/examples/lapto…

English

4.4K

Bella Casa retweetledi

Z.ai@Zai_org·1d

x.com/i/article/2057…

ZXX

116

781

143.5K

Bella Casa retweetledi

Browser Use@browser_use·1d

Use /goal wisely...

shawn@shawn_pana

x.com/i/article/2057…

English

224

96K

Bella Casa retweetledi

Google AI Studio@GoogleAIStudio·1d

x.com/i/article/2056…

ZXX

415

36K

Bella Casa retweetledi

Neil Borate@ActusDei·1d

PM forgot to mention this in his speech. Buy crypto. No TDS, No LRS. Sell abroad or hold in USD stablecoin. This will drain forex big time even while traditional remittances go through hoops. Good story by @sugataghoshET

English

257

65.5K

Bella Casa retweetledi

Normal Guy@Normal_2610·19h

India taxes crypto at 30% on gains and 1% TDS on every single trade But Binance told there are no requirements in any law… specifying withdrawal limits on virtual digital assets, They allow free withdrawals to private wallets. But they did not say it inside Parliament hall, Binance was invited to the Parliamentary Standing Committee on Finance meeting on 20 May 2026. Traditional remittances go through LRS with a $250K annual cap and 20% TCS above 10 lakh rupees. Crypto has none of these gates. Over $42 billion in trading volume shifted offshore since 2022 because the tax made staying more expensive than leaving, Toll booth on an empty highway. India has 100m+ crypto users and no crypto law. You can buy bitcoin on a registered exchange, move it to your own wallet, convert it to USDT, and sit on dollar value with zero LRS paperwork. Meanwhile if you want to send $1000 to your kid studying abroad, you fill forms, pay 20% TCS above 7 lakh, and wait for the bank to approve it. One channel is regulated to death. The other has no gate at all. The deeper problem is not tax evasion. It is dollarization through the back door. Every Indian who converts rupees to USDT and holds it is choosing the dollar over the rupee without the RBI knowing or being able to stop it. The DRI already found gold smuggling rings using USDT to move money to China, Global stablecoin supply crossed $316 billion in April 2026. India has no way to measure how much of that sits in Indian wallets, Cannot defend a currency you cannot track.

Neil Borate@ActusDei

English

187

42.3K

Bella Casa retweetledi

Philipp Schmid@_philschmid·1d

Give Gemini its own isolated Linux sandbox. Let it reason, runs code, browses the web, and manages files. In One API call. Want custom behavior? Define agents in markdown, add skills, mount repos, provide credentials. Early preview, sandbox compute is free. 👇

Google AI Studio@GoogleAIStudio

x.com/i/article/2056…

English

8.6K

Bella Casa retweetledi

Aaron Levie@levie·1d

Great post on FDEs. Everyone should read it if you’re interested in this job category. This is a job that is going to be around as long as AI keeps changing rapidly, which it inevitably will. People often wonder why isn’t this like just deploying other forms of technology in the past, like cloud. Because something like cloud adoption affected a fairly concentrated set of users (developers and IT), and generally didn’t require a fundamental change to the workflows of employees to get the benefits of the new service being delivered on the cloud. At best you went to one training session and you were done. With agents, the work to implement them is not only highly technical, but they directly impact the underlying workflows that people participate in. This means there’s a ton of technical work and change management that comes with it. Further, the pace of change of cloud wasn’t nearly as quick, so there was a lot more time for best practices to propagate. Now, every model change means either something new can be done that wasn’t possible before, or some piece of scaffolding is now redundant or holding you back. This is why it’s commonly easier for a vendor or partner that’s seen the implementation hundreds or thousands of times help do the work, even with internal support from the customer. So, this job isn’t going away any time soon, and will be a great path for a lot of technical talent, especially early career.

vas@vasuman

x.com/i/article/2057…

English

170

1.6K

541.7K

Bella Casa retweetledi

Ahmad@TheAhmadOsman·1d

DROP EVERYTHING The bible for running LLMs locally is now available online to read for free Covers what to use on - Laptop / edge / odd hardware - Mac-first workflows - Single RTX GPUs - 2-4+ NVIDIA / CUDA GPUs - General production serving - Long-context / MoE / routing - NVIDIA max performance - Cluster orchestration Software - llama.cpp - MLX / MLX-LM - ExLlamaV2 - ExLlamaV3 - vLLM - SGLang - TensorRT-LLM - NVIDIA Dynamo You should read this, and if you cannot now then you most definitely wanna bookmark it for later Local AI FTW

Ahmad@TheAhmadOsman

x.com/i/article/2057…

English

220

1.8K

220.1K

Bella Casa retweetledi

Google DeepMind@GoogleDeepMind·2d

We want to help scientists discover their next breakthrough with AI. Gemini for Science is our new suite of experimental tools to help them explore more hypotheses, validate work at scale, unpack literature with ease, and more 🧵

GIF

English

258

1.4K

117.7K

Keşfet

@mlech26l @maximelabonne @DarioAmodei @browserbase @sugataghoshET @elonmusk @BarackObama @taylorswift13