Chris Roth

3.5K posts

Chris Roth

Chris Roth

@rothific

AI @ Mozilla. Focused on on-device AI, local-first data, agent architecture, and AI safety.

United States شامل ہوئے Şubat 2010
161 فالونگ1.3K فالوورز
Chris Roth
Chris Roth@rothific·
My take on why all the AI browsers flopped: browsers are designed for people. Agents don't do well navigating clunky UIs. They thrive in their native habitat, the CLI. I think we'll see "AI clients" take off instead: Chat <> Markdown <> CLI Sandboxes <> ACP.
English
0
0
0
36
Chris Roth
Chris Roth@rothific·
@thdxr I just stare at it, hopelessly waiting
English
0
0
1
15
dax
dax@thdxr·
so what do you do when your agent is working and don't say start another agent because i know you're lying
English
592
17
1.2K
97.5K
Chris Roth ری ٹویٹ کیا
Jessica Cheng
Jessica Cheng@mukajitu·
Building Gridland: Terminal UI that also works on web 😆 ShadCN style UI framework! Open-sourced. gridland.io
Jessica Cheng tweet mediaJessica Cheng tweet mediaJessica Cheng tweet mediaJessica Cheng tweet media
English
0
1
1
66
Chris Roth
Chris Roth@rothific·
I'm building a custom harness that runs Gemma 4 on-device via MLX in the same process as the harness itself. This opens the door to interesting things like directly accessing activation layers, real-time interp, etc. Check it out! github.com/cjroth/mlx-har…
English
1
1
1
79
Chris Roth
Chris Roth@rothific·
I feel like we're dangerously close to the entire internet breaking due to AI's ability to find security vulnerabilities faster than we can fix them. Most software is not ready for this.
English
0
0
2
51
Kyle Wild
Kyle Wild@dorkitude·
as a longtime conspiracy theorist, this timeline is really the best i’m happy to be here (even though i was nonconsentually transplanted into it from the Looney Toons timeline)
English
1
0
1
113
Chris Roth
Chris Roth@rothific·
@eigenron As a bird, I find my flight manual fairly useful actually
English
0
0
1
71
eigenron
eigenron@eigenron·
mech interp is as useful for understanding LLMs as a flight manual is to birds
AVB@neural_avb

@ylecun @nxthompson To be fair the original Anth post was just some cool mech interp expts. This particular tweet sensationalized it a bit much haha. PS: wonder what your thoughts are on the field of mech interp?

English
15
4
106
17.2K
Jessica Cheng
Jessica Cheng@mukajitu·
Been obsessed with a question lately: why don't terminal UIs get the same component ecosystem that web UIs have? shadcn proved that copy-paste components beat npm install for UI. What if that same model worked for the terminal?
English
1
0
1
30
Andrew Farah
Andrew Farah@andrewfarah·
sharing my first open source project a CLI for downloading and syncing your X bookmarks locally so your agent can access them. it's free › npm install -g fieldtheory › login to your X account in a chrome tab › ft sync (done!) bonus: › ft viz › ft classify
English
290
280
4.4K
533.2K
Chris Roth ری ٹویٹ کیا
David Klindt
David Klindt@klindt_david·
So excited to finally share this! Linear probes often outperform SAEs, especially out-of-distribution (OOD). @thesubhashk @JoshAEngels et al showed this convincingly (arxiv.org/abs/2502.16681). This prompted @NeelNanda5 and others to de-emphasize SAE research. Empirically, fair enough. But we think the theoretical case for dictionary learning was dismissed too quickly. @oneill_c previously showed SAEs can't do proper sparse coding (arxiv.org/abs/2411.13117). @shruti_joshi @vpacela and @isacama_phys took this further and showed how this leads to problems particularly in OOD settings. So the issue may not be with dictionary learning itself, but with the current tools. Here's the core argument: if neural representations are in superposition, i.e. more features than dimensions encoded linearly (arxiv.org/abs/2503.01824), then linear probes fundamentally cannot be the answer. This is a compressed sensing problem. There's a linear measurement (the representation) and a nonlinear inference procedure (like an SAE encoder) that recovers the higher-dimensional sparse signal. Linear algebra tells us error-free recovery is impossible if decoding is restricted to be linear. (but see this cool work if errors are acceptable arxiv.org/abs/2602.11246) Check out our video: We have some neat demonstrations here. A linear decision boundary in 3D becomes nonlinear in 2D, even though all sparse combinations of latents remain distinguishable. Compressed sensing works: we can, in principle, recover the high-dimensional latent space where linear probes work and generalize OOD. Where does this leave us? With finite data and millions of concepts, simpler methods may perform better for a while. But if we want interpretability and safety methods that work OOD, especially compositional generalization covering all possible jailbreaks and real-world failures, we'll have to build bottom up from the right theory. @kennylpeng @thebasepoint @tegmark @yash_j_sharma @woog09 @livgorton @EkdeepL @thomas_fel_ @nsaphra
Shruti Joshi@_shruti_joshi_

SAEs fail at OOD tasks. Why? Features in superposition are linearly representable but not linearly accessible. Instead of discarding sparse coding, we embrace the geometry of superposition and use methods equipped to handle the nonlinearity it induces.

English
4
39
263
27.3K
Matt Shumer
Matt Shumer@mattshumer_·
Sitting next to a woman on a plane using ChatGPT on Auto mode. I need someone to physically restrain me from telling her to turn on Thinking mode at the very least.
English
162
21
2K
276.5K
Aiden Bai
Aiden Bai@aidenybai·
Introducing Expect Let agents test your code in a real browser 1. Run Claude Code / Codex to QA your app 2. Watch a video of every bug found 3. Fix and repeat until passing Run as a CLI or agent skill. Fully open source
English
229
312
4.6K
836.2K
Michael Lucas Poage 🐝
Michael Lucas Poage 🐝@RubyBrewsday·
About an hour into using gstack at work, so far I will say, I’m still employed so that’s something
English
1
0
3
38
Chris Roth
Chris Roth@rothific·
@wagslane To be fair, nobody said uptime was solved
English
0
0
0
227
Chris Roth
Chris Roth@rothific·
@oliviscusAI Hey since you're using bun, this UI could probably run in the browser for people to try without needing to install it on their machine. Happy to help make it work if interested
English
0
0
0
969
Oliver Prompts
Oliver Prompts@oliviscusAI·
Claude Code for finance is here 🤯 It's called Dexter. It can find undervalued stocks, analyze them in detail and build investment thesis. 100% Open Source.
English
55
222
1.7K
285.7K
Chris Roth
Chris Roth@rothific·
@karpathy It's time to start vendoring packages' code directly into our repos shadcn style. In the past, we needed to quickly update patch versions for security updates. Now patch versions are the attack vector.
English
0
0
1
269
Andrej Karpathy
Andrej Karpathy@karpathy·
Software horror: litellm PyPI supply chain attack. Simple `pip install litellm` was enough to exfiltrate SSH keys, AWS/GCP/Azure creds, Kubernetes configs, git credentials, env vars (all your API keys), shell history, crypto wallets, SSL private keys, CI/CD secrets, database passwords. LiteLLM itself has 97 million downloads per month which is already terrible, but much worse, the contagion spreads to any project that depends on litellm. For example, if you did `pip install dspy` (which depended on litellm>=1.64.0), you'd also be pwnd. Same for any other large project that depended on litellm. Afaict the poisoned version was up for only less than ~1 hour. The attack had a bug which led to its discovery - Callum McMahon was using an MCP plugin inside Cursor that pulled in litellm as a transitive dependency. When litellm 1.82.8 installed, their machine ran out of RAM and crashed. So if the attacker didn't vibe code this attack it could have been undetected for many days or weeks. Supply chain attacks like this are basically the scariest thing imaginable in modern software. Every time you install any depedency you could be pulling in a poisoned package anywhere deep inside its entire depedency tree. This is especially risky with large projects that might have lots and lots of dependencies. The credentials that do get stolen in each attack can then be used to take over more accounts and compromise more packages. Classical software engineering would have you believe that dependencies are good (we're building pyramids from bricks), but imo this has to be re-evaluated, and it's why I've been so growingly averse to them, preferring to use LLMs to "yoink" functionality when it's simple enough and possible.
Daniel Hnyk@hnykda

LiteLLM HAS BEEN COMPROMISED, DO NOT UPDATE. We just discovered that LiteLLM pypi release 1.82.8. It has been compromised, it contains litellm_init.pth with base64 encoded instructions to send all the credentials it can find to remote server + self-replicate. link below

English
1.4K
5.4K
28.1K
66.5M