Andrej Arpas

293 posts

Andrej Arpas

@ArpasAndrej

Quantitative Methods @Columbia. Member of British Mensa. ML, Bayesian inference @Meta.

เข้าร่วม Ağustos 2015

61 กำลังติดตาม38 ผู้ติดตาม

Andrej Arpas รีทวีตแล้ว

Fleetwood@fleetwood___·26 Mar

ZXX

126

954

60.1K

Andrej Arpas รีทวีตแล้ว

Andrej Karpathy@karpathy·7 Mar

I packaged up the "autoresearch" project into a new self-contained minimal repo if people would like to play over the weekend. It's basically nanochat LLM training core stripped down to a single-GPU, one file version of ~630 lines of code, then: - the human iterates on the prompt (.md) - the AI agent iterates on the training code (.py) The goal is to engineer your agents to make the fastest research progress indefinitely and without any of your own involvement. In the image, every dot is a complete LLM training run that lasts exactly 5 minutes. The agent works in an autonomous loop on a git feature branch and accumulates git commits to the training script as it finds better settings (of lower validation loss by the end) of the neural network architecture, the optimizer, all the hyperparameters, etc. You can imagine comparing the research progress of different prompts, different agents, etc. github.com/karpathy/autor… Part code, part sci-fi, and a pinch of psychosis :)

English

1.1K

3.6K

28.3K

11M

Andrej Arpas รีทวีตแล้ว

Karan@karankendre·2 Mar

Me realising I can run these models locally on my M1 MacBook Air for free

Qwen@Alibaba_Qwen

🚀 Introducing the Qwen 3.5 Small Model Series Qwen3.5-0.8B · Qwen3.5-2B · Qwen3.5-4B · Qwen3.5-9B ✨ More intelligence, less compute. These small models are built on the same Qwen3.5 foundation — native multimodal, improved architecture, scaled RL: • 0.8B / 2B → tiny, fast, great for edge device • 4B → a surprisingly strong multimodal base for lightweight agents • 9B → compact, but already closing the gap with much larger models And yes — we’re also releasing the Base models as well. We hope this better supports research, experimentation, and real-world industrial innovation. Hugging Face: huggingface.co/collections/Qw… ModelScope: modelscope.cn/collections/Qw…

English

341

7.3K

529.9K

Andrej Arpas รีทวีตแล้ว

NIK@ns123abc·23 Şub

🚨 META’s head of AI safety and alignment gets her emails nuked by OpenClaw >be director of AI Safety and Alignment at Meta >install OpenClaw >give it unrestricted access to personal emails >it starts nuking emails >“Do not do that” >*keeps going* >“Stop don’t do anything” >*gets all remaining old stuff and nukes it aswell* >“STOP OPENCLAW” >“I asked you to not do that” >“do you remember that?” >“Yes I remember. And I violated it.” >“You’re right to be upset” LMAOOOOOOOO

English

1.1K

2.5K

29.6K

2.9M

Andrej Arpas รีทวีตแล้ว

𝐑.𝐎.𝐊 👑@r0ktech·19 Şub

Two developers coding in the same branch.

English

105

1.1K

12.5K

738.4K

Andrej Arpas รีทวีตแล้ว

Andrej Karpathy@karpathy·16 Şub

I think it must be a very interesting time to be in programming languages and formal methods because LLMs change the whole constraints landscape of software completely. Hints of this can already be seen, e.g. in the rising momentum behind porting C to Rust or the growing interest in upgrading legacy code bases in COBOL or etc. In particular, LLMs are *especially* good at translation compared to de-novo generation because 1) the original code base acts as a kind of highly detailed prompt, and 2) as a reference to write concrete tests with respect to. That said, even Rust is nowhere near optimal for LLMs as a target language. What kind of language is optimal? What concessions (if any) are still carved out for humans? Incredibly interesting new questions and opportunities. It feels likely that we'll end up re-writing large fractions of all software ever written many times over.

Thomas Wolf@Thom_Wolf

Shifting structures in a software world dominated by AI. Some first-order reflections (TL;DR at the end): Reducing software supply chains, the return of software monoliths – When rewriting code and understanding large foreign codebases becomes cheap, the incentive to rely on deep dependency trees collapses. Writing from scratch ¹ or extracting the relevant parts from another library is far easier when you can simply ask a code agent to handle it, rather than spending countless nights diving into an unfamiliar codebase. The reasons to reduce dependencies are compelling: a smaller attack surface for supply chain threats, smaller packaged software, improved performance, and faster boot times. By leveraging the tireless stamina of LLMs, the dream of coding an entire app from bare-metal considerations all the way up is becoming realistic. End of the Lindy effect – The Lindy effect holds that things which have been around for a long time are there for good reason and will likely continue to persist. It's related to Chesterton's fence: before removing something, you should first understand why it exists, which means removal always carries a cost. But in a world where software can be developed from first principles and understood by a tireless agent, this logic weakens. Older codebases can be explored at will; long-standing software can be replaced with far less friction. A codebase can be fully rewritten in a new language. ² Legacy software can be carefully studied and updated in situations where humans would have given up long ago. The catch: unknown unknowns remain unknown. The true extent of AI's impact will hinge on whether complete coverage of testing, edge cases, and formal verification is achievable. In an AI-dominated world, formal verification isn't optional—it's essential. The case for strongly typed languages – Historically, programming language adoption has been driven largely by human psychology and social dynamics. A language's success depended on a mix of factors: individual considerations like being easy to learn and simple to write correctly; community effects like how active and welcoming a community was, which in turn shaped how fast its ecosystem would grow; and fundamental properties like provable correctness, formal verification, and striking the right balance between dynamic and static checks—between the freedom to write anything and the discipline of guarding against edge cases and attacks. As the human factor diminishes, these dynamics will shift. Less dependence on human psychology will favor strongly typed, formally verifiable and/or high performance languages.³ These are often harder for humans to learn, but they're far better suited to LLMs, which thrive on formal verification and reinforcement learning environments. Expect this to reshape which languages dominate. Economic restructuring of open source – For decades, open-source communities have been built around humans finding connection through writing, learning, and using code together. In a world where most code is written—and perhaps more importantly, read—by machines, these incentives will start to break down.⁴ Communities of AIs building libraries and codebases together will likely emerge as a replacement, but such communities will lack the fundamentally human motivations that have driven open source until now. If the future of open-source development becomes largely devoid of humans, alignment of AI models won't just matter—it will be decisive. The future of new languages – Will AI agents face the same tradeoffs we do when developing or adopting new programming languages? Expressiveness vs. simplicity, safety vs. control, performance vs. abstraction, compile time vs. runtime, explicitness vs. conciseness. It's unclear that they will. In the long term, the reasons to create a new programming language will likely diverge significantly from the human-driven motivations of the past. There may well be an optimal programming language for LLMs—and there's no reason to assume it will resemble the ones humans have converged on. TL; DR: - Monoliths return – cheap rewriting kills dependency trees; smaller attack surface, better performance, bare-metal becomes realistic - Lindy effect weakens – legacy code loses its moat, but unknown unknowns persist; formal verification becomes essential - Strongly typed languages rise – human psychology mattered for adoption; now formal verification and RL environments favor types over ergonomics - Open source restructures – human connection drove the community; AI-written/read code breaks those incentives; alignment becomes decisive - New languages diverge – AI may not share our tradeoffs; optimal LLM programming languages may look nothing like what humans converged on ¹ x.com/mntruell/statu… ² x.com/anthropicai/st… ³ wesmckinney.com/blog/agent-erg… ⁴ #issuecomment-3717222957" target="_blank" rel="nofollow noopener">github.com/tailwindlabs/t…

English

699

650

1.2M

Andrej Arpas รีทวีตแล้ว

Christian Szegedy@ChrSzegedy·15 Şub

Super cool! It was a bit chatty, but it focused on getting across the main idea, its motivation, and the thinking behind the results, failed attempts, and fixes. I'd love all ML/AI results presented in such an intuitive way.

Mikhail Parakhin@MParakhin

x.com/i/article/2022…

English

135

35.9K

Andrej Arpas รีทวีตแล้ว

Andrej Karpathy@karpathy·12 Şub

New art project. Train and inference GPT in 243 lines of pure, dependency-free Python. This is the *full* algorithmic content of what is needed. Everything else is just for efficiency. I cannot simplify this any further. gist.github.com/karpathy/8627f…

English

653

3.1K

25.1K

5.2M

Andrej Arpas รีทวีตแล้ว

tetsuo@tetsuoai·12 Şub

From the xAI all-hands (just released publicly, uncut): Guodong Zhang: "I can already feel the AGI, at least for coding." He told kernel and compiler engineers to seriously ask whether it's still worth it, or whether it's time to join the effort and automate yourself. "What a year to be alive." Elon Musk: by end of year, AI won't even write code. It'll generate optimized binaries directly. Bypassing compilers. Bypassing coding entirely. "Just say 'create optimized binary for this particular outcome.' That intermediate step will not be needed." Grok Code expected to be state-of-the-art in 2 to 3 months. This is an internal all-hands they published raw.

tetsuo@tetsuoai

xAI just released their full 45-minute Elon all-hands to the public. Uncut. No edits. Name another company that does this.

English

848

153.7K

Andrej Arpas รีทวีตแล้ว

Google Students@googlestudents·10 Şub

If you're a fan of optimization puzzles or coding, be sure to check out the MLSys 2026 Graph Scheduling Competition! 🧩 The challenge: You have a complex AI workload and a tiny high-speed memory scratchpad. Your goal is to design a schedule (just a JSON file) that moves data in and out efficiently to minimize latency. It’s like a high-stakes game of inventory management. Open to students, researchers, hobbyists and puzzle-solvers worldwide. Register now and start coding → goo.gle/4ktgyyb #MLSys2026 #CodingChallenge

English

1.2K

97.7K

Andrej Arpas@ArpasAndrej·22 Oca

The timeframe you seem to refer to is from the distinct, though related, Belgian-French occupation of the Ruhr Valley. The Rhineland occupation did not fully end until 1930. This picture is widely dated to 1929. I stand corrected on the troops’ origin, however; most colonial troops were allegedly withdrawn by 1925, so these should indeed be metropolitan French. (Not that one can tell for sure from the angle.)

English

113

gascobourguignon🇫🇷@jus2poulet·22 Oca

@ArpasAndrej @TheNameofWar Tu racontes n’importe quoi? C’est en 1923 pendant l’occupation de la Rhur. Quand on ne sait pas on se tait ?

Français

225

The Name of War@TheNameofWar·21 Oca

French soldiers at Koblenz, WW1

English

707

15.3K

353K

Andrej Arpas@ArpasAndrej·21 Oca

Love this. Weighted sum (`Final Score = Σ (weight_i × P(action_i))`) ignores variance, though. Could swap weighted scores for GP over action probs like $u(c) \sim \mathcal{GP}(\mu(\mathbf{p}), k(\mathbf{p}, \mathbf{p}'))$, sample posterior for exploration, using JAX: e.g. gp_scorer(action_probs, weights) → Upper Confidence Bound ranks. Reduces echo chambers. (Where $\mathbf{p}$ is vector of action probs for candidate $c$, $\mu$ linear mean (weighted sum baseline), $k$ RBF kernel for covariance. Posterior sampling: Draw u \sim \mathcal{N}(\bar{\mu}, \bar{\Sigma}) post-observations, rank by samples (e.g., Thompson) or UCB.)

English

Christopher Stanley@cstanley·20 Oca

Checkout the new recommendation algorithm built by @xAI 🚀 No Scala, only Rust and Python 🔥 github.com/xai-org/x-algo…

English

435

30.7K

Andrej Arpas รีทวีตแล้ว

gaut@0xgaut·20 Oca

engineers watching everybody else vibe code apps

English

256

1.1K

14K

794.3K

Andrej Arpas@ArpasAndrej·11 Oca

@Object_Zero_ This is a bunch of hooey. East Greenland is sparsely populated for a reason. Yes, the waters surrounding it are full of ice floes, nearly impenetrable, dense pack ice, but it also features katabatic winds, way less fauna, highly inhabitable.

English

226

Object Zero@Object_Zero_·10 Oca

Scoresby Sund, Greenland This is the San Francisco Bay of the Arctic Ocean. Scoresby is 500-600 meter water depth, it is East facing and provides near perfect shelter from storms. It is perfect geography for a large naval base from which to control not only the Greenland, Iceland, UK (GIUK) Gap, but also the entire Arctic Ocean and also the Northern Atlantic. The only comparable geography in the world is Stavanger Fjord in Norway. The only limitation for Scoresby and why it remains undeveloped is the ice and the need for icebreakers. But submarines do not care. Subs can transit under the ice and many submarine bases are constructed as artificial sea caves. Scoresby Sund represents an incredible piece of undeveloped infrastructure as a submarine base, that would be a crown jewel in a world where the Northeast Passage and Northwest Passage open up new shipping lanes than dramatically shorten the transport distance between all the major economies. Doing this at the necessary scale would require something like 30-40,000 people, maybe more. If you map out the development is it large and expensive and could dwarf the existing GDP and population of Greenland. If someone committed to build a naval base here, they would need to ensure they could never lose it. It would provide a major security boost for Europe and the Eastern seaboard of US against Russian and Chinese subs, who would have to cross the Indian Ocean and Cape of Good Hope, or the Pacific Ocean and Cape Horn before entering the Atlantic. Scoresby Sund is a game changer for naval dominance in both the North Atlantic and the Arctic.

English

151

1.2K

84.2K

Andrej Arpas รีทวีตแล้ว

Green Ginger | 🟢🫚📌@greenisginger·3 Oca

𝘣𝘢𝘤𝘬 𝘧𝘳𝘰𝘮 𝘵𝘩𝘦 𝘵𝘳𝘪𝘱. #pixelart #ドット絵

GIF

English

8.3K

161K

Andrej Arpas รีทวีตแล้ว

Yuchen Jin@Yuchenj_UW·29 Ara

DeepSeek was a side project at High-Flyer Quant. Qwen was a side project at Alibaba. Twitter was a side project at Odeo. Mac was a side project at Apple. Meanwhile: Windows Phone was a core project at Microsoft. Metaverse was a core project at Facebook. Google Glass was a core project at Google. Apple Intelligence is a core project at Apple. taste, passion, agency > roadmaps.

Yuchen Jin@Yuchenj_UW

Claude Code was a side project at Anthropic. ChatGPT was a side project at OpenAI. PyTorch was a side project at Meta. Gmail was a side project at Google. Side projects are the only place where taste, curiosity, and agency fully compound.

English

132

377

3.7K

332.5K

Andrej Arpas รีทวีตแล้ว

TheBlackWolf@thewolvenhour·25 Ara

Same Odyssey trailer.. but with Historically accurate costumes.. wasn’t so hard, was it? And it’s still badass! Super cool work by DF Fox! Kudos to the dude for the splendid AI adaptation!

TheBlackWolf@thewolvenhour

First trailer is out.. we have more footage from the Ancient Greek spine-helmet robocop.. Boy, this is going to suck HARD.

English

1.3K

436

Andrej Arpas รีทวีตแล้ว

Yun-Ta Tsai@yunta_tsai·12 Ara

One of the most common engineering mistakes is that people often believe picking the easiest path would get to the end goal faster. In reality, you would face the same hard problem at 80% of way there. Worse, you did not spend any valuable time to study. The problem would still be alien to the teams. Instead, it is often better to face the hardest problem head on so that you do not need to see it as a stranger again.

English

639

22.2K

Andrej Arpas@ArpasAndrej·10 Ara

@yunta_tsai Space servers - Landauer’s principle. Lower energy bound of ~3K on how much energy it takes to store a bit. Most optimal.

English

129

Yun-Ta Tsai@yunta_tsai·10 Ara

Solving AGI is solving the compression of photons.

English

442

29.6K

ค้นพบ

@TheNameofWar @xai @Object_Zero_ @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates