Jonas

319 posts

Jonas

@LoosJonas

ai/ml, mechinterp, diffusion, llms, compilers, robotics, space

Beigetreten Aralık 2018

249 Folgt71 Follower

Jonas@LoosJonas·13 Mar

@hxiao 99% vibecoded with claude, crazy what is possible today explore it yourself here: jonasloos.github.io/embedding-ring…

English

Jonas@LoosJonas·13 Mar

@hxiao Nice project! I think PCA may not be the best basis to judge this. Fitting the circle directly shows that some embeddings seem to encode it much better than pca suggests. link ↓

English

152

Han Xiao@hxiao·12 Mar

take a 3D Pikachu model, render 36 frames rotating 360°, encode each frame with different vision embedding models (incl GeminiEmbedding 2), then project to 3D with PCA. If a model truly "understands" that these images are the same object rotating, the embeddings should form a circle/ring - smooth, continuous, preserving the angular order.

English

504

43K

Jonas@LoosJonas·13 Mar

@fchollet invention seems like just another task for the automation machine

English

106

François Chollet@fchollet·13 Mar

If you build an automation machine, the way to monetize it is to sell it to as many people as possible -- anyone who has tasks to automate. But if what you build is an invention machine, then the best way to monetize it is to use it yourself.

English

100

107

1.5K

76.1K

Jonas@LoosJonas·11 Mar

@dogecahedron I could imagine more, but one would have to test. But maybe there are also cases where LLMs would profit from more redundancy/explicitness

English

dogecahedron@dogecahedron·11 Mar

@LoosJonas modern languages introduce some amount of redundancy in order to make it possible to localize typos with statical analysis. Since LLMs are way less prone to errors you could probably do better tradeoffs here. however this soule probably not save more than 10% of tokens

English

Jonas@LoosJonas·11 Şub

we need a better programming language optimized for coding agents

English

138

Jonas@LoosJonas·11 Mar

@dogecahedron yes

dogecahedron@dogecahedron·11 Mar

@LoosJonas token efficiency after accounting for a good tokenizer sounds like we are looking for a language that assigns short encodings to useful programs🤔

English

Jonas@LoosJonas·11 Mar

@dogecahedron Token-efficiency, adjustments for the autoregressive nature of llms, less long-range dependencies, ... But yes, the bias towards common languages is a disadvantage for new languages

English

dogecahedron@dogecahedron·11 Mar

@LoosJonas llms seem somewhat syntax agnostic? how could new syntax outperform common languages that have massive training sets?

English

Jonas@LoosJonas·10 Mar

@enjoyingthewind @carl_sterns @robertluxemburg @josephdviviano tru

//TODO: fix later 🐳@enjoyingthewind·10 Mar

@carl_sterns @robertluxemburg @josephdviviano Yeah and ChatGPTs are very clearly recognizable as well x.com/LoosJonas/stat…

Jonas@LoosJonas

@robertluxemburg @josephdviviano Quite nice that it just works (also gpt5.4, same prompt)

English

Joseph Viviano@josephdviviano·10 Mar

me: "can you use whatever resources you like, and python, to generate a short 'youtube poop' video and render it using ffmpeg ? can you put more of a personal spin on it? it should express what it's like to be a LLM" claude opus 4.6:

English

550

1.2K

12.5K

1.4M

Jonas@LoosJonas·10 Mar

@robertluxemburg @josephdviviano Quite nice that it just works (also gpt5.4, same prompt)

English

114

Robert Luxemburg@robertluxemburg·10 Mar

@josephdviviano interesting prompt. here is chatgpt 5.4:

English

2.5K

Jonas@LoosJonas·10 Mar

@dogecahedron Better verification, security, syntax, agent-tooling, ... Good thing is that we dont have to guess, but could just evaluate how well coding agents can use it and optimize

English

dogecahedron@dogecahedron·10 Mar

@LoosJonas what are the missing specs?

English

Jonas@LoosJonas·9 Mar

@JesseFarebro Chatgpt/Claude/Gemini/... make it really easy to find prior related work

English

1.7K

Jesse Farebrother@JesseFarebro·8 Mar

It is infuriating how many ICML submissions could have been entirely prevented if authors just took 5 minutes to do a literature review. Ignoring ~10 years of established work on the exact idea you are proposing is just lazy.

English

337

32.4K

Jonas@LoosJonas·7 Mar

@SwayStar123 Congrats

English

110

sway@SwayStar123·7 Mar

Now that I'm working in a frontier lab i can finally vague post about research Just tested out a random idea i had and its significantly better than baseline

English

2.2K

Jonas@LoosJonas·1 Mar

@Dorialexander @amiribtc Wouldn't emergence then just be the point at which the signal starts to be capturable?

English

Alexander Doria@Dorialexander·1 Mar

@amiribtc not completely but rather the by-product of very noisy training data: signal is there but weak and you need a large nest to catch it.

English

169

Alexander Doria@Dorialexander·1 Mar

Hard disagree. Everything is there at small scale, just less capable.

François Chollet@fchollet

It's basically impossible to predict what emergent properties you might get from scaling up a given algorithm. That's why AGI is much more an engineering endeavor than a theoretical one. It's a process of discovery through building.

English

8.8K

Jonas@LoosJonas·1 Mar

@N8Programs I remember this working in the past, but recently had some cases where the model just refused to think more than 5s, no matter what I tried

English

N8 Programs@N8Programs·1 Mar

@LoosJonas set to extended thinking (or heavy if you're on pro) and say 'please think as long as you can/think hard'. if you don't get a 'thinking' popup retry.

English

N8 Programs@N8Programs·1 Mar

one thing that's mildly infuriating with gpt-5.2 (when it doesn't think long enough) is how it responds to autoregressively trapping itself. it often says 'you got this wrong/this isn't fully correct', works out the math, realizes you are right, and just restates your conclusion without saying 'whoops i got this wrong'. this ragebaits you (or at least ragebaits me) because it goes 'you are wrong [does work] and thats how you fix it [thing you originally did]'.

English

2.6K

Jonas@LoosJonas·1 Mar

@VictorTaelin Can you recommend ts for programming language implementations?

English

Taelin@VictorTaelin·27 Şub

Another day, another mindset. Dismissing Codex CLI and Claude Code, back to the "just put everything in context, build yourself from ground up, use AI just to fill gaps", and my love for Opus 4.6 is suddenly back. Fast mode is incredible! Just implemented a toy programming language in 30 minutes. All I had to do is write this 60-loc prompt, and Opus gave me a fully working implementation in ~30 seconds, including parser, stringifier, main. Everything works. (Just don't ask why I need this.)

English

221

15.6K

Jonas@LoosJonas·22 Şub

@SwayStar123 And still, they're all over the timeline

English

sway@SwayStar123·20 Şub

Criminal twitter management 1. No image even though they have an amazing chart in the blog 2. Links in the tweet 3. No information about the product even though they are a no name company releasing something awesome

Taalas Inc.@taalas_inc

24 dedicated people. $30M spent on development. Extreme specialization, speed, and power efficiency. Today we launch Taalas’ first product. Check it out: Details: taalas.com/the-path-to-ub… Demo chatbot: chatjimmy.ai API: taalas.com/api-request-fo…

English

911

Jonas@LoosJonas·21 Şub

@seconds_0 Interactive sub 1 second app/website generation; chat models doing preliminary deep research while you're still typing/speaking; ...

English

0.005 Seconds (3/694)@seconds_0·20 Şub

The most important part of the Taalas demo is the revelation that superhuman OOM token output is simply possible Theres now a direct path through time for 15k tok/sec gpt5.3 codex xhigh the world that has that is very confusing indeed

English

754

34K

Jonas@LoosJonas·21 Şub

@dravidan @seconds_0 Image tokens can be generated in parallel, so it's already possible, see e.g. genie 3

English

dravidan@dravidan·21 Şub

@seconds_0 At what tokens/sec are we in the realm of live video generation —- TP math: ~2000 tokens per image, so that’s like 7 FPS; is the math mathing ?

English

683

Jonas@LoosJonas·18 Şub

@DavidSHolz arxiv.org/abs/1612.03238

QME

292

David@DavidSHolz·18 Şub

this remains one of my favorite papers of all time

English

609

22.4K

Jonas@LoosJonas·17 Şub

@karpathy we need a benchmark evaluating programming languages for coding agents. And then use it create new ones

English

Andrej Karpathy@karpathy·16 Şub

I think it must be a very interesting time to be in programming languages and formal methods because LLMs change the whole constraints landscape of software completely. Hints of this can already be seen, e.g. in the rising momentum behind porting C to Rust or the growing interest in upgrading legacy code bases in COBOL or etc. In particular, LLMs are *especially* good at translation compared to de-novo generation because 1) the original code base acts as a kind of highly detailed prompt, and 2) as a reference to write concrete tests with respect to. That said, even Rust is nowhere near optimal for LLMs as a target language. What kind of language is optimal? What concessions (if any) are still carved out for humans? Incredibly interesting new questions and opportunities. It feels likely that we'll end up re-writing large fractions of all software ever written many times over.

Thomas Wolf@Thom_Wolf

Shifting structures in a software world dominated by AI. Some first-order reflections (TL;DR at the end): Reducing software supply chains, the return of software monoliths – When rewriting code and understanding large foreign codebases becomes cheap, the incentive to rely on deep dependency trees collapses. Writing from scratch ¹ or extracting the relevant parts from another library is far easier when you can simply ask a code agent to handle it, rather than spending countless nights diving into an unfamiliar codebase. The reasons to reduce dependencies are compelling: a smaller attack surface for supply chain threats, smaller packaged software, improved performance, and faster boot times. By leveraging the tireless stamina of LLMs, the dream of coding an entire app from bare-metal considerations all the way up is becoming realistic. End of the Lindy effect – The Lindy effect holds that things which have been around for a long time are there for good reason and will likely continue to persist. It's related to Chesterton's fence: before removing something, you should first understand why it exists, which means removal always carries a cost. But in a world where software can be developed from first principles and understood by a tireless agent, this logic weakens. Older codebases can be explored at will; long-standing software can be replaced with far less friction. A codebase can be fully rewritten in a new language. ² Legacy software can be carefully studied and updated in situations where humans would have given up long ago. The catch: unknown unknowns remain unknown. The true extent of AI's impact will hinge on whether complete coverage of testing, edge cases, and formal verification is achievable. In an AI-dominated world, formal verification isn't optional—it's essential. The case for strongly typed languages – Historically, programming language adoption has been driven largely by human psychology and social dynamics. A language's success depended on a mix of factors: individual considerations like being easy to learn and simple to write correctly; community effects like how active and welcoming a community was, which in turn shaped how fast its ecosystem would grow; and fundamental properties like provable correctness, formal verification, and striking the right balance between dynamic and static checks—between the freedom to write anything and the discipline of guarding against edge cases and attacks. As the human factor diminishes, these dynamics will shift. Less dependence on human psychology will favor strongly typed, formally verifiable and/or high performance languages.³ These are often harder for humans to learn, but they're far better suited to LLMs, which thrive on formal verification and reinforcement learning environments. Expect this to reshape which languages dominate. Economic restructuring of open source – For decades, open-source communities have been built around humans finding connection through writing, learning, and using code together. In a world where most code is written—and perhaps more importantly, read—by machines, these incentives will start to break down.⁴ Communities of AIs building libraries and codebases together will likely emerge as a replacement, but such communities will lack the fundamentally human motivations that have driven open source until now. If the future of open-source development becomes largely devoid of humans, alignment of AI models won't just matter—it will be decisive. The future of new languages – Will AI agents face the same tradeoffs we do when developing or adopting new programming languages? Expressiveness vs. simplicity, safety vs. control, performance vs. abstraction, compile time vs. runtime, explicitness vs. conciseness. It's unclear that they will. In the long term, the reasons to create a new programming language will likely diverge significantly from the human-driven motivations of the past. There may well be an optimal programming language for LLMs—and there's no reason to assume it will resemble the ones humans have converged on. TL; DR: - Monoliths return – cheap rewriting kills dependency trees; smaller attack surface, better performance, bare-metal becomes realistic - Lindy effect weakens – legacy code loses its moat, but unknown unknowns persist; formal verification becomes essential - Strongly typed languages rise – human psychology mattered for adoption; now formal verification and RL environments favor types over ergonomics - Open source restructures – human connection drove the community; AI-written/read code breaks those incentives; alignment becomes decisive - New languages diverge – AI may not share our tradeoffs; optimal LLM programming languages may look nothing like what humans converged on ¹ x.com/mntruell/statu… ² x.com/anthropicai/st… ³ wesmckinney.com/blog/agent-erg… ⁴ #issuecomment-3717222957" target="_blank" rel="nofollow noopener">github.com/tailwindlabs/t…

English

701

655

8.1K

1.2M

Entdecken

@hxiao @fchollet @dogecahedron @enjoyingthewind @carl_sterns @robertluxemburg @josephdviviano @JesseFarebro