Alexander Doria

46.1K posts

Alexander Doria

@Dorialexander

building open ai infrastructure @pleiasfr

Katılım Nisan 2011

4K Takip Edilen23.2K Takipçiler

Sabitlenmiş Tweet

Alexander Doria@Dorialexander·11h

After months of delay, the successor post to "The model is the product": the AI decoupling. vintagedata.org/blog/posts/the…

English

218

17.3K

Alexander Doria@Dorialexander·6h

@BetterCallMedhi j'avoue c'est après avoir vu ton post, mais pas pu m'empêcher là…

Français

182

Mehdi (e/λ)@BetterCallMedhi·6h

@Dorialexander mdr je vois que toi et moi on a décidé de se le faire 😂😂

Français

514

Alexander Doria@Dorialexander·6h

Bonjour BFM. Vous avez personne pour lire les model reports ? Réponse en p. 9.

BFM Business@bfmbusiness

IA : Deepseek baisse ses prix de 75 % "C'est une stratégie de dumping : il distribue massivement, quasi gratuitement, leurs modèles, et à côté ils font en sorte que tout le monde aille sur Alibaba car le token est très bas" 💬@mikiane 🎙️@simottel

Français

Alexander Doria@Dorialexander·7h

Great overview of world model research. Particularly liked this part, about the lead example in the open, Cosmos : "what makes it a world model is the training data".

Julia Turc@juliarturc

"World models" is one of the buzziest yet ambiguous terms in AI right now. I started this video with many questions: - How are they different from video generation? - Can they do more than AI slop? - Can LeCun be trusted given that he wears knee-high white socks? Many thanks to @tjgalda and @NVIDIAAI for helping me answer (most) of these questions!

English

8.9K

Alexander Doria@Dorialexander·10h

i guess that’s one way to outcompete anthropic.

Alexander Doria@Dorialexander

well i'm literally running jobs on this right now.

English

Alexander Doria@Dorialexander·10h

@ObscureLocal I do have a slight interest but I believe the critical thing will be the availability of synthetic pipelines + post-training methods (especially for smaller MoE with super cheap inference). Compute isn’t the main blocker.

English

312

Obscure Local Historian@ObscureLocal·10h

Very interesting. I think the economics are also starting to make sense for U.S. companies to take the China path. I myself am contemplating whether now is the time to begin doing this as an individual, which is maybe telling of some other future event. I was reading a few days ago that it might cost very little right now, less than a hundred dollars, to train something like a 9B model on Colab. I haven't verified that yet, but if it were the case, Training As A Service is not far away. I'd very much like to organize my training data, upload it, pick my arch and hyperparameters in a dashboard, and put in coins. In the long run, the shape of things may even come to resemble that. Like software was created to automate simple verifiable tasks, it will not necessarily be efficient to do everything with a 100 trillion parameter super-AI. It may be better to use that AI to build an edge model that can solve your repeatable process problem well. So the shape of the AI economy comes at some point to somewhat resemble the former shape of the software economy, perhaps. Another thought is what happens when diffuser economics do become well optimized. It was maybe last year or earlier this year that there was a lot of speculation about generative user interfaces, and I see people making purely generative web servers and such things. Maybe then "the model IS the product". 😅

English

478

Alexander Doria@Dorialexander·11h

After months of delay, the successor post to "The model is the product": the AI decoupling. vintagedata.org/blog/posts/the…

English

218

17.3K

Alexander Doria@Dorialexander·11h

@advait_jayant Yes. To some extent they had surprisingly dated prior (also over scaling: highly sparsed MoE were not what they had in mind)

English

296

Advait@advait_jayant·11h

@Dorialexander funnily enough ai-2027 got itself backwards on china. it had the ccp nationalizing everything into a single megalab next to a nuclear plant. the opposite happened. deepseek, qwen, kimi, minimax, and as you pointed out even meituan and xiaomi are all shipping their own models!

English

619

Alexander Doria@Dorialexander·11h

@52dsl Ah oui bien vu, je corrige.

Français

171

52dsl@52dsl·11h

@Dorialexander Très intéressant. Merci ! (il doit cependant manquer un bout à cette phrase : "By 2023, OpenAI and Anthropic had an early lead in model development but nothing that would prevent...")

Français

222

Alexander Doria@Dorialexander·12h

@ed_brz9 @jackson_stokes Thought briefly about that but disagree now. Memory is actually needed for many agentic processes (basically model needs to understand what is being asked, and how to look for things)

English

Ed Brz9@ed_brz9·12h

@Dorialexander @jackson_stokes For now I just do synth data and fine tune so I’m not at all qualified to talk about model architecture, but do LLM need to know much to be useful? I feel like more weight dedicated to attention could benefit agentic capabilities. Near zero knowledge models that can use tools

English

Jackson Stokes@jackson_stokes·2d

There seems to be a ~3B lower limit for useful LLMs. below that, instruction following and ICL drop off a cliff? Is there some fundamental reason for this?

English

10.9K

Alexander Doria@Dorialexander·19h

and here is anthropic soft power: blessed be circuit transformers (and the data that feed it).

English

874

Alexander Doria@Dorialexander·19h

great seeing the pope supporting tokenizer research.

English

1.9K

Alexander Doria@Dorialexander·1d

@Noahpinion Just creating conditions for proper open-ended research (the kind OpenAI has just started to emulate) and selecting for people willing to do that.

English

470

Noah Smith 🐇🇺🇸🇺🇦🇹🇼@Noahpinion·1d

I've wondered about this. If demand for intelligence were so infinite, why do we have so.many of our smartest people sitting around doing esoteric fun stuff for smallish salaries in academia?

Chris Anderson@chr1sa

In all of human history, has there ever been a commodity with infinite demand, as there appears to be for intelligence? I can't think of one. Even compute, energy or just silicon/sand are just downstream of intelligence, which is the main demand driver. In economics, rather than modeling the usual price/demand curve to reach an equilibrium, perhaps you'd have to model price/*rate of demand growth* (ie, the derivative of demand, or some other indicator of velocity) Interestingly, ChatGPT (below) prefers the framework of "recursive expansion of demand" as increasing intelligence opens new applications/markets. But the end result is the same -- the demand curve keeps moving to the right, maybe forever. Which I think is unprecedented.

English

165

412

72.5K

Alexander Doria@Dorialexander·1d

@JulienBlanchon @bastiengares (les documents vraiment corpos dans certains domaines c'est totalement galère à localiser en webcralw et les quantités sont pas là : je comprends la logique de racheter des données de boîte)

Français

Alexander Doria@Dorialexander·1d

@JulienBlanchon @bastiengares Moi c'est un peu les échos que j'ai eu côté Anthropic (et de certains fournisseurs RL) : migration d'environnements contraints avec pas mal de connaissance métiers vers des workflow plus ouverts de la donnée en gros, moins structuré.

Français

Bastien Gares@bastiengares·1d

Pourquoi Anthropic est si mauvais en maths et les sujets de recherche par rapport à openAI et Google

Dr Singularity@Dr_Singularity

this is pretty amazing news Google DeepMind's AI agent autonomously solved 9 of 353 open Erdos problems. Cost - only few hundred dollars per problem. This is a major signal that AI is moving from helpful math assistant to actual autonomous research engine. LLM agents connected to Lean are now solving real open mathematical problems, not just textbook exercises. The strongest system resolved 9 open Erdős problems and proved 44 OEIS conjectures, with some breakthroughs costing only a few hundred dollars each. Mathematics is one of the hardest domains because you cannot bluff your way through it. A proof either checks or it does not. And now AI systems are starting to generate proofs that survive formal verification. 1000x acceleration of progress is coming.

Français

3.3K

Alexander Doria@Dorialexander·1d

@JulienBlanchon @bastiengares Je pense qu'il achètent surtout la donnée de seeding mais ensuite c'est multi-usage…

Français

Julien Blanchon 🇺🇦@JulienBlanchon·1d

@Dorialexander @bastiengares Je dirais plus l'achat d'env RL (en code/droit/corpo effectivement)

Français

Alexander Doria@Dorialexander·1d

@neev_parikh normative

Français

111

Neev Parikh@neev_parikh·1d

@Dorialexander Is this a normative or descriptive claim?

English

159

Alexander Doria@Dorialexander·1d

Exactly why the frontier should be distributed.

will depue@willdepue

academics are unprepared for the coming world where much scientific progress is majorly a function of inference compute. whether OpenAI points the Eye of Stargate at your particular field will decide its acceleration. talent will leach away into the labs. it's already begun

English

5.6K

Alexander Doria@Dorialexander·1d

@synquid Yes.