francois

582 posts

francois banner
francois

francois

@fozenne

lead data scientist. AI for high expertise domains, functional programing and domain driven design

Versailles, France Katılım Ocak 2013
104 Takip Edilen70 Takipçiler
francois
francois@fozenne·
The three body problem novel is about AI doom
English
0
0
0
22
francois
francois@fozenne·
2026 prediction : MD5: d1c5c969fc61989992d0a5128c1a42b1 Let’s see how long it takes to realize 👀
English
0
0
0
29
francois
francois@fozenne·
@gchampeau @_mcorbin @le_trappiste Tout utilisateur a grosse conso (donc ceux qui dictent la roadmap) font de l’IaC pour la reproductibilité et utilisent boto3 / la CLI pour monitorer les usages. Ça n’exclut pas que ces memes outils sont souvent complexes, mais ce n’est pas un sujet UI web
Français
1
0
0
55
Guillaume Champeau
Guillaume Champeau@gchampeau·
Si je comprends bien ses tweets, Octave Klaba refait lui-même toute l'interface d'admin d'OVHCloud (un outil souvent critiqué par les clients qui le trouvent confus) en vibe-codant, j'imagine pour aller au plus vite et parce qu'il n'a plus besoin d'avoir what-mille réunions en interne pour faire discuter les équipes produits, marketing, UX, frontend, etc sur la couleur ou le nom d'un bouton. Ca va etre très intéressant de voir le résultat, et riche d'enseignements pour beaucoup de boîtes, en positif comme en négatif. D'un côté le gain en efficacité pour la boîte peut être énorme, d'un autre on ne s'improvise pas UX/UI designer de bon niveau. Ce sont de vraies compétences de science de l'ergonomie et de marketing réunies, et savoir à qui confier ça à à l'heure du vibe-coding sera clé pour faire la différence.
Français
36
16
218
61.6K
francois retweetledi
Terrible Maps
Terrible Maps@TerribleMaps·
Mind blown.. Germany’s 5 biggest cities lie perfectly on a 4th-degree polynomial by u/BarisSayit
Terrible Maps tweet media
English
341
868
25.3K
1.8M
francois retweetledi
Justin Mitchel
Justin Mitchel@JustinMitchel·
So... Postgres is now basically a search engine? pg_textsearch was just open sourced. It enables BM25 to search your database.... massive upgrade for key word search. Google uses BM25 in their search engine. Claude told me: "if you're already on Postgres, you can now skip the whole sync-your-data-to-Elasticsearch dance for search." (ps, how can you not love Claude). Now I got to figure out how to implement in my Django querysets... future course? Grab it at github.com/timescale/pg_t… #sponsored
English
84
405
5K
507.3K
francois retweetledi
Mistral AI
Mistral AI@MistralAI·
Mistral OCR 3 sets new benchmarks in both accuracy and efficiency, outperforming enterprise document processing solutions as well as AI-native OCR.
Mistral AI tweet mediaMistral AI tweet media
English
16
80
766
198.1K
francois retweetledi
Simon Willison
Simon Willison@simonw·
This one is pretty nasty - it tricks Antigravity into stealing AWS credentials from a .env file (working around .gitignore restrictions using cat) and then leaks them to a webhooks debugging site that's included in the Antigravity browser agent's default allow-list
PromptArmor@PromptArmor

Top of HackerNews today: our article on Google Antigravity exfiltrating .env variables via indirect prompt injection -- even when explicitly prohibited by user settings!

English
50
319
2.2K
314.8K
francois retweetledi
Jeffrey Emanuel
Jeffrey Emanuel@doodlestein·
Just read through the new LeJEPA paper by Yann LeCun and Randall Balestriero. I’ve been curious to know what Yann’s been working on lately, especially considering all his criticisms of LLMs (which I disagree with, as I think LLMs will keep improving and will take us to ASI fairly soon). Anyway, there are several threads already on X about the paper and what it introduces. The short version is that it’s a principled, theoretically justified, and parsimonious approach to self-supervised learning that replaces a complex hodgepodge of ad-hoc, hacky heuristics for preventing mode collapse, which is the bane of self-supervised learning. That’s where the model screws up and starts mapping all inputs to nearly identical embeddings or to a narrow subspace of embeddings, collapsing down all the richness of the problem into a pathologically simple and wrong correspondence. The first pillar of the new approach is their proof that isotropic Gaussian distributions uniquely minimize worst-case downstream prediction risk. As soon as I read that, I immediately thought of CMA-ES, the best available black-box optimization algorithm for when you don’t have access to the gradient of the function you’re trying to minimize, but can only do (expensive/slow) function evaluations. Nikolaus Hansen has been working on CMA-ES since he introduced it way back in 1996. I’ve always been fascinated by this approach and used it with a lot of success to efficiently explore hyper-parameters of deep neural nets back in 2011 instead of doing inefficient grid searches. Anyway, the reason why I bring it up is because there’s a striking parallel and deep connection between that approach and the core of LeJEPA. CMA-ES says: Start with an isotropic Gaussian because it's the maximum entropy (least biased) distribution given only variance constraints. Then adapt the covariance to learn the problem's geometry. LeJEPA says: Maintain an isotropic Gaussian because it's the maximum entropy (least biased) distribution for unknown future tasks. Both recognize that isotropy is optimal under uncertainty for three reasons: The maximum entropy principle; Among all distributions with fixed variance, the isotropic Gaussian has maximum entropy; I.e., it makes the fewest assumptions. There’s no directional bias; Equal variance in all directions means you're not pre-committing to any particular problem structure. You get worst-case optimality; Minimize maximum regret across all possible problem geometries. So then what’s the difference? It comes down to adaptation timing. CMA-ES can adapt during optimization; it starts isotropic but then becomes anisotropic as it learns the specific optimization landscape. In contrast, LeJEPA has to stay isotropic because it's preparing for unknown downstream tasks that haven't been seen yet. This parallel suggests LeJEPA is applying a fundamental principle from optimization theory to representation learning. It's essentially saying: “The optimal search distribution for black-box optimization is also the optimal embedding distribution for transfer learning.” This makes sense because both problems involve navigating unknown landscapes; for CMA-ES, this is the unknown optimization landscape; for LeJEPA, this is the unknown space of downstream tasks. This difference then makes me wonder: could we have "adaptive LeJEPA" that starts isotropic but adapts its embedding distribution once we know the downstream task, similar to how CMA-ES adapts during optimization? That would be like meta-learning the right anisotropy for specific task families. Anyway, I thought I’d share my thoughts on this. It’s fascinating to see the connections between these different areas. The black-box optimization community has always been pretty separate and distinct from the deep learning community, and there’s not much cross-pollination there. This makes sense, because if you have a gradient, you’d be crazy not to use it. But there are strong connections.
Jeffrey Emanuel tweet media
English
40
94
924
89.1K
francois retweetledi
Jack Morris
Jack Morris@jxmnop·
there are dozens or perhaps a couple hundred ex-{OpenAI, xAI, Google DeepMind} researchers founding companies in the current climate there are, as far as i know, zero people leaving to found startups out of Anthropic really makes you think
English
89
47
2.2K
732.2K
francois retweetledi
Simo Ryu
Simo Ryu@cloneofsimo·
Im confused about "10,000 more efficient" part. This means you can train stable-diffusion-3 like model with 20$~ ish amount of electricity. What stops them from building a model and demonstrating it, beyond *checks note* ... Fashion MNIST? Im genuinely curious whats stopping them from demonstrating something like imagenet-1k which should take less than a dollar of electricity (if my math is right) for 200k steps of training
Extropic@extropic

Hello Thermo World.

English
72
21
665
148.9K
Sauers
Sauers@Sauers_·
After training their flagship 405B parameter model, Thinking Machines researchers discovered that replacing identity mappings between attention layers with non-linear activation functions dramatically improved performance. "Our previous architecture was essentially computing weighted averages at every layer," explains lead researcher; "introducing non-linearity allows the network to learn feature interactions we didn't know were possible—it can now represent functions that aren't just linear combinations of inputs." The lab is calling this the "Deep Learning 2.0" paradigm shift.
English
18
9
354
40.4K
francois retweetledi
Anthropic
Anthropic@AnthropicAI·
New Anthropic research: Signs of introspection in LLMs. Can language models recognize their own internal thoughts? Or do they just make up plausible answers when asked about them? We found evidence for genuine—though limited—introspective capabilities in Claude.
Anthropic tweet media
English
284
780
4.8K
1.2M
francois retweetledi
Wirelyss 👁️‍🗨️💫
Luckily since the Louvre made NFTs of their jewelry, even though the crowns physically were stolen, they still own the same assets. Because the tokens still exist and are in limited supply just as before. Nothing has changed. few understand blockchain technology.
Wirelyss 👁️‍🗨️💫 tweet media
English
320
1.1K
15.5K
619K