Alexandre Momeni

476 posts

Alexandre Momeni

Alexandre Momeni

@AlexandreMomeni

_investor (@GeneralCatalyst) _alumni(@NablaTech, @GoldmanSachs, @Stanford, @Polytechnique, @HECParis, @LSEEcon). Health & Bio, Machine Intelligence, Infra

London, England Katılım Eylül 2018
900 Takip Edilen400 Takipçiler
Alexandre Momeni retweetledi
General Catalyst
General Catalyst@generalcatalyst·
“The digital world lacks physical information. The physical world lacks dense digital information. Games perfectly merge these two together, and we believe that’s just the next phase of pretraining.” Every few weeks @gen_intuition ships new emergent capabilities that are orthogonal to anything you see in the LLM world. Watch GC’s @max_rimpel in conversation with @PimDeWitte. Chapters 00:00 — Introduction 00:10 — The World's Biggest Private RuneScape Server 03:09 — From RuneScape to Ebola 07:41 — Mapping the Unmappable 09:47 — Why LLMs Can't See the World 13:45 — The Accidental Foundation of General Intuition 19:10 — Turning Down a Life-Changing Acquisition Offer 21:12 — One Foot in Front of the Other 24:04 — Atoms to Atoms 27:33 — The Talent Flywheel 30:20 — Protecting the Last Weird Corner of the Internet
English
3
16
82
18.6K
Alexandre Momeni retweetledi
Patrick Collison
Patrick Collison@patrickc·
I'm lucky enough to have a great doctor and access to excellent Bay Area medical care. I've taken lots of standard screening tests over the years and have tried lots of "health tech" devices and tools. With all this said, by far the most useful preventative medical advice that I've ever received has come from unleashing coding agents on my genome, having them investigate my specific mutations, and having them recommend specific follow-on tests and treatments. Population averages are population averages, but we ourselves are not averages. For example, it turns out that I probably have a 30x(!) higher-than-average predisposition to melanoma. Fortunately, there are both specific supplements that help counteract the particular mutations I have, and of course I can significantly dial up my screening frequency. So, this is very useful to know. I don't know exactly how much the analysis cost, but probably less than $100. Sequencing my genome cost a few hundred dollars. (One often sees papers and articles claiming that models aren't very good at medical reasoning. These analyses are usually based on employing several-year-old models, which is a kind of ludicrous malpractice. It is true that you still have to carefully monitor the agents' reasoning, and they do on occasion jump to conclusions or skip steps, requiring some nudging and re-steering. But, overall, they are almost literally infinitely better for this kind of work than what one can otherwise obtain today.) There are still lots of questions about how this will diffuse and get adopted, but it seems very clear that medical practice is about to improve enormously. Exciting times!
English
488
642
9.6K
4.1M
Alexandre Momeni retweetledi
Atila
Atila@atiorh·
This is what context does to your speech-to-text system! Our new paper studies the impact of contextual information on the accuracy of leading open-source and proprietary systems.
Atila tweet media
English
1
4
21
2.3K
Alexandre Momeni retweetledi
Atila
Atila@atiorh·
localhost Ep. 2 Bryan Catanzaro (@ctnzr) on @NVIDIAAI's open models and risky bets (00:20) Who is Bryan? (07:38) Getting Nvidia to care about Deep Learning (14:13) Why did Bryan leave Nvidia right when Deep Learning was taking off (18:02) Leadership: Aligning a village of researchers (24:12) Will the frontier flip back to open? (32:16) Nvidia's models: Side project or core business? (38:19) Efficiency leads to edge inference: Does Apple capture inference? (42:43) Nvidia’s risky bets: Fewer and fewer bits (47:19) Nvidia's misstep with Volta (52:30) Every model is already obsolete as soon as you stop training it
English
2
3
14
1.5K
Alex Elkrief
Alex Elkrief@AElkrief·
AI agents in digital asset management are moving from concept to production. Continuous market monitoring, cross-protocol rebalancing, automated strategy execution — the operational advantages are clear. The underexplored risk is hallucination. LLM-powered agents generate outputs probabilistically. They can misinterpret oracle data, fabricate a protocol address, or route funds into a pool with no liquidity — all with high confidence. On-chain, where transactions are irreversible, this class of error is uniquely costly. Traditional finance mitigates this through pre-trade checks, approval chains, and segregation of duties. Most on-chain vault infrastructure has no equivalent — the agent gets signing authority, and that authority is unconstrained. Upshift vaults, with their on-chain policy engine, provide an elegant solution here. The architecture enforces deterministic constraints at the smart contract level, independent of the agent's reasoning: Pre-execution: - Role-based access control scopes each agent to a defined set of operations - Granular permissions validate every external call down to the function selector - Asset and protocol whitelists bound the universe of allowable interactions During execution: - Balance checks before and after every call verify expected outcomes — mismatches trigger a revert - NAV growth rate caps limit portfolio-level impact per unit of time - A module discovery pattern separates intent declaration from execution — the agent proposes, the contract validates before committing Post-execution: - Timelocked governance enforces delays on critical changes - Emergency pause allows security providers to halt operations immediately Smart contracts are deterministic — they enforce exactly the boundaries they were programmed with, regardless of the calling agent's confidence level. As autonomous agents take on a larger role in on-chain asset management, the policy layer underneath them becomes the critical infrastructure. That's what we're building.
English
1
1
7
184
Alexandre Momeni retweetledi
Atila
Atila@atiorh·
Why is the 100 ms barrier for Qwen3-TTS (1.7b) this important?👇 Nvidia GPUs scale up amazingly, but they don't scale down well to serving a single user with sub-3b Transformers. They are throughput-maximizers, not latency-minimizers. @Alibaba_Qwen's Qwen3-TTS paper showed that an optimized vLLM implementation on Nvidia GPUs achieved 101 ms time-to-first-byte latency under idealized conditions: no concurrency and no network round-trip latency. Argmax TTSKit achieves as low as 70 ms on Apple Silicon Macs in the post below, but the takeaway is not 70 vs 101 ms here. The takeaway is that, when we move from idealized conditions to the real world: - Mac will actually serve a single user without an internet round-trip, and the user will experience sub-100ms latency as-is - Nvidia GPUs will serve many users concurrently in the cloud, resulting in at least 3-5x higher latency. Most importantly, latency will have high variance. Real-time streaming inference for sub-3b Transformers is where on-device inference is differentiated from cloud, and companies pay the premium for this today. This is the only commercially relevant market segment where the broadly repeated but rarely substantiated claim of "on-device is faster" actually holds, not running 1T LLMs on 2 Mac Studios.
Atila tweet media
argmax@argmax

TTSKit now achieves sub-100ms time-to-first-byte for Qwen3-TTS 1.7b on Apple Silicon! Link to the code repo and details in comments.

English
3
13
139
23.2K
Alexandre Momeni
Alexandre Momeni@AlexandreMomeni·
Beyond @GoogleDeepMind and @IsomorphicLabs, @demishassabis’s legscy may be the generation of founders he’s inspired - @MistralAI @orbitalmaterials @latentlabs and many more.
James Dacombe@jamesdacombe

Two observations: 1. @demishassabis has done more for the UK by demanding DeepMind remain headquartered in London than arguably any Briton in recent decades (never mind all of his other achievements for the world). His actions will single-handedly account for the majority of the UK’s future growth, if the politicians can manage to stay out of the way.​​​​​​​​​​​​​​​​ What a legend. 2. Sequoia appear to be back and playing aggressively again.

English
0
0
1
245
Alexandre Momeni retweetledi
argmax
argmax@argmax·
We are open-sourcing TTSKit! Run state-of-the-art text-to-speech models on your Mac and iPhone. The launch version supports @Alibaba_Qwen Qwen3-TTS and generates audio faster than real-time playback with sub-200 ms time-to-first-byte. Voice cloning and advanced speed optimizations will be in the next version. Link to the GitHub repo and models on @huggingface in comments.
English
19
66
388
62K
Alexandre Momeni retweetledi
Atila
Atila@atiorh·
Pro tip: When using @superwhisper for AI meeting notes, select Parakeet (voice to text) + Sonnet 4.5 (text to summary) and put all of your company jargon in Vocabulary. Thank me later.
English
1
2
5
438
Alexandre Momeni retweetledi
Mistral AI
Mistral AI@MistralAI·
We’ve raised €1.7B to accelerate technological progress with AI! This Series C funding round, led by @ASMLcompany, fuels Mistral AI scientific research to keep pushing the frontier of AI to tackle the most critical technological challenges faced by strategic industries.
English
213
419
3.8K
561.2K
Alexandre Momeni retweetledi
Sahaj Garg
Sahaj Garg@SahajGarg6·
Latency is the most underrated product feature. 500ms feels instant. 1s feels broken. 2s and you’ve lost the user completely. At Wispr Flow we’ve had to rethink infra from the ground up just to hit sub-500ms LLM inference worldwide. If you like sweating the milliseconds, we’re hiring ML + infra engineers @WisprFlow 👉 wisprflow.ai/jobs
English
8
2
30
4.5K
Alexandre Momeni retweetledi