Jouni Helminen

924 posts

Jouni Helminen banner
Jouni Helminen

Jouni Helminen

@dharmaone

design, open source, music

London Katılım Mart 2009
7K Takip Edilen1.4K Takipçiler
Jouni Helminen retweetledi
Bookmark Bro
Bookmark Bro@bookmarkbroski·
Bookmarked something fire on X… then spent 15 minutes scrolling trying to find it again? 😩 We got you fam. Meet BookmarkBro — a beautiful native Mac app for browsing, super-fast search, tagging, and chatting with AI about your X bookmarks. All locally on your Mac. No privacy leaks. No new sign-ups. No cloud nonsense. Free in beta. More + download in the replies 👇
English
3
6
13
13.7K
Jouni Helminen retweetledi
Aman
Aman@Amank1412·
USING Claude Opus 4.7 TO CENTER A DIV
English
351
2.3K
28.9K
1.8M
hinata
hinata@HinataMotivates·
Jensen Huang gets into a heated argument over selling chips to China.
English
208
220
4.5K
1.2M
Aakash Gupta
Aakash Gupta@aakashgupta·
Props to Dwarkesh for going toe to toe with the CEO of the world’s largest company like this
English
88
66
1.6K
153.4K
Jouni Helminen
Jouni Helminen@dharmaone·
Great interview. Only one codex model runs on cerebras afaik - 5.3-spark. I’ve been testing it - very fast but the quality isn’t great. Tiny context window and not as good overall as 5.4. I think this is because the chip only has 44gb sram. @MatXComputing will have an interesting blend of sram (weights) and HBM (kv cache) and Nvidia will do more with Groq over time for fast inference of some workloads no doubt. Huang is right in that Nvidia GPUs/CUDA is more general and more future proof for architecture changes than TPUs optimised for current workloads/architectures. He also said that the main reason Anthropic is using TPUs is because Google/Amazon are large investors in them and Nvidia wasn’t able to invest early on - not sure how true that is but was interesting. China doesn’t have access to latest lithography for competitive power efficiency but will build EUV (or whatever comes after) capabilities eventually, likely in the next decade. They are moving pretty fast elsewhere (models obviously, but also fast 3d DDR5 from CXMT, Huawei etc for processors). I think the chip ban is probably bad long term, might have been better to keep them on nvidia instead of accelerating home grown alternatives
English
0
0
0
399
Ejaaz
Ejaaz@cryptopunk7213·
ridiculous amount of alpha in this post, gavin knows this shit better than anyone. tldr: - the switching-cost to train your model on a different type of gpu is very high now - translation: ai labs are becoming increasingly reliant on their GPU maker (which gives nvidia a lot of power) - labs are now literally designing their models to work with specific gpus - google’s gemini needs tpus, openai needs cerebras / nvidia - Anthropic is the ONLY ONE that can afford to switch. why? because they train claude across tpus, trainium and nvidia - but inference is now way more important than pre-training aka the TYPE of gpu matters more - chinese models are trained on chips VERY different to americas = their models wont run on our hardware.
Gavin Baker@GavinSBaker

Much of Dwarkesh's argument hinges on this statment which *was* accurate but will be increasingly inaccurate on a go forward basis imo:    “American labs port across accelerators constantly. Anthropic's models are run on GPUs, they're run on Trainium, they're run on TPUs. There are so many things you can do, from distilling to a model that's well fit for your chips.”   As system level architectures diverge (torus vs. switched scale-up topologies, memory hierarchies, networking primitives), true portability is eroding. The Mi300 and Mi325 had roughly the same scale-up domain size as Hopper while Blackwell’s scale-up domain is 9x larger than the Mi355 scale-up domain, etc. Many frontier models are now being explicitly co-designed for inference on specific hardware like GB300 racks. Codex on Cerebras is another example. Those models run less efficiently on other systems and the performance differentials will only widen. A model that runs well on Google’s torus topology will run less efficiently on Nvidia’s switched scale-up topology and vice versa - the data traffic is fundamentally different as a byproduct of the models being parallelized across the different topologies. Google’s internal teams - and increasingly the Anthropic teams as they become the most important customer of almost every cloud - have the luxury of operating across the stack (models, chips, networking) - but that is not the case for the rest of the market and other prospective users. Anthropic is the exception, not the rule. To wit, Anthropic and Google allegedly have a mutual understanding where Anthropic can hire the TPU engineers they need every year to ensure that they can continue to get the most out of the TPU. Given the overwhelming importance of cost per token to the economics of the labs, models will be run where they run best. Most extremely large MoE models will run best on GB300s given the importance of having a switched scale-up network like NVLink for MoE inference. When training was the dominant cost for labs and power was broadly available, labs were optimizing to minimize capex dollars. Model portability was a way to create leverage over suppliers. I think that drove a lot of the focus on portability. Today, inference costs as measured by tokens per watt per dollar are everything. Inference is way more important than training costs (inference is effectively now part of training via RL). Labs are therefore now optimizing for inference. This means increasing co-design and higher go-forward switching costs for individual models between systems. I do think this explains why Anthropic and Nvidia came together: Anthropic needed Blackwells and Rubins to inference at least *some* of their models economically. And Mythos might just end up being released coincident with the availability of Rubins for inference. TLDR: as labs shift their focus from training to inference, the costs of portability and the upside of co-design to maximize tokens per watt per dollar both rise. Portability is likely to begin decreasing as a result.   I think what I might have respectfully added to Jensen’s answer is that systems evolve under local selective pressures. The evolutionary pressure in America is a shortage of watts so it makes sense for Nvidia to optimize, as an American company, for power efficiency and tokens per watt and stay on copper as long as possible. China has a surfeit of watts. Chinese AI systems are already taking advantage of this with the Huawei Cloudmatrix 384 and Atlas SuperPoD having an optical scale-up domain that is much larger than anything offered by Nvidia today at the cost of *much* higher power consumption and much lower tokens per watt. The networking primitives for this Huawei system are very different than those for Nvidia’s systems and a model that runs well on Nvidia will not run well on that system and vice versa. This means that if a Chinese ecosystem gets momentum, Chinese models might stop running well on American hardware. And when Chinese models run best on American hardware, America is in a better position as this gives America a degree of leverage and control over Chinese AI that it risks losing to an all-Chinese alternative ecosystem.   This architectural fork makes porting and distillation less effective and strengthens the pro-American national security case for selling China deprecated GPUs imo. Also I will attest that I did not wake up a loser this morning.

English
16
15
400
84.8K
Jouni Helminen
Jouni Helminen@dharmaone·
@claudeai this is the way. the executor could be a local model also, or a realtime voice model that does tool calling for complex tasks when needed but doesn't stop the voice conversation
English
0
0
1
1.1K
Claude
Claude@claudeai·
We're bringing the advisor strategy to the Claude Platform. Pair Opus as an advisor with Sonnet or Haiku as an executor, and get near Opus-level intelligence in your agents at a fraction of the cost.
Claude tweet media
English
1K
2.8K
38.5K
4.7M
Jouni Helminen
Jouni Helminen@dharmaone·
@elonmusk Ramanujan is a good example also. Speedrunning in representation space of compressed insights + intuition vs reasoning in language
English
0
0
1
500
Elon Musk
Elon Musk@elonmusk·
Hadamard thought in image space
English
3.1K
3.4K
53.6K
67.4M
Mustafa Suleyman
Mustafa Suleyman@mustafasuleyman·
Three models. Three top-tier results. All shipped within just a few months by the @MicrosoftAI team. - MAI-Transcribe-1 dropped today, the most accurate transcription model in the world across 25 languages according to FLEURS WER benchmark. - MAI-Voice-1 sets a new standard for natural speech. - MAI-Image-2 lands as a top 3 model family on @arena. We've been building with them - now you can too. All 3 available now on Microsoft Foundry.
English
51
97
544
76.6K
Aryan
Aryan@justbyte_·
In which programming language you wrote your first "Hello World" ?
Aryan tweet media
English
484
11
543
32.8K
Jouni Helminen
Jouni Helminen@dharmaone·
@amix3k @soumo_dg it's a really great model and the optional reasoning and tool calling are great too. but i wonder how this will scale to every user on popular apps unless metered. some day a model like this will run on device
English
0
0
0
12
Amir Salihefendić
Amir Salihefendić@amix3k·
@soumo_dg I’m just exploring, no idea if we’ll add this to Todoist at some point 👍😊
English
2
0
1
421
Amir Salihefendić
Amir Salihefendić@amix3k·
Gemini 3.1 Flash Live launched a few days ago, and it's a pretty incredible real-time model. We're getting very close to everyone having their own JARVIS assistant. A small demo of a Todoist voice assistant built with the new model.
English
18
25
368
35.4K
Brad Neuberg
Brad Neuberg@bradneuberg·
@BoWang87 Any good introductions explaining how SIGReg Gaussian regularization works?
English
1
0
2
396
Bo Wang
Bo Wang@BoWang87·
This is essentially LeCun's JEPA dream made practical— a clean, efficient, collapse-free world model that learns entirely from pixels with minimal engineering tricks. The key insight (SIGReg Gaussian regularization) is surprisingly simple.
Lucas Maes@lucasmaes_

JEPA are finally easy to train end-to-end without any tricks! Excited to introduce LeWorldModel: a stable, end-to-end JEPA that learns world models directly from pixels, no heuristics. 15M params, 1 GPU, and full planning <1 second. 📑: le-wm.github.io

English
13
61
667
75.9K
dataStrategies
dataStrategies@_DataStrategies·
@BoWang87 SIGReg gaussian regularization was already in LeJEPA, wasn't it?
English
1
0
1
130
Chen Cheng
Chen Cheng@cherry_cc12·
Yes, we just dropped it — Flash, 35B-A3B, 122B-A10B & 27B 🚀 Quick take from me: Flash is basically 35B-A3B but with longer context out of the box — great for production. 27B Dense is honestly my favorite for indie devs. Runs on a single GPU, multimodal, and I've been playing with it myself — the Code Agent and reasoning are legit good. 122B-A10B finally shows what this architecture is really capable of. And 35B-A3B is so insane value. Smart, fast to run, highly efficient. Hard to beat at this size. Hope you like it — and seriously, try our model. Your feedback is how we get better 🙏 try them now: chat.qwen.ai
Qwen@Alibaba_Qwen

🚀 Introducing the Qwen 3.5 Medium Model Series Qwen3.5-Flash · Qwen3.5-35B-A3B · Qwen3.5-122B-A10B · Qwen3.5-27B ✨ More intelligence, less compute. • Qwen3.5-35B-A3B now surpasses Qwen3-235B-A22B-2507 and Qwen3-VL-235B-A22B — a reminder that better architecture, data quality, and RL can move intelligence forward, not just bigger parameter counts. • Qwen3.5-122B-A10B and 27B continue narrowing the gap between medium-sized and frontier models — especially in more complex agent scenarios. • Qwen3.5-Flash is the hosted production version aligned with 35B-A3B, featuring: – 1M context length by default – Official built-in tools 🔗 Hugging Face: huggingface.co/collections/Qw… 🔗 ModelScope: modelscope.cn/collections/Qw… 🔗 Qwen3.5-Flash API: modelstudio.console.alibabacloud.com/ap-southeast-1… Try in Qwen Chat 👇 Flash: chat.qwen.ai/?models=qwen3.… 27B: chat.qwen.ai/?models=qwen3.… 35B-A3B: chat.qwen.ai/?models=qwen3.… 122B-A10B: chat.qwen.ai/?models=qwen3.… Would love to hear what you build with it.

English
39
34
672
49K
Jouni Helminen
Jouni Helminen@dharmaone·
@elonmusk v cool. will AI4/5 be sold separately? And will you be able to use the AI4/5 chip in your car for other inference tasks (like Digital Optimus) while not driving?
English
0
0
0
39
Elon Musk
Elon Musk@elonmusk·
Macrohard or Digital Optimus is a joint xAI-Tesla project, coming as part of Tesla’s investment agreement with xAI. Grok is the master conductor/navigator with deep understanding of the world to direct digital Optimus, which is processing and actioning the past 5 secs of real-time computer screen video and keyboard/mouse actions. Grok is like a much more advanced and sophisticated version of turn-by-turn navigation software. You can think of it as Digital Optimus AI being System 1 (instinctive part of the mind) and Grok being System 2. (thinking part of the mind). This will run very competitively on the super low cost Tesla AI4 ($650) paired with relatively frugal use of the much more expensive xAI Nvidia hardware. And it will be the only real-time smart AI system. This is a big deal. In principle, it is capable of emulating the function of entire companies. That is why the program is called MACROHARD, a funny reference to Microsoft. No other company can yet do this.
English
8.2K
11.1K
78.9K
47.7M
Jouni Helminen
Jouni Helminen@dharmaone·
This was a great recent interview - youtu.be/ukpCHo5v-Gc?si… Good fit for millions of low complexity problems that are still unsolved and are verifiable Coding is a bit special in the sense that there is potential for RSI - starting to see that with Karpathy’s new autoresearcher, AI optimised CUDA kernels etc
YouTube video
YouTube
English
1
0
0
263
Ye Zhang
Ye Zhang@yezhang1998·
I think RL with verifiable rewards will become increasingly important in pushing LLMs toward their own “AlphaZero moment.” It will likely begin with coding, then extend to math, physics, and other domains where models can self-explore, discover out-of-distribution solutions humans might never imagine, and verify them using an absolute reward signal (0/1). This also reminds me of @elonmusk talking about a future where programs could be generated directly as binaries, without going through the traditional compilation process. That may actually be possible if LLMs can generate binary code and then execute it directly against a verifiable reward.
SAIR@SAIRfoundation

Terence Tao: Formal Verification Breaks the Trust Barrier in Mathematics Formal verification is transforming mathematical collaborations — enabling anonymous contributions, machine-checked proofs, and radically more precise scientific discussion.

English
2
6
76
17.5K
Jouni Helminen
Jouni Helminen@dharmaone·
@tomjohndesign This plugin has worked very well for the same task but it’s great to see Figma embrace Claude code more. More excited to see design systems integration and two way flows in the future figma.com/community/plug…
English
0
0
2
306
Tom Johnson
Tom Johnson@tomjohndesign·
This is incredible. I'm seeing people bashing this but I'm pretty sure they've never had to go through the pain of working and trying to recreate complex web apps in Figma to tweak layouts and try new variants -- remove cards, update copy, try different information density, etc. I'm now able to work directly from the Vercel dashboard as the source of design truth and then explore layout changes in the canvas. This is such a big unlock for me I can't even begin to explain it. I just recreated basically all of the data-heavy core UI of Vercel (something that was nearly impossible to do before) with ACTUAL data in less than 5 minutes. Charts, lists, tables, filters, all one-shotted. This is crazy crazy crazy.
Dylan Field@zoink

x.com/i/article/2023…

English
38
12
438
97K
Jouni Helminen
Jouni Helminen@dharmaone·
@bencera @ryancarson macOS/iOS STT APIs do the heavy lifting- the weights are bundled with the OS. Still, looks great and very useful
English
0
0
0
38
Ben Cera
Ben Cera@Bencera·
@ryancarson 913kb teleprompter in a world of 2GB apps that show you a loading screen. we went wrong somewhere
English
1
0
2
736
Ben South
Ben South@bnj·
Introducing @variantui Enter an idea and get endless (beautiful) designs as you scroll No canvas, no skills or MCP, no constant prompting Reply if you'd like 200 free designs to give it try
English
2.2K
271
4.2K
1.1M
Jouni Helminen retweetledi
Volodymyr Zelenskyy / Володимир Зеленський
There was so much talk about the protests in Iran – but they drowned in blood. The world has not helped enough the Iranian people, it has stood aside. What will Iran become after this bloodshed? If the regime survives, it sends a clear signal to every bully – kill enough people, and you stay in power.
English
9.8K
23.7K
71.8K
1.6M