Alec Flowers

49 posts

Alec Flowers banner
Alec Flowers

Alec Flowers

@flowpow123

Inference @ Nvidia

Katılım Ocak 2012
121 Takip Edilen41 Takipçiler
Alec Flowers retweetledi
NVIDIA AI
NVIDIA AI@NVIDIAAI·
Traditional inference wasn’t built for agentic coding. Agentic tools make hundreds of API calls per coding session, often with recomputed context, creating bottlenecks that drive up cost per token. NVIDIA Dynamo rebuilds the stack for agents with: → KV-aware routing → Agent-aware scheduling → Multi-tier caching → Unified orchestration The result: higher cache hit rates, lower latency, and up to 7× more throughput: nvda.ws/3P1tO1N
NVIDIA AI tweet media
English
46
104
886
71.4K
Alec Flowers retweetledi
ishan
ishan@0xishand·
Thanks to an amazing partnership with @inferact and @lmsysorg /@radixark , Dynamo had day0 support for DeepSeek-V4 with features including large scale P/D disaggregation on B/GB200 and 300 and KV cache aware routing. Containers and some fun PRs linked in the next thread!
English
13
7
64
16.1K
Alec Flowers retweetledi
Sam Altman
Sam Altman@sama·
Really excellent work by the inference team to serve this model so efficiently! To a significant degree, we have to become an AI inference company now.
English
269
154
5.7K
322.3K
Alec Flowers
Alec Flowers@flowpow123·
@sama Can attest! The model is awesome.
English
0
0
0
1.4K
Sam Altman
Sam Altman@sama·
We tried a new thing with NVIDIA to roll out Codex across a whole company and it was awesome to see it work. Let us know if you'd like to do it at your company!
Sam Altman tweet media
English
481
423
8.2K
1M
Alec Flowers retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
Judging by my tl there is a growing gap in understanding of AI capability. The first issue I think is around recency and tier of use. I think a lot of people tried the free tier of ChatGPT somewhere last year and allowed it to inform their views on AI a little too much. This is a group of reactions laughing at various quirks of the models, hallucinations, etc. Yes I also saw the viral videos of OpenAI's Advanced Voice mode fumbling simple queries like "should I drive or walk to the carwash". The thing is that these free and old/deprecated models don't reflect the capability in the latest round of state of the art agentic models of this year, especially OpenAI Codex and Claude Code. But that brings me to the second issue. Even if people paid $200/month to use the state of the art models, a lot of the capabilities are relatively "peaky" in highly technical areas. Typical queries around search, writing, advice, etc. are *not* the domain that has made the most noticeable and dramatic strides in capability. Partly, this is due to the technical details of reinforcement learning and its use of verifiable rewards. But partly, it's also because these use cases are not sufficiently prioritized by the companies in their hillclimbing because they don't lead to as much $$$ value. The goldmines are elsewhere, and the focus comes along. So that brings me to the second group of people, who *both* 1) pay for and use the state of the art frontier agentic models (OpenAI Codex / Claude Code) and 2) do so professionally in technical domains like programming, math and research. This group of people is subject to the highest amount of "AI Psychosis" because the recent improvements in these domains as of this year have been nothing short of staggering. When you hand a computer terminal to one of these models, you can now watch them melt programming problems that you'd normally expect to take days/weeks of work. It's this second group of people that assigns a much greater gravity to the capabilities, their slope, and various cyber-related repercussions. TLDR the people in these two groups are speaking past each other. It really is simultaneously the case that OpenAI's free and I think slightly orphaned (?) "Advanced Voice Mode" will fumble the dumbest questions in your Instagram's reels and *at the same time*, OpenAI's highest-tier and paid Codex model will go off for 1 hour to coherently restructure an entire code base, or find and exploit vulnerabilities in computer systems. This part really works and has made dramatic strides because 2 properties: 1) these domains offer explicit reward functions that are verifiable meaning they are easily amenable to reinforcement learning training (e.g. unit tests passed yes or no, in contrast to writing, which is much harder to explicitly judge), but also 2) they are a lot more valuable in b2b settings, meaning that the biggest fraction of the team is focused on improving them. So here we are.
staysaasy@staysaasy

The degree to which you are awed by AI is perfectly correlated with how much you use AI to code.

English
1.2K
2.5K
20.6K
4.3M
Alec Flowers retweetledi
Noah
Noah@NoahKingJr·
Claude watching me write code manually after I hit the daily limit
English
401
5.9K
80.8K
3.7M
Alec Flowers retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
@nummanali tmux grids are awesome, but i feel a need to have a proper "agent command center" IDE for teams of them, which I could maximize per monitor. E.g. I want to see/hide toggle them, see if any are idle, pop open related tools (e.g. terminal), stats (usage), etc.
English
303
117
3.1K
1.4M
Alec Flowers retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
Thank you Jensen and NVIDIA! She’s a real beauty! I was told I’d be getting a secret gift, with a hint that it requires 20 amps. (So I knew it had to be good). She’ll make for a beautiful, spacious home for my Dobby the House Elf claw, among lots of other tinkering, thank you!!
NVIDIA AI Developer@NVIDIAAIDev

🙌 Andrej Karpathy’s lab has received the first DGX Station GB300 -- a Dell Pro Max with GB300. 💚 We can't wait to see what you’ll create @karpathy! 🔗 #dgx-station" target="_blank" rel="nofollow noopener">blogs.nvidia.com/blog/gtc-2026-… @DellTech

English
529
832
19.2K
1.1M
Alec Flowers retweetledi
WarrenBuffering
WarrenBuffering@WarrenInTheBuff·
my agents when I hit build after a 40 minute plan mode
English
25
116
1.9K
184.8K
Alec Flowers retweetledi
beginbot 🃏
beginbot 🃏@beginbot·
Mark Zuckerberg acquiring AI companies
English
270
559
10.2K
1.4M
Alec Flowers retweetledi
matt rothenberg
matt rothenberg@mattrothenberg·
just picked up this bad boy. can't wait to write some software with it
matt rothenberg tweet media
English
222
876
15K
606.6K
Alec Flowers retweetledi
Alexander Long
Alexander Long@AlexanderLong·
insane sequence of statements buried in an Alibaba tech report
Alexander Long tweet media
English
230
934
6.9K
2.9M
Alec Flowers retweetledi
Rohan Pandey
Rohan Pandey@khoomeik·
unproductive shitty feeling day in 2024: i wrote 10 lines of code & answered slack unproductive shitty feeling day in 2026: claude wrote 100 lines of code, cleaned up my s3 bucket, debugged a cuda error, launched an RL ablation, solved an easy-in-retrospect erdos problem
English
8
9
412
15.4K
Alec Flowers retweetledi
Rhys
Rhys@RhysSullivan·
i gave Claude access to my financial data and asked for suggestions and it told me to leave California 💀
Rhys tweet media
English
279
415
13.7K
726K
Alec Flowers retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
It is hard to communicate how much programming has changed due to AI in the last 2 months: not gradually and over time in the "progress as usual" way, but specifically this last December. There are a number of asterisks but imo coding agents basically didn’t work before December and basically work since - the models have significantly higher quality, long-term coherence and tenacity and they can power through large and long tasks, well past enough that it is extremely disruptive to the default programming workflow. Just to give an example, over the weekend I was building a local video analysis dashboard for the cameras of my home so I wrote: “Here is the local IP and username/password of my DGX Spark. Log in, set up ssh keys, set up vLLM, download and bench Qwen3-VL, set up a server endpoint to inference videos, a basic web ui dashboard, test everything, set it up with systemd, record memory notes for yourself and write up a markdown report for me”. The agent went off for ~30 minutes, ran into multiple issues, researched solutions online, resolved them one by one, wrote the code, tested it, debugged it, set up the services, and came back with the report and it was just done. I didn’t touch anything. All of this could easily have been a weekend project just 3 months ago but today it’s something you kick off and forget about for 30 minutes. As a result, programming is becoming unrecognizable. You’re not typing computer code into an editor like the way things were since computers were invented, that era is over. You're spinning up AI agents, giving them tasks *in English* and managing and reviewing their work in parallel. The biggest prize is in figuring out how you can keep ascending the layers of abstraction to set up long-running orchestrator Claws with all of the right tools, memory and instructions that productively manage multiple parallel Code instances for you. The leverage achievable via top tier "agentic engineering" feels very high right now. It’s not perfect, it needs high-level direction, judgement, taste, oversight, iteration and hints and ideas. It works a lot better in some scenarios than others (e.g. especially for tasks that are well-specified and where you can verify/test functionality). The key is to build intuition to decompose the task just right to hand off the parts that work and help out around the edges. But imo, this is nowhere near "business as usual" time in software.
English
1.6K
4.8K
37.3K
5.1M