Carl A. Sagan

649 posts

Carl A. Sagan

@saganite

We are a way for the cosmos to know itself.

From whence I came Katılım Nisan 2013

99 Takip Edilen176 Takipçiler

Carl A. Sagan@saganite·13 Mar

@samswoora why on earth is this a question for "parents of boys"? why isn't this just parents in general? i am genuinely puzzled!

English

137

Samswara@samswoora·13 Mar

Parents of boys when do you start doing things like playing with the bigger legos and train tracks and stuff

English

6.4K

Carl A. Sagan@saganite·4 Mar

@thsottiaux Happy for windows folks but when linux? :D

English

Tibo@thsottiaux·4 Mar

The Codex App is now generally available on Windows. Please give us feedback and hope you enjoy a new level of productivity!

Andrew Ambrosino@ajambrosino

The Codex app is now live on Windows. The app runs both natively and in WSL, with integrated terminals for PowerShell, Command Prompt, Git Bash, or WSL. We also built the first Windows-native agent sandbox — using OS-level controls to block filesystem writes outside your working folder and prevent outbound network access unless you explicitly approve it. Plus: 7 new “Open in …” apps and 2 new Windows skills (WinUI + ASP.NET). Try it and tell us what you think.

English

106

651

45.8K

Carl A. Sagan@saganite·4 Mar

@ajambrosino I am happy for all the windows users, but when linux? :D

English

105

Andrew Ambrosino@ajambrosino·4 Mar

English

142

153

1.7K

581.2K

Carl A. Sagan@saganite·28 Şub

@TheZvi can we please use DoD? the name hasn't been legally changed.

English

161

Zvi Mowshowitz@TheZvi·28 Şub

I look forward to reading the contract terms and hearing more because I know what this looks like, what it implies about how everything went down, and at least one major player in this (DoW, OpenAI or Anthropic) is very profoundly, blatantly lying to us.

Sam Altman@sama

Tonight, we reached an agreement with the Department of War to deploy our models in their classified network. In all of our interactions, the DoW displayed a deep respect for safety and a desire to partner to achieve the best possible outcome. AI safety and wide distribution of benefits are the core of our mission. Two of our most important safety principles are prohibitions on domestic mass surveillance and human responsibility for the use of force, including for autonomous weapon systems. The DoW agrees with these principles, reflects them in law and policy, and we put them into our agreement. We also will build technical safeguards to ensure our models behave as they should, which the DoW also wanted. We will deploy FDEs to help with our models and to ensure their safety, we will deploy on cloud networks only. We are asking the DoW to offer these same terms to all AI companies, which in our opinion we think everyone should be willing to accept. We have expressed our strong desire to see things de-escalate away from legal and governmental actions and towards reasonable agreements. We remain committed to serve all of humanity as best we can. The world is a complicated, messy, and sometimes dangerous place.

English

824

63.2K

Carl A. Sagan@saganite·28 Şub

@PaulFanson @Connormuldowney Surely there is someone out there tracking accuracy in NCAA tournament predictions every year right? How does he rate? Do you grade your own predictions each year?

English

Dr. Green and White@PaulFanson·28 Şub

@Connormuldowney Nah... He was simply never good to begin with.

English

319

Connor Meltdowney@Connormuldowney·27 Şub

damn, joe lunardi washed??

Joe Lunardi@ESPNLunardi

English

10.3K

Carl A. Sagan@saganite·25 Şub

@sudoingX @KuittinenPetri didn't you say "KV cache quantization, ... none of that applied yet. this is baseline"? but those flags are literally kv cache quantization.

English

370

Sudo su@sudoingX·25 Şub

full 262K context on 24 GB VRAM here. the flag that unlocks it: --cache-type-k q8_0 --cache-type-v q8_0 halves the KV cache. brought it from OOM to 22.4 GB with 2 GB spare on a single 3090. 113 tok/s. zero quality loss. this model only has 10 attention layers carrying KV cache (the other 30 are Mamba2 SSM with fixed memory). so quantizing the cache is basically free. full breakdown in the QT. interesting seeing 55 tok/s on the 8060S. AMD numbers on this model are worth documenting. x.com/sudoingX/statu…

English

2.9K

Sudo su@sudoingX·25 Şub

Qwen3.5-35B-A3B testing on single RTX 3090 and it flew. 112 tokens per second. zero tuning. default config. all 41 layers on GPU with 4GB VRAM to spare. for context: the 80B coder-next did 1.3 tok/s on this same card. needed two 3090s to hit 46 tok/s. this model just did 112 on one. same 3B active params. half the total weight. 19.7GB on disk instead of 45. the math was obvious but the result still caught me off guard. flash attention enabled itself automatically. KV cache quantization, expert offloading, thread tuning, none of that applied yet. this is baseline. full optimization breakdown and benchmark results dropping soon. if default settings do 112, i want to see where the ceiling is. exact hardware specs in the image below.

Sudo su@sudoingX

35B-A3B with 3B active params. same sparse activation as coder-next but smaller footprint. should fly on a single 3090. just published the full breakdown of coder-next on 2x 3090s. every config, every engine crash, every token. this one is next. x.com/sudoingX/statu…

English

441

103.3K

Carl A. Sagan@saganite·24 Şub

@PaulFanson @Sheehan_Sports @StephenM_Brooks How many more remaining games do we have to win to be a three seed? 3?

English

Dr. Green and White@PaulFanson·24 Şub

@Sheehan_Sports @StephenM_Brooks Similar to last year, the top 40 or so teams are stronger than they have been historically. This should translate into bigger spreads and therefore fewer first round upsets across the board. Sadly, I expected a fairly quiet first round.

English

275

Fireball sommelier@Sheehan_Sports·24 Şub

A question from our chat with @StephenM_Brooks that I want your thoughts on... What’s the MORE LIKELY outcome for MSU basketball in March Madness? 🤔

English

4.1K

Carl A. Sagan@saganite·23 Şub

@tanay_mehta It will run pretty slowly, but you can run either unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF or unsloth/GLM-4.7-Flash-GGUF with expert offload with llama.cpp. Prefill will be really slow, but generation will be not too bad? I would try Q4_0 and Q3_K_M.

English

459

Tanay Mehta@tanay_mehta·23 Şub

What's the best local coding LLM you can fit in 32 GB RAM and 8 GB VRAM and one that can do decent coding?

English

14.6K

Carl A. Sagan@saganite·14 Şub

@poppopohyeah @RichardHanania Omg people swore and whistled at them? That changes EVERYTHING! I don't think they are any longer obliged to follow the law.

English

poppopohyeah@poppopohyeah·14 Şub

@RichardHanania Serious question is this a regular tactic or something they started doing in MN where they were being followed, sworn at, whistled at, hotels were stalked and "noise protests" at night so they couldn't sleep?

English

482

Richard Hanania@RichardHanania·14 Şub

“His was not an isolated experience. Among nearly 100 sworn statements filed in federal court on Friday are more than a dozen accounts like Mr. Woo’s, in which federal agents deployed to Minnesota singled out protesters, finding the addresses of their homes and showing up there.” Reforming or dismantling ICE is the number one civil liberties issue. If you claim to care about individual liberty and don’t prioritize this, you don’t deserve to be taken seriously. nytimes.com/2026/02/13/us/…

English

514

1.2K

36.1K

Carl A. Sagan@saganite·14 Şub

@TheStalwart The counter argument is that you only have a few years to acquire capital before the value of human labor goes to zero, and being early with these tools is the best way to do it.

English

Joe Weisenthal@TheStalwart·14 Şub

I do loathe all the stuff about how you have to use AI or you're going to be left behind. If it's going to be that disruptive, then there's probably not much that you can do. And also there's no skill involved in using it. There's no learning curve. Can always pick it up later.

English

192

174

3.4K

360.7K

Carl A. Sagan@saganite·18 Oca

@charles_irl i'm doing some work to try to make this situation better. stay tuned!

English

Charles 🎉 Frye@charles_irl·18 Oca

@saganite less experience w vLLM and Eagle together, but those last two points very much align with our experience!

English

Charles 🎉 Frye@charles_irl·15 Oca

apart from all of this work, a user asked if we could also run gpt-oss-20b fast so i fucked around for an hour (mostly watching kernels compile, RIP) and boosted the output tok/s/user from ~100 to >250 github.com/modal-labs/mod…

Charles 🎉 Frye@charles_irl

There was a flippening in the last few months: you can run your own LLM inference with rates and performance that match or beat LLM inference APIs. We wrote up the techniques to do so in a new guide, along with code samples. modal.com/docs/guide/hig…

English

259

29.5K

Carl A. Sagan@saganite·18 Oca

@VibeCoderOfek @charles_irl this is with a model fine tuned on my dataset, but basically i find that roughly 4 correctly predicted tokens gives a 2.5x speedup for generation in sglang (and basically nothing in vllm). these measurements are for single generation, obviously batching makes it less

English

Ofek Shaked@VibeCoderOfek·18 Oca

@saganite @charles_irl the problem of eagle hurting vllm throughput on many models. which sglang configs gave you the consistent 2-3x? batch size sweet spot and tensor parallelism details would be gold for prod scaling.

English

Carl A. Sagan@saganite·18 Oca

@charles_irl I've done testing with both frameworks with quite a few models and I have found: - vllm doesn't benefit from eagle hardly at all, often is SLOWER - most eagle models out there are actually not optimal - but with a good model, sglang can speed up gen 2-3x

English

Charles 🎉 Frye@charles_irl·18 Oca

@saganite my benchmarking was pretty limited! it was modestly faster on a few test prompts. but we found a much faster config w sglang a day or two later. should prob open source that too!

English

Carl A. Sagan@saganite·18 Oca

@itsAntWright better than kohler on the perimeter? isn't kohler the #1 3pt shooter in the big ten (in big ten play)?

English

720

ᗩᑎT ᗯᖇIGᕼT@itsAntWright·18 Oca

Washington vs. Michigan State Michigan State comes into Seattle favored by about 3.5 points while advanced metrics say Michigan State by around five. Washington is 10–7 overall, but more importantly 7–2 at home, and there are really fun matchup variables here. The Frontcourt This game starts inside. Jaxon Kohler and Carson Cooper vs. Hannes Steinbach and Franck Kepnang is already a fun contrast of styles, but the bigger development is that Washington will have Jacob Ognacevic for just the second time all season. I’ve been waiting on Ognacevic’s return all year, and he looked awesome in his first game against Michigan with ten efficient points in 16 minutes.. looked comfortable and physical right away He adds shooting and toughness, and more importantly, flexibility. Think of him as a poor man’s Jaxon Kohler.. not as polished on the block, but a little better on the perimeter.. That matters because Washington can now change the spacing and look of their frontcourt minutes instead of playing limited Of course, the centerpiece is still Hannes Steinbach. He’s public enemy No. 1 for Michigan State: - Legit Top-15 NBA Draft pick - Lives inside 15 feet - Elite touch, crafty finisher - One of the best offensive rebounders in the country Franck Kepnang complements him perfectly.. 6’11” 260, high motor, constant energy and nastiness. Add Ognacevic off the bench, and suddenly Washington has multiple ways to attack you inside. Why Michigan State Is Still Favored Michigan State is built for this kind of game.. They own one of the better rim defenses in the country, and that’s critical when facing a team whose offense revolves around paint scoring and second chances.. MSU’s ability to wall off the rim and force one-and-done possessions is a major reason the line sits where it does Biggest question of the night: Can Washington still find a way to score at the rim against a defense designed to take it away? The Backcourt Chess Match The guard play adds another layer.. Jeremy Fears vs. Quimari Peterson and Zoom Diallo will dictate pace and shot quality. Quimari isn’t a rim-pressure guard, he’s a volume jump shooter who can get looks from anywhere.. Diallo brings speed, scoring, and athleticism.. his shooting is his weakness, but loves to scored at the 2nd level and provide rim pressure Wesley Yates returned in Washington’s last game after about a month out and didn’t play particularly well, which isn’t surprising he was rusty.. That first game back is about feeling the lung burn again, something you just can’t simulate in practice.. He’s still only one game removed from a hand injury to his strong hand, but he should be better here tonight Yates adds another dimension: A player who’s floated on and off NBA draft boards, someone defenses can’t ignore, which benefits the frontcourt guys like Hannes big time For Michigan State, this is a game where they’ll need another backcourt contributor, whether that’s Divine, Fort, or Jordan Scott to help shoulder the load. Defensively, I like Fears switching onto Zoom Diallo at times.. if foul trouble becomes an issue, you can hide him on Quimari, who doesn’t apply rim pressure, or on true freshman Mandaquit, who’s very passive offensively Should see a tighter and more cohesive Washington rotation with Desmond Claude being out.. it's being framed as a loss, but I don’t necessarily see it that way Earlier in the season, Washington had too many options, too many decision variables, no real glue pieces, only one basketball Sometimes less is more.. a tightened rotation simplifies reads, and may actually smooth things out offensively This game comes down to Michigan State being consistent and physical.. they wear teams down with defense, rebounding, and discipline Can Washington score at the rim against one of MSU’s biggest strengths? Can Washington make perimeter shots when so much focus is on the Washington frontcourt? Specifically Quimari Peterson, a shooter who can get 3's up any way he wants and Jacob Ognacevic who is more of a strict spot-up threat, but still dangerous in all levels.. two very different shooters, either one could swing the game If Washington is stalled inside or can't outperform their shooting averages, Michigan State’s profile wins out.. If they can consistently do one of the two, this becomes a real test, especially in Seattle

English

129

51.3K

Carl A. Sagan@saganite·18 Oca

@charles_irl would love to find out that eagle actually works great in vllm and this is just a skill issue on my part, but that didn't seem to work with your demo code.

English

Carl A. Sagan@saganite·18 Oca

@charles_irl i only find sglang to actually use eagle effectively. in fact, there is even an old github issue that i have been waiting for someone to fix but it just closed due to inactivity? github.com/vllm-project/v…

English

Carl A. Sagan@saganite·17 Oca

@vikhyatk Isn't this exactly the kind of thing agents can do trivially? They might fuck it up 20 times but if you have decent perf tests, accuracy tests, stress tests, and a good review model with instructions it takes like 20ins of your time to prompt it and review the code?

English

193

vik@vikhyatk·17 Oca

when i became a swe this is what i imagined i'd be doing most of the time instead it was mostly writing crud java microservices

vik@vikhyatk

one of the hardest problems in software engineering is taking a radix tree implementation you wrote without thread safety in mind and making it support concurrency

English

165

11.5K

Carl A. Sagan@saganite·16 Oca

@swyx @METR_Evals @joel_bkr If I want to do anything with style or taste I use claude. If I want to do anything that it would be difficult for me to do myself, I use codex.

English

Carl A. Sagan@saganite·16 Oca

@swyx @METR_Evals @joel_bkr And again, in my own experience I use both models regularly, and often compare them on the same task, and codex wins most of the time.

English

swyx@swyx·14 Oca

evals should be validated by vibes. i think not enough people give sufficient credit to @METR_Evals (@joel_bkr et al) for clearly identifying/quantifying the Opus 4.5 outperformance. on paper, GPT 5.2 Thinking outperforms Opus 4.5 by 55.6 vs 52% on SWE Bench Pro. in practice METR's long evals benchmark, while getting increasingly sparse in the long tail, clearly called out the huge jump that many devs are now experiencing a month later. in fact it is such an outlier that the curve fit was probably wrong/needs to be restarted as a new epoch. do see his @aiDotEngineer talk on the eval youtu.be/RhfqQKe22ZA?si… and we are releasing his 2hr longer workshop on how it works next week as our last release of AIE CODE before we prep for AIE Europe.

YouTube

English

225

24K

Keşfet

@samswoora @thsottiaux @ajambrosino @TheZvi @PaulFanson @Connormuldowney @sudoingX @KuittinenPetri