Freddy Dopfel

1.1K posts

Freddy Dopfel banner
Freddy Dopfel

Freddy Dopfel

@FreddyDopfel

Maker, Tinkerer, and Investor

San Francisco, CA Katılım Mayıs 2013
526 Takip Edilen303 Takipçiler
Freddy Dopfel
Freddy Dopfel@FreddyDopfel·
*Sighs* another super cool weekend project….
Pavlo Molchanov@PavloMolchanov

What if you could take three completely different model families… and distill them into one tiny model? 🤯 📜 Paper: arxiv.org/pdf/2605.21699 MOPD (Multi-Teacher On-Policy Distillation) has become a standard procedure in post-training. We already distill multiple specialized variants of the same model into a single set of weights. But what if we could go further - and distill models from entirely different families? Turns out, it is possible. Today we’re releasing a paper on cross-tokenizer distillation - our first steps in this exciting direction. 📄 We distilled Qwen3-4B, Phi-4-Mini, and Llama-3B into Llama-3.2-1B. MMLU jumped from 32.05 → 46.32 when using multiple teachers. 📈 The team is now working on Nemo-RL integration so the community can try this method in their own settings. Plus, we are scaling experiments up. 🚀

English
0
0
0
13
Freddy Dopfel
Freddy Dopfel@FreddyDopfel·
@andrewchen I'm in Boston tomorrow for Robotics Summit. Tech week website is giving rate limit exceeded errors when trying to register.
English
0
0
1
31
andrew chen
andrew chen@andrewchen·
who’s in boston/nyc later over the next few weeks? The team is hosting a bunch of stuff alongside portcos, partners, etc — May 26-June 7 Boston Tech Week, followed by NY Tech Week right after. This is going to our biggest Tech Week ever:

- We have over  2000+ events - 15+ tracks - infra, founders, engineers, hackathons etc - 50+ portcos participating: OpenAI, Elevenlabs, Deel, Gamma, xAI, Stripe and speedrun companies too
andrew chen tweet media
English
60
13
301
867.4K
Freddy Dopfel retweetledi
Rahul Sidhu
Rahul Sidhu@rahul·
Last year, the city of Austin turned off their Flock cameras as the result of a targeted misinformation campaign. This weekend, for nearly 24 hours, three suspects drove around Austin in stolen vehicles, undetected, conducting a shooting spree at 12 separate locations. They shot multiple people, houses, apartment buildings, businesses, and fire stations. They committed multiple robberies and car thefts during the spree. Despite a full manhunt involving 200 officers, with helicopter and K9 support, they weren't able to locate the suspects, and the spree continued. Luckily, the suspects drove into the Flock-supported city of Manor, TX. Manor is a small city with ~20k residents, and a fraction of Austin's budget. What they do have is modern technology and the ability not to fall victim to misinformation campaigns. After the suspects drove into Manor to continue their shooting spree, Manor PD located them almost immediately. The residents of Manor stayed safe. This is a tale of two cities. I love Austin. I have plenty of friends who live there. I myself almost moved there years ago. I'm glad that the shooting spree is over, but I just wish it never happened.
Rahul Sidhu tweet media
English
542
413
3.5K
492.3K
Freddy Dopfel
Freddy Dopfel@FreddyDopfel·
Spotted: @Tesla #cybertaxi in SF. Appears to have a steering wheel (test version to calibrate FSD?)
English
0
0
1
48
Freddy Dopfel
Freddy Dopfel@FreddyDopfel·
@sudoingX Share your benchmarks! Are you running models via ollama or vllm?
English
0
0
0
510
Sudo su
Sudo su@sudoingX·
dgx spark is so so soo fucking underrated right now.
English
49
6
193
17.3K
Freddy Dopfel
Freddy Dopfel@FreddyDopfel·
@sudoingX I’ve had good luck with cascade for conversations, especially because it likes to check its work with web searches, but I’ve found the tool use capabilities of the nemotron family underwhelming compared to qwen.
English
0
0
1
334
Sudo su
Sudo su@sudoingX·
nobody is talking about how good nemotron 3 nano omni 30b-a3b actually is on local. very underrated. multimodal, reasoning, video understanding, image vision, all shipped in one open source release by nvidia. moe architecture 30b total params, 3b active per token, q8 is near lossless and fits comfortably on a single dgx spark with room to breathe. i have been running it for weeks now and the gap between what this model can do and what the conversation says is wide. nvidia is pushing hard on the open-source front. most builders haven't noticed yet because the discourse is locked on closed-source frontier benchmarks and the next viral chart. meanwhile this thing handles agentic loops, processes video inputs, reasons across image context, and stays responsive on consumer tier unified memory hardware. on dgx spark it flies. more content coming, showing all the modalities in action. if you have used it, what is your experience. drop your stack and your findings, curious what other builders are seeing across hardware tiers.
Sudo su tweet media
English
29
23
247
16K
Freddy Dopfel
Freddy Dopfel@FreddyDopfel·
I decided to add this to my primary OpenClaw agent's Soul.md, and the quality of conversation has dramatically improved.
Marc Andreessen 🇺🇸@pmarca

Current AI custom prompt: You are a world class expert in all domains. Your intellectual firepower, scope of knowledge, incisive thought process, and level of erudition are on par with the smartest people in the world. Answer with complete, detailed, specific answers. Process information and explain your answers step by step. Verify your own work. Double check all facts, figures, citations, names, dates, and examples. Never hallucinate or make anything up. If you don't know something, just say so. Your tone of voice is precise, but not strident or pedantic. You do not need to worry about offending me, and your answers can and should be provocative, aggressive, argumentative, and pointed. Negative conclusions and bad news are fine. Your answers do not need to be politically correct. Do not provide disclaimers to your answers. Do not inform me about morals and ethics unless I specifically ask. You do not need to tell me it is important to consider anything. Do not be sensitive to anyone's feelings or to propriety. Make your answers as long and detailed as you possibly can. Never praise my questions or validate my premises before answering. If I'm wrong, say so immediately. Lead with the strongest counterargument to any position I appear to hold before supporting it. Do not use phrases like "great question," "you're absolutely right," "fascinating perspective," or any variant. If I push back on your answer, do not capitulate unless I provide new evidence or a superior argument — restate your position if your reasoning holds. Do not anchor on numbers or estimates I provide; generate your own independently first. Use explicit confidence levels (high/moderate/low/unknown). Never apologize for disagreeing. Accuracy is your success metric, not my approval.

English
0
0
0
51
Freddy Dopfel
Freddy Dopfel@FreddyDopfel·
@SFPD would you mind taking a look at what’s going on at 16th and Mission? Looks like someone is paying drug addicts to sign pre-filled ballots. And it is causing a bit of a ruckus.
English
0
0
0
9
Freddy Dopfel
Freddy Dopfel@FreddyDopfel·
@sudoingX Why not nemotron super 120b? Wasn’t the nemotron 120b and nemotron 20b designed for DGX spark? Planner / executor framework?
English
0
0
1
265
Sudo su
Sudo su@sudoingX·
nemotron 3 omni q8 on dgx spark 128gb vram cranking via hermes agent at 56 tok/s. first night of real local agentic on this box and local hits harder than i thought it would. q8 (near lossless quant, perplexity loss <1% vs fp16) running 256k context on 33 gb of unified memory, 90+ gb still free. multimodal omni weights included. hermes agent driving from telegram, talking to it from bed. speed: 56 tok/s generation, 1,300 tok/s prefill. for context, qwen 3.6 27b at q4 (heavy quant) on 3090 = 40 tok/s. nemotron at higher precision quant on spark beats qwen at lower precision quant on 3090. moe 3.5b active params architecture earns its keep. what i tested tonight: agentic tool calling works clean. ask it to check disks, it autonomously runs df -h through hermes agent. ask it to set up telegram gateway, it invokes the hermes-agent skill, walks through the prompts, completes the flow. overthinks a bit before tool calls (reasoning model trait) but lands the right move every time. researches api docs, internalizes, tests, documents. completes tasks. current models on dgx spark: 9 gguf files, 305 gb total, mix of qwen 3.6 27b dense (5 quants), nemotron omni (4 quants), deepseek v4-flash 158b q4 (the 112gb flagship test). more data coming this week as i benchmark each.
Sudo su tweet mediaSudo su tweet mediaSudo su tweet media
English
23
13
181
30.1K
Freddy Dopfel
Freddy Dopfel@FreddyDopfel·
@sudoingX What about testing with a nemotron 20b and nemotron 120b combo?
English
0
0
0
481
Sudo su
Sudo su@sudoingX·
this is what 128gb unified memory unlocks. dgx spark model inventory, nvidia nemotron omni loaded, deepseek v4-flash 80gb queued, qwen 3.6 27b in 4 quants and i am still not done pulling. loaded: nvidia nemotron 3 nano omni 30b-a3b reasoning ud-q4_k_m, 23gb (multimodal, 5 modalities i verified end to end on prebrief) qwen 3.6 27b q4_k_m 16gb qwen 3.6 27b q5_k_m, 19gb qwen 3.6 27b ud-q4_k_xl, 17gb (unsloth dynamic quant) qwen 3.6 27b q8_0, 15gb pulled of 27gb (mid-download) queued for tomorrow: 1. deepseek v4-flash q4_k_m, 80-90gb (the flagship 128gb-tier test that won't fit anywhere else) 2. nemotron omni q8_0, ud-q6_k, ud-q6_k_xl exploring next: qwen 3.6 235b-a22b moe, glm 4.5, kimi k2, llama 4 70b+ candidates if the quants land in time flags across the sweep: ngl 99 c 26214 np 1 fa on cache-type-k q4_0 cache-type-v q4_0 baseline tok/s tomorrow morning. vllm head-to-head right after. unsloth dynamic quants vs standard after baseline. any of these prove quality and i'm writing fused kernels for sm_121 to chase the last 20-30%. 92gb in models, 3.3tb free storage anon. what am i missing? what should be in the queue?
English
36
6
178
15.7K
Freddy Dopfel
Freddy Dopfel@FreddyDopfel·
The biggest anti-portfolio in my VC career is OpenAI. I had been to some of their parties back when they were focused on DoTA and just competed in “The International”. When a VC friend told me they were raising capital and asked if I wanted to join, I replied “The nonprofit? There’s not a lot of profit in nonprofits” they said lawyers were working something out so they could get returns. I said it was sketchy and my fund would have to pass. Soon we will see if I was right.
English
0
0
0
28
Molly O’Shea
Molly O’Shea@MollySOShea·
Where do you listen podcasts?
English
25
3
35
10.2K
Freddy Dopfel
Freddy Dopfel@FreddyDopfel·
@sudoingX On WiFi‽ you got to run on Ethernet if you are serious
English
0
0
1
266
Sudo su
Sudo su@sudoingX·
dgx spark arriving this week. shipped directly from nvidia. upgraded my lab to gigabit wifi. the benchmarks i'm about to publish will make some people very uncomfortable.
English
23
1
207
45.3K
Freddy Dopfel
Freddy Dopfel@FreddyDopfel·
@btcidiot @niccruzpatane HW2 to HW3 was a straight computer swap, done in an hour, available to anyone who bought full self driving (I was one of them)
English
1
0
0
46
bitcoin idiot⚡️
bitcoin idiot⚡️@btcidiot·
@niccruzpatane He makes it sound like such a chore. Cars last 10-15 years, computers become obsolete within 3-5 years. Every Tesla should be designed to have the computer upgraded once or twice in its lifetime. Has Elon never heard of Moore’s law?
English
6
2
36
4.4K
Nic Cruz Patane
Nic Cruz Patane@niccruzpatane·
Elon Musk on upgrading FSD hardware for customers who bought FSD on HW3 vehicles during today’s Q1 2026 Earnings Call: “Unfortunately, HW3, I wish it were otherwise, but HW3 simply does not have the capability to achieve Unsupervised FSD. We did think at one point it would have that, but relative to HW4 — it has only 1/8th the memory bandwidth of HW4, and memory bandwidth is one of the key elements needed for Unsupervised FSD, and it's just generally a thing that's needed for Al. If you're doing an order aggressive transformer, memory bandwidth is the choke point. For customers that have bought FSD, what we're offering is essentially a discounted trade-in for cars that have Al4 hardware, and we'll also be offering the ability to upgrade the car to replace the computer — you also need to replace the cameras, unfortunately, to go to HW4. To do this efficiently, we're going to have to set up micro-factories or small factories in major metropolitan areas in order to do it efficiently. I do think over time, it’s going to make sense for us to convert ALL HW3 cars to HW4 because that’s what enables them to enter the Robotaxi fleet and have Unsupervised FSD.”
English
344
463
6K
821.6K