Jordan Dotzel

263 posts

Jordan Dotzel

@AmishAcorns

Cornell PhD '25; CS BA '17; Neural Architect at Google; Ex?-Halo Pro; Eternal Halo Combine Kid

Ithaca, NY Katılım Ocak 2011

482 Takip Edilen1.1K Takipçiler

Jordan Dotzel@AmishAcorns·7 Eyl

@henryperschk @paulg Alcohol is the last to go. This is a refocusing on health and not dying

English

624

Henry Perschk@henryperschk·7 Eyl

@paulg As the inflation/recession belt gets tightened, things like drinks are the first things to go.

English

7.4K

Paul Graham@paulg·7 Eyl

Apparently people in the SF Bay Area have stopped drinking. I heard about one big social event that made half as much from drinks in 2025 as in 2024.

English

532

134

4.9K

843K

Jordan Dotzel@AmishAcorns·23 Ağu

@algobaker @dl_insider @jiawzhao Yeah I agree this seems strange. Token entropy and reasoning certainty don't seem like they're too related

English

191

algobaker@algobaker·23 Ağu

Thanks Jose. I think that's consistent with my question then. It doesn't seem like we separate word-choice uncertainty (doesn't affect correctness), and uncertainty in the actual correctness of the answer (should this digit be a 1 or a 2). Feels like it would steer one moreso down paths of using words that have fewer natural synonyms, because you will get higher confidence. Would be an interesting experiment to see

English

509

Jiawei Zhao@jiawzhao·22 Ağu

Introducing DeepConf: Deep Think with Confidence 🚀 First method to achieve 99.9% on AIME 2025 with open-source models! Using GPT-OSS-120B even without tools, we reached this almost-perfect accuracy while saving up to 85% generated tokens. It also delivers many strong advantages for parallel thinking: 🔥 Performance boost: ~10% accuracy across models & datasets ⚡ Ultra-efficient: Up to 85% fewer tokens generated 🔧 Plug & play: Works with ANY existing model - zero training needed (no hyperparameter tuning as well!) ⭐ Easy to deploy: Just ~50 lines of code in vLLM (see PR below) 📚 Paper: arxiv.org/pdf/2508.15260 🌐 Project: jiaweizzhao.github.io/deepconf joint work with: @FuYichao123 , xuewei_wang, @tydsh (see details in the comments below)

English

328

2.3K

463.6K

Jordan Dotzel@AmishAcorns·21 Ağu

@Dan_Jeffries1 It won't make people 10x productive because it will always be bottlenecked by those people, it will be 10x more productive than people though, and then 100x, and then 1000x

English

155

Daniel Jeffries@Dan_Jeffries1·21 Ağu

People thinking AI will end all the jobs are hallucinating worse than Max Tegmark on an acid trip. And one of the reasons is this: AI does not make people 10x more productive and it is not a magical fix to anything. It is simply another kind of intelligence that shifts the bottlenecks to other parts of workflow, namely problem composition/refinement; iteration; and verification. You may have a 10X speed up in writing code but you now also have a 10X slowdown in verification because you have to read all that code and troubleshoot it and bug fix it or you may be shipping with dozens of hidden security vulnerabilities and untested bugs. Again, speeding up in one area just means a slow down in another area. The bottleneck shifts. If AI produces an email you can verify it quickly. If it produces a novel you've got a lot of reading and an exponential increase in problems to find in that long text. With that idiotic book by Yud on the way, we are about at peak stupidity/doom about AI and all that remains is this generation's The Population Bomb, aka If We Build It Everyone Dies.

Rohan Paul@rohanpaul_ai

AI is not exacly creating “10x worker” yet. LLM adoption rose significantly 45.9% among US workers. But, macro productivity has not surged. U.S. Nonfarm business labor productivity rose 2.4% in Q2 2025 after a 1.8% drop in Q1 2025, a rebound rather than a regime change. The OECD’s July 2025 compendium similarly notes that generative AI’s impact is “not yet evident” in cross‑country productivity statistics. If LLMs had made typical workers “10x” faster, you would expect a much clearer macro signal by mid‑2025. Two mechanisms possibly reconcile high AI adoption with modest macro gains. 1st, usage is broad but shallow. Much of today’s use targets drafting, summarizing, and coding assistance rather than core transaction flows, and many teams have not redesigned processes or roles to capitalize on AI. Microsoft’s cross‑industry Randomized Controlled Trial shows behavior moving most where individuals can act unilaterally, like email, while coordination‑heavy activities stay fixed, which limits throughput gains. 2nd, there is a mismatch between what workers want automated and what current systems do well. A July-25 Stanford study mapping worker preferences to current technical capability finds large zones where deployments are either unwanted or not yet capable, which blunts realized ROI. Overall, Generative AI till now looks like a general‑purpose tech, which means the big payoff depends on complementary investments, workflow redesign, data plumbing, and trustworthy autonomy, not on chat windows alone.

English

105

150

161.2K

Jordan Dotzel@AmishAcorns·20 Ağu

@sjgadler @aidan_mclau @jxmnop Cerebras WSE3 is around 4T "lbs" but give it a year

English

270

Steven Adler@sjgadler·19 Ağu

@aidan_mclau @jxmnop Yup! and Moore's Law's infant now weighs 7.5 trillion pounds

English

2.6K

dr. jack morris@jxmnop·19 Ağu

*taps the sign*

Aidan McLaughlin@aidan_mclau

the task length an ai can reliably finish (conservatively) doubles every 7 months when i'm the age my mom was when she watched me graduate, ai will be able to do tasks that would take someone ~1000 millennia

English

161

7.7K

202.9K

5.4M

Jordan Dotzel@AmishAcorns·16 Ağu

@chiefbuidl @eigenron higher temp, more bit flips, more stochastic i never knew that's how it worked

English

289

Scott Dykstra@chiefbuidl·16 Ağu

@eigenron most people don’t realize temp is just adjusting the fan speed on the GPU cluster serving you

English

608

20.5K

eigenron@eigenron·15 Ağu

i was literally talking to this "LLM researcher" about setting temperature in LLMs and i asked you know why lowering or raising the temperature results in more deterministic or random outputs, right? and he said yeah it changes the way tokens are represented. boy wtf, people IN the fucking field have no idea about botzmann stats or even softmax. i'm gonna cry.

English

3.1K

392.6K

Jordan Dotzel@AmishAcorns·11 Ağu

@nicdunz @sama governments should probably start subsidizing AI access for broad benefits across society, investment may save money if done right

English

1.7K

nic@nicdunz·11 Ağu

@sama can we have all of it for free and unlimited please

English

394

27.9K

Sam Altman@sama·11 Ağu

we are considering giving a (very) small number of GPT-5 pro queries each month to plus subscribers so they can try it out! i like it too. but yeah if you wanna pay us $1k a month for 2x the input tokens feels like we should find a way to make that happen...

Mckay Wrigley@mckaywrigley

@sama fwiw gpt-5 pro is ridiculously good and i’d pay $1k/mo for it if you 2x’d the limit on input tokens for it

English

1.4K

274

1.3M

Jordan Dotzel@AmishAcorns·10 Ağu

@xwang_lk I feel we shouldn't expect LLMs to do math without reasoning, especially with current tokenizers Humans would make the same instant guess too but correct it through internal reasoning But if it makes this mistake with reasoning, that's sad at this point :/

English

132

Xin Eric Wang@xwang_lk·8 Ağu

Really? Still?

English

21.2K

Jordan Dotzel@AmishAcorns·10 Ağu

@jasonbotterill if you compare Grok 3 to Grok 4, I'm guessing there will be similar improvements soon for Imagine but Elon seems to be hyping it too much too quick

English

JB@JasonBotterill·10 Ağu

Not only is Grok Video really mid and doesn’t flow well, but the audio is always out of context and low quality. Why do I hear a train and some guy whistling?

Elon Musk@elonmusk

Made by @grok Imagine

English

2.6K

Jordan Dotzel@AmishAcorns·10 Ağu

@SamuelSurfboard where does it claim Muon optimizer btw?

English

Samuel@SamuelSurfboard·28 Tem

GLM-4.5 looks like an amazing model It's a Mixture of Experts model with native tool use by Z-AI. It's a hybrid reasoning model with 355B with 32B active at a time. Surprisingly, it uses the Muon Optimizer like Kimi K2 👀 And it performs very well in benchmarks too.

English

Jordan Dotzel@AmishAcorns·10 Ağu

@khoomeik it seems most people believe the experts have some human-semantic boundaries. they're experts but with no expertise we can (easily) recognize

English

865

Rohan Pandey@khoomeik·9 Ağu

the name “mixture of experts” and its consequences have been a disaster for research communication

tomie@tomieinlove

(Me): It's MoE model, two experts. One is aligned, and the other is misaligned. (VC investor): go on (Me): The kicker: it's a wearable. The aligned model sits on your right shoulder and has a halo. The misaligned model is on your left and has horns. (VC investor): oh my GOD

English

436

49.4K

Jordan Dotzel@AmishAcorns·7 Ağu

@suchenzang Polymarket resolves on lmarena without style control, and Google is winning there

English

507

Susan Zhang@suchenzang·7 Ağu

mixture of sycophantic experts vs mixture of gambling experts

English

300

22.7K

Jordan Dotzel@AmishAcorns·7 Ağu

@scaling01 This isn't about sentiment. GPT-5 falls under Gemini without style control, which is what resolves the market Gemini: 1471 GPT: 1462

English

2.9K

Lisan al Gaib@scaling01·7 Ağu

Markets disappointed by GPT-5 OpenAI getting crushed on Polymarket

English

638

73.9K

Jordan Dotzel retweetledi

Google DeepMind@GoogleDeepMind·5 Ağu

What if you could not only watch a generated video, but explore it too? 🌐 Genie 3 is our groundbreaking world model that creates interactive, playable environments from a single text prompt. From photorealistic landscapes to fantasy realms, the possibilities are endless. 🧵

English

814

2.6K

13.5K

3.7M

Jordan Dotzel@AmishAcorns·29 Tem

I thought the same originally but I'm not sure I'd consider residuals, norms, init, activations, optimizers, etc part of CV research. All the DL research at the time happened to be evaluated on CV, given the popularity of ImageNet / CIFAR, but none of that seems fundamentally CV focused

English

181

Aidan Clark@_aidan_clark_·28 Tem

@jxmnop Ignoring all the actual contributions like resnets and thinking hard about initialization and activations and all the actual counter examples one could give, the last phrase seems like the classic anti-MM error. Good luck getting a text model to tell me whether my shirt fits!

English

8.2K

dr. jack morris@jxmnop·28 Tem

very surprising that fifteen years of hardcore computer vision research contributed ~nothing toward AGI except better optimizers we still don't have models that get smarter when we give them eyes

English

1.2K

176.4K

Jordan Dotzel@AmishAcorns·21 Tem

there are scaling laws for everything, with respect to everything you can have a L(N,T) that gives loss with params and tokens, but you can scale either for the scaling law to be useful if you want want to keep compute-optimal though, you should scale both and keep a fixed token-param ratio

English

173

Charchit Sharma@charchits7·20 Tem

@cloneofsimo @navvye for scaling law to work, you need to scale the data as well, right?

English

642

Simo Ryu@cloneofsimo·20 Tem

What happens if you Q,K,V = mlp(x).split(3) instead of linear(x).split(3) ? Anyone tried this?

English

241

89.8K

Jordan Dotzel@AmishAcorns·14 Tem

@niccruzpatane they were out today in Mountain View too

English

172

Nic Cruz Patane@niccruzpatane·14 Tem

Tesla spotted validating near Monterey, California! Robotaxi is going to scale quicker than you think.

Nathan Garvey@NSgarv

Spotted near Monterey, CA on July 13th

English

102

1.3K

70.4K

Jordan Dotzel retweetledi

Taelin@VictorTaelin·5 Tem

Not consulting AI models in 2025 is medical malpractice. No other way to put it. I follow the field closely to know how to use it, and that saved my life. Not every patient does. People are dying today for misdiagnoses that o3 would get right 10 out of 10 times

Rohan Paul@rohanpaul_ai

this story is going wildy viral on reddit. ChatGPT flagged a hidden gene defect that doctors missed for a decade. ChatGPT ingested the patient’s MRI, CT, broad lab panels and years of unexplained symptoms. It noticed that normal serum B12 clashed with nerve pain and fatigue, hinting at a methylation block. Within months tingling eased and brain fog cleared. The primary physician reviewed the genetics report and agreed the variant unified the entire case. IMO, time has already come, taking a 2nd opinion from the best healthcare-AI model should be made part of medical code of practice. ------ reddit. com/r/ChatGPT/comments/1lrmom4/chatgpt_solved_a_10_year_problem_no_doctors_could/

English

484

44K

Jordan Dotzel@AmishAcorns·8 Haz

@OpenAI yeah this needs to be reverted or made optional better for ai girlfriends and language learning but worse for almost every other use case. the intonations and cadence have fallen into the uncanny valley so it actually sounds less natural

English

842

OpenAI@OpenAI·7 Haz

The updated Advanced Voice is also more effective at translation, persistently translating for multiple turns until being told to stop. help.openai.com/en/articles/68…

English

1.1K

307.6K

OpenAI@OpenAI·7 Haz

We launched an update to Advanced Voice to make it way more natural and effortless to talk to. Now available to all paid users in ChatGPT.

English

580

602

9.3K

1.5M

Jordan Dotzel@AmishAcorns·3 Haz

@growing_daniel @djcows james was recruiting for tesla at cvpr 2019 and was super nice organized a fun event where elon spoke