Afroz Mohiuddin

528 posts

Afroz Mohiuddin

@afrozenator

@OpenAI, ex @Google, @AIAtMeta. Interested in Science, Psychology, Investing and generally everything. Good Thoughts, Good Words, Good Deeds.

SF Bay Area, CA 가입일 Nisan 2009

5.2K 팔로잉1.4K 팔로워

Afroz Mohiuddin@afrozenator·12h

@_arohan_ All the very very best!

English

228

rohan anil@_arohan_·15h

Ok I did leave anthropic, a few weeks ago, it was one of the best places to work for a researcher. Jerry Tworek nerdsniped me into starting this with him and others The pretraining team at ant is one of the well functioning research team in the industry, and anthropic has great culture - I miss the fun times and claude! Thank you for everything!

English

1.5K

124.2K

Afroz Mohiuddin@afrozenator·12 Nis

@AndrewDai @nmasc_ Well reasoned, much respect!🫡

English

Andrew M. Dai@AndrewDai·11 Nis

@nmasc_ Thanks Natasha! We really value the team both current and future. It felt unfair to new employees to force them to accept a high valuation.

English

415

Natasha Mascarenhas@nmasc_·10 Nis

I remember writing a column last year about VC investors no longer backing model startups because of the heavy costs and competition from big labs. That perspective has significantly changed as reasoning and specialization matters more and more. Our latest scoop on @technology digs into yet another effort Latest w/ @byJuliaLove: bloomberg.com/news/articles/…

English

21.4K

Afroz Mohiuddin@afrozenator·7 Nis

Many congratulations @nitishkulki and @alankarjain91 -- This is awesome and can't wait to see how this is used in the wild!!

Nitish Kulkarni@nitishkulki

Today, we are launching NextToken - a single place to build production-grade agents, apps and analytics. Code-forward. Cloud Hosted. Zero setup. Extremely affordable. @nexttoken_co

English

175

Afroz Mohiuddin@afrozenator·26 Mar

“The test of a first-rate intelligence is the ability to hold two opposed ideas in the mind at the same time, and still retain the ability to function” — F Scott Fitzgerald

Darshak Rana ⚡️@thedarshakrana

I accidentally broke my brain reading about Nobel Prize winners last month. There's this thing called "Janusian thinking" that basically explains why some people's minds work like magic while the rest of us think in straight lines. Named after Janus, the Roman god with two faces pointing opposite directions. The psychologist who discovered it, Albert Rothenberg, was trying to figure out what made breakthrough thinkers different. He interviewed dozens of Nobel laureates, major artists, revolutionary scientists. What he found sounds impossible. These people can hold two different ideas in their mind at the same time. They can explore both without switching back and forth or forcing a quick comparison. They can consider “yes” and “no” to the same question simultaneously and stay clear-headed. Einstein too talked about this when he described his relativity breakthrough. He was imagining riding alongside a beam of light while also standing perfectly still. Both perspectives at once. Mozart said he could hear an entire symphony "all at once," every note, every contradiction, every resolution happening in a single moment of awareness. Your average person's mind works like a courtroom. Evidence comes in, you weigh it, you reach a verdict. Case closed. But Janusian minds work more like... I don't know, like a quantum computer that can process multiple realities simultaneously until something new emerges from the overlap. I've started noticing it in conversations. When someone can genuinely see both sides of something without needing to pick one, it drives people nuts. They want you to land somewhere definite. The ability to live in that tension space reads as wishy-washy or indecisive. Most creative advice tells you to "think outside the box." But Janusian thinking is weirder than that. It's being inside and outside the box at the same time. It's thinking the box exists and doesn't exist simultaneously. Which explains why truly creative people seem slightly unhinged. They think they're choosing between realities. But, they're inhabiting multiple realities at once, mining the contradictions for insights the rest of us never see. Sadly, most of us have trained ourselves out of this ability. We've learned that holding contradictions feels unstable, so we rush toward resolution. We've been taught that changing your mind means you were wrong before, so we defend positions instead of exploring them. But the people changing the world have kept that childlike ability to hold impossible thoughts without needing them to make sense immediately. We just need to live in the questions everyone else is too scared to ask.

English

224

Afroz Mohiuddin@afrozenator·6 Mar

@JingyuanLiu123 @Jianlin_S @clu_cheng Congratulations on the move @JingyuanLiu123 ! You’ll do great and happy for you!

English

173

JingyuanLiu@JingyuanLiu123·6 Mar

Some updates: I've always been bullish on TML, and I actually joined TML this Monday Looking back, I am feeling so lucky that I have the privilege to work closely with the best optimization experts on the Muon optimizer ( @Jianlin_S from Kimi and @clu_cheng from Meta). Now I am so excited to be able to work with @jxbz and build new cool things! (On the other hand, there have always been some bad rumors about Meta TBD's potential failure. That's not true! From my personal experiences, it really has the best talents in the field, and I really enjoyed learning from the lab. The avocado model will for sure be great!)

JingyuanLiu@JingyuanLiu123

hmm I sort of disagree and I am bullish for TML. I think they really really have the top talents that I admire in the field, e.g. Jeremy and Sam for optimization, Songlin for Attn, Lia for MoE, Andrew for FSDPv2, and a bunch more folks it's just natural that it takes a while to publish good models: - dpsk starts to publish papers in 2023, even piblished dspkv2 (which I think is already amazing) in mid 2024 and nobody cares, until dpskv3 and r1 - msh took 10+ month to deliver a first not bad long ctx model in 2023 and be silent for the whole 2024 year, and starts to catch up gradually in 2025 - qwen starts to be a much better model than llama until qwen2.5, mid or late 2024, while the lab has been there forever it takes time to get infra and data done, but as long as you have good folks, and principled ways of doing science and experiments, some time or later, scaling laws will pay back

English

274

54.1K

Afroz Mohiuddin@afrozenator·10 Oca

@agihippo Useful!

English

215

yi@agihippo·10 Oca

What's the highest utility per per dollar / ROI things you bought /subbed in the recent years? here's mine (i don't spend much money in general but here's my tier list anyway): S-tier [outsized value / insane] - fitness tracker / pixel watch. 300 dollars and tracks random metrics but the key thing is that it spurs you on to be more health conscious (sleep and exercise) - totally worth it. superb value for money. can't find anything more ROI efficient than this. - good quality snacks that are high in protein (sometimes pricey for snacks but good health is always worth it). For example, high protein ice cream is like 2.5x more expensive, tastes slightly worse but has decent macros. A-tier [pretty good / amazing] - car. A simple tesla in Singapore costs 200K but i found it to be so high ROI somehow because I can travel to play sports easily or go anywhere easily which improves my health & time significantly (two most important metrics!). MRT / grab is ok but somehow even though they are pareto-efficient it's still different driving. A luxury car might be C tier though but I think an affordable EV is super high ROI. - Michelin star fine dining with family. (~800 SGD for 2 people for an unforgettable meal) not bad for good experiences. occasionally though like once a month, not everyday. - X premium. Mostly any other subscription like YT premium, netflix etc. The cost to watch ads is just higher than any cost you pay for subscriptions. - home gym (we paid for a home barbel/dumbbell set years ago for $3K and did more than a thousand sessions combined on that thing). needs some floor space though. - meal prep service. $10 per meal and they ship good quality meals with reasonable stats to you. slightly troublesome but probably good value compared to eating on grab food every single meal/day. B+ / B / B- tier "okay-ish good to okay-ish maybe." - business class tickets for flights >6 hours [B+ tier] necessary for sanity even on your own dime. - top end sports equipment [B+ tier] (one thing i learned in sports like badminton is just buy the expensive rackets anyways instead of finding one like tens of dollars slightly cheaper. - macro tracking app [B+ tier]. (sure, you can track in a spreadsheet too but this spurs you on to be more health conscious). - grand piano (~30k). I don't use it that much even though I used to be a classical pianist but my wife does. but b tier for the vibe enhancements to the house. [B tier] - supplements like vitamins/magnesium/ashwaganda. Makes you feel like you're healthmaxxing but I honestly don't feel any difference even after a few months. They should be based on science but I didn't see any effects except placebo maybe. [B- tier] - personal homepage. I use squarespace and pay like $100 per year or so? I swear I could have used github pages and it would do the same thing but anyways. Not the greatest thing but not the worse [B- tier] D-tier (mostly cancelled or never used) - Linkedin premium (I used to have that on a corp card) but I wouldn't pay for it myself. Useless tier. - Tesla add-ons in Singapore (i don't need to pay extra money i can park myself lol). There's no FSD in Singapore either. - Meditation app (i used one but it was good but time consuming so i deleted and unsubbed) - Ipad or tablet. Between a computer and a phone, what is the point of a tablet?? Things not sure and still wondering - smart ring (i was always tempted to buy one but i have a smart watch and felt it's a waste of money). Am I wrong? - very fancy gym memberships (maybe like 200 SGD per month?) i never got down to sign up because google has a gym and i have one at home too). But are these cold baths, saunas and good lifting vibes worth it? idk. - sleep optimization things like temperature control blankets/sheets etc. (are these worth it?) - probably missed more here. - personal assistant (@swyx suggested this at one point but im still wondering if it's more time cost than doing stuff yourself). i'm looking for things that are high ROI / good pareto frontier to buy and do some weekend retail therapy but have no clue...lmk if any suggestions.

English

15.3K

Afroz Mohiuddin@afrozenator·1 Oca

@agarwl_ @urvashi__s Many congratulations to both of you!

English

240

Rishabh Agarwal@agarwl_·1 Oca

The highlight of my year happened right before it ended. Happy new year!

English

971

76K

Afroz Mohiuddin@afrozenator·10 Ara

This one Munger quote is so counterintuitive, yet empirically so accurate.

Kevin Carpenter@kejca

Charlie Munger: "Politicians are never so bad that you don't live to want them back." "You laugh, you young people, but you're going to live to wish that Nancy Pelosi and Donald Trump were immortal."

English

530

Afroz Mohiuddin 리트윗함

Gregory Zuckerman@GZuckerman·27 Kas

“At my age, you make new friends, or you don’t have any friends,” Charlie Munger during in his last, eventful years. wsj.com/finance/invest…

English

476

140.9K

Afroz Mohiuddin 리트윗함

Matt Turck@mattturck·26 Kas

Thanksgiving-week treat: an epic conversation on Frontier AI with @lukaszkaiser -co-author of “Attention Is All You Need” (Transformers) and leading research scientist at @OpenAI working on GPT-5.1-era reasoning models. 00:00 – Cold open and intro 01:29 – “AI slowdown” vs a wild week of new frontier models 08:03 – Low-hanging fruit, infra, RL training and better data 11:39 – What is a reasoning model, in plain language 17:02 – Chain-of-thought and training the thinking process with RL 21:39 – Łukasz’s path: from logic and France to Google and Kurzweil 24:20 – Inside the Transformer story and what “attention” really means 28:42 – From Google Brain to OpenAI: culture, scale and GPUs 32:49 – What’s next for pre-training, GPUs and distillation 37:29 – Can we still understand these models? Circuits, sparsity and black boxes 39:42 – GPT-4 → GPT-5 → GPT-5.1: what actually changed 42:40 – Post-training, safety and teaching GPT-5.1 different tones 46:16 – How long should GPT-5.1 think? Reasoning tokens and jagged abilities 47:43 – The five-year-old’s dot puzzle that still breaks frontier models 52:22 – Generalization, child-like learning and whether reasoning is enough 53:48 – Beyond Transformers: ARC, LeCun’s ideas and multimodal bottlenecks 56:10 – GPT-5.1 Codex Max, long-running agents and compaction 1:00:06 – Will foundation models eat most apps? The translation analogy and trust 1:02:34 – What still needs to be solved, and where AI might go next

English

267

167.7K

Afroz Mohiuddin@afrozenator·15 Kas

Shortcut [noun]: The longest path between two points. We love them around here! 😂

English

316

Afroz Mohiuddin@afrozenator·15 Kas

@prajdabre More about Warren Buffett disclosing his stake I’d guess!

English

234

Raj Dabre@prajdabre·15 Kas

Oh boy!

Sundar Pichai@sundarpichai

🤔🤔

English

105

12K

Afroz Mohiuddin 리트윗함

Joy Bhattacharjya@joybhattacharj·7 Kas

In 1890, 23 years old without any formal degree, she was working as a governess in her father's friend's house in Warsaw to help support her sister who was struggling to get an education in Paris. She fell in love with the elder son of the family who refused her as she had no prospects. Heartbroken and weary she wrote to her sister Bronia that she was stupid, would remain stupid and nothing would ever come of her life. Thirteen years later she won a Nobel Prize in Physics along with her husband Pierre, and won another one for Chemistry eight years later. On Marie Curie's birthday, giving up shouldn't ever be an option.

English

329

1.7K

55.8K

Afroz Mohiuddin@afrozenator·28 Eki

Sometimes examples like these drive home the Sherlock Holmes quote — “You see, but you do not observe. The distinction is clear.” (3/3)

English

120

Afroz Mohiuddin@afrozenator·28 Eki

If a bacteria doubles every minute, and in 60 minute fills up the container — Exactly when does a bacterium first realize this fact? Even a mere 5 seconds before, the container is only 3% full, ie 97% empty! (2/3)

English

114

Afroz Mohiuddin@afrozenator·28 Eki

“Our inability to understand the exponential function is our biggest weakness” — Prof Albert Bartlett youtu.be/1bvwOrGn1Zs?si… The example given is striking in its simplicity! (1/3)

YouTube

English

487

Afroz Mohiuddin@afrozenator·22 Eki

Amazing work and gains by a solid set of folks!

Vinay S Rao@vinaysrao

While at Meta, I worked on this optimizer-wrapper (outer step lookahead momentum) we're calling Snoo (arxiv.org/abs/2510.15830). You can use it with AdamW or Muon and see really strong scaling. Here's a plot where we ran it against (tuned) AdamW up to 1e23 training flop scales. The "x"s in the plot are compute-factors i.e the baseline needs "x" more flops to reach the same loss (instead of simply measuring in steps). - We further established a medium-track WR on modded-nanogpt (github.com/KellerJordan/m…) With amazing co-authors (Dominik,Vishal,Michael).

English

685

Afroz Mohiuddin@afrozenator·10 Eki

@_arohan_ Heavy lifting word …

English

rohan anil@_arohan_·10 Eki

@afrozenator “Early” is working hard. Very much agree.

English

175

Afroz Mohiuddin@afrozenator·8 Eki

I think outside early Google employees it’s not appreciated as much how academic the early Google culture was, open internally, ambitious and excited about nerd sniping each other — It was the best place one could land up in for my generation even … Truly a magical place.

Peyman Milanfar@docmilanfar

Google has reached a remarkable milestone not seen since the heyday of Bell Labs: 5 of its current/former employees are science Nobel laureates. This remarkable concentration of talent signals a major shift: fundamental discoveries are no longer confined to the halls of academia

English

2.7K

Afroz Mohiuddin@afrozenator·10 Eki

Congratulations @achowdhery and team!

Aakanksha Chowdhery@achowdhery

I've spent years pushing the boundaries of pretraining—first as lead author on PaLM, then as a lead contributor on Gemini pre-training. Now I'm at Reflection, building open-weight agentic models at the frontier from the ground up. Today we're announcing our Series B to accelerate this mission. What excites me most is the team of world-class researchers who are deeply bought into this mission and the opportunity to build a frontier lab from scratch. Pre-training at scale. RL at scale. Agentic reasoning. The full stack. It's rare to get the resources, the talent, and the mission aligned like this. If you're passionate about this mission and pre-training/RL at scale to advance the open frontier, join us on our ambitious journey! DM me. We’re hiring in SF, New York and London: reflection.ai/careers

English

1.1K

탐색

@_arohan_ @AndrewDai @nmasc_ @technology @byJuliaLove @nitishkulki @alankarjain91 @JingyuanLiu123