cOfDirac

47 posts

cOfDirac

@cOfDirac

exploring synthetic a priori

Poincaré disk Katılım Ocak 2026

57 Takip Edilen1 Takipçiler

cOfDirac@cOfDirac·48m

@cloneofsimo AGI doesn't need to do what Ramanujan did, but it should solve problems as he did.

English

Simo Ryu@cloneofsimo·14h

Ofc Demis is absolute goat no doubt but its kinda funny to think we pushed the goalpost to a degree that it takes Ramanujan to be considered AGI

NIK@ns123abc

🚨 Google DeepMind CEO Sir Demis Hassabis: “Today’s systems, are nowhere near [AGI]. Doesn’t matter how many Erdős problems you solve… I think it’s far, far from what a true invention or someone like a Ramanujan would have been able to do” it’s over for the Erdős hype

English

451

39.4K

cOfDirac@cOfDirac·18h

@ns123abc Finally some real talk. I've doubted Demis' authenticity previously so I'm glad to see something like this. We are not building towards AGI and we're far from it. x.com/i/status/20567…

Intology@intology

Can coding agents do research? We release NanoGPT-Bench, an internal eval we’ve used to test agents on an AI R&D problem with months of human progress Codex, Claude Code, Autoresearch recover only 9.3% of human progress, mostly tuning hyperparams & ignoring algorithmic research NanoGPT-Bench is built on the NanoGPT Speedrun, a popular LLM pretraining competition to minimize the training time of a GPT-2 style model. Existing human submissions constitute nearly 2 years of work. To control for dependencies and contamination in frontier models, we standardize evaluation to a 5-month window of world records. Evaluation is fully autonomous and end-to-end, with no human intervention or internet access. 🧵

English

1.2K

NIK@ns123abc·1d

English

147

208

2.8K

425K

cOfDirac@cOfDirac·20h

@Yuchenj_UW I agree that a title shouldn't decide the perception of your capability, but I do think it helps other people understand your role and responsibilities quicker. What would be best if you had a hierarchy but progression didn't have to be sequential.

English

628

Yuchen Jin@Yuchenj_UW·1d

Tech industry spent decades building a title and leveling system. Greg brought the “Member of Technical Staff”, originally invented at Bell Labs, to OpenAI. It has been adopted by Anthropic, xAI, Thinky, and many AI startups. Young MTS can have huge impact. Alec created GPT for example. In a traditional system, he was just an “L4 software engineer”. Databricks AI recently started using MTS as well. I think this is a very positive change in Silicon Valley.

Yuchen Jin@Yuchenj_UW

Whoever invented “Member of Technical Staff” was a genius. It filters out Staff/Principal title-maxxers, protects engineering and research from corporate ladder brain, and leaves recruiters staring at LinkedIn like: “Is this person L4 or L7?” MTS is the best title. Happy to be MTS.

English

1.1K

286.1K

cOfDirac@cOfDirac·3d

@zackabrams @emollick I did. It is very impressive, but not overly surprisingly and doesn't detract from my previous point. In fact I would say that this is more towards proof that current models are just getting better at pretending to be intelligent than actually trying to get to AGI.

English

Zack Abrams@zackabrams·3d

@cOfDirac @emollick did you see the news from this week openai.com/index/model-di…

English

cOfDirac@cOfDirac·3d

@emollick not true. models still are unable to do very basic problems that reveal that their complex problem solving is a result of good pattern recognition as opposed to any sort of intelligence. we will not reach the latter by scaling the same architectures in the same training loops.

English

477

cOfDirac@cOfDirac·12 May

@Google @Android have you considered removing the backdoor that lets governments circumvent vpns and spy on people?

English

Google@Google·12 May

We’re rolling out new updates to make your everyday @Android experience even better, including: 🤳 Screen Reactions, so you can record yourself and your screen at the same time — without switching apps or setting up a green screen 📸 An improved Instagram experience in partnership with Meta, including ultra HDR video, Night Mode integrations, brand new tools in the Edits app and more 📴 New digital wellbeing tools, like Pause Point, to help you reclaim your time and use apps more mindfully 😀 Nearly 4,000 redesigned emoji 🤝 New features to make it even easier to switch to Android from another phone, so your passwords, photos, messages, favorite apps, contacts and even your homescreen travel with you 🛜 Expanded Quick Share compatibility, so you can easily share files with more types of devices #TheAndroidShow

English

141

189

2.3K

216.4K

cOfDirac@cOfDirac·6 May

@alex_whedon From my understanding the attention itself is O(m^2) with m being the chosen sparse tokens where m <= sqrt(n) where n is the total tokens as opposed to actual linear attention yes? Is the attention algorithm novel as well? If not which did you choose?

English

Alexander Whedon@alex_whedon·5 May

Introducing SubQ - a major breakthrough in LLM intelligence. It is the first model built on a fully sub-quadratic sparse-attention architecture (SSA), And the first frontier model with a 12 million token context window which is: - 52x faster than FlashAttention at 1MM tokens - Less than 5% the cost of Opus Transformer-based LLMs waste compute by processing every possible relationship between words (standard attention). Only a small fraction actually matter. @subquadratic finds and focuses only on the ones that do. That's nearly 1,000x less compute and a new way for LLMs to scale.

English

1.5K

2.9K

23K

12.7M

cOfDirac@cOfDirac·5 May

@vikhyatk optimal transport is important though

English

170

vik@vikhyatk·4 May

signs that an AI researcher has llm psychosis: - random matrix theory - optimal transport - went to ayahuasca retreat - "I've been thinking a lot about Yoneda lately" - wife left him

English

460

56.2K

cOfDirac@cOfDirac·4 May

@HowToAI_ Reminds me of the universal weight Subspace hypothesis arxiv.org/abs/2512.05117 and of neural thickets alphaxiv.org/abs/2603.12228

English

How To AI@HowToAI_·27 Nis

MIT proved every major AI model is secretly converging on the same "brain." It’s called the “platonic representation hypothesis,” and it’s one of the most mind-blowing papers you’ll ever read. You train a vision model purely on images. You train a language model purely on text. They use completely different architectures. They process completely different data. They should have completely different "brains." But as these models scale up, something impossible is happening. When researchers measure how they organize information, the mathematical geometry is identical. A model that only "sees" images and a model that only "reads" text are measuring the distance between concepts in the exact same way. The models are converging. The researchers named this after Plato’s Allegory of the Cave. Plato believed that everything we experience is just a shadow of a deeper, hidden, perfect reality. The paper argues that AI models are doing the exact same thing. They are looking at the different "shadows" of human data, text, images, audio. And they are independently discovering the exact same underlying structure of the universe to make sense of it. It doesn't matter what company built the AI. It doesn't matter what data it was trained on. As models get larger, they stop memorizing their specific tasks. They are forced to build a statistical model of reality itself. And there is only one reality to map. 2024, Arxiv

English

243

825

3.9K

295.9K

cOfDirac@cOfDirac·2 May

@fhuszar see I don't mind genuine grounding in my ideas but sometimes it's just picking at air for the sake of it

English

927

Ferenc Huszár@fhuszar·2 May

I noticed Claude has started to very methodically push back on every idea I discuss. In every response there are always caveats and the "the one thing I'd push back on" section. Oh my god, is this what it feels like to talk to me? Sorry, everyone.

English

620

29.4K

cOfDirac@cOfDirac·29 Nis

@CrumbsSpace @eliebakouch I'm not saying that our current approaches are useless; they're super useful. They're just not steps towards AGI.

English

spaceCrumbs@CrumbsSpace·29 Nis

@cOfDirac @eliebakouch True but "dumb" intelligence is also pretty useful enough to start capitalizing on AI. Similar to how you don't need AlphaZero to beat most humans at chess - stockfish running on a potato cpu will do. Agency > intelligence in the real world.

English

elie@eliebakouch·27 Nis

i might be very wrong here, but i don't think "no human data, no pre-training" is the right approach to get frontier models or scientific breakthroughs any time soon

Ineffable Intelligence@IneffableLabs

Introducing Ineffable Intelligence. Led by David Silver, we're assembling the best engineers and researchers in the world to make first contact with superintelligence. We’ll be solving the hardest problems in AI on the way. Come join us. ineffable.ai

English

299

72.5K

cOfDirac@cOfDirac·29 Nis

@dviolettchan Honestly, I should be. I have been a bit lazy about this but there's lot of avenues for it. I saw a pretty cool resource for some grants: nightingal3.github.io/blog/2026/04/1…

English

紫云@dviolettchan·28 Nis

@cOfDirac Maybe you can try applying for some micro‑funding programs from companies or government agencies. These programs don’t require a lengthy proposal and are far simpler than applying for something like an NSF.

English

紫云@dviolettchan·28 Nis

The trickiest cost of being an independent researcher may be conference registration and publication fees. This is especially painful for researchers in low-income countries. Nowadays, even remote registration for some top conferences can still cost $500-$1,000. If you are doing unpaid remote research without institutional support, you may end up paying a lot to publish your own work.

English

7.1K

cOfDirac@cOfDirac·28 Nis

@dviolettchan I would love to go do some more classical ML research some day, but I'm so entrenched in LLM research for such a long time, it's all I think about. And no, I've never had a good idea for finetuning, every idea I have is training from scratch.

English

紫云@dviolettchan·28 Nis

@cOfDirac Maybe you should work on topics that are less expensive. Even API calls alone can be a huge cost in some cases, let alone LLM training.

English

cOfDirac@cOfDirac·28 Nis

@jino_rohit I would say that learning to read PTX is the only absolute requirement though if you hope to be able to do any serious profiling

English

cOfDirac@cOfDirac·28 Nis

@jino_rohit you can skip CUDA (but learn to read PTX), Triton is only good if you're willing to learn Gluon/TLX, TileLang and CuTe seem pretty nice if you wanna max numbers, helion seems great for cross platform and easy code. either way, they all do the job so pick your poison.

English

314

Jino Rohit@jino_rohit·28 Nis

cuda, triton, cutlass, cute, tilelang, thunderkittens, mojo, helion. so which one do you even learn at this point?

English

255

17.7K

cOfDirac@cOfDirac·28 Nis

@eliebakouch It's undeniable that LLMs currently produce impressive results, but there's tiny cracks on the surface that reveal that they're the furthest thing from any sort of general intelligence. I think this is a fault of how we train them and the data we use.

English

cOfDirac@cOfDirac·28 Nis

@eliebakouch I can't speak for their approach but personally I have a strong feeling that we'll never get to AGI without moving past our current regime of shoveling data into models. It will never be enough, and it's debatable if it works at all.

English

213

cOfDirac@cOfDirac·26 Nis

@Lunexalith @pigeon__s yeah but high quality data is not easy to get

English

Lunexa@Lunexalith·25 Nis

@pigeon__s Isn't data set quality far more important than data set size?

English

1.7K

ρ:ɡeσn@pigeon__s·25 Nis

deepseek v4 pro which is 1.6T parameters was trained on 3T tokens less than FUCKING QWEN3-***0.6B*** its EXACTLY chinchilla but these days we go like 3000x chinchilla bro they need to up the dataset size badly (not saying qwen is gold standard though or anything but still) holy

Zephyr@zephyr_z9

Flash at 47, Max at 52 They encountered some serious issues while training V4 Max

English

38.5K

cOfDirac@cOfDirac·26 Nis

@sun_hanchi not everyone wants to go the SSI route

English

388

Hanchi Sun@sun_hanchi·26 Nis

If ur goal is AGI: 1. Is mHC’s sinkhorn AGI? 2. Is sqrt(softplus(•)) AGI? 3. Is HashMoE AGI? 4. Is using two sets of coefficients for muon AGI? How do they serve the purpose of AGI?

DeepSeek@deepseek_ai

🚀 DeepSeek-V4 Preview is officially live & open-sourced! Welcome to the era of cost-effective 1M context length. 🔹 DeepSeek-V4-Pro: 1.6T total / 49B active params. Performance rivaling the world's top closed-source models. 🔹 DeepSeek-V4-Flash: 284B total / 13B active params. Your fast, efficient, and economical choice. Try it now at chat.deepseek.com via Expert Mode / Instant Mode. API is updated & available today! 📄 Tech Report: huggingface.co/deepseek-ai/De… 🤗 Open Weights: huggingface.co/collections/de… 1/n

English

22K

Keşfet

@cloneofsimo @ns123abc @Yuchenj_UW @zackabrams @emollick @Google @Android @alex_whedon