Santiago M.

1.5K posts

Santiago M.

@sanmking

Independent researcher. Focused on Artificial Reasoning | X-AWS

Katılım Haziran 2024

232 Takip Edilen141 Takipçiler

Sabitlenmiş Tweet

Santiago M.@sanmking·14 Şub

After 18 months, I’m happy to share my research: Where I empirically demonstrate the limitations of language as a representation. And mathematically formulate their inefficiencies compared to traditional learning algorithms. I was deeply inspired by the work of @ev_fedorenko

English

1.2K

Santiago M.@sanmking·1d

x.com/i/article/2058…

ZXX

Santiago M. retweetledi

Rosmine@rosmine·6d

The launch was amazing, that you so much everyone ❤️ - multiple companies reached out to request DFT training - successful author said the model was incredible - at least one donation offer that was not a scam Now I'm getting ready to train the open weights model. I've figured out several tricks that are going to make the next model even better Huge shoutout to @brendanh0gan @sanmking @HrishbhDalal for providing feedback on early versions, and to @Algomancer for sponsoring this and other work They are all awesome and you should follow them immediately

Rosmine@rosmine

I fixed why LLMs write so poorly, and I have a demo to prove it Announcing Distribution Fine Tuning (DFT): A post training step that fixes LLM writing Model outputs fooled pangram on 100% of test cases

English

336

21.9K

Santiago M.@sanmking·15 May

@DimitrisPapail Not necessarily, because somehow there’s a chance that Shakespeare was actually a monkey 🙈

English

480

Dimitris Papailiopoulos@DimitrisPapail·14 May

If a submission contains incontrovertible evidence that one author did not check the work of their co-authors, we can't trust anything in the paper. Right?

Thomas G. Dietterich@tdietterich

We have recently clarified our penalties for this. If a submission contains incontrovertible evidence that the authors did not check the results of LLM generation, this means we can't trust anything in the paper. 3/

English

22K

Santiago M.@sanmking·14 May

@NCouriel Cambiaría GPT-3 por GPT-2. No concuerdo con LSTM. Me gusta la inclusión de Word2Vec. Buena lista!

Español

Naomi@NCouriel·12 May

The ultimate list de (en mi opinión) los papers más importantes de inteligencia artificial en la historia: 1.Perceptron, Rosenblatt (1958) 2.Backpropagation, Rumelhart, Hinton & Williams (1986) 3.LSTM, Hochreiter & Schmidhuber (1997) 4.LeNet-5, LeCun (1998) 5.AlexNet, Krizhevsky, Sutskever & Hinton (2012) 6.Word2Vec, Mikolov (2013) 7.GANs, Goodfellow (2014) 8.Seq2Seq, Sutskever (2014) 9.Adam, Kingma & Ba (2014) 10.VGG, Simonyan & Zisserman (2014) 11.Batch Norm, Ioffe & Szegedy (2015) 12.ResNet, He et al. (2015) 13.AlphaGo, Silver et al., DeepMind (2016) 14.Attention is All You Need, Vaswani (2017) 👑 15.BERT, Devlin (2018) 16.GPT-3, Brown et al., OpenAI (2020) 17.DDPM (Diffusion), Ho et al. (2020) 18.DALL-E, Ramesh et al. (2021) 19.CLIP, Radford et al. (2021) 20.Chinchilla, Hoffmann et al., DeepMind (2022) 21.InstructGPT / RLHF, Ouyang et al. (2022) 22.LLaMA, Touvron et al., Meta (2023) 23.DeepSeek-R1, DeepSeek (2025), primer LLM publicado en Nature, razonamiento puro con RL Cualquiera de estos te cambia la forma de pensar el campo. Si querés entender AI desde la academia y no solo desde el hype, guardá este post 📚

Español

173

Santiago M.@sanmking·12 May

@NegarEmpr @wenjie_ma @sewon__min @matei_zaharia How did you checked for contamination? Did it improved on AIME? Cool result, I’ve seen the pattern applied for reasoning reusability.

English

196

Negar Arabzadeh@NegarEmpr·12 May

1/ Thrilled to introduce T³: a corpus for RAG over reasoning tasks, built from thinking traces. We show that surprisingly RAG can improve reasoning— with the right corpus. Rag with Transformed Thinking Traces T³ gain by up to 43.9% on AIME 2025-2026. 🔗 arxiv.org/abs/2605.03344 🧵

English

212

472.1K

Santiago M.@sanmking·12 May

@jasonlk What do you look on a great hire? Could you share a resource, please?

English

Jason ✨👾SaaStr.Ai✨ Lemkin@jasonlk·12 May

Almost every important mistake I've made in the past 15+ years has been due to lowering the hire bar Directly or indirectly, it leads to chaos, slowdown, doubt, and confusing inputs

English

9.5K

Santiago M.@sanmking·12 May

To learn more you can visit the GitHub Repo. The tentative topic for a research paper would be: What Makes a Good Description? Measuring Geometric Faithfulness in Hierarchical Semantic Representations github.com/sanmquin/AI/tr… Your thoughts are encouraged!

English

Santiago M.@sanmking·12 May

It all started when trying to asses the descriptions of PCA dimensions. The chart displays two of the most significant dimensions that correlate with @20vcFund performance. Optimal engagement clearly sits between at provocative commentary about AI impact.

English

Santiago M.@sanmking·12 May

Can we evaluate the accuracy of a description? Many know about “reversion to the mean”: The undesirable property of LLMs that make them sound generic. Perhaps geometry can provide a solution 🧵

English

14.2K

Santiago M.@sanmking·12 May

And again I would push back, the point is not that the math is perfect, but to have a better layer of communication for the empirical phenomenon observed. My only point is that, hopefully, just as in software, mathematical proofs would stop being an obtrusive barrier, and instead, gradually, a more accessible formal language. I do hope that at least in AI, more discoveries will start to emerge from theory, and not only from data. Math is powerful enough to simplify complex phenomena.

English

Alexander Terenin@avt_im·12 May

@sanmking I agree and believe is that it is _completely_ mechanized - but I think this alone is not enough to hope for an automatic answer to many theoretical questions. The problem is that reality itself can be far too complex. Many of its mysteries will be far beyond both humans and AI.

English

Alexander Terenin@avt_im·11 May

It is one thing to be able to express your idea in mathematical language, and another thing completely to be able to prove it correct. It is easy to write down backprop for training a neural network. To this day, no-one I know of can prove it generically achieves low test loss.

English

5.7K

Santiago M.@sanmking·12 May

@avt_im There’s always exceptions, but my guess is that we will be surprised by how “mechanized” reality truly is! An example involving planning and creativity:

Santiago M.@sanmking

@decisionneurop The fact that creativity reuses many of mechanism than planning, gives a potential verifiable representation of creativity!

English

Alexander Terenin@avt_im·11 May

@sanmking I agree in principle. But there are certain classes of behavior which are easy to see empirically, but extraordinarily hard to pin down theoretically. Even ASI may not be enough for that - the exact degree of S will likely matter.

English

131

Santiago M.@sanmking·11 May

@avt_im That is changing fast, and I expect one of the first consequences of LLM expertise in math verification. The ability to formalize intuitions, and communicate them in a standard language.

English

159

Alexander Terenin@avt_im·11 May

Mathematical proof is an extraordinarily high standard by which to judge success. So high, that almost no original ideas in machine learning are developed to that standard first. The only exception I can think of off the top of my head is Greg Yang's muP work.

English

8.2K

Santiago M.@sanmking·11 May

@rolibosch Not without proactive interference.

English

Roli Bosch@rolibosch·11 May

Wouldn’t an intelligent system get smarter with more context instead of incoherent

English

Santiago M.@sanmking·11 May

@barrowjoseph That’s what makes it valuable. Art becomes manufacturing! Good luck.

English

Joe Barrow@barrowjoseph·11 May

@sanmking This survey is organic, free-range human effort. Just a labor of love, tbh.

English

Joe Barrow@barrowjoseph·11 May

Working on a survey of VLM-based OCR models, pretty notable uptick in releases in 2025, largely thanks to Qwen.

English

481

Santiago M.@sanmking·11 May

@BetaTomorrow @che_shr_cat

QME

deep Manifold@BetaTomorrow·11 May

@sanmking @che_shr_cat Thanks.. I have no idea how many article will be for this series, probably will be over 10, I try to write every week :) meanwhile, please check out deepmanifo.ai. look forward to good discussion.

English

Grigory Sapunov@che_shr_cat·11 May

Another beautiful work on geometry! 1/ Stop steering LLMs in straight lines. The Linear Representation Hypothesis is a useful lie, but it breaks down fast. Pushing activations across flat Euclidean space causes "teleportation" and diversity collapse. The real geometry is curved. 🧵

English

219

14.4K

Santiago M.@sanmking·11 May

@Underfox3 Is that the most devastating blow to NVIDIA’s scientific supremacy you’ve seen? It could the beginning of a confirmed superior architecture.

English

Underfox@Underfox3·11 May

In this paper is proposed CStencil, an iterative 2D stencil solver based on the Jacobi Method on the Cerebras WSE-3, supporting both Star and Box stencil patterns of various orders. arxiv.org/pdf/2605.07954

English

3.1K

Keşfet

@brendanh0gan @HrishbhDalal @Algomancer @DimitrisPapail @NCouriel @NegarEmpr @wenjie_ma @sewon__min