Paulo Salem

1.6K posts

Paulo Salem

@paulosalem

Doctor of Computer Science, Principal Data & Applied Scientist. I enjoy science, business, art, philosophy, and more. TinyTroupe's creator. Opinions are my own.

São Paulo Katılım Nisan 2010

181 Takip Edilen263 Takipçiler

Sabitlenmiş Tweet

Paulo Salem@paulosalem·29 Mar

🚀TinyTroupe v0.7.0 just released. Major new feature: vision support! 👀 Repo: github.com/microsoft/Tiny… Vision example: github.com/microsoft/Tiny…

Sao Paulo, Brazil 🇧🇷 English

165

Paulo Salem@paulosalem·29 Mar

@goktug_eth One thing that really helps is to build examples for us to share in the repo. There is only one using the new vision capability, so that would be particularly welcome as a PR.

English

goktug@goktug_eth·29 Mar

@paulosalem I really wanna contribute the TinyTroupe, is there a way? Can you send dm?

English

Paulo Salem@paulosalem·29 Mar

🚀TinyTroupe v0.7.0 just released. Major new feature: vision support! 👀 Repo: github.com/microsoft/Tiny… Vision example: github.com/microsoft/Tiny…

Sao Paulo, Brazil 🇧🇷 English

165

Paulo Salem@paulosalem·25 Mar

@BlackHC This is very insightful, thanks for articulating it so well.

English

Paulo Salem retweetledi

Andreas Kirsch 🇺🇦@BlackHC·18 Mar

A while back, Andrej Karpathy said the app store will be replaced by generated, disposable software," and Amjad Masad predicted that the value of all application software will go to zero I think this "ephemeral software hypothesis" is wrong, though, and I want to explain why:

English

391

34.1K

Paulo Salem retweetledi

Rohan Paul@rohanpaul_ai·6 Mar

Citadel Securities published this graph showing a strange phenomenon. Job postings for software engineers are actually seeing a massive spike. Classic example of the Jevons paradox. When AI makes coding cheaper, companies actually may need a lot more software engineers, not fewer. When software is cheaper to build, companies naturally want to build a lot more of it. Businesses are now putting software into industries and tools where it was simply too expensive before. --- Chart from citadelsecurities .com/news-and-insights/2026-global-intelligence-crisis/

English

428

1.3K

9.9K

Paulo Salem retweetledi

Bo Wang@BoWang87·3 Mar

Prof. Donald Knuth opened his new paper with "Shock! Shock!" Claude Opus 4.6 had just solved an open problem he'd been working on for weeks — a graph decomposition conjecture from The Art of Computer Programming. He named the paper "Claude's Cycles." 31 explorations. ~1 hour. Knuth read the output, wrote the formal proof, and closed with: "It seems I'll have to revise my opinions about generative AI one of these days." The man who wrote the bible of computer science just said that. In a paper named after an AI. Paper: cs.stanford.edu/~knuth/papers/…

English

155

1.9K

9.1K

1.4M

Paulo Salem@paulosalem·28 Şub

@fchollet Wise words.

English

François Chollet@fchollet·27 Şub

If you ever feel like you're late to the game, consider that in the 1890s many scientists thought physics as a field was completely solved (quote below is from Albert Michelson in 1894). On the front of intelligence science, it feels more like the 1870s. For the first time we have something that is starting to really work (however primitive it may be), which we can use as a springboard for the next few decades of discoveries.

English

101

1.1K

52.6K

Paulo Salem@paulosalem·7 Şub

A well-posed problem contains the seeds of its own solution?

Yann LeCun@ylecun

Hugo Duminil-Copin, French mathematician and 2022 Field Medalist told me he never participated in math competition and was very bad at it. Innovative mathematics requires creativity, intuition, intense concentration, and long reflections, sometimes spread over several years. Good performance at a math olympiad merely tests fast problem solving abilities. AI can do that nowadays. One of the big activities of a researcher, in mathematics and elsewhere, is not to answer questions but to ask the right questions.

Sao Paulo, Brazil 🇧🇷 English

154

Paulo Salem@paulosalem·2 Şub

🚀Just released TinyTroupe v0.6.0, now supporting GPT-5 (took more time and effort than I expected 🥵). It also includes some new examples, comparing simulations to real-world human behavior! Have fun! github.com/microsoft/Tiny…

Sao Paulo, Brazil 🇧🇷 English

160

Paulo Salem retweetledi

Claude@claudeai·12 Oca

Introducing Cowork: Claude Code for the rest of your work. Cowork lets you complete non-technical tasks much like how developers use Claude Code.

English

2.6K

8.4K

86.9K

49.6M

Paulo Salem@paulosalem·27 Ara

@aakashgupta Admitting one's ignorance is often the best sign of wisdom. Socrates comes to mind. It's always refreshing to hear such comments by competent people, since most folks are just too scared to say "I don't know", sometimes even to themselves.

English

Aakash Gupta@aakashgupta·27 Ara

Andrej Karpathy literally built the neural networks running inside coding assistants. He taught the world deep learning at Stanford. He ran AI at Tesla. If he feels “dramatically behind” as a programmer… that tells you everything about where we are. The confession here is that raw intelligence and deep technical knowledge no longer guarantee mastery. The new stack isn’t about understanding transformers or writing elegant algorithms. It’s about orchestrating a zoo of stochastic systems that nobody fully controls. Karpathy’s list is revealing: agents, subagents, prompts, contexts, memory, modes, permissions, tools, plugins, skills, hooks, MCP, LSP, slash commands, workflows, IDE integrations. That’s 15+ new primitives that didn’t exist 18 months ago. Each one evolving weekly. The mental model problem is real. Traditional engineering gives you deterministic systems. You write code, it does exactly what you wrote. Now you’re managing entities that are “fundamentally stochastic, fallible, unintelligible and changing.” His “alien tool with no manual” framing is exactly right. We’re all reverse-engineering capabilities in real-time. The documentation is always out of date. The best practices from 3 months ago are already wrong. The magnitude 9 earthquake isn’t coming. It already hit. The aftershocks are the new normal.

Andrej Karpathy@karpathy

I've never felt this much behind as a programmer. The profession is being dramatically refactored as the bits contributed by the programmer are increasingly sparse and between. I have a sense that I could be 10X more powerful if I just properly string together what has become available over the last ~year and a failure to claim the boost feels decidedly like skill issue. There's a new programmable layer of abstraction to master (in addition to the usual layers below) involving agents, subagents, their prompts, contexts, memory, modes, permissions, tools, plugins, skills, hooks, MCP, LSP, slash commands, workflows, IDE integrations, and a need to build an all-encompassing mental model for strengths and pitfalls of fundamentally stochastic, fallible, unintelligible and changing entities suddenly intermingled with what used to be good old fashioned engineering. Clearly some powerful alien tool was handed around except it comes with no manual and everyone has to figure out how to hold it and operate it, while the resulting magnitude 9 earthquake is rocking the profession. Roll up your sleeves to not fall behind.

English

196

784

7.5K

826.9K

Paulo Salem@paulosalem·27 Ara

@coproduto É uma possibilidade intrigante. A principal questão que me permanece nebulosa é até que ponto as próprias especificações formais correspondentes seriam mais compreensíveis do que os programas subjascentes, para além de domínios classicos de FM (e.g., sistemas de transição, etc)

Português

el hombre pulpo@coproduto·26 Ara

Se vocês quiserem só uma coisa pra estudar em 2026 Estudem formalização. Bastante gente já sacou que com volumes cada vez maiores de código sendo produzidos cada vez mais rápido, o único jeito de controlar isso é formalização. Recursos pra estudar no fim do fio.

Português

1.8K

112K

Paulo Salem@paulosalem·27 Ara

@karpathy It's never been so exciting to program, at least since I was a kid - an entirely new ground to explore, who knows what wonders will be found, and so many chances to leave one's mark.

English

Andrej Karpathy@karpathy·26 Ara

English

2.6K

7.5K

55.9K

16.8M

Paulo Salem@paulosalem·21 Ara

As 2025 draws to a close, here's my (generated) comic strip summarizing the LLM revolution so far.

Sao Paulo, Brazil 🇧🇷 English

17K

Paulo Salem retweetledi

Sujay@sujay_kapadnis·5 Ara

Ever wondered how Q, K, V matrices are constructed exactly, which leads to the pattern recognition, or how the actual "attention" is paid? Then you'll definitely love this post by @arpit_bhayani arpitbhayani.me/blogs/qkv-matr…

English

2.1K

Paulo Salem@paulosalem·17 Kas

@tuzhaopeng @UtopicDev @qingxuan_jiang @Mibonap I'd be super happy if anyone tries this on TinyTroupe (I'm the lead author)! Please let me know if you do so.

English

Zhaopeng Tu@tuzhaopeng·11 Kas

Thank you very much, and TinyTroupe sounds fascinating! Integrating the Moral RolePlay personas into your framework could offer a rich testing ground for simulating nuanced social dynamics and character fidelity. Looking forward to seeing what insights emerge from that adaptation!

English

Zhaopeng Tu@tuzhaopeng·10 Kas

Are safety-aligned LLMs too good to truly play villains? 🤖🎭😈 Introducing Moral RolePlay, a balanced dataset with 800 characters across 4 moral levels (Paragons → Flawed → Egoists → Villains), featuring 77 personality traits and rigorous scene contexts. This enables the first large-scale, systematic evaluation of moral persona fidelity in LLMs. 🔍 Key findings: 📉 Role-playing fidelity drops as character morality decreases — especially for egoists and villains. 🚫 Models fail most on traits like "Deceitful" and "Manipulative", due to safety alignment conflicts. ⚠️ General chatbot skills ≠ good villain acting. Top Arena models fall short on moral ambiguity. 🧠 Explicit reasoning doesn't help much — models still sanitize complex antagonism. ✨ This work reveals a critical limitation in current alignment approaches — models trained to be "too good" cannot authentically simulate the full spectrum of human psychology, limiting their utility in creative, educational, and social science applications. 📏 Benchmark: github.com/Tencent/Digita… 📃 Paper: arxiv.org/abs/2511.04962

English

179

31.2K

Paulo Salem@paulosalem·17 Kas

@tuzhaopeng @qingxuan_jiang @Mibonap Your work goes much deeper, but we found similar results on our TinyTroupe (a multiagent persona simulation toolkit) paper: arxiv.org/abs/2507.09788 Interestingly, we found it possible to partly correct some misalignments on-the-fly, particularly the more egregious ones.

English

Paulo Salem@paulosalem·24 Ağu

Easily one of the most, if not the most, interesting places anywhere. The largest dig in a lifetime is under way in Pompeii economist.com/interactive/cu… From The Economist

Sao Paulo, Brazil 🇧🇷 English

Keşfet

@goktug_eth @BlackHC @fchollet @aakashgupta @coproduto @karpathy @arpit_bhayani @elonmusk