Marco Trombetti

944 posts

Marco Trombetti banner
Marco Trombetti

Marco Trombetti

@marcotrombetti

Incurable optimist always in love with big ideas. Entrepreneur and investor. @translation @picampusrome

Italia شامل ہوئے Mart 2009
250 فالونگ755 فالوورز
Marco Trombetti
Marco Trombetti@marcotrombetti·
@paulg Adding innovation to a job title, division or event is a natural response to lack of innovation. Startups exists because non-innovative companies can’t be fixed.
English
0
0
0
3
Paul Graham
Paul Graham@paulg·
A rule of thumb that has served me well: Beware of anything with "innovation" in the name.
English
218
116
2K
92.7K
Marco Trombetti ری ٹویٹ کیا
Andrej Karpathy
Andrej Karpathy@karpathy·
Three days ago I left autoresearch tuning nanochat for ~2 days on depth=12 model. It found ~20 changes that improved the validation loss. I tested these changes yesterday and all of them were additive and transferred to larger (depth=24) models. Stacking up all of these changes, today I measured that the leaderboard's "Time to GPT-2" drops from 2.02 hours to 1.80 hours (~11% improvement), this will be the new leaderboard entry. So yes, these are real improvements and they make an actual difference. I am mildly surprised that my very first naive attempt already worked this well on top of what I thought was already a fairly manually well-tuned project. This is a first for me because I am very used to doing the iterative optimization of neural network training manually. You come up with ideas, you implement them, you check if they work (better validation loss), you come up with new ideas based on that, you read some papers for inspiration, etc etc. This is the bread and butter of what I do daily for 2 decades. Seeing the agent do this entire workflow end-to-end and all by itself as it worked through approx. 700 changes autonomously is wild. It really looked at the sequence of results of experiments and used that to plan the next ones. It's not novel, ground-breaking "research" (yet), but all the adjustments are "real", I didn't find them manually previously, and they stack up and actually improved nanochat. Among the bigger things e.g.: - It noticed an oversight that my parameterless QKnorm didn't have a scaler multiplier attached, so my attention was too diffuse. The agent found multipliers to sharpen it, pointing to future work. - It found that the Value Embeddings really like regularization and I wasn't applying any (oops). - It found that my banded attention was too conservative (i forgot to tune it). - It found that AdamW betas were all messed up. - It tuned the weight decay schedule. - It tuned the network initialization. This is on top of all the tuning I've already done over a good amount of time. The exact commit is here, from this "round 1" of autoresearch. I am going to kick off "round 2", and in parallel I am looking at how multiple agents can collaborate to unlock parallelism. github.com/karpathy/nanoc… All LLM frontier labs will do this. It's the final boss battle. It's a lot more complex at scale of course - you don't just have a single train. py file to tune. But doing it is "just engineering" and it's going to work. You spin up a swarm of agents, you have them collaborate to tune smaller models, you promote the most promising ideas to increasingly larger scales, and humans (optionally) contribute on the edges. And more generally, *any* metric you care about that is reasonably efficient to evaluate (or that has more efficient proxy metrics such as training a smaller network) can be autoresearched by an agent swarm. It's worth thinking about whether your problem falls into this bucket too.
Andrej Karpathy tweet media
English
974
2.1K
19.4K
3.6M
Marco Trombetti
Marco Trombetti@marcotrombetti·
@paulg It’s time to write the modern version of “A plan for spam”
English
0
0
0
9
Paul Graham
Paul Graham@paulg·
I got a pointless email from someone. When I asked why he'd emailed me, he apologized and said that OpenClaw had sent it. That's a first. Wish it was the last, but it will presumably only become more common. Who knows how many other pointless emails I've already gotten this way?
English
231
60
2K
248.2K
Massimo
Massimo@Rainmaker1973·
You're on a 10 hour flight. Which seat would you choose and why?
Massimo tweet media
English
15.2K
512
8.5K
9M
Marco Trombetti ری ٹویٹ کیا
Peter W. Kruger
Peter W. Kruger@pwk·
Did you already try Gemini 3 Flash? ⚡️ 🚨AUTOBENCH UPDATE! 🚨 Well, the new @GoogleDeepMind model dropped less than 48h ago and you can already find it ranked on #AutoBench, alongside 2 other very interesting models: @NVIDIAAI #Nemotron 3 Nano 30B and @allen_ai Olmo 3.1 31B Think. This is one of the really powerful features of our new 2.0 version of AutoBench. As new models are released, we can update existing benchmark runs with the newcomers (new models will be submitted with the same questions and evaluated by the same rankers as per the original run). 1/5
Peter W. Kruger tweet media
English
1
1
5
891
Marco Trombetti ری ٹویٹ کیا
Peter W. Kruger
Peter W. Kruger@pwk·
🚨 AutoBench 1.0 – Run 4 is LIVE 📷 - 33 frontier models ranked (including GPT-5.1, Gemini 3 Pro, Grok 4.1, Kimi K2 Thinking, etc.) - 21 ranking models - 300+ fresh questions generated - 220,000+ individual rankings This is the most manipulation-resistant evaluation we’ve ever run. And yes… the winner is NOT who most people expected. 1/13
Peter W. Kruger tweet media
English
1
4
20
1.9K
Massimo
Massimo@Rainmaker1973·
Post below the first word you see.
Massimo tweet media
English
8.8K
535
5.2K
1.4M
Marco Trombetti
Marco Trombetti@marcotrombetti·
Ohh, @paulg website is no longer available over plain HTTP; it has switched to HTTPS-only. I used the PG website to rely on it in airplanes and airports: because it was HTTP, captive portals could intercept the request and show their login page without causing TLS errors. Now I have to fall back on captive.apple.com, which is one of the few endpoints still served over HTTP for exactly this purpose. PG website, being a static without forms, it was safe to be http only. In my mind it was Paul declaring his love for speed and simplicity. I loved it. Today, from a performance standpoint, TLS 1.3 has removed the old latency penalty. With a single round-trip handshake (and session resumption or 0-RTT on repeat connections), plus optimizations that are HTTPS-exclusive (HTTP/2 multiplexing, QUIC/HTTP/3), HTTPS today is as fast, or often faster, than HTTP.
English
0
0
3
82
Marco Trombetti
Marco Trombetti@marcotrombetti·
@paulg It is very painful. It is very sad. Very. But it is even more painful to know that this modern war has caused the death of more than 20,000 kids.
English
0
0
0
9
Marco Trombetti ری ٹویٹ کیا
Out of Context Human Race
Out of Context Human Race@NoContextHumans·
I know that was the most embarrassing moment in his life
English
229
2.2K
50.8K
5.7M
Andrej Karpathy
Andrej Karpathy@karpathy·
The hottest new programming language is English
English
1.9K
8K
63.3K
11.2M
Massimo
Massimo@Rainmaker1973·
If you had him, what would you name him?
Massimo tweet media
English
2.7K
183
3.1K
517K
Marco Trombetti
Marco Trombetti@marcotrombetti·
@Rainmaker1973 I often go there. There is a great restaurant nearby that I love: Pagnanelli. Castle Gandolfo is 4-5 degrees Celsius less than Rome. Natural AC. Air conditioning can save you a lot of money, but will make the world uglier.
Marco Trombetti tweet media
English
0
0
1
17
Massimo
Massimo@Rainmaker1973·
Castel Gandolfo is the historic summer retreat that has housed Popes for centuries but remained unused since Pope Francis chose not to reside there.
English
22
11
117
36.6K
Marco Trombetti ری ٹویٹ کیا
DVPS
DVPS@dvps_ai·
A special evening in Rome to talk about Physical AI and Europe’s role in shaping this new frontier. Partners from across Europe came together to present the DVPS project, and connect with key people from public institutions, embassies, industries, national & international media.
DVPS tweet mediaDVPS tweet mediaDVPS tweet mediaDVPS tweet media
English
1
6
8
389
Massimo
Massimo@Rainmaker1973·
What do you call this?
Massimo tweet media
English
2.6K
133
1.6K
573.5K
Massimo
Massimo@Rainmaker1973·
What’s the right answer?
Massimo tweet media
English
1.7K
83
1.3K
809.2K
Marco Trombetti
Marco Trombetti@marcotrombetti·
@ItakGol Sinner is missing from this old photo because he is a self build AGI
English
0
0
0
18
Massimo
Massimo@Rainmaker1973·
A considerable number of people get it wrong. Can you solve it?
Massimo tweet media
English
2.2K
96
1.2K
632.2K