Cristiano Calcagno

1.3K posts

Cristiano Calcagno

Cristiano Calcagno

@ccrisccris

@fbinfer @rescriptlang Hanging out with @magnesCH

เข้าร่วม Haziran 2011
115 กำลังติดตาม1.2K ผู้ติดตาม
Cristiano Calcagno รีทวีตแล้ว
Cristiano Calcagno รีทวีตแล้ว
Cheng Lou
Cheng Lou@_chenglou·
My dear front-end developers (and anyone who’s interested in the future of interfaces): I have crawled through depths of hell to bring you, for the foreseeable years, one of the more important foundational pieces of UI engineering (if not in implementation then certainly at least in concept): Fast, accurate and comprehensive userland text measurement algorithm in pure TypeScript, usable for laying out entire web pages without CSS, bypassing DOM measurements and reflow
English
1.3K
8.3K
65.1K
23.6M
Cristiano Calcagno รีทวีตแล้ว
Cheng Lou
Cheng Lou@_chenglou·
I’m very happy to present my toy research project: Sotaku! It's a neural net that automatically discovered the rules of sudoku and learned to solve them, achieving a new state-of-the-art score of 98.9% on one of the hardest sudoku datasets, while being agnostic to the game, and beating all other sudoku-optimized neural net architectures* Read more for fun motivations, plus some extremely unconventional discoveries, e.g. reverse curriculum consistently beating curriculum (!), emergent reasoning-like capabilities, and the future of traditional programming
Cheng Lou tweet media
English
32
123
1.8K
198.8K
Cristiano Calcagno รีทวีตแล้ว
Leonardo de Moura
Leonardo de Moura@Leonard41111588·
AI is writing a growing share of the world's software. No one is formally verifying any of it. New essay: "When AI Writes the World's Software, Who Verifies It?" leodemoura.github.io/blog/2026/02/2…
English
41
246
1.6K
421.1K
Cristiano Calcagno
Cristiano Calcagno@ccrisccris·
Now that adding code is easy, removing code is the new superpower.
Cristiano Calcagno tweet media
English
1
5
5
479
Cristiano Calcagno
Cristiano Calcagno@ccrisccris·
The @rescriptlang static analyzer is going incremental with the @skiplabs reactive combinators. Soon ReScript static analysis that updates in real time in the editor.
Cristiano Calcagno tweet media
English
1
4
18
2.2K
Cristiano Calcagno รีทวีตแล้ว
Anil Madhavapeddy
Anil Madhavapeddy@avsm·
"Package Managers à la Carte, A Formal Model of Dependency Resolution" preprint out today: a new package calculus to describe the cambrian explosion of systems that exist today arxiv.org/pdf/2602.18602
English
0
15
54
3.1K
Cristiano Calcagno รีทวีตแล้ว
Mark Kretschmann
Mark Kretschmann@mark_k·
This is awesome to watch: @agenticasdk have solved all publically available ARC-AGI 3 tasks (mini-games)! @arcprize It seems to work by generating bespoke program code for each puzzle. You can see it generate and progress in this video:
English
5
12
101
10.5K
Cristiano Calcagno รีทวีตแล้ว
Lean
Lean@leanprover·
The CSLib steering committee recently announced the official launch of CSLib — an open-source effort to formalize computer science in Lean, inspired by the impact of Mathlib in mathematics. CS researchers, practitioners, and enthusiasts are invited to get involved to support formalizing essential computer science concepts, and building infrastructure for reasoning about real-world code with Lean. Learn more at: 🌐 cslib.io 📄 White paper: arxiv.org/abs/2602.04846 🤝 Contribute: github.com/leanprover/csl… #LeanLang #LeanProver #CSLib #OpenSource #FormalVerification
Lean tweet media
English
8
84
427
29.5K
Cristiano Calcagno รีทวีตแล้ว
Neil Houlsby
Neil Houlsby@neilhoulsby·
🚨 New roles at Anthropic Zurich 🇨🇭 In addition to pre-training (where we've been hiring so far), post-training and security are joining and have open roles! It's a remarkable time in AI, the company, and on the site. job-boards.greenhouse.io/anthropic?offi…
English
22
35
668
73.3K
Cristiano Calcagno รีทวีตแล้ว
Vinod Khosla
Vinod Khosla@vkhosla·
Well well… ARC-AGI-2 (François Chollet’s “hardest” benchmark) is starting to smell like toast. 🍞🔥 @agenticasdk just set a new SOTA: 85.28% with an Agentica agent (~350 lines) that writes & runs code. Best part: it’s not ARC-specialized—it's a general system that’s strong across other benchmarks too. Details at symbolica.ai/blog/arcgentica What benchmark should we throw at it next?
English
19
31
292
54.1K
Cristiano Calcagno รีทวีตแล้ว
Sang Hyun Kim
Sang Hyun Kim@kimshmath·
arxiv.org/abs/2601.22401 The concluding remark from the introduction (I didn't write this part, but cannot agree more with this): "... we caution against overexcitement about its mathematical significance. (1/3)
English
3
6
33
5.4K
Cristiano Calcagno รีทวีตแล้ว
Peter O'Hearn
Peter O'Hearn@PeterOHearn12·
LLMs vs the Halting Problem. (Why, what, where going.) We recently released a paper on this; link to follow. A few comments here for context. Why? With LLM "reasoning" excitement, we thought: why not try LLMs on the first ever code reasoning task, the halting problem. Turing's proof of undecidability established fundamental limits. Fun bit: no matter how "superintelligent" AI becomes, this is a problem it can never perfectly solve. Where to get data to measure? SVCOMP. Verification researchers have through their insight and hard work, curated several thousand example C programs. They run dedicated tools over this dataset in an annual competition. This is in a sense the home turf of symbolic. We didn't know how LLMs would do, and in particular were aware of results of @rao2z , @RishiHazra95 and others showing that LLMs trail symbolic on "easier" decidable problems (SAT, propositional planning). The surprise: LLMs are competitive on halting—where they often trail on "easier" problems. Why? Hypothesis: LLMs are heuristic approximators; in undecidability, heuristic approximation isn't just a workaround—it's often the only way forward. Broader context: Penrose claimed undecidability proved AI is impossible (but didn't show humans can solve the undecidable). Turning the tables: undecidability is an ideal target for heuristic LLMs. Instead of using "already crushed" logic problems to show LLM limits, let's look at uncrushed problems where LLMs might actually help.
Peter O'Hearn tweet mediaPeter O'Hearn tweet mediaPeter O'Hearn tweet media
English
4
12
55
5.4K
Cristiano Calcagno รีทวีตแล้ว
Oren Sultan
Oren Sultan@oren_sultan·
Can LLMs reliably predict program termination? We evaluate frontier LLMs in the International Competition on Software Verification (SV-COMP) 2025, directly competing with state-of-the-art verification systems. @AIatMeta @HebrewU @Bloomberg @imperialcollege @ucl @jordiae @pascalkesseli @jvanegue @HyadataLab @adiyossLC @PeterOHearn12 Paper: arxiv.org/pdf/2601.18987 Website: orensultan.com/llms_halting_p… 🧵👇 1/n
Oren Sultan tweet media
English
9
42
116
43.8K
Cristiano Calcagno รีทวีตแล้ว
Patrick Ecker
Patrick Ecker@ryyppy·
@ccrisccris @n2parko Yeah, understanding how the code changed is one part. What we're trying to solve at @KetryxHQ is even more ambitious by establishing full traceability across the req / spec architecture that goes beyond git boundaries.
English
1
1
2
160
n2parko
n2parko@n2parko·
today we introduced Cursor Blame it turns out it's useful to persist the "why" behind your code so your teammates can understand code lineage ...and future agents can use this context to make better decisions
English
57
44
1.2K
139K
Cristiano Calcagno รีทวีตแล้ว
Brando Miranda
Brando Miranda@BrandoHablando·
Papers! - VeriBench: End-to-End Formal Verification Benchmark for AI Code Generation in Lean 4: #discussion" target="_blank" rel="nofollow noopener">openreview.net/forum?id=rWkGF… 2/n
English
1
1
3
302