Cristiano Calcagno

1.3K posts

Cristiano Calcagno

@ccrisccris

@fbinfer @rescriptlang Hanging out with @magnesCH

เข้าร่วม Haziran 2011

115 กำลังติดตาม1.2K ผู้ติดตาม

Cristiano Calcagno รีทวีตแล้ว

Singular Prism@singular_prism·29 Mar

Lets go! Play pretext breaker 🎮 pretext-breaker.netlify.app

Cheng Lou@_chenglou

My dear front-end developers (and anyone who’s interested in the future of interfaces): I have crawled through depths of hell to bring you, for the foreseeable years, one of the more important foundational pieces of UI engineering (if not in implementation then certainly at least in concept): Fast, accurate and comprehensive userland text measurement algorithm in pure TypeScript, usable for laying out entire web pages without CSS, bypassing DOM measurements and reflow

English

430

4.4K

420.2K

Cristiano Calcagno รีทวีตแล้ว

Cheng Lou@_chenglou·28 Mar

English

1.3K

8.3K

65.1K

23.6M

Cristiano Calcagno รีทวีตแล้ว

Cheng Lou@_chenglou·14 Mar

I’m very happy to present my toy research project: Sotaku! It's a neural net that automatically discovered the rules of sudoku and learned to solve them, achieving a new state-of-the-art score of 98.9% on one of the hardest sudoku datasets, while being agnostic to the game, and beating all other sudoku-optimized neural net architectures* Read more for fun motivations, plus some extremely unconventional discoveries, e.g. reverse curriculum consistently beating curriculum (!), emergent reasoning-like capabilities, and the future of traditional programming

English

123

1.8K

198.8K

Cristiano Calcagno รีทวีตแล้ว

Leonardo de Moura@Leonard41111588·3 Mar

AI is writing a growing share of the world's software. No one is formally verifying any of it. New essay: "When AI Writes the World's Software, Who Verifies It?" leodemoura.github.io/blog/2026/02/2…

English

246

1.6K

421.1K

Cristiano Calcagno@ccrisccris·26 Şub

Real-Time project-wide analysis now in ReScript rescript-lang.org/blog/reactive-…

English

2.1K

Cristiano Calcagno@ccrisccris·26 Şub

Now that adding code is easy, removing code is the new superpower.

English

479

Cristiano Calcagno@ccrisccris·25 Şub

@StatisticsFTW @rescriptlang @skiplabs rescript-lang.org/blog/reactive-…

QME

Robert Balicki (👀 @IsographLabs)@StatisticsFTW·18 Ara

@ccrisccris @rescriptlang @skiplabs Very cool! Is there more written about this somewhere?

English

136

Cristiano Calcagno@ccrisccris·18 Ara

The @rescriptlang static analyzer is going incremental with the @skiplabs reactive combinators. Soon ReScript static analysis that updates in real time in the editor.

English

2.2K

Cristiano Calcagno รีทวีตแล้ว

Anil Madhavapeddy@avsm·24 Şub

"Package Managers à la Carte, A Formal Model of Dependency Resolution" preprint out today: a new package calculus to describe the cambrian explosion of systems that exist today arxiv.org/pdf/2602.18602

English

3.1K

Cristiano Calcagno รีทวีตแล้ว

Mark Kretschmann@mark_k·23 Şub

This is awesome to watch: @agenticasdk have solved all publically available ARC-AGI 3 tasks (mini-games)! @arcprize It seems to work by generating bespoke program code for each puzzle. You can see it generate and progress in this video:

English

101

10.5K

Cristiano Calcagno รีทวีตแล้ว

Lean@leanprover·22 Şub

The CSLib steering committee recently announced the official launch of CSLib — an open-source effort to formalize computer science in Lean, inspired by the impact of Mathlib in mathematics. CS researchers, practitioners, and enthusiasts are invited to get involved to support formalizing essential computer science concepts, and building infrastructure for reasoning about real-world code with Lean. Learn more at: 🌐 cslib.io 📄 White paper: arxiv.org/abs/2602.04846 🤝 Contribute: github.com/leanprover/csl… #LeanLang #LeanProver #CSLib #OpenSource #FormalVerification

English

427

29.5K

Cristiano Calcagno รีทวีตแล้ว

Dimitris Papailiopoulos@DimitrisPapail·19 Şub

x.com/i/article/2024…

ZXX

192

1.6K

492.5K

Cristiano Calcagno รีทวีตแล้ว

Neil Houlsby@neilhoulsby·12 Şub

🚨 New roles at Anthropic Zurich 🇨🇭 In addition to pre-training (where we've been hiring so far), post-training and security are joining and have open roles! It's a remarkable time in AI, the company, and on the site. job-boards.greenhouse.io/anthropic?offi…

English

668

73.3K

Cristiano Calcagno รีทวีตแล้ว

Vinod Khosla@vkhosla·12 Şub

Well well… ARC-AGI-2 (François Chollet’s “hardest” benchmark) is starting to smell like toast. 🍞🔥 @agenticasdk just set a new SOTA: 85.28% with an Agentica agent (~350 lines) that writes & runs code. Best part: it’s not ARC-specialized—it's a general system that’s strong across other benchmarks too. Details at symbolica.ai/blog/arcgentica What benchmark should we throw at it next?

English

292

54.1K

Cristiano Calcagno รีทวีตแล้ว

Sang Hyun Kim@kimshmath·4 Şub

arxiv.org/abs/2601.22401 The concluding remark from the introduction (I didn't write this part, but cannot agree more with this): "... we caution against overexcitement about its mathematical significance. (1/3)

English

5.4K

Cristiano Calcagno รีทวีตแล้ว

Peter O'Hearn@PeterOHearn12·2 Şub

LLMs vs the Halting Problem. (Why, what, where going.) We recently released a paper on this; link to follow. A few comments here for context. Why? With LLM "reasoning" excitement, we thought: why not try LLMs on the first ever code reasoning task, the halting problem. Turing's proof of undecidability established fundamental limits. Fun bit: no matter how "superintelligent" AI becomes, this is a problem it can never perfectly solve. Where to get data to measure? SVCOMP. Verification researchers have through their insight and hard work, curated several thousand example C programs. They run dedicated tools over this dataset in an annual competition. This is in a sense the home turf of symbolic. We didn't know how LLMs would do, and in particular were aware of results of @rao2z , @RishiHazra95 and others showing that LLMs trail symbolic on "easier" decidable problems (SAT, propositional planning). The surprise: LLMs are competitive on halting—where they often trail on "easier" problems. Why? Hypothesis: LLMs are heuristic approximators; in undecidability, heuristic approximation isn't just a workaround—it's often the only way forward. Broader context: Penrose claimed undecidability proved AI is impossible (but didn't show humans can solve the undecidable). Turning the tables: undecidability is an ideal target for heuristic LLMs. Instead of using "already crushed" logic problems to show LLM limits, let's look at uncrushed problems where LLMs might actually help.

English

5.4K

Cristiano Calcagno รีทวีตแล้ว

Oren Sultan@oren_sultan·1 Şub

Can LLMs reliably predict program termination? We evaluate frontier LLMs in the International Competition on Software Verification (SV-COMP) 2025, directly competing with state-of-the-art verification systems. @AIatMeta @HebrewU @Bloomberg @imperialcollege @ucl @jordiae @pascalkesseli @jvanegue @HyadataLab @adiyossLC @PeterOHearn12 Paper: arxiv.org/pdf/2601.18987 Website: orensultan.com/llms_halting_p… 🧵👇 1/n

English

116

43.8K

Cristiano Calcagno@ccrisccris·29 Oca

@ryyppy @n2parko @KetryxHQ x.com/cursor_ai/stat…

Cursor@cursor_ai

We're proposing an open standard for tracing agent conversations to the code they generate. It's interoperable with any coding agent or interface. agent-trace.dev

QME

Cristiano Calcagno รีทวีตแล้ว

Patrick Ecker@ryyppy·24 Oca

@ccrisccris @n2parko Yeah, understanding how the code changed is one part. What we're trying to solve at @KetryxHQ is even more ambitious by establishing full traceability across the req / spec architecture that goes beyond git boundaries.

English

160

n2parko@n2parko·23 Oca

today we introduced Cursor Blame it turns out it's useful to persist the "why" behind your code so your teammates can understand code lineage ...and future agents can use this context to make better decisions

English

1.2K

139K

Cristiano Calcagno รีทวีตแล้ว

Brando Miranda@BrandoHablando·20 Oca

Papers! - VeriBench: End-to-End Formal Verification Benchmark for AI Code Generation in Lean 4: #discussion" target="_blank" rel="nofollow noopener">openreview.net/forum?id=rWkGF… 2/n

English

302

ค้นพบ

@StatisticsFTW @rescriptlang @skiplabs @agenticasdk @arcprize @rao2z @RishiHazra95 @AIatMeta