Diego Kingston🟩

1

5

246

Diego Kingston🟩 retweetledi

LambdaClass@class_lambda·2d

Last week Rome was three conferences in one: zkSummit 14, zkProof, and Eurocrypt 2026. We were there for the latest in cryptography and zero-knowledge, and to get the community's eyes on our new VM.

English

5

17

1.2K

Diego Kingston🟩@diego_aligned·10 May

Are algebraic hash functions screwed?

English

4

43

4.7K

Diego Kingston🟩 retweetledi

Giacomo Fenzi@GiacomoFenzi·9 May

We are back on! Alessandro Chiesa on Close Enough: Proximity Tests from Linear codes. Livestream: youtube.com/live/Kla_3rFN-…

YouTube

English

2

21

781

Diego Kingston🟩@diego_aligned·9 May

How close can you get? IOP fest has the answer

English

6

284

Diego Kingston🟩 retweetledi

Mauro Toscano 🟩@mauro_aligned·9 May

But the good news is LambdaVM doesn’t care about quantum computers. And @leanEthereum neither.

English

2

4

28

2.5K

Diego Kingston🟩@diego_aligned·9 May

Now on zkproofs, quantum cryptanalysis

Nederlands

10

307

Diego Kingston🟩@diego_aligned·9 May

Proximity prize news at IOPFest

Română

1

14

480

Diego Kingston🟩@diego_aligned·7 May

The final panel in zksummit now

English

0

9

452

Diego Kingston🟩@diego_aligned·7 May

Closing panel for zksummit!

English

25

1K

Diego Kingston🟩@diego_aligned·7 May

Starting another zksummit

English

8

357

Diego Kingston🟩@diego_aligned·13 Nis

This is going to be huge

There is a high chance I managed to lower the bandwith usage considerably and now we can check all the tokens. @diego_aligned pick up the phone!

English

0

4

272

Diego Kingston🟩@diego_aligned·10 Nis

The overhead is not 100%, see the paper. You can audit all the tokens that you want and the provider cannot change responses, because he is cryptographically bound by the commitment. The protocol allows you to verify locally without reexecuting, in a time that is significantly smaller than what would take you to do the inference on your own. This is way better than having people randomly redo the computation and which you cannot be certain are colluding. Thus, the protocol is more efficient both in terms of compute and in allowing you to verify with certainty any opening you do

English

0

1

42

Alex Mizrahi@killerstorm·9 Nis

@ercwl This is a very interesting work, but I don't think it unlocks many use cases, at least as is. I'm not convinced it's actually better than redundant execution. Let's compare: "The full response is always committed, but only a random fraction of responses are opened for audit."

English

3

0

2

792

Eric Wall@ercwl·9 Nis

This is pretty brilliant. A significant improvement vs. any known scheme for how to cryptographically *prove* LLM inference at a low overhead cost. => Imagine you can get a cryptographic proof that a specific LLM generated a certain response. It unlocks thousands of use cases.

Paper and code: github.com/lambdaclass/Co…

English

19

4

162

29K

Diego Kingston🟩@diego_aligned·8 Nis

This is actually becoming a habit lol

RJ 🟩@rj_aligned

another sprint of nights without sleep incoming

English

1

465

Diego Kingston🟩 retweetledi

Fede’s intern 🥊@fede_intern·8 Nis

Attentioned correclty bounded in multiple models. Working on FP8 implementation now too. Let's make open weight models the default in AI by making inference verifiable!

LLMs now make critical decisions in hospitals, defense, banks, and governments. Yet nobody can verify which model actually ran, or whether the output was tampered with. A provider or middleman can swap weights, silently requantize the model, alter decoding, inject hidden prompts, do supply chain attacks, or change the deployment surface without the user knowing. This problem is already serious. It will become critical. We think this needs a practical solution, not just a theoretically clean one. CommitLLM is designed to be deployable on existing serving stacks now: the provider keeps the normal GPU serving path, does not need a proving circuit, does not need a kernel rewrite, and does not generate a heavy proof for every response. In practice, two families of approaches dominated the conversation before this work: fingerprinting, which can be gamed, and proof-based systems, which are theoretically strong but too expensive for production inference. We built CommitLLM to target the middle ground. The core idea is to keep the verification discipline of proof systems, but specialize it to open weight LLM inference. The cryptographic core is simple: Freivalds style randomized checks for the large linear layers, plus Merkle commitments for the traced execution. Then a lot of engineering work is needed to make that line up with real GPU inference. The key trick is this. A provider claims `z = W × x` for a massive weight matrix. Normally you would verify that by redoing the multiply. Instead, the verifier samples a secret random vector `r`, precomputes `v = rᵀ × W`, and later checks whether `v · x = rᵀ · z`. Two dot products instead of a full matrix multiply. In the current implementation, a wrong result passes with probability at most `1 / (2^32 - 5)` per check. A full matrix multiply, audited with two dot products. Most of the transformer can then be checked exactly or canonically from committed openings. Nonlinear operations such as activations and layer norms are canonically re executed by the CPU verifier. The one honest caveat is attention: native FP16/BF16 attention is not bit reproducible across hardware. CommitLLM verifies the shell around attention exactly, then independently replays attention and checks that the committed post attention output stays within a measured INT8 corridor. So attention is bounded and audited, not proved exactly. That means the protocol already gives very strong exact guarantees on the parts that matter operationally most. If an audited response used the wrong model, the wrong quantization/configuration, or a tampered input/deployment surface, the audit catches that exactly. That includes things like model swaps, silent requantization, and provider side prompt or system prompt injection. Today the implementation and measurements are strongest on Qwen and Llama. But the protocol itself is not meant to be Qwen or Llama specific: we expect it to generalize across open weight decoder only families. What still has to be done is the engineering work to integrate and validate more families explicitly, and we are already working on that. On the measured path, online generation overhead is about 12 to 14% with the provider staying on the normal GPU serving path. The heavier receipt finalization cost is separate and can be deferred off the user facing path. The main systems costs are RAM and bandwidth, not proof generation. The full response is always committed, but only a random fraction of responses are opened for audit. Individual audits are much larger, roughly 4 MB to 100 MB depending on audit depth. The important number is the amortized one: under a reasonable audit policy, the added bandwidth averages to roughly 300 KB per response. After too many weeks without sleep, I’m proud to show what I built with @diego_aligned: CommitLLM. Thanks Diego for your patience. I've been calling you at random hours. The code and paper still need some cleaning and formalization. We’re already in talks with multiple providers and teams that have cryptography related ideas on how to improve it even more. We’re really excited about this and we will continue doubling down on building products in AI, cryptography and security with my company @class_lambda. If governments, hospitals, defense and financial systems are going to run on LLMs, verifiable inference is not optional. It is infrastructure. I will be explaining this in more details in the days to come and I will show how to test it and run it.

English

4

13

2K

Diego Kingston🟩@diego_aligned·7 Nis

Best place for sushi in Buenos Aires, amazing experience. Ergodic focuses also on craft

English

1

2

282

Diego Kingston🟩 retweetledi

Fede’s intern 🥊@fede_intern·5 Nis

Practically solved the attention gap with good engineering and policy. Verifiable open weight inference is here!

I'm working on the new version of CommitLLM paper and codebase. We got many ideas how to make this even better but this right now solves the verification of LLMs! I highly recommend every researcher to take a look. We will be deploying this in production.

English

8

5

44

4.5K

Diego Kingston🟩 retweetledi

abdel@AbdelStark·1 Nis

This is amazing work! Turns out there might be more ways than ZK to help solve the problem of verifiable AI! I really like the simplicity and pragmatism of the scheme. It seems to be a very interesting set of tradeoffs, that could become suitable for multiple production use cases. I implemented CommitLLM version in Zig. Fully compatible with the Rust reference implementation. In the demo video you can see cross implementation checks, including tamper attempt, a.k.a the CommitLLM Rust prover trying to fool the CommitLLM Zig verifier and failing.