ivangavran

91 posts

ivangavran

@ivan00gavran

Katılım Şubat 2022

273 Takip Edilen80 Takipçiler

ivangavran retweetledi

Igor Igor@ohdearlordylord·30 Mar

I ported Dungeons & Dragons (5th edition 2024) rules to @quint_lang and XState. Went really well, small thread and an article in the message below:

English

309

ivangavran retweetledi

Murat Demirbas (Distributolog)@muratdemirbas·18 Mar

Using Claude Code, I deployed another distributed systems algorithm visualization. Interactive single-page Paxos consensus tutorial game. Players try (and fail) to break the safety invariant by killing and delaying nodes. What is your high score?

Murat Demirbas (Distributolog) tweet media

English

5.7K

ivangavran retweetledi

Arjun Narayan@narayanarjun·17 Şub

I'm optimistic that formal verification is the solution to our current situation where LLMs are writing our code and nobody's reading it. Formal methods can give us a world where we write succinct specs and agent-generated code is proven to comply. But we have a long way to go. There are several open challenges that stand between our situation today and that future, but none appear insurmountable. I’ve written a brief overview of what I consider to be the big open problems, and some of the directions that researchers are taking today to address them: from verifying mathematics to building standard libraries of verified code that can be built upon. Here are a few highlights: 1) A Brief History of Formal Verification Verification is fundamentally about understanding what your program can or can’t do, and verifying it with a proof. In order to verify, you must first have a specification that you are verifying your program against. Most of you leverage some formal verification day to day: namely, some of the compiler errors in statically-typed languages like C++ and Java are verification errors. Static type checking is the version of formal verification programmers are most familiar with. Type systems (and related formal verification tools) have gotten quite impressive, and they are becoming a lot more relevant in constraining the behavior of AI coding models. 2) Rust Type checking represents a middle ground for verification. The hard part is choosing the right balance: reject too many good programs and it becomes hard to program in this language as the programmer has to “guess what the type checker will permit”. Recently the language that has brought the most interesting advances from type systems to the real world is Rust. Its ownership type language and associated type checker is known as the “borrow checker”. The borrow checker is conservative, and “fighting with the borrow checker” is part and parcel of everyone’s Rust experience. This gives us the following lesson: we can prove more interesting things, but at a larger burden to the developer. Finding elegant middle points is hard, and Rust represents a real design breakthrough in navigating that tradeoff. 3) Mechanically verified math Recently, groups of mathematical researchers have recently been writing mathematical proofs in a specialized programming language called a proof assistant. This language, LEAN, comes with a powerful type checker capable of certifying complex mathematical proofs. LEAN is exciting, but working in LEAN can be frustrating - because of the nontermination properties of the type checker’s search, such languages rely heavily on programmer annotation. And this is why more complex type systems have stayed relatively academic: the Rust borrow checker sits at a genuinely elegant point in the design space: complex enough to reason about a complex property like memory references, yet simple enough to not need too much extra annotation. But this is a critically important point: Mathematical proofs and type checking aren’t just analogous: they are the literal same task. They are different only in the degree of complexity along two axes: the complexity of the underlying objects, and the complexity of the properties we are proving. 4) There is still a long way to go for proof assistants While the world I describe is exciting, bluntly, we’re not anywhere close to that world yet. Proofs break easily when programs are modified, the standard library of proofs is too small, and specifications seldom capture everything about the program’s behavior. Overall there’s a long way to go before these techniques reach a mainstream programming language with broad adoption. But, AI is a huge accelerant to proof assistants. Much of the energy towards AI-assisted mathematics is coming from AI researchers who see it as a very promising domain for building better reasoning models. Verified math is a domain rich in endless lemmas, statements, and proofs, all of which can be used as “ground truth” - which means we can use them as strong reward signals in our post-training workflows. There are several startups being built by seasoned foundation model researchers - Harmonic, Math Inc - that are based on this premise. I’m no expert here, but it sure seems to me that formally verified code would lead to a clear domain of tasks that have strong verifiable rewards ripe for use in reinforcement learning to build better agents period. I’m excited about the efforts to use verified mathematics in reinforcement learning. But I’d love to see even more experiments in bringing verification to the agentic coding world. This is an exciting time in programming languages and formal methods research. There’s only one way out of the increasingly unwieldy mountain of LLM generated code: We must prove. We will prove.

English

135

19.9K

ivangavran retweetledi

Ilya Sergey@ilyasergey·9 Şub

New post on "Proofs and Intuitions": Verifying Distributed Protocols in Veil. We take a tour of Veil, a Lean-based verification framework that combines TLA+-style model checking with formal proofs and enables AI-powered invariant inference. proofsandintuitions.net/2026/02/09/dis…

English

107

8.5K

ivangavran@ivan00gavran·3 Şub

Different people have different methods, here is what we do: - create a list of invariants that must hold - find how they can be broken or argue for why they hold - don't fool yourself, have somebody check your reasoning - check with Quint (Apalache, TLC, simulator)

English

ivangavran@ivan00gavran·3 Şub

Security auditors know that the most difficult question in an audit is "how do we know we are done", how do we know there aren't any problems left in the protocol or the implementation.

Informal Systems@informalinc

Most audits depend on one person checking every line of code. Our approach starts differently: define what the system should do, then systematically verify it does. @ivan00gavran from our security team explains.

English

ivangavran@ivan00gavran·29 Oca

@archeologistdev @DominikTornow Though it becomes increasingly easy to write and verify specs. (Our effort in this direction is TLA+ - adjacent language Quint quint-lang.org and there are other attempts as well)

English

software archæologist@archeologistdev·29 Oca

@DominikTornow That’s all very well and I applaud it. The point of my somewhat tongue-in-cheek tweet was to call out that (alas!) most software doesn’t have a specification, at least not one that can be proven correct. And even when it does, the internal quality of the code still matters…

English

Dominik Tornow@DominikTornow·28 Oca

Yes, the spec is right in front of us The paper is accompanied by a formal specification in TLA+. Most notably, the paper specifies Raft's safety invariants that an agent can use to debug its own implementation State machine safety implies linearizability, closing the loop

software archæologist@archeologistdev

@DominikTornow Is this “specification” in the room with us?

English

6.4K

ivangavran@ivan00gavran·20 Oca

@aesmonty Very interesting, thanks for writing!

English

Andres Monty | range.org@aesmonty·18 Oca

x.com/i/article/2012…

ZXX

866

ivangavran@ivan00gavran·5 Oca

@sgrove @bugarela @informalinc @geoffreylitt @MarcJBrooker That post definitely gave me a different angle to view the claims on NL as a spec

English

355

ivangavran@ivan00gavran·5 Oca

@sgrove @bugarela @informalinc @geoffreylitt @MarcJBrooker It seems that both camps are converging towards similar solutions, but with different emphasis. The common denominator is understanding built through communication, and helping and grounding agents when needed with formal specs.

English

370

ivangavran@ivan00gavran·5 Oca

Everyone likes spec-driven development because it is a term that everyone understands differently and uses as they like.

English

335

ivangavran retweetledi

Gabriela Moreira@bugarela·3 Ara

Excited to join some amazing folks in Floripa this weekend! I'll share all of our newest ideas for Quint 😎 PS: I'd say co-creator is more precise 😅 although there are many different words to describe my relationship to Quint 💜

CryptoLar@CryptoLarBrasil

Happy to announce @bugarela, creator of Quint. Sharing how formal specifications can meaningfully guide LLM workflows. #TYPED #AI #FormalMethods

English

611

ivangavran retweetledi

Informal Systems@informalinc·2 Ara

Public blockchains lack the performance and control enterprises need. Private chains sacrifice transparency and trust. Today we are introducing Emerald: An open-source framework for institutional networks of trust. Launch your own high-performance, EVM compatible Emerald network in minutes.

English

137

26.5K

ivangavran@ivan00gavran·24 Kas

@philbugcatcher @0xDestinyae I definitely recommend to try it out (of course, I may be biased, but I believe it is a great fit for your method). I'll be glad to assist along the way, or join the tg group (t.me/quint_lang) for discussions/questions

English

phil@philbugcatcher·24 Kas

@0xDestinyae @ivan00gavran Haven’t checked how good they are, but there are some lessons here: quint-lang.org/docs/lessons

English

ivangavran@ivan00gavran·24 Kas

@philbugcatcher , that was a very good and interesting talk. Given how clearly you introduced the value of modelling when doing audits, have you tried using Quint? (quint-lang.org) .

Bernhard Mueller@muellerberndt

The @summit_defi was incredibly inspiring, especially @philbugcatcher's lightning talk. There's an insane new AI auditing idea that's been on my mind for weeks and today all the pieces finally clicked.

English

ivangavran@ivan00gavran·24 Kas

@philbugcatcher There is for sure a lot of value in being able to start quickly in Excel, but I believe Quint may give you the same velocity (indeed, I once adapted a devs' spreadsheet model into a Quint one and it was a huge value add)

English

Keşfet

@quint_lang @archeologistdev @DominikTornow @aesmonty @sgrove @bugarela @informalinc @geoffreylitt