ivangavran

91 posts

ivangavran

ivangavran

@ivan00gavran

Katılım Şubat 2022
273 Takip Edilen80 Takipçiler
ivangavran retweetledi
Igor Igor
Igor Igor@ohdearlordylord·
I ported Dungeons & Dragons (5th edition 2024) rules to @quint_lang and XState. Went really well, small thread and an article in the message below:
English
1
1
7
309
ivangavran retweetledi
Murat Demirbas (Distributolog)
Murat Demirbas (Distributolog)@muratdemirbas·
Using Claude Code, I deployed another distributed systems algorithm visualization. Interactive single-page Paxos consensus tutorial game. Players try (and fail) to break the safety invariant by killing and delaying nodes. What is your high score?
Murat Demirbas (Distributolog) tweet media
English
0
9
50
5.7K
ivangavran retweetledi
Arjun Narayan
Arjun Narayan@narayanarjun·
I'm optimistic that formal verification is the solution to our current situation where LLMs are writing our code and nobody's reading it. Formal methods can give us a world where we write succinct specs and agent-generated code is proven to comply. But we have a long way to go. There are several open challenges that stand between our situation today and that future, but none appear insurmountable. I’ve written a brief overview of what I consider to be the big open problems, and some of the directions that researchers are taking today to address them: from verifying mathematics to building standard libraries of verified code that can be built upon. Here are a few highlights: 1) A Brief History of Formal Verification Verification is fundamentally about understanding what your program can or can’t do, and verifying it with a proof. In order to verify, you must first have a specification that you are verifying your program against. Most of you leverage some formal verification day to day: namely, some of the compiler errors in statically-typed languages like C++ and Java are verification errors. Static type checking is the version of formal verification programmers are most familiar with. Type systems (and related formal verification tools) have gotten quite impressive, and they are becoming a lot more relevant in constraining the behavior of AI coding models. 2) Rust Type checking represents a middle ground for verification. The hard part is choosing the right balance: reject too many good programs and it becomes hard to program in this language as the programmer has to “guess what the type checker will permit”. Recently the language that has brought the most interesting advances from type systems to the real world is Rust. Its ownership type language and associated type checker is known as the “borrow checker”. The borrow checker is conservative, and “fighting with the borrow checker” is part and parcel of everyone’s Rust experience. This gives us the following lesson: we can prove more interesting things, but at a larger burden to the developer. Finding elegant middle points is hard, and Rust represents a real design breakthrough in navigating that tradeoff. 3) Mechanically verified math Recently, groups of mathematical researchers have recently been writing mathematical proofs in a specialized programming language called a proof assistant. This language, LEAN, comes with a powerful type checker capable of certifying complex mathematical proofs. LEAN is exciting, but working in LEAN can be frustrating - because of the nontermination properties of the type checker’s search, such languages rely heavily on programmer annotation. And this is why more complex type systems have stayed relatively academic: the Rust borrow checker sits at a genuinely elegant point in the design space: complex enough to reason about a complex property like memory references, yet simple enough to not need too much extra annotation. But this is a critically important point: Mathematical proofs and type checking aren’t just analogous: they are the literal same task. They are different only in the degree of complexity along two axes: the complexity of the underlying objects, and the complexity of the properties we are proving. 4) There is still a long way to go for proof assistants While the world I describe is exciting, bluntly, we’re not anywhere close to that world yet. Proofs break easily when programs are modified, the standard library of proofs is too small, and specifications seldom capture everything about the program’s behavior. Overall there’s a long way to go before these techniques reach a mainstream programming language with broad adoption. But, AI is a huge accelerant to proof assistants. Much of the energy towards AI-assisted mathematics is coming from AI researchers who see it as a very promising domain for building better reasoning models. Verified math is a domain rich in endless lemmas, statements, and proofs, all of which can be used as “ground truth” - which means we can use them as strong reward signals in our post-training workflows. There are several startups being built by seasoned foundation model researchers - Harmonic, Math Inc - that are based on this premise. I’m no expert here, but it sure seems to me that formally verified code would lead to a clear domain of tasks that have strong verifiable rewards ripe for use in reinforcement learning to build better agents period. I’m excited about the efforts to use verified mathematics in reinforcement learning. But I’d love to see even more experiments in bringing verification to the agentic coding world. This is an exciting time in programming languages and formal methods research. There’s only one way out of the increasingly unwieldy mountain of LLM generated code: We must prove. We will prove.
English
17
19
135
19.9K
ivangavran retweetledi
Ilya Sergey
Ilya Sergey@ilyasergey·
New post on "Proofs and Intuitions": Verifying Distributed Protocols in Veil. We take a tour of Veil, a Lean-based verification framework that combines TLA+-style model checking with formal proofs and enables AI-powered invariant inference. proofsandintuitions.net/2026/02/09/dis…
English
3
22
107
8.5K
ivangavran
ivangavran@ivan00gavran·
Different people have different methods, here is what we do: - create a list of invariants that must hold - find how they can be broken or argue for why they hold - don't fool yourself, have somebody check your reasoning - check with Quint (Apalache, TLC, simulator)
English
0
0
2
22
software archæologist
software archæologist@archeologistdev·
@DominikTornow That’s all very well and I applaud it. The point of my somewhat tongue-in-cheek tweet was to call out that (alas!) most software doesn’t have a specification, at least not one that can be proven correct. And even when it does, the internal quality of the code still matters…
English
2
0
1
80
Dominik Tornow
Dominik Tornow@DominikTornow·
Yes, the spec is right in front of us The paper is accompanied by a formal specification in TLA+. Most notably, the paper specifies Raft's safety invariants that an agent can use to debug its own implementation State machine safety implies linearizability, closing the loop
Dominik Tornow tweet mediaDominik Tornow tweet media
software archæologist@archeologistdev

@DominikTornow Is this “specification” in the room with us?

English
1
6
82
6.4K
ivangavran
ivangavran@ivan00gavran·
@sgrove @bugarela @informalinc @geoffreylitt @MarcJBrooker It seems that both camps are converging towards similar solutions, but with different emphasis. The common denominator is understanding built through communication, and helping and grounding agents when needed with formal specs.
English
1
0
1
370
ivangavran
ivangavran@ivan00gavran·
Everyone likes spec-driven development because it is a term that everyone understands differently and uses as they like.
English
1
1
8
335
ivangavran retweetledi
Gabriela Moreira
Gabriela Moreira@bugarela·
Excited to join some amazing folks in Floripa this weekend! I'll share all of our newest ideas for Quint 😎 PS: I'd say co-creator is more precise 😅 although there are many different words to describe my relationship to Quint 💜
CryptoLar@CryptoLarBrasil

Happy to announce @bugarela, creator of Quint. Sharing how formal specifications can meaningfully guide LLM workflows. #TYPED #AI #FormalMethods

English
0
3
18
611
ivangavran retweetledi
Informal Systems
Informal Systems@informalinc·
Public blockchains lack the performance and control enterprises need. Private chains sacrifice transparency and trust. Today we are introducing Emerald: An open-source framework for institutional networks of trust. Launch your own high-performance, EVM compatible Emerald network in minutes.
Informal Systems tweet media
English
34
27
137
26.5K
ivangavran
ivangavran@ivan00gavran·
@philbugcatcher @0xDestinyae I definitely recommend to try it out (of course, I may be biased, but I believe it is a great fit for your method). I'll be glad to assist along the way, or join the tg group (t.me/quint_lang) for discussions/questions
English
0
0
1
27
ivangavran
ivangavran@ivan00gavran·
@philbugcatcher There is for sure a lot of value in being able to start quickly in Excel, but I believe Quint may give you the same velocity (indeed, I once adapted a devs' spreadsheet model into a Quint one and it was a huge value add)
English
1
0
3
73