Eve Bodnia

567 posts

Eve Bodnia

Eve Bodnia

@evelovesolive

EBM reasoning, AI formal verification. CEO and founder of @logic_int

เข้าร่วม Temmuz 2025
493 กำลังติดตาม3.3K ผู้ติดตาม
Eve Bodnia
Eve Bodnia@evelovesolive·
@beffjezos Until I see 1M+ qubits in the lab with less than e-21 noise level I wont believe it :)
English
1
0
7
484
Beff (e/acc)
Beff (e/acc)@beffjezos·
Quantum getting pumped right now feels really comical. Saying this as someone who really loved the field and worked in it 8 years. People are truly clueless about the real quantum timelines.
English
55
16
457
31.9K
JJ
JJ@JosephJacks_·
Marin is very lush today.
JJ tweet media
English
4
0
18
3K
Eve Bodnia
Eve Bodnia@evelovesolive·
@francescofaenzi @IceSolst @BorisHanin I also suspect that lots of bugs in NSA came from poor legacy code conversions. This is also can be solved by FV: you just prove logic equivalence between the old and new code
English
0
0
1
27
francescofaenzi
francescofaenzi@francescofaenzi·
So … FV works best on localized systems with clear boundaries (finance, autopilots, hardware) while complex ones use math to extend spec classes and will need “vibe-code-specs” research. Got it , thank you. How would this actually could stop emergent hacks in something as interconnected as the systems Mythos breached? Silly question?
English
2
0
1
94
Eve Bodnia
Eve Bodnia@evelovesolive·
Based on my understanding of NSA, Mythos was mainly targeting networks. Networks are well studied from mathematical perspective (i did phd on this myself), so you can define properties of these networks and constrain the behavior with specs. I argue that pretty much all kinds of network can be formalized even quantum ones. If you think therr are some bugs in networks which cannot be captured, pls challenge me! Its Sunday morning and I am dying to use my brain for something mathematical and have fun discussions! 🤩
English
0
0
2
50
Eve Bodnia
Eve Bodnia@evelovesolive·
Maybe we can write a blog post about it with @BorisHanin on how exactly it works. I propose another view to grasp the big picture: consider some arbitrary code base which does certain things. The code base can have many dependencies and so on and hence it is not localized since boundaries are not there. FV works the best for localized code if boundaries are known. Such examples are some finance, some autopilots and some hardware industries. It is literally the problem of a finite bounded space and you can probe your system behavior against these boundaries. Some systems are less localized and boundaries can only be weakly defined : for example when you know most of specs but some are still missing. In this case you can try to take what is known and use math to prove that similar classes of known specs are applicable to your system, and try to test that behavior as well. Eventually as you work with more people and business on the same problem, you are learning morr about their specs and more about the real boundaries. I predict that vibe-code-specs will be a research effort in this industry in the future, where we will use the same FV methods and math to extrapolate behavior of certain specs. My brilliant friend @jdlichtman who has tons of expertise in this may also weight on big picture here maybe
English
1
1
9
626
francescofaenzi
francescofaenzi@francescofaenzi·
Hey @IceSolst (and @evelovesolive), Not an expert here at all, but just someone who’s been reading this thread and now my brain is doing loops trying to picture how this actually works in real life. Formal verification sounds powerful from what Eve said (like, it can actually prove stuff instead of just hoping tests catch everything), but your point about specs and those “emergent properties” makes total sense to me as a potential weak spot. I’m trying to get a clearer picture with some simple examples. Like, if I had the tiniest possible program (say, one that just adds two numbers together and returns the result) what exactly could formal verification prove about it? That it never gives a wrong answer for any inputs? That it never crashes? Or something more specific like “it will always stay within certain memory limits”? And on the emergent properties side, what’s a real-world-ish example of one that might sneak through even if the basic specs are solid? Something that only shows up when parts of a bigger system start talking to each other? In the context of the original NSA/Mythos thing, I’m wondering if formal verification is more like “we can lock down the important pieces really tightly” or if it’s closer to “the whole thing becomes basically unhackable if done right.” Because if the latter, that’d be wild. Also curious what “class-equivalence for the spec properties” looks like in practice: is that like proving “this whole family of behaviors will always be safe” instead of checking every single case one by one?
English
1
0
0
123
Eve Bodnia
Eve Bodnia@evelovesolive·
To your point it is true, we cannot cover full space since its np incompleteness problem. Instead, you can learn all the specs about your code and create data knowledge base on what people do in your areas in terms of specs and issues. If human does it, gaps will be huge. If AI does it, chances to fill all the gaps are a lot higher. LI is working with a few mission critical industries as result, because we have understand everything about their specs and invariants. Financial sector is one of them, and the way people break it from logical stant point, can also be fully covered (since we know all the invariants in theory). Another well covered sector is hardware. So to your point again, if we talk about *any code*, then obviously cannot cover it all since specs are too ambiguous. If we talk about automated systems as the original post, we can define full spec of invariants for those.
English
0
0
2
40
solst/ICE of Astarte
@evelovesolive Interesting, I always imagined this would be extremely difficult and inevitably leave gaps, esp for more complex systems
English
2
0
1
124
Eve Bodnia
Eve Bodnia@evelovesolive·
@IceSolst One can get complete coverage of all *useful* emergent properties. Sometimes directly, sometimes you can prove class-equivalence for the spec properties as well
English
1
0
3
155
solst/ICE of Astarte
@evelovesolive My understanding is formal verif. proves certain properties of the code behave as expected. But preventing all possible hacks would depend on correct specs and complete coverage of all emergent properties of the product (impossible).
English
5
1
21
1.3K
mihalyich🇺🇸
mihalyich🇺🇸@borodapomorska·
Чё кстате по девочковым именам? Чтоб удобно и в английском и в русском))))
Русский
147
1
33
13.4K
Eve Bodnia รีทวีตแล้ว
Rohan Paul
Rohan Paul@rohanpaul_ai·
Yann LeCun (@ylecun) explains why LLMs are limited in terms of real-world intelligence during a Bloomberg interview. "Language is a very approximate, reduced, quantized, and simplified description of the world, and LLMs can only deal with discrete sequences of symbols. The world is much more complicated than language. The biggest LLMs are pre-trained on the totality of all the publicly available text on the internet. That’s about 20 trillion words, or 30 trillion tokens. A token is about 3 bytes. So total 10¹⁴ bytes of text. This is the amount of data a four-year-old has seen through vision during four years. Now, the text, though, would take 400,000 years to read? So, there is enormously more data from sensory input, like vision, touch, and everything else, than there could ever be through language." A child does not need 400,000 years of reading to understand cups, doors, balance, faces, falls, or heat, because the body is already collecting dense feedback from vision, touch, motion, and consequence. Text strips most of that away. It turns a living scene into symbols, then asks the model to infer the missing world from traces left by people describing it. That is why an LLM can sound fluent about physics and still have no native sense of how fragile glass feels in a hand. Moravec’s paradox names this reversal: the things humans find intellectual can be easier for machines than the things toddlers do without applause. The hard part is not producing an answer, but building a model of the world that survives contact with weight, friction, surprise, and failure. ---- Link to the full video on Bloomberg's site. Link in comment.
English
48
118
453
47.1K
ranjan_raj
ranjan_raj@ranjan_vittal·
Today, I'm thrilled to announce Pramaana's $27M seed, led by @khoslaventures. The foundational domains that hold the world together: tax, law, finance, healthcare; all run on certainty. Probabilistic AI can't give them that. We’ve been asked to accept wrong answers with AI as ‘hallucinations’, while in traditional software terms, it’s just a bug. And a wrong answer in such mission-critical domains is more than just a bug, it's a liability that could have catastrophic impact. We built Pramaana to deliver a 100% trustable experience to the domains that run on certainty: AI that is provably correct, not probabilistically correct. We turn statute and regulation into machine-verifiable code, so every output ships with mathematical proof of correctness. Our mission is to make AI take ownership of it’s work. Pramaana in Sanskrit stands for “means of valid knowledge”, and we’re going to achieve that by formalizing the world’s knowledge.
English
99
152
1.3K
317.1K
Eve Bodnia รีทวีตแล้ว
duve
duve@jevonduve·
Wrote a blog post on formalizing quantum algorithms in Lean. Implements Deutsch-Jozsa and the GHZ state, with arguments of their correctness and complexity. tannerduve.github.io/blog/2026/quan…
English
1
3
25
947
Eve Bodnia
Eve Bodnia@evelovesolive·
Today, having AI model alone is no longer enough. A harness alone is not enough. Unique datasets alone are not enough either. Only the combination of all three enables self-improving AI today. Modern AI businesses rely on a tightly integrated loop: • A smart harness helps improve a model’s reasoning. • Better models enable the harness to reason and respond more effectively. • Together, they generate unique and novel knowledge, which is fed back into the model–harness loop. Alternative architectures beyond LLMs are essential for scaling this process. At Logical Intelligence (@logic_int )we build EBRMs (Energy-Based Reasoning Models), latent-variable reasoning models that are compatible with LLMs and designed to support the broader AI ecosystem. We also built and benchmarked our formal harness Aleph, which improves the formal reasoning capabilities of AI models. The future of self-improving AI is not a model, a harness, or data alone -it is the continuous interaction between all three!! I am very happy that the world finally recognizes the value of formal and self improving AI reasoning systems.. :)
Satya Nadella@satyanadella

x.com/i/article/2065…

English
0
1
21
2.4K
Eve Bodnia รีทวีตแล้ว
Sanjay Ganapathy
Sanjay Ganapathy@sanjaygsub·
Last week I joined a panel on the frontiers of AI × Verification at the @fv_summit, with @evelovesolive, @diagram_chaser, @DjDvij and @sathyanellore . From researching the jagged frontier of Gemini's capabilities, one thing was clear: Verifiability is the bottleneck to superhuman AI on any class of tasks. And verification needs rigorous specification. The catch: rigorous specification doesn't come naturally to us. We run on intuition — and intuition can generate an answer, but it can't certify one. Yet certainty is exactly what high-stakes work needs to stay reliable and compound. And frontier models won't learn that rigor from human data. At @PramaanaLabs, we're teaching it directly: pairing AI with a formal specification language and training on machine-checkable rewards.
English
0
3
11
27.6K
Santiago M.
Santiago M.@sanmking·
@evelovesolive With all due respect: I think the name of your company would be better if it was Logical Intuition.
English
1
0
0
83