Eve Bodnia

567 posts

Eve Bodnia

@evelovesolive

EBM reasoning, AI formal verification. CEO and founder of @logic_int

เข้าร่วม Temmuz 2025

493 กำลังติดตาม3.3K ผู้ติดตาม

ทวีตที่ปักหมุด

Eve Bodnia@evelovesolive·21 Oca

We are the first company commercializing energy based reasoning models — reasoning beyond LLMs. Come to our website to play Sudoku with Kona!

Logical Intelligence@logic_int

x.com/i/article/2014…

English

989

130.1K

Eve Bodnia@evelovesolive·9h

@beffjezos Until I see 1M+ qubits in the lab with less than e-21 noise level I wont believe it :)

English

484

Beff (e/acc)@beffjezos·11h

Quantum getting pumped right now feels really comical. Saying this as someone who really loved the field and worked in it 8 years. People are truly clueless about the real quantum timelines.

English

457

31.9K

Eve Bodnia@evelovesolive·2d

@JosephJacks_ Woodside was sunny ☀️ hiked 11 miles …

English

306

JJ@JosephJacks_·2d

Marin is very lush today.

English

Eve Bodnia@evelovesolive·2d

@francescofaenzi @IceSolst @BorisHanin I also suspect that lots of bugs in NSA came from poor legacy code conversions. This is also can be solved by FV: you just prove logic equivalence between the old and new code

English

francescofaenzi@francescofaenzi·2d

So … FV works best on localized systems with clear boundaries (finance, autopilots, hardware) while complex ones use math to extend spec classes and will need “vibe-code-specs” research. Got it , thank you. How would this actually could stop emergent hacks in something as interconnected as the systems Mythos breached? Silly question?

English

Eve Bodnia@evelovesolive·3d

Formal verification is protecting code and hardware from all possible hacks!

Pedro Domingos@pmddomingos

Mythos broke into almost all of the NSA’s classified systems in hours, per its director. It would have been irresponsible to not impose export controls on it. (And on Fable, with its pathetically inadequate guardrails.)

English

5.2K

Eve Bodnia@evelovesolive·2d

Based on my understanding of NSA, Mythos was mainly targeting networks. Networks are well studied from mathematical perspective (i did phd on this myself), so you can define properties of these networks and constrain the behavior with specs. I argue that pretty much all kinds of network can be formalized even quantum ones. If you think therr are some bugs in networks which cannot be captured, pls challenge me! Its Sunday morning and I am dying to use my brain for something mathematical and have fun discussions! 🤩

English

Eve Bodnia@evelovesolive·2d

Maybe we can write a blog post about it with @BorisHanin on how exactly it works. I propose another view to grasp the big picture: consider some arbitrary code base which does certain things. The code base can have many dependencies and so on and hence it is not localized since boundaries are not there. FV works the best for localized code if boundaries are known. Such examples are some finance, some autopilots and some hardware industries. It is literally the problem of a finite bounded space and you can probe your system behavior against these boundaries. Some systems are less localized and boundaries can only be weakly defined : for example when you know most of specs but some are still missing. In this case you can try to take what is known and use math to prove that similar classes of known specs are applicable to your system, and try to test that behavior as well. Eventually as you work with more people and business on the same problem, you are learning morr about their specs and more about the real boundaries. I predict that vibe-code-specs will be a research effort in this industry in the future, where we will use the same FV methods and math to extrapolate behavior of certain specs. My brilliant friend @jdlichtman who has tons of expertise in this may also weight on big picture here maybe

English

626

francescofaenzi@francescofaenzi·2d

Hey @IceSolst (and @evelovesolive), Not an expert here at all, but just someone who’s been reading this thread and now my brain is doing loops trying to picture how this actually works in real life. Formal verification sounds powerful from what Eve said (like, it can actually prove stuff instead of just hoping tests catch everything), but your point about specs and those “emergent properties” makes total sense to me as a potential weak spot. I’m trying to get a clearer picture with some simple examples. Like, if I had the tiniest possible program (say, one that just adds two numbers together and returns the result) what exactly could formal verification prove about it? That it never gives a wrong answer for any inputs? That it never crashes? Or something more specific like “it will always stay within certain memory limits”? And on the emergent properties side, what’s a real-world-ish example of one that might sneak through even if the basic specs are solid? Something that only shows up when parts of a bigger system start talking to each other? In the context of the original NSA/Mythos thing, I’m wondering if formal verification is more like “we can lock down the important pieces really tightly” or if it’s closer to “the whole thing becomes basically unhackable if done right.” Because if the latter, that’d be wild. Also curious what “class-equivalence for the spec properties” looks like in practice: is that like proving “this whole family of behaviors will always be safe” instead of checking every single case one by one?

English

123

Eve Bodnia@evelovesolive·2d

To your point it is true, we cannot cover full space since its np incompleteness problem. Instead, you can learn all the specs about your code and create data knowledge base on what people do in your areas in terms of specs and issues. If human does it, gaps will be huge. If AI does it, chances to fill all the gaps are a lot higher. LI is working with a few mission critical industries as result, because we have understand everything about their specs and invariants. Financial sector is one of them, and the way people break it from logical stant point, can also be fully covered (since we know all the invariants in theory). Another well covered sector is hardware. So to your point again, if we talk about *any code*, then obviously cannot cover it all since specs are too ambiguous. If we talk about automated systems as the original post, we can define full spec of invariants for those.

English

solst/ICE of Astarte@IceSolst·3d

@evelovesolive Interesting, I always imagined this would be extremely difficult and inevitably leave gaps, esp for more complex systems

English

124

Eve Bodnia@evelovesolive·3d

@IceSolst One can get complete coverage of all *useful* emergent properties. Sometimes directly, sometimes you can prove class-equivalence for the spec properties as well

English

155

solst/ICE of Astarte@IceSolst·3d

@evelovesolive My understanding is formal verif. proves certain properties of the code behave as expected. But preventing all possible hacks would depend on correct specs and complete coverage of all emergent properties of the product (impossible).

English

1.3K

Eve Bodnia@evelovesolive·4d

@ChiefScientist @borodapomorska Давай! А где территориально?

Русский

Alexy 🤍💙🤍@ChiefScientist·4d

@evelovesolive @borodapomorska надо выпить пива за формальные методы!

Русский

mihalyich🇺🇸@borodapomorska·4d

Чё кстате по девочковым именам? Чтоб удобно и в английском и в русском))))

Русский

147

13.4K

Eve Bodnia@evelovesolive·4d

@ChiefScientist @borodapomorska @ChiefScientist махароший ты наш чтоли!? 😁

Русский

Alexy 🤍💙🤍@ChiefScientist·4d

@borodapomorska Diana

Español

727

Eve Bodnia รีทวีตแล้ว

Rohan Paul@rohanpaul_ai·5d

Yann LeCun (@ylecun) explains why LLMs are limited in terms of real-world intelligence during a Bloomberg interview. "Language is a very approximate, reduced, quantized, and simplified description of the world, and LLMs can only deal with discrete sequences of symbols. The world is much more complicated than language. The biggest LLMs are pre-trained on the totality of all the publicly available text on the internet. That’s about 20 trillion words, or 30 trillion tokens. A token is about 3 bytes. So total 10¹⁴ bytes of text. This is the amount of data a four-year-old has seen through vision during four years. Now, the text, though, would take 400,000 years to read? So, there is enormously more data from sensory input, like vision, touch, and everything else, than there could ever be through language." A child does not need 400,000 years of reading to understand cups, doors, balance, faces, falls, or heat, because the body is already collecting dense feedback from vision, touch, motion, and consequence. Text strips most of that away. It turns a living scene into symbols, then asks the model to infer the missing world from traces left by people describing it. That is why an LLM can sound fluent about physics and still have no native sense of how fragile glass feels in a hand. Moravec’s paradox names this reversal: the things humans find intellectual can be easier for machines than the things toddlers do without applause. The hard part is not producing an answer, but building a model of the world that survives contact with weight, friction, surprise, and failure. ---- Link to the full video on Bloomberg's site. Link in comment.

English

118

453

47.1K

Eve Bodnia@evelovesolive·5d

@nasqret We were the ones to verify the proof of Unit distance problem :) @logic_int

English

268

Bartosz Naskręcki@nasqret·6d

And we made it with the leidendeclaration.ai into the Nature editorial. What a day!

English

225

13.8K

Eve Bodnia@evelovesolive·5d

Vinod is never wrong

Vinod Khosla@vkhosla

Auto formalization will be an important new area.

English

3.5K

Eve Bodnia@evelovesolive·5d

@ranjan_vittal @khoslaventures Congratulations 🎈🎉🍾🎊

English

140

ranjan_raj@ranjan_vittal·6d

Today, I'm thrilled to announce Pramaana's $27M seed, led by @khoslaventures. The foundational domains that hold the world together: tax, law, finance, healthcare; all run on certainty. Probabilistic AI can't give them that. We’ve been asked to accept wrong answers with AI as ‘hallucinations’, while in traditional software terms, it’s just a bug. And a wrong answer in such mission-critical domains is more than just a bug, it's a liability that could have catastrophic impact. We built Pramaana to deliver a 100% trustable experience to the domains that run on certainty: AI that is provably correct, not probabilistically correct. We turn statute and regulation into machine-verifiable code, so every output ships with mathematical proof of correctness. Our mission is to make AI take ownership of it’s work. Pramaana in Sanskrit stands for “means of valid knowledge”, and we’re going to achieve that by formalizing the world’s knowledge.

English

152

1.3K

317.1K

Eve Bodnia@evelovesolive·5d

We wrote the blog with @BorisHanin about EBRMs (latent reasoning models), let us know what do you think!

Logical Intelligence@logic_int

Adaptive planning only works if you can evaluate progress while you’re still in the middle. If you only get feedback at the end (the plan “works” or “doesn’t”) you’re forced into guess-and-check. logicalintelligence.com/blog/energy-ba…

English

1.6K

Eve Bodnia รีทวีตแล้ว

duve@jevonduve·17 Haz

Wrote a blog post on formalizing quantum algorithms in Lean. Implements Deutsch-Jozsa and the GHZ state, with arguments of their correctness and complexity. tannerduve.github.io/blog/2026/quan…

English

947

Eve Bodnia@evelovesolive·15 Haz

Today, having AI model alone is no longer enough. A harness alone is not enough. Unique datasets alone are not enough either. Only the combination of all three enables self-improving AI today. Modern AI businesses rely on a tightly integrated loop: • A smart harness helps improve a model’s reasoning. • Better models enable the harness to reason and respond more effectively. • Together, they generate unique and novel knowledge, which is fed back into the model–harness loop. Alternative architectures beyond LLMs are essential for scaling this process. At Logical Intelligence (@logic_int )we build EBRMs (Energy-Based Reasoning Models), latent-variable reasoning models that are compatible with LLMs and designed to support the broader AI ecosystem. We also built and benchmarked our formal harness Aleph, which improves the formal reasoning capabilities of AI models. The future of self-improving AI is not a model, a harness, or data alone -it is the continuous interaction between all three!! I am very happy that the world finally recognizes the value of formal and self improving AI reasoning systems.. :)

Satya Nadella@satyanadella

x.com/i/article/2065…

English

2.4K

Eve Bodnia รีทวีตแล้ว

Sanjay Ganapathy@sanjaygsub·14 Haz

Last week I joined a panel on the frontiers of AI × Verification at the @fv_summit, with @evelovesolive, @diagram_chaser, @DjDvij and @sathyanellore . From researching the jagged frontier of Gemini's capabilities, one thing was clear: Verifiability is the bottleneck to superhuman AI on any class of tasks. And verification needs rigorous specification. The catch: rigorous specification doesn't come naturally to us. We run on intuition — and intuition can generate an answer, but it can't certify one. Yet certainty is exactly what high-stakes work needs to stay reliable and compound. And frontier models won't learn that rigor from human data. At @PramaanaLabs, we're teaching it directly: pairing AI with a formal specification language and training on machine-checkable rewards.

English

27.6K

Eve Bodnia@evelovesolive·13 Haz

@sanmking Nah, @lulumeservey came up for the best name for us :)

English

Santiago M.@sanmking·13 Haz

@evelovesolive With all due respect: I think the name of your company would be better if it was Logical Intuition.

English

Eve Bodnia@evelovesolive·13 Haz

Formal verification can prevent these issues! Nobody will break your code if its formally verified 🪄 Latent reasoning models (EBRMs) can help to scale it ! Lets push FV to be the government standard for mission critical industry

TechCrunch@TechCrunch

Anthropic’s safety warnings may have just backfired — the government has pulled the plug on its most powerful AI techcrunch.com/2026/06/12/ant…

English

4.1K

ค้นพบ

@beffjezos @JosephJacks_ @francescofaenzi @IceSolst @BorisHanin @jdlichtman @ChiefScientist @borodapomorska