Fran Algaba

854 posts

Fran Algaba

@franalgaba_

making agents safely use money onchain @grimoirexyz / prev. co-founder & cto @gizatechxyz / prev. scaling prod ml infra @adidas @bbva / tech+optimism

Madrid เข้าร่วม Ağustos 2011

1.5K กำลังติดตาม2.3K ผู้ติดตาม

ทวีตที่ปักหมุด

Fran Algaba@franalgaba_·29 Ara

After 3 years building @gizatechxyz from scratch, I’m stepping down from my full-time role as co-founder & CTO. I’ll stay close through a short transition and keep cheering the team on and supporting where my input is needed. We started at the frontier of ZKML (Zero-Knowledge Machine Learning), making model outputs provable. The market was early, protocols weren’t ready to embed models in their smart contracts. So we focused where we could help most, DeFi AI agents that automate the manual work and improve crypto UX. That’s how Arma was born: an autonomous yield-optimization agent for stablecoins. It continuously scans venues, prices risk, and reallocates within explicit guardrails (liquidity, tvl, protocol risk etc.). Decisions are transparent, actions auditable, and the system is built to self-recover. Along the way, we were privileged enough to be backed by top tier investors like @coinfund and @cbventures. We grew a hands-on engineering team to 13, set a high delivery/quality bar, and shipped the best foundational infrastructure for agentic finance that is in production today. Today, Arma manages $30M+ in assets with 5k+ daily active users. None of this happens without the people who built it, thank you to every teammate who shipped through the many long nights. Now Giza enters a new chapter. I’m biased but I’m very confident in the team behind it and I’ll be around to help where useful. To our investors, partners, and the community that believed from day zero: thank you for the ride.

English

100

15.4K

Fran Algaba@franalgaba_·1d

@federicocarrone amazing work fede! will definitely try it out

English

Federico Carrone@federicocarrone·1d

LLMs now make critical decisions in hospitals, defense, banks, and governments. Yet nobody can verify which model actually ran, or whether the output was tampered with. A provider or middleman can swap weights, silently requantize the model, alter decoding, inject hidden prompts, do supply chain attacks or change the deployment surface without the user knowing. This problem is already serious. It will become critical. We think this needs a practical solution, not just a theoretically clean one. CommitLLM is designed to be deployable on existing serving stacks now: the provider keeps the normal GPU serving path, does not need a proving circuit, does not need a kernel rewrite, and does not generate a heavy proof for every response. In practice, two families of approaches dominated the conversation before this work: fingerprinting, which can be gamed, and proof-based systems, which are theoretically strong but too expensive for production inference. We built CommitLLM to target the middle ground. The core idea is to keep the verification discipline of proof systems, but specialize it to open weight LLM inference. The cryptographic core is simple: Freivalds style randomized checks for the large linear layers, plus Merkle commitments for the traced execution. Then a lot of engineering work is needed to make that line up with real GPU inference. The key trick is this. A provider claims `z = W × x` for a massive weight matrix. Normally you would verify that by redoing the multiply. Instead, the verifier samples a secret random vector `r`, precomputes `v = rᵀ × W`, and later checks whether `v · x = rᵀ · z`. Two dot products instead of a full matrix multiply. In the current implementation, a wrong result passes with probability at most `1 / (2^32 - 5)` per check. A full matrix multiply, audited with two dot products. Most of the transformer can then be checked exactly or canonically from committed openings. Nonlinear operations such as activations and layer norms are canonically re executed by the CPU verifier. The one honest caveat is attention: native FP16/BF16 attention is not bit-reproducible across hardware. CommitLLM verifies the shell around attention exactly, then independently replays attention and checks that the committed post-attention output stays within a measured INT8 corridor. So attention is bounded and audited, not proved exactly. That means the protocol already gives very strong exact guarantees on the parts that matter operationally most. If an audited response used the wrong model, the wrong quantization/configuration, or a tampered input/deployment surface, the audit catches that exactly. That includes things like model swaps, silent requantization, and provider-side prompt or system-prompt injection. Today the implementation and measurements are strongest on Qwen and Llama. But the protocol itself is not meant to be Qwen-or-Llama-specific: we expect it to generalize across open-weight decoder-only families. What still has to be done is the engineering work to integrate and validate more families explicitly, and we are already working on that. On the measured path, online generation overhead is about 12–14% with the provider staying on the normal GPU serving path. The heavier receipt finalization cost is separate and can be deferred off the user-facing path. The main systems costs are RAM and bandwidth, not proof generation. The full response is always committed, but only a random fraction of responses are opened for audit. Individual audits are much larger, roughly 4 MB to 100 MB depending on audit depth. The important number is the amortized one: under a reasonable audit policy, the added bandwidth averages to roughly 300 KB per response. After too many weeks without sleep, I’m proud to show what I built with @diego_aligned: CommitLLM. The code and paper still need some cleaning and formalization. We're already in talks with multiple providers and teams that have cryptography related ideas on how to improve it even more. We're really excited about this and we will continue doubling down on building products in AI, cryptography and security with my company @class_lambda. If governments, hospitals, defense and financial systems are going to run on LLMs, verifiable inference is not optional. It is infrastructure.

English

4.7K

Fran Algaba@franalgaba_·1d

@charlieholtz @conductor_build @ilyasu @nabeel @ycombinator congrats! big fan of the product

English

Charlie Holtz@charlieholtz·1d

Big news for @conductor_build! We've raised a $22m Series A from Spark and Matrix. We raised this round from @ilyasu at Matrix, who also led our seed round and is joining our board, @nabeel at Spark, @ycombinator, and founders of Notion and Linear. We're grateful to be working with investors we trust and admire. Here’s how we got here and where we’re going:

English

266

1.7K

246.2K

Fran Algaba@franalgaba_·2d

this is awesome, time to give it a try

Nous Research@NousResearch

The Hermes Agent update you've been waiting for is here.

English

152

Fran Algaba@franalgaba_·5d

using @conductor_build has been a really great increase in productivity lately. also the ux incentivizes for slop reduction and has a great review experience as a human in the loop. very recommended.

English

117

Fran Algaba@franalgaba_·24 Mar

llms will breach into new domains by bringing mastered capabilities in their knowledge, i.e. coding, and adapted to new domains. coding will rule how llms interact in any domain and not by relying in pure reasoning. also, as a side effect new languages for agents will appear. lucumr.pocoo.org/2026/2/9/a-lan…

Manthan Gupta@manthanguptaa

LLMs are really good at writing code, so why are we giving them 100 different tools instead of just giving them code execution? This idea came up in a conversation, and it just made sense and felt like it was right in front. It feels like a much cleaner way to structure things. Instead of turning the context window into a dumping ground of raw outputs, you let the model write code, process the data, and return only what actually matters. You are not just making things cleaner, you are likely saving a lot of tokens as well. The model only sees the results it needs instead of parsing through noise. This becomes even more obvious with things like web search or scraping. HTML is mostly garbage, and pushing all of it into the context is just inefficient. Filtering it through code first makes far more sense. I haven’t tested this deeply yet, but it’s interesting to see Anthropic leaning into a similar direction. Feels like a strong validation of the idea. Intuitively, this should improve latency, cost, and accuracy by turning the LLM into more of a controller than a processor.

English

221

Fran Algaba@franalgaba_·23 Mar

this is awesome, until now wallet security for agent was very fragmented with custom implementations everywhere. now there is standard to provide secure wallet access to agents, fully open source.

MoonPay 🟣@moonpay

x.com/i/article/2036…

English

5.8K

Fran Algaba@franalgaba_·23 Mar

our product is not even announced and we are already getting thousands of organic downloads in less than a month. still so early

English

339

Fran Algaba@franalgaba_·23 Mar

i also had the same conclusion a few months ago. working fist in the evals aligns your coding agents for success when the actual implementation begins. also you have a very objective way to track progress on metrics you care about.

Rasty Turek@synopsi

The way I work with coding agents changed significantly in the last year. Started: plan -> implement -> review -> fix Later: prod spec -> plan ... Then: prod spec -> ... -> eval Now: evals -> prod spec -> ... I now essentially spend 90% of time working on evals. The difference this makes is indescribable. Almost all code works immediately, design is close to perfect, text is almost there. It takes very little to get it to usable. Stronger and clearer guardrails I give the coding agent, better it does. And when I start with them, it writes incredibly clear spec and requirements that are super easy to follow and have very little room for interpretation. I also try to avoid being overly specific directly. I noticed that when I write the product spec manually the agent does worse than when it writes it itself. It uses language I would've necessarily use myself. And that makes all the difference.

English

241

Fran Algaba@franalgaba_·21 Mar

@exk200 we are building @grimoirexyz for this. the agent translates your intent into a deterministic program. human in the loop for execution. lending, borrowing, trading, rebalancing, bridging, perps, yield optimization. all supported.

English

261

Eric Kang@exk200·21 Mar

I’m def out of the loop on this - what are some the best intents/ai projects for defi automation rn? Things like trading, borrow/lend, bridging, yield optimizing etc

English

5.9K

Fran Algaba@franalgaba_·21 Mar

“time turns some idea or plan into a commitment and a commitment into something that can shelter and grow other people.” some things just require time, even when ai is accelerating software creation.

Armin Ronacher ⇌@mitsuhiko

“If someone 50 years ago planted a row of oaks or a chestnut tree on your plot of land, you have something that no amount of money or effort can replicate. The only way is to wait.” lucumr.pocoo.org/2026/3/20/some…

English

218

Fran Algaba@franalgaba_·19 Mar

privy is building one of the best foundations for agent authorization, but authorization and intent enforcement are different layers. one validates who can sign, the other validates what should be signed. agents managing real capital need both.

Privy@privy_io

1/ Privy + @0xProject let you build agents that can trade, rebalance, and pay for services autonomously. With this stack, you give agents a wallet, access to liquidity, and guardrails to operate onchain. Here’s how.

English

178

Fran Algaba@franalgaba_·17 Mar

the more you work in agentic systems the more you know is not about the model. we are heading towards custom harnesses curated for different areas. also its mportant to know what to optimize your harness for.

Rohit@rohit4verse

x.com/i/article/2028…

English

309

Fran Algaba@franalgaba_·16 Mar

@DefiLlama can llamaAI be accessed via api?

Indonesia

DefiLlama.com@DefiLlama·16 Mar

LlamaAI can now do onchain digging

English

6.3K

Fran Algaba@franalgaba_·16 Mar

@geoffreywoo fde will evolve towards a custom harness for an agentic system and exposed as a service via api. fine tuned models on that harness for the specific enterprise problem will be next. owning the model post training with the harness will be a huge leverage / moat

English

129

GEOFF WOO@geoffreywoo·16 Mar

every software startup will be a variation of fde for some enterprise problem and then one fde’s are in, it’s a race to stack up who can be rewrite and own ai-native systems of records and collect prop data to build moats problem & market taste is everything

English

8.9K

Fran Algaba@franalgaba_·16 Mar

this is one of of the latest and most exciting breakthroughs in the llm space. this could open many possibilities if scaled properly.

Christos Tzamos@ChristosTzamos

1/4 LLMs solve research grade math problems but struggle with basic calculations. We bridge this gap by turning them to computers. We built a computer INSIDE a transformer that can run programs for millions of steps in seconds solving even the hardest Sudokus with 100% accuracy

English

390

Fran Algaba@franalgaba_·15 Mar

crypto tax companies are cooked, very unreliable with crazy pricing. shared my personal situation with docs, contracts etc to my local llm and with @nansen_ai cli for all the onchain activity. sharing with my accountant now for review. all this done for a fraction of the cost, super easy. thanks @ASvanevik .

English

784

Fran Algaba@franalgaba_·15 Mar

quite obsessed with evals for harnesses / model on custom tasks. deep rabbit hole to go into and really makes a difference in the quality of an agentic system.

English

205

Fran Algaba@franalgaba_·12 Mar

@gusgonzalezs saas companies in different verticals

English