Raphael De Lio

2.6K posts

Raphael De Lio

@RaphaelDeLio

Utrecht, Nederland Katılım Mayıs 2022

416 Takip Edilen893 Takipçiler

Sabitlenmiş Tweet

Raphael De Lio@RaphaelDeLio·12 Mar

E se não puderem comparecer em nenhum desses eventos, não deixem de assistir o vídeo onde eu explico Vector Databases, Vector Similarity Search e Embedding Models de forma muito simples e intuitiva: youtube.com/watch?v=Yhv19l…

YouTube

Português

2.1K

Raphael De Lio@RaphaelDeLio·1d

@JorgeArteiro @bsbodden Great meeting you too!

English

Jorge Arteiro@JorgeArteiro·2d

@RaphaelDeLio @bsbodden Great to meet and talk with both of you. Let’s catch up soon.

English

Raphael De Lio@RaphaelDeLio·2d

Yesterday, @bsbodden and I delivered our Designing Multi Agent Systems with Spring AI hands-on lab at JavaOne in San Francisco. More than 35 people showed up and it was the lab with most registrations of the day. 🙏🙌☕️

English

970

Raphael De Lio@RaphaelDeLio·1d

@daviddryparry Thank you!

English

david parry@daviddryparry·1d

You forgot it was also an awesome class!

Raphael De Lio@RaphaelDeLio

English

102

Raphael De Lio@RaphaelDeLio·2d

@0xlelouch_ Google uses bloom filters for that

English

Abhishek Singh@0xlelouch_·3d

Typed a Gmail username once and the UI instantly said: “Username already taken.” I asked an ex-Staff Google engineer the same problem (he was director of engineering in a startup i worked at), “You’re not doing an Elasticsearch query on every keypress, right?” He laughed. “No. That’d be a crime.” My classy approach: 1. Keep an in-memory trie of reserved usernames. 2. Update it async (delta pushes), not per keystroke. 3. UI checks locally in O(k) where k = username length. Numbers (why this is feasible): 1. Assume 2B usernames, avg length 10 chars. 2. Raw chars = 2B × 10 = 20B chars. 3. Even if you store 1 byte/char (not true in a trie, but baseline) that’s ~20GB just for characters. 4. A trie is about prefix sharing, so common prefixes collapse hard. Real memory is “nodes + edges”, not “strings”. 5. If we model ~1 node per char worst-case: ~20B nodes. - If a node is 8 bytes (tight packed arrays, bitsets, offset indices; no pointers), worst-case is 160GB. - With prefix sharing, you can easily cut multiples of that depending on distribution (gmail-like usernames are not random). 6. Shard by first 2 chars (36 possible: a-z, 0-9). 36² = 1296 shards. - Worst-case per shard: 160GB / 1296 ≈ 123MB. - Suddenly “instant check” fits in memory per front-end pod or edge POP. Yes, you can also do it with WebSockets: 1. Client streams “candidate username” events. 2. Server replies with availability. 3. Works fine, but now you’ve built a hot, stateful, low-latency service for… a UI hint. Most people will ship: 1. Elasticsearch prefix search. 2. Debounce 150ms. 3. Cache a bit. 4. Pray at peak signup traffic. And it works. But the trie approach is the kind of solution where the UI feels like magic tbh and it's something novel that i thought of. Things are just different at google scale.

SumitM@SumitM_X

As a developer, Have you ever wondered : You type a Gmail username and UI instantly shows "Username already taken"... There are millions of users globally How is this check so fast?

English

214

1.2M

Raphael De Lio@RaphaelDeLio·4d

@sseraphini This specifically is software engineering. You want a step in your pipeline that triggers an app that calls an LLM. Building agents is pretty much SWE. LLMs are rest API calls.

English

Sibelius Seraphini@sseraphini·4d

what is the recommendation to create a recurrent task using an LLM? like making a review of what will be push to production, before pushing to production? is this prompt engineering?

English

644

Raphael De Lio@RaphaelDeLio·4d

@guyroyse The interesting thing is that this applies to our memory in general! We tend to better recall the first and the last presented information. I hadn’t thought about broader applications of it though!

English

Guy Royse@guyroyse·4d

It's not every day that I get to quote Dave Thomas—the founder of Wendy's, not the author of The Pragmatic Programmer—but I do in this podcast with Roberto Perez. Dave Thomas said that the most important bites of a burger are the first bite and the last bite. Those are the bites you remember. Those are the bites that influence your reaction to eating that burger—in other words, will you get another in the future. LLMs, it turns out, do something similar. They tend to remember stuff at the beginning of the context and the stuff at the end of the context. And sometimes struggle with the in the middle stuff. So keeping context small matters. The less middle you have, the less chance the middle is missed. And, of course, a smaller context is cheaper and faster too. Fewer tokens. Less time spent processing them. So smaller contexts are, generally speaking, better. Of course, just dropping messages off the back of the conversation to keep a small context doesn't solve the problem. The context needs to be tight, not truncated. To do this, you need Agentic Memory. In the podcast, I demo a simple chatbot that uses Agent Memory Server to provide just that. Watch it for all the details: youtube.com/watch?v=fkdqwm…

YouTube

English

240

Raphael De Lio retweetledi

Peter Steinberger 🦞@steipete·14 Mar

@notsunsakis @bcherny don't call it vibe coding - that's associated with yolo i smash head on keyboard, not thinking, engineering, building, testing, debugging, iterating. agentic engineering, or just...coding. We move faster, but it's still hard.

English

129

156

2.2K

171.1K

Raphael De Lio@RaphaelDeLio·13 Mar

But the lesson is clear: Ambition isn’t a strategy. Shipping is. Definitely a valuable lesson I’m gonna take with me for when I start my own startup in the future. Next time, I’ll start smaller. cc @sseraphini /7

English

Raphael De Lio@RaphaelDeLio·13 Mar

And if this were a real startup competing in the open market, that would’ve been fatal. Luckily, we’re on the same team and company. His solution is gonna serve me as much as mine would’ve served him. We can even build on top of his and possibly add all features I envisioned. /6

English

Raphael De Lio@RaphaelDeLio·13 Mar

I made the classic founder mistake this week: I built the vision instead of the product. Follow the thread /1

English

407

Raphael De Lio@RaphaelDeLio·12 Mar

@toddsaunders You’re being positive about decision taking 30 minutes 😅 Typically it’s multiple layer of decisions that may take weeks to not only decide, but also to plan, before executing. Fewer people will be a great advantage from now on.

English

134

Todd Saunders@toddsaunders·12 Mar

The token cost to build a production feature is now lower than the meeting cost to discuss building that feature. Let me rephrase. It is literally cheaper to build the thing and see if it works than to have a 30 minute planning meeting about whether you should build it. It’s wild when you think about it. This completely inverts how you should run a software organization. The planning layer becomes the bottleneck because the building layer is essentially free. The cost of code has dropped to essentially 0. The rational response is to eliminate planning for anything that can be tested empirically. Don’t debate whether a feature will work. Just build it in 2 hours, measure it with a group of customers, and then decide to kill or keep it. I saw a startup operating this way and their build velocity is up 20x. Decision quality is up because every decision is informed by a real prototype, not a slide deck and an expensive meeting. We went from “move fast and break things” to “move fast and build everything.” The planning industrial complex is dead. Thank god.

English

374

569

5.5K

463K

Raphael De Lio@RaphaelDeLio·12 Mar

@henriquevinhola @sseraphini Quem coda com IA olha pro código?

Português

Vinhola 🦦@henriquevinhola·11 Mar

Pra quem usa IA pra codar, vocês já identificam padrões de codigo igual elas fazem com texto, tipo o uso excessivo de travessão?

Português

280

79.3K

Raphael De Lio@RaphaelDeLio·12 Mar

Agentic coding is not giving me more time. It's allowing me to build within a timeframe that wouldn't have been possible before. But it does not take one commit. It takes more than 300.

English

Raphael De Lio@RaphaelDeLio·12 Mar

This week I've been working from the moment I wake up to the moment I go to bed. It's 00:39 as I'm writing this post. I still need to steer agents to build stuff correctly, reliably, and in a way that can scale. But most importantly, I still need to decide what to work on.

English

Raphael De Lio@RaphaelDeLio·12 Mar

It's difficult for me to believe that the cost of coding has got nearly to zero. Especially as someone who's been coding with agents consistently for almost one year. cc @sseraphini

English

726

Keşfet

@JorgeArteiro @bsbodden @daviddryparry @0xlelouch_ @sseraphini @guyroyse @notsunsakis @bcherny