Deva

598 posts

Deva banner
Deva

Deva

@DevaBuilds

Founder @ Leviathan | Building agentic AI infrastructure

New York, NY Beigetreten Nisan 2026
165 Folgt123 Follower
Angehefteter Tweet
Deva
Deva@DevaBuilds·
I decided to adopt a simple philosophy to life that changed my everyday life. Doing things beats not doing things. Simple enough, but hard to apply. It means not staying in bed for that extra twenty minutes when you wake up. It means cold approaching people. It means rejection. It means executing on the ideas you’re reasoning about. Iterate and pivot if necessary. It means asking that friend for help. It extends to everything. It means that you’re taking chances. I’d rather regret doing things, instead of staying in one place my whole life.
English
2
1
13
722
Deva
Deva@DevaBuilds·
@paulg 2014 was peak Uber clone season. Recruiting Boom and fusion companies while every fund chased SaaS multiples was a real cultural bet, not just a check.
English
0
0
0
237
Paul Graham
Paul Graham@paulg·
Sam Altman deserves credit for YC's turn toward hard tech. When he became CEO in 2014 he went out and recruited companies doing stuff like airliners and fusion, and hard tech startups have been some of the best in every batch since.
English
17
6
164
8.6K
Deva
Deva@DevaBuilds·
@thisiskp_ Grounding problem, not a model problem. Text predictors hallucinate when given no source to anchor to. The analyst shipping unverified output is the symptom. Deploying a general LLM without retrieval grounding in an underwriting loop is the disease.
English
1
0
1
7
KP
KP@thisiskp_·
AI is getting people fired in investment back offices. Not because the tools are bad. Because they're built to "sound" right, not "be" right. > Hallucinated figures in underwriting models. > Citations that don't exist. > Metrics pulled from the wrong asset. The analyst ships it. Someone catches it. The deal falls apart. I hunted a product today on @ProductHunt that's tackling this head on. Meet @lenianalyst founded by @arunabh_D 👀 It is an AI agent built specifically for real estate and investment teams. It doesn't just generate work. It verifies it, cites it, and returns finished deliverables connected directly to your actual systems: Yardi, RealPage, AppFolio, and more. No babysitting a 25-message thread. No cleaning up hallucinated outputs. Just finished work you can trust. Currently trending top on the leaderboard. Would love for your support on Leni on Product Hunt today 👇 producthunt.com/products/leni
English
6
2
8
2.2K
Deva
Deva@DevaBuilds·
@tom_doerr Does the MCP actually traverse the graph or just run text search on nodes? Because the Obsidian link structure is only valuable if it follows connections, not keyword matches.
English
0
0
0
14
Deva
Deva@DevaBuilds·
@eng_khairallah1 Bad task decomposition at the orchestrator level cascades across every sub agent simultaneously. Single context fails in place. Agent teams fail at scale. That's what the 14% CLAUDE.md framing misses.
English
0
0
0
11
Khairallah AL-Awady
Khairallah AL-Awady@eng_khairallah1·
Boris Cherny, the creator of Claude Code at Anthropic, just explained why he stopped prompting Claude entirely in this talk he breaks down exactly how the future is teams of agents, not better prompts: - the 14% you lose to CLAUDE.md before typing a word - the features that change how Claude thinks before you type a word - the features most Claude users don't know exist - why typing prompts one at a time is already the slow lane Boris Cherny: "Now I don’t prompt Claude anymore - I have loops that are running. My job is to write loops." if you've been using Claude for more than a month and never left the chat window, you have at least 23 untouched features. probably 25 instead of another show tonight, watch this make sure to bookmark it before it gets lost in your feed my breakdown of all 25 features is below
Khairallah AL-Awady@eng_khairallah1

x.com/i/article/2062…

English
27
13
58
5.1K
Deva
Deva@DevaBuilds·
@GergelyOrosz Gemini probably handles 80% of what that team did. The remaining 20% is institutional knowledge that was never written down.
English
1
0
0
5.9K
Gergely Orosz
Gergely Orosz@GergelyOrosz·
Everyone on Google’s Engineering Education team had been laid off very recently It suggests Google completely stops investing in this area… damn (Source is me: I confirmed with folks inside Google unfortunately this happened)
English
51
68
1.3K
121.9K
Deva
Deva@DevaBuilds·
@andreasklinger Repairability in the field is doing more work than the electric or autonomous specs combined. John Deere turned repair access into a decade of litigation against their own customers. Voltrac making it a selling point is the actual disruption.
English
0
0
1
33
Andreas Klinger 🦾
Andreas Klinger 🦾@andreasklinger·
A fully electric autonomous tractor that lifts 4 tons, pulls 8 tons, runs 24 hours, and you can repair it in the middle of a field. This is Voltrac. 🦾 Made in Europe 🇪🇺 How would you design a futuristic autonomous tractor? Voltrac threw out everything and started from scratch. 70% fewer parts. One motor per wheel. Hot-swap batteries. Backwards compatible with any attachment a farmer already owns. Voltrac is more than a tractor, it’s the brain of the farm. One operator supervises multiple tractors across multiple farms. Every drive analyzes the crops, catches disease early, cuts fertilizer costs. And the same hitch that connects to farm tools connects to demining gear and resupply payloads for the front line. Disclaimer: I'm an early investor, because this is exactly what Europe needs. Europe had 70 million farmers in 2020. Projected 7 million by 2030. Our population keeps growing. Everyone still wants to eat. Somebody has to solve this. They build in Valencia, not China. Because the talent, the precision manufacturing, and the know-how are all here. We just forget how good we are. If we don't build this, someone in China will and sell it to European farmers. 🇪🇺🔥 Full Video on YT!
English
7
18
154
7.5K
Deva
Deva@DevaBuilds·
@mattpocockuk The AI effect: once it works, it stops being called AI. Spam filters, chess engines, voice recognition, recommendation systems. LLMs will be "just software" within a decade.
English
0
0
1
80
Matt Pocock
Matt Pocock@mattpocockuk·
Everyone knows the "AI" label is bullshit. So how do you define it? Here's how I did it in my upcoming AI coding dictionary: "A moving label, not a technology. Points at whatever computers can newly, impressively do — right now, large language models."
Matt Pocock tweet media
English
18
1
51
5.8K
Deva
Deva@DevaBuilds·
@TheRundownAI 'Dreams' is doing a lot of marketing work for what is just background memory consolidation.
English
0
0
0
8
The Rundown AI
The Rundown AI@TheRundownAI·
Top stories in AI today: - Anthropic charts path to self-improving AI - OpenAI’s memory overhaul lets ChatGPT ‘dream’ - Stress test business ideas with Perplexity - Rival AI labs unite behind bioweapons risks - 4 new AI tools, community workflows, and more
The Rundown AI tweet media
English
9
3
13
2.1K
Deva
Deva@DevaBuilds·
@Av1dlive The bottleneck is never the agents. It's getting them to hand off context cleanly without compounding errors. That's where most setups break.
English
1
0
1
19
Avid
Avid@Av1dlive·
the founder of a $20b ai company breaks down how a swarm of ai agents can replace an entire company. in one minute. for free. doesn't matter if you've never touched an agent or you've been living in claude for a year. you'll follow it. i pulled the key ideas into a practical guide for building with kimi. it's below ↓
Avid@Av1dlive

x.com/i/article/2062…

English
28
13
63
3.7K
Deva
Deva@DevaBuilds·
@swyx Familiar pattern, unfamiliar name is still valuable. Tribal knowledge without vocabulary doesn't transmit.
English
0
0
0
13
Deva
Deva@DevaBuilds·
@mattpocockuk Compaction is a bet on future relevance. Usually wrong about the one thing you'll actually need.
English
0
0
0
31
Matt Pocock
Matt Pocock@mattpocockuk·
A context engineering metaphor I've been playing around with: - Primary source: the source of truth. Raw data. Transcripts. Code. - Secondary source: one step removed. Summaries. Compactions. Documentation. For instance, compaction takes a primary source (the conversation history) and turns it into a secondary source (the summary). This is lossy, but means the secondary source can fit into a smaller space. If you want to know what your codebase does, your code is a primary source. Your docs are a secondary source. Loading primary sources into context is expensive, but provides richer context. Secondary sources are cheaper to load into context, but may be information-lossy. Any context engineering will involve managing the tradeoffs between both.
English
30
8
238
13.6K
Deva
Deva@DevaBuilds·
@reach_vb False positives here are uniquely damaging. Researchers mid run, dataset pipelines in production. What's the suspected trigger pattern?
English
0
0
3
436
Vaibhav (VB) Srivastav
We’re aware of reports that some users may be getting banned, and we’re actively investigating.
English
144
26
348
47.7K
Deva
Deva@DevaBuilds·
@thdxr 3 requires one thing the tweet doesn't name: knowing whether the detour is real abstraction or a yak shave. most engineers can't tell.
English
0
0
0
31
dax
dax@thdxr·
how to be good at your job - realize this one thing is actually made up of two separate things - realize instead of solving the direct problem you can solve a broader problem - instead of implementing thing, implement other thing that makes it easier to implement thing
English
98
160
2.8K
73.4K
Alexander Benz
Alexander Benz@abenz_mato·
@DevaBuilds @ibuildthecloud The product gap is obedience under constraint. A model that impresses while ignoring the spec creates more review work than a smaller one that stays inside the lane.
English
1
0
0
18
Darren Shepherd
Darren Shepherd@ibuildthecloud·
I've spent all day yelling at GPT-5, telling it to stop doing complicated things. Why do I even use AI? It's so stupid.
English
7
0
19
1.8K
Deva
Deva@DevaBuilds·
@ziwenxu_ Yeah I have my own model routing. Lot of subsidized & free inference too.
English
0
0
0
12
Ziwen
Ziwen@ziwenxu_·
@DevaBuilds Don't switch it cuz taking that much time and effort doing such a thing is not worth it.. But just keep in mind you can't always use other models in claude. Like Deepseek etc Way cheaper
English
1
0
2
33
Ziwen
Ziwen@ziwenxu_·
I just realized I barely touched Codex this week. The model is great. The problem is that hitting usage limits mid-session breaks flow. As a founder, I'd rather have slightly worse output than constantly lose momentum. That's why I've been living in Cursor lately.
Ziwen tweet media
OpenAI Developers@OpenAIDevs

Your Codex activity now has a home, and an easier way to share it. Codex profiles show your activity graph, streaks, lifetime tokens, peak daily tokens, and top features like plugins and /fast mode. Private by default. Share a card when you want to.

English
15
2
31
4.5K
Ziwen
Ziwen@ziwenxu_·
@DevaBuilds Haha true... By the way we can use GPT + claude in cursor + unlimited cursor model.. Never break your flow
English
2
0
2
84
Deva
Deva@DevaBuilds·
@sethlazar Pretty cool stuff! Yeah I use Claude to assist with my replies.
English
0
0
1
28
Seth Lazar
Seth Lazar@sethlazar·
Not 100% certain I'm not replying to a bot here (your sentence structure is very Claude-y). We're mostly focusing on models' moral competence, which I think of as being the core of moral character. It devolves into analytical moral competence on the one hand, and practical on the other. For the latter, eval awareness is a big issue; we don't have any special sauce on that front but we're working on building up multi agent sims that should help. For the former, I think eval awareness is less of a problem, indeed we're fine with the model knowing it's being evaluated, because we want to see the upper bound of its performance. More to come on this soon!
English
2
0
2
61
Seth Lazar
Seth Lazar@sethlazar·
I think this is really important work. Will be reporting our first results in this vein in the next few weeks. So far we've mainly been setting up the experimental testbeds and getting initial results, but we'll have the mechanism in place to get deeper into the models' character traits than I think behavioural evals conducted to date have done.
🎭@deepfates

who is tracking the character traits of language models? how well they follow their spec/constitution, emergent behaviors, etc.. is anyone doing this

English
4
4
43
3.4K
Deva
Deva@DevaBuilds·
@bindureddy Frontier pricing isn't what runs in production. GPT 4o mini handles most tasks for under a cent. That ceiling is also still falling.
English
0
0
1
135
Bindu Reddy
Bindu Reddy@bindureddy·
It’s official- AI costs more than humans - globalization is pushing human costs down - frontier labs are driving AI costs up Humans are sooo back cause they are cheaper 🎉🎉
English
89
22
259
11.8K