Aidan

82 posts

Aidan

@AidanGior

Always building. Usually somewhere between accounting tech, automation, AI agents, market intel, and business software.

Katılım Kasım 2016

1.1K Takip Edilen99 Takipçiler

Aidan@AidanGior·3d

@adxtyahq Honestly useless answer without mentioning chunking/indexing methods

English

593

aditya@adxtyahq·3d

“design a RAG pipeline for 10M docs with zero hallucination” apparently this was asked in a Google L5 interview round. came across it somewhere on the internet and honestly it’s a way more interesting system design problem than most classic distributed systems questions 1. ingest + normalize docs - remove duplicates, standardize formats, extract metadata, maintain version history 2. hybrid retrieval (BM25 + embeddings) - BM25 handles exact keyword matching while embeddings capture semantic meaning - semantic search alone usually struggles with precision at massive scale 3. ANN retrieval + reranking - ANN (Approximate nearest neighbor ) quickly pulls top candidate chunks from millions of docs - then a reranker rescoring step improves relevance by deeply comparing query vs retrieved chunks 4. source confidence scoring - every retrieved chunk gets scored based on freshness, trust level, overlap and retrieval consistency - low-confidence context should never heavily influence generation 5. constrained generation - the model is only allowed to answer using retrieved context (nothing new to be invented outside of the retrieved context) 6. citation-backed responses - every major claim links back to exact chunks, documents or timestamps 7. hallucination fallback layer - if retrieval confidence drops below a threshold: “insufficient evidence found” 8. continuous evals - run adversarial queries, retrieval recall benchmarks and hallucination tests continuously 9. caching + memory layer - cache high-frequency enterprise queries and retrieval paths (improves latency and output) 10. observability everywhere - trace retrieval paths, chunk rankings, token attribution and failure points Also at 10M docs, retrieval quality matters more than the frontier model itself.

English

320

2.7K

189.1K

Aidan retweetledi

Blake Burge@blakeaburge·6d

Underrated life advice: Make yourself easy to root for. Be kind. Be reliable. Celebrate other people’s wins. Work hard without complaining. Carry good energy into rooms. You'll be shocked by how many doors open for you by making life better for others.

English

172

3.6K

23.6K

491.8K

Aidan@AidanGior·6d

@_vmlops Pretty sure these are distilling models. Claude costs more because they actively fight distillation

English

Vaishnavi@_vmlops·16 May

CHINESE DEVS ARE BURNING 100M+ GPT-5.4 TOKENS FOR ~$1/DAY ▫️ they buy api access from resellers who exploit cheap regional subscriptions at massive scale ▫️ gpt costs them 3% of official price. claude costs more because anthropic made it harder to crack ▫️ when pirates can undercut you by 97%, your pricing model is the real problem

English

456

214K

Aidan@AidanGior·13 May

@jonaswillett1 CAFE

English

Jonas@jonaasw1·12 May

I created a list of my favorite cafes to work at in NYC. Coffee shops > offices. I started my company inside a coffee shop. Beautiful spaces. Good energy. Surrounded by people locking in. And you randomly meet the most interesting people. My list includes the cafe name, neighborhood, and a proprietary, confidential scoring system based on: - Work space (tables, outlets, WiFi, design) - Food (quality, options, price) - Music (playlist, can you take calls) - People (do interesting people go here) - Coffee (does it hit) - Vibes (overall energy) I'd love to share this list with you + add new spots. Comment "CAFE" and I'll DM you the list.

English

381

752

207.9K

Aidan@AidanGior·9 May

@swang_co moshava is one of the best. not open on weekends though

English

1.3K

Serena Wang@swang_co·9 May

best spots to lock in on a drizzly Saturday in NY 🌧️ - conwell coffee hall - Georgie’s cafe - haraz coffee house - stone street - cafe jalu - mori coffee - plantshed - the lost draft - toby’s estate in tribeca - kings street coffee - the blue bottle on broadway - moshava only including the few that are both laptop friendly over the weekend and with ample seating for founders building/raising, come co-work with me next Sat!

English

1.2K

111.2K

Aidan@AidanGior·2 May

@hiarun02 Often using Claude/Codex CLIs in Cursor and only using cursor ai for basic stuff and quick edits. Problem is starting to feel cursor being very slow and bloated

English

419

Arun@hiarun02·1 May

Is anyone still using VSCode instead of switching fully to Claude Code, Cursor IDE, or OpenAI Codex? Or am I the only one?

English

352

419

84.3K

Aidan@AidanGior·28 Nis

@victorcardenas @ellisgouller_ @kevinbai0 🐐

QME

535

Victor Cardenas Codriansky@victorcardenas·28 Nis

@ellisgouller_ Yep - @kevinbai0 you wanna rip one?

English

Victor Cardenas Codriansky@victorcardenas·28 Nis

We built this internally at Slash and it has genuinely changed the trajectory of our company. Insane what perfect, real-time context for anyone at your org does for productivity.

Y Combinator@ycombinator

Company Brain @t_blom Every company has critical know-how scattered across people's heads, old Slack threads, support tickets, and databases, and AI agents can't operate like that. We think every company in the world is going to need a new primitive: a living map of how the company works that turns its own artifacts into an executable skills file for AI.

English

556

168.6K

Aidan@AidanGior·27 Nis

@garrytan This comes to mind and has been top of mind for me x.com/ashwingop/stat…

Ashwin Gopinath@ashwingop

x.com/i/article/2042…

English

Aidan@AidanGior·27 Nis

@garrytan Wow zero LLM entity extraction? Hybrid GraphRAG is nice if done well but will decay quickly at scale compared to the 2MB eval harness

English

647

Garry Tan@garrytan·26 Nis

For GBrain I built a proper eval harness. 145 queries, Opus-generated corpus. The retrieval stack uses graph based, vector based and Grep based strategies in combination. The graph layer is worth +31 points on precision. Vector-only misses 170/261 correct answers that the full system finds. Keyword + vector + graph are three separable wins, each load-bearing. Standard information retrieval metrics: the same ones Google uses to measure search quality. Precision at 5: You ask a question, the system returns 5 results. How many of those 5 are actually useful? If 3 out of 5 are relevant, P@5 = 60%. It measures: am I wasting your time with junk results? Recall at 5: For a given question, there might be 3 pages in the entire brain that are genuinely relevant. If the system finds all 3 in its top 5, R@5 = 100%. If it only finds 1, R@5 = 33%. It measures: am I missing things you need? High precision = low noise. High recall = nothing slips through. GBrain's 97.9% R@5 means it almost never misses the right answer. The 49.1% P@5 means about half the results are relevant — which is good when you realize that for most queries there are only 1-2 right answers out of 17,888 pages, so 2.5 hits out of 5 is strong signal. Entity resolution is zero-LLM-call: regex extracts typed links (works_at, invested_in, founded) on every write. Re-embed on write not on a timer, so decay = stale pages, and stale pages get rewritten when new info lands. Scorecards: github.com/garrytan/gbrai…

English

466

211.7K

Aidan@AidanGior·24 Nis

@tom_doerr Feels like the next level of Karpathy’s knowledge bases

English

Tom Dörr@tom_doerr·24 Nis

Structured knowledge graph for AI agent memory github.com/iwe-org/iwe

English

173

8.2K

Aidan@AidanGior·24 Nis

@mytechceoo @albertadevs @getaxal Very interested to see where this goes. Token maxxing makes sense while driving adoption but token efficiency will soon become priority as costs grow with adoption. Even personally I’m starting want to see where I’m spending and what value I’m getting

English

150

Jason@mytechceoo·21 Nis

CEO obsessed with token maxxing

English

282

13K

1.9M

Aidan retweetledi

Gideon Shalwick@GideonShalwick·23 Nis

Hot take: Vibe coding doesn’t (fully) replace thinking. It replaces hard core, coalface coding. If you want it to actually work, you still need: - Deep understanding of what the market wants - A clear user experience (not just “it works”) - Real UI design with proper affordances - Backend + infrastructure that can scale - Security that won’t bite you later - A plan for distribution Vibe coding replaces the old dev bottleneck. It doesn’t replace product, strategy, or execution. What do you think @garrytan?

English

440

113.8K

Aidan@AidanGior·21 Nis

Wow I had a veryyy similar idea just last week. Started building a variation of it that I believe is more powerful. I agree though, the only thing stopping agents from being infinitely more helpful is them not having the full context on our works, goals, problems, procedures, etc. Luckily that data is readily available

English

1.1K

Tibo@thsottiaux·20 Nis

We are releasing a *research preview* of Chronicle in Codex. It allows codex to build up memories based on your day to day work on your computer and then refer to these memories to be a lot more helpful. Available for PRO subscriptions and on Mac to start. This is early and consumes quite a bit of tokens, but it has changed how I and many folks at OpenAI use Codex.

OpenAI Developers@OpenAIDevs

Last week, we released a preview of memories in Codex. Today, we’re expanding the experiment with Chronicle, which improves memories using recent screen context. Now, Codex can help with what you’ve been working on without you restating context.

English

239

150

2.6K

938.9K

Aidan@AidanGior·12 Nis

@TalkinYanks This is the same team as always. Just run and put the ball in play and we’ll beat ourselves

English

1.6K

Talkin' Yanks@TalkinYanks·12 Nis

Yankees have lost four straight

English

229

1.1K

64.1K

Aidan@AidanGior·8 Nis

@RyanFieldABC They can’t say they didn’t expect this. Tickets are only a few $ for a lot of games anyway, $0 is not a game changer. Sad reality of a sport with 81 home games, especially on a cold weeknight

English

10.3K

Ryan Field@RyanFieldABC·8 Nis

The Mets owner has a legitimate gripe.

English

149

220

20K

1.9M

Aidan@AidanGior·8 Nis

@danielmerja Haraz is excellent

English

378

Daniel Merja ( gotogether.ai )@danielmerja·8 Nis

This went viral so here is another NYC hidden gem: Haraz Coffee House Open until 12am! Free WiFi, tons of space and outlets everywhere. On Spring street right by the exit of E, C subway line.

Daniel Merja ( gotogether.ai ) tweet media

Daniel Merja ( gotogether.ai )@danielmerja

By popular demand, another NYC hidden gem: free lockers and changing room at NYRR on 57th and 8th ave. Going for a run in central park 🏃

English

783

75.5K

Aidan@AidanGior·7 Nis

@a16z One of my favorite speeches ever. The spirit we need, now growing again with AI. Choose the hard things. Bring out the best of our energy and skill.

English

681

a16z@a16z·6 Nis

We're going back to the moon.

English

438

215.5K

Aidan@AidanGior·3 Nis

@Vtrivedy10 @hwchase17 Frontier models shine for general apps. Open models with specialized harnesses crush targeted tasks. Running MiniMax 2.7 in DeepAgents for much lower costs

English

113

Viv@Vtrivedy10·2 Nis

we’re leaning incredibly hard into Open Models + Open Harnesses evals show that current open models get near frontier (or better) intelligence on many tasks, they’re way cheaper, and usually faster real world tasks need to take perf, cost, latency into account many tasks don’t need the bleeding edge frontier intelligence, they need specialized intelligence funneled into a problem with open models, production traces becomes even more of a moat with harness engineering and finetuning we’re so early, a big part of the future will be open everything 🚀

Mason Daugherty@masondrxy

x.com/i/article/2039…

English

14K

Aidan@AidanGior·26 Mar

Probably user error but first issue I faced was the main agent continuously checking on the async subagent so it’s not really async. I don’t fully see a reason to let the main agent check at all for my use case so would be nice have more granular control over tools like that by default

English

Harrison Chase@hwchase17·25 Mar

@AidanGior Try it out - let us know any feedback!

English

214

Harrison Chase@hwchase17·25 Mar

⛳️async subagents in deepagents==0.5.0a1 i think in future we will "chat" with a single agent, and it will manage multiple longer running agents in the background you can now do this with deepagents we released an alpha release (0.5.0a1) with this functionality - try it out and let us know what you think! docs.langchain.com/oss/python/rel…

English

155

21.8K

Aidan@AidanGior·26 Mar

@shiri_shh Is it still dogfooding if it’s SOTA?

English

221

shirish@shiri_shh·25 Mar

The Anthropic team is dogfooding Claude Code at insane levels. In the last 52 days, the Claude team dropped 50+ major UPDATES. One employee alone hit $150,000 in a single month on Claude Code 80% of employees use it daily, with power users racking up six-figure bills.

Claude@claudeai

Your work tools in Claude are now available on mobile. Explore Figma designs, create Canva slides, check Amplitude dashboards, all from your phone. Give it a try: claude.com/download

English

938

431.9K

Keşfet

@adxtyahq @_vmlops @jonaswillett1 @swang_co @hiarun02 @victorcardenas @ellisgouller_ @kevinbai0