Avsek Jha

489 posts

Avsek Jha

@avsekza

engineer, product obsessive, and creator of Auralix. Building https://t.co/lPZeJJ3SxH | https://t.co/BWtF6uJnpZ

United States Katılım Ağustos 2024

601 Takip Edilen217 Takipçiler

Sabitlenmiş Tweet

Avsek Jha@avsekza·18 Nis

▎ 218k tokens saved. $3.27 saved vs $0.07 spent. 47.2× ROI. One session. ULTRA active. ▎ Lifetime across all sessions: ~1.9M tokens saved. $63+ in API costs.

English

1.1K

Avsek Jha retweetledi

Danielle Morrill@DanielleMorrill·1d

I’m freaking out!!!

Josef Chen@josefchen

Launching our new paper on arXiv: we trained the largest multilingual food model ever built. 4.1M recipes. 7 languages. 1,790 ingredients. 300 dimensions. All of human cooking compressed into 2 megabytes.

English

1.7K

411.1K

Avsek Jha retweetledi

Remotion@Remotion·20 Oca

Remotion now has Agent Skills - make videos just with Claude Code! $ npx skills add remotion-dev/skills This animation was created just by prompting 👇

English

787

1.6K

21.2K

17.9M

Avsek Jha@avsekza·15 May

Share your website. Let the work be appreciated. I am sharing mine here! auralix.ai

English

Avsek Jha@avsekza·10 May

I am a full-time AI engineer. I build products that solve core supply chain problems. This involves spending an average of 5 hours in calls with customers to understand the problems, which reduces my building time. But recently, I devoted all my time to building agents for myself instead of my company. Everything I do is self-reported, and I prepare for anything I am responsible for. This increased my efficiency unimaginably. I am a full-time developer and love it, but any distractions absolutely drop my efficiency and energy. This changed this week. What are you doing for your 9-5?

English

Avsek Jha@avsekza·10 May

My 9-5 is building for someone else, but my 5-9 is building for myself! What is your 5-9?

English

Avsek Jha@avsekza·10 May

If you are a founder, let's connect!

English

Avsek Jha@avsekza·9 May

What are you building today? I am ALL in with auralix

English

Avsek Jha@avsekza·8 May

@askOkara auralix.ai

QME

Okara@askOkara·8 May

drop your website and i'll ask our ai cmo how to grow it

English

795

479

75.2K

Avsek Jha@avsekza·7 May

I just finished setting up NanoBrain on my machine, and the results are incredible. It’s a centralized hub for priorities inspired by the Andrej Karpathy "LLM Wiki" blueprint. The real "magic" is how it connects to every data source to keep you synchronized with your schedule in real-time. Finally, a way to solve the digital context-switching problem. 🛠️ Check out the open-source repo here: nanobrain.app #AI #ContextEngineering #LLM #PersonalKnowledgeManagement

GIF

English

Avsek Jha retweetledi

self.dll@seelffff·4 May

ex-Googlers published a map of every internal tool Google uses and its open-source equivalent. 15,200 stars. 1,100 forks. 99 contributors. → Borg = Kubernetes → Spanner = CockroachDB → Colossus = HDFS → Dremel = DuckDB / Presto → Chubby = Zookeeper → Stubby = gRPC → Zanzibar = SpiceDB → Blaze = Bazel → MapReduce = Spark everything Google engineers use every day. all of it has an open-source equivalent. none of it requires working at Google. like+bookmark

self.dll@seelffff

x.com/i/article/2049…

English

131

1.5K

232.6K

Avsek Jha@avsekza·2 May

@GoogleAIStudio auralix.ai

QME

Google AI Studio@GoogleAIStudio·2 May

What are you vibe coding this weekend?

English

411

884

79.5K

Avsek Jha@avsekza·2 May

Did not realize until I used CODEX, 🤯! Next I will add that plugin here!

Avsek Jha@avsekza

It’s incredible what happens when you stop wasting tokens. I’m seeing a massive spike in session endurance using Pith. Even with heavy Claude Code usage, I’m barely hitting the halfway mark on context. If it's this good now, Opus 4.7 is going to be a game-changer. GitHub: github.com/abhisekjha/pith #OpenSource #AI #Claude

English

Avsek Jha retweetledi

Avsek Jha@avsekza·17 Nis

English

877

Avsek Jha retweetledi

elvis@omarsar0·29 Nis

// Agentic Harness Engineering // Pay attention to this one, AI devs. (bookmark it) Most coding-agent harnesses are still tuned by hand or brittle trial-and-error self-evolution. This new work introduces Agentic Harness Engineering, a framework that makes harness evolution observable. They do this through three layers: components as revertible files, experience as condensed evidence from millions of trajectory tokens, and decisions as falsifiable predictions checked against task outcomes. Each edit becomes a contract you can verify or revert. Results: pass@1 on Terminal-Bench 2 climbs from 69.7% to 77.0% in ten iterations, beating human-designed Codex-CLI (71.9%) and self-evolving baselines like ACE and TF-GRPO. The evolved harness also transfers across model families with +5.1 to +10.1 point gains, while using 12% fewer tokens than the seed on SWE-bench-verified. Harness work is the biggest hidden cost in most agent systems. This is the first credible recipe for letting the harness improve itself without drifting into noise. Paper: arxiv.org/abs/2604.25850 Learn to build effective AI agents in our academy: academy.dair.ai

English

234

1.6K

139.4K

Avsek Jha retweetledi

Akshay 🚀@akshay_pachaar·29 Nis

Vector DBs can't reason. Top-k similarity ranks chunks one at a time against a query. That's fine for single-hop fact lookups, and it breaks the moment a question needs information stitched across multiple chunks. That's what the FalkorDB GraphRAG-Bench results expose. The gap is widest on Complex Reasoning (83.61) and Contextual Summarization (85.08), the exact query types where retrieval needs to traverse relations between entities, not score chunks in isolation. Worth a closer look if your workload leans long-form. GraphRAG SDK is 100% open-source: github.com/FalkorDB/Graph…

FalkorDB@falkordb

Token costs spiking. Responses too slow. Users don't trust the answers. All three are retrieval problems. GraphRAG SDK 1.0 is out. Ranked #1 on GraphRAG-Bench against 8 systems. Fewer LLM calls, grounded answers, predictable cost. Open-source: github.com/FalkorDB/Graph…

English

319

35.3K

Avsek Jha retweetledi

Maziyar PANAHI@MaziyarPanahi·27 Nis

Same text. Two privacy filters. OpenAI's model catches 8 categories. OpenMed catches 55+: medical record numbers, blood type, API keys, financial codes, demographics. Trained on Nemotron data by Nvidia. All on-device. All open-source. Coming soon! What's missing?

English

921

121.3K

Avsek Jha retweetledi

Kye Gomez (swarms)@KyeGomezB·26 Nis

Introducing NetWatch 🧑‍💻👀 Netwatch is a CLI security monitor with real-time visibility into all network connections, featuring risk scoring, GeoIP, VPN detection, process validation, and alerts. I built this in an airport on my way to San Francisco, my friends kept warning me about public Wi-Fi, so I made a tool to monitor all traffic to and from my computer lol. Learn more ⬇️🧵

English

4.7K

Avsek Jha@avsekza·26 Nis

@heyblake auralix.ai 💥

QME

Blake Emal@heyblake·25 Nis

Drop your project URL Let’s drive some traffic

English

846

392

64.7K

Avsek Jha retweetledi

Varun@varun_mathur·16 Nis

Introducing Pods Hyperspace Pods lets a small group of people - a family, a startup, a few friends, to pool their laptops and desktops into one AI cluster. Everyone installs the CLI, someone creates a pod, shares an invite link, and the machines form a mesh. Models like Qwen 3.5 32B or GLM-5 Turbo that need more memory than any single laptop has get automatically sharded across the group's devices - layers split proportionally, inference pipelined through the ring. From the outside it looks like one OpenAI-compatible API endpoint with a pk_* key that drops straight into your AI tools and products. No configuration beyond pasting the key and changing the base URL. A team of five paying for cloud AI burns $500–2,000 a month on API calls. The same team's existing machines can serve Qwen 3.5 (competitive on SWE-bench) and GLM-5 Turbo (#1 on BrowseComp for tool-calling and web research) for free - the hardware is already on their desks. When a query genuinely needs a frontier model nobody has locally, the pod falls back to cloud at wholesale rates from a shared treasury. But for the daily work - code reviews, refactors, research, drafting - local models handle it and nobody gets billed. And when it is idle, you can rent out your pod on the compute marketplace, with fine-grained permissions for access management. There's no central server involved in inference. Prompts go from your machine to your pod members' machines and back: all of this enabled by the fully peer-to-peer Hyperspace network. Pod state - who's a member, which API keys are valid, how much treasury is left - is replicated across members with consensus, so the whole thing works on a local network. Members behind home routers don't need port forwarding either. The practical setup for most pods is three models covering different jobs: Qwen 3.5 32B for code and reasoning, GLM-5 Turbo for browsing and research, Gemma 4 for fast lightweight tasks. All running on hardware you already own. Pods ship today in Hyperspace v5.19. Model sharding, API keys, treasury, and Raft coordinator are all live. What Makes This Different - No middleman. Your prompts travel from your IDE to your pod members' hardware and back. There is no server in between reading your data. - No vendor lock-in. Pod membership, API keys, and treasury are replicated across your own machines using Raft consensus. If the internet goes down, your local network keeps working. There is no database in someone else's cloud that your pod depends on. - Automatic sharding. You don't configure layer ranges or calculate VRAM budgets. Tell the pod which model you want. It figures out how to split it across whatever hardware is online. - Real NAT traversal. Your friend behind a home router with a dynamic IP? Works. No VPN, no Tailscale, no port forwarding. The nodes handle it. - Free when local. This is the part that matters most. Cloud AI bills scale with usage. Pod inference on local hardware scales with nothing. The marginal cost of your 10,000th prompt is the electricity your laptop was already using. Coming soon: - Pod federation: pods form alliances with other pods. - Marketplace: pods with spare capacity can sell inference to other pods.

English

186

297

3.1K

302.8K

Avsek Jha@avsekza·25 Nis

@omarsar0 Use Pith, save cost, save tokens, lots of tokens github.com/abhisekjha/pith

English

267

elvis@omarsar0·24 Nis

Really impressed by how smooth switching most of my coding tasks to Codex (GPT-5.5) from Claude Code (Opus 4.7) has been. I thought it was going to be more difficult and that I would be "fighting" with the model a lot. If there was ever a good time to try Codex, it would be now. I feel like Codex (GPT-5.5) just gets it and has a "warmer and more welcoming personality" compared to previous iterations. I appreciate the sharp responses and how straight to the point it is. I really don't care about these benchmark numbers anymore. They don't say anything about how good an agent harness performs in real work tasks. GPT-5.5 tries hard to get as much done, given the scope of the instruction. It doesn't try too hard (doing things that I haven't asked) or too little. The effort feels just right. I think that part is hard to get right, but there is something very different about how this model was trained and how aligned it is to be helpful. Skills have also helped a lot in this switch. Both Claude Code and Codex now make proper use of all my skills. MCP tools work right out of the box. I was a heavy user of the Claude-in-Chrome tools, but GPT-5.5 + chrome-cdp skill is actually amazing! Finally, having a blast with coding agents again. Opus 4.7 has superb planning, and I will continue to use it for lots of my research and automated tasks. Will keep testing both Claude Code and Codex and a few other coding agents and share more along the way.

English

240

36.1K

Keşfet

@askOkara @GoogleAIStudio @heyblake @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates