Eeshan

83 posts

Eeshan

@notesundrground

Local AI & agentic coding. Projects - https://t.co/kbpniD7cTC Writing - https://t.co/Oa0owDzo0v

Seattle เข้าร่วม Temmuz 2023

422 กำลังติดตาม38 ผู้ติดตาม

Eeshan@notesundrground·17h

@codevsdev they didn't .. coding was invented by LLMs in 2023 right?

English

142

Tom ☕@codevsdev·1d

how did people even learn to code when there was no docs, no YouTube... nothing?

English

1.6K

599

223K

Eeshan@notesundrground·3d

@ThePrimeagen I built something like this for myself 6 months ago, which doesn't require you to upload your code to a 3rd party site. Everything is analyzed locally and I'm not selling anything. Just cool view of my coding activity + persona + wrapped style page howiprompt.eeshans.com

English

779

ThePrimeagen@ThePrimeagen·4d

Is this this taste thing I keep hearing about?

English

521

82.8K

Eeshan@notesundrground·3d

@atmoio Just like all of the LLM outputs, it's writing is a crude "approximation" of human writing.

English

311

Mo@atmoio·3d

The funny thing about the AI “intelligence” revolution is that AI is basically a writer. Writing what? Anything. Prose. Poetry. Code. Math. It can write. And the question is, if you automated writing, how disruptive is that? It’s not nothing. But also, it’s a pretty whimsical premise right? Like ha wow writing is solved, THEREFORE EVERYONE AND EVERYTHING IS TRULY FUCKED. Like what? How did you get there? IT CAN WRITE A LOT. REALLY FAST. oooo, spooky Clearly the non-delusional take is every human gains an expensive writing/research assistant. And this is clearly, demonstrably the full extent of the revolution.

English

119

537

37K

Eeshan@notesundrground·6d

@orcdev Spotify wrapped style analytics dashboard to analyze your AI conversations for activity, politeness, vibe-coder index etc. Source: github.com/eeshansrivasta… Demo: howiprompt.eeshans.com

English

OrcDev@orcdev·6d

Give me your open source projects ⚔️ I'm collecting projects for a new YouTube series. Drop a link below and tell me what you're building 👇

English

120

12.6K

Eeshan@notesundrground·6d

@hthieblot Stumbleupon

English

Hubert Thieblot@hthieblot·3 Haz

Anyone who surfed the early web between 1995-2010. What’s the one website/app you still think about?

English

17.6K

545

11.3K

4.8M

Eeshan@notesundrground·31 May

@helloiamleonie This is with Hermes btw.

English

Eeshan@notesundrground·31 May

@helloiamleonie Managing some Obsidian workflows. Daily gratitude & wins log, extracting the wins from my vault activity.

English

2.5K

Leonie@helloiamleonie·31 May

just curious: what’s the most useful thing your OpenClaw, Hermes Agent, etc. is doing for you?

English

368

1.4K

323.2K

Eeshan@notesundrground·30 May

One of the best @MKBHD videos! Truly fascinating to learn about all this NBA tech and the super talented people behind it.

Marques Brownlee@MKBHD

NEW VIDEO - Shoutout to the Spurs and NBC for letting me go behind the scenes of all the tech behind an NBA broadcast! Watch this before you watch the game tonight 👀 youtu.be/mk_wdHePbtQ

English

Eeshan@notesundrground·29 May

Of course credit to pi.dev @badlogicgames for the minimal harness that works great with local models. My workflow is also just 3 skills and no subagents to fit my uni-tasking mindset.

English

Eeshan@notesundrground·29 May

New post in my local AI series where I continue testing and evaluating local LLMs that don't need any data centers to run. This time I gave four local models a live A/B test to analyze, by making API calls to a real database, pull experiment data, run the correct statistical tests, and make a product decision. Caveat: I don't really NEED an LLM to automate experiment analysis, nor do I think it's a good real-world LLM use case, but this was a very interesting test of complex multi-step tool calling and hallucination resistance over a long procedural task. In short, these tiny <35B parameter models are capable enough for such narrow agentic tasks. Qwen 3.6 35B A3B is still my M4 Mac Pro laptop champion over multiple benchmarks. @Alibaba_Qwen I used my live A/B test memory game as the source, and wrote about the methodology and results on my substack: theasymptotic.substack.com/p/local-ai-ser… Live A/B test at absim.eeshans.com Workbench + gallery at localai.eeshans.com Read if you're interested in this domain as well, and would love to hear your thoughts.

English

Eeshan@notesundrground·29 May

@badlogicgames Same. The ones that don't need massive data centers to run.

English

Mario Zechner@badlogicgames·29 May

except for open weights models i can run locally :D

English

117

3.9K

Mario Zechner@badlogicgames·29 May

when i worked on game dev tools (including mobile) i was always excited when a new WWDC or Google I/O dropped. Usually meant new stuff. when i started working with LLMs sometime in 2023, i was always excited about new model drops. I do not have excitement anymore.

English

408

25.9K

Eeshan@notesundrground·28 May

@F2aldi My daily driver. And I use it minimally as well. Only added a couple of extensions for web search and that’s it. No subagents or plan mode. Fits my unitasking workflow perfectly.

English

289

λL-D1 | AI for Buzzer 🍉@F2aldi·28 May

Has anyone tried Pi harness seriously? I’m curious how it compares to OpenCode for open-weight model workflows. On paper Pi feels more minimal/customizable, while OpenCode feels more ready out of the box. Which one feels better in real coding work?

English

18.7K

Eeshan@notesundrground·24 May

@buildwithsid Super clean! Loved the reddit roast app :D

English

siddharth@buildwithsid·24 May

updated my portfolio website my cleanest design yet 👀

English

774

74.3K

Eeshan@notesundrground·24 May

@levie Just realized that you're a CEO too .. it's refreshing to see this level of self-awareness and critical thinking. Thank you!

English

1.5K

Aaron Levie@levie·24 May

CEOs are uniquely prone to AI psychosis because they’re sufficiently distant from the last mile of work that still has to happen to generate most value with AI. So when they play with AI, they see the happy path results, often not considering the next 10 or 20 things that have to happen to get sustainable results from agents. “Look I made this awesome product prototype”. Yes but you didn’t have to review the code before it went into production and fix a bunch of issues. “Look I generated a contract”. Yes but you didn’t verify all the terms before it goes out to the counterparty and didn’t have to wire up all the past contracts to work with. The best thing you can do as a CEO is to use AI a *ton* to figure out the real implications of agents in the enterprise, and come out the other side with an appreciation for both the upside and the real work that goes into them.

Michal Malewicz@michalmalewicz

CEOs are the most delusional about AI. Detached from reality.

English

311

793

7.2K

1.2M

Eeshan@notesundrground·23 May

@badlogicgames Completely agree! Time to slow down, focus on use cases, and establish best practices to make these models into a 'reliable' superautocomplete.

English

284

Mario Zechner@badlogicgames·23 May

this has been my experience as well. there definitely were improvements, specifically wrt shell based computer use, but also regressions, especially in the last 3 version bumps of flicker and gerpertee corp. the last step change was 3.x to 4.x in flicker land, probably mostly due to them getting all coding sessions from april to october 2025 via CC. similar timeline with GPT and Codex. at least in my line of work, no big jumps after that. the benchmark increases mean literally nothing in the real world. i suppose we have a data problem now. only so much you can RL into those damn things. and with ralph loops/swarms/agents reviewing agents/whatever, you get less and less human signal to improve RL, would be my uneducated guess. also very hard to capture design/system thinking in RL would be my guess. all that said: if we are at the top of the S curve now, then i'll take what we got. plenty useful, even if it won't replace me fully nor partially anytime soon.

David Cramer@zeeg

Whenever someone talks about how much models have improved over the last 3, 6, or 12 months I’m like “sure, but not enough to matter”. I’m still solving the same problems I was solving a year ago and there’s no plausible path forward. If you honestly ask yourself how much has genuinely changed I think it will make you a lot more grounded about how much might change in the future. It’s not to say there’s not been visible improvement, but there have been no exponential leaps in capabilities of the technology. Only exponential micro benchmarks.

English

309

49K

Eeshan@notesundrground·19 May

@julien_c I wrote a gist about this with all the flags that I've been using. 35B works super smooth on my Mac M4 Pro 48 GB, and 27B is usable as well. The MTP update definitely helped. gist.github.com/eeshansrivasta…

English

456

Julien Chaumond@julien_c·19 May

I've seen some confusion online on how to run llama.cpp with MTP (Multi-token prediction) in the simplest way possible. ICYMI, MTP is a new flavor of speculative decoding built-in to the model itself, that ~2x your tokens per sec for most use cases. 2x generation speed = Truly a game changer. 🔥 How to run it? brew upgrade llama.cpp # or you might need to install from source until build 9200 is in your package manager: brew install llama.cpp --HEAD Then pick either the Dense 27B or the 35B A3B MoE. Personally I tend to stick to the Dense model where I achieve ~30 tok/sec on my machine. The MoE is of course way faster at an impressive ~100 tok/sec on my machine. Truly rapid. ⚡️ In both cases you probably want 48GB or better 64GB RAM or VRAM, though 36GB might work with more strongly-quantized versions. # Dense: llama-server -hf ggml-org/Qwen3.6-27B-MTP-GGUF --spec-type draft-mtp --spec-draft-n-max 2 # MoE: llama-server -hf ggml-org/Qwen3.6-35B-A3B-MTP-GGUF --spec-type draft-mtp --spec-draft-n-max 3 Enjoy!

English

425

20.6K

Eeshan@notesundrground·19 May

Oh I didn't know that. That tracks with my very subjective visual benchmarks that show Gemma models producing very low quality outputs compared to Qwen models. Those benchmarks also helped me find a cheaper Claude / Codex replacement (GLM 5.1) while I wait for the local models to get better. localai.eeshans.com

English

120

maybeyonas@maybeYonas·19 May

@notesundrground @Michaelzsguo will try it out today. the only reason i was staying away from gemma models was cause the unsloth quants kld numbers were higher. so, i thought the deterioration would be higher, compared to similarly quantized qwen.

English

Michael Guo@Michaelzsguo·19 May

I’ve tried driving Qwen 3.6 on my MacBook Pro with a few different agent harnesses: Claude Code: too slow. It moves like a turtle. Codex: Qwen stopped in the middle of tasks, probably due to tool-call mismatches between them. Qwen Code: so far the best experience. Very smooth. Qwen Code sometimes even tells a joke while you’re waiting. This video is played at original speed so you can get a real feel for how Qwen works.

English

127

20.4K

Eeshan@notesundrground·19 May

@maybeYonas @Michaelzsguo I think you can with <80K context. Gemma 4 26B with IQ3 might fit better, and it's a good model too.

English

maybeyonas@maybeYonas·19 May

@notesundrground @Michaelzsguo ah, thanks for that clarification. I thought i was insane to try and run q3 quant on my 24GB model.

English

Eeshan@notesundrground·19 May

@maybeYonas @Michaelzsguo No I think it's about 20-25GB, but I have other applications running on my laptop. Honestly, that's the real-world use case I'm testing out. Definitely not buying any separate GPUs.

English

137

maybeyonas@maybeYonas·19 May

@notesundrground @Michaelzsguo damn, so model + kv-cache + overhead is close to 40GB ??

English

118

ค้นพบ

@codevsdev @ThePrimeagen @atmoio @orcdev @hthieblot @helloiamleonie @MKBHD @badlogicgames