Aswin Manohar

5.4K posts

Aswin Manohar

@Aswin_polymath

ex-astrophysicist | data scientist & MLE | seeking meaning through creative exploration | I write about humans, ai, tech, films, art & philosophy

Germany เข้าร่วม Mayıs 2010

223 กำลังติดตาม288 ผู้ติดตาม

ทวีตที่ปักหมุด

Aswin Manohar@Aswin_polymath·9 Eyl

To do: 1. Write a screenplay 2. Launch an app 3. Make a film

English

1.6K

Aswin Manohar@Aswin_polymath·2d

@helloiamleonie Super cool! I built some apps to integrate them with openclaw. Happy to know my thought process is not that irrelevant

English

Leonie@helloiamleonie·2d

If you're not designing your software product to be used by AI agents, you're going to get left behind. User Experience. Developer Experience. Now Agent Experience. I mapped out the full Agent Journey from a DX lens: Discover → Evaluate → Onboard → Integrate → Advocate Blog: leoniemonigatti.com/blog/agent-exp…

English

3.1K

Aswin Manohar@Aswin_polymath·19 Mar

@Al_Grigor This is really cool!

English

Alexey Grigorev@Al_Grigor·19 Mar

I collected 100+ GitHub repos with real AI engineering take-home assignments (plus hiring challenges and candidate submissions). Then I analyzed them to see what companies were asking for in Q4 2025 and Q1 2026. The result is one repo that makes the patterns easy to study in one place. It includes: - Company-issued assignments - Candidate submissions - Hiring challenges and competitions - Interview prep repos and templates Link: github.com/alexeygrigorev…

English

494

28.2K

Aswin Manohar@Aswin_polymath·19 Mar

Looking for marketing interns for an education consulting start-up. Retweet for more attention and comment if you want to know more.

English

Aswin Manohar@Aswin_polymath·16 Mar

@atmoio To avoid this bias, I use a skill called /evil . It becomes very critical of me and shoots down all my ideas until I have really made up my mind that I want to build.

English

Mo@atmoio·16 Mar

AI is making CEOs delusional

Indonesia

2.6K

19.1K

2.8M

Aswin Manohar@Aswin_polymath·13 Mar

@archiexzzz They are taking a jab at the AI native Microsoft 365 suit that Microsoft recently released.

English

477

Archie Sengupta@archiexzzz·12 Mar

i have a hunch that anthropic actually has a ‘nuke’ model that is way too capable, and these launches every few days for every vertical are a way to capture the market as much as possible before they publicly launch the nuke model.

English

321

17.8K

Aswin Manohar รีทวีตแล้ว

Muratcan Koylan@koylanai·12 Mar

I think the reason devs still don't use Skills is that they're generated by LLMs. 99% of AI generated Skills are just word salad, slop. The best way to write them is to interview yourself; your repetitive tasks, the tools you use to handle them, the order and context, protocols, and connections... You need to turn your entire life into markdown files, then identify the tasks that can be packaged as Skills, write them manually, and optimize them with LLMs.

Thariq@trq212

somehow skills are still underrated

English

306

45.7K

Aswin Manohar@Aswin_polymath·13 Mar

@ThePrimeagen Coding agents are better if you are patient and don't aggressively prompt or iterate just to make something work. You must code like you are learning and verify each implementation and possible edge cases.

English

118

ThePrimeagen@ThePrimeagen·12 Mar

i am using supermaven again and i have something to say about this whole AI thing. I think as a group (swe) we rushed so fast into Agents when inline autocomplete + actual skills is crazy. A good autocomplete that is fast like supermaven actually makes marked proficiency gains, while saving me from cognitive debt that comes from agents. With agents you reach a point where you must fully rely on their output and your grip on the codebase slips. Its insane how good cursor Tab is. Seriously, I think we had something that genuinely makes improvement to ones code ability (if you have it). Truly acts as a multiplier, and we left it in the dust because it is not sexy. hurts me on the inside.

English

218

133

3.7K

182.7K

Aswin Manohar@Aswin_polymath·13 Mar

Really good source for maximizing your Claude assisted software projects. A big shout out to superpowers too.

cogsec@affaanmustafa

github.com/affaan-m/every…

English

Aswin Manohar รีทวีตแล้ว

Sydney Runkle@sydneyrunkle·11 Mar

more power to the model! we often see that the more information we give the model, the better it performs. we're experimenting with giving the model the power to compact its own conversation based on context! try it out in the latest deepagents!

Mason Daugherty@masondrxy

x.com/i/article/2031…

English

5.2K

Aswin Manohar@Aswin_polymath·9 Mar

@code_rams This is pretty cool

English

146

Ramya Chinnadurai 🚀@code_rams·9 Mar

Running multiple projects locally is painful. localhost:3000, localhost:3001, localhost:8080... which one is which? One port conflict and your whole setup breaks. Portless by Vercel Labs fixes this cleanly. Instead of port numbers, you get stable named URLs: http://myapp.localhost:1355 http://api.myapp.localhost:1355 http://docs.myapp.localhost:1355 What it solves: • Port conflicts across projects • Cookie and storage bleeding between apps on different ports • "Wait, which tab is which?" confusion in monorepos • Git worktrees: each branch gets its own subdomain automatically Works with Next.js, Vite, Express, Nuxt, React Router, Angular, Expo. There's also an AI angle. Coding agents were hardcoding ports and getting them wrong. Named URLs mean your agent always knows exactly where to go. 3.8k stars. v0.5.2. Actively maintained by Vercel Labs. npm install -g portless portless run next dev That's it. github.com/vercel-labs/po…

English

1.1K

111.8K

Aswin Manohar@Aswin_polymath·8 Mar

@MLStreetTalk @jack_w_taylor @jeremyphoward What agentic application is this? Looks cool.

English

Machine Learning Street Talk@MLStreetTalk·5 Mar

Given Grady didn't take the bait - I will answer! LLMs *break* this pattern (on average). You're not moving to a higher level of understanding imo -- you're (potentially) moving to NO understanding of the implementation. This doesn't always happen, but it seems to happen more to people who go in without domain understanding/expertise. As Jeremy eludes to in the interview, there is a strategy to mitigate this and stay mentally "tuned-in". Roughly speaking, it is about adopting a development (and usage) workflow where you have iterative, interactive, stateful feedback exposing and reinforcing the abstractions/interfaces of the application. Just so it doesn't sound like I'm chatting random bullshit, here's an actual example that I used to create the very video you watched above. It's an agentic application, so I built this timeline by just chatting with my agent. But every single iteration - I had rich visual stateful feedback using the interface of the application and its abstractions. I could iterate many times, and it was all reproducible / immutable history. So rather than doing a bunch of half-baked stuff with LLMs and having no idea what the current state is, I'm always completely tuned in and everything is validated etc. This is an AI literacy thing as we discussed in the interview, it's possible to use AI in this way and build understanding (as I do).

Machine Learning Street Talk tweet media

English

6.1K

Machine Learning Street Talk@MLStreetTalk·4 Mar

A masterclass from @jeremyphoward on why AI coding tools can be a trap -- and what 45 years of programming taught him that most vibe coders will never learn. - AI coding tools exploit gambling psychology - The difference between typing code and software engineering - Enterprise coding AND prompt-only vibe coding are "inhumane" i.e. disconnecting humans from understanding-building - AI tools remove the "desirable difficulty" you need to build deep mental models. Out on MLST now!

English

614

127.5K

Aswin Manohar@Aswin_polymath·8 Mar

@Hesamation I use Kimi 2.5 and Gemini, both are great!

English

ℏεsam@Hesamation·8 Mar

SUPER interesting benchmark on the models performing best on OpenClaw: > gemini-3-flash-preview seems to be the best overall. > sonnet-4.5 and haiku-4.5 perform better than opus-4.6. this seems to verify the assumption that smaller models are all you need in agentic setups.

Peter Steinberger 🦞@steipete

Interesting benchmark on which model is best for @openclaw pinchbench.com

English

12.3K

Aswin Manohar@Aswin_polymath·8 Mar

@neural_avb Pretty cool! I am working as a RAG agent on a financial database. I'd like to set something similar for the project

English

113

AVB@neural_avb·7 Mar

I am SOOOO glad I ran this experiment! I have so many actionable insight it is crazy. Highly recommend yall to set up similar evals for your projects/SaaS. Context: I have been evaluating different models on the current Paper Breakdown retrieval subagents. Goal is to find cheaper models that get the job done quicker. Dataset: huggingface.co/datasets/paper… I have been comparing smaller model outputs against Sonnet-4.6 (results shown below) and gpt-5-mini (current subagent model running in prod). Some insights: - gemini-3-flash thinks a lot, it returns too many chunks, and explores the paper way too much. - gemini-3-flash-lite is actually better than 3-flash at this, it even caches additional queries for fast "future retrieval". Very cool! - grok-fast-non-reasoning outperforms grok-fast-reasoning. And is the CLOSEST to sonnet-4.6 <- this was my biggest surprise. - gpt-5-mini is very fast, it thinks less, fetches quickly. I have empirically felt it's pretty good and reliable - gpt-5-nano pretty bad at this - minimax-m2.5 has high precision (it returns more info than needed) but the problem is the vercel ai gateway provider has been slow :( - for some reason glm-5 and glm-4.7 has a high failure rate on my task, I am yet to understand why. Next steps: - My goal now is to pick some of the best models here, and run either a larger expt with more test cases, or use a LLM-as-a-judge. - In the near future, I may go into harness optimizations (i.e. better prompts, better tool descriptions) I am seeing a ton of free users using the website lately, if I am able to switch to grok-fast-non-reasoning and minimax-m2.5 it will save me actual money.

AVB@neural_avb

I really like the Prime RL school of thinking - "environments & evals are two sides of the same coin" So today I'll convert Paper Breakdown into an RL env. I'll run evals with smaller models to check if I can cut my inference bill without sacrificing rewards.

English

7.7K

Aswin Manohar@Aswin_polymath·8 Mar

It is not as simple as you phrased it. There will be a period of 1-2 years where you are still employed and use Claude code to solve problems until the problem identification and user understanding itself is automated under one unified platform. There is still some way to go before unemployment due to AI becoming a thing.

English

548

Paras Chopra@paraschopra·8 Mar

Been wondering lately how many jobs in the digital economy are hanging by their supervisor’s ignorance of what Claude Code is capable of.

English

125

1.7K

104.4K

Aswin Manohar รีทวีตแล้ว

Peter Steinberger 🦞@steipete·7 Mar

Interesting benchmark on which model is best for @openclaw pinchbench.com

English

364

297

498K

Aswin Manohar รีทวีตแล้ว

Jamin Ball@jaminball·6 Mar

Awesome job by the @databricks team My summary: They trained a model called KARL that beats Claude 4.6 and GPT 5.2 on enterprise knowledge tasks (searching docs, cross-referencing info, answering questions over internal data), at ~33% lower cost and ~47% lower latency. The key insight: instead of throwing expensive frontier models at enterprise search, you can use reinforcement learning on synthetic data to train a smaller model that's faster, cheaper, AND better at the specific task. RL went beyond making the model more accurate. I t learned to search more efficiently (fewer wasted queries, better knowing when to stop searching and commit to an answer). They're opening this RL pipeline to Databricks customers so they can build their own custom RL-optimized agents for high-volume workloads. I think we'll continue to see data platforms become agent platforms. Databricks' KARL paper is really an agent platform play. The pitch: you already store your enterprise data in the Lakehouse, now Databricks will train a custom RL agent that searches and reasons over it, tuned specifically for your highest-volume workloads (workloads = apps = agents). The business move is closing the loop: data storage → retrieval → custom agent training → serving, all on Databricks. They're turning "your data lives here" into "your agents live here too." Kudos @alighodsi @matei_zaharia @rxin

Databricks AI Research@DbrxMosaicAI

Meet KARL: a faster agent for enterprise knowledge, powered by custom reinforcement learning (now in preview). Enterprise knowledge work isn’t just Q&A. Agents need to search for documents, find facts, cross-reference information, and reason over dozens or hundreds of steps. KARL (Knowledge Agent via Reinforcement Learning) was built to handle this full spectrum of grounded reasoning tasks. The result: frontier-level performance on complex knowledge workloads at a fraction of the cost and latency of leading proprietary models. These advances are already making their way into Agent Bricks, improving how knowledge agents reason over enterprise data. And Databricks customers can apply the same reinforcement learning techniques used to train KARL to build custom agents for their own enterprise use cases. Read the research → databricks.com/sites/default/… Blog: databricks.com/blog/meet-karl…

English

1.2K

370.4K

Aswin Manohar@Aswin_polymath·7 Mar

@krassenstein Trump is an idiot

English

Brian Krassenstein@krassenstein·7 Mar

BREAKING: Trump has completely lost it. FACT CHECK: Iran was not at war with anybody but America and Israel FACT CHECK: IRAN has not surrendered to anybody FACT CHECK: Trump is likely looking for an exit strategy and the only thing he can think of is to claim that Iran lost so he can pull back and paint this as a win rather than a defeat.

English

2.5K

5.2K

17.9K

878.4K

Aswin Manohar รีทวีตแล้ว

Xing Han Lu@xhluca·6 Mar

You can now install a 1-line BM25 search engine by just calling `pip install bm25`. Powered by bm25s, you can query millions of documents in <10 ms with just a search() call. pypi.org/project/BM25

English

140

1.2K

87.5K

Aswin Manohar@Aswin_polymath·7 Mar

x.com/i/article/2030…

ZXX

Aswin Manohar รีทวีตแล้ว

Paras Chopra@paraschopra·2 Mar

I've been building myself a psychophysics lab using claude code. Today I added subliminal perception experiment to it to check how i perform on tasks where I can't see things (because stimulus is flashed only briefly for <50ms). Turns out my accuracy for "unseen" condition is 63%, much above 50% random guessing! This shows brain indeed is able to act upon information that I deny having conscious access to!

English

158

11.7K

ค้นพบ

@helloiamleonie @Al_Grigor @atmoio @archiexzzz @ThePrimeagen @code_rams @MLStreetTalk @jack_w_taylor