Aditya

198 posts

Aditya

@adityab29_

Katılım Haziran 2022

636 Takip Edilen108 Takipçiler

Aditya@adityab29_·15h

@srikosuri banger

Indonesia

Sri Kosuri@srikosuri·1d

Why did Erdos have so many problems?

English

127

174

2.7K

233.3K

Aditya@adityab29_·1d

best ai meme ever.

Siddhartha Saxena@siddsax

Anthropic onboarding day: Michael Scott introducing Karpathy like he just signed Wemby in free agency.

English

Aditya retweetledi

Abhishek Eswaran@AbhishekEswaran·1d

this is a naive understanding of where models are headed. cursor is a prime example of this. they built one of the best coding harnesses, but opus 4.5 absorbed the entire harness into the model, making years of cursor’s work redundant. yes, you need a harness, but it should be minimal, just enough to build an interface between the api and the user, nothing else. pi coding agent is the classic example here.

dharmesh@dharmesh

The harness matters more than the model. Models have gotten really good. Great reasoning, large context windows, better instruction following. But, what makes *use* of those capabilities is actually the harness. It's what provides tools, memory, skills and context to the model. ChatGPT is a harness. Claude Cowork is a harness. Without the harness, the model is just an engine with no car. You don't get anywhere.

English

212

Aditya retweetledi

Jeremy Howard@jeremyphoward·4d

We desperately need better ways of evaluating models. Something that shows how helpful they are at working hand-in-hand with humans to help them get stuff done in a cooperative/iterative way. The Claude models have consistently been better at this, and the market rewards that.

English

194

15.2K

Aditya@adityab29_·5d

agi - a giant ipo

Italiano

Aditya@adityab29_·17 May

frederickvanbrabant.com/blog/2026-05-1…

ZXX

Aditya@adityab29_·17 May

@prastran_inc same, it's the only microsoft product i like.

English

Prasad Chatur desale@prastran_inc·2 May

Gotta say Microsoft clip champ is so intuitive never edited a video but managed do what had to be done at last minute

English

Aditya@adityab29_·17 May

@prastran_inc i always wanted something like this during college!! something that go through the college email and whatsapp and just notify me of what's due for submission.

English

Prasad Chatur desale@prastran_inc·16 May

well i just made something to help my adhd a script that goes through my whatsapp and my emails to remind me of my obligations it presents this stuff into a go tui which starts up on worspace on my laptop as soon as it starts up really helpful

English

Aditya@adityab29_·15 May

built a pi extension to enable fast mode for openai models on pi incase you want to try it out - github.com/aditya-borse/p…

English

197

Aditya@adityab29_·15 May

@himanshustwts true. also i think there is nothing like a non verifiable task. it is all about are you creative enough to come up with a feasible and a fair verifier.

English

107

himanshu@himanshustwts·15 May

there are many many domains for training data / envs which are under explored right now and this can be realized from the fact that so far it followed a single principle - how easily you can verify the output. well coding evolved cuz it offered free programmatic verification and building the equivalent "compiler" for other domains is one of the interesting (and intensive) problems to work on. example: bio is the frontier of the verification problem right now.

English

2.7K

Aditya@adityab29_·13 May

@paraschopra true, we need more benchmarks like arc agi 3 which measure skill acquisition efficiency.

English

Paras Chopra@paraschopra·13 May

Been thinking a lot about continual learning and I feel we probably have it backwards. Most formulations care about reducing “catastrophic forgetting” on previously learned tasks when you learn new tasks, but what matters in the real world is speed of adaptation to new tasks. It’s irrelevant if, as adults, we can solve grade 10 math exams; what matters is if we have learned good representations that are composable such that we can adapt to new tasks with minimal training. You’ve trained well if you can re-learn grade 10 math quickly as an adult, not that you can solve it out of the box. So we should be measuring performance of AI systems on future expected distributions of tasks, not the distribution encountered in the past.

English

221

12.5K

Aditya@adityab29_·12 May

@leerob

QME

Lee Robinson@leerob·12 May

Code is actually the right abstraction. Too often I see the future of software engineering diminished down to, effectively, writing and reviewing markdown files. Yes, it will be hard to review thousands of lines of agent code. But maybe the takeaway is that you want less code? Rather than just giving up ("well I guess we won't read the code, or we'll read this lossy markdown summary") this should be a signal forcing you to think about better systems. - How can we make our codebase more verifiable? For example, fast/robust/stable tests, or moving to a typed language. - How can we deslop or improve the architecture/abstractions of the code generated by agents? For example, spending more time up front on the codebase architecture/types before yolo generating all of the code. - How are we going to maintain and evolve this codebase over time? The slop compounds. One great solution here is... you guessed it, learning from the past decades of software engineering! For example, you might just have the wrong abstraction entirely, leading to a ton of duplicated code. I think the markdown folks *are* right in some ways. If you are using skills every day, for many different prompts and workflows, isn't that effectively "coding with markdown"? Kinda. There's been plenty of ink spilled on the merits and benefits of skills. To me, skills make your style of working legible for agents. They don't replace code and that's not really the point. In reality, there's this messy and constantly re-evolving future in which both of these things are true: 1. Skills (and markdown) are important for how you give input to the agents and ensure high-quality code & systems are created 2. Looking at the actual code will not be replaced by markdown summaries or a collection of spec documents that ignore the lower level details of the code In summary: reality has a surprising amount of detail (and nuance)!

English

110

1.3K

113.1K

Aditya@adityab29_·10 May

@AbhishekEswaran @sidsimharaju will try this today!

English

Abhishek Eswaran@AbhishekEswaran·9 May

@sidsimharaju @adityab29_

QAM

110

Siddharth@sidsimharaju·9 May

Tip: if you are returning from BLR airport, stand in the uber line and share the cab with the person in front. half the fare, one new friend.

English

1.8K

87.6K

Aditya@adityab29_·10 May

@KushBang1 in claude we believe :)

English

Kush Bang@KushBang1·10 May

Stop expecting an AI to cure all diseases or solve all problems just because it can read all the scholarship and “think” for a very long time. No matter how much an AI “knows,” it is always too little.

English

Aditya retweetledi

François Chollet@fchollet·19 Şub

Sufficiently advanced agentic coding is essentially machine learning: the engineer sets up the optimization goal as well as some constraints on the search space (the spec and its tests), then an optimization process (coding agents) iterates until the goal is reached. The result is a blackbox model (the generated codebase): an artifact that performs the task, that you deploy without ever inspecting its internal logic, just as we ignore individual weights in a neural network. This implies that all classic issues encountered in ML will soon become problems for agentic coding: overfitting to the spec, Clever Hans shortcuts that don't generalize outside the tests, data leakage, concept drift, etc. I would also ask: what will be the Keras of agentic coding? What will be the optimal set of high-level abstractions that allow humans to steer codebase 'training' with minimal cognitive overhead?

English

174

412

3.5K

426.2K

Aditya retweetledi

Arushi Gandhi@arushi_ressl·7 May

It's easy to claim the end of white collar work. But very hard to get LLMs to reliably edit PPTX. A lot of the workflows our agents address have LLMs taking PPTX, DOCX, XLSX and editing them to get real world outcomes out. A lot of the white collar work is just this. We've tested this extensively, and performance is consistently poor unless you write painstaking, slide-by-slide prompts defining what "acceptable" looks like. At which point, it’s almost not worth it because of how much these templates and expected outcomes change in the real world. My co-founder has penned down some of his thoughts on how different models perform at simple PPTX editing tasks in the URL below. Please check it out and hit us up with any feedback :)

English

7.6K

Aditya@adityab29_·28 Nis

@fortelabs github.com/MarkEdit-app/M…

QME

Tiago Forte@fortelabs·28 Nis

Can anyone recommend a dead simple, easy to use notetaking app based on markdown files? Obsidian is far too complicated for me. I'm looking for Apple Notes, but with markdown storage

English

751

978

316.9K

Aditya@adityab29_·23 Nis

great writeup nrehiew.github.io/blog/minimal_e…

English

172

Aditya retweetledi

kshitij@Kshitijjkapoor·21 Nis

real benchmarks are the users you made happy along the way

English

866

Aditya@adityab29_·18 Nis

@thdxr someone with ocd might

English

dax@thdxr·18 Nis

are there people out there who just want to refactor every day? just wake up and find the worst code and just chip away at it and clean it up wake up the next day do it again, infinitely improving things with zero external impact?

English

769

3.4K

321.8K

Keşfet

@srikosuri @prastran_inc @himanshustwts @paraschopra @leerob @AbhishekEswaran @sidsimharaju @elonmusk