Talha Sarı

46 posts

Talha Sarı

@talhasarit

Agentic Engineer

Katılım Kasım 2025

119 Takip Edilen11 Takipçiler

Talha Sarı@talhasarit·3h

@Dimillian whos gonna maintain documentation afterwards? i don't think it's sustainable..

English

202

Thomas Ricouard@Dimillian·22h

/goal document the whole project

English

305

23.7K

Talha Sarı@talhasarit·3h

@DeryaTR_ tokenmaxxing ftw!

Indonesia

Derya Unutmaz, MD@DeryaTR_·14h

This is the situation I am in with Codex: Pro 20x account, 2 days left until the weekly reset, and I am down to 0%! 😅 Now I’m burning through my API credits! Though, I think I’ll defer making another Codex pet until May 5th! Burn, tokens, burn!🤣

English

148

11.9K

Talha Sarı@talhasarit·1d

@mttcer Agreed. AI agents still don’t have “taste” . They will almost always generate slop whenever you let it work autonomously with no steering.

English

Mattia Cerutti@mttcer·2d

not a critic to Max, but i still don’t get how people can let agents run this long. for me, i use agents only for planning / coding individual features, but i still need control over the overall codebase and decisions i feel like that if i let an agent run for hours i’ll come back with a messy codebase that would take days to understand and review

Max Weinbach@mweinbach

Codex goal mode is kinda crazy

English

1.6K

Talha Sarı retweetledi

Sam Altman@sama·26 Nis

"post-AGI, no one is going to work and the economy is going to collapse" "i am switching to polyphasic sleep because GPT-5.5 in codex is so good that i can't afford to be sleeping for such long stretches and miss out on working"

English

1.2K

606

11.2K

1.6M

Talha Sarı@talhasarit·23 Nis

@DeryaTR_ elaborate please

English

433

Derya Unutmaz, MD@DeryaTR_·23 Nis

I’d been part of OpenAI early tester group for GPT-5.5. I believe with GPT-5.5 Pro we reached another inflection point-comparable to the original release of o1-preview & then with 5.0 Pro, I had felt. It’s that feeling of crossing a milestone threshold that pushes us to new era🔥

English

129.4K

Talha Sarı@talhasarit·22 Nis

@kcosr Its better to use subagents for research task and not inheriting the context from parent is better maybe?

English

Kevin@kcosr·20 Nis

How do folks handle sub-agents inheriting context from the parent and executing instructions meant for the parent? Disable the context preservation? Tell the parent to tell the sub-agent to ignore previous instructions? Codex, BTW.

English

742

Talha Sarı@talhasarit·17 Nis

@CtrlAltDwayne @Liinad_De_Varge never.

English

106

Dwayne@CtrlAltDwayne·17 Nis

@Liinad_De_Varge The point is he asks the smartest guests the stupidest questions you'll ever hear. And he's usually having to read them off a list. The guests he gets are wasted.

English

804

45.6K

Dwayne@CtrlAltDwayne·17 Nis

Lex Fridman podcast, every single episode: Lex: My guest today invented modern computing. Before we start. What is a computer? Guest: Well it's a machine that Lex: But what IS a computer. Is a rock a computer. Guest: No. Lex: The atoms inside the rock are computing. Guest: That's not how any of this Lex: Do you love your work. Guest: Sure? Lex: On a scale of 1 to 10 how much do you think Stalin loved his work. Guest: What Lex: We're 4 hours in. I want to ask you about consciousness. Guest: You haven't let me finish a sentence yet Lex: Beautiful. Beautiful question my friend.

English

312

239

7.8K

677.6K

Talha Sarı@talhasarit·17 Nis

@AlicanKiraz0 Bir de her post AI kokuyor ya o çok can sıkıcı olmaya başladı..

Türkçe

Alican Kiraz@AlicanKiraz0·16 Nis

Linkedin’de gördüğüm kadarıyla şirketler ve çalışanlar “AI” ile verim/kalite/tasarruf kazanma yada teknik öğrenerek mesleğini dönüştürme, proje üretme peşinde değil. Neredeyse her paylaşım haber veya etkileşim niyetinde… Tüm timeline haberciye dönüşmüş durumda.

Türkçe

181

11.4K

Talha Sarı@talhasarit·17 Nis

@dexhorthy now that i watched your talk at AI Engineer, I know now. beautifully put youtu.be/rmvDxxNubIg?t=…

YouTube

English

Talha Sarı@talhasarit·14 Nis

@dexhorthy I really want to understand what do you actually mean by “you cannot outsource thinking”, do you have an earlier post that you explained this topic deeper?

English

dex@dexhorthy·13 Nis

You cannot outsource the thinking

Big Brain AI@realBigBrainAI

Peter Steinberger, creator of OpenClaw, on why AI agents still produce "slop" without human taste in the loop: "You can create code and run all night and then you have like the ultimate slop because what those agents don't really do yet is have taste." Peter is direct: raw capability without direction still produces mediocre output. "They are spiky smart and they're really good at things, but if you don't navigate them well, if you don't have a vision of what you're going to build, it's still going to be slop. If you don't ask the right questions, it's still going to be slop." Great AI-assisted work is defined by the human guiding it. @steipete describes his own creative process when starting a new project: "When I start a project, I have like this very rough idea what it could be. And as I play with it and feel it, my vision gets more clear. I try out things, some things don't work, and I evolve my idea into what it will become." Most people skip this part entirely, front-loading everything into a single prompt and wondering why the result feels hollow. "My next prompt depends on what I see and feel and think about the current state of the project." Each step informs the next. The work itself is the feedback loop. "But if you try to put everything into a spec up front, you miss this kind of human-machine loop. And then I don't know how something good can come out without having feelings in the loop — almost like taste." The agentic trap is what happens when you remove yourself from the process too early.

English

8.3K

Talha Sarı@talhasarit·16 Nis

@gaytheguy @SelmanKahyaX AI kullanmayı daha iyi öğrenip seniorları geçebilirsin.

Türkçe

368

aptalca@gaytheguy·16 Nis

@SelmanKahyaX reis yeni mezun biri ne yapsin onu der misin?kodlama ogrensek, AI hepsini yapiyor.ogrenmesek AI'in yaptiklarini kontrol edemiyoruz dogru mu diye.cok sacma bir loop icinde kaldik.seniorlar hayvan gibi uretim yaparken jr ve yeni mezunlar ogrenmek veya AI'a dayanmak arasinda kaldi

Türkçe

1.6K

Selman Kahya@SelmanKahyaX·16 Nis

'yazılımcıya ihtiyaç devam edecek' için en güçlü argüman: işlerin çok kısa sürede bitirilmesi bekleniyor ve 'hadi buna da girelim, hallederiz' yaklaşımı çok yaygınlaştı. başlangıç noktası normalde yapacağın scope'un on katı. vibecoder ile çözülecek şey değil

Cursor@cursor_ai

We partnered with University of Chicago economist @SuproteemSarkar to study how more capable models have changed the way people use Cursor. Across 500 teams, we find that developers are tackling more ambitious work with AI, with a 68% increase in high-complexity tasks this year.

Türkçe

246

43.9K

Talha Sarı@talhasarit·8 Nis

@LLMJunky Now we have a new cheap alternative via cloud usage don’t we? $1.4/$4.4 input/output prices per 1M tokens!

English

am.will@LLMJunky·7 Nis

GLM 5.1 is here, and its the strongest open weights coding model ever created. Not sure if I'll be able to run this locally. Need moar GPU. It's never enough 🥲

Markets & Mayhem@Mayhem4Markets

GLM-5.1 weights just dropped. 🎉 This is a strong model. > Excels at coding, just under Claude Opus 4.6 and above Gemini 3.1 Pro. > Built to work across longer multi-step agentic workflows. > At 754B it's quite a bit smaller than the frontier models it's competing against

English

4.3K

Talha Sarı@talhasarit·7 Nis

@kimmonismus can we access the "simple harness" that they have built?

English

111

Chubby♨️@kimmonismus·7 Nis

This is a very big deal: GLM-5.1 model can autonomously evaluate and improve its own work over long periods without explicit metrics, shifting from one-shot outputs to sustained, self-directed problem solving. Lets go

Chubby♨️@kimmonismus

Another big release: GLM-5.1! China is on fire! significant increase in evals compared to GLM-5.0 tl;dr GLM-5.1 is the new open-source agentic coding model that significantly outperforms its predecessor by sustaining long-horizon problem-solving over hundreds of iterations, continuously improving results instead of plateauing, achieving state-of-the-art performance on complex software engineering benchmarks.

English

401

18.4K

Talha Sarı@talhasarit·5 Nis

@furkanbytekin skill issue: youtu.be/kwSVtQ7dziU?si…

YouTube

English

818

Furkan Baytekin@furkanbytekin·4 Nis

claude code 30 dakika plan yaptı. bir proje yaptı. şu an düzenlemelerini yapıyorum. /users/me endpointi koymuş 200 geliyor ama /login'e atıyor dedim. LOCAL STORAGE A YAZMIŞ TOKENİ

Türkçe

165

60.5K

Talha Sarı retweetledi

Andrej Karpathy@karpathy·4 Nis

Wow, this tweet went very viral! I wanted share a possibly slightly improved version of the tweet in an "idea file". The idea of the idea file is that in this era of LLM agents, there is less of a point/need of sharing the specific code/app, you just share the idea, then the other person's agent customizes & builds it for your specific needs. So here's the idea in a gist format: gist.github.com/karpathy/442a6… You can give this to your agent and it can build you your own LLM wiki and guide you on how to use it etc. It's intentionally kept a little bit abstract/vague because there are so many directions to take this in. And ofc, people can adjust the idea or contribute their own in the Discussion which is cool.

Andrej Karpathy@karpathy

LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. In this way, a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating knowledge (stored as markdown and images). The latest LLMs are quite good at it. So: Data ingest: I index source documents (articles, papers, repos, datasets, images, etc.) into a raw/ directory, then I use an LLM to incrementally "compile" a wiki, which is just a collection of .md files in a directory structure. The wiki includes summaries of all the data in raw/, backlinks, and then it categorizes data into concepts, writes articles for them, and links them all. To convert web articles into .md files I like to use the Obsidian Web Clipper extension, and then I also use a hotkey to download all the related images to local so that my LLM can easily reference them. IDE: I use Obsidian as the IDE "frontend" where I can view the raw data, the the compiled wiki, and the derived visualizations. Important to note that the LLM writes and maintains all of the data of the wiki, I rarely touch it directly. I've played with a few Obsidian plugins to render and view data in other ways (e.g. Marp for slides). Q&A: Where things get interesting is that once your wiki is big enough (e.g. mine on some recent research is ~100 articles and ~400K words), you can ask your LLM agent all kinds of complex questions against the wiki, and it will go off, research the answers, etc. I thought I had to reach for fancy RAG, but the LLM has been pretty good about auto-maintaining index files and brief summaries of all the documents and it reads all the important related data fairly easily at this ~small scale. Output: Instead of getting answers in text/terminal, I like to have it render markdown files for me, or slide shows (Marp format), or matplotlib images, all of which I then view again in Obsidian. You can imagine many other visual output formats depending on the query. Often, I end up "filing" the outputs back into the wiki to enhance it for further queries. So my own explorations and queries always "add up" in the knowledge base. Linting: I've run some LLM "health checks" over the wiki to e.g. find inconsistent data, impute missing data (with web searchers), find interesting connections for new article candidates, etc., to incrementally clean up the wiki and enhance its overall data integrity. The LLMs are quite good at suggesting further questions to ask and look into. Extra tools: I find myself developing additional tools to process the data, e.g. I vibe coded a small and naive search engine over the wiki, which I both use directly (in a web ui), but more often I want to hand it off to an LLM via CLI as a tool for larger queries. Further explorations: As the repo grows, the natural desire is to also think about synthetic data generation + finetuning to have your LLM "know" the data in its weights instead of just context windows. TLDR: raw data from a given number of sources is collected, then compiled by an LLM into a .md wiki, then operated on by various CLIs by the LLM to do Q&A and to incrementally enhance the wiki, and all of it viewable in Obsidian. You rarely ever write or edit the wiki manually, it's the domain of the LLM. I think there is room here for an incredible new product instead of a hacky collection of scripts.

English

1.1K

2.8K

26.6K

6.9M

Talha Sarı@talhasarit·3 Nis

@oguzergin @hi_rullah Buradaki Nemotron modellerin farkı hem Open Source hem Open Weight olması olabilir. Modeli eğittikleri dataset ve nasıl eğittiklerine kadar tüm her şey Open Source olduğu için yine öncü kategorisine belki aday olabilirdi diye düşünüyorum siz ne dersiniz?

Türkçe

Oğuz Ergin@oguzergin·2 Nis

@hi_rullah Pek başarılı olmadığı için almadım buraya. Claude ile tartıştık, koymamaya karar verdik :)

Türkçe

365

Oğuz Ergin@oguzergin·2 Nis

Yapay zeka yarışı hız kesmiyor! Güncel zaman çizelgemizi paylaşıyorum. Son 40 günde çıkan modeller: 3 Mar: GPT-5.3 Instant, Gemini 3.1 Flash-Lite 5 Mar: GPT-5.4 16 Mar: Mistral Small 4 17 Mar: GPT-5.4 mini 18 Mar: MiniMax-M2.7 27 Mar: GLM-5.1 31 Mar: Qwen 3.6 2 Nis: Gemma 4

Türkçe

6.5K

Talha Sarı@talhasarit·2 Nis

@paulbohnenkamp @JasonBotterill Did u put all the article to the agents.md kind of thing? What was your approach like? How each fresh session is still obeying the harness engineering best practices?

English

Paul Bohnenkamp@paulbohnenkamp·2 Nis

@JasonBotterill +1 Tried the harness eng setup from the OpenAI article, fed Codex their docs/ MD system and guided it to keep that context updated as it worked. At that point I was mostly just narrating features into existence.

English

402

JB@JasonBotterill·2 Nis

I’ve become a god of fucking MD file context management. I can make Codex comprehend insanely large codebases by having it dynamically write a living record. Honestly it makes 200k of context plenty. You don’t need more if you’re proactive

English

372

30.3K

Talha Sarı@talhasarit·2 Nis

@JasonBotterill Do you have any recommendations or guides to share the best practices of MD file context management with us?

English

106

Talha Sarı@talhasarit·2 Nis

@ashleybchae Does it work on codex app too?

English

294

Ashley Ha@ashleybchae·2 Nis

oh-my-codex is lowkey the best thing to happen to codex cli. you can even throw a vague idea at $ autopilot & it plans, codes, & tests everything. my friends built this & it's cracked. you can even use $ ralph to clone Resend .com in ~7 hr with no human input (link soon!) @Yun_HDY @bellman_ych github.com/Yeachan-Heo/oh…

English

377

39.4K

Talha Sarı@talhasarit·2 Nis

@LLMJunky @clairernovotny do you think that you should update the strategies at swarms because codex can now talk to the subagents and steer them better, instead of spawning subagents used to be a 1 pass over and get result back thing?

English

am.will@LLMJunky·1 Nis

@clairernovotny my strategies are evolving but i do generally have some guidance for you on my github at github.com/am-will/swarms i need to update the skills but the idea is the same. i've just added TDD to my workflow

English

334

am.will@LLMJunky·1 Nis

Codex compaction is truly the #1 killer feature for me, and has been ever since 5.2. I used to also have context window anxiety, and built an entire orchestration system to complete tasks within ~40% of the context window. It was a huge amount of manual work, and very time consuming. That is until @steipete made me aware of their new compaction endpoint. It was a difficult habit to break, especially coming from Claude models - where you absolutely have to stay out of the "dumb zone." But I started to trust it more and more. Now, I literally don't worry about compaction at all. It doesn't matter if it compacts 7-8 times. That is why, aside from very large docs or codebases, I don't really care that much about large context windows. Why should I? If you're still obsessing about resetting your context window, I would encourage you to try GPT 5.4 and just let it ride. I do recommend you write your spec to a markdown file so that the agents can manage state and track progress through compactions. This helps keep it on track. My strategy is to have the orchestrator update the spec after every task is complete with a concise log of its work. To me, this is the biggest difference maker between Codex and literally every other product.

dominik kundel@dkundel

I haven't had context window anxiety since GPT-5.1-Codex-Max when the model got natively trained on compaction. I let a thread go on until the feature is done and rely on auto compaction! You can even bring that same compaction into your own apps 👇

English

201

19.8K

Keşfet

@Dimillian @DeryaTR_ @mttcer @kcosr @CtrlAltDwayne @Liinad_De_Varge @AlicanKiraz0 @dexhorthy