Talha Sarı

46 posts

Talha Sarı banner
Talha Sarı

Talha Sarı

@talhasarit

Agentic Engineer

Katılım Kasım 2025
119 Takip Edilen11 Takipçiler
Talha Sarı
Talha Sarı@talhasarit·
@Dimillian whos gonna maintain documentation afterwards? i don't think it's sustainable..
English
1
0
0
202
Thomas Ricouard
Thomas Ricouard@Dimillian·
/goal document the whole project
English
16
1
305
23.7K
Derya Unutmaz, MD
Derya Unutmaz, MD@DeryaTR_·
This is the situation I am in with Codex: Pro 20x account, 2 days left until the weekly reset, and I am down to 0%! 😅 Now I’m burning through my API credits! Though, I think I’ll defer making another Codex pet until May 5th! Burn, tokens, burn!🤣
Derya Unutmaz, MD tweet media
English
38
1
148
11.9K
Talha Sarı
Talha Sarı@talhasarit·
@mttcer Agreed. AI agents still don’t have “taste” . They will almost always generate slop whenever you let it work autonomously with no steering.
English
1
0
1
22
Mattia Cerutti
Mattia Cerutti@mttcer·
not a critic to Max, but i still don’t get how people can let agents run this long. for me, i use agents only for planning / coding individual features, but i still need control over the overall codebase and decisions i feel like that if i let an agent run for hours i’ll come back with a messy codebase that would take days to understand and review
Max Weinbach@mweinbach

Codex goal mode is kinda crazy

English
6
2
11
1.6K
Talha Sarı retweetledi
Sam Altman
Sam Altman@sama·
"post-AGI, no one is going to work and the economy is going to collapse" "i am switching to polyphasic sleep because GPT-5.5 in codex is so good that i can't afford to be sleeping for such long stretches and miss out on working"
English
1.2K
606
11.2K
1.6M
Derya Unutmaz, MD
Derya Unutmaz, MD@DeryaTR_·
I’d been part of OpenAI early tester group for GPT-5.5. I believe with GPT-5.5 Pro we reached another inflection point-comparable to the original release of o1-preview & then with 5.0 Pro, I had felt. It’s that feeling of crossing a milestone threshold that pushes us to new era🔥
English
54
51
1K
129.4K
Talha Sarı
Talha Sarı@talhasarit·
@kcosr Its better to use subagents for research task and not inheriting the context from parent is better maybe?
English
1
0
1
46
Kevin
Kevin@kcosr·
How do folks handle sub-agents inheriting context from the parent and executing instructions meant for the parent? Disable the context preservation? Tell the parent to tell the sub-agent to ignore previous instructions? Codex, BTW.
English
2
0
0
742
Dwayne
Dwayne@CtrlAltDwayne·
@Liinad_De_Varge The point is he asks the smartest guests the stupidest questions you'll ever hear. And he's usually having to read them off a list. The guests he gets are wasted.
English
31
0
804
45.6K
Dwayne
Dwayne@CtrlAltDwayne·
Lex Fridman podcast, every single episode: Lex: My guest today invented modern computing. Before we start. What is a computer? Guest: Well it's a machine that Lex: But what IS a computer. Is a rock a computer. Guest: No. Lex: The atoms inside the rock are computing. Guest: That's not how any of this Lex: Do you love your work. Guest: Sure? Lex: On a scale of 1 to 10 how much do you think Stalin loved his work. Guest: What Lex: We're 4 hours in. I want to ask you about consciousness. Guest: You haven't let me finish a sentence yet Lex: Beautiful. Beautiful question my friend.
Dwayne tweet media
English
312
239
7.8K
677.6K
Talha Sarı
Talha Sarı@talhasarit·
@AlicanKiraz0 Bir de her post AI kokuyor ya o çok can sıkıcı olmaya başladı..
Türkçe
0
0
1
23
Alican Kiraz
Alican Kiraz@AlicanKiraz0·
Linkedin’de gördüğüm kadarıyla şirketler ve çalışanlar “AI” ile verim/kalite/tasarruf kazanma yada teknik öğrenerek mesleğini dönüştürme, proje üretme peşinde değil. Neredeyse her paylaşım haber veya etkileşim niyetinde… Tüm timeline haberciye dönüşmüş durumda.
Türkçe
9
3
181
11.4K
Talha Sarı
Talha Sarı@talhasarit·
@dexhorthy I really want to understand what do you actually mean by “you cannot outsource thinking”, do you have an earlier post that you explained this topic deeper?
English
1
0
0
31
dex
dex@dexhorthy·
You cannot outsource the thinking
Big Brain AI@realBigBrainAI

Peter Steinberger, creator of OpenClaw, on why AI agents still produce "slop" without human taste in the loop: "You can create code and run all night and then you have like the ultimate slop because what those agents don't really do yet is have taste." Peter is direct: raw capability without direction still produces mediocre output. "They are spiky smart and they're really good at things, but if you don't navigate them well, if you don't have a vision of what you're going to build, it's still going to be slop. If you don't ask the right questions, it's still going to be slop." Great AI-assisted work is defined by the human guiding it. @steipete describes his own creative process when starting a new project: "When I start a project, I have like this very rough idea what it could be. And as I play with it and feel it, my vision gets more clear. I try out things, some things don't work, and I evolve my idea into what it will become." Most people skip this part entirely, front-loading everything into a single prompt and wondering why the result feels hollow. "My next prompt depends on what I see and feel and think about the current state of the project." Each step informs the next. The work itself is the feedback loop. "But if you try to put everything into a spec up front, you miss this kind of human-machine loop. And then I don't know how something good can come out without having feelings in the loop — almost like taste." The agentic trap is what happens when you remove yourself from the process too early.

English
3
5
57
8.3K
aptalca
aptalca@gaytheguy·
@SelmanKahyaX reis yeni mezun biri ne yapsin onu der misin?kodlama ogrensek, AI hepsini yapiyor.ogrenmesek AI'in yaptiklarini kontrol edemiyoruz dogru mu diye.cok sacma bir loop icinde kaldik.seniorlar hayvan gibi uretim yaparken jr ve yeni mezunlar ogrenmek veya AI'a dayanmak arasinda kaldi
Türkçe
3
0
3
1.6K
Selman Kahya
Selman Kahya@SelmanKahyaX·
'yazılımcıya ihtiyaç devam edecek' için en güçlü argüman: işlerin çok kısa sürede bitirilmesi bekleniyor ve 'hadi buna da girelim, hallederiz' yaklaşımı çok yaygınlaştı. başlangıç noktası normalde yapacağın scope'un on katı. vibecoder ile çözülecek şey değil
Cursor@cursor_ai

We partnered with University of Chicago economist @SuproteemSarkar to study how more capable models have changed the way people use Cursor. Across 500 teams, we find that developers are tackling more ambitious work with AI, with a 68% increase in high-complexity tasks this year.

Türkçe
12
10
246
43.9K
Talha Sarı
Talha Sarı@talhasarit·
@LLMJunky Now we have a new cheap alternative via cloud usage don’t we? $1.4/$4.4 input/output prices per 1M tokens!
English
0
0
0
1
Furkan Baytekin
Furkan Baytekin@furkanbytekin·
claude code 30 dakika plan yaptı. bir proje yaptı. şu an düzenlemelerini yapıyorum. /users/me endpointi koymuş 200 geliyor ama /login'e atıyor dedim. LOCAL STORAGE A YAZMIŞ TOKENİ
Türkçe
21
0
165
60.5K
Talha Sarı retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
Wow, this tweet went very viral! I wanted share a possibly slightly improved version of the tweet in an "idea file". The idea of the idea file is that in this era of LLM agents, there is less of a point/need of sharing the specific code/app, you just share the idea, then the other person's agent customizes & builds it for your specific needs. So here's the idea in a gist format: gist.github.com/karpathy/442a6… You can give this to your agent and it can build you your own LLM wiki and guide you on how to use it etc. It's intentionally kept a little bit abstract/vague because there are so many directions to take this in. And ofc, people can adjust the idea or contribute their own in the Discussion which is cool.
Andrej Karpathy@karpathy

LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. In this way, a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating knowledge (stored as markdown and images). The latest LLMs are quite good at it. So: Data ingest: I index source documents (articles, papers, repos, datasets, images, etc.) into a raw/ directory, then I use an LLM to incrementally "compile" a wiki, which is just a collection of .md files in a directory structure. The wiki includes summaries of all the data in raw/, backlinks, and then it categorizes data into concepts, writes articles for them, and links them all. To convert web articles into .md files I like to use the Obsidian Web Clipper extension, and then I also use a hotkey to download all the related images to local so that my LLM can easily reference them. IDE: I use Obsidian as the IDE "frontend" where I can view the raw data, the the compiled wiki, and the derived visualizations. Important to note that the LLM writes and maintains all of the data of the wiki, I rarely touch it directly. I've played with a few Obsidian plugins to render and view data in other ways (e.g. Marp for slides). Q&A: Where things get interesting is that once your wiki is big enough (e.g. mine on some recent research is ~100 articles and ~400K words), you can ask your LLM agent all kinds of complex questions against the wiki, and it will go off, research the answers, etc. I thought I had to reach for fancy RAG, but the LLM has been pretty good about auto-maintaining index files and brief summaries of all the documents and it reads all the important related data fairly easily at this ~small scale. Output: Instead of getting answers in text/terminal, I like to have it render markdown files for me, or slide shows (Marp format), or matplotlib images, all of which I then view again in Obsidian. You can imagine many other visual output formats depending on the query. Often, I end up "filing" the outputs back into the wiki to enhance it for further queries. So my own explorations and queries always "add up" in the knowledge base. Linting: I've run some LLM "health checks" over the wiki to e.g. find inconsistent data, impute missing data (with web searchers), find interesting connections for new article candidates, etc., to incrementally clean up the wiki and enhance its overall data integrity. The LLMs are quite good at suggesting further questions to ask and look into. Extra tools: I find myself developing additional tools to process the data, e.g. I vibe coded a small and naive search engine over the wiki, which I both use directly (in a web ui), but more often I want to hand it off to an LLM via CLI as a tool for larger queries. Further explorations: As the repo grows, the natural desire is to also think about synthetic data generation + finetuning to have your LLM "know" the data in its weights instead of just context windows. TLDR: raw data from a given number of sources is collected, then compiled by an LLM into a .md wiki, then operated on by various CLIs by the LLM to do Q&A and to incrementally enhance the wiki, and all of it viewable in Obsidian. You rarely ever write or edit the wiki manually, it's the domain of the LLM. I think there is room here for an incredible new product instead of a hacky collection of scripts.

English
1.1K
2.8K
26.6K
6.9M
Talha Sarı
Talha Sarı@talhasarit·
@oguzergin @hi_rullah Buradaki Nemotron modellerin farkı hem Open Source hem Open Weight olması olabilir. Modeli eğittikleri dataset ve nasıl eğittiklerine kadar tüm her şey Open Source olduğu için yine öncü kategorisine belki aday olabilirdi diye düşünüyorum siz ne dersiniz?
Türkçe
0
0
0
30
Oğuz Ergin
Oğuz Ergin@oguzergin·
@hi_rullah Pek başarılı olmadığı için almadım buraya. Claude ile tartıştık, koymamaya karar verdik :)
Türkçe
1
0
1
365
Oğuz Ergin
Oğuz Ergin@oguzergin·
Yapay zeka yarışı hız kesmiyor! Güncel zaman çizelgemizi paylaşıyorum. Son 40 günde çıkan modeller: 3 Mar: GPT-5.3 Instant, Gemini 3.1 Flash-Lite 5 Mar: GPT-5.4 16 Mar: Mistral Small 4 17 Mar: GPT-5.4 mini 18 Mar: MiniMax-M2.7 27 Mar: GLM-5.1 31 Mar: Qwen 3.6 2 Nis: Gemma 4
Oğuz Ergin tweet media
Türkçe
3
9
68
6.5K
Talha Sarı
Talha Sarı@talhasarit·
@paulbohnenkamp @JasonBotterill Did u put all the article to the agents.md kind of thing? What was your approach like? How each fresh session is still obeying the harness engineering best practices?
English
1
0
0
35
Paul Bohnenkamp
Paul Bohnenkamp@paulbohnenkamp·
@JasonBotterill +1 Tried the harness eng setup from the OpenAI article, fed Codex their docs/ MD system and guided it to keep that context updated as it worked. At that point I was mostly just narrating features into existence.
English
1
0
0
402
JB
JB@JasonBotterill·
I’ve become a god of fucking MD file context management. I can make Codex comprehend insanely large codebases by having it dynamically write a living record. Honestly it makes 200k of context plenty. You don’t need more if you’re proactive
English
31
5
372
30.3K
Talha Sarı
Talha Sarı@talhasarit·
@JasonBotterill Do you have any recommendations or guides to share the best practices of MD file context management with us?
English
0
0
0
106
Ashley Ha
Ashley Ha@ashleybchae·
oh-my-codex is lowkey the best thing to happen to codex cli. you can even throw a vague idea at $ autopilot & it plans, codes, & tests everything. my friends built this & it's cracked. you can even use $ ralph to clone Resend .com in ~7 hr with no human input (link soon!) @Yun_HDY @bellman_ych github.com/Yeachan-Heo/oh…
English
10
23
377
39.4K
Talha Sarı
Talha Sarı@talhasarit·
@LLMJunky @clairernovotny do you think that you should update the strategies at swarms because codex can now talk to the subagents and steer them better, instead of spawning subagents used to be a 1 pass over and get result back thing?
English
1
0
1
51
am.will
am.will@LLMJunky·
@clairernovotny my strategies are evolving but i do generally have some guidance for you on my github at github.com/am-will/swarms i need to update the skills but the idea is the same. i've just added TDD to my workflow
English
2
0
2
334
am.will
am.will@LLMJunky·
Codex compaction is truly the #1 killer feature for me, and has been ever since 5.2. I used to also have context window anxiety, and built an entire orchestration system to complete tasks within ~40% of the context window. It was a huge amount of manual work, and very time consuming. That is until @steipete made me aware of their new compaction endpoint. It was a difficult habit to break, especially coming from Claude models - where you absolutely have to stay out of the "dumb zone." But I started to trust it more and more. Now, I literally don't worry about compaction at all. It doesn't matter if it compacts 7-8 times. That is why, aside from very large docs or codebases, I don't really care that much about large context windows. Why should I? If you're still obsessing about resetting your context window, I would encourage you to try GPT 5.4 and just let it ride. I do recommend you write your spec to a markdown file so that the agents can manage state and track progress through compactions. This helps keep it on track. My strategy is to have the orchestrator update the spec after every task is complete with a concise log of its work. To me, this is the biggest difference maker between Codex and literally every other product.
dominik kundel@dkundel

I haven't had context window anxiety since GPT-5.1-Codex-Max when the model got natively trained on compaction. I let a thread go on until the feature is done and rely on auto compaction! You can even bring that same compaction into your own apps 👇

English
31
12
201
19.8K