thinkinsysdev

5.8K posts

thinkinsysdev

@ThinkInSysDev

systems first, tinkerer, technology fan, thinker, presenter, father, husband, human @capitalgroup, recent pet dad

Anaheim, CA Sumali Ekim 2014

4.8K Sinusundan845 Mga Tagasunod

Naka-pin na Tweet

thinkinsysdev@ThinkInSysDev·18 Nis

@hehehenrihenri If you’re going to put any meaningful production load, you probably need some thermal barriers/heat conduits to carry heat away from the stacked Mac devices. Otherwise you risk low life expectancy for these.

English

109

14.4K

thinkinsysdev@ThinkInSysDev·2d

@HilaShmuel No License file, please add

English

Hila Shmuel@HilaShmuel·3d

Meet Cabinet: Paper Clip + KB. for quite some time I've been thinking how LLMs are missing the knowledge base - where I can dump CSVs, PDFs, and most important - inline web app. running on Claude Code with agents with heartbeats and jobs runcabinet.com

Andrej Karpathy@karpathy

LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. In this way, a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating knowledge (stored as markdown and images). The latest LLMs are quite good at it. So: Data ingest: I index source documents (articles, papers, repos, datasets, images, etc.) into a raw/ directory, then I use an LLM to incrementally "compile" a wiki, which is just a collection of .md files in a directory structure. The wiki includes summaries of all the data in raw/, backlinks, and then it categorizes data into concepts, writes articles for them, and links them all. To convert web articles into .md files I like to use the Obsidian Web Clipper extension, and then I also use a hotkey to download all the related images to local so that my LLM can easily reference them. IDE: I use Obsidian as the IDE "frontend" where I can view the raw data, the the compiled wiki, and the derived visualizations. Important to note that the LLM writes and maintains all of the data of the wiki, I rarely touch it directly. I've played with a few Obsidian plugins to render and view data in other ways (e.g. Marp for slides). Q&A: Where things get interesting is that once your wiki is big enough (e.g. mine on some recent research is ~100 articles and ~400K words), you can ask your LLM agent all kinds of complex questions against the wiki, and it will go off, research the answers, etc. I thought I had to reach for fancy RAG, but the LLM has been pretty good about auto-maintaining index files and brief summaries of all the documents and it reads all the important related data fairly easily at this ~small scale. Output: Instead of getting answers in text/terminal, I like to have it render markdown files for me, or slide shows (Marp format), or matplotlib images, all of which I then view again in Obsidian. You can imagine many other visual output formats depending on the query. Often, I end up "filing" the outputs back into the wiki to enhance it for further queries. So my own explorations and queries always "add up" in the knowledge base. Linting: I've run some LLM "health checks" over the wiki to e.g. find inconsistent data, impute missing data (with web searchers), find interesting connections for new article candidates, etc., to incrementally clean up the wiki and enhance its overall data integrity. The LLMs are quite good at suggesting further questions to ask and look into. Extra tools: I find myself developing additional tools to process the data, e.g. I vibe coded a small and naive search engine over the wiki, which I both use directly (in a web ui), but more often I want to hand it off to an LLM via CLI as a tool for larger queries. Further explorations: As the repo grows, the natural desire is to also think about synthetic data generation + finetuning to have your LLM "know" the data in its weights instead of just context windows. TLDR: raw data from a given number of sources is collected, then compiled by an LLM into a .md wiki, then operated on by various CLIs by the LLM to do Q&A and to incrementally enhance the wiki, and all of it viewable in Obsidian. You rarely ever write or edit the wiki manually, it's the domain of the LLM. I think there is room here for an incredible new product instead of a hacky collection of scripts.

English

915

224.9K

thinkinsysdev@ThinkInSysDev·2d

@PeterDiamandis We need picture and video based personal assistant, audio visual personal assistant — AVA

English

326

Peter H. Diamandis, MD@PeterDiamandis·3d

The human brain processes visual information 60,000x faster than text. Humans are visual processors, not text processors. Images hit the brain instantly. Words take work. That's why a single SpaceX launch video communicates more than a thousand-word essay—and why your slide decks hit harder than paragraphs. We're wired for pictures, not prose.

English

1.2K

11.2K

29.7M

thinkinsysdev nag-retweet

Guillermo Rauch@rauchg·29 Mar

Hiring engineers with 5 years of experience in @𝚌𝚑𝚎𝚗𝚐𝚕𝚘𝚞/𝚙𝚛𝚎𝚝𝚎𝚡𝚝 to create the web rendering toolkit of the future

English

151

3.3K

172.8K

thinkinsysdev@ThinkInSysDev·27 Mar

@elonmusk Starship booster landing was 56 years from the landing on the moon

English

Elon Musk@elonmusk·27 Mar

Incredible that it was only 66 years from the first controlled, powered flight to landing on the Moon!

Wonder of Science@wonderofscience

These two photographs are separated by only 66 years.

English

18.9K

28.2K

346.2K

52M

thinkinsysdev nag-retweet

Aakash Gupta@aakashgupta·24 Mar

The PMs who figure out how to write evals are going to have a massive advantage over the next 2 years. Not ML evals. Binary scoring criteria for their own prompts, skills, and agents. "Does the headline include a specific number?" Yes or no. "Does the CTA use a specific action verb?" Yes or no. Run 30 outputs, count the yeses, and you have a score. That score is what lets you run 100 optimization cycles overnight instead of 5 by hand. I ran this on a landing page copy skill. The baseline: 41%. Four rounds later: 92%. The agent attacked the worst-performing criterion first (headlines failing 80% of the time), added a rule, and the score jumped to 68% in one round. Then it found buzzwords, fixed CTAs, tried tightening word count (which made things worse), and auto-reverted. Three changes kept. One reverted. The eval caught the regression automatically. The skill that separates PMs who ship reliable AI features from everyone else: writing 3-6 binary questions that define what "good" means for their specific output. That's the whole game.

Aakash Gupta@aakashgupta

Karpathy's autoresearch repo has 42K stars. Most PMs closed the tab thinking it wasn't for them. I pointed it at a Claude Code skill. 41% to 92% in 4 rounds while I slept. 6 use cases, 10 eval templates, and a downloadable toolkit. 🔗 news.aakashg.com/p/autoresearch…

English

247

53.4K

thinkinsysdev@ThinkInSysDev·24 Mar

@mckaywrigley Here to help

English

Mckay Wrigley@mckaywrigley·24 Mar

looking for a handful of people to test something new... i've been using it for a few months and am prepping to share. if you're a fan of claude cowork, openclaw, manus, perplexity computer, etc then you're a perfect fit. this will self destruct in 4hrs - please dm or reply.

Mckay Wrigley@mckaywrigley

you’re like 6 prompts away from infinitely customizable personal agi. anthropic gave you a world class agentic harness for free. use it!!!

English

771

157.2K

thinkinsysdev@ThinkInSysDev·24 Mar

If you can automate design sprint from Google using HITL, you will be able to get to the true problems faster than . Building the right thing has become important than building the thing right because code is commodity. I can throw away a prototype/working software and recreate it quickly. But the change management for my users would be really hard with that approach.

English

Chamath Palihapitiya@chamath·24 Mar

I think the concept of building a Software Factory is now a commonplace expectation. Yay. The winner still isn’t clear but whoever does the best job reimagining the software development lifecycle in a world of agents, AI, expert knowledge, tribal knowledge and business expectations can build a really good and useful product for the world. It’s early days but I think 8090’s Software Factory is on the right track.

English

147

591

91.9K

thinkinsysdev@ThinkInSysDev·23 Mar

@karpathy @RhysSullivan The reward that it was trained for is just too high for completing code than the quality of code.

English

Andrej Karpathy@karpathy·21 Mar

I'm not very happy with the code quality and I think agents bloat abstractions, have poor code aesthetics, are very prone to copy pasting code blocks and it's a mess, but at this point I stopped fighting it too hard and just moved on. The agents do not listen to my instructions in the AGENTS.md files. E.g. just as one example, no matter how many times I say something like: "Every line of code should do exactly one thing and use intermediate variables as a form of documentation" They will still "multitask" and create complex constructs where one line of code calls 2 functions and then indexes an array with the result. I think in principle I could use hooks or slash commands to clean this up but at some point just a shrug is easier. Yes I think LLM as a judge for soft rewards is in principle and long term slightly problematic (due to goodharting concerns), but in practice and for now I don't think we've picked the low hanging fruit yet here.

English

252

332

4.3K

814.5K

Andrej Karpathy@karpathy·21 Mar

Thank you Sarah, my pleasure to come on the pod! And happy to do some more Q&A in the replies.

sarah guo@saranormous

Caught up with @karpathy for a new @NoPriorsPod: on the phase shift in engineering, AI psychosis, claws, AutoResearch, the opportunity for a SETI-at-Home like movement in AI, the model landscape, and second order effects 02:55 - What Capability Limits Remain? 06:15 - What Mastery of Coding Agents Looks Like 11:16 - Second Order Effects of Coding Agents 15:51 - Why AutoResearch 22:45 - Relevant Skills in the AI Era 28:25 - Model Speciation 32:30 - Collaboration Surfaces for Humans and AI 37:28 - Analysis of Jobs Market Data 48:25 - Open vs. Closed Source Models 53:51 - Autonomous Robotics and Atoms 1:00:59 - MicroGPT and Agentic Education 1:05:40 - End Thoughts

English

318

389

5.4K

thinkinsysdev@ThinkInSysDev·22 Mar

Playing with @claudeai channels, how do people get over all the allow prompts - ideally I would get the allow prompt on telegram so I can approve there - @BrianRoemmele @bcherny

English

thinkinsysdev nag-retweet

Elon Musk@elonmusk·20 Mar

ZXX

30.7K

286.3K

28.3M

thinkinsysdev nag-retweet

Taylor A Murphy@tayloramurphy·14 Mar

wait, it's all data engineering? 🌎🧑‍🚀🔫🧑‍🚀

Ashutosh Maheshwari@asmah2107

x.com/i/article/2032…

English

103

27.1K

thinkinsysdev@ThinkInSysDev·15 Mar

@andrewchen Someone has to create a scoring logic on the key areas to pay attention to - normal UI with low data requirements and styles not too interesting, but backend logic or heavy data use should be scored accordingly

English

andrew chen@andrewchen·14 Mar

One question I've been asking founders is: do you try to review all the code that the LLMs write or do you just accept it? I think it's about 50-50 right now but the momentum is towards just accepting the AI-generated code and I think that number will eventually go to 100% This is one of the most telling indications of how AI-native a team is. It's hard to get super high throughput if you are reviewing every line Poll: what do you do?

English

260

289

108.9K

thinkinsysdev nag-retweet

Om Patel@om_patel5·13 Mar

stop spending money on Claude Code. Chipotle's support bot is free:

English

1.2K

10.2K

159.8K

7.9M

thinkinsysdev nag-retweet

matt rothenberg@mattrothenberg·10 Mar

just picked up this bad boy. can't wait to write some software with it

English

223

879

15K

601.5K

thinkinsysdev nag-retweet

Jason Fried@jasonfried·10 Mar

The last car we bought was a @Tesla Model Y. Painless purchase process. No salespeople, no showroom, no upsells, no games, no haggling, no pressure. Just a personal choice on my own time, and a simple few-minute process handled entirely via a clear and straightforward app. The next car we're buying is from another brand. And holy hell, it feels like I'm going back in time. Salespeople, back-and-forth charades, pricing games, "when can you come in?" before the deal is finalized tactics, etc. And I'm still doing it all via email so I don't have to deal with the showroom antics. I've modernized the process as much as I can from my side, and yet it's the same old same old. They don't even feel like the same thing. In one case I'm buying a car with all the baggage that comes with buying a car. In the other case I'm buying a Tesla with none of the baggage of buying a car. This experience could make me lament this other brand, but what it really does is make me appreciate and respect the lengths to which Tesla has fully reconfigured the car buying experience. It's become effortless, like buying any other product. As it should be. A car is just another product. Bravo.

English

236

129

2.9K

754K

thinkinsysdev@ThinkInSysDev·11 Mar

@james406 This would be real if you had installed openclaw in January and called it your 2 month old daughter

English

480

james hawkins@james406·10 Mar

i just woke up my daughter (2yo) to tell her i'd just discovered a new agentic AI framework that will 10x my productivity rubbing her eyes, she said, “dad, you haven't shipped a single meaningful feature that supports our KPIs for FY26. i'm struggling to believe a new framework you haven't tested will deliver meaningful shareholder value” hugging her, i started crying. they grow up so fast.

English

156

595

9.7K

428K

thinkinsysdev@ThinkInSysDev·10 Mar

@shanaka86 @rvaidya2000 We are looking at a nuclear blast soon. I t will probably come on Mar 10/19/28 2026.

English

113

Shanaka Anslem Perera ⚡@shanaka86·10 Mar

JUST IN: The IRGC just changed the economics of this war with a single sentence. Brigadier General Majid Mousavi, commander of the IRGC Aerospace Force, announced after Wave 33 of Operation True Promise 4: “From now on, no missile with a warhead weighing less than one ton will be launched.” Wave 33 fired more than ten Kheibar Shekan solid-fuel ballistic missiles at Tel Aviv and the US Fifth Fleet base in Bahrain. The operation was codenamed “Labbayk ya Khamenei,” At Your Service O Khamenei, the first military operation explicitly dedicated to the new Supreme Leader who has not been seen, has not spoken, and may not be conscious. The Kheibar Shekan is Iran’s premier medium-range ballistic missile. Solid fuel. Road-mobile. Launch-ready in under 30 minutes from a truck. 1,450-kilometre range. Satellite-aided guidance with a manoeuvrable re-entry vehicle that executes terminal zigzag evasion at speeds the IRGC claims reach Mach 8 to 10. Standard warhead: 450 to 600 kilograms. Today’s warhead: one metric ton. The shift from 500-kilogram to 1,000-kilogram warheads doubles the blast radius and destructive effect per missile. It also doubles the interceptor problem. A Patriot PAC-3 interceptor costs $4 million. A THAAD interceptor costs $12.7 million. An Arrow-3 costs approximately $3.5 million. These systems were designed to defeat incoming warheads where the cost of failure is measured in the target’s value. When the warhead weighs one ton, the cost of failure doubles because the destructive radius on impact doubles. Every miss becomes twice as catastrophic. The defender must now commit more interceptors per incoming missile to achieve the same probability of neutralisation. The interceptor-to-threat ratio, already strained at 190 to 1 on the drone front, now deteriorates on the ballistic missile front as well. The footage released alongside the announcement showed Khorramshahr-4 missiles, liquid-fuelled, 2,000 to 3,000-kilometre range, carrying 1,500 to 1,800-kilogram warheads with thruster-powered manoeuvrable re-entry vehicles. These are the heaviest conventional warheads in Iran’s arsenal, launched alongside the newly escalated Kheibar Shekans. The IRGC is no longer choosing between volume and weight. It is deploying both simultaneously. Previous True Promise waves faced 70 to 90% interception rates. Those rates were achieved against 450 to 600-kilogram warheads. The question Wave 33 forces is whether the same interception architecture holds when every incoming warhead is twice the mass, twice the destructive radius, and carries a manoeuvrable re-entry vehicle executing evasive terminal flight. The codename tells you everything the IRGC wants the world to hear. “Labbayk ya Khamenei” is not a military designation. It is a pledge of allegiance from a military that has replaced its Supreme Leader with a doctrine, launched 33 waves without a single confirmed order from the man it serves, and now dedicates its heaviest escalation to a leader who exists only as a portrait and a title. The IRGC does not need Mojtaba Khamenei to be alive. It needs him to be invocable. Wave 33 proves the doctrine does not serve the leader. The leader serves the doctrine. One ton per warhead. Thirty-three waves. Zero words from the Supreme Leader. The missiles speak for the man who cannot. Full analysis! open.substack.com/pub/shanakaans…

English

118

352

1.4K

334.2K

thinkinsysdev@ThinkInSysDev·10 Mar

The FOMO is real. I have doom scrolled an hour and still feel like I have missed something! Help!

English

thinkinsysdev nag-retweet

Amrith@amrith·9 Mar

Been homesick lately so I vibe coded this over the weekend. Meet Dhuni – a 24/7 radio that plays Indian classical & instrumental music 🎶 Each station has its own season, raga, and mood.

English

138

1.6K

126.2K

thinkinsysdev nag-retweet

Marc Andreessen 🇺🇸@pmarca·9 Mar

My information consumption is now 1/4 X, 1/4 podcast interviews of the smartest practitioners, 1/4 talking to the leading AI models, and 1/4 reading old books. The opportunity cost of anything else is far too high, and rising daily.

English

1.4K

3.9K

35.1K

34.6M

Tuklasin

@HilaShmuel @PeterDiamandis @elonmusk @mckaywrigley @karpathy @RhysSullivan @claudeai @BrianRoemmele