David

788 posts

David

@MeaseBeee

เข้าร่วม Ocak 2009

973 กำลังติดตาม127 ผู้ติดตาม

David@MeaseBeee·4h

@HistoryGPT @emollick Gemini doesn’t even work as well as Claude models in Google’s own Antigravity IDE. Or didn’t a month ago when I last tried.

English

Generative History@HistoryGPT·5h

The problem, I think, is that gemini-3.1-pro-preview (the underlying model) is really quite bad at tool calling (at least that’s my experience via the API). I’ve tried it in a number of Deric workflows and it just doesn’t work as well as the Claude or OpenAI models. It gets stuck in weird loops, thinking traces get mixed into tool calls and outputs, etc. Flash is much better at the agentic stuff, but it’s not as capable. Tha said, it remains far superior to Claude or OpenAI models for specific technical tasks like data extraction from large contexts, transcription, etc…and I h cheaper.

English

774

Ethan Mollick@emollick·6h

The continuing gap between the capabilities of Gemini Pro 3.1 (very good model) and the capabilities of the Gemini app/website is odd. The model can do what Claude/GPT can do, but there is a minimal harness for tools (file creation, research etc), no auditable CoT/actions, manual canvas, etc. The reason this is odd is that Google is trusted by enterprises & has the compute to burn, so a good harness would solve so many of Gemini’s gaps and make it an easier sell to companies. The model can make Office documents, for example, but the harness doesn’t allow it. It could also decide when to use other Google tools (and Google has a lot of very good AI tools) and apply them, taking advantage of the ecosystem, but it doesn’t consistently. I assume something will be coming out here eventually, but the gap with Claude and ChatGPT has only been growing.

English

836

51.9K

David@MeaseBeee·2d

@ebarcuzzi @conorsen @tolstoybb Plus the bonus of “oops, you now can’t use this service until it resets NEXT WEEK”

English

Juggalos For Context🌴🥥@ebarcuzzi·2d

@conorsen @tolstoybb We're going through the AI equivalent of when it stopped costing $5 to take an Uber across town.

English

141

David@MeaseBeee·4d

@Timrdk @anothercohen What? Wtf kind of slop is this?

English

Tim Rudik@Timrdk·4d

@anothercohen The 3x price jump is the moment most teams realize they've been paying for presence management, not actual communication. We switched to async-first two years ago and the main result was fewer messages, not more.

English

797

Alex Cohen@anothercohen·4d

I'm tempted to finally churn off Slack. We're paying ~$6k/year for 40 people and they just quoted me $21k/year for the business version that includes a BAA (and all the shitty AI features). Incredibly overrated software

English

522

3.8K

500.7K

David@MeaseBeee·6d

@emollick Hasn’t the issue always been the short time to depreciation and obsolescence of the stuff *in* the data centers? Data centers will get used. But how quickly does the compute inside have to be replaced to keep relevant and valuable?

English

301

Ethan Mollick@emollick·6d

Six months ago, there was a lot of focus on the idea that the there would be a massive glut of unused computing power which would could a recession as AI use plateaued. The "compute bubble" belief was absolutely everywhere. The degree to which this was wrong deserves some notice

English

194

167.2K

David@MeaseBeee·10 Nis

@rettlerb @TheStalwart @JohnCarreyrou Same. Although perhaps this coding style difference points to there being an unidentified partner in the endeavor.

English

Bradley Rettler@rettlerb·9 Nis

@TheStalwart @JohnCarreyrou Did you see this one? I'm thinking it's less and less plausible. x.com/robertgraham/s…

Robert Graham@robertgraham

Hi. Professional C/C++ programmer here. The open-source code I can find written by Adam Back and Satoshi Nakamoto don't look remotely similar. Back's code looks typical of academic Unix programmers who also hack their code to run on Windows. Satoshi code was written by a professional Windows programmer who also wrote for Unix. Stylistically, they look nothing alike. There's not enough time between 2005 when I can find the newest Adam Back and January 2009 when Satoshi published Bitcoin/0.1 to account for the change. Both are perfectly competent programmers, but stylistically, they are completely different. The NYTimes tried to compare their English language in posts/emails. I'm compare their C/C++ language in their open-source code. The NYTimes merely points out they both use C++ as if that's another corroborating detail, when the actual code seems to disqualify Adam Back.

English

Joe Weisenthal@TheStalwart·9 Nis

I've read the @JohnCarreyrou piece a couple more times, and along with seeing the commentary of others, I'm coming around more to the plausibility of the conclusion.

Joe Weisenthal@TheStalwart

Although I didn't find @JohnCarreyrou's Satoshi reveal to be compelling, I'm glad he wrote it, because it's one of my favorite topics to talk about, and it brings out OG crypto twitter, from when it was good.

English

187

76.9K

David@MeaseBeee·9 Nis

@MrMaxBuilds @poetengineer__ I’m not convinced keeping the source files untouched is necessary and not just an architectural choice. Putting generated info/text in front matter or clearly delineated sections in the source doc leaves the source info effectively untouched and has its advantages.

English

Max@MrMaxBuilds·9 Nis

@poetengineer__ Tell me more about /raw-is-sacred. Never heard this but sounds very correct. I’ve been keeping human only divs inside my md files but this sounds like it could be better

English

742

Kat ⊷ the Poet Engineer@poetengineer__·9 Nis

one direction from this that excites me: a learning base instead of a storage one: not for what you already know, but for what you don't. made one for deep reading of plato's timaeus. 2 things i carried over: non-rag, indexed fs, and /raw-is-sacred to separate sources from generated content. a few features i find genuinely helpful:

Andrej Karpathy@karpathy

LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. In this way, a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating knowledge (stored as markdown and images). The latest LLMs are quite good at it. So: Data ingest: I index source documents (articles, papers, repos, datasets, images, etc.) into a raw/ directory, then I use an LLM to incrementally "compile" a wiki, which is just a collection of .md files in a directory structure. The wiki includes summaries of all the data in raw/, backlinks, and then it categorizes data into concepts, writes articles for them, and links them all. To convert web articles into .md files I like to use the Obsidian Web Clipper extension, and then I also use a hotkey to download all the related images to local so that my LLM can easily reference them. IDE: I use Obsidian as the IDE "frontend" where I can view the raw data, the the compiled wiki, and the derived visualizations. Important to note that the LLM writes and maintains all of the data of the wiki, I rarely touch it directly. I've played with a few Obsidian plugins to render and view data in other ways (e.g. Marp for slides). Q&A: Where things get interesting is that once your wiki is big enough (e.g. mine on some recent research is ~100 articles and ~400K words), you can ask your LLM agent all kinds of complex questions against the wiki, and it will go off, research the answers, etc. I thought I had to reach for fancy RAG, but the LLM has been pretty good about auto-maintaining index files and brief summaries of all the documents and it reads all the important related data fairly easily at this ~small scale. Output: Instead of getting answers in text/terminal, I like to have it render markdown files for me, or slide shows (Marp format), or matplotlib images, all of which I then view again in Obsidian. You can imagine many other visual output formats depending on the query. Often, I end up "filing" the outputs back into the wiki to enhance it for further queries. So my own explorations and queries always "add up" in the knowledge base. Linting: I've run some LLM "health checks" over the wiki to e.g. find inconsistent data, impute missing data (with web searchers), find interesting connections for new article candidates, etc., to incrementally clean up the wiki and enhance its overall data integrity. The LLMs are quite good at suggesting further questions to ask and look into. Extra tools: I find myself developing additional tools to process the data, e.g. I vibe coded a small and naive search engine over the wiki, which I both use directly (in a web ui), but more often I want to hand it off to an LLM via CLI as a tool for larger queries. Further explorations: As the repo grows, the natural desire is to also think about synthetic data generation + finetuning to have your LLM "know" the data in its weights instead of just context windows. TLDR: raw data from a given number of sources is collected, then compiled by an LLM into a .md wiki, then operated on by various CLIs by the LLM to do Q&A and to incrementally enhance the wiki, and all of it viewable in Obsidian. You rarely ever write or edit the wiki manually, it's the domain of the LLM. I think there is room here for an incredible new product instead of a hacky collection of scripts.

English

201

2.2K

182.4K

David@MeaseBeee·7 Nis

@DasNripanka @levelsio Which the capabilities (and limitations) even with these improved local models, I’d rather have the same space filled with a copy of Wikipedia and some selection of key e-books.

English

Dr. Nripanka Das@DasNripanka·6 Nis

@levelsio the apocalypse angle is actually the one legitimate use case for on-device AI nobody talks about. connectivity goes out, cloud AI goes dark, your local model becomes the only thing still working. it's the only AI that works when everything else doesn't.

English

1.8K

@levelsio@levelsio·6 Nis

Tried Gemma 4 ran locally on my iPhone today I thought it'd be useful in case the apocalypse happens and I need to ask it for survival tips Like how to make a fire 🔥 I guess I'll freeze to death instead 🫠

English

475

165

5.9K

612K

David@MeaseBeee·7 Nis

@paulbettner @sdamico @PatrickKavanagh Using a database like SQL.

English

Paul@paulbettner·7 Nis

@sdamico @PatrickKavanagh ... which is what?

English

151

Sam D'Amico@sdamico·6 Nis

Been using the @PatrickKavanagh version of this for ~months

Nav Toor@heynavtoor

🚨 Andrej Karpathy thinks RAG is broken. He published the replacement 2 days ago. 5,000 stars in 48 hours. It's called LLM Wiki. A pattern where your AI doesn't retrieve information from scratch every time. It builds and maintains a persistent, compounding knowledge base. Automatically. RAG re-discovers knowledge on every question. LLM Wiki compiles it once and keeps it current. Here's the difference: RAG: You ask a question. AI searches your documents. Finds fragments. Pieces them together. Forgets everything. Starts over next time. LLM Wiki: You add a source. AI reads it, extracts key information, updates entity pages, revises topic summaries, flags contradictions, strengthens the synthesis. The knowledge compounds. Every source makes the wiki smarter. Permanently. Here's how it works: → Drop a source into your raw collection. Article, paper, transcript, notes. → AI reads it, writes a summary, updates the index → Updates every relevant entity and concept page across the wiki → One source can touch 10 to 15 wiki pages simultaneously → Cross-references are built automatically → Contradictions between sources get flagged → Ask questions against the wiki. Good answers get filed back as new pages. → Your explorations compound in the knowledge base. Nothing disappears into chat history. Here's the wildest part: Karpathy's use case examples: → Personal: track goals, health, psychology. File journal entries and articles. Build a structured picture of yourself over time. → Research: read papers for months. Build a comprehensive wiki with an evolving thesis. → Reading a book: build a fan wiki as you read. Characters, themes, plot threads. All cross-referenced. → Business: feed it Slack threads, meeting transcripts, customer calls. The wiki stays current because the AI does the maintenance nobody wants to do. Think of it like this: Obsidian is the IDE. The LLM is the programmer. The wiki is the codebase. You never write the wiki yourself. You source, explore, and ask questions. The AI does all the grunt work. NotebookLM, ChatGPT file uploads, and most RAG systems re-derive knowledge on every query. This compiles it once and builds on it forever. 5,000+ stars. 1,294 forks. Published by Andrej Karpathy. 2 days ago. 100% Open Source.

English

11.8K

David@MeaseBeee·7 Nis

@McReynoldsJoe @sdamico @PatrickKavanagh Yeah, it wouldn’t. At least not like Karpathy is setting it up… and he says this. I think he mentions a corpus of 100-ish papers. But it could be adapted for larger numbers. Still, when you get to book-lengths, and thousands of them, it’s tough

English

Joe McReynolds@McReynoldsJoe·7 Nis

@sdamico @PatrickKavanagh How well would it work as a reference librarian for a collection of 3,000 OCRed books? That's the system I wish existed that I've not yet found.

English

151

David@MeaseBeee·27 Mar

@SkolShannon @Nerdy_Addict In the reply back on 2/7 the family doesn’t say they’ll pay for her alive, right? I read that reply as them saying we’ll pay for her body.

English

162

Shannon@SkolShannon·27 Mar

@Nerdy_Addict It would have to be a 3rd communication then. Guthries said they would pay ransom on 2/7 after receiving the 2nd letter on 2/6. They would not say they would pay the ransom if they received the apology note prior.

English

8.2K

🅽🅴🆁🅳🆈@Nerdy_Addict·27 Mar

I now have two sources confirming that one of the letters sent to the media in the Nancy Guthrie case allegedly states the sender apologized, claiming they did not realize how serious her heart condition was and that she has “gone to be with God.” According to these sources, investigators believe the message came from the same individuals who previously demanded Bitcoin, though this latest letter reportedly made no demands and was framed solely as an apology. While this has been validated by two independent sources, it has not been publicly released or confirmed by the media.

English

115

162

1.6K

257.3K

David@MeaseBeee·25 Mar

@HouseOfChains_ @DexterXRP @JoePostingg But isn’t the likes of Eero made oversees too? What brands will actually be legal?

English

146

HoC@HouseOfChains_·24 Mar

@DexterXRP @JoePostingg Pity Ubiquity is made in China, and banned under the new rules.

English

1.1K

Joe@JoePostingg·24 Mar

If your router is more than a few years old you should buy a TP-Link Archer BE3600 right now. Good home routers basically won't exist in six months, you'll have to buy an enterprise device if you want something that actually works.

English

173

195

6.1K

David@MeaseBeee·9 Mar

@ddunderfelt @banteg Well, now I don’t k ow what we were supposed to make of that touchy/pokey video. Maybe it was to show a smaller MacBook?

English

Daniel Dunderfelt@ddunderfelt·27 Şub

@MeaseBeee @banteg lol I didn’t connect that to touchscreen at all. But I get it. Maybe the low-cost MacBook has a touch screen 🤔 I haven’t really paid attention to it. MacBook pros (which will get touch in the fall), Mac minis and Mac studios are where my interest is focused.

English

227

banteg@banteg·26 Şub

touchbar level flop incoming imo, will probably need to skip a generation till they roll it back

Tim Cook@tim_cook

A big week ahead. It all starts Monday morning! #AppleLaunch

English

4.1K

699.2K

David@MeaseBeee·9 Mar

@maza1theo @brockman_david @itskyleconner That’s what I’m saying. $18 a day at his how is STILL quite high for at-home charging.

English

Teddy@maza1theo·9 Mar

@MeaseBeee @brockman_david @itskyleconner No. I went from about $10 a day in my ford bronco to $3 a day or less in my Chevy Silverado ev. The savings is real, it’s more about is it practical for your driving habits

English

Kyle Conner@itskyleconner·8 Mar

Just the facts: Took 1.5 hours to charge Cost ~$75

English

468

5.7K

1.9M

David@MeaseBeee·9 Mar

@brockman_david @itskyleconner $18 still seems like a lot, though, right?

English

David Brockman@brockman_david·9 Mar

@itskyleconner Almost $0.50/kWh. That same charge would have been about $18.00 at my house. They have always been vocal that you see most of your EV benefits from charging at home. Its still cheaper when I road trip my EVs just not as big of a savings as charging at home.

English

10.2K

David@MeaseBeee·1 Mar

@RealIanLucas @emollick But why would flash cost more and perform better than Pro?

English

Ian Lucas@RealIanLucas·1 Mar

@emollick Whoa: Gemini Gemini Gemini!

Indonesia

345

Ethan Mollick@emollick·1 Mar

This paper is one of the first to test AI skills and the results seem to suggest that yes, they have high practical value. They use pretty mediocre skills (6.2/12 quality rating) harvested mostly from places like Github, and still get large boosts, especially outside software.

English

588

63.8K

David@MeaseBeee·26 Şub

@ddunderfelt @banteg And yet there’s that promo video which sure seems like a touchscreen tease.

English

321

Daniel Dunderfelt@ddunderfelt·26 Şub

@banteg If you’re referring to the touch screen MacBook’s, those aren’t due until the end of the year.

English

9.9K

David@MeaseBeee·25 Şub

@PatrickHeizer For a while there during Covid, used Siennas were more than new…if you could find the rare dealer selling at new at MSRP.

English

Patrick Heizer@PatrickHeizer·25 Şub

When we were shopping for minivans last February, a used Toyota Siena was ~$4k less than a new one. In disbelief, I asked the salesman why wouldn't we just buy a new one. He said that's what he'd do, but it was our call. We bought a brand new one. Had 3.1 miles on the odometer.

Jum@JesterJum

Why do people still buy new cars? That $50,000 car you just paid off cost you $63,000 in total payments. Plus, it is now only worth $20,000 so it's lost $43,000 in depreciation....in 5 years. Buy used. Come on people....stop throwing your money away.

English

244

6.1K

1.7M

David@MeaseBeee·8 Şub

@dioscuri Gemini pro 3 just yesterday whiffed on a question “given the Claude code usage described in this blog post, can Gemini CLI perform the same way?” And twice, it asserted there was no product like Claude code from Google (but it could write something for me).

English

126

Henry Shevlin@dioscuri·7 Şub

This is a weird hallucination from ChatGPT 5.2. Basic knowledge about one of the most played games in history. Real throwback to 3.5 levels of inaccuracy.

English

2.1K

150.2K

David@MeaseBeee·8 Şub

@SolutionsCay @ChrisPavese @Winterrose Isn’t the original post saying, though, that if an agent has access it will drink up that milkshake of data. And so then it can become a system of record

English

332

Jose@SolutionsCay·8 Şub

@ChrisPavese @Winterrose 100% There's no way you can run any serious operation without a database and a system of record. You can layer all the AI you want. But if you can't answer questions about your business with select foo from bar you are not going to make it.

English

380

Chris Pavese@ChrisPavese·7 Şub

Interesting take. Very different than current consensus.

Zain Hoda@zain_hoda

x.com/i/article/2019…

English

1.2K

549.6K

David@MeaseBeee·6 Şub

@jovinxthomas @philzona @tom_doerr Aren’t LLMs ideally suited to being maintainers/gardeners to fix these issues?

English

jovin@jovinxthomas·6 Şub

by brittle i mean more of the side of practical fragility, not uselessness. auto extracted KGs usually contain missing/duplicated or false triples. and graph RAG relies on explicit edges so small extraction errors or even bad linking can break multi hop retrieval. i've seen papers that show how denoising and hybrid KG+vector methods help but these are still active research areas.

English

Tom Dörr@tom_doerr·6 Şub

Converts text into knowledge graphs for Graph RAG github.com/rahulnyk/knowl…

English

225

1.8K

118.4K

ค้นพบ

@HistoryGPT @emollick @ebarcuzzi @conorsen @tolstoybb @Timrdk @anothercohen @rettlerb