Jason Green

2.8K posts

Jason Green

@jgreen_us

Building unified memory for AI agents at https://t.co/R4uPJL5kw2 and automation at https://t.co/skla9S3Bgd.

Pittsburgh, PA Katılım Mart 2011

1.1K Takip Edilen380 Takipçiler

Jason Green@jgreen_us·12h

@AlexFinn @farzyness Disagree. I've been using OpenAI gpt-5.3 codex in OpenClaw since February, and now 5.4 and it's been great. I was firmly in camp Claude last year, but the new gpt models have been much better than opus IMHO. Plus I haven't had to worry about getting banned.

English

Alex Finn@AlexFinn·14h

@farzyness We desperately need OpenAI to compete on the next wave of models. Opus currently is the only remotely usable model for Openclaw

English

125

5.5K

Farzad 🇺🇸 🇮🇷@farzyness·14h

I tried really hard to get GPT 5.4 to work with OpenClaw but it’s just not there. Opus 4.6 is just so much better. I wish there was better competition at the frontier. IMO it’s Anthropic, then everyone else at the moment. Grok is better at research and fact checking. Gemini and grok are both far better at image and video, mainly because Anthropic doesn’t do it. GPT 5.4 is very good for an overall good model for a good price that’s a generalist. But when it comes to specialization, OpenAI doesn’t have anything IMO. I think this is going to bite them really hard. However once players start leveraging their new compute clusters I think it’ll flip very quickly. Anthropic has made some pretty big errors as it pertains to securing compute in the long term it seems. If I were a betting man, I think Anthropic secure its lead by middle of this year, but then Grok and Gemini start making a huge comeback for 1st. Only path for OpenAI long term IMO is to ultra-specialize around openclaw and becoming the best bang-for-your-buck brain for agentic tools and harnesses.

Yashu Sharma 🍊@heyitsyashu

Okay I lied the magic of Openclaw without Opus is just not there. Someone help me I’m struggling lol

English

203

31.2K

Jason Green@jgreen_us·14h

@iffishX @BetterScotus @physicsgeek Sorry if I butchered the framing. Here's the original article and Marilyn's response.

English

iffish@iffishX·15h

@jgreen_us @BetterScotus @physicsgeek You're not choosing 'sides' initially but 'side-pairs' and the maths does start after the initial choice, as the article shows. *I mean the framing 2 tweets above

English

Physics Geek@physicsgeek·1d

The discussion about the Monty Hall problem yesterday reminded me of another time when Marilyn caught grief from a lot of people. She said that for families with only two children, the odds of both children being the same sex was 2/3. People yelled at her, called her stupid, etc., so Marilyn asked her large readership to respond if they were either one of a two sibling family or had only two children themselves, reporting the sexes of the pairs. Many thousands of responses latter and the numbers were roughly 65%-66% (I can't remember exactly because it's been a long time). Some people were stunned by the numbers, still thinking that it should be a 50-50 split. It should not and here's why: Thing of the possible combinations of two children: 1) Two boys 2) Two girls 3) Boy and girl* 4) Girl and a boy* You will note that groups 3 and 4 are the same combination: two children of the opposite sex. So there are really only three combinations: 1)Two boys 2) Two girls 3) One boy and one girl Two out of the three possible combinations are same sex siblings.

English

251

258K

Jason Green@jgreen_us·15h

@iffishX @BetterScotus @physicsgeek Further explanation. *Note: She chose the golden (unburned) pancakes in her original question, but the results remain the same. An Explanation for Those Puzzling Pancakes - Parade share.google/kBnHRWXHUcT7Wc…

English

iffish@iffishX·15h

@jgreen_us @BetterScotus @physicsgeek That's not counterintuitive, it's straightforwardly wrong. There is only 1 out of the 3 pancakes that meets that condition. How are you managing to choose that one 2 out of 3 times? It is not possible to choose 'sides' randomly in this example because they are pair bound.

English

Jason Green@jgreen_us·15h

@iffishX @BetterScotus @physicsgeek That's the counterintuitive part of the story. 2/3 of the time you'll choose a brown side that is brown on the other side. The math starts at the initial choice, not after you've chosen.

English

iffish@iffishX·15h

@BetterScotus @jgreen_us @physicsgeek But you're not randomly choosing a side. Because the sides are bound together in pairs it eliminates that form of randomness.

English

Jason Green@jgreen_us·1d

@thekitze I liked it, but at this point I forget most of it. Hopefully season 2 comes out soon.

English

kitze 🛠️ tinkerer.club@thekitze·1d

2 episodes into three body problem and eehhhhhh idk man should we continue?

English

13.5K

Jason Green@jgreen_us·1d

The 3 pancakes problem was just a different Marilyn Vos Savant problem that I was mentioning. However, it actually is the same type of scenario. The odds are 2/3. There are 6 total *sides* that you might look at first. If the side is golden, you throw the pancake back and pick again. If you pick one of the 3 brown sides, 2 out of 3 times the other side will also be brown.

English

105

Andrew the Millwright@JdubAndrew·1d

@jgreen_us @physicsgeek Ok. Anyone has their first kid. There is virtually a 100% chance that kid will be either a boy or a girl…now the couple has their second kid, the odds are 50/50 that the gender of the second child will either match or be different than the first child. It isn’t three pancakes.

English

111

Jason Green@jgreen_us·1d

@jtbooth1021 @physicsgeek You might like this other one from her book. x.com/i/status/20412…

Jason Green@jgreen_us

@BrianRoemmele Here's another one that I always liked. You drive around a 1 mile track at 30mph. How fast would you have to drive for the second lap to average 60mph?

English

288

JT Booth@jtbooth1021·1d

@jgreen_us @physicsgeek Say in stead that you picked a face from the six faces and it was brown. Of the three brown faces, two are opposite another brown face and one is opposite a gold face. So, 2/3. Fun one.

English

278

Jason Green@jgreen_us·1d

@JdubAndrew @physicsgeek This is another thought problem from Marilyn Vos Savant's book/articles. Definitely 3 pancakes.

English

338

Andrew the Millwright@JdubAndrew·1d

@jgreen_us @physicsgeek No No No….There are TWO pancakes in a top hat. One golden and one brown. You blindly choose one and then put it back in the hat and then you choose again. That’s how it works. It’s 50% all day long.

English

379

Jason Green@jgreen_us·2d

@BrianRoemmele Here's another one that I always liked. You drive around a 1 mile track at 30mph. How fast would you have to drive for the second lap to average 60mph?

English

622

Brian Roemmele@BrianRoemmele·3d

In 1990 I wrote a letter to Marilyn vos Savant, Parade Magazine in support of her proof on the Monty Hall Problem. I ran an AI (expert system) test on it and she was right and just about the entire academic community was wrong. They could not accept it. Now they do.

English

129

283

2.5K

208K

Jason Green@jgreen_us·2d

@vxunderground This list is BS. I don't care that much about keyboards.

English

vx-underground@vxunderground·2d

I'm tired of people stereotyping us computer nerds. It is PREJUDICE. Here are some stereotypes non-nerds push on us. They're all FALSE. According to non-nerds, us nerds do the following: - Excessive caffeine or nicotine intake - Unusual or unhealthy sleep schedule, specifically around 3am and 5am - Apparently have tons of tabs open, or something, in terminal or web browser - Desk messy, covered in cables - Hardware nerds apparently do "experiments" just to see if something works - Notes on paper or whiteboard look like serial killer manifesto - Web cam taped, mic disabled, because of "paranoia" - Strong distrust in tech companies, especially social media - Nerd so intense forget to eat or shower - Spend 8 hours debugging instead of reading something which would take 20 minutes because ??? - Apparently we "don't know an answer" but know how to find it? - Some nerds become irrationally angry about GUIs? - Weird obsession with mechanical keyboards I'm so tired of these stereotypes. Literally none of these are true.

English

208

969

48.4K

Jason Green@jgreen_us·2d

The solution is to reduce the limits, not ban the tools. Let supply and demand work itself out. If you want to use agentic tools, you'll hit the limit faster. 🤷 Subscription-subsidized usage costs the same to the inference provider regardless of the harness.

Yuchen Jin@Yuchenj_UW

I’m pretty sure the $20/$200 subscription pricing was vibe-coded by OpenAI, then copied by Anthropic. That pricing works for chatbots, not agents. A 24/7 agent can burn through orders of magnitude more tokens than a user chatting with a chatbot. Now they’re stuck. Neither Anthropic nor OpenAI wants to be the first to change pricing and risk user churn, so the options are: keep subsidizing, get more GPUs, tighter rate limits, and enforce rules like limiting 3rd-party apps. I wouldn’t be surprised if intelligence gets more expensive, not cheaper.

English

Jason Green@jgreen_us·2d

@dharmesh @karpathy I built this at allbase.ai , including the MCP, API, CLI bit as well as treating each agent as its own user. Permissions and cross account sharing is a bigger component than most assume.

English

135

dharmesh@dharmesh·3d

Here's what I'm currently pondering: This idea, but implemented totally in the cloud for normies. One could imagine building a virtual file system on top of the cloud-hosted data (so it looked to the LLM like a navigable directory tree of files). That could be done with a super simple SKILL implementing the base file system primitives. Next step would be to make it multi-player so businesses/teams could use it. Content is default private (each user has their own directory). Any individual directory/file could be tagged/shared to a team, to the company or to the world (public). Make it available via API, MCP, CLI etc. so you could unless something like OpenClaw against it if you wanted. I even have the domain for it: secondbrain .com What do folks think?

English

9.5K

Andrej Karpathy@karpathy·6d

LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. In this way, a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating knowledge (stored as markdown and images). The latest LLMs are quite good at it. So: Data ingest: I index source documents (articles, papers, repos, datasets, images, etc.) into a raw/ directory, then I use an LLM to incrementally "compile" a wiki, which is just a collection of .md files in a directory structure. The wiki includes summaries of all the data in raw/, backlinks, and then it categorizes data into concepts, writes articles for them, and links them all. To convert web articles into .md files I like to use the Obsidian Web Clipper extension, and then I also use a hotkey to download all the related images to local so that my LLM can easily reference them. IDE: I use Obsidian as the IDE "frontend" where I can view the raw data, the the compiled wiki, and the derived visualizations. Important to note that the LLM writes and maintains all of the data of the wiki, I rarely touch it directly. I've played with a few Obsidian plugins to render and view data in other ways (e.g. Marp for slides). Q&A: Where things get interesting is that once your wiki is big enough (e.g. mine on some recent research is ~100 articles and ~400K words), you can ask your LLM agent all kinds of complex questions against the wiki, and it will go off, research the answers, etc. I thought I had to reach for fancy RAG, but the LLM has been pretty good about auto-maintaining index files and brief summaries of all the documents and it reads all the important related data fairly easily at this ~small scale. Output: Instead of getting answers in text/terminal, I like to have it render markdown files for me, or slide shows (Marp format), or matplotlib images, all of which I then view again in Obsidian. You can imagine many other visual output formats depending on the query. Often, I end up "filing" the outputs back into the wiki to enhance it for further queries. So my own explorations and queries always "add up" in the knowledge base. Linting: I've run some LLM "health checks" over the wiki to e.g. find inconsistent data, impute missing data (with web searchers), find interesting connections for new article candidates, etc., to incrementally clean up the wiki and enhance its overall data integrity. The LLMs are quite good at suggesting further questions to ask and look into. Extra tools: I find myself developing additional tools to process the data, e.g. I vibe coded a small and naive search engine over the wiki, which I both use directly (in a web ui), but more often I want to hand it off to an LLM via CLI as a tool for larger queries. Further explorations: As the repo grows, the natural desire is to also think about synthetic data generation + finetuning to have your LLM "know" the data in its weights instead of just context windows. TLDR: raw data from a given number of sources is collected, then compiled by an LLM into a .md wiki, then operated on by various CLIs by the LLM to do Q&A and to incrementally enhance the wiki, and all of it viewable in Obsidian. You rarely ever write or edit the wiki manually, it's the domain of the LLM. I think there is room here for an incredible new product instead of a hacky collection of scripts.

English

2.6K

6.3K

53.8K

18.7M

Jason Green@jgreen_us·2d

I was at the DMV recently and overheard a woman getting her license renewed... DMV: Are you a veteran? Woman: Confused. DMV: You have the veteran indicator on your license. W: I don't know what that means. DMV: Are you a veteran? Did you serve in the military? W: confused... DMV: You know it's illegal to check that box unless you've served in the military? W: confused... DMV: Here's your new license.

English

829

Colin Dunlap@colin_dunlap·2d

The real question no one wants to talk about is this: How did so many illegal aliens get a CDL to begin with in Pennsylvania? All the while honest Pennsylvanians had to bring like 106 forms of ID and get meticulously checked up and down to get a REAL ID? cbsnews.com/pittsburgh/new…

English

131

325

1.4K

34.6K

Jason Green@jgreen_us·3d

@chamath I built albase.ai for exactly this scenario. Centralized memory for all of my agents. Let me know if you'd like a walkthrough or have any feedback.

English

161

Chamath Palihapitiya@chamath·3d

This may be a dumb question but I’ll ask it here anyways: I can’t find a good way for my various AI chats to automatically sync its conversation history into a structured knowledge base. So that as I update various chats from time to time and refine context, my knowledge base automatically grows with this new info.

English

1.1K

2.4K

792.8K

Jason Green@jgreen_us·5d

@colin_dunlap Do we have any information about the actual people behind this? It could literally be a trafficking operation. A thing like this needs complete transparency to work, not anonymity.

English