David Robertson

2.1K posts

David Robertson

@davidrobertson

Full-Stack Engineer | AI Integration Specialist | Building Smarter Solutions With Code | 🚀

Katılım Mayıs 2016

354 Takip Edilen287 Takipçiler

Sabitlenmiş Tweet

David Robertson@davidrobertson·17 Nis

x.com/i/article/2045…

ZXX

923

David Robertson@davidrobertson·1d

I agree, and I believe you that it didn’t know what it was doing. When something is out of distribution, it’s always going to struggle. Your observation was valid. My point really was just that there are lots of ways to maximize the results from these tools. Having a document like what Pro regurgitated thrown into the context would likely narrow its focus and get you better next-token prediction. Trying to get it to write a decent kernel in Mojo with no references gives you the dumbest slop ever. Give it some references and a way to validate its results, and you can get some incredible results.

English

346

mike64_t@mike64_t·1d

It “understands” in language and when talking about it, obviously. However, how it acts is a different story. I’ve said in the past that the models like to delay computation as much as possible, collect everything as raw data, and not introduce state if possible. Without me specifying how exactly it should implement the procedure to find the variable, except that it should sample two times, one would expect that it doesn’t require explanation that the *when* in time matters. If it thinks it can do two captures really fast at the end and that’s the same, it didn’t understand the point no matter what it reads back to you in question mode. Talking about a thing and doing a thing are not necessarily the same.

English

2.6K

mike64_t@mike64_t·1d

gpt 5.5 apparently does not understand the point of cheat-engine-like variable discovery... and that you can't actually defer the scan at the instant of interest unless you dump the entire memory... Kind of scary that this thing that's doing all this work apparently doesn't seem to *actually* understand the concept of variables changing in memory... Scary jagged intelligence

English

237

68.6K

David Robertson retweetledi

Justin Schroeder@jpschroeder·1d

Well...I'm going to be in SF on Monday/Tuesday. Aaaand...I have something very very interesting to show. Anyone want to hang irl?

English

1.4K

David Robertson@davidrobertson·5d

@gdb It infers things like Opus does and doesn’t adhere to the prompt as strictly as 5.4. It’s definitely lazier at times. Overall it’s decent, but kind of annoying at the same time.

English

Greg Brockman@gdb·6d

how is gpt-5.5 performing for you?

VraserX e/acc@VraserX

GPT-5.5 feels insane so far. What I love most is that it is blazing fast, but still clearly smarter than GPT-5.4. In my own tests, it feels around 2x to 5x faster depending on the task. What’s your experience so far?

English

448

1.1K

153.1K

David Robertson@davidrobertson·6d

Bluesky LOL. Github is trash, they locked me out of my account no questions asked and no easy way to get it back. It took them 10 days to open it back up and it was their mistake. Then they double charged me for my annual copilot sub. The whole thing is a mess, plus its obsolete or approaching obsolescence really fast. If people really want their IP controlled by one central operator, there's lots of other options including self hosting.

English

725

Justin Schroeder@jpschroeder·6d

Uhhhg. I totally agree...and also kinda don't want to kill Github. I'm quite worried if we throw out Github we'll end up with a balkanized hellscape of half-used Mastadonish fiefdoms with reddit mod vibes — or worse, bluesky.

Theo - t3.gg@theo

Github has been down for most of the day. I'm so tired of this. Never been so ready to move on.

English

692

60.4K

David Robertson@davidrobertson·26 Nis

@shadcn Super annoying when a button doesn’t have cursor pointer. I know it’s not what it was intended for but sometimes just feel off when it’s not on.

English

275

shadcn@shadcn·25 Nis

I see. How many of you do this? I’d like to get a sense of how common this is.

Cristian@iamthatcris

the first thing I do when I install @shadcn is to add cursor pointer on the button

English

533

2.7K

436.1K

David Robertson@davidrobertson·24 Nis

How do you find 5.5? I find it faster but kind of annoying, it over corrects on everything, forgets things. OpenAI is moving the models in the direction of Opus trying to make them less autistic and Anthropic is making theirs more autistic, each model is getting a bit worse to use. Lololol.

English

Justin Schroeder@jpschroeder·24 Nis

Tech influencers today be like…

English

David Robertson@davidrobertson·23 Nis

@jpschroeder LOLOLOL "Clutch my pearls"

English

189

Justin Schroeder@jpschroeder·23 Nis

Oh me. Oh my. Clutch my pearls! Anthropic investor says GPT 5.5 is bad at coding because old benchmark says so. I bet 5.5 is goated.

Deedy@deedydas

GPT 5.5 underperforms Opus 4.7 on SWE-Bench Pro. Couldn't find any reported SWE-Bench scores at all and an internal benchmark is reported instead. That footnote is trying really hard to bury the lede. GPT 5.5 isn't SOTA for coding.

English

114

11.1K

David Robertson@davidrobertson·22 Nis

@jpschroeder Lolol so true

English

368

Justin Schroeder@jpschroeder·22 Nis

The justin-swe-bench results are in for gpt-5.4 vs Opus 4.7 vs Qwen 3.6

English

402

32K

David Robertson@davidrobertson·21 Nis

@0xSero I wrote about this last week it touches on some of the same points but focuses more on the myth of model swapping. x.com/davidrobertson…

David Robertson@davidrobertson

x.com/i/article/2045…

English

870

0xSero@0xSero·21 Nis

For everyone wondering about Opus regressions, this is pretty accurate. Almost all the issues I’ve seen people experience when self hosting is related to inference infra, settings, or harnesses. There’s so much room for compounding errors, Nvidia is what they bench on and give to their insiders. youtu.be/KFisvc-AMII?is…

YouTube

English

126

36.8K

David Robertson retweetledi

Elon Musk@elonmusk·18 Nis

You can access 𝕏 APi via @OpenClaw. We’re trying to make it affordable without giving away the shop. Hopefully, this can be useful & fun 💫

Robert Scoble@Scobleizer

Holy shit. Now everyone will be able to use their @OpenClaws and all the other agentic platforms to build apps on top of X. Here's the secret: build lists. Lists are how you build apps. The pattern: Build a list of your favorite football team. Or whatever you are into. Then ask your AI agents "build an app showing me all the important news about my favorite football team." In minutes you'll have an app. And that's just the beginning. Your agent can build a script about your favorite football team that you can take to places like Google's Notebook LM. Now you have a video, a podcast, a slide deck, a game, a mind map. All about your favorite football team based on real time news. You can do the same with something like @HeyGen, create an avatar of your favorite football player. Now you will have your favorite football player telling you everything that's happening on the football team. And I could go for hours about how many things you can build and not even cover a fraction of them. This is huge. Thank you @elonmusk for making it possible to make millions of agentic apps affordably on top of X. Start building!

English

2.7K

5.9K

48K

38.9M

David Robertson@davidrobertson·17 Nis

@jpschroeder @DarioAmodei @sama x.com/davidrobertson…

David Robertson@davidrobertson

x.com/i/article/2045…

QME

Justin Schroeder@jpschroeder·15 Nis

You think so? I think the big labs have different levels of distillation for each model. Right when a new model drops, they are serving almost exclusively the new model, meanwhile they are distilling an 85% version of it to reduce inference and maybe even a cheaper one than that. Then they use a router to try and optimize cost, and when they really need more compute they can flip a switch and get more at any time. I’m so darn convinced of this…it’s how I would do it. I don’t even mind that they do this either, I just want to know at any given time, what *actual* model I’m using.

English

Justin Schroeder@jpschroeder·13 Nis

Ahem, @DarioAmodei and @sama - this is what honesty looks like. I pay you $200/mo - at least tell me when I’m using a quant.

dax@thdxr

we've had a huge spike in OpenCode Go subscribers so we have people on the team entirely focused on securing more capacity we're growing faster than our providers are receiving GPUs we have options but might be a bit bumpy as we figure it out

English

185

31.5K

David Robertson@davidrobertson·16 Nis

@jpschroeder Every time I try a GUI I always go back to a TUI.

English

165

Justin Schroeder@jpschroeder·16 Nis

I'm open to GUI over TUI agents — but i have yet to encounter a single benefit. GUIs also all sacrifice the programmability and portability of a TUI. GUI fanboys, set me straight

English

David Robertson retweetledi

Ben Pouladian@benitoz·16 Nis

Watching the takes on Jensen / Dwarkesh. Credit first: these were the best questions Jensen's been asked in a long-form sit-down. Dwarkesh didn't lob softballs. He pressed on commoditization, ASIC economics, margin compression, customer concentration. Real questions. But the consensus read that Jensen got "defensive" or didn't answer is missing what actually happened. A lot of his answers were operator clarity that only registers as evasion if you've never run the operation. "Without Anthropic, why would there be any TPU growth at all? It's 100% Anthropic. Without Anthropic, why would there be Trainium growth at all? It's 100% Anthropic" that's a CEO who has the customer concentration data on every competing silicon program and is mildly amused he has to say it out loud. The upstream supply chain answer telling supplier CEOs how big the industry would be, and them staking capacity on his word only reads as a flex if you've never had to underwrite a forecast to a supplier. It's how supply chains actually get built. The China section is where the frame gap was widest. Dwarkesh treats compute like uranium dangerous material to be controlled and withheld. Jensen treats compute like a platform propagate it, win the developers, win the stack. Two completely different theories of how American tech leadership actually works. Jensen's frame: 50% of AI developers are in China. Concede that market and you concede the standard. Win all five layers of the stack — silicon, systems, networking, software, models — on CUDA, or watch an open-source ecosystem grow on a foreign tech stack. Nvidia isn't a phone or a car. Export controls calibrated for consumer hardware misread the actual game. The gap in the interview wasn't curiosity or rigor. It was business framing. The questions kept circling "do you favor these customers" when the real mechanics are purchase orders, allocation, supply commitments, and the relationships that make any of it possible. Jensen's TSMC partnership predates most of this conversation. Hardware is hard. The scars matter. The parts worth pulling forward Token Dollar economics, supply chain prefetch, Anthropic-as-only-ASIC-customer, highest tokens-per-watt, win all five layers are operator answers to theoretical questions. The interview is better than the discourse around it. Worth the full watch.

Dwarkesh Patel@dwarkesh_sp

The Jensen Huang episode. 0:00:00 – Is Nvidia’s biggest moat its grip on scarce supply chains? 0:16:25 – Will TPUs break Nvidia’s hold on AI compute? 0:41:06 – Why doesn’t Nvidia become a hyperscaler? 0:57:36 – Should we be selling AI chips to China? 1:35:06 – Why doesn’t Nvidia make multiple different chip architectures? Look up Dwarkesh Podcast on YouTube, Apple Podcasts, Spotify, etc. Enjoy!

English

296

69.7K

David Robertson@davidrobertson·15 Nis

youtu.be/Hrbq66XqtCo?si… There’s a reason Jensen is one of the most talented, thoughtful CEOs in the world. Watch him dismantle @dwarkesh_sp weak arguments. Probably the most interesting and informative podcasts I’ve seen in a long time. Completely worth the watch.

YouTube

English

David Robertson retweetledi

Uncle Bob Martin@unclebobmartin·15 Nis

@thegeeknarrator I disagree. Code is slow for humans. The more we read or write it the slower we go. To gain productivity from AI we need to disengage from code and put our energies into managing the structure, not the syntax, of the code.

English

232

14.5K

David Robertson retweetledi

Matt Pocock@mattpocockuk·15 Nis

Been waiting a month for Anthropic to answer a simple usage question about Claude Code subscriptions Have I been ghosted

Matt Pocock@mattpocockuk

Can I get some questions answered by someone at Anthropic? 1. Can you use an OAuth token generated from a subscription to power the Claude Agent SDK strictly for using Claude Code in a local dev loop? All I want is a more reliable API for parallelizing multiple Claude Code's. 2. If I build an open source tool that relies on this pattern - i.e. for making parallelization easier - can I distribute it so that other people can use it? The reason I'm asking is that the legal compliance docs and @trq212's public statements (below) appear to contradict. x.com/trq212/status/…

English

106

892

118.6K

David Robertson@davidrobertson·15 Nis

No doubt, LLM's are obviously math sensitive. I could see a coupe of scenarios where you where you moved a workload to older hardware that's not as easy to optimize, or newer hardware that isn't very optimized yet, and if tolerances are lower, that could easily degrade quality. You see this quite a bit when a new open weights model comes out, the various inference providers will have drastically different numbers for benchmarks until they tune their systems. MOE is so hard to serve. The complexity difference between serving something dense like Llama 3 and say Kimi or GLM is incredible. I definitely agree that quality can seem and be degraded, my best guess is different infra changes, not model weights.

English

Justin Schroeder@jpschroeder·14 Nis

@davidrobertson @DarioAmodei @sama Maybe that’s true, but whatever trick they’re using they clearly have a way to degrade quality to reduce compute. No question.

English

342

David Robertson retweetledi

Onur Solmaz@onusoz·15 Nis

You need to understand one fact about OpenClaw People are biased and incentivized to spread disinformation about OpenClaw. That is because OpenClaw IS NOT PUMPING ANYONE’S BAGS, unlike most other projects Literally every other for-profit agent product is incentivized to trash OpenClaw, BECAUSE OpenClaw is a neutral third party across the industry and geopolitical scene. They MAKE MONEY when OpenClaw loses OpenClaw does not worry about making money for some investors. Its founder @steipete is a successful exited founder. He is motivated by having fun and democratizing AI, literally. That is why he is suddenly so loved by everyone. He cares about PEOPLE, not MONEY “OpenClaw is bloated” -> Since beginning of March, OpenClaw is thinning its core and putting functionality in plugins behind a plugin SDK. Having numerous plugins to choose from does not mean bloat. This was already copied by others and is still a work in progress “OpenClaw is not secure” -> OpenClaw has the most eyeballs and immediately addresses any security advisories as soon as they come. It is the most secure agent, by sheer pressure “OpenClaw is bought by OpenAI” -> Then why is my bank account so empty bro??? All maintainers are literally unpaid and working DOUBLE beside their dayjobs to ship features to you. Do you think VC money can buy that kind of commitment? Once you understand these facts, you’ll like OpenClaw even more. Because OpenClaw is your AI, People’s AI And you can join us too. OpenClaw is the easiest-to-join project in AI right now. You just need to start using it, and start making good contributions. If you are competent, you can become a maintainer, and join the rest of the team making history!

English

146

174

1.6K

334.2K

Keşfet

@gdb @shadcn @jpschroeder @0xSero @openclaw @DarioAmodei @sama @elonmusk