David Robertson

2.1K posts

David Robertson

David Robertson

@davidrobertson

Full-Stack Engineer | AI Integration Specialist | Building Smarter Solutions With Code | 🚀

Katılım Mayıs 2016
354 Takip Edilen287 Takipçiler
David Robertson
David Robertson@davidrobertson·
I agree, and I believe you that it didn’t know what it was doing. When something is out of distribution, it’s always going to struggle. Your observation was valid. My point really was just that there are lots of ways to maximize the results from these tools. Having a document like what Pro regurgitated thrown into the context would likely narrow its focus and get you better next-token prediction. Trying to get it to write a decent kernel in Mojo with no references gives you the dumbest slop ever. Give it some references and a way to validate its results, and you can get some incredible results.
English
0
0
2
346
mike64_t
mike64_t@mike64_t·
It “understands” in language and when talking about it, obviously. However, how it acts is a different story. I’ve said in the past that the models like to delay computation as much as possible, collect everything as raw data, and not introduce state if possible. Without me specifying how exactly it should implement the procedure to find the variable, except that it should sample two times, one would expect that it doesn’t require explanation that the *when* in time matters. If it thinks it can do two captures really fast at the end and that’s the same, it didn’t understand the point no matter what it reads back to you in question mode. Talking about a thing and doing a thing are not necessarily the same.
English
3
0
36
2.6K
mike64_t
mike64_t@mike64_t·
gpt 5.5 apparently does not understand the point of cheat-engine-like variable discovery... and that you can't actually defer the scan at the instant of interest unless you dump the entire memory... Kind of scary that this thing that's doing all this work apparently doesn't seem to *actually* understand the concept of variables changing in memory... Scary jagged intelligence
English
15
1
237
68.6K
David Robertson retweetledi
Justin Schroeder
Justin Schroeder@jpschroeder·
Well...I'm going to be in SF on Monday/Tuesday. Aaaand...I have something very very interesting to show. Anyone want to hang irl?
Justin Schroeder tweet media
English
5
1
31
1.4K
David Robertson
David Robertson@davidrobertson·
@gdb It infers things like Opus does and doesn’t adhere to the prompt as strictly as 5.4. It’s definitely lazier at times. Overall it’s decent, but kind of annoying at the same time.
English
0
0
0
78
David Robertson
David Robertson@davidrobertson·
Bluesky LOL. Github is trash, they locked me out of my account no questions asked and no easy way to get it back. It took them 10 days to open it back up and it was their mistake. Then they double charged me for my annual copilot sub. The whole thing is a mess, plus its obsolete or approaching obsolescence really fast. If people really want their IP controlled by one central operator, there's lots of other options including self hosting.
English
0
0
3
725
David Robertson
David Robertson@davidrobertson·
@shadcn Super annoying when a button doesn’t have cursor pointer. I know it’s not what it was intended for but sometimes just feel off when it’s not on.
English
0
0
0
275
David Robertson
David Robertson@davidrobertson·
How do you find 5.5? I find it faster but kind of annoying, it over corrects on everything, forgets things. OpenAI is moving the models in the direction of Opus trying to make them less autistic and Anthropic is making theirs more autistic, each model is getting a bit worse to use. Lololol.
English
0
0
0
21
Justin Schroeder
Justin Schroeder@jpschroeder·
Tech influencers today be like…
Justin Schroeder tweet media
English
6
0
25
1K
Justin Schroeder
Justin Schroeder@jpschroeder·
The justin-swe-bench results are in for gpt-5.4 vs Opus 4.7 vs Qwen 3.6
Justin Schroeder tweet media
English
18
2
402
32K
0xSero
0xSero@0xSero·
For everyone wondering about Opus regressions, this is pretty accurate. Almost all the issues I’ve seen people experience when self hosting is related to inference infra, settings, or harnesses. There’s so much room for compounding errors, Nvidia is what they bench on and give to their insiders. youtu.be/KFisvc-AMII?is…
YouTube video
YouTube
English
9
9
126
36.8K
David Robertson retweetledi
Elon Musk
Elon Musk@elonmusk·
You can access 𝕏 APi via @OpenClaw. We’re trying to make it affordable without giving away the shop. Hopefully, this can be useful & fun 💫
Robert Scoble@Scobleizer

Holy shit. Now everyone will be able to use their @OpenClaws and all the other agentic platforms to build apps on top of X. Here's the secret: build lists. Lists are how you build apps. The pattern: Build a list of your favorite football team. Or whatever you are into. Then ask your AI agents "build an app showing me all the important news about my favorite football team." In minutes you'll have an app. And that's just the beginning. Your agent can build a script about your favorite football team that you can take to places like Google's Notebook LM. Now you have a video, a podcast, a slide deck, a game, a mind map. All about your favorite football team based on real time news. You can do the same with something like @HeyGen, create an avatar of your favorite football player. Now you will have your favorite football player telling you everything that's happening on the football team. And I could go for hours about how many things you can build and not even cover a fraction of them. This is huge. Thank you @elonmusk for making it possible to make millions of agentic apps affordably on top of X. Start building!

English
2.7K
5.9K
48K
38.9M
Justin Schroeder
Justin Schroeder@jpschroeder·
You think so? I think the big labs have different levels of distillation for each model. Right when a new model drops, they are serving almost exclusively the new model, meanwhile they are distilling an 85% version of it to reduce inference and maybe even a cheaper one than that. Then they use a router to try and optimize cost, and when they really need more compute they can flip a switch and get more at any time. I’m so darn convinced of this…it’s how I would do it. I don’t even mind that they do this either, I just want to know at any given time, what *actual* model I’m using.
English
1
0
0
26
Justin Schroeder
Justin Schroeder@jpschroeder·
I'm open to GUI over TUI agents — but i have yet to encounter a single benefit. GUIs also all sacrifice the programmability and portability of a TUI. GUI fanboys, set me straight
English
13
0
19
6K
David Robertson retweetledi
Ben Pouladian
Ben Pouladian@benitoz·
Watching the takes on Jensen / Dwarkesh. Credit first: these were the best questions Jensen's been asked in a long-form sit-down. Dwarkesh didn't lob softballs. He pressed on commoditization, ASIC economics, margin compression, customer concentration. Real questions. But the consensus read that Jensen got "defensive" or didn't answer is missing what actually happened. A lot of his answers were operator clarity that only registers as evasion if you've never run the operation. "Without Anthropic, why would there be any TPU growth at all? It's 100% Anthropic. Without Anthropic, why would there be Trainium growth at all? It's 100% Anthropic" that's a CEO who has the customer concentration data on every competing silicon program and is mildly amused he has to say it out loud. The upstream supply chain answer telling supplier CEOs how big the industry would be, and them staking capacity on his word only reads as a flex if you've never had to underwrite a forecast to a supplier. It's how supply chains actually get built. The China section is where the frame gap was widest. Dwarkesh treats compute like uranium dangerous material to be controlled and withheld. Jensen treats compute like a platform propagate it, win the developers, win the stack. Two completely different theories of how American tech leadership actually works. Jensen's frame: 50% of AI developers are in China. Concede that market and you concede the standard. Win all five layers of the stack — silicon, systems, networking, software, models — on CUDA, or watch an open-source ecosystem grow on a foreign tech stack. Nvidia isn't a phone or a car. Export controls calibrated for consumer hardware misread the actual game. The gap in the interview wasn't curiosity or rigor. It was business framing. The questions kept circling "do you favor these customers" when the real mechanics are purchase orders, allocation, supply commitments, and the relationships that make any of it possible. Jensen's TSMC partnership predates most of this conversation. Hardware is hard. The scars matter. The parts worth pulling forward Token Dollar economics, supply chain prefetch, Anthropic-as-only-ASIC-customer, highest tokens-per-watt, win all five layers are operator answers to theoretical questions. The interview is better than the discourse around it. Worth the full watch.
Dwarkesh Patel@dwarkesh_sp

The Jensen Huang episode. 0:00:00 – Is Nvidia’s biggest moat its grip on scarce supply chains? 0:16:25 – Will TPUs break Nvidia’s hold on AI compute? 0:41:06 – Why doesn’t Nvidia become a hyperscaler? 0:57:36 – Should we be selling AI chips to China? 1:35:06 – Why doesn’t Nvidia make multiple different chip architectures? Look up Dwarkesh Podcast on YouTube, Apple Podcasts, Spotify, etc. Enjoy!

English
37
34
296
69.7K
David Robertson
David Robertson@davidrobertson·
youtu.be/Hrbq66XqtCo?si… There’s a reason Jensen is one of the most talented, thoughtful CEOs in the world. Watch him dismantle @dwarkesh_sp weak arguments. Probably the most interesting and informative podcasts I’ve seen in a long time. Completely worth the watch.
YouTube video
YouTube
English
0
0
1
61
David Robertson retweetledi
Uncle Bob Martin
Uncle Bob Martin@unclebobmartin·
@thegeeknarrator I disagree. Code is slow for humans. The more we read or write it the slower we go. To gain productivity from AI we need to disengage from code and put our energies into managing the structure, not the syntax, of the code.
English
19
12
232
14.5K
David Robertson retweetledi
David Robertson
David Robertson@davidrobertson·
No doubt, LLM's are obviously math sensitive. I could see a coupe of scenarios where you where you moved a workload to older hardware that's not as easy to optimize, or newer hardware that isn't very optimized yet, and if tolerances are lower, that could easily degrade quality. You see this quite a bit when a new open weights model comes out, the various inference providers will have drastically different numbers for benchmarks until they tune their systems. MOE is so hard to serve. The complexity difference between serving something dense like Llama 3 and say Kimi or GLM is incredible. I definitely agree that quality can seem and be degraded, my best guess is different infra changes, not model weights.
English
1
0
0
11
David Robertson retweetledi
Onur Solmaz
Onur Solmaz@onusoz·
You need to understand one fact about OpenClaw People are biased and incentivized to spread disinformation about OpenClaw. That is because OpenClaw IS NOT PUMPING ANYONE’S BAGS, unlike most other projects Literally every other for-profit agent product is incentivized to trash OpenClaw, BECAUSE OpenClaw is a neutral third party across the industry and geopolitical scene. They MAKE MONEY when OpenClaw loses OpenClaw does not worry about making money for some investors. Its founder @steipete is a successful exited founder. He is motivated by having fun and democratizing AI, literally. That is why he is suddenly so loved by everyone. He cares about PEOPLE, not MONEY “OpenClaw is bloated” -> Since beginning of March, OpenClaw is thinning its core and putting functionality in plugins behind a plugin SDK. Having numerous plugins to choose from does not mean bloat. This was already copied by others and is still a work in progress “OpenClaw is not secure” -> OpenClaw has the most eyeballs and immediately addresses any security advisories as soon as they come. It is the most secure agent, by sheer pressure “OpenClaw is bought by OpenAI” -> Then why is my bank account so empty bro??? All maintainers are literally unpaid and working DOUBLE beside their dayjobs to ship features to you. Do you think VC money can buy that kind of commitment? Once you understand these facts, you’ll like OpenClaw even more. Because OpenClaw is your AI, People’s AI And you can join us too. OpenClaw is the easiest-to-join project in AI right now. You just need to start using it, and start making good contributions. If you are competent, you can become a maintainer, and join the rest of the team making history!
English
146
174
1.6K
334.2K