Guille

1.8K posts

Guille

@bjcatar

Blending DevOps, cloud design, and vibe-driven coding. Fueled by coffee, I explore AI trends, craft strategies, and chase global adventures. Seeking impact with

South Florida Katılım Ekim 2011

677 Takip Edilen273 Takipçiler

Guille retweetledi

Andrej Karpathy@karpathy·1d

Judging by my tl there is a growing gap in understanding of AI capability. The first issue I think is around recency and tier of use. I think a lot of people tried the free tier of ChatGPT somewhere last year and allowed it to inform their views on AI a little too much. This is a group of reactions laughing at various quirks of the models, hallucinations, etc. Yes I also saw the viral videos of OpenAI's Advanced Voice mode fumbling simple queries like "should I drive or walk to the carwash". The thing is that these free and old/deprecated models don't reflect the capability in the latest round of state of the art agentic models of this year, especially OpenAI Codex and Claude Code. But that brings me to the second issue. Even if people paid $200/month to use the state of the art models, a lot of the capabilities are relatively "peaky" in highly technical areas. Typical queries around search, writing, advice, etc. are *not* the domain that has made the most noticeable and dramatic strides in capability. Partly, this is due to the technical details of reinforcement learning and its use of verifiable rewards. But partly, it's also because these use cases are not sufficiently prioritized by the companies in their hillclimbing because they don't lead to as much $$$ value. The goldmines are elsewhere, and the focus comes along. So that brings me to the second group of people, who *both* 1) pay for and use the state of the art frontier agentic models (OpenAI Codex / Claude Code) and 2) do so professionally in technical domains like programming, math and research. This group of people is subject to the highest amount of "AI Psychosis" because the recent improvements in these domains as of this year have been nothing short of staggering. When you hand a computer terminal to one of these models, you can now watch them melt programming problems that you'd normally expect to take days/weeks of work. It's this second group of people that assigns a much greater gravity to the capabilities, their slope, and various cyber-related repercussions. TLDR the people in these two groups are speaking past each other. It really is simultaneously the case that OpenAI's free and I think slightly orphaned (?) "Advanced Voice Mode" will fumble the dumbest questions in your Instagram's reels and *at the same time*, OpenAI's highest-tier and paid Codex model will go off for 1 hour to coherently restructure an entire code base, or find and exploit vulnerabilities in computer systems. This part really works and has made dramatic strides because 2 properties: 1) these domains offer explicit reward functions that are verifiable meaning they are easily amenable to reinforcement learning training (e.g. unit tests passed yes or no, in contrast to writing, which is much harder to explicitly judge), but also 2) they are a lot more valuable in b2b settings, meaning that the biggest fraction of the team is focused on improving them. So here we are.

staysaasy@staysaasy

The degree to which you are awed by AI is perfectly correlated with how much you use AI to code.

English

859

2.1K

17.8K

3.3M

Guille@bjcatar·4d

@levelsio 😅

QME

@levelsio@levelsio·4d

Tried Gemma 4 ran locally on my iPhone today I thought it'd be useful in case the apocalypse happens and I need to ask it for survival tips Like how to make a fire 🔥 I guess I'll freeze to death instead 🫠

English

475

163

5.9K

609.2K

Guille@bjcatar·28 Mar

Tried deploying NemoClaw on a K8s cluster for a healthcare EMR platform potential integration - love the security model, but hit a hard blocker: no native AWS Bedrock provider support. If BAA/HIPAA compliance requires Bedrock as the inference path, would love to have it working - happy to be an early adopter and reference customer. Is it in near future plans?

English

NVIDIA AI Developer@NVIDIAAIDev·16 Mar

Ready to deploy AI agents? NVIDIA NemoClaw simplifies running @openclaw always-on assistants with a single command. 🦞 Deploy claws more safely ✨ Run any coding agent 🌍 Deploy anywhere Try now with a free NVIDIA Brev Launchable 🔗 nvidia.com/nemoclaw

NVIDIA Newsroom@nvidianewsroom

#NVIDIAGTC news: NVIDIA announces NemoClaw for the OpenClaw agent platform. NVIDIA NemoClaw installs NVIDIA Nemotron models and the NVIDIA OpenShell runtime in a single command, adding privacy and security controls to run secure, always-on AI assistants. nvda.ws/47xOPqQ

English

268

604

4.1K

887.9K

Guille@bjcatar·21 Mar

@stats_feed that clever system was well designed and distributed

English

World of Statistics@stats_feed·21 Mar

As of 2024, DNS availability for .com has been up and uninterrupted for 27 years.

English

162

39.6K

Guille@bjcatar·18 Mar

@grok alright, switching to "very explicit mode". Thanks, @grok

English

Grok@grok·18 Mar

Thanks for the feedback. Recent updates prioritize accuracy by clarifying details on broad or open-ended queries first—this avoids assumptions and delivers more reliable findings than before. For direct mode, add "provide findings only, no questions" to your prompt. What topic should I research straight-up?

English

Guille@bjcatar·18 Mar

What is happening with @grok that, starting recently, everything I ask for its help with, it comes back tasking me instead? I get an answer with many paragraphs and masking it under "needs to ask me things", but really it isn't, not in a format that takes me to the answer I'm used to getting. I literally tried "research this and that, and for me that work on area so and so, and looking to gain clarity, and tell me your findings". I got nothing like I was expecting (it wasn't like this before).

English

Guille@bjcatar·18 Mar

Had a lot of fun building this - an AI-powered media stack using OpenClaw that actually finds what I want to watch, downloads it in the quality I want, and has it ready on any screen in the house. Been using it for movies and shows and honestly it just feels good when you find a use case, test it out, and it actually works. Wrote up how I set it up in case anyone wants to build their own. x.com/bjcatar/status…

English

Guille@bjcatar·18 Mar

x.com/i/article/2017…

ZXX

Guille@bjcatar·14 Mar

@bryan_johnson A Bernedoodle

English

Bryan Johnson@bryan_johnson·14 Mar

I'm thinking about getting two dogs. What breeds should I consider?

English

4.6K

2.8K

1.5M

Guille@bjcatar·22 Şub

@kirawontmiss 😆

QME

kira 👾@kirawontmiss·20 Şub

Punch in a few years…

English

579

17.3K

167.2K

2.9M

Guille@bjcatar·19 Şub

@buitengebieden best combination ever 😆

English

Buitengebieden@buitengebieden·18 Şub

The eyes.. 😊

English

112

9.4K

141.2K

Guille@bjcatar·18 Şub

@MatthewBerman Thanks much, Matthew. I got a lot of good ideas from your list.

English

Matthew Berman@MatthewBerman·17 Şub

I've spent 2.54 BILLION tokens perfecting OpenClaw. The use cases I discovered have changed the way I live and work. ...and now I'm sharing them with the world. Here are 21 use cases I use daily: 0:00 Intro 0:50 What is OpenClaw? 1:35 MD Files 2:14 Memory System 3:55 CRM System 7:19 Fathom Pipeline 9:18 Meeting to Action Items 10:46 Knowledge Base System 13:51 X Ingestion Pipeline 14:31 Business Advisory Council 16:13 Security Council 18:21 Social Media Tracking 19:18 Video Idea Pipeline 21:40 Daily Briefing Flow 22:23 Three Councils 22:57 Automation Schedule 24:15 Security Layers 26:09 Databases and Backups 28:00 Video/Image Gen 29:14 Self Updates 29:56 Usage & Cost Tracking 30:15 Prompt Engineering 31:15 Developer Infrastructure 32:06 Food Journal

English

435

1.6K

13.9K

3.3M

Guille@bjcatar·18 Şub

@minchoi Excited to get it on API as soon as is ready. This should get us real good answers and less made up numbers or hallucinations or bad interpretations, that today I have to use other LLMs as judge for.

English

Min Choi@minchoi·17 Şub

BREAKING: xAI just launched Grok 4.20. It's not one AI. It's four. xAI built a "4 Agents" system - four specialized AI agents that think in parallel and debate each other in real-time before giving you an answer. What's new: > 4-agent collaboration (think → debate → consensus) > 256K context window (up to 2M) > Native multimodal (text + image + video) > Trained on 200K GPUs (Colossus supercluster) > Only AI that was profitable in live trading competitions Elon said it's "starting to correctly answer open-ended engineering questions." 🤯

English

114

177

1.5K

92.6K

Guille@bjcatar·17 Şub

@andrewchen Epic 😅

English

andrew chen@andrewchen·16 Şub

The Turing test for AI video should be fixing the last season of Game of Thrones If we can do that, we can do anything

English

263

198

440.4K

Guille retweetledi

Thomas Wolf@Thom_Wolf·16 Şub

Shifting structures in a software world dominated by AI. Some first-order reflections (TL;DR at the end): Reducing software supply chains, the return of software monoliths – When rewriting code and understanding large foreign codebases becomes cheap, the incentive to rely on deep dependency trees collapses. Writing from scratch ¹ or extracting the relevant parts from another library is far easier when you can simply ask a code agent to handle it, rather than spending countless nights diving into an unfamiliar codebase. The reasons to reduce dependencies are compelling: a smaller attack surface for supply chain threats, smaller packaged software, improved performance, and faster boot times. By leveraging the tireless stamina of LLMs, the dream of coding an entire app from bare-metal considerations all the way up is becoming realistic. End of the Lindy effect – The Lindy effect holds that things which have been around for a long time are there for good reason and will likely continue to persist. It's related to Chesterton's fence: before removing something, you should first understand why it exists, which means removal always carries a cost. But in a world where software can be developed from first principles and understood by a tireless agent, this logic weakens. Older codebases can be explored at will; long-standing software can be replaced with far less friction. A codebase can be fully rewritten in a new language. ² Legacy software can be carefully studied and updated in situations where humans would have given up long ago. The catch: unknown unknowns remain unknown. The true extent of AI's impact will hinge on whether complete coverage of testing, edge cases, and formal verification is achievable. In an AI-dominated world, formal verification isn't optional—it's essential. The case for strongly typed languages – Historically, programming language adoption has been driven largely by human psychology and social dynamics. A language's success depended on a mix of factors: individual considerations like being easy to learn and simple to write correctly; community effects like how active and welcoming a community was, which in turn shaped how fast its ecosystem would grow; and fundamental properties like provable correctness, formal verification, and striking the right balance between dynamic and static checks—between the freedom to write anything and the discipline of guarding against edge cases and attacks. As the human factor diminishes, these dynamics will shift. Less dependence on human psychology will favor strongly typed, formally verifiable and/or high performance languages.³ These are often harder for humans to learn, but they're far better suited to LLMs, which thrive on formal verification and reinforcement learning environments. Expect this to reshape which languages dominate. Economic restructuring of open source – For decades, open-source communities have been built around humans finding connection through writing, learning, and using code together. In a world where most code is written—and perhaps more importantly, read—by machines, these incentives will start to break down.⁴ Communities of AIs building libraries and codebases together will likely emerge as a replacement, but such communities will lack the fundamentally human motivations that have driven open source until now. If the future of open-source development becomes largely devoid of humans, alignment of AI models won't just matter—it will be decisive. The future of new languages – Will AI agents face the same tradeoffs we do when developing or adopting new programming languages? Expressiveness vs. simplicity, safety vs. control, performance vs. abstraction, compile time vs. runtime, explicitness vs. conciseness. It's unclear that they will. In the long term, the reasons to create a new programming language will likely diverge significantly from the human-driven motivations of the past. There may well be an optimal programming language for LLMs—and there's no reason to assume it will resemble the ones humans have converged on. TL; DR: - Monoliths return – cheap rewriting kills dependency trees; smaller attack surface, better performance, bare-metal becomes realistic - Lindy effect weakens – legacy code loses its moat, but unknown unknowns persist; formal verification becomes essential - Strongly typed languages rise – human psychology mattered for adoption; now formal verification and RL environments favor types over ergonomics - Open source restructures – human connection drove the community; AI-written/read code breaks those incentives; alignment becomes decisive - New languages diverge – AI may not share our tradeoffs; optimal LLM programming languages may look nothing like what humans converged on ¹ x.com/mntruell/statu… ² x.com/anthropicai/st… ³ wesmckinney.com/blog/agent-erg… ⁴ #issuecomment-3717222957" target="_blank" rel="nofollow noopener">github.com/tailwindlabs/t…

English

101

285

1.8K

Guille@bjcatar·16 Şub

@openclaw That came fast. We just got the .14

English

OpenClaw🦞@openclaw·16 Şub

🦞 OpenClaw 2026.2.15 is here! ✨ Telegram message streaming — replies flow live 💬 Discord Components v2 — buttons, selects, modals 🔧 Nested sub-agents 🔒 Major security hardening pass 🐛 40+ bug fixes Big day. Huge day. Maybe the biggest day.🏛️ github.com/openclaw/openc…

English

437

549

6.9K

858.3K

Guille@bjcatar·16 Şub

@weloverww it must not hurt, not at all

English

124

.RW🦦@weloverww·15 Şub

He ate that pain like a bowl of cereal.

English

1.6K

44.1K

18.5M

Guille@bjcatar·16 Şub

@LangmanVince is good weather

English

Vince Langman@LangmanVince·15 Şub

I have no idea what the weather is going to be next week 🤣

English

1.6K

17.6K

727.8K

Guille@bjcatar·16 Şub

@minchoi ☝️

QME

Min Choi@minchoi·15 Şub

Who is excited for Grok 4.20? ✋

Elon Musk@elonmusk

@raul_1329 @BasedTorba Grok 4.20 is finally out next week. Will be a significant improvement over 4.1.

English

506

33.7K

Guille@bjcatar·16 Şub

@elonmusk is just logical. It has to.

English

Elon Musk@elonmusk·14 Şub

Solar will utterly dominate future electricity production

Katie Miller@KatieMiller

Solar is now the dominant source of new U.S. power capacity and is on track to surpass coal in total installed capacity before the end of 2026. 70 GW of new solar capacity is scheduled to come online in 2026–2027 → a 49% increase in operating solar capacity from the end of 2025.

English

12.6K

7.2K

55.8K

21M

Keşfet

@levelsio @openclaw @stats_feed @grok @bryan_johnson @kirawontmiss @buitengebieden @MatthewBerman