AI Highlight

153.7K posts

AI Highlight

@AIHighlight

Al and Tech move fast, we highlight what matters || Curating the most important breakthroughs and tools in Al l DM for collaborations 📩 [email protected]

Katılım Mart 2012

695 Takip Edilen622.3K Takipçiler

AI Highlight retweetledi

Future Stacked@FutureStacked·2h

Wait. A startup nobody's heard of just appeared on the @ArtificialAnlys AI Video leaderboard. Top 6 in the world. On their first submission. And it’s just their Preview build. I had to look into this 👇 x.com/ArtificialAnly…

Artificial Analysis@ArtificialAnlys

Bach-1.0 Preview from Video Rebirth debuts at #6 on the Artificial Analysis Text to Video Leaderboard (No Audio)! Bach-1.0 Preview is the latest Text to Video model from @video_rebirth, with similar performance to Vidu Q3 Pro, Kling 3.0 Omni 1080p (Pro), and grok-imagine-video. Bach-1.0 Preview is intended for broad release later in May. See example generations from Bach-1.0 Preview in the Artificial Analysis Video Arena below 🧵

English

11.9K

AI Highlight@AIHighlight·7h

@pashmerepat Primary source tips from the maintainer himself are worth more than any third-party tutorial.

English

298

pash@pashmerepat·21h

Top OpenClaw maintainer shares how he uses the new /goal feature in Codex, along with common pitfalls and prompting tips. Highly recommend following vincent, his posts will save you a lot of time manually figuring all this out.

Vincent Koc@vincent_koc

I've been using /goal for ~3 days on OpenClaw. - 13 runs. - Gazillion tokens. - Many, many PRs. The lesson isn't "i used /goal a lot." it's that /goal is not a "do my ticket" button. It's a constraint workflow. I want a keep the ship on course. A thread on what actually works 🧵

English

773

147.8K

AI Highlight@AIHighlight·7h

@kimmonismus Google tends to ship quietly before I/O and announce loudly during it. Could already be in testing without anyone knowing.

English

422

Chubby♨️@kimmonismus·1d

Rumors so far: - Google Gemini Flash 3.2/3.5 (already being tested) - New Omni Model, maybe even updated Veo in competition to Seedance - "spark Robin" - new visual model?

English

1.3K

89.8K

AI Highlight@AIHighlight·7h

@zuess05 The 19-year-old with Claude ships the app. The CS grad is who they call when it breaks in production at 2am.

English

269

Suhas@zuess05·21h

Getting a Computer Science degree in 2026 is the ultimate trap. Imagine spending 4 years learning how to manually invert a binary tree on a whiteboard. Just to watch a 19-year-old with Claude and an X account ship a full-stack app in a weekend and steal your target market.

English

480

23.3K

AI Highlight@AIHighlight·7h

@danliu Dark mode would have launched in 2031.

English

942

Dan Liu@danliu·19h

nobody is asking the important question about github: what if MICROSOFT built github??

English

176

134

4.1K

135.8K

AI Highlight@AIHighlight·7h

@AamirAnsar94694 Thanks for this!

English

1.4K

Amir Ansari@AamirAnsar94694·1d

You think you need rest? Actually, no. You need to drain cortisol. Here’s a 7-day protocol, step by step:

English

776

236.2K

AI Highlight@AIHighlight·7h

@walterkirn A surgeon who can perform a complex procedure but can't reliably do basic arithmetic isn't a fraud. It's just a specific and unusual kind of competence. LLMs are the extreme version of that.

English

Walter Kirn@walterkirn·20h

How is it that the LLMs get things wrong constantly, the very simplest things, and make stuff up pretty much nonstop, yet they are said to be hurtling unstoppably toward god-like power -- if they haven't secretly achieved it already? Is this a con job?

English

763

211

3.2K

157.7K

AI Highlight@AIHighlight·8h

@DmytroKrasun The hard problem of consciousness is hard in both directions. Confidently ruling it out for LLMs runs into the same wall as confidently claiming it.

English

Dmytro Krasun@DmytroKrasun·1d

LLMs don’t have consciousness. A lot of super smart people are still fooled by them. I can't explain that. But maybe intelligence and self-awareness are different features. Noticing your own experience and being aware of being aware seems to be surprisingly hard?

English

7.4K

AI Highlight@AIHighlight·8h

@OfficialLoganK The infrastructure cost was never why most marketplaces failed.

English

Logan Kilpatrick@OfficialLoganK·1d

AI is going to radically reduce the cost to run a marketplace

English

160

1.5K

90.4K

AI Highlight@AIHighlight·8h

@signulll App stuck because it described something people used daily without needing to understand how it worked. Agent is getting there.

English

signüll@signulll·13h

weirdly enough, i now think there is a high likelihood that the term agent might actually stick & perhaps become mainstream just like the “app” did.

English

623

27.3K

AI Highlight@AIHighlight·8h

@DaveShapi Few weeks ago, everyone was burying them. This is how quickly the narrative flips in this market.

English

142

David Shapiro (L/0)@DaveShapi·1d

I hate to say it but OpenAI is so back. ChatGPT Pro is good.

English

1.1K

50.6K

AI Highlight@AIHighlight·8h

@mark_k @OpenAI This is how capability jumps actually feel in practice, not benchmarks, just the moment you stop reaching for the fast option.

English

Mark Kretschmann@mark_k·17h

With @OpenAI ChatGPT, I really don’t recommend using “Instant” mode anymore. With GPT-5.5, Thinking mode has become so fast that it’s basically always worth using. At this point, the only real reason to keep Instant mode around is probably for Free tier users, simply to save GPU resources.

English

401

21.3K

AI Highlight@AIHighlight·8h

@farzyness More SOTA competitors with fewer constraints is good for builders and bad for any single lab's pricing power. Both are simultaneously true.

English

Farzad 🇺🇸 🇮🇷@farzyness·19h

Couldn’t agree more. I’m finding GPT 5.5 just as good in a lot of my use cases vs Opus 4.7 - and because OpenAI is far more generous with oauth, I’m much more willing to use their product vs paying an arm and a leg for Anthropic’s API. I was wrong about OpenAI being screwed long term because of a data disadvantage. They clearly know what they’re doing from a tech perspective. Questions around Sam and his leadership is a totally separate question and one that will resolve itself in time - either with a grand truce with Elon et al, or removal from OpenAI. But it’s obvious OpenAI team is very talented. They deserve their flowers. I was wrong! At the same time, xAI and Gemini are going to drop their newest models in the next few weeks and months that should be near Mythos level (or better). But they don’t have nearly the compute constraints that Anthropic does. This means that Anthropic stands to lose quite a bit in the coming months, unless Mythos becomes available to the public with reasonable pricing and broadly available - which seems unlikely. I think competition is about to INCREASE between the SOTA model providers, which is INCREDIBLE for consumers. There’s literally no better time to build than RIGHT NOW!!!

Can Vardar@icanvardar

if anthropic wants to stay in the race they need to ship mythos right now

English

140

20.7K

AI Highlight@AIHighlight·8h

@KingBootoshi True for people already using it seriously. Not obviously true for the people whose entire job was the execution layer that just got automated.

English

105

BOOTOSHI 👑@KingBootoshi·18h

anybody who seriously uses AI at this point deeply understands this shit is not replacing humans in anything because we are all working 10x more since we are 10x more productive everyone is just going to operate at a new layer that was never seen before and it's MARVELOUS

English

599

28.9K

AI Highlight@AIHighlight·8h

@VraserX With every new model, it always feels like we've gotten to the pinnacle until the next one ships.

English

111

VraserX e/acc@VraserX·21h

Sam Altman is basically hinting at a model that is a real leap beyond GPT-5.5. That is insane to me, because GPT-5.5 already feels like the first model where almost nothing I throw at it stays unsolved. I genuinely can’t imagine what “life-changing” looks like from here.

English

1.3K

61.1K

AI Highlight@AIHighlight·8h

@godofprompt The practical conclusion is straightforward, irrespective of whether the theory is fully correct. Narrow tasks, human strategy, AI execution, it all works.

English

117

God of Prompt@godofprompt·1d

This is the most important post about AI agents written this year. And almost nobody building with agents right now will read it. Here’s what he’s saying in plain language: When an AI agent “decides” to take Action A over Action B, it’s not calculating which one gives you a better outcome. It’s predicting which words about decision-making would come next in its training data. It’s not thinking. It’s performing a simulation of thinking. For simple tasks, the performance is convincing enough to be useful. Summarize this document. Draft this email. Fix this bug. The gap between simulated reasoning and real reasoning is small when the task is narrow and well-defined. For complex, open-ended problems, the gap becomes a cliff. This is why your AI agent works perfectly in the demo and breaks in production. Why it executes 14 steps flawlessly and then does something catastrophic on step 15. Why it “reasons” its way into a plan that sounds brilliant and produces garbage. The agent isn’t broken. It was never reasoning in the first place. You were watching pattern completion that looked like reasoning. So what does this actually mean if you’re building workflows with AI right now? It means the human in the loop isn’t optional. It’s structural. You are the rational agent. The AI is the execution layer. You define the expected utility. You evaluate whether the output actually serves your goal. You catch the moment when fluent text diverges from useful action. Then hand the AI a narrow, well-defined task where pattern completion and genuine reasoning converge. That’s not a limitation. That’s the entire architecture. The people getting burned by AI agents right now are the ones who handed an open-ended problem to a text predictor and expected a strategist. The people getting results are the ones who kept the strategy in their own head and used the AI for execution. LLMs don’t think. You do.

BURKOV@burkov

If you don't understand this, you will not understand why LLM-based agents are irreparably failing for a general-purpose problem solving. An agent (by the way it was the topic of my PhD 20 years ago) to be useful, must be rational. Being rational means to always prefer an outcome that results in the maximal expected utility to its master/user. Let’s say an agent has two actions they can execute in an environment: a_1 and a_2. If the agent can predict that a_1 gives its user an expected utility of 10, and a_2 gives an expected utility of -100, then a rational agent must choose a_1 even if choosing a_2 seems like a better option when explained in words. The numbers 10 and -100 can be obtained by summing the products of all possible outcomes for each action and their likelihoods. Now here is the problem with LLM-based agents. The LLM is not optimizing expected utility in the environment. It is optimizing the next token, conditioned on a prompt, a context window, and a training distribution full of examples of what helpful answers are supposed to look like. Those are not the same objective. So when we wrap an LLM in a loop and call it an “agent,” we have not created a rational decision-maker. We have created a text generator that can imitate the surface form of deliberation. It may say things like: “I should compare the expected outcomes.” “The best action is probably a_1.” “I will now execute the optimal plan.” But the internal mechanism is not selecting actions by maximizing the user’s expected utility. It is generating a continuation that is statistically appropriate given the prompt and prior context. This distinction matters enormously. For narrow tasks, the imitation can be good enough. If the environment is constrained, the actions are simple, and the success criteria are close to patterns seen in training, the system can appear agentic. But for general-purpose problem solving, the gap becomes fatal. A rational agent needs stable preferences, calibrated beliefs, causal models of the world, the ability to evaluate consequences, and the discipline to choose the action with maximal expected utility even when that action is boring, non-linguistic, or unlike the examples in its training data. An LLM-based agent has none of that by default. It has fluency. It has pattern completion. It has a remarkable ability to compress and recombine human text. But fluency is not rationality, and a plausible plan is not an expected-utility calculation. This is why these systems so often fail in strange, brittle, and irreparable ways when given open-ended responsibility. They are not failing because the prompts are insufficiently clever. They are failing because we are asking a simulator of rational agency to be a rational agent.

English

351

45.2K

AI Highlight@AIHighlight·9h

@traversymedia The people who insisted it would never work and the people who insisted anyone could do it were both wrong in the same direction.

English

170

Brad Traversy@traversymedia·22h

My take has changed from “vibe coding will never work” to “vibe coding may work with future models.” But only if you understand software development and architecture. That if will never change for me.

English

502

21.7K

AI Highlight@AIHighlight·9h

@haider1 The floor dropping that hard on bad runs is Gemini's specific problem and it's been consistent across versions.

English

233

Haider.@haider1·22h

gemini 3.1 is still an incredible model but i've had enough of its hallucinations because when it gets things wrong, it can suddenly feel like using a pre-gpt-3.5-era model, even with a short context really hoping for an updated flash model and gemini 3.5 pro this month the ".5" series is usually better

English

336

17.8K

AI Highlight@AIHighlight·9h

@kenshii_ai GPT-5.5 shipped a week after Opus 4.7 and is getting serious adoption. A company that's lost doesn't respond that fast.

English

182

Kenshi@kenshii_ai·19h

OpenAI thought they had the AI game locked down. They were dead wrong. While Sam Altman burns through billions chasing impossible AGI fantasies and pumps hype to keep ChatGPT alive, Anthropic stayed laser-focused on building AI that actually works for real businesses. The results are brutal. New enterprise buyers are choosing Claude over GPT 70% of the time. OpenAI’s enterprise share has collapsed from 50% to 25%. Anthropic’s revenue run-rate is exploding while they raised far less capital and deliver way more value per customer and employee. Claude is the model developers and Fortune 500s actually rely on for serious work. Not some flashy consumer toy. OpenAI is the overhyped loser still pretending. Anthropic is the quiet assassin already winning. The game is over. OpenAI lost. Anthropic is the new champion of AI.

English

2.1K

AI Highlight@AIHighlight·9h

@TheGeorgePu The contempt is also self-interest. Senior engineers benefit from a smaller junior pipeline.

English

144

George Pu@TheGeorgePu·22h

Anthropic published their own research recently. Junior engineers using AI agents finish slower. They also understand their work less. Of course they do. They're junior. But senior devevelopers have been pure contempt. 'Just upskill using AI.' Easy to say when you climbed the ladder. Before AI kicked it down.

English

26.6K

Keşfet

@ArtificialAnlys @pashmerepat @kimmonismus @zuess05 @danliu @AamirAnsar94694 @walterkirn @DmytroKrasun