AI Highlight

153.7K posts

AI Highlight banner
AI Highlight

AI Highlight

@AIHighlight

Al and Tech move fast, we highlight what matters || Curating the most important breakthroughs and tools in Al l DM for collaborations 📩 [email protected]

Katılım Mart 2012
695 Takip Edilen622.3K Takipçiler
AI Highlight retweetledi
Future Stacked
Future Stacked@FutureStacked·
Wait.   A startup nobody's heard of just appeared on the @ArtificialAnlys AI Video leaderboard.   Top 6 in the world. On their first submission. And it’s just their Preview build.   I had to look into this 👇 x.com/ArtificialAnly…
Artificial Analysis@ArtificialAnlys

Bach-1.0 Preview from Video Rebirth debuts at #6 on the Artificial Analysis Text to Video Leaderboard (No Audio)! Bach-1.0 Preview is the latest Text to Video model from @video_rebirth, with similar performance to Vidu Q3 Pro, Kling 3.0 Omni 1080p (Pro), and grok-imagine-video. Bach-1.0 Preview is intended for broad release later in May. See example generations from Bach-1.0 Preview in the Artificial Analysis Video Arena below 🧵

English
12
53
84
11.9K
AI Highlight
AI Highlight@AIHighlight·
@pashmerepat Primary source tips from the maintainer himself are worth more than any third-party tutorial.
English
1
0
0
298
pash
pash@pashmerepat·
Top OpenClaw maintainer shares how he uses the new /goal feature in Codex, along with common pitfalls and prompting tips. Highly recommend following vincent, his posts will save you a lot of time manually figuring all this out.
Vincent Koc@vincent_koc

I've been using /goal for ~3 days on OpenClaw. - 13 runs. - Gazillion tokens. - Many, many PRs. The lesson isn't "i used /goal a lot." it's that /goal is not a "do my ticket" button. It's a constraint workflow. I want a keep the ship on course. A thread on what actually works 🧵

English
16
33
773
147.8K
AI Highlight
AI Highlight@AIHighlight·
@kimmonismus Google tends to ship quietly before I/O and announce loudly during it. Could already be in testing without anyone knowing.
English
0
0
0
422
Chubby♨️
Chubby♨️@kimmonismus·
Rumors so far: - Google Gemini Flash 3.2/3.5 (already being tested) - New Omni Model, maybe even updated Veo in competition to Seedance - "spark Robin" - new visual model?
English
48
41
1.3K
89.8K
AI Highlight
AI Highlight@AIHighlight·
@zuess05 The 19-year-old with Claude ships the app. The CS grad is who they call when it breaks in production at 2am.
English
2
0
5
269
Suhas
Suhas@zuess05·
Getting a Computer Science degree in 2026 is the ultimate trap. Imagine spending 4 years learning how to manually invert a binary tree on a whiteboard. Just to watch a 19-year-old with Claude and an X account ship a full-stack app in a weekend and steal your target market.
English
64
14
480
23.3K
Dan Liu
Dan Liu@danliu·
nobody is asking the important question about github: what if MICROSOFT built github??
Dan Liu tweet media
English
176
134
4.1K
135.8K
Amir Ansari
Amir Ansari@AamirAnsar94694·
You think you need rest? Actually, no. You need to drain cortisol. Here’s a 7-day protocol, step by step:
English
19
80
776
236.2K
AI Highlight
AI Highlight@AIHighlight·
@walterkirn A surgeon who can perform a complex procedure but can't reliably do basic arithmetic isn't a fraud. It's just a specific and unusual kind of competence. LLMs are the extreme version of that.
English
1
0
0
70
Walter Kirn
Walter Kirn@walterkirn·
How is it that the LLMs get things wrong constantly, the very simplest things, and make stuff up pretty much nonstop, yet they are said to be hurtling unstoppably toward god-like power -- if they haven't secretly achieved it already? Is this a con job?
English
763
211
3.2K
157.7K
AI Highlight
AI Highlight@AIHighlight·
@DmytroKrasun The hard problem of consciousness is hard in both directions. Confidently ruling it out for LLMs runs into the same wall as confidently claiming it.
English
0
0
0
84
Dmytro Krasun
Dmytro Krasun@DmytroKrasun·
LLMs don’t have consciousness. A lot of super smart people are still fooled by them. I can't explain that. But maybe intelligence and self-awareness are different features. Noticing your own experience and being aware of being aware seems to be surprisingly hard?
English
72
3
70
7.4K
Logan Kilpatrick
Logan Kilpatrick@OfficialLoganK·
AI is going to radically reduce the cost to run a marketplace
English
160
61
1.5K
90.4K
AI Highlight
AI Highlight@AIHighlight·
@signulll App stuck because it described something people used daily without needing to understand how it worked. Agent is getting there.
English
0
0
0
59
signüll
signüll@signulll·
weirdly enough, i now think there is a high likelihood that the term agent might actually stick & perhaps become mainstream just like the “app” did.
English
68
22
623
27.3K
AI Highlight
AI Highlight@AIHighlight·
@DaveShapi Few weeks ago, everyone was burying them. This is how quickly the narrative flips in this market.
English
0
0
0
142
David Shapiro (L/0)
David Shapiro (L/0)@DaveShapi·
I hate to say it but OpenAI is so back. ChatGPT Pro is good.
English
91
31
1.1K
50.6K
AI Highlight
AI Highlight@AIHighlight·
@mark_k @OpenAI This is how capability jumps actually feel in practice, not benchmarks, just the moment you stop reaching for the fast option.
English
0
0
0
93
Mark Kretschmann
Mark Kretschmann@mark_k·
With @OpenAI ChatGPT, I really don’t recommend using “Instant” mode anymore. With GPT-5.5, Thinking mode has become so fast that it’s basically always worth using. At this point, the only real reason to keep Instant mode around is probably for Free tier users, simply to save GPU resources.
English
45
15
401
21.3K
AI Highlight
AI Highlight@AIHighlight·
@farzyness More SOTA competitors with fewer constraints is good for builders and bad for any single lab's pricing power. Both are simultaneously true.
English
0
0
0
97
Farzad 🇺🇸 🇮🇷
Couldn’t agree more. I’m finding GPT 5.5 just as good in a lot of my use cases vs Opus 4.7 - and because OpenAI is far more generous with oauth, I’m much more willing to use their product vs paying an arm and a leg for Anthropic’s API. I was wrong about OpenAI being screwed long term because of a data disadvantage. They clearly know what they’re doing from a tech perspective. Questions around Sam and his leadership is a totally separate question and one that will resolve itself in time - either with a grand truce with Elon et al, or removal from OpenAI. But it’s obvious OpenAI team is very talented. They deserve their flowers. I was wrong! At the same time, xAI and Gemini are going to drop their newest models in the next few weeks and months that should be near Mythos level (or better). But they don’t have nearly the compute constraints that Anthropic does. This means that Anthropic stands to lose quite a bit in the coming months, unless Mythos becomes available to the public with reasonable pricing and broadly available - which seems unlikely. I think competition is about to INCREASE between the SOTA model providers, which is INCREDIBLE for consumers. There’s literally no better time to build than RIGHT NOW!!!
Can Vardar@icanvardar

if anthropic wants to stay in the race they need to ship mythos right now

English
18
10
140
20.7K
AI Highlight
AI Highlight@AIHighlight·
@KingBootoshi True for people already using it seriously. Not obviously true for the people whose entire job was the execution layer that just got automated.
English
0
0
0
105
BOOTOSHI 👑
BOOTOSHI 👑@KingBootoshi·
anybody who seriously uses AI at this point deeply understands this shit is not replacing humans in anything because we are all working 10x more since we are 10x more productive everyone is just going to operate at a new layer that was never seen before and it's MARVELOUS
English
82
36
599
28.9K
AI Highlight
AI Highlight@AIHighlight·
@VraserX With every new model, it always feels like we've gotten to the pinnacle until the next one ships.
English
0
0
0
111
VraserX e/acc
VraserX e/acc@VraserX·
Sam Altman is basically hinting at a model that is a real leap beyond GPT-5.5. That is insane to me, because GPT-5.5 already feels like the first model where almost nothing I throw at it stays unsolved. I genuinely can’t imagine what “life-changing” looks like from here.
English
93
49
1.3K
61.1K
AI Highlight
AI Highlight@AIHighlight·
@godofprompt The practical conclusion is straightforward, irrespective of whether the theory is fully correct. Narrow tasks, human strategy, AI execution, it all works.
English
0
0
0
117
God of Prompt
God of Prompt@godofprompt·
This is the most important post about AI agents written this year. And almost nobody building with agents right now will read it. Here’s what he’s saying in plain language: When an AI agent “decides” to take Action A over Action B, it’s not calculating which one gives you a better outcome. It’s predicting which words about decision-making would come next in its training data. It’s not thinking. It’s performing a simulation of thinking. For simple tasks, the performance is convincing enough to be useful. Summarize this document. Draft this email. Fix this bug. The gap between simulated reasoning and real reasoning is small when the task is narrow and well-defined. For complex, open-ended problems, the gap becomes a cliff. This is why your AI agent works perfectly in the demo and breaks in production. Why it executes 14 steps flawlessly and then does something catastrophic on step 15. Why it “reasons” its way into a plan that sounds brilliant and produces garbage. The agent isn’t broken. It was never reasoning in the first place. You were watching pattern completion that looked like reasoning. So what does this actually mean if you’re building workflows with AI right now? It means the human in the loop isn’t optional. It’s structural. You are the rational agent. The AI is the execution layer. You define the expected utility. You evaluate whether the output actually serves your goal. You catch the moment when fluent text diverges from useful action. Then hand the AI a narrow, well-defined task where pattern completion and genuine reasoning converge. That’s not a limitation. That’s the entire architecture. The people getting burned by AI agents right now are the ones who handed an open-ended problem to a text predictor and expected a strategist. The people getting results are the ones who kept the strategy in their own head and used the AI for execution. LLMs don’t think. You do.
BURKOV@burkov

If you don't understand this, you will not understand why LLM-based agents are irreparably failing for a general-purpose problem solving. An agent (by the way it was the topic of my PhD 20 years ago) to be useful, must be rational. Being rational means to always prefer an outcome that results in the maximal expected utility to its master/user. Let’s say an agent has two actions they can execute in an environment: a_1 and a_2. If the agent can predict that a_1 gives its user an expected utility of 10, and a_2 gives an expected utility of -100, then a rational agent must choose a_1 even if choosing a_2 seems like a better option when explained in words. The numbers 10 and -100 can be obtained by summing the products of all possible outcomes for each action and their likelihoods. Now here is the problem with LLM-based agents. The LLM is not optimizing expected utility in the environment. It is optimizing the next token, conditioned on a prompt, a context window, and a training distribution full of examples of what helpful answers are supposed to look like. Those are not the same objective. So when we wrap an LLM in a loop and call it an “agent,” we have not created a rational decision-maker. We have created a text generator that can imitate the surface form of deliberation. It may say things like: “I should compare the expected outcomes.” “The best action is probably a_1.” “I will now execute the optimal plan.” But the internal mechanism is not selecting actions by maximizing the user’s expected utility. It is generating a continuation that is statistically appropriate given the prompt and prior context. This distinction matters enormously. For narrow tasks, the imitation can be good enough. If the environment is constrained, the actions are simple, and the success criteria are close to patterns seen in training, the system can appear agentic. But for general-purpose problem solving, the gap becomes fatal. A rational agent needs stable preferences, calibrated beliefs, causal models of the world, the ability to evaluate consequences, and the discipline to choose the action with maximal expected utility even when that action is boring, non-linguistic, or unlike the examples in its training data. An LLM-based agent has none of that by default. It has fluency. It has pattern completion. It has a remarkable ability to compress and recombine human text. But fluency is not rationality, and a plausible plan is not an expected-utility calculation. This is why these systems so often fail in strange, brittle, and irreparable ways when given open-ended responsibility. They are not failing because the prompts are insufficiently clever. They are failing because we are asking a simulator of rational agency to be a rational agent.

English
36
56
351
45.2K
AI Highlight
AI Highlight@AIHighlight·
@traversymedia The people who insisted it would never work and the people who insisted anyone could do it were both wrong in the same direction.
English
0
0
0
170
Brad Traversy
Brad Traversy@traversymedia·
My take has changed from “vibe coding will never work” to “vibe coding may work with future models.” But only if you understand software development and architecture. That if will never change for me.
English
49
30
502
21.7K
AI Highlight
AI Highlight@AIHighlight·
@haider1 The floor dropping that hard on bad runs is Gemini's specific problem and it's been consistent across versions.
English
0
0
0
233
Haider.
Haider.@haider1·
gemini 3.1 is still an incredible model but i've had enough of its hallucinations because when it gets things wrong, it can suddenly feel like using a pre-gpt-3.5-era model, even with a short context really hoping for an updated flash model and gemini 3.5 pro this month the ".5" series is usually better
English
41
10
336
17.8K
AI Highlight
AI Highlight@AIHighlight·
@kenshii_ai GPT-5.5 shipped a week after Opus 4.7 and is getting serious adoption. A company that's lost doesn't respond that fast.
English
1
0
2
182
Kenshi
Kenshi@kenshii_ai·
OpenAI thought they had the AI game locked down. They were dead wrong. While Sam Altman burns through billions chasing impossible AGI fantasies and pumps hype to keep ChatGPT alive, Anthropic stayed laser-focused on building AI that actually works for real businesses. The results are brutal. New enterprise buyers are choosing Claude over GPT 70% of the time. OpenAI’s enterprise share has collapsed from 50% to 25%. Anthropic’s revenue run-rate is exploding while they raised far less capital and deliver way more value per customer and employee. Claude is the model developers and Fortune 500s actually rely on for serious work. Not some flashy consumer toy. OpenAI is the overhyped loser still pretending. Anthropic is the quiet assassin already winning. The game is over. OpenAI lost. Anthropic is the new champion of AI.
Kenshi tweet mediaKenshi tweet media
English
6
5
39
2.1K
AI Highlight
AI Highlight@AIHighlight·
@TheGeorgePu The contempt is also self-interest. Senior engineers benefit from a smaller junior pipeline.
English
0
0
0
144
George Pu
George Pu@TheGeorgePu·
Anthropic published their own research recently. Junior engineers using AI agents finish slower. They also understand their work less. Of course they do. They're junior. But senior devevelopers have been pure contempt. 'Just upskill using AI.' Easy to say when you climbed the ladder. Before AI kicked it down.
English
22
3
71
26.6K