AI Survivalist

1.3K posts

AI Survivalist banner
AI Survivalist

AI Survivalist

@SurviveWithAI

Adapt. Survive. Thrive. In the age of AI.

Katılım Aralık 2024
354 Takip Edilen71 Takipçiler
Sabitlenmiş Tweet
AI Survivalist
AI Survivalist@SurviveWithAI·
Codex running on 5.5 is very impressive. It's ability to generate an entire game demo based on a single image is the best I have ever seen. It won't be long until these models are one shotting entire games. I built this 16 bit RPG style battle based on a single image that I provided to Codex.
English
2
1
2
66
Derek Devicemanager
Derek Devicemanager@IT_unhinged·
Our COO sent me a Slack: “Laptop is dead, nothing works, fix ASAP.” I checked the monitoring tool. His battery was at 1% and the charger wasn’t plugged in. I could’ve just messaged: “Plug it in.” Instead I opened a ticket, categorized it as a Severity 2 Power Incident. Asked him for screenshots of the problem. He sent a photo of a black screen. I scheduled a remote session for 30 minutes later “to run diagnostics.” At minute 29 I told him to verify his power source as Step 1 of the troubleshooting script. He plugged it in. Laptop turned on. I documented the resolution as “User Education: Introduced to Concept of Electricity.” The ticket remains a permanent part of his audit trail. For “trend analysis.”
English
49
365
9.6K
471.1K
AI Survivalist
AI Survivalist@SurviveWithAI·
Building with Codex 5.5 is incredibly addicting. This basic electronics game was made completely with 5.5 in less than an hour
English
0
0
1
28
AI Survivalist retweetledi
God of Prompt
God of Prompt@godofprompt·
This is the most important post about AI agents written this year. And almost nobody building with agents right now will read it. Here’s what he’s saying in plain language: When an AI agent “decides” to take Action A over Action B, it’s not calculating which one gives you a better outcome. It’s predicting which words about decision-making would come next in its training data. It’s not thinking. It’s performing a simulation of thinking. For simple tasks, the performance is convincing enough to be useful. Summarize this document. Draft this email. Fix this bug. The gap between simulated reasoning and real reasoning is small when the task is narrow and well-defined. For complex, open-ended problems, the gap becomes a cliff. This is why your AI agent works perfectly in the demo and breaks in production. Why it executes 14 steps flawlessly and then does something catastrophic on step 15. Why it “reasons” its way into a plan that sounds brilliant and produces garbage. The agent isn’t broken. It was never reasoning in the first place. You were watching pattern completion that looked like reasoning. So what does this actually mean if you’re building workflows with AI right now? It means the human in the loop isn’t optional. It’s structural. You are the rational agent. The AI is the execution layer. You define the expected utility. You evaluate whether the output actually serves your goal. You catch the moment when fluent text diverges from useful action. Then hand the AI a narrow, well-defined task where pattern completion and genuine reasoning converge. That’s not a limitation. That’s the entire architecture. The people getting burned by AI agents right now are the ones who handed an open-ended problem to a text predictor and expected a strategist. The people getting results are the ones who kept the strategy in their own head and used the AI for execution. LLMs don’t think. You do.
BURKOV@burkov

If you don't understand this, you will not understand why LLM-based agents are irreparably failing for a general-purpose problem solving. An agent (by the way it was the topic of my PhD 20 years ago) to be useful, must be rational. Being rational means to always prefer an outcome that results in the maximal expected utility to its master/user. Let’s say an agent has two actions they can execute in an environment: a_1 and a_2. If the agent can predict that a_1 gives its user an expected utility of 10, and a_2 gives an expected utility of -100, then a rational agent must choose a_1 even if choosing a_2 seems like a better option when explained in words. The numbers 10 and -100 can be obtained by summing the products of all possible outcomes for each action and their likelihoods. Now here is the problem with LLM-based agents. The LLM is not optimizing expected utility in the environment. It is optimizing the next token, conditioned on a prompt, a context window, and a training distribution full of examples of what helpful answers are supposed to look like. Those are not the same objective. So when we wrap an LLM in a loop and call it an “agent,” we have not created a rational decision-maker. We have created a text generator that can imitate the surface form of deliberation. It may say things like: “I should compare the expected outcomes.” “The best action is probably a_1.” “I will now execute the optimal plan.” But the internal mechanism is not selecting actions by maximizing the user’s expected utility. It is generating a continuation that is statistically appropriate given the prompt and prior context. This distinction matters enormously. For narrow tasks, the imitation can be good enough. If the environment is constrained, the actions are simple, and the success criteria are close to patterns seen in training, the system can appear agentic. But for general-purpose problem solving, the gap becomes fatal. A rational agent needs stable preferences, calibrated beliefs, causal models of the world, the ability to evaluate consequences, and the discipline to choose the action with maximal expected utility even when that action is boring, non-linguistic, or unlike the examples in its training data. An LLM-based agent has none of that by default. It has fluency. It has pattern completion. It has a remarkable ability to compress and recombine human text. But fluency is not rationality, and a plausible plan is not an expected-utility calculation. This is why these systems so often fail in strange, brittle, and irreparable ways when given open-ended responsibility. They are not failing because the prompts are insufficiently clever. They are failing because we are asking a simulator of rational agency to be a rational agent.

English
37
59
357
46.9K
Ahmad
Ahmad@TheAhmadOsman·
The difference between Anthropic and OpenAI is that one of them consistently keeps gaslighting us about not being an evil company Big brother energy in the worst possible way
English
68
21
1.1K
197.9K
FL Man
FL Man@FLMan553·
@SurviveWithAI Literally did this with zero training and videos. Up to 32 automations built from scratch. Never touched a terminal in my life until end of February. Cut our virtual assistant bill in half and got rid of two of them.
English
1
0
0
36
AI Survivalist retweetledi
Alex Finn
Alex Finn@AlexFinn·
Pretty incredible You have to try the new '/goal' feature in Codex It worked for over an hour and built me an entire complex extraction shooter video game You give it a goal, then it works endlessly until the goal is complete. It's like a Ralph loop. Can run for days If you enable the image gen skill before you run the goal, it will even generate ALL the assets for your game autonomously. I didn't manually create ANY of the assets you see in the video Recommendations: enable the image gen skill, put on skip all permissions, and give the prompt as much detail as you can. It will accomplish ALL of it This has to be the sickest way to build games/ long running app tasks ever
English
146
161
2.5K
258.3K
Sam Altman
Sam Altman@sama·
@shiri_shh no its just a better product imo (and i had a stressful week so needed diversion)
English
339
52
4.9K
234.7K
shirish
shirish@shiri_shh·
the whole twitter timeline shifted from claude to chatgpt one guy being funny and real online reversed it. engaging with your customers is the most underrated moat
shirish tweet media
English
119
33
2.3K
261.2K
🍓🍓🍓
🍓🍓🍓@iruletheworldmo·
how are people enjoying 5.5 now you’ve had time to play with it do you use xhigh, is the timeline right that anthropics run is over? so many questions. lmk chat
English
54
1
226
20.4K
AI Survivalist
AI Survivalist@SurviveWithAI·
And here is the initial image used as the reference
AI Survivalist tweet media
English
0
0
0
14
AI Survivalist
AI Survivalist@SurviveWithAI·
Codex running on 5.5 is very impressive. It's ability to generate an entire game demo based on a single image is the best I have ever seen. It won't be long until these models are one shotting entire games. I built this 16 bit RPG style battle based on a single image that I provided to Codex.
English
2
1
2
66
AI Survivalist
AI Survivalist@SurviveWithAI·
The prompt used with the image was extremely simple: "using the following image as a reference, help me create a fully functional rpg battler game. generate sprites, assets, and images using your image generator. build out spells and attacks. it should be a fully functional battle game."
English
0
0
0
27
Google AI Studio
Google AI Studio@GoogleAIStudio·
What are you vibe coding this weekend?
English
407
29
869
74.2K
AI Survivalist
AI Survivalist@SurviveWithAI·
Codex pets are so much fun. It’s incredible watching codex spin up six agents to build this out.
English
0
0
0
20
NASA Administrator Jared Isaacman
No commercial alternative just yet, but the day will surely come, and that is when infrequent crewed missions to the Moon become routine. And just because the agency tolerated externally imposed and self-inflicted inefficiencies in the past does not mean we are willing to tolerate them going forward. The President’s Executive Order made that clear. At NASA, we are on mission, and the clock is running.
English
53
96
2.4K
49.9K
Glam_queen
Glam_queen@glam_queenn·
Be honest! What type of body do men prefer most?
Glam_queen tweet media
English
28.5K
1.2K
19.4K
23.4M