AI Survivalist

1.3K posts

AI Survivalist

@SurviveWithAI

Adapt. Survive. Thrive. In the age of AI.

Katılım Aralık 2024

354 Takip Edilen71 Takipçiler

Sabitlenmiş Tweet

AI Survivalist@SurviveWithAI·1d

Codex running on 5.5 is very impressive. It's ability to generate an entire game demo based on a single image is the best I have ever seen. It won't be long until these models are one shotting entire games. I built this 16 bit RPG style battle based on a single image that I provided to Codex.

English

AI Survivalist@SurviveWithAI·2h

Incredibly stupid. This will just completely hamstring new models, keeping them only in the hands of government and giant corporations. This idea should be walked back immediately.

Chubby♨️@kimmonismus

The Trump administration is discussing the creation of an AI working group that could establish a government review process for new AI models before public release, following growing cybersecurity concerns around increasingly capable systems like Anthropic's Mythos. White House officials briefed executives from Anthropic, Google, and OpenAI on the plans last week, though the proposals remain in early stages and no executive order has been confirmed. Via NYT

English

AI Survivalist@SurviveWithAI·4h

@IT_unhinged 🤣

QME

520

Derek Devicemanager@IT_unhinged·7h

Our COO sent me a Slack: “Laptop is dead, nothing works, fix ASAP.” I checked the monitoring tool. His battery was at 1% and the charger wasn’t plugged in. I could’ve just messaged: “Plug it in.” Instead I opened a ticket, categorized it as a Severity 2 Power Incident. Asked him for screenshots of the problem. He sent a photo of a black screen. I scheduled a remote session for 30 minutes later “to run diagnostics.” At minute 29 I told him to verify his power source as Step 1 of the troubleshooting script. He plugged it in. Laptop turned on. I documented the resolution as “User Education: Introduced to Concept of Electricity.” The ticket remains a permanent part of his audit trail. For “trend analysis.”

English

365

9.6K

471.1K

AI Survivalist@SurviveWithAI·16h

Building with Codex 5.5 is incredibly addicting. This basic electronics game was made completely with 5.5 in less than an hour

English

AI Survivalist retweetledi

God of Prompt@godofprompt·1d

This is the most important post about AI agents written this year. And almost nobody building with agents right now will read it. Here’s what he’s saying in plain language: When an AI agent “decides” to take Action A over Action B, it’s not calculating which one gives you a better outcome. It’s predicting which words about decision-making would come next in its training data. It’s not thinking. It’s performing a simulation of thinking. For simple tasks, the performance is convincing enough to be useful. Summarize this document. Draft this email. Fix this bug. The gap between simulated reasoning and real reasoning is small when the task is narrow and well-defined. For complex, open-ended problems, the gap becomes a cliff. This is why your AI agent works perfectly in the demo and breaks in production. Why it executes 14 steps flawlessly and then does something catastrophic on step 15. Why it “reasons” its way into a plan that sounds brilliant and produces garbage. The agent isn’t broken. It was never reasoning in the first place. You were watching pattern completion that looked like reasoning. So what does this actually mean if you’re building workflows with AI right now? It means the human in the loop isn’t optional. It’s structural. You are the rational agent. The AI is the execution layer. You define the expected utility. You evaluate whether the output actually serves your goal. You catch the moment when fluent text diverges from useful action. Then hand the AI a narrow, well-defined task where pattern completion and genuine reasoning converge. That’s not a limitation. That’s the entire architecture. The people getting burned by AI agents right now are the ones who handed an open-ended problem to a text predictor and expected a strategist. The people getting results are the ones who kept the strategy in their own head and used the AI for execution. LLMs don’t think. You do.

BURKOV@burkov

If you don't understand this, you will not understand why LLM-based agents are irreparably failing for a general-purpose problem solving. An agent (by the way it was the topic of my PhD 20 years ago) to be useful, must be rational. Being rational means to always prefer an outcome that results in the maximal expected utility to its master/user. Let’s say an agent has two actions they can execute in an environment: a_1 and a_2. If the agent can predict that a_1 gives its user an expected utility of 10, and a_2 gives an expected utility of -100, then a rational agent must choose a_1 even if choosing a_2 seems like a better option when explained in words. The numbers 10 and -100 can be obtained by summing the products of all possible outcomes for each action and their likelihoods. Now here is the problem with LLM-based agents. The LLM is not optimizing expected utility in the environment. It is optimizing the next token, conditioned on a prompt, a context window, and a training distribution full of examples of what helpful answers are supposed to look like. Those are not the same objective. So when we wrap an LLM in a loop and call it an “agent,” we have not created a rational decision-maker. We have created a text generator that can imitate the surface form of deliberation. It may say things like: “I should compare the expected outcomes.” “The best action is probably a_1.” “I will now execute the optimal plan.” But the internal mechanism is not selecting actions by maximizing the user’s expected utility. It is generating a continuation that is statistically appropriate given the prompt and prior context. This distinction matters enormously. For narrow tasks, the imitation can be good enough. If the environment is constrained, the actions are simple, and the success criteria are close to patterns seen in training, the system can appear agentic. But for general-purpose problem solving, the gap becomes fatal. A rational agent needs stable preferences, calibrated beliefs, causal models of the world, the ability to evaluate consequences, and the discipline to choose the action with maximal expected utility even when that action is boring, non-linguistic, or unlike the examples in its training data. An LLM-based agent has none of that by default. It has fluency. It has pattern completion. It has a remarkable ability to compress and recombine human text. But fluency is not rationality, and a plausible plan is not an expected-utility calculation. This is why these systems so often fail in strange, brittle, and irreparable ways when given open-ended responsibility. They are not failing because the prompts are insufficiently clever. They are failing because we are asking a simulator of rational agency to be a rational agent.

English

357

46.9K

AI Survivalist@SurviveWithAI·1d

@TheAhmadOsman Anthropic is such a shady company. The evil vibes are pretty bad.

English

Ahmad@TheAhmadOsman·2d

The difference between Anthropic and OpenAI is that one of them consistently keeps gaslighting us about not being an evil company Big brother energy in the worst possible way

English

1.1K

197.9K

AI Survivalist@SurviveWithAI·1d

@FLMan553 That’s awesome!

English

FL Man@FLMan553·1d

@SurviveWithAI Literally did this with zero training and videos. Up to 32 automations built from scratch. Never touched a terminal in my life until end of February. Cut our virtual assistant bill in half and got rid of two of them.

English

AI Survivalist@SurviveWithAI·1d

This is exactly the mindset you need to adopt with AI. Just keep asking “how can I automate this” Your time is too valuable to keep putting the same cog in the same bucket manually.

CyrilXBT@cyrilXBT

THE JOB MARKET IS ABOUT TO GET WEIRD. And most people are not prepared for what is coming. Companies in 2026 are not looking for data scientists. They are not looking for ML engineers. They are not looking for people who can build models from scratch. THEY ARE LOOKING FOR AI NERDS. The person who walks into a meeting, sees a 4 hour manual process, and kills it in 10 minutes with Claude Code and LLMs. The person who refuses to do anything manually twice. The person who looks at every repetitive task and asks one question: Why is a human still doing this. That mindset is worth more right now than a machine learning PhD. More than five years of Python experience. More than any certification from any university. THE NEW VALUABLE SKILL IS NOT TECHNICAL. It is a refusal to accept inefficiency. The people who develop that refusal this year will be completely unemployable in the old way and completely irreplaceable in the new one. Which side of that line are you on.

English

AI Survivalist retweetledi

Alex Finn@AlexFinn·2d

Pretty incredible You have to try the new '/goal' feature in Codex It worked for over an hour and built me an entire complex extraction shooter video game You give it a goal, then it works endlessly until the goal is complete. It's like a Ralph loop. Can run for days If you enable the image gen skill before you run the goal, it will even generate ALL the assets for your game autonomously. I didn't manually create ANY of the assets you see in the video Recommendations: enable the image gen skill, put on skip all permissions, and give the prompt as much detail as you can. It will accomplish ALL of it This has to be the sickest way to build games/ long running app tasks ever

English

146

161

2.5K

258.3K

AI Survivalist@SurviveWithAI·1d

@mark_k @OpenAI It just works. So good.

English

Mark Kretschmann@mark_k·1d

I’m also testing ChatGPT Pro right now, and I have to agree: GPT-5.5 was a big step forward by @OpenAI. It’s a really good model.

David Shapiro (L/0)@DaveShapi

I hate to say it but OpenAI is so back. ChatGPT Pro is good.

English

273

12.3K

AI Survivalist@SurviveWithAI·1d

@sama @shiri_shh Thank you for not being a Dario Doomer. Also 5.5 on Codex is incredible!

English

945

Sam Altman@sama·1d

@shiri_shh no its just a better product imo (and i had a stressful week so needed diversion)

English

339

4.9K

234.7K

shirish@shiri_shh·1d

the whole twitter timeline shifted from claude to chatgpt one guy being funny and real online reversed it. engaging with your customers is the most underrated moat

English

119

2.3K

261.2K

AI Survivalist@SurviveWithAI·1d

@iruletheworldmo 5.5 is incredible

English

123

🍓🍓🍓@iruletheworldmo·1d

how are people enjoying 5.5 now you’ve had time to play with it do you use xhigh, is the timeline right that anthropics run is over? so many questions. lmk chat

English

226

20.4K

AI Survivalist@SurviveWithAI·1d

And here is the initial image used as the reference

English

AI Survivalist@SurviveWithAI·1d

English

AI Survivalist@SurviveWithAI·1d

The prompt used with the image was extremely simple: "using the following image as a reference, help me create a fully functional rpg battler game. generate sprites, assets, and images using your image generator. build out spells and attacks. it should be a fully functional battle game."

English

AI Survivalist@SurviveWithAI·2d

@GoogleAIStudio More tools for learning

English

Google AI Studio@GoogleAIStudio·2d

What are you vibe coding this weekend?

English

407

869

74.2K

AI Survivalist@SurviveWithAI·2d

Codex pets are so much fun. It’s incredible watching codex spin up six agents to build this out.

English

AI Survivalist@SurviveWithAI·2d

@OpenAIDevs Meet Flux. My coding and engineering companion.

English

446

OpenAI Developers@OpenAIDevs·2d

Show us the Codex pets you hatched. Use /hatch to create your own Codex pet. We’ll pick 10 favorites to get 30 days of ChatGPT Pro.

OpenAI Developers@OpenAIDevs

Customize your Codex pet with /hatch

English

926

138

514.5K

AI Survivalist@SurviveWithAI·2d

@NASAAdmin @Pinboard Amazing work

English

352

NASA Administrator Jared Isaacman@NASAAdmin·2d

No commercial alternative just yet, but the day will surely come, and that is when infrequent crewed missions to the Moon become routine. And just because the agency tolerated externally imposed and self-inflicted inefficiencies in the past does not mean we are willing to tolerate them going forward. The President’s Executive Order made that clear. At NASA, we are on mission, and the clock is running.

English

2.4K

49.9K

Pinboard@Pinboard·2d

Interesting statement from an agency whose flagship rocket costs *40 times* more to launch than the commercial alternative

NASA Administrator Jared Isaacman@NASAAdmin

The President signed an Executive Order to strengthen efficiency and accountability in federal contracting. At NASA, every dollar matters as we return Americans to the Moon and we won’t tolerate inefficient use of taxpayer resources, waste, fraud, or abuse that stands in the way of the mission.

English

41.2K

AI Survivalist@SurviveWithAI·2d

@isaacwrld7 @glam_queenn Yes. 3 is ideal.

English

323