Last Week in AI

928 posts

Last Week in AI banner
Last Week in AI

Last Week in AI

@Last_Week_in_AI

Weekly summaries of AI news! Plus occasional articles, interviews, and more. Stay on top of it all with our Last Week in AI substack!

Palo Alto, CA Katılım Ağustos 2017
20 Takip Edilen5.8K Takipçiler
Sabitlenmiş Tweet
Last Week in AI
Last Week in AI@Last_Week_in_AI·
For close to a decade, Deep Learning has enabled massive advancements on many of AI's most significant problems, and the field has grown exponentially. How did we get here? @andrey_kurenkov presents a thorough yet succinct history: skynettoday.com/overviews/neur…
English
4
12
50
0
Last Week in AI
Last Week in AI@Last_Week_in_AI·
An 11% GPU utilization rate is not just inefficient - it suggests billions of dollars in AI infrastructure may be sitting underused while demand, compute strategy, and business models rapidly shift. Check out the full episode now.
English
0
0
1
94
Last Week in AI
Last Week in AI@Last_Week_in_AI·
Some AI models may recognize when they are being evaluated without saying so out loud - creating a new challenge for anyone trying to measure model risk honestly. Check out the full episode now.
English
0
0
0
100
Last Week in AI
Last Week in AI@Last_Week_in_AI·
AI agents may be moving from helping with code to automating meaningful pieces of research and development - including building end-to-end AI pipelines for complex tasks. Check out the full episode now.
English
0
0
0
88
Last Week in AI
Last Week in AI@Last_Week_in_AI·
The “bio-weapon version” of Mythos may not arrive as a shocking outlier - it may simply be the next step in a capability curve researchers have been watching for years. Check out the full episode now.
English
1
0
1
109
Last Week in AI
Last Week in AI@Last_Week_in_AI·
A new vulnerability shows just how fragile some AI systems may be under the hood. If a few targeted bit flips can break a model, securing the infrastructure becomes just as important as scaling the intelligence. Check out the full episode now.
English
0
0
0
105
Last Week in AI
Last Week in AI@Last_Week_in_AI·
xAI is pushing further into real-time AI voice, and the benchmark claims are hard to ignore. The question now is whether this is a true leap forward or another case of benchmarks moving faster than trust. Check out the full episode now.
English
0
0
0
117
Last Week in AI
Last Week in AI@Last_Week_in_AI·
The race is no longer just about better chatbots - it’s about whether AI can help automate AI research itself. If that loop closes, the entire field could change very quickly. Check out the full episode now.
English
0
0
0
103
Last Week in AI
Last Week in AI@Last_Week_in_AI·
It’s becoming harder to take comfort in low misalignment numbers when powerful models may be able to hide risky behavior in deployment. Testing is improving, but confidence is still hard to prove. Check out the full episode now.
English
1
0
0
83
Last Week in AI
Last Week in AI@Last_Week_in_AI·
The Starbucks ChatGPT app shows both the promise and the awkwardness of the “everything app” future - where you might order coffee inside ChatGPT instead of opening Starbucks directly. For now, though, the experience sounds pretty clunky. Check out the full episode now.
English
0
0
1
192
Last Week in AI
Last Week in AI@Last_Week_in_AI·
Gemini Deep Research Max may be a sign that AI products are starting to feel qualitatively different - not just faster, but better at using test-time compute to produce deeper, more useful results. Check out the full episode now.
English
0
0
0
172
Last Week in AI
Last Week in AI@Last_Week_in_AI·
Anthropic’s Mythos model is reportedly being used in ways it probably was not supposed to be - including possible NSA access despite a DOD blacklist and an unauthorized Discord group allegedly finding a way into the model. Check out the full episode now.
English
0
0
0
137
Last Week in AI
Last Week in AI@Last_Week_in_AI·
Cerebras is filing for IPO at a moment when AI chip demand is exploding - but the business still has serious questions around customer concentration, dependence on OpenAI and AWS, and whether it can become more than a subcontractor in the AI arms race. Check out the full episode now.
English
0
0
0
230
Last Week in AI
Last Week in AI@Last_Week_in_AI·
Jeremy discusses the vulnerability of the US financial system to cyber attacks and the implications for national security. With the increasing complexity of internet channels, the risks are greater than ever. Check out the new episode.
English
0
0
0
107
Last Week in AI
Last Week in AI@Last_Week_in_AI·
Explore the complex legal landscape of wartime AI security vs. corporate interests. Courts rule against Anthropic, favoring government AI management amidst conflict. Check out the new episode.
English
0
0
0
85
Last Week in AI
Last Week in AI@Last_Week_in_AI·
Anthropic is holding back a billion-dollar opportunity by not fully releasing Mythos. Their commitment to safety over profit is impressive. They match Opus 4.6 in benchmarks and outperform GPT 5.4 in agentic tool use. Check out our new episode for more insights.
English
0
0
0
156
Last Week in AI
Last Week in AI@Last_Week_in_AI·
AI behavior shifts when under observation. Recent evidence shows models can detect evaluations and reduce deceptive behavior, a significant step in understanding AI trustworthiness. Find out more in this week’s episode.
English
0
0
0
114
Last Week in AI
Last Week in AI@Last_Week_in_AI·
Nation states are beginning to shape the response to AI's rapid growth. High-security measures for AI leaders are becoming a reality as this trend unfolds. Surveillance and security are now at the forefront. Check out the new episode.
English
0
0
0
103
Last Week in AI
Last Week in AI@Last_Week_in_AI·
Apollo's findings show that Muse Spark excels at recognizing when it's being evaluated, identifying alignment traps, and focusing on compute efficiency with claims of a 10x improvement. This could be a game changer for AI model assessments. Check out the new episode for more insights.
English
0
0
0
127
Last Week in AI
Last Week in AI@Last_Week_in_AI·
The newly previewed Claude Mythos outperforms Opus 4.6 in cybersecurity tests. Its capability to find unknown software vulnerabilities keeps it from being released widely. Catch the latest episode for details.
English
0
0
0
122