
𝕸𝖆𝖙𝖙𝖍𝖊𝖜
198 posts

𝕸𝖆𝖙𝖙𝖍𝖊𝖜
@Postulix96
🎓 Computer Science; 🎧📷 Shoegaze & Aesthetics
Europe Bergabung Haziran 2024
29 Mengikuti9 Pengikut

@chetaslua Those are the moments when I doubt that ai will replace developers
English

@keysmashbandit anyone with substantial text on the internet will probably have their iq predicted fairly well by llms in a year
English

IQ, especially one's personal IQ score, is one of the few things I consider a genuine infohazard, and I believe one should do whatever they can to avoid ever being assessed at any point in their life. Every single possible n carries huge potential to fuck up your self-perception, self-esteem, or your relationship to the common man, and probably it's going to do all three of those things. Just a complete and total net negative any way you slice it.
English

@CitizenSigma @AntiWokeMemes Are you serious? This person is more dangerous then your Christian Brainwashing Children school?
English

@AntiWokeMemes This is a clear and present danger to a Christian Children's school. He should be bagged and deposited in a mental facility for extended treatment.
Hurry, before he shows up at the school with firearms and a manifesto.
English

Anthropic has released Claude Managed Agents, a suite of APIs designed to build and deploy cloud-hosted AI agents up to 10x faster.
TLDR
- Provides secure sandboxing and automated tool execution
- Features long-running sessions that persist through disconnections
- Includes built-in orchestration for state management and error recovery
- Supports multi-agent coordination for complex parallel tasks
- Offers session tracing and analytics via the Claude Console

Claude@claudeai
Introducing Claude Managed Agents: everything you need to build and deploy agents at scale. It pairs an agent harness tuned for performance with production infrastructure, so you can go from prototype to launch in days. Now in public beta on the Claude Platform.
English

New on the Engineering Blog:
Building Managed Agents—our hosted service for long-running agents—meant solving an old problem in computing: how to design a system for “programs as yet unthought of.”
Read more: anthropic.com/engineering/ma…
English

OpenAI is hinting or releasing a model comparable to Mythos

adi@adonis_singh
it’ll probably be months before we use a model of this level of capability
English

@kimmonismus These models will replace developers in the real world?
English


@chatgpt21 Why not a nerfed version without cyber security feautures?
English

🚨 ANTHROPIC JUST BROKE SWE-BENCH PRO WITH CLAUDE MYTHOS 🚨
Anthropic just dropped the numbers for their unreleased "Claude Mythos Preview" and the coding leap is almost incomprehensible.
This model is so powerful at finding exploits that they are keeping it strictly locked down for critical infrastructure partners. Anthropic explicitly stated: "We’ve used Claude Mythos to demonstrate thousands of zero day vulnerabilities."
Look at the absolute destruction of these benchmarks compared to Opus 4.6:
• SWE-Bench Pro: 77.8% (Destroying Opus 4.6 at 53.4%)
• Terminal-Bench 2.0: 82.0% (Up from 65.4%)
• SWE-Bench Verified: 93.9%
• SWE-Bench Multimodal: 59.0% (More than double Opus 4.6's 27.1%)
• Humanity's Last Exam (with tools): 64.7% (Up from 53.1%)
• GPQA Diamond: 94.6%
A nearly 25-point jump in SWE-Bench Pro in a single generation. And we’re in *checks notes* April..


English







