
๐ธ๐๐๐๐๐๐
198 posts

๐ธ๐๐๐๐๐๐
@Postulix96
๐ Computer Science; ๐ง๐ท Shoegaze & Aesthetics
Europe ุงูุถู
Haziran 2024
29 ูุชุจุน9 ุงูู
ุชุงุจุนูู

@chetaslua Those are the moments when I doubt that ai will replace developers
English

@keysmashbandit anyone with substantial text on the internet will probably have their iq predicted fairly well by llms in a year
English

IQ, especially one's personal IQ score, is one of the few things I consider a genuine infohazard, and I believe one should do whatever they can to avoid ever being assessed at any point in their life. Every single possible n carries huge potential to fuck up your self-perception, self-esteem, or your relationship to the common man, and probably it's going to do all three of those things. Just a complete and total net negative any way you slice it.
English

@synthwavedd I'm waiting for the nerfed version of Mythos
English

@ai_for_success Image model? Bah I'm waiting for spud
English

Please @AnthropicAI release something like Mythos
English

@CitizenSigma @AntiWokeMemes Are you serious? This person is more dangerous then your Christian Brainwashing Children school?
English

@AntiWokeMemes This is a clear and present danger to a Christian Children's school. He should be bagged and deposited in a mental facility for extended treatment.
Hurry, before he shows up at the school with firearms and a manifesto.
English

Anthropic has released Claude Managed Agents, a suite of APIs designed to build and deploy cloud-hosted AI agents up to 10x faster.
TLDR
- Provides secure sandboxing and automated tool execution
- Features long-running sessions that persist through disconnections
- Includes built-in orchestration for state management and error recovery
- Supports multi-agent coordination for complex parallel tasks
- Offers session tracing and analytics via the Claude Console

Claude@claudeai
Introducing Claude Managed Agents: everything you need to build and deploy agents at scale. It pairs an agent harness tuned for performance with production infrastructure, so you can go from prototype to launch in days. Now in public beta on the Claude Platform.
English

New on the Engineering Blog:
Building Managed Agentsโour hosted service for long-running agentsโmeant solving an old problem in computing: how to design a system for โprograms as yet unthought of.โ
Read more: anthropic.com/engineering/maโฆ
English

OpenAI is hinting or releasing a model comparable to Mythos

adi@adonis_singh
itโll probably be months before we use a model of this level of capability
English

@kimmonismus These models will replace developers in the real world?
English


@chatgpt21 Why not a nerfed version without cyber security feautures?
English

๐จ ANTHROPIC JUST BROKE SWE-BENCH PRO WITH CLAUDE MYTHOS ๐จ
Anthropic just dropped the numbers for their unreleased "Claude Mythos Preview" and the coding leap is almost incomprehensible.
This model is so powerful at finding exploits that they are keeping it strictly locked down for critical infrastructure partners. Anthropic explicitly stated: "Weโve used Claude Mythos to demonstrate thousands of zero day vulnerabilities."
Look at the absolute destruction of these benchmarks compared to Opus 4.6:
โข SWE-Bench Pro: 77.8% (Destroying Opus 4.6 at 53.4%)
โข Terminal-Bench 2.0: 82.0% (Up from 65.4%)
โข SWE-Bench Verified: 93.9%
โข SWE-Bench Multimodal: 59.0% (More than double Opus 4.6's 27.1%)
โข Humanity's Last Exam (with tools): 64.7% (Up from 53.1%)
โข GPQA Diamond: 94.6%
A nearly 25-point jump in SWE-Bench Pro in a single generation. And weโre in *checks notes* April..


English







