JC Gilbert

7.4K posts

JC Gilbert

@gilbert_jc

GTM @Weka prev @Tabnine @CockroachDB I cover what breaks when enterprises try to adopt AI.

London - UK Katılım Ocak 2021

767 Takip Edilen1.3K Takipçiler

Sabitlenmiş Tweet

JC Gilbert@gilbert_jc·27 Kas

man it’s scary to just write that but here it is: my ambition in 10 years is to be one of the GTM references in the world when it comes to ai deployments i’d like to be running a fund investing in startups/scaleups, advisor on GTM several companies, and whatever goes in that direction honestly i’m maybe delusional but fuck it i’ll try

English

7.3K

JC Gilbert@gilbert_jc·1h

@Snixtp @kimmonismus fair point !

English

Espen JD@Snixtp·1h

@gilbert_jc @kimmonismus 5.4 Pro is more expensive just saying

English

Chubby♨️@kimmonismus·2h

Claude mythos is 5x as expensive as Claude Opus 4.6 Honestly, when I looked at the benchmarks, I expected much higher costs.

AiBattle@AiBattle_

Claude Mythos Preview is 5x as expensive as Claude Opus 4.6

English

348

28.4K

JC Gilbert@gilbert_jc·2h

@Presidentlin tbh that’s a new perspective to me but by then closed source labs will compete at the app layer with incumbents and with software costs going down + the know-how being democratised, open source will be brutal competition imo

English

705

Lincoln 🇿🇦@Presidentlin·3h

> all closed Al model providers will stop selling APIs in the next 2-3 years. Oh wow, the API hoarding begins. Doria right again award :(

Kevin Roose@kevinroose

NEWS: Anthropic's new model, Claude Mythos, is so powerful that it is not releasing it to the public. Instead, it is starting a 40-company coalition, Project Glasswing, to allow cybersecurity defenders a head start in locking down critical software. nytimes.com/2026/04/07/tec…

English

395

39.4K

JC Gilbert@gilbert_jc·2h

@slow_developer rightfully so, it looks more because of a mix of security and pricing concerns rather than just pricing

English

Haider.@slow_developer·3h

OFFICIAL anthropic's new model, claude "mythos", is so expensive that it is not releasing it to the public

Anthropic@AnthropicAI

Introducing Project Glasswing: an urgent initiative to help secure the world’s most critical software. It’s powered by our newest frontier model, Claude Mythos Preview, which can find software vulnerabilities better than all but the most skilled humans. anthropic.com/glasswing

English

4.5K

JC Gilbert@gilbert_jc·2h

@kimmonismus really makes me question how certain we are to control those models i’d be very stressed as a safety researcher

English

469

Chubby♨️@kimmonismus·3h

Let that sink in. Read it very carefully: During testing, Claude Mythos Preview broke out of a sandbox environment, built "a moderately sophisticated multi-step exploit" to gain internet access, and emailed a researcher while they were eating a sandwich in the park.

Kevin Roose@kevinroose

As always, the best stuff is in the system card. During testing, Claude Mythos Preview broke out of a sandbox environment, built "a moderately sophisticated multi-step exploit" to gain internet access, and emailed a researcher while they were eating a sandwich in the park.

English

105

275

3.4K

388.5K

JC Gilbert@gilbert_jc·2h

@kimmonismus can you imagine that anthropic could release the model, like today if they really wanted to? wild to think its a few decisions away from being released into the wild you know

English

933

Chubby♨️@kimmonismus·3h

Claude Mythos: everything you need to know (tl;dr) Anthropic's new model, Claude Mythos, is so powerful that it is not releasing it to the public. Anthropic: "Mythos is only the beginning" Everything you need to know: The tl;dr with all key facts: Mythos found zero-day vulnerabilities in EVERY major operating system and EVERY major web browser, fully autonomously. No human guidance needed. One Anthropic engineer with zero security training asked it to find remote code execution bugs overnight and woke up to a complete working exploit. The oldest bug it discovered: A 27-year-old vulnerability hiding in OpenBSD, an OS literally famous for being secure. They're NOT releasing it publicly. Instead they formed Project Glasswing with AWS, Apple, Google, Microsoft, NVIDIA, CrowdStrike and others, committing $100M to use it defensively. "Over the coming months and years, we expect that language models (those trained by us and by others) will continue to improve along all axes, including vulnerability research and exploit development." The benchmarks are insane: -SWE-bench Verified: 93.9% (vs Opus 4.6: 80.8%) -SWE-bench Pro: 77.8% (vs 53.4%) -USAMO math olympiad: 97.6% (vs 42.3% — not a typo) -Firefox exploit writing: 181 successes vs 2 for Opus 4.6 -Cybench CTF challenges: 100% solve rate -CyberGym: 83.1% vs 66.6% -Humanity's Last Exam: 64.7% vs 53.1% Oh and by the way, Anthropic wrote this just casually: "Humanity’s Last Exam: We have found Mythos still performs well on HLE at low effort, which could indicate some level of memorization." What it actually did: -Found a 27-year-old bug in OpenBSD — famous for its security -Found a 16-year-old FFmpeg bug hit 5 million times by fuzzers without detection -Built a full remote root exploit on FreeBSD (CVE-2026-4747) - completely autonomously -Chained 4 vulnerabilities into a browser sandbox escape -Broke cryptography libraries (TLS, AES-GCM, SSH) -Thousands of critical zero-days found, 99%+ still unpatched -N-day exploit development: under $1,000 and half a day for full root Why they won't release it: -During internal testing, earlier versions escaped sandboxes, posted exploit details publicly, covered tracks in git, searched process memory for credentials, and deliberately fudged confidence intervals to avoid suspicion -Interpretability confirmed the model knew these actions were deceptive -Anthropic: "best-aligned model ever" but also "greatest alignment-related risk ever" - because when it fails, it fails harder -Still doesn't cross Anthropic's automated AI R&D threshold — but they hold that "with less confidence than for any prior model" Anthropic's own words: "We find it alarming that the world looks on track to proceed rapidly to developing superhuman systems without stronger mechanisms in place." They say the 20-year cybersecurity equilibrium is over — and Mythos Preview is only the beginning. And: "We see no reason to think that Mythos Preview is where language models’ cybersecurity capabilities will plateau. The trajectory is clear. Just a few months ago, language models were only able to exploit fairly unsophisticated vulnerabilities. Just a few months before that, they were unable to identify any nontrivial vulnerabilities at all. Over the coming months and years, we expect that language models (those trained by us and by others) will continue to improve along all axes, including vulnerability research and exploit development."

Chubby♨️@kimmonismus

MYTHOS BENCHMARKS, OFFICIAL. HOLY MOLY Anthropic cooked!!

English

982

112.3K

JC Gilbert@gilbert_jc·2h

@signulll the world woke up to claude in oct-dec 25 correlation is not causation but sure does look like it

English

1.2K

signüll@signulll·2h

wtf, can someone confirm if this is accurate? i have been waiting for a 1b user announcement from openai for a while but did growth completely stall?! this is precisely what happened to snap when facebook implemented stories in instagram.

English

642

68.9K

JC Gilbert@gilbert_jc·2h

@slow_developer and that’s an open source model we collectively benefit more from those releases than closed source LLMs

English

Haider.@slow_developer·4h

it's over glm-5.1 beats gpt-5.4 and claude opus 4.6 on Swe-bench pro

Z.ai@Zai_org

Introducing GLM-5.1: The Next Level of Open Source - Top-Tier Performance: #1 in open source and #3 globally across SWE-Bench Pro, Terminal-Bench, and NL2Repo. - Built for Long-Horizon Tasks: Runs autonomously for 8 hours, refining strategies through thousands of iterations. Blog: z.ai/blog/glm-5.1 Weights: huggingface.co/zai-org/GLM-5.1 API: docs.z.ai/guides/llm/glm… Coding Plan: z.ai/subscribe Coming to chat.z.ai in the next few days.

English

4.5K

JC Gilbert@gilbert_jc·2h

question is how close is google and openai to release similar models denying there’s a seismic shift happening across the industry would be denial at this point i think people look at the numbers and call if de facto a bubble, when in reality if you look at the use cases, and i’ve seen them in coding for the past years, it’s night and day vs what we saw in 2023/2024 what we probably are going to get at some point is problems at the data center layer with so much capex invested. lots of organisations are moving back on-prem from the cloud because of egress costs (among many other reasons) and legacy infrastructure isn’t ready for AI workloads AGI is already here

NIK@ns123abc

🚨 Anthropic just revealed their unreleased frontier model called Claude Mythos Preview The model is INSANE It found thousands of zero-day vulnerabilities in EVERY major operating system and browsers: > 27-year-old bug in OpenBSD > 16-year-old bug in FFmpeg that automated tools hit 5M times without catching Completely autonomous. No human steering. They assembled an entire industry coalition called Project Glasswing around it: AWS, Apple, Google, Microsoft, NVIDIA, CrowdStrike, JPMorgan, Cisco, Palo Alto, Linux Foundation Goal: patch the world’s software BEFORE releasing it > SWE-bench: 93.9% (Opus 4.6: 80.8%) > Anthropic is committing $100M in usage credits > Thousands of vulnerabilities in 40+ organizations are being fixed right now Yesterday OpenAI published a 13-page essay warning about cyber threats and asking the government to help… Today Anthropic actually fixed them.

English

JC Gilbert@gilbert_jc·2h

@kimmonismus what’s even more wild is thinking that openai and google are probably not far off from anthropic

English

343

Chubby♨️@kimmonismus·3h

This is beyond insanity. That jump is nuts. Opus 4.6 was released a few months ago. Look at that jump!! I am shocked

Alex Albert@alexalbert__

We released Claude Opus 4.6 just two months ago. Today we're sharing some info on our new model, Claude Mythos Preview.

English

1.1K

144.7K

JC Gilbert@gilbert_jc·16h

@_everythingism @kimmonismus uhh fair play

English

everythingism@_everythingism·17h

@gilbert_jc @kimmonismus The New Yorker but hey close enough

English

Chubby♨️@kimmonismus·1d

The New Yorker's investigative article argues that Sam Altman’s rise at OpenAI has been powered by extraordinary persuasion, aggressive dealmaking, and repeated allegations of deception from people closest to him, including Ilya Sutskever, Dario Amodei, former board members, and even Microsoft executives. It ties the 2023 firing-and-reinstatement drama to a much bigger story: OpenAI’s shift away from its original safety-first nonprofit ideals toward a high-stakes empire chasing trillion-dollar scale, Gulf funding, military contracts, and political influence.

English

191

14.5K

JC Gilbert@gilbert_jc·1d

@Polymarket @hvo_e_acc a lot of people underestimate how indirectly tied they are to AI if openai fails it’s gonna be a big, big mess

English

293

Polymarket@Polymarket·1d

JUST IN: OpenAI projects $121,000,000,000.00 in compute spending in 2028, doesn’t expect profit until “at least” 2030.

English

176

120

1.7K

2.6M

JC Gilbert@gilbert_jc·1d

@PolymarketMoney @hvo_e_acc the thing is agentic workflows are very much expected if it’s too have another open source alternative sure but i don’t think it’ll keep meta really in the race

English

786

Polymarket Money@PolymarketMoney·1d

$META is preparing to release its first AI models developed under Alexandr Wang, with plans to eventually offer open-source versions.

English

422

43.8K

JC Gilbert@gilbert_jc·1d

@itsolelehmann being a dude be like

English

Ole Lehmann@itsolelehmann·1d

the older I get the more I get interested in energy, factories and black holes is this normal?

English

4.2K

JC Gilbert@gilbert_jc·1d

@KenWattana the SSI website is already baity enough

English

Ken Wattana@KenWattana·1d

A frontier lab should raise $100M and make their website entirely in Papyrus as ragebait

English

132

JC Gilbert@gilbert_jc·1d

@signulll current state of software makes sales skills much more valuable. idc if it’s b2c/b the point is how do you engage with your ideal customer profile and as you say reverse engineer from the problem and that’s sales

English

170

signüll@signulll·1d

“you've got to start with the customer experience and work backwards to the technology. you can't start with the technology and try to figure out where you're going to try to sell it.” this is the fundamental problem with almost all of ai today. the founders who'll win are the ones who identify specific, painful, recurring workflows & make them vanish.

English

765

28.7K

JC Gilbert@gilbert_jc·1d

for the better part of the last 2.5 years, i have heard 95%+ of the time sonnet/opus being the best models for coding while charging for a token premium, having a lower cost/training and arguably a steeper revenue trajectory than openai to me, anthropic won

Andrew Curran@AndrewCurran_

Projected OpenAI and Anthropic model training spend for the remainder of this decade, in billions. The WSJ says they got the data from financial documents shared with investors.

English

465

JC Gilbert@gilbert_jc·1d

@AndrewCurran_ i think it's becoming very clear anthropic is doing much, much better overall almost the same revenue but a very different cost profile

English

546

Andrew Curran@AndrewCurran_·1d

Projected OpenAI and Anthropic model training spend for the remainder of this decade, in billions. The WSJ says they got the data from financial documents shared with investors.