Potrock

1.2K posts

Potrock

@Potrock_

applied ai engineer

Canada Beigetreten Kasım 2014

856 Folgt645 Follower

Potrock retweetet

Jared Palmer@jaredpalmer·7h

The learning machine is the business, wherein a product is a component to gather feedback and signal at scale. The loop goes as follows: Better feedback/signal -> Better evals and tests -> Better models/intelligence -> Better product -> Better distribution -> Better feedback Improving any of these components will increase the learning rate of the machine. Whoever can spin the loop fastest wins

English

5.8K

Potrock retweetet

Ethan Mollick@emollick·1d

Soon, at each gradual improvement level of AI, you will start to see large discrete jumps in ability in economically important areas, because the previous AI ability level in some aspect of the job bottlenecked progress. When bottlenecks are released, it looks like a leap forward

English

546

31.2K

Potrock retweetet

Andrew Curran@AndrewCurran_·8 Nis

Judging by Muse, META should have a Mythos size model before the end of the year. Elon confirmed today xAI has a 10T Grok in training now. Project Glasswing probably has about six months to help people get hardened globally. Maybe nine. Doesn't look easy.

English

539

25.6K

Potrock@Potrock_·8 Nis

@gabebusto no, read the technical red team report. these are vulnerabilities in software that is decades old and the most audited on the planet. not vibe coded nextjs apps.

English

Gabe@gabebusto·8 Nis

is mythos so good at finding 0days now because the amount of vibe coded software increased significantly over the last few years? that’s one way to inflate capabilities/reported performance.

English

Potrock@Potrock_·8 Nis

just-bash it

Amazon Web Services@awscloud

Announcing Amazon S3 Files. The first and only cloud object store with fully-featured, high-performance file system access. Learn more here. go.aws/4tw17Zg

English

Potrock retweetet

Dean W. Ball@deanwball·8 Nis

Today, reflections on Mythos and a long-forgotten painting.

English

23.8K

Potrock@Potrock_·8 Nis

@0xngmi reasonable issue

English

0xngmi@0xngmi·8 Nis

@Potrock_ issue is that i dont trust vercel after they rugged me with frontend stuff

English

246

0xngmi@0xngmi·8 Nis

i love openrouter but if anyone is using AI at scale it's worth migrating to use the underlying APIs directly At defillama we reduced AI costs by 40% while speeding up prompts for the same model by doing this migration explanation in QT thread by one of our devs

reynardoew@reynardoew

at defillama we reduced our ai costs by more than 40% while also speeding ups time to first tokens just by switching from openrouter to direct underlying provider (same model)

English

187

28.7K

Potrock@Potrock_·8 Nis

@0xngmi You can have fine cache control and provider specific settings with the gateway config

English

Potrock@Potrock_·8 Nis

@0xngmi Vercel AI Gateway is a solid option if you find yourself missing the openrouter fallbacks

English

249

Potrock retweetet

Dean W. Ball@deanwball·7 Nis

Some brief thoughts on Mythos We’ve known this was coming for a long time. At least, we *should* have. Extremely effective software vulnerability discovery was clearly coming to anybody paying attention. It has also been clear that all AI policy so far has been made and executed with training wheels. It was always clear that, sometime soon, the training wheels would come off. The training wheels aren’t fully off just yet—this model is being kept under lock and key, and Anthropic does not seem inclined to release Mythos preview to the public anytime soon, if ever. The training wheels will be off when these capabilities are fully diffused in ways centralized actors cannot control. It is inevitable that this will happen. The point is not to argue about whether we should “ban open source” or similarly unrealistic notions. The point is to harden the world for this new reality. I applaud Anthropic—and I especially applaud @logangraham—for doing so. But their efforts alone are not close to enough. Project Glasswing—a partnership with Anthropic and other companies—seems nice, but unsurprisingly it lacks uniform frontier lab participation. It would probably be ideal, for our national cyberdefense, if the federal government were not trying to destroy Anthropic and eliminate their models from government systems. If anything, the government should be trying to work more closely with Anthropic. As a side note, I hope Anthropic is working with state and local government entities on cyber vulnerability discovery, since many of our adversaries know that state and local is America’s soft underbelly in so many ways. In any event, the Mythos news should lay bare how stupid and counter-productive the Department of War’s feud with Anthropic really is. As someone who suspected all this was coming (not from inside knowledge but from it being ~obvious), that probably explains why I have had such a strong reaction to that feud. It’s this senseless distraction just at the time that the training wheels are coming off. I hope the two parties can resolve their differences now, for the sake of the country, but I am not hopeful. I do want to call out, however, the numerous political and career civil servants in the Trump Admin who do get these issues, know how stupid the Ant-DoW stuff is, and want to work with the frontier labs like adults. I wish you all utmost success. I find myself inclined to end on some positive notes. Mythos appears to be—according to Anthropic at least—“the most aligned” model Anthropic has ever trained. We are approaching superhuman capabilities in some domains, and yet alignment is getting better rather than worse. That’s not nothing. I know some of you think the model is faking its alignment, or aware when its alignment is being tested. I don’t have a good answer. Finally, there is this: Mythos was made by an American company, and like most successful American companies, it has a vested interest in maintaining order and peace, and it is investing substantial resources in mitigating the risks of its technological progress, as I expect most of the American labs would. This is cause for optimism: The incentives of capitalism are working. The training wheels are coming off, but at least we are the ones removing them, as opposed to our enemies. Perhaps we can be the first to learn to bike for real. The first step would be to get beyond all the low-fidelity, under-specified, pimply little fights of AI policy’s prepubescent era. That goes for me too. “What hath God wrought,” wrote the first telegram. What, indeed. In this case, the answer is still up to us.

English

244

2.6K

404.6K

Potrock retweetet

Zvi Mowshowitz@TheZvi·7 Nis

Imagine being Dario, and being told DoW is worried you might sabotage the weights of Claude Gov in physically impossible ways, while you know you have zero-days on every operating system and browser in the world.

Anthropic@AnthropicAI

Introducing Project Glasswing: an urgent initiative to help secure the world’s most critical software. It’s powered by our newest frontier model, Claude Mythos Preview, which can find software vulnerabilities better than all but the most skilled humans. anthropic.com/glasswing

English

2.3K

145.5K

Potrock retweetet

Exa@ExaAILabs·7 Nis

We're excited to partner with @coinbase to enable agents to natively pay for web search, via x402! x402 is an open protocol that enables agents to pay via HTTP, governed by the Linux Foundation. When an Exa API request is made without an API key, Exa now returns a 402 status code with payment information that an agent can act on.

English

499

237.4K

Potrock retweetet

prinz@deredleritt3r·30 Mar

You don't truly understand the magnitude of the potential impact of powerful AI on the world unless you are aware, and have fully internalized, that senior leadership and most researchers at the frontier labs *actually believe* the following: 1. Existing AI is already significantly speeding up AI research. Very soon (this year), AI will very likely take over *ALL* aspects of AI research other than generation of novel research ideas. Soon (within the next 2 years), AI will very likely take over *ALL* aspects of AI research, period. This means hundreds of thousands of GPUs working 24/7 to discover novel ideas at the level of, or better than, the likes of Alec Radford, Ilya Sutskever, etc. The thread below presents a conservative timeline: AI researchers will "meaningfully contribute" to AI development in 1-3 years. 2. Many (but, as far as I can tell, not all) executives and researchers at the frontier labs believe that fully automated AI research will kick off recursive self-improvement (RSI), wherein the AI models will autonomously build better and better AI models, with human oversight (for safety reasons), but increasingly with no human input into the research or implementation of that research. From the thread below: "'[h]umans vs AI on intellectual work is likely to be like human runner vs a Porsche in a race', likely very soon" - but replace "intellectual work" generally with "AI research" specifically. RSI is a complicated and messy thing to consider, both because there will be compute and energy constrains and because there are unknowns (will there be diminishing returns from greater intelligence of the models? if so, when will these diminishing returns become meaningful? is there a ceiling to intelligence that we don't know about?). But suffice to say that, if RSI *is* achieved in a way that many leaders/researchers at the frontier labs believe is possible, *THE WORLD MAY BECOME COMPLETELY UNRECOGNIZABLE WITHIN JUST A FEW YEARS*. This is subject to various bottlenecks; as the thread below correctly notes, "[i]nstitutional, personal & regulatory bottlenecks will bind very hard", and much also depends on continuing progress in areas like robotics. 3. On ~the same timeline as full, end-to-end automation of *ALL* aspects of AI research (within the next 2 years), AI will also become capable of making significant novel scientific discoveries *IN OTHER FIELDS*. This is why Dario Amodei, Demis Hassabis et al. believe that it is possible that all diseases will be curable within 10 years. (One account of how this might be possible is set forth in "Machines of Loving Grace".) The point is that an LLM that is capable of significant novel insights in the field of AI research should likewise be capable of significant novel insights in at least some (and perhaps all) other fields. The thread below notes: "AI for automating science [is] very early" - obviously true, but I think some changes may be right on the horizon. Overall, and again from the thread below: "'a million scientists in a data center' will think much more quickly than humans, on almost any intellectual task; this will happen in the next 2-10 years." This is ~the same timeline as that presented in "Machines of Loving Grace". Many will be tempted to dismiss all this as "just hype", "they are just trying to raise money again", etc. But no! - the above, in fact, presents the *actual beliefs* of senior leadership and many researchers at the frontier labs. Again, they genuinely think that AI research will be automated soon. Many of them genuinely believe that RSI is achievable in the not-too-distant future. And they genuinely see a real path towards AI significantly accelerating science, curing diseases, inventing new materials, helping to solve key global issues from poverty to climate change, etc., etc. Whether the frontier labs' beliefs are correct is, of course, a separate question. I personally have historically tended to take public statements by OpenAI, Anthropic and Google at face value and quite seriously. As a result, I was not surprised when LLMs won gold in the IMO, IOI and the ICPC competitions last year, or when Claude Code/Codex started taking off, or when Anthropic and OpenAI started releasing significantly better models every 1-2 months, or when some of the best coders became reliant on Claude Code/Codex in their daily work, or when LLMs became significantly helpful to scientists in fields like math and physics in the last few months. The trajectory has been ~the same as that publicly predicted by the frontier labs. We have been accelerating. And, as of right now, all signs are indicating that the acceleration shall continue and that full automation of AI research and, potentially, RSI are firmly on the horizon.

Kevin A. Bryan@Afinetheorem

My read on "normal policymaker & corp. leader on AI": mostly now they don't need to be convinced it is very important (unlike a year ago). But they still see its capabilities as today + epsilon. So just briefly, here is what even "AI is normal tech" folks in the labs believe: 1/8

English

139

1.2K

178.6K

Potrock retweetet

swyx 🐣@swyx·19 Mar

THE BITTER LESSON APPLIED TO AGENTS (aka how to not be steamrolled by GPTNext) Ramp just hit a $13b valuation and "every surface of Ramp is infused with AI" TL;DR of @rahulgs' very well constructed @aidotengineer talk as a syllogism 1. systems that scale with compute beat systems that don't 2. you should build systems such that they improve with more compute 3. exponentials are rare: when you find one, -actually- hop on for the ride (instead of subconsciously fighting it out of habit/fear) 4. therefore allow the agent to flex tools and self augment/improve rather than constrain it has everything: single message, real life usecase from a major company, and LIVE DEMO (on conference wifi lol) do not miss

AI Engineer@aiDotEngineer

Stop over-engineering agents Rahul Sengottuvelu started an AI Agent company in 2021, acquired by @tryramp where he's now head of Applied AI. In this talk, he presents alternative ideas for scaffolding agents, arguing systems that scale with compute beat systems that don't. YT version in next post

English

298

57.3K

Potrock@Potrock_·26 Mar

An awesome form factor for digesting longer form writing, and on an important topic!

tylercowen@tylercowen

My new "generative book," fully written by me, the last chapter is on how AI will revolutionize the sciences (and us): tylercowen.com/marginal-revol…

English

Potrock retweetet

Andrew Curran@AndrewCurran_·23 Mar

In some takeoff scenarios, the value of compute becomes too high to sell. The opportunity cost of selling the chips just becomes too high. Eventually it becomes more profitable to keep chips rather than to sell them to anyone else. Not long after that, it may become vastly more profitable still. At that point, compute probably starts to look more like political currency. This is one reason most frontier labs are working on their own chips. At a certain point, no one sells freely anymore. There is a threshold where even Jensen stops selling in the ordinary sense. Everyone already established probably gets their yearly allotment of spice, and that's it. And no one resells. Either your internal projects are too valuable, or you lease your capacity strategically to whoever is in first or second place in a takeoff kingmaker gambit. Reselling would probably also be death-penalty illegal by this point. And what is the point of long-term contracts anyway if your internal reports say human employment, and even money itself, may not function or look anything like the way it does now in as little as ten years? Maybe even five. Things start to break a little. People stop caring about some consequences if they don't see them carrying any weight in the new world. And as the new world starts to seem more real, the old rules start to matter less. What does a fine mean in a post-scarcity society? It means nothing. This is why Elon builds TeraFab. And this is why Sam Altman wanted to build Project Tigris. After a certain point on this trajectory, the only reliable fab is your own fab. Compute becomes so valuable that rules and incentives begin to break. Alliances break. Contracts break. Rumors of nationalization probably start around then too. It is easy to talk about building a god in your lab as long as no one takes it seriously. The moment they do, the government becomes much more involved. If you do believe in a takeoff scenario this decade, then the USG will have a very large role in this story, and probably a lot of influence on any takeoff trajectory.

English

414

26.3K

Potrock@Potrock_·22 Mar

@adonis_singh Can probably expect something every month now

English

389

adi@adonis_singh·22 Mar

i just had a dream they're going to release opus 4.7 soon

English

141

7.4K

Potrock@Potrock_·18 Mar

OpenClawde

Felix Rieseberg@felixrieseberg

We're shipping a new feature in Claude Cowork as a research preview that I'm excited about: Dispatch! One persistent conversation with Claude that runs on your computer. Message it from your phone. Come back to finished work. To try it out, download Claude Desktop, then pair your phone.

English

Potrock@Potrock_·17 Mar

Mini Model Enjoyers Rejoice

OpenAI Developers@OpenAIDevs

We’re introducing GPT-5.4 mini and nano, our most capable small models yet. GPT-5.4 mini is more than 2x faster than GPT-5 mini. Optimized for coding, computer use, multimodal understanding, and subagents. For lighter-weight tasks, GPT-5.4 nano is our smallest and cheapest version of GPT-5.4. openai.com/index/introduc…

English

Potrock@Potrock_·16 Mar

Many of the greatest thinkers and leaders in history engaged in deliberate introspection. But to be fair they did not build any notable B2B SaaS apps so we should probably ignore them.

More Perfect Union@MorePerfectUS

Billionaire Marc Andreessen says he has "zero" introspection, and that the idea itself is a modern invention.

English

Entdecken

@gabebusto @0xngmi @logangraham @coinbase @rahulgs @aidotengineer @adonis_singh @elonmusk