Yuriy Dybskiy

10.3K posts

Yuriy Dybskiy

@html5cat

building https://t.co/GonoWc90V9 (@Puma_ai + Puma Browser + PumaClaw) prev. dev rel at Meta (Parse), @Meteorjs (YC S11), Cloudant (S08) 🇺🇦↠🇯🇵↠🇨🇦↠🇺🇸 🌁🎾📷

San Francisco Beigetreten Temmuz 2010

4.9K Folgt5.8K Follower

Angehefteter Tweet

Yuriy Dybskiy@html5cat·14 Oca

Was looking for a designer to make a new website for @puma_ai and decided to try latest vibecoding options so played around with @antigravity and Codex from @OpenAI. Guess which one is which:

English

13.6K

Yuriy Dybskiy@html5cat·15h

@elwatto i guess i need to go to Japan to pick them up too

English

124

Miguel Carranza@elwatto·15h

@html5cat currently novablast 5. thinking of getting megablasts and superblast 3 in a couple of weeks in japan

English

502

Miguel Carranza@elwatto·15h

one year of consistent running, starting from full sedentary dad. First time ever hitting ‘high’ vo2 max. It works.

English

148

12.3K

Yuriy Dybskiy@html5cat·15h

@hthieblot 👨🏻‍💻✨

QME

Hubert Thieblot@hthieblot·17h

Only incredible founders can reply to this tweet

English

392

435

35.4K

Yuriy Dybskiy@html5cat·15h

@visualsofearth1 @RyanResatka gorgeous! Where is this?

English

170

𝘄𝗮𝗻𝗱𝗲𝗿𝘁𝗿𝗮𝘃𝗲𝗹𝘀@visualsofearth1·1d

California’s wildflower superbloom

English

231

1.8K

27.6K

Yuriy Dybskiy@html5cat·16h

@hagaetc @KyleSamani I think the question is which particular app is best to try it in

English

hagaetc@hagaetc·17h

@KyleSamani You can track the data here dune.com/hashed_officia…

English

623

Kyle Samani@KyleSamani·1d

Supposedly 100M x402 transactions for $30M in payments volume Where can I try this myself?

Nina Bambysheva@ninabambysheva

Crypto’s perfect customer has finally arrived. I spoke with @matthuang, @hosseeb, @jessepollak, @programmer, @_rishinsharma, @joechalom, @OnchainLu and a few other teams and payments experts to unpack how crypto is repositioning for the agentic age, what it will take to win agentic commerce and why this matters beyond payments. forbes.com/sites/ninabamb…

English

25.6K

Yuriy Dybskiy@html5cat·16h

@levelsio @tbpn @photomatt @X you’re absolutely right 🫡

English

128

@levelsio@levelsio·18h

I think @tbpn should report on this with their image post :D BREAKING: WordPress founder @photomatt solves @X's AI reply bot problem

@levelsio@levelsio

I didn't seen an AI reply yet, I think it works! @nikitabier

English

179

41.8K

Yuriy Dybskiy@html5cat·17h

@sama contextmaxxx

Filipino

Sam Altman@sama·2d

I would like a single word for this phrase: "throw it into the maw with every bit of context I can think of".

Ethan Mollick@emollick

GPT-5.4 Pro continues to be the only model of its class. For anything really hard & complex, I throw it into the maw with every bit of context I can think of. More often than not, something very useful comes out. I can't get the same results from Codex or Code or anything else.

English

981

117

2.9K

693.4K

Yuriy Dybskiy@html5cat·19h

@wesbos 🧊

QME

Wes Bos@wesbos·21h

Only cool people can reply to this

English

679

673

102.7K

Yuriy Dybskiy@html5cat·21h

@OpenAIDevs @OpenAI ty ty 👨🏻‍💻✨

English

OpenAI Developers@OpenAIDevs·1d

@html5cat @OpenAI great ship

English

1.5K

Yuriy Dybskiy@html5cat·1d

finally rebuilt puma.tech to be a static pages vs Notion site to improve performance and drastically reduce number of request/js thanks @OpenAI Codex

Yuriy Dybskiy@html5cat

is Meta training a new model? one of our sites (puma.tech) is getting hammered by their crawl bots: 100k+ requests in the last 24hr

English

2.8K

Yuriy Dybskiy@html5cat·22h

i'm at level 6/7 and it's super fun arcprize.org/tasks/ls20

ARC Prize@arcprize

Announcing ARC-AGI-3 The only unsaturated agentic intelligence benchmark in the world Humans score 100%, AI <1% This human-AI gap demonstrates we do not yet have AGI Most benchmarks test what models already know, ARC-AGI-3 tests how they learn

English

153

Yuriy Dybskiy@html5cat·22h

@mikeknoop is ARC-AGI-4 going to be StarCraft 2 and RDR2?

English

105

Mike Knoop@mikeknoop·1d

ARC-AGI-3 and ARC Prize 2026 are now live with $2,000,000 in prizes! As of today, version 3 is the world's only unsaturated agentic intelligence benchmark. Humans score 100% and frontier AI scores ~0%. Play here: arcprize.org/arc-agi/3 While no single version of ARC is definitionally AGI, our aim with the ARC-AGI Series is to continually produce useful scientific benchmarks which identify large remaining gaps between Humans and Frontier AI. At some point, we'll be unable to, and then we'll have AGI. Our new benchmark consists of over 100 novel game environments encompassing nearly 1,000 levels. Notably, test takers are given no explicit goals (other than to win) and must explore the environments to acquire goals, understand rules, develop strategy, and ultimately execute a plan to win. ARC-AGI-3 is a test of agentic intelligence. Beating this benchmark requires on-the-fly world modeling and continual learning to adapt to evolving environments. To score 100% AI must beat all of the games as efficiently as the human baseline (e.g., the number of actions taken to win). An ARC first, this gives us a formal comparison of AI reasoning efficiency vs humans. Version 3 carries classic ARC design principles: core knowledge priors only, private test sets to measure generalization, and it's fun! Every benchmark we release is an experiment and I believe this new version will provide strong signal towards increasingly autonomous AI agents. Prior versions of ARC held strong predictive power for important AI moments. Version 1 only saw progress with the release of AI reasoning models in late 2024 and Version 2 only began seeing progress with the advent of agentic coding models in late 2025. Version 3 is expected to signal when AI agents can become economically useful in more open-ended domains (beyond highly measurable domains like coding and math). There are a few other important design changes for ARC-AGI-3. The public set is now a "demonstration" set, not a training set. And unlike prior versions, the private set is now explicitly designed to be Out Of Distribution (non-IDD) from the public demo set. This is to mitigate targeting and because LLMs can now generalize over IDD splits using AI reasoning. Frontier models have made great progress over the past year. So much that several industry leaders have suggested we may already have AGI. Part of the ARC Prize Foundation mission is to provide accurate public sense finding and we strive to reduce false-positive claims. To this end, we've updated our testing policy. Going forward we will only verify scores outside of the official Kaggle competitions from AI systems with high commercial usage or are 100% open source. We're also adopting a stateless client scoring philosophy to ensure humans and AI are tested under identical conditions. The goal of these changes is to reduce the amount of developer-aware targeting (whether incidental or intentional) and provide clear signal if actual AGI progress has occurred. The Foundation also has a goal to inspire AI innovation which is most likely to come from the community. We've seen dozens of startups using ARC as a tool for showcasing their ideas - a few have fundraised serious capital based on their ARC results. To support this we're launching a new Community leaderboard. While scores for this leaderboard can't be Verified, and you should explicitly not trust these scores as an accurate measure of AGI progress, we will curate the best ideas and promote them. This year I expect we will see rapid progress on the ARC-AGI-3 Community leaderboard and the best ideas will eventually migrate into frontier models and onto the Verified leaderboard. Finally, we’ve partnered again with Kaggle to run two competition tracks for ARC-AGI-2 and ARC-AGI-3. This will be the last year for Version 2. When we launched the first ARC Prize back in 2024, I committed to running the Grand Prize until it was beaten. So for the ARC-AGI-2 track we will be paying out the Grand Prize to the best team, no matter what, in order to honor this commitment. In accordance with the Foundation mission, to win any prize money you must open-source a reproducible solution. We raised the standard for open source to include training. I'm excited to produce a truly open solution as a final send off for the ARC-AGI-1 and 2 format. Focus is now on ARC-AGI-3 (we've even started work on Versions 4 and 5). As always, I'm honored to have the opportunity to steward attention towards AGI progress. I'm also super grateful to the incredible ARC Prize team - including our core engineers, game designers, and human testers - led by @GregKamradt without whom we would not have this incredibly useful benchmark. See you on the leaderboard!

English

140

13.4K

Yuriy Dybskiy@html5cat·23h

@harris # of series As you helped raise don't think Bridgewater would allow graph with non 0 starting point for Y axis tho

English

185

Aaron Harris@harris·1d

Here's a fun game we used to play at Bridgewater. Guess the chart. Winner gets...satisfaction.

English

2.6K

Yuriy Dybskiy@html5cat·1d

drop the “Data” - we’re moving to Kyoto 🏯

Notion@NotionHQ

🇯🇵🇰🇷 Japan + Korea Data Residency is coming to Notion. If you need your Enterprise workspace data to stay in-region, you'll be able to choose Tokyo or Seoul — without giving up any of Notion’s collaborative features. So your data stays local and your team stays connected.

English

445

Yuriy Dybskiy@html5cat·1d

@NotionHQ @ishverduzco i can haz jacket plz 🐾

English

Notion@NotionHQ·1d

@ishverduzco 👀

QME

1.4K

Ish Verduzco@ishverduzco·1d

The vibes at @NotionHQ 🤌🏽

English

198

16.1K

Yuriy Dybskiy@html5cat·1d

impressive lemonade stand you got there, David 🍋

David Cramer@zeeg

You can’t even AI the right details?

English

519

Yuriy Dybskiy@html5cat·1d

@feross @zeeg @jfrog @SocketSecurity Wat. Such insecure take from jfrog

English

1.8K

Feross@feross·1d

When the CEO of a $600M+ ARR public company calls out your startup directly, your team and customers deserve a response. @JFrog's CEO published a post today calling @SocketSecurity a "fragile, commercialized illusion of security" that "wraps open source scanners." This isn't the kind of discourse that makes our industry better. But since he named us, here are the facts. The attack he references -- the Trivy/Aqua supply chain compromise -- is one Socket helped expose. Our threat research team independently identified the OpenVSX extension attack on March 2, the 75+ compromised GitHub Actions tags on March 19, and the poisoned Docker Hub images on March 22. He's citing our work to make his case against us. On the core question -- who's actually finding supply chain threats -- the public record is clear. JFrog's research page lists ~5,000 findings across their entire 18-year history as a company. Socket discovers ~10,000 malicious packages *per week*. We've identified ~250,000 unique supply chain attacks. These numbers are all public. We publish our research, our detections, and our threat data publicly. Anyone can evaluate the work. We report our findings to the registries, where they end up protecting JFrog's own customers through OpenSSF. Scanners find known CVEs. Socket finds attackers. Those are different problems, and conflating them is either a mistake or a choice. JFrog's SEC filings show security is 7% of their FY2025 revenue. 93% of their customers aren't buying their storytelling either. Back to building.