Piotr Gabryś 🇺🇦
807 posts

Piotr Gabryś 🇺🇦
@GabrysPiotrek
Senior Research Scientist @ Snowflake ❄️ | Kaggle Competitions Master | #StandWithUkraine

I wonder where are all the people who claimed to beat our score by far on ARC AGI2 when we won ARC Prize. I don't see them on the ARC AGI2 leaderboard. We are still on the pareto frontier of it (the orange dot at $0.2 per task), see screenshot. Moreover, people are submitting our code in the current Kaggle competition. They are moving the needle a bit higher, up to 31%.

"But here is what we found when we tested: We took the specific vulnerabilities Anthropic showcases in their announcement, isolated the relevant code, and ran them through small, cheap, open-weights models. Those models recovered much of the same analysis. Eight out of eight models detected Mythos's flagship FreeBSD exploit, including one with only 3.6 billion active parameters costing $0.11 per million tokens. A 5.1B-active open model recovered the core chain of the 27-year-old OpenBSD bug." aisle.com/blog/ai-cybers…



Deep Past Challenge was my first kaggle competition without coding, and with @raja_biswas we came 3rd 🥉 It was fun to use agentic tools to learn about life and settling disputes 4000 years ago. Thanks @kaggle for hosting 👇10 minas of silver (~5 kg) — enough to buy a house.



Dubai Offsite. EP. 4 We're building a data center in the desert








