Ivan Provilkov

135 posts

Ivan Provilkov

@provilkov

Research & Engineering @togethercompute; Ex @YandexResearch; Building Products; Sapere aude!

Dublin Katılım Haziran 2020

387 Takip Edilen182 Takipçiler

Sabitlenmiş Tweet

Ivan Provilkov@provilkov·11 Ara

New research from @couplefire12 and me on training LLMs to reason from expert demonstrations — with no verifiers and no preference labels. We do a GAN-like training via Inverse Reinforcement Learning. Promising results. Take a look!

Locke Cai@couplefire12

RL for reasoning often rely on verifiers — great for math, but tricky for creative writing or open-ended research. Meet RARO: a new paradigm that teaches LLMs to reason via adversarial games instead of verification. No verifiers. No environments. Just demonstrations. 🧵👇

English

748

Ivan Provilkov retweetledi

eu/acc@euacc·25 Mar

🇪🇺eu/acc After 2 years in existence, the first 4 points of the @euacc manifesto, which was crowdsourced by all of you, are now passed as laws That means 1/3rd of eu/acc's points is complete: ✅ 1. Reduce regulatory burden for startups ✅ 2. Make skilled immigration easier, unskilled harder ✅ 3. Repeal the cookie law ✅ 4. European Inc: a single pan-EU business entity Now for the next 8 objectives: 🔲 6. Tax discount during startup phase 🔲 7. Tax stock options when sold, not when exercised 🔲 8. Embrace AI and technology, don't fight it 🔲 9. Champion free speech, don't censor it 🔲 10. Reform bankruptcy laws to empower entrepreneurs 🔲 11. Make English the primary language of the European Union 🔲 12. Teach AI and tech in European schools and universities

English

465

37.7K

Ivan Provilkov retweetledi

Together AI@togethercompute·19 Mar

Together Fine-tuning now supports tool calling, reasoning, and vision-language model fine-tuning. Train models up to 1T parameters with up to 6x higher throughput on MoE architectures.

English

2.4K

Ivan Provilkov@provilkov·21 Şub

@daniel_dhawan Congrats! That's really cool!

English

Daniel Dhawan@daniel_dhawan·19 Şub

Introducing Rork Max – the first website that builds Swift iOS apps Powered by Claude Code & Opus 4.6, it's the most powerful AI for mobile apps. Rork Max makes beautiful, real mobile apps that don't feel vibecoded.

Rork@rork

Introducing Rork Max AI that one-shots almost any app for iPhone,  Watch, iPad,  TV &  Vision Pro. Even Pokémon Go with AR & 3D. Max is a website that replaces Xcode. Install on device in 1 click. Publish to App Store in 2 clicks. Powered by Swift, Claude Code & Opus 4.6.

English

289

58.7K

Ivan Provilkov@provilkov·3 Şub

A really nice deep dive showing how fine-tuning using Together AI can produce a model that outperforms GPT-5.2 on a given task, while also being 10× cheaper and 15× faster.

Zain@ZainHasan6

How to fine-tune OS LLM judges to outperform GPT-5.2! 🔥 We trained GPT-OSS 120B on 5,400 preference pairs to beat GPT-5.2's accuracy > superior performance > 15x lower cost > 14x faster speeds Code + deepdive below👇

English

465

Ivan Provilkov retweetledi

Together AI@togethercompute·3 Şub

2/ Together Evaluations is a unified framework for assessing LLM quality. Team can: • compare open models to OpenAI/Anthropic/Google • make better decisions on prompting vs. fine-tuning • track quality improvements over time read more: together.ai/blog/together-…

English

1.3K

Ivan Provilkov@provilkov·24 Oca

A cool project for video generation and editing from my friend

Alex Varga /✦@vargastartup

Introducing vargai/sdk - JSX for AI Video. Declarative programming language for Claude Code. AI Agent writes JSX, you get videos ✦ 🧵

English

Ivan Provilkov retweetledi

Anders K.@Falliblemusings·20 Oca

I used to think Sapiens was a great book. Sweeping, provocative, the kind of book that makes you feel like you finally understand the big picture of human history. It's on every CEO's bookshelf, assigned in universities, praised as a masterwork of synthesis. Yuval Noah Harari is treated as one of the serious thinkers of our time. But something nagged at me. Some passages felt off. Claims that human rights are just figments of our collective imagination, not real things, just stories we tell ourselves. That nations, laws, money, justice, doesn't exist outside our heads. That meaning itself is a delusion we've invented to cope. That we're far more powerful than ever before but not happier. That hunter-gatherers had it better because they had no dishes to wash, no carpets to vacuum, no nappies to change, no bills to pay. That sounded depressing to me, but was perhaps just the realistic scientific worldview? What it meant to see the world clearly, without comforting illusions. Then I read The Beginning of Infinity by @DavidDeutschOxf. Deutsch has a concept he calls 'bad philosophy.' Not philosophy that's merely false, but philosophy that actively prevents the growth of knowledge. Ideas that close doors rather than open them. That makes problems seem unsolvable by design. After soaking in Deutsch's framework (it's dense, a bit like digesting a delicious whale), it becomes clear: Harari's books are riddled with bad philosophy. They're smuggling nihilism in under the guise of scientific objectivity. Some examples: On meaning: "Human life has absolutely no meaning. Humans are the outcome of blind evolutionary processes that operate without goal or purpose... any meaning that people inscribe to their lives is just a delusion." On human rights: "There are no gods in the universe, no nations, no money, no human rights, no laws, and no justice outside the common imagination of human beings." On free will: "Humans are now hackable animals. The idea that humans have this soul or spirit and they have free will, that's over." On progress: "We thought we were saving time; instead we revved up the treadmill of life to ten times its former speed." The Agricultural Revolution? "History's biggest fraud." We didn't domesticate wheat, "it domesticated us." On our cosmic significance: "If planet Earth were to blow up tomorrow morning, the universe would probably keep going about its business as usual. Human subjectivity would not be missed." On the future: "Those who fail in the struggle against irrelevance would constitute a new 'useless class.'" Homo sapiens will likely "disappear in a century or two." This is bad philosophy. It tells us our problems are cosmically insignificant, our solutions are illusions, and that progress is neither desirable nor within our control. It's also perfect nonsense. No one would ever go back to being hunter-gatherers. Would you rather worry about your kid spending too much time on Roblox, or face the 50% chance she won't reach puberty? And our so-called "fictions"? They ended slavery. They gave women equal rights. They solved hunger. They eradicated smallpox. They turned sand into computer chips. They got us to the moon, and hopefully soon, to Mars and beyond. These "fictions" are already reshaping the universe, and over time they may become the most potent force in it. Now compare Deutsch: "Humans, people and knowledge are not only objectively significant: they are by far the most significant phenomena in nature." "Feeling insignificant because the universe is large has exactly the same logic as feeling inadequate for not being a cow." "Problems are soluble, and each particular evil is a problem that can be solved." "We are only just scratching the surface, and shall never be doing anything else. If unlimited progress really is going to happen, not only are we now at almost the very beginning of it, we always shall be." Where Harari sees a species of deluded apes stumbling toward obsolescence, Deutsch sees universal explainers, the only entities we know of capable of creating explanatory knowledge, solving problems, and potentially seeding the universe with intelligence. The difference isn't academic. Ideas shape action. If you believe life is meaningless, progress is a trap, and humans are hackable animals with no free will, how does that affect what you build? What you fight for? What you teach your children? Harari's books sell because they flatter a fashionable pessimism. They let readers feel sophisticated for seeing through the "delusions" everyone else lives by. That smug cynicism is corrosive. And it's everywhere: in schools, in media, in bestselling books. More than half of young adults now say they feel little to no purpose or meaning in life. This is what happens when you teach an entire generation bad philosophy. Less progress, less health, less wealth. Less flourishing. And ultimately, a higher chance that civilization and consciousness go extinct. Fortunately, there's another equally well-written, but much truer, account of homo sapiens, appropriately titled 'The Beginning of Infinity'. And this one smuggles no despair in by the backdoor. But let's give Harari credit where it's due. He is right about one thing: if planet Earth blew up tomorrow, we wouldn't be missed. Because there'd be no one left to miss us, just a careless universe, blindly obeying physical laws. We are the only ones who can miss, but we're not going to. We're going to aim, hit, and keep going. Full credit for the amazing meme to @Ben__Jeff

English

867

1.5K

9.2K

897.4K

Ivan Provilkov@provilkov·18 Oca

I think there are 2 possibilities: 1. People retain control of AI-Capital. In that case, they need other people who understand and verify AI plans and work. These people become the “labor bottleneck” that maintains the labor–capital ratio. 2. AI-Capital operates on its own. In this case, there is no human control: AI-capital is autonomous. The whole question of inequality is then different and is determined by this self-improving AI capital.

English

Dwarkesh Patel@dwarkesh_sp·29 Ara

New blog post w @pawtrammell: Capital in the 22nd Century Where we argue that while Piketty was wrong about the past, he’s probably right about the future. Piketty argued that without strong redistribution of wealth, inequality will indefinitely increase. Historically, however, income inequality from capital accumulation has actually been self-correcting. Labor and capital are complements, so if you build up lots of capital, you’ll lower its returns and raise wages (since labor now becomes the bottleneck). But once AI/robotics fully substitute for labor, this correction mechanism breaks. For centuries, the share of GDP that goes to paying wages has been 2/3, and the share of GDP that’s been income from owning stuff has been 1/3. With full automation, capital’s share of GDP goes to 100% (since datacenters and solar panels and the robot factories that build all the above plus more robot factories are all “capital”). And inequality among capital holders will also skyrocket - in favor of larger and more sophisticated investors. A lot of AI wealth is being generated in private markets. You can’t get direct exposure to xAI from your 401k, but the Sultan of Oman can. A cheap house (the main form of wealth for many Americans) is a form of capital almost uniquely ill-suited to taking advantage of a leap in automation: it plays no part in the production, operation, or transportation of computers, robots, data, or energy. Also, international catch-up growth may end. Poor countries historically grew faster by combining their cheap labor with imported capital/know-how. Without labor as a bottleneck, their main value-add disappears. Inequality seems especially hard to justify in this world. So if we don’t want inequality to just keep increasing forever - with the descendants of the most patient and sophisticated of today’s AI investors controlling all the galaxies - what can we do? The obvious place to start is with Piketty’s headline recommendation: highly and progressively tax wealth. This might discourage saving, but it would no longer penalize those who have earned a lot by their hard work and creativity. The wealth - even the investment decisions - will be made by the robots, and they will work just as hard and smart however much we tax their owners. But taxing capital is pointless if people can just shift their future investment to lower tax countries. And since capital stocks could grow really fast (robots building robots and all that), pretty soon tax havens go from marginal outposts to the majority of global GDP. But how do you get global coordination on taxing capital, when the benefits to defecting are so high and so accessible? Full automation will probably lead to ever-increasing inequality. We don’t see an obvious solution to this problem. And we think it’s weird how little thought has gone into what to do about it. Many more thoughts from re-reading Piketty with our AGI hats on at the post in the link below.

English

209

235

2.3K

1.5M

Ivan Provilkov@provilkov·16 Ara

@Douglas_Schon @couplefire12 We can try it on Whoop’s data, if you have an idea of where it would be beneficial.

English

Douglas Schonholtz@Douglas_Schon·13 Ara

@couplefire12 The more I think about this, the more excited I am. Really great work

English

344

Locke Cai@couplefire12·11 Ara

English

611

177K

Ivan Provilkov retweetledi

Together AI@togethercompute·15 Ara

No verifiers? No problem. 🤝 The Together Research team is excited to introduce RARO — a new paradigm that unlocks scalable reasoning. By teaching LLMs to reason through adversarial games, we're seeing promising results where standard RL fails. Check it out now and let us know if you're interested in trying RARO to train reasoning models: forms.gle/Rrrs52MZHJZVuH…

Locke Cai@couplefire12

English

2.8K

Ivan Provilkov@provilkov·13 Ara

@beffjezos Thank you! Anything you can demo, you can explain, reason about, and then hillclimb — that’s the high-level idea behind this research direction. However, it still requires a lot of tuning and scaling.

English

Beff (e/acc)@beffjezos·13 Ara

Anything you can demo you can hillclimb now. It's kind of over.

Locke Cai@couplefire12

English

111

12.8K

Ivan Provilkov@provilkov·13 Ara

Thank you! Anything you can demo, you can explain, reason about, and then hillclimb — that’s the high-level idea behind this research direction. However, it still requires a lot of tuning and scaling.

Beff (e/acc)@beffjezos

Anything you can demo you can hillclimb now. It's kind of over.

English

Ivan Provilkov@provilkov·13 Ara

If you have a good dataset/task in mind that you’re interested in, and it a) requires reasoning b) is hard to build a fast experiment/verification system for please share a link!

Locke Cai@couplefire12

Woke up to some amazing feedback, thanks everyone!! @provilkov and I are working hard to release a plug-and-play RARO repo soon — what domains do you want to see supported? If you have specific model/dataset requests, let us know in the announcement thread! 👇

English

Ivan Provilkov@provilkov·13 Ara

@iScienceLuvr Thank you! I also think the medical domain could benefit from our method. Do you have a good dataset or benchmark in mind?

English

Tanishq Mathew Abraham, Ph.D.@iScienceLuvr·12 Ara

This is super cool, basically GANs for LLM post-training policy tries to mimic expert answers, critic tries to identify expert answer vs policy answer I'm curious to try this out on some medical tasks... I had similar ideas about 2 years ago which I was discussing in EleutherAI but I was trying to apply it to RLHF and didn't pursue any further... skill issue on my part I guess lol

Locke Cai@couplefire12

English

273

26.4K

Ivan Provilkov retweetledi

Nazneen Rajani@nazneenrajani·31 Eki

excited to be partnering with amazing folks @togethercompute, @ZainHasan6 and @provilkov to bring dynamic agent simulations to together evals.

Together AI@togethercompute

Together AI 🤝@CollinearAI Introducing TraitMix, Collinear’s simulation product empowering teams to generate persona-driven AI agent interactions. 🔌Plug these interactions into your workflows and evaluate their effectiveness with Together Evals. Details: bit.ly/43GHJhR

English

2.8K

Ivan Provilkov retweetledi

Zain@ZainHasan6·16 Eyl

🚀Now you can fine-tune LLM's from the @huggingface hub using @togethercompute!🔥 • Public + private repos • CausalLMs <100B params • Push tuned models back to the Hub Smaller, open models + smart fine-tuning > bigger closed ones. Link below👇

English

424

Ivan Provilkov retweetledi

Together AI@togethercompute·28 Tem

🚨 Stop shipping LLMs blind. Together Evaluations is here — fast, flexible, LLM-as-a-judge-based benchmarking to: ✅ Compare model outputs ✅ Score responses against your own criteria ✅ Classify outputs into custom labels — from safety to sentiment Run our early preview today with any serverless model — more support coming soon. Learn more (links below!)

English

6.3K

Keşfet

@euacc @daniel_dhawan @DavidDeutschOxf @Ben__Jeff @pawtrammell @Douglas_Schon @couplefire12 @beffjezos