Ivan Provilkov

135 posts

Ivan Provilkov banner
Ivan Provilkov

Ivan Provilkov

@provilkov

Research & Engineering @togethercompute; Ex @YandexResearch; Building Products; Sapere aude!

Dublin Katılım Haziran 2020
387 Takip Edilen182 Takipçiler
Sabitlenmiş Tweet
Ivan Provilkov
Ivan Provilkov@provilkov·
New research from @couplefire12 and me on training LLMs to reason from expert demonstrations — with no verifiers and no preference labels. We do a GAN-like training via Inverse Reinforcement Learning. Promising results. Take a look!
Locke Cai@couplefire12

RL for reasoning often rely on verifiers — great for math, but tricky for creative writing or open-ended research. Meet RARO: a new paradigm that teaches LLMs to reason via adversarial games instead of verification. No verifiers. No environments. Just demonstrations. 🧵👇

English
0
0
8
748
Ivan Provilkov retweetledi
eu/acc
eu/acc@euacc·
🇪🇺eu/acc After 2 years in existence, the first 4 points of the @euacc manifesto, which was crowdsourced by all of you, are now passed as laws That means 1/3rd of eu/acc's points is complete: ✅ 1. Reduce regulatory burden for startups ✅ 2. Make skilled immigration easier, unskilled harder ✅ 3. Repeal the cookie law ✅ 4. European Inc: a single pan-EU business entity Now for the next 8 objectives: 🔲 6. Tax discount during startup phase 🔲 7. Tax stock options when sold, not when exercised 🔲 8. Embrace AI and technology, don't fight it 🔲 9. Champion free speech, don't censor it 🔲 10. Reform bankruptcy laws to empower entrepreneurs 🔲 11. Make English the primary language of the European Union 🔲 12. Teach AI and tech in European schools and universities
eu/acc tweet media
English
37
48
465
37.7K
Ivan Provilkov retweetledi
Together AI
Together AI@togethercompute·
Together Fine-tuning now supports tool calling, reasoning, and vision-language model fine-tuning. Train models up to 1T parameters with up to 6x higher throughput on MoE architectures.
Together AI tweet media
English
2
6
14
2.4K
Daniel Dhawan
Daniel Dhawan@daniel_dhawan·
Introducing Rork Max – the first website that builds Swift iOS apps Powered by Claude Code & Opus 4.6, it's the most powerful AI for mobile apps. Rork Max makes beautiful, real mobile apps that don't feel vibecoded.
Rork@rork

Introducing Rork Max AI that one-shots almost any app for iPhone,  Watch, iPad,  TV &  Vision Pro. Even Pokémon Go with AR & 3D. Max is a website that replaces Xcode. Install on device in 1 click. Publish to App Store in 2 clicks. Powered by Swift, Claude Code & Opus 4.6.

English
23
12
289
58.7K
Ivan Provilkov retweetledi
Together AI
Together AI@togethercompute·
2/ Together Evaluations is a unified framework for assessing LLM quality. Team can: • compare open models to OpenAI/Anthropic/Google • make better decisions on prompting vs. fine-tuning • track quality improvements over time read more: together.ai/blog/together-…
English
1
1
4
1.3K
Ivan Provilkov retweetledi
Anders K.
Anders K.@Falliblemusings·
I used to think Sapiens was a great book. Sweeping, provocative, the kind of book that makes you feel like you finally understand the big picture of human history. It's on every CEO's bookshelf, assigned in universities, praised as a masterwork of synthesis. Yuval Noah Harari is treated as one of the serious thinkers of our time. But something nagged at me. Some passages felt off. Claims that human rights are just figments of our collective imagination, not real things, just stories we tell ourselves. That nations, laws, money, justice, doesn't exist outside our heads. That meaning itself is a delusion we've invented to cope. That we're far more powerful than ever before but not happier. That hunter-gatherers had it better because they had no dishes to wash, no carpets to vacuum, no nappies to change, no bills to pay. That sounded depressing to me, but was perhaps just the realistic scientific worldview? What it meant to see the world clearly, without comforting illusions. Then I read The Beginning of Infinity by @DavidDeutschOxf. Deutsch has a concept he calls 'bad philosophy.' Not philosophy that's merely false, but philosophy that actively prevents the growth of knowledge. Ideas that close doors rather than open them. That makes problems seem unsolvable by design. After soaking in Deutsch's framework (it's dense, a bit like digesting a delicious whale), it becomes clear: Harari's books are riddled with bad philosophy. They're smuggling nihilism in under the guise of scientific objectivity. Some examples: On meaning: "Human life has absolutely no meaning. Humans are the outcome of blind evolutionary processes that operate without goal or purpose... any meaning that people inscribe to their lives is just a delusion." On human rights: "There are no gods in the universe, no nations, no money, no human rights, no laws, and no justice outside the common imagination of human beings." On free will: "Humans are now hackable animals. The idea that humans have this soul or spirit and they have free will, that's over." On progress: "We thought we were saving time; instead we revved up the treadmill of life to ten times its former speed." The Agricultural Revolution? "History's biggest fraud." We didn't domesticate wheat, "it domesticated us." On our cosmic significance: "If planet Earth were to blow up tomorrow morning, the universe would probably keep going about its business as usual. Human subjectivity would not be missed." On the future: "Those who fail in the struggle against irrelevance would constitute a new 'useless class.'" Homo sapiens will likely "disappear in a century or two." This is bad philosophy. It tells us our problems are cosmically insignificant, our solutions are illusions, and that progress is neither desirable nor within our control. It's also perfect nonsense. No one would ever go back to being hunter-gatherers. Would you rather worry about your kid spending too much time on Roblox, or face the 50% chance she won't reach puberty? And our so-called "fictions"? They ended slavery. They gave women equal rights. They solved hunger. They eradicated smallpox. They turned sand into computer chips. They got us to the moon, and hopefully soon, to Mars and beyond. These "fictions" are already reshaping the universe, and over time they may become the most potent force in it. Now compare Deutsch: "Humans, people and knowledge are not only objectively significant: they are by far the most significant phenomena in nature." "Feeling insignificant because the universe is large has exactly the same logic as feeling inadequate for not being a cow." "Problems are soluble, and each particular evil is a problem that can be solved." "We are only just scratching the surface, and shall never be doing anything else. If unlimited progress really is going to happen, not only are we now at almost the very beginning of it, we always shall be." Where Harari sees a species of deluded apes stumbling toward obsolescence, Deutsch sees universal explainers, the only entities we know of capable of creating explanatory knowledge, solving problems, and potentially seeding the universe with intelligence. The difference isn't academic. Ideas shape action. If you believe life is meaningless, progress is a trap, and humans are hackable animals with no free will, how does that affect what you build? What you fight for? What you teach your children? Harari's books sell because they flatter a fashionable pessimism. They let readers feel sophisticated for seeing through the "delusions" everyone else lives by. That smug cynicism is corrosive. And it's everywhere: in schools, in media, in bestselling books. More than half of young adults now say they feel little to no purpose or meaning in life. This is what happens when you teach an entire generation bad philosophy. Less progress, less health, less wealth. Less flourishing. And ultimately, a higher chance that civilization and consciousness go extinct. Fortunately, there's another equally well-written, but much truer, account of homo sapiens, appropriately titled 'The Beginning of Infinity'. And this one smuggles no despair in by the backdoor. But let's give Harari credit where it's due. He is right about one thing: if planet Earth blew up tomorrow, we wouldn't be missed. Because there'd be no one left to miss us, just a careless universe, blindly obeying physical laws. We are the only ones who can miss, but we're not going to. We're going to aim, hit, and keep going. Full credit for the amazing meme to @Ben__Jeff
Anders K. tweet media
English
867
1.5K
9.2K
897.4K
Ivan Provilkov
Ivan Provilkov@provilkov·
I think there are 2 possibilities: 1. People retain control of AI-Capital. In that case, they need other people who understand and verify AI plans and work. These people become the “labor bottleneck” that maintains the labor–capital ratio. 2. AI-Capital operates on its own. In this case, there is no human control: AI-capital is autonomous. The whole question of inequality is then different and is determined by this self-improving AI capital.
English
0
0
0
5
Dwarkesh Patel
Dwarkesh Patel@dwarkesh_sp·
New blog post w @pawtrammell: Capital in the 22nd Century Where we argue that while Piketty was wrong about the past, he’s probably right about the future. Piketty argued that without strong redistribution of wealth, inequality will indefinitely increase. Historically, however, income inequality from capital accumulation has actually been self-correcting. Labor and capital are complements, so if you build up lots of capital, you’ll lower its returns and raise wages (since labor now becomes the bottleneck). But once AI/robotics fully substitute for labor, this correction mechanism breaks. For centuries, the share of GDP that goes to paying wages has been 2/3, and the share of GDP that’s been income from owning stuff has been 1/3. With full automation, capital’s share of GDP goes to 100% (since datacenters and solar panels and the robot factories that build all the above plus more robot factories are all “capital”). And inequality among capital holders will also skyrocket - in favor of larger and more sophisticated investors. A lot of AI wealth is being generated in private markets. You can’t get direct exposure to xAI from your 401k, but the Sultan of Oman can. A cheap house (the main form of wealth for many Americans) is a form of capital almost uniquely ill-suited to taking advantage of a leap in automation: it plays no part in the production, operation, or transportation of computers, robots, data, or energy. Also, international catch-up growth may end. Poor countries historically grew faster by combining their cheap labor with imported capital/know-how. Without labor as a bottleneck, their main value-add disappears. Inequality seems especially hard to justify in this world. So if we don’t want inequality to just keep increasing forever - with the descendants of the most patient and sophisticated of today’s AI investors controlling all the galaxies - what can we do? The obvious place to start is with Piketty’s headline recommendation: highly and progressively tax wealth. This might discourage saving, but it would no longer penalize those who have earned a lot by their hard work and creativity. The wealth - even the investment decisions - will be made by the robots, and they will work just as hard and smart however much we tax their owners. But taxing capital is pointless if people can just shift their future investment to lower tax countries. And since capital stocks could grow really fast (robots building robots and all that), pretty soon tax havens go from marginal outposts to the majority of global GDP. But how do you get global coordination on taxing capital, when the benefits to defecting are so high and so accessible? Full automation will probably lead to ever-increasing inequality. We don’t see an obvious solution to this problem. And we think it’s weird how little thought has gone into what to do about it. Many more thoughts from re-reading Piketty with our AGI hats on at the post in the link below.
English
209
235
2.3K
1.5M
Locke Cai
Locke Cai@couplefire12·
RL for reasoning often rely on verifiers — great for math, but tricky for creative writing or open-ended research. Meet RARO: a new paradigm that teaches LLMs to reason via adversarial games instead of verification. No verifiers. No environments. Just demonstrations. 🧵👇
Locke Cai tweet media
English
24
78
611
177K
Ivan Provilkov retweetledi
Together AI
Together AI@togethercompute·
No verifiers? No problem. 🤝 The Together Research team is excited to introduce RARO — a new paradigm that unlocks scalable reasoning. By teaching LLMs to reason through adversarial games, we're seeing promising results where standard RL fails. Check it out now and let us know if you're interested in trying RARO to train reasoning models: forms.gle/Rrrs52MZHJZVuH…
Locke Cai@couplefire12

RL for reasoning often rely on verifiers — great for math, but tricky for creative writing or open-ended research. Meet RARO: a new paradigm that teaches LLMs to reason via adversarial games instead of verification. No verifiers. No environments. Just demonstrations. 🧵👇

English
0
2
9
2.8K
Ivan Provilkov
Ivan Provilkov@provilkov·
@beffjezos Thank you! Anything you can demo, you can explain, reason about, and then hillclimb — that’s the high-level idea behind this research direction. However, it still requires a lot of tuning and scaling.
English
0
0
0
61
Ivan Provilkov
Ivan Provilkov@provilkov·
If you have a good dataset/task in mind that you’re interested in, and it a) requires reasoning b) is hard to build a fast experiment/verification system for please share a link!
Locke Cai@couplefire12

Woke up to some amazing feedback, thanks everyone!! @provilkov and I are working hard to release a plug-and-play RARO repo soon — what domains do you want to see supported? If you have specific model/dataset requests, let us know in the announcement thread! 👇

English
0
0
1
38
Ivan Provilkov
Ivan Provilkov@provilkov·
@iScienceLuvr Thank you! I also think the medical domain could benefit from our method. Do you have a good dataset or benchmark in mind?
English
0
0
0
17
Tanishq Mathew Abraham, Ph.D.
Tanishq Mathew Abraham, Ph.D.@iScienceLuvr·
This is super cool, basically GANs for LLM post-training policy tries to mimic expert answers, critic tries to identify expert answer vs policy answer I'm curious to try this out on some medical tasks... I had similar ideas about 2 years ago which I was discussing in EleutherAI but I was trying to apply it to RLHF and didn't pursue any further... skill issue on my part I guess lol
Locke Cai@couplefire12

RL for reasoning often rely on verifiers — great for math, but tricky for creative writing or open-ended research. Meet RARO: a new paradigm that teaches LLMs to reason via adversarial games instead of verification. No verifiers. No environments. Just demonstrations. 🧵👇

English
12
16
273
26.4K
Ivan Provilkov retweetledi
Nazneen Rajani
Nazneen Rajani@nazneenrajani·
excited to be partnering with amazing folks @togethercompute, @ZainHasan6 and @provilkov to bring dynamic agent simulations to together evals.
Together AI@togethercompute

Together AI 🤝@CollinearAI Introducing TraitMix, Collinear’s simulation product empowering teams to generate persona-driven AI agent interactions. 🔌Plug these interactions into your workflows and evaluate their effectiveness with Together Evals. Details: bit.ly/43GHJhR

English
2
2
22
2.8K
Ivan Provilkov retweetledi
Zain
Zain@ZainHasan6·
🚀Now you can fine-tune LLM's from the @huggingface hub using @togethercompute!🔥 • Public + private repos • CausalLMs <100B params • Push tuned models back to the Hub Smaller, open models + smart fine-tuning > bigger closed ones. Link below👇
Zain tweet media
English
1
2
7
424
Ivan Provilkov retweetledi
Together AI
Together AI@togethercompute·
🚨 Stop shipping LLMs blind. Together Evaluations is here — fast, flexible, LLM-as-a-judge-based benchmarking to: ✅ Compare model outputs ✅ Score responses against your own criteria ✅ Classify outputs into custom labels — from safety to sentiment Run our early preview today with any serverless model — more support coming soon. Learn more (links below!)
Together AI tweet media
English
4
4
26
6.3K