Sergey Kolesnikov

589 posts

Sergey Kolesnikov

@Scitator

Head of AI Research in a fintech company. Decision Making in the Wild. Opinions are my own.

가입일 Mayıs 2011

235 팔로잉1.2K 팔로워

고정된 트윗

Sergey Kolesnikov@Scitator·30 Ara

This year we managed to do the impossible. We launched the AI research department, and we launched it noticeably – with recognition from the world's leading AI conferences: ICML and NeurIPS (both spotlights). But I want to believe that this is just the beginning...

English

1.2K

Sergey Kolesnikov 리트윗함

Nikita Balagansky@nlp_ceo·19 Şub

1/7 In-context learning (ICL) is poised to revolutionise NLP, but its success hinges on our ability to process long sequences. Recently, @simran_s_arora et al. showcased advancements in Linear Transformers and proposed Based. But what if we could push its boundaries further?

English

14.2K

Sergey Kolesnikov 리트윗함

Alexander Nikulin@how_uhh·13 Şub

Our first stable release and full paper preprint for XLand-MiniGrid is done, check it out! Compared to the workshop version, we have significantly redesigned the library, multi-GPU baselines and standardized benchmarks with millions of unique tasks. github.com/corl-team/xlan…

English

2.6K

Sergey Kolesnikov 리트윗함

Vladislav Kurenkov@vladkurenkov·12 Şub

In-Context RL for Variable Action Spaces ICRL is a promising direction to build Foundational Decision-Making Models. But adaptation to new action spaces is a problem. We propose Headless Algorithm Distillation (@MishaLaskin) to address it. arxiv.org/abs/2312.13327

English

104

9.2K

Sergey Kolesnikov 리트윗함

Vladislav Kurenkov@vladkurenkov·9 Şub

Which data-collection strategies enable In-Context Reinforcement Learning? You need either RL training trajectories or supervision with optimal actions. But what if we had a demonstrator policy, could we use it to enable ICRL? We show the answer is yes arxiv.org/abs/2312.12275

English

5.3K

Sergey Kolesnikov 리트윗함

Vladislav Kurenkov@vladkurenkov·4 Ara

🔥 Imagine if you could train Meta-RL agents for 1 TRILLION transitions under 40 hours? We present XLand-MiniGrid — JAX-accelerated meta-reinforcement learning environments inspired by XLand (@FeryalMP) and MiniGrid (@Love2Code). code: github.com/corl-team/xlan…

English

181

30.9K

Sergey Kolesnikov@Scitator·24 Haz

@kchonyc Curious about hyperparameters' impact in large-scale benchmarks? Check out our paper "Clean Offline Reinforcement Learning"! Fully open source, waiting for your ⭐️. pdf: arxiv.org/abs/2210.07105 src: github.com/tinkoff-ai/CORL

English

Sergey Kolesnikov@Scitator·24 Haz

@kchonyc Discover more nuances in our paper "Showing Your Offline Reinforcement Learning Work: Online Evaluation Budget Matters" - ICML 2022, spotlight. pdf: arxiv.org/abs/2110.04156 src: github.com/tinkoff-ai/eop

English

Kyunghyun Cho@kchonyc·23 Haz

how do people tune hyperparameters in offline reinforcement learning???

English

112

105.4K

Sergey Kolesnikov 리트윗함

Vladislav Kurenkov@vladkurenkov·17 Haz

NetHack is arguably one of the most challenging games for humans and even more for RL algorithms. Maybe, offline RL could help? Time will reveal. To bootstrap the practitioners, we release Katakomba — Tools and Benchmarks for Data-Driven NetHack. github.com/tinkoff-ai/kat…

English

30.8K

Sergey Kolesnikov 리트윗함

Vladislav Kurenkov@vladkurenkov·15 Haz

Interested in offline and offline-to-online RL 🫶? Check out new major release of Clean Offline Reinforcement Learning library: 🤖 Offline: 10 algorithms, 30 datasets benchmarked 🦾 Offline-to-Online: 5 algorithms, 10 datasets benchmarked github.com/tinkoff-ai/CORL

English

9.3K

Sergey Kolesnikov@Scitator·18 May

📢 Exciting research from our team! We explored the power of seemingly minor design choices in offline RL by applying them to an established minimalistic baseline developed by @shaneguML. The outcome? Just follow this 🧵

Vladislav Kurenkov@vladkurenkov

There were a lot of algorithmic innovations in offline RL recently, along with a silent evolution of minor design choices. What if we applied these seemingly minor modifications to an established minimalistic baseline by @shaneguML? Turns out, gains are enormous.

English

360

Sergey Kolesnikov@Scitator·28 Nis

Exciting news: Our paper has been accepted at ICML! Our work focuses on improving the reliability of offline RL algorithms and tackling overfitting through an anti-exploration bonus. And the best part? SAC-RND challenges SOTA results with a single network, no ensembles required!

English

1.2K

Sergey Kolesnikov@Scitator·30 Ara

For a full list of our publications, check my unofficial records 😅 notion.so/scitator/TRS-P…

English

281

Sergey Kolesnikov@Scitator·30 Ara

English

1.2K

Sergey Kolesnikov@Scitator·21 Tem

Optimizing accuracy is not a problem if you are EXACT 🤘

Ivan Karpukhin@IvanKarpukhin

In our new paper 🚀 we optimize accuracy via gradient descent! The work, called "EXACT: How to Train Your Accuracy", will be presented at the TAG-ML workshop during #ICML2022 🙃 Paper: arxiv.org/pdf/2205.09615… Poster: drive.google.com/file/d/1ZBOBQv… Enjoy!)

English

Sergey Kolesnikov@Scitator·20 Tem

You don't need a TPU cluster to count a budget 🤯 Join our EOP talk (@vladkurenkov) in room 307 in 2 hours! @icmlconf spotlight, #ICML2022

English

Sergey Kolesnikov@Scitator·7 Haz

Are you ready for the upcoming ICML spotlight? 🤯 kudos to @vladkurenkov

Vladislav Kurenkov@vladkurenkov

Extremely pleased to announce that our paper “Showing Your Offline Reinforcement Learning Work: Online Evaluation Budget Matters” was accepted to ICML 2022 (Spotlight)! tinkoff-ai.github.io/eop (1/N)

English

Sergey Kolesnikov@Scitator·28 Nis

If you're at ICLR now, check Generalizable Policy Learning in the Physical World workshop tomorrow. We will present Prompts and Pre-Trained Language Models for Offline Reinforcement Learning and will be happy to share its current improvements. We have some 😉

English

Sergey Kolesnikov@Scitator·28 Nis

🤯 30-under-30.forbes.ru/2022/463877-se…

QME

Sergey Kolesnikov 리트윗함

Sergey Plis@PlisSergey·27 Nis

Check out our blog post for a thorough explanation of how brainchop.org was made and the principles behind its work. With @MMasoud2021 @FarfallaHu @Entodi @Kevin_C_Wang @Scitator #neuroimaging #brainresearch #medicalresearch #MRI #MadeWithTFJS trendscenter.org/in-browser-3d-…

English

탐색

@simran_s_arora @MishaLaskin @FeryalMP @Love2Code @kchonyc @shaneguML @vladkurenkov @icmlconf