Long Tran-Thanh
180 posts

Long Tran-Thanh
@ltt_hal
Professor. Does exist at Warwick Uni. In theory: working on AI/ML - bandits, game theory, and AI4SG. In practice: procrastinates & drinks craft beers.




Personal Update: I've moved to the University of Warwick, UK 🇬🇧. It was a tremendous 5 years in Singapore @NUSComputing, and my move was bittersweet. But I'm so very excited to be part of Warwick (and UK academia more broadly). Stay tuned for more updates!



The original RL algorithms, inspired by natural learning, were online and incremental—they were streaming in the sense that they learned from each increment of experience as it happened, then discarded it, never to be processed again. The streaming algorithms were simple and elegant, but the first big successes of RL in deep learning were not with streaming algorithms. Instead, methods such as DQN chopped the stream of experience into individual transitions, then stored and sampled them in arbitrary batches. Subsequent work followed, extended, and refined the batch approach into asynchronous and offline RL, while the streaming approach languished, unable to produce good results in popular deep learning domains. Until now. Now researchers at the University of Alberta have shown that streaming RL algorithms can work just as well as DQN on Atari and Mujoco tasks (arxiv.org/pdf/2410.14606). How did they do it? Mostly just by getting signal normalization and step-size bounding right for the streaming case—otherwise they use standard streaming algorithms like TD(lambda) and Q(lambda). To me it looks like they were simply the first researchers knowledgeable of streaming RL algorithms to seriously address deep RL without being over-influenced by batch-oriented software and batch-oriented supervised-learning ways of thinking.

Would you believe that deep RL can work without replay buffers, target networks, or batch updates? Our recent work gets deep RL agents to learn from a continuous stream of data one sample at a time without storing any sample. Joint work with @Gautham529 and @rupammahmood.

Today, we’re excited to host Ana-Andreea Stoica from @MPI_IS at our Foundations of AI Seminar Series,@FAIS_Warwick! 🗓️ Nov 11th 🕐Time: 2pm (GMT) 📍 Zeeman Building, MS.03 @warwickdcs @uniofwarwick #WarwickAI



Due to a high demand for registrations, NeurIPS will be moving towards a randomized lottery system, effective immediately. Authors of accepted conference and workshop papers are still guaranteed registration, but this may change as we release spots to the lottery, so we urge authors to register ASAP. Read more: blog.neurips.cc/2024/10/29/neu…



Researchers from the Department of Computer Science at the University of Warwick have had eight papers accepted for publication at NeurIPS 2024! Congratulations to all involved 👏 Read more: warwick.ac.uk/fac/sci/dcs/ne…









This is insane! My PhD student, who got 2 accepted first-authored papers to #NeurIPS2024, cannot attend the conf because the visa processing time in Canada is 10 months (sic!). What’s happening in Canada? It’s time to look for better accessible places to host Neurips!!
