Guoqing Liu

106 posts

Guoqing Liu

@fiberleif

Senior Researcher at @MSFTResearch AI for Science, working on reinforcement learning, large language models, and AI for Science.

Cambridge, England Beigetreten Nisan 2022

562 Folgt131 Follower

Guoqing Liu retweetet

Robert Pinsler@rpinsler·5 Mar

We are significantly expanding to accelerate our ambitious plans for AI-driven materials discovery at @MSFTResearch AI for Science. Looking for a Data Engineer, ML Engineer and Applied Scientist (UK/NL/DE). ⬇️See job postings below ⬇️

English

7.6K

Guoqing Liu@fiberleif·16 Ara

Excited to share our latest development, QFANG: A Scientific Reasoning Model for Organic Synthesis Procedure Generation. Check it out 👉arxiv.org/abs/2512.13668 #MicrosoftResearch #AIforScience #Chemistry #ReasoningModel #LLM

Marwin Segler@marwinsegler

Language Models bring new capabilities to Chemistry, especially when dealing with both structures and rich natural language, ie in synthesis. For this, we now report our new Reasoning model for Synthesis Procedure generation, with dedicated SFT on COT and RLVR for the task. 1/2

English

1.1K

Guoqing Liu retweetet

Jerry Tworek@MillionInt·16 Eki

I don't do podcasts very often - in reality this is my first one ever, but if anyone wants to listen to someone talk about RL for an hour, this is it

Matt Turck@mattturck

How GPT-5 thinks, with @OpenAI VP of Research @MillionInt 00:00 - Intro 01:01 - What Reasoning Actually Means in AI 02:32 - Chain of Thought: Models Thinking in Words 05:25 - How Models Decide How Long to Think 07:24 - Evolution from o1 to o3 to GPT-5 11:00 - The Road to OpenAI: Growing up in Poland, Dropping out of School, Trading 20:32 - Working on Robotics and Rubik's Cube Solving 23:02 - A Day in the Life: Talking to Researchers 24:06 - How Research Priorities Are Determined 26:53 - OpenAI's Culture of Transparency 29:32 - Balancing Research with Shipping Fast 31:52 - Using OpenAI's Own Tools Daily 32:43 - Pre-Training Plus RL: The Modern AI Stack 35:10 - Reinforcement Learning 101: Training Dogs 40:17 - The Evolution of Deep Reinforcement Learning 42:09 - When GPT-4 Seemed Underwhelming at First 45:39 - How RLHF Made GPT-4 Actually Useful 48:02 - Unsupervised vs Supervised Learning 49:59 - GRPO and How DeepSeek Accelerated US Research 53:05 - What It Takes to Scale Reinforcement Learning 55:36 - Agentic AI and Long-Horizon Thinking 59:19 - Alignment as an RL Problem 1:01:11 - Winning ICPC World Finals Without Specific Training 1:05:53 - Applying RL Beyond Math and Coding 1:09:15 - The Path from Here to AGI 1:12:23 - Pure RL vs Language Models

English

106

1.1K

197.7K

Guoqing Liu retweetet

Tsinghua University@Tsinghua_Uni·18 Eki

Prof. Chen Ning Yang, a world-renowned physicist, Nobel Laureate in Physics, Academician of the Chinese Academy of Sciences, Professor at Tsinghua University, and Honorary Director of the Institute for Advanced Study at Tsinghua University, passed away in Beijing due to illness at the age of 103. His life stands as a timeless chapter in human history—one that shines not only for China but for the global community of thinkers and innovators. His legacy will live on forever.

English

212

725

3.4K

428.8K

Guoqing Liu retweetet

Gregor Simm@gncsimm·18 Eki

MLFFs 🤝 Polymers — SimPoly works! Our team at @MSFTResearch AI for Science is proud to present SimPoly (SIM-puh-lee) — a deep learning solution for polymer simulation. Polymeric materials are foundational to modern life—found in everything from the clothes we wear and the food we consume to high-performance materials in aerospace, electronics, and medicine. Today, we introduce a new way to simulate them. We built a machine learning force field (MLFF) to predict macroscopic properties across a broad range of polymers—trained only on quantum-chemical data, with no experimental fitting. Specifically, we accurately compute polymer densities via large-scale MD simulations, achieving higher accuracy than classical force fields. We also capture second-order phase transitions, enabling prediction of glass transition temperatures. These two properties are fundamental to processing and application design. Finally, we created a benchmark based on experimental data for 130 polymers plus an accompanying quantum-chemical dataset—laying the foundation for a fully in silico design pipeline for next-generation polymeric materials. The incredible team: Jean Helie, @temporaer, Yicheng Chen, Guillem Simeon, @a_kzna, @ErnestoCheco, @erunzzz, Gabriele Tocci, @chc273, @yatao_li, @SherryLixueC, @zunwang_msr, Bichlien H. Nguyen, Jake A. Smith, and Lixin Sun. 📄 Preprint: arxiv.org/abs/2510.13696 ⚙️ Data and code release: in progress⏳ #MLFFs #Polymers #AIforScience #DeepLearning #SimPoly #ScientificML #Microsoft #MicrosoftResearch #MicrosoftQuantum

English

162

32.9K

Guoqing Liu retweetet

Da Yu@DaYu85201802·23 Eyl

✨ Internship Opportunity @ Google Research ✨ We are seeking a self-motivated student researcher to join our team at Google Research starting around January 2026. 🚀 In this role, you will contribute to research projects advancing agentic LLMs through tool use and RL, with the goal of enabling breakthrough applications. We are particularly interested in PhD students with a strong background in these areas. If interested, please send a brief self-introduction and your CV to yuda3.edu@gmail.com. Looking forward to connecting with talented researchers in this exciting space!

English

839

76.1K

Guoqing Liu retweetet

Microsoft Research@MSFTResearch·13 Ağu

RetroChimera, now available on Azure AI Foundry, marks a new milestone for predicting synthesis routes to drug-like molecules, opening new possibilities for AI in drug discovery. Learn more: msft.it/6011sutzJ

English

7.6K

Guoqing Liu retweetet

Dylan Foster 🐢@canondetortugas·11 Ağu

Announcing the first workshop on Foundations of Language Model Reasoning (FoRLM) at NeurIPS 2025! 📝Soliciting abstracts that advance foundational understanding of reasoning in language models, from theoretical analyses to rigorous empirical studies. 📆 Deadline: Sept 3, 2025

English

162

38K

Guoqing Liu retweetet

Tian Xie@xie_tian·31 Tem

Want to join our efforts @MSFTResearch AI for Science to push the frontier of AI for materials? We are the team behind MatterGen & MatterSim and we have 2 job openings! Each can be in Amsterdam, NL, Berlin, DE, or Cambridge, UK. It is a rare opportunity to join a highly talented, collaborative team and build the next frontier model for materials design. Senior Researcher: jobs.careers.microsoft.com/global/en/job/… Senior Research Engineer: jobs.careers.microsoft.com/global/en/job/…

English

11.4K

Guoqing Liu retweetet

Roberta Raileanu@robertarail·24 Tem

I’m building a new team at @GoogleDeepMind to work on Open-Ended Discovery! We’re looking for strong Research Scientists and Research Engineers to help us push the frontier of autonomously discovering novel artifacts such as new knowledge, capabilities, or algorithms, in an open-ended self-improving loop. We aim to work on ambitious research projects in a fast-paced manner. If this sounds appealing to you, apply using the link below by Friday, August 1st EOD: job-boards.greenhouse.io/deepmind/jobs/…

English

256

2.5K

345K

Guoqing Liu retweetet

Andrej Karpathy@karpathy·13 Tem

Scaling up RL is all the rage right now, I had a chat with a friend about it yesterday. I'm fairly certain RL will continue to yield more intermediate gains, but I also don't expect it to be the full story. RL is basically "hey this happened to go well (/poorly), let me slightly increase (/decrease) the probability of every action I took for the future". You get a lot more leverage from verifier functions than explicit supervision, this is great. But first, it looks suspicious asymptotically - once the tasks grow to be minutes/hours of interaction long, you're really going to do all that work just to learn a single scalar outcome at the very end, to directly weight the gradient? Beyond asymptotics and second, this doesn't feel like the human mechanism of improvement for majority of intelligence tasks. There's significantly more bits of supervision we extract per rollout via a review/reflect stage along the lines of "what went well? what didn't go so well? what should I try next time?" etc. and the lessons from this stage feel explicit, like a new string to be added to the system prompt for the future, optionally to be distilled into weights (/intuition) later a bit like sleep. In English, we say something becomes "second nature" via this process, and we're missing learning paradigms like this. The new Memory feature is maybe a primordial version of this in ChatGPT, though it is only used for customization not problem solving. Notice that there is no equivalent of this for e.g. Atari RL because there are no LLMs and no in-context learning in those domains. Example algorithm: given a task, do a few rollouts, stuff them all into one context window (along with the reward in each case), use a meta-prompt to review/reflect on what went well or not to obtain string "lesson", to be added to system prompt (or more generally modify the current lessons database). Many blanks to fill in, many tweaks possible, not obvious. Example of lesson: we know LLMs can't super easily see letters due to tokenization and can't super easily count inside the residual stream, hence 'r' in 'strawberry' being famously difficult. Claude system prompt had a "quick fix" patch - a string was added along the lines of "If the user asks you to count letters, first separate them by commas and increment an explicit counter each time and do the task like that". This string is the "lesson", explicitly instructing the model how to complete the counting task, except the question is how this might fall out from agentic practice, instead of it being hard-coded by an engineer, how can this be generalized, and how lessons can be distilled over time to not bloat context windows indefinitely. TLDR: RL will lead to more gains because when done well, it is a lot more leveraged, bitter-lesson-pilled, and superior to SFT. It doesn't feel like the full story, especially as rollout lengths continue to expand. There are more S curves to find beyond, possibly specific to LLMs and without analogues in game/robotics-like environments, which is exciting.

English

408

827

8.3K

1.1M

Guoqing Liu retweetet

Ryan Yuan@RainbowYuhui·1 Tem

I’m building a fundamental research team focused on developing the world’s best graphic-design generation and understanding models. We have access to vast amounts of high-quality data and ample GPU resources. If you’re interested in joining us, please email me your résumé.🤗🤗🤗

English

8.5K

Guoqing Liu retweetet

Adam Foster@AdamEFoster·26 Haz

I am very happy to share Orbformer, a foundation model for wavefunctions using deep QMC that offers a route to tackle strongly correlated quantum states! arxiv.org/abs/2506.19960

English

11.5K

Guoqing Liu retweetet

Microsoft Research@MSFTResearch·18 Haz

Microsoft researchers achieved a breakthrough in the accuracy of DFT, a method for predicting the properties of molecules and materials, by using deep learning. This work can lead to better batteries, green fertilizers, precision drug discovery, and more. msft.it/6011SQwKX

English

100

323

41.1K

Guoqing Liu retweetet

Rianne van den Berg@vdbergrianne·18 Haz

🚀 After two+ years of intense research, we’re thrilled to introduce Skala — a scalable deep learning density functional that hits chemical accuracy on atomization energies and matches hybrid-level accuracy on main group chemistry — all at the cost of semi-local DFT. ⚛️🔥🧪🧬

English

291

33K

Guoqing Liu retweetet

Younggyo Seo@younggyoseo·29 May

Excited to present FastTD3: a simple, fast, and capable off-policy RL algorithm for humanoid control -- with an open-source code to run your own humanoid RL experiments in no time! Thread below 🧵

English

112

561

130.6K

Guoqing Liu retweetet

Tom Zahavy@TZahavy·10 Nis

I am looking to hire a student researcher to work with AlphaProof on a project at the intersection of AI, math, computation, and creativity. Background in AI for math, and/or Lean is desired. If interested, please get in touch. The position will be based in London.

English

539

54.6K

Guoqing Liu retweetet

Ryan Yuan@RainbowYuhui·26 Şub

Thrilled to share our latest research on fundamental variable multi-layer transparent image generation, inspired by Schema Theory! ✨ ART enables precise control and scalable layer generation—pioneering a new paradigm for interactive content creation. 🚀 art-msra.github.io

English

10K

Guoqing Liu@fiberleif·25 Şub

Check out our latest work on DNA language modeling 🧬: HybriDNA, a hybrid Transformer-Mamba2 model crafted to address the unique challenges of DNA sequences, offering both understanding and generative capabilities! A huge thanks to all our amazing collaborators!

Mingqian Ma@mishamamq

🚀 Research Update! Excited to share HybriDNA: A Hybrid Transformer-Mamba2 Long-Range DNA Language Model, now on arXiv! DNA is the language of life, but modeling it is challenging: ultra-long sequences, single-nucleotide precision, and the need for both understanding & generation. HybriDNA tackles this with a decoder-only Hybrid Transformer-Mamba2 model, scaling to 131kb context & achieving SOTA across 33 DNA tasks! 🧬 🔹 Hybrid Transformer-Mamba2 for long-range genomic modeling 🔹 SOTA on BEND, GUE & LRB benchmarks 🔹 Scales up to 7B params, improving with size 🔹 Efficient long-context DNA modeling 📄 Read here: arxiv.org/abs/2502.10807 🌐 Project page: hybridna-project.github.io/HybriDNA-Proje… Huge thanks to my co-authors @_albertgu, @tri_dao from Princeton and CMU, and all the researchers at MSR AI for Science. Special gratitude to my mentors @fiberleif and @TaoQin for their continuous support! #MachineLearning #Genomics #AI4Science #DNA #FoundationModels #HybriDNA

English

1.1K

Guoqing Liu retweetet

Mingqian Ma@mishamamq·25 Şub

English

1.5K

Entdecken

@MSFTResearch @temporaer @a_kzna @ErnestoCheco @erunzzz @chc273 @yatao_li @SherryLixueC