ML@CMU

124 posts

ML@CMU banner
ML@CMU

ML@CMU

@mlcmublog

Official twitter account for the ML@CMU blog @mldcmu @SCSatCMU

Pittsburgh, PA Katılım Şubat 2020
20 Takip Edilen2.3K Takipçiler
ML@CMU
ML@CMU@mlcmublog·
blog.ml.cmu.edu/2026/03/17/lum… The method used to segment textual content into ‘chunks’ in RAG pipelines can significantly impact dense retrieval quality. Read more about LumberChunker, a method for dynamically segmenting long-form narrative segments, in our new post!
English
0
1
4
438
ML@CMU
ML@CMU@mlcmublog·
We asked LLMs: Is Santa real? 🎅 GPT-4o says Yes at any age. Claude tells 5-year-olds the truth. What does this reveal about invisible assumptions in AI? Do LLMs believe in the tooth fairy or the Illuminati? New holiday post here: blog.ml.cmu.edu/2025/12/23/is-…
English
0
3
5
770
ML@CMU
ML@CMU@mlcmublog·
LLM-as-a-judge is used everywhere, but breaks under standard forced-choice ratings. Modeling rating indeterminacy can lead to more reliable judges. Read more in our latest blog post. blog.ml.cmu.edu/2025/12/09/val…
English
0
0
1
179
ML@CMU
ML@CMU@mlcmublog·
Why does LLM training plateau and how can we fix it? In a new blogpost, we discuss how we can improve LLM exploration using ideas from offline RL. blog.ml.cmu.edu/2025/11/26/how…
English
1
0
1
268
ML@CMU
ML@CMU@mlcmublog·
blog.ml.cmu.edu/2025/10/27/lea… The hardest problems have near-zero success rates and no positive examples during learning. BaNEL (Bayesian Negative Evidence Learning) post-trains using failed attempts only while minimizing the number of reward evaluations. Read more in our latest post!
English
0
3
3
1.5K
ML@CMU
ML@CMU@mlcmublog·
blog.ml.cmu.edu/2025/09/22/dif… Check out our new blog post on "Diffusion beats Autoregressive in Data-Constrained settings". The era of infinite internet data is ending. This research paper asks:  What is the right generative modeling objective when data—not compute—is the bottleneck?
English
0
1
7
495
ML@CMU
ML@CMU@mlcmublog·
blog.ml.cmu.edu/2025/09/15/ver… Check out our latest blog post on Verlog, a multi-turn reinforcement learning framework built for long-horizon LLM-agentic tasks with highly variable episode lengths.
English
0
4
13
884
ML@CMU
ML@CMU@mlcmublog·
blog.ml.cmu.edu/2025/04/21/all… Check out our new blog post on ALLIE, a new chess AI that actually plays like a human! Unlike Stockfish or AlphaZero that focus on winning at all costs, ALLIE uses a transformer model trained on human chess games to make moves, ponder and resign like humans. With time-adaptive MCTS search at inference time (allocating more search budget to positions where humans spend time on), ALLIE can match player skill levels up to grandmaster-level opponents (2500 Elo) in online games, while learning exclusively from humans. Written by @yimingz0, @apjacob03, Vivian Lai, @dan_fried, @daphneipp
English
0
3
1
400
ML@CMU
ML@CMU@mlcmublog·
blog.ml.cmu.edu/2025/04/09/cop… How do real-world developer preferences compare to existing evaluations? A CMU and UC Berkeley team led by @iamwaynechi and @valeriechen_ created @CopilotArena to collect user preferences on in-the-wild workflows. This blogpost overviews the  design and deployment of Copilot Arena + new insights into developer code preferences.
ML@CMU tweet media
English
0
8
18
3.9K