Barlas Oğuz

9 posts

Barlas Oğuz banner
Barlas Oğuz

Barlas Oğuz

@barlas_berkeley

Research scientist, Meta FAIR. ex-MSFT, Berkeley alumni

Oakland, CA, USA Katılım Mart 2014
24 Takip Edilen51 Takipçiler
Barlas Oğuz retweetledi
Jessy Lin
Jessy Lin@realJessyLin·
🧠 How can we equip LLMs with memory that allows them to continually learn new things? In our new paper with @AIatMeta, we show how sparsely finetuning memory layers enables targeted updates for continual learning, w/ minimal interference with existing knowledge. While full finetuning and LoRA see drastic drops in held-out task performance (📉-89% FT, -71% LoRA on fact learning tasks), memory layers learn the same amount with far less forgetting (-11%). 🧵:
Jessy Lin tweet media
English
52
294
1.9K
317.8K
Barlas Oğuz retweetledi
Yen-Ju Lu
Yen-Ju Lu@Yen_Ju_Lu·
🚀 Introducing the Latent Speech-Text Transformer (LST) — a speech-text model that organizes speech tokens into latent patches for better text→speech transfer, enabling steeper scaling laws and more efficient multimodal training ⚡️ Paper 📄 arxiv.org/pdf/2510.06195
Yen-Ju Lu tweet media
English
7
16
35
9.4K
Barlas Oğuz retweetledi
Jessy Lin
Jessy Lin@realJessyLin·
🔍 How do we teach an LLM to 𝘮𝘢𝘴𝘵𝘦𝘳 a body of knowledge? In new work with @AIatMeta, we propose Active Reading 📙: a way for models to teach themselves new things by self-studying their training data. Results: * 𝟔𝟔% on SimpleQA w/ an 8B model by studying the wikipedia docs (+𝟑𝟏𝟑% vs plain finetuning) * a domain-specific expert model: 𝟏𝟔𝟎% vs FT on FinanceBench knowledge * an 8B wikipedia expert competitive w/ 405B on factuality (💥open-sourced!) 🧵[1/n]
Jessy Lin tweet media
English
15
150
1.1K
132.1K
Barlas Oğuz retweetledi
Jessy Lin
Jessy Lin@realJessyLin·
And understanding how to teach models new things is increasingly impt – not just for training capable specialized models (e.g. AR as a practical technique for training personalized/expert models), but looking towards a continual learning paradigm where models keep acquiring new skills/knowledge through interaction with the world. Retrieval / text-based memory works fine today, but to have models that keep getting smarter over time - we need to figure out parametric methods for memory. When an agent encounters a new piece of experience, how can it update on it effectively to build up its knowledge/skills over time - like humans do? (maybe the first step: active reading!) 🧵 [9/n]
English
2
3
31
3.6K
Barlas Oğuz retweetledi
Jason Weston
Jason Weston@jaseweston·
...is today a good day for new paper posts? 🤖Learning to Reason for Factuality 🤖 📝: arxiv.org/abs/2508.05618 - New reward func for GRPO training of long CoTs for *factuality* - Design stops reward hacking by favoring precision, detail AND quality - Improves base model across all axes 🧵1/3
Jason Weston tweet media
English
1
49
382
36.7K
Barlas Oğuz retweetledi
Guan Wang
Guan Wang@makingAGI·
🚀Introducing Hierarchical Reasoning Model🧠🤖 Inspired by brain's hierarchical processing, HRM delivers unprecedented reasoning power on complex tasks like ARC-AGI and expert-level Sudoku using just 1k examples, no pretraining or CoT! Unlock next AI breakthrough with neuroscience. 🌟 📄Paper: arxiv.org/abs/2506.21734 💻Code: github.com/sapientinc/HRM
Guan Wang tweet media
English
226
629
4K
1.3M
Barlas Oğuz retweetledi
Gargi Ghosh
Gargi Ghosh@gargighosh·
Last one of the year - EWE: arxiv.org/pdf/2412.18069 Ewe (Explicit Working Memory), enhances factuality in long-form text generation by integrating a working memory that receives real-time feedback from external resources.
Gargi Ghosh tweet media
English
2
19
98
10.7K
Barlas Oğuz retweetledi
AI at Meta
AI at Meta@AIatMeta·
New research from Meta FAIR — Meta Memory Layers at Scale. This work takes memory layers beyond proof-of-concept, proving their utility at contemporary scale ➡️ go.fb.me/3lbt4m
English
38
175
994
156K