Michael Bendersky

518 posts

Michael Bendersky banner
Michael Bendersky

Michael Bendersky

@bemikelive

@ Databricks / ex-Google DeepMind / ex-Google Research Interested in research at the intersection of IR & AI.

Katılım Mayıs 2010
389 Takip Edilen1.8K Takipçiler
Michael Bendersky
Michael Bendersky@bemikelive·
We just published OfficeQA Pro - a set of 133 challenging questions from the original OfficeQA benchmark. Even the best frontier agents still struggle on OfficeQA Pro with common issues stemming from errors in parsing, retrieval, and visual reasoning.
Michael Bendersky tweet media
English
1
8
24
2.2K
Michael Bendersky
Michael Bendersky@bemikelive·
I thought about posting a thread on KARL, a new Pareto-optimal model for retrieval and grounded reasoning tasks. But @jefrankle did a much better job than I ever could. If you have any interest in information retrieval and/or RL, check it out! Full report: databricks.com/sites/default/…
Jonathan Frankle@jefrankle

Meet KARL, an RL'd model for document-centric tasks at frontier quality and open source cost/speed. Great for @databricks customers and scientists (77-page tech report!) As usual, this isn't just one model - it's an RL assembly line to churn out models for us and our customers 🧵

English
1
3
26
2.3K
Michael Bendersky retweetledi
Matei Zaharia
Matei Zaharia@matei_zaharia·
Agent memory is a simple and powerful way to do continual learning! With the new MemAlign method from Databricks Research, we can build better LLM judges from examples of human ratings, and they scale with more data. Now in Databricks and @MLflow. databricks.com/blog/memalign-…
English
10
37
233
18.3K
Michael Bendersky
Michael Bendersky@bemikelive·
Instructed retriever is not just better than RAG, but it is also a much more effective tool in a multi-step agentic setting, where it not only delivers better results, but also does it faster and in fewer steps.
Michael Bendersky tweet media
English
1
0
1
183
Michael Bendersky
Michael Bendersky@bemikelive·
@mrdrozdov @jeffreyhuber "Some people, when confronted with a problem, think 'I know, I’ll use 𝚛̶𝚎̶𝚐̶𝚞̶𝚕̶𝚊̶𝚛̶ ̶𝚎̶𝚡̶𝚙̶𝚛̶𝚎̶𝚜̶𝚜̶𝚒̶𝚘̶𝚗̶ search.' Now they have two problems."
Español
0
0
2
36
Jeff Huber
Jeff Huber@jeffreyhuber·
compaction? search problem memory? search problem tool selection? search problem observability? believe it or not, search problem
English
41
15
325
30.4K
Michael Bendersky
Michael Bendersky@bemikelive·
If you are excited about the intersection of reinforcement learning and highly complex economically valuable tasks --I can't think of a better place to spend the summer of 2026!
Jonathan Frankle@jefrankle

I'm hiring interns for next summer at @databricks! Specifically on (1) empirical RL at scale on non-verifiable tasks and (2) enabling real people specify the behaviors they want out of AI (e.g., through evals) on highly complex tasks. 🧵

English
0
0
6
240
Michael Bendersky
Michael Bendersky@bemikelive·
Big thanks to the entire @databricks AI Research team, and our partners SuperAnnotate, Turing and USAFacts!
English
0
0
5
148