Michael Bendersky

518 posts

Michael Bendersky

@bemikelive

@ Databricks / ex-Google DeepMind / ex-Google Research Interested in research at the intersection of IR & AI.

Katılım Mayıs 2010

389 Takip Edilen1.8K Takipçiler

Sabitlenmiş Tweet

Michael Bendersky@bemikelive·7 Oca

Really excited to share our latest research on the Instructed Retriever - a novel retrieval architecture that reimagines search for the agentic era. databricks.com/blog/instructe… Amazing work by @cindyxinyiwang and @mrdrozdov who co-led this effort!

English

2.8K

Michael Bendersky@bemikelive·10 Mar

Congratulations to @kristahopsalong @arnav_thebigman @jazco @ivanzhouyq Erich Elsen @matei_zaharia and everyone at @DbrxMosaicAI who made this work possible! Special thanks you to our partners @USAFacts @superannotate @turingcom and to all Github contributors!

English

188

Michael Bendersky@bemikelive·10 Mar

All of these are realistic problems that @databricks customers face in their daily work, and we hope that OfficeQA Pro will contribute to advancing SoTA on grounded reasoning tasks. Technical Report: arxiv.org/pdf/2603.08655 Github: github.com/databricks/off…

English

222

Michael Bendersky@bemikelive·10 Mar

We just published OfficeQA Pro - a set of 133 challenging questions from the original OfficeQA benchmark. Even the best frontier agents still struggle on OfficeQA Pro with common issues stemming from errors in parsing, retrieval, and visual reasoning.

English

2.2K

Michael Bendersky@bemikelive·5 Mar

This was an incredibly fun collaboration with @j_nadan_chang @mrdrozdov @ShubhamToshniw6 @owenoertell @alexrtrott @WenSun1 @jefrankle and many others here at Databricks AI Research.

English

197

Michael Bendersky@bemikelive·5 Mar

I thought about posting a thread on KARL, a new Pareto-optimal model for retrieval and grounded reasoning tasks. But @jefrankle did a much better job than I ever could. If you have any interest in information retrieval and/or RL, check it out! Full report: databricks.com/sites/default/…

Jonathan Frankle@jefrankle

Meet KARL, an RL'd model for document-centric tasks at frontier quality and open source cost/speed. Great for @databricks customers and scientists (77-page tech report!) As usual, this isn't just one model - it's an RL assembly line to churn out models for us and our customers 🧵

English

2.3K

Michael Bendersky retweetledi

Matei Zaharia@matei_zaharia·4 Şub

Agent memory is a simple and powerful way to do continual learning! With the new MemAlign method from Databricks Research, we can build better LLM judges from examples of human ratings, and they scale with more data. Now in Databricks and @MLflow. databricks.com/blog/memalign-…

English

233

18.3K

Michael Bendersky@bemikelive·7 Oca

Instructed retriever is now available for all of our Agent Bricks Knowledge Assistant customers. Consider trying it out for your next retrieval agent project. docs.databricks.com/aws/en/generat…

English

150

Michael Bendersky@bemikelive·7 Oca

Instructed retriever is not just better than RAG, but it is also a much more effective tool in a multi-step agentic setting, where it not only delivers better results, but also does it faster and in fewer steps.

English

183

Michael Bendersky@bemikelive·7 Oca

English

2.8K

Michael Bendersky@bemikelive·19 Ara

@mrdrozdov @jeffreyhuber "Some people, when confronted with a problem, think 'I know, I’ll use 𝚛̶𝚎̶𝚐̶𝚞̶𝚕̶𝚊̶𝚛̶ ̶𝚎̶𝚡̶𝚙̶𝚛̶𝚎̶𝚜̶𝚜̶𝚒̶𝚘̶𝚗̶ search.' Now they have two problems."

Español

Andrew Drozdov@mrdrozdov·18 Ara

@jeffreyhuber generation? search problem!

English

949

Jeff Huber@jeffreyhuber·18 Ara

compaction? search problem memory? search problem tool selection? search problem observability? believe it or not, search problem

English

325

30.4K

Michael Bendersky@bemikelive·19 Ara

If you are excited about the intersection of reinforcement learning and highly complex economically valuable tasks --I can't think of a better place to spend the summer of 2026!

Jonathan Frankle@jefrankle

I'm hiring interns for next summer at @databricks! Specifically on (1) empirical RL at scale on non-verifiable tasks and (2) enabling real people specify the behaviors they want out of AI (e.g., through evals) on highly complex tasks. 🧵

English

240

Michael Bendersky@bemikelive·9 Ara

Big thanks to the entire @databricks AI Research team, and our partners SuperAnnotate, Turing and USAFacts!

English

148

Michael Bendersky@bemikelive·9 Ara

Huge congratulations to @kristahopsalong and @arnav_thebigman who spearheaded this work, and all our co-authors @jazco , @ivanzhouyq , @cindyxinyiwang , @abaheti95 , @JacobianNeuro , @sam_havens , Erich Elsen, @matei_zaharia and Xing Chen!

English

181

Michael Bendersky@bemikelive·9 Ara

We released OfficeQA today -- a hard benchmark for evaluating agents on grounded reasoning tasks. More details in our blog databricks.com/blog/introduci… and the thread below

English

2.4K

Keşfet

@kristahopsalong @arnav_thebigman @jazco @ivanzhouyq @matei_zaharia @DbrxMosaicAI @USAFacts @superannotate