Thomas Dhome-Casanova

26 posts

Thomas Dhome-Casanova banner
Thomas Dhome-Casanova

Thomas Dhome-Casanova

@swayingoak

Foundational model training @MicrosoftAI, previously @MSFTResearch @Princeton

Katılım Ocak 2020
244 Takip Edilen171 Takipçiler
Thomas Dhome-Casanova retweetledi
Corbin Rosset
Corbin Rosset@corby_rosset·
How do you tell if a computer use agent actually succeeded? It’s really two questions: did it execute well (process), and did the user actually get what they asked for (outcome)? Introducing the Universal Verifier 🧵
English
2
14
32
2.5K
Thomas Dhome-Casanova retweetledi
Mustafa Suleyman
Mustafa Suleyman@mustafasuleyman·
Our new image generator MAI-Image-2 is out! Available now on MAI Playground for everything from lifelike realism to detailed infographics. Our team has been pushing immensely hard for this release, and we are now among the top models out there: #3 family on @arena. Check out the details in our blog: microsoft.ai/news/introduci… It's shipping soon in Copilot and Bing Image Creator, as well as Microsoft Foundry. Really proud of our progress on models and products - stay tuned for new releases and come join us on our Superintelligence mission!
Mustafa Suleyman tweet mediaMustafa Suleyman tweet mediaMustafa Suleyman tweet mediaMustafa Suleyman tweet media
English
156
124
598
239.8K
Thomas Dhome-Casanova
Thomas Dhome-Casanova@swayingoak·
Off to NeurIPS with the Microsoft AI Superintelligence team. DM me for a coffee or come to the Microsoft booth to meet the team!
Thomas Dhome-Casanova tweet media
English
1
1
3
185
Thomas Dhome-Casanova retweetledi
Sky
Sky@skybysoftware·
Introducing Sky for Mac. Here's a 90-second sneak peek:
English
62
88
1.1K
300.6K
Thomas Dhome-Casanova retweetledi
Sherjil Ozair
Sherjil Ozair@sherjilozair·
Today I'm launching my new company @GeneralAgentsCo and our first product. Introducing Ace: The First Realtime Computer Autopilot Ace is not a chatbot. Ace performs tasks for you. On your computer. Using your mouse and keyboard. At superhuman speeds!
English
347
323
2.9K
870.7K
Thomas Dhome-Casanova
Thomas Dhome-Casanova@swayingoak·
OmniParser v2 is cooking! #1 on GitHub, #3 on Product Hunt. Go try it out
Thomas Dhome-Casanova tweet media
English
0
1
3
337
Thomas Dhome-Casanova retweetledi
Gradio
Gradio@Gradio·
Microsoft's OmniParser V2 just dropped. Now AI can : *understands every button* *identifies all interactive elements* *structures the entire GUI* Pure vision-based parsing. GUI automation just got a lot easier 🎯
English
7
91
583
62K
Thomas Dhome-Casanova retweetledi
Chubby♨️
Chubby♨️@kimmonismus·
x.com/_akhaliq/statu… OmniParser V2 can turn any LLM into an agent capable of using a computer! You can enable GPT-4o, DeepSeek R1, Sonnet 3.5, Qwen to understand what's on your screen and take actions. 100% free & open source. Microsoft cooked!
English
23
80
564
42.5K
Thomas Dhome-Casanova retweetledi
AK
AK@_akhaliq·
Microsoft just dropped OmniParser V2, looks incredible Turning Any LLM into a Computer Use Agent
English
48
321
2.1K
226.3K
Thomas Dhome-Casanova
Thomas Dhome-Casanova@swayingoak·
🤖 Your LLM agent only needs 1 tool – an operating system. Introducing OmniTool from Microsoft Research. Use any app in Windows by pairing OmniParser V2 with your favourite LLM (GPT4o, O1, DeepSeek R1 or Qwen 2.5VL).
English
1
7
13
3.2K
Logan Kilpatrick
Logan Kilpatrick@OfficialLoganK·
We just shipped the ability to use fine-tuned Gemini models via API key and share them with additional projects in Google AI Studio and the Gemini API! 🚢
Logan Kilpatrick tweet media
English
57
67
882
125.1K