Tyler LaBonte

940 posts

Tyler LaBonte banner
Tyler LaBonte

Tyler LaBonte

@tmlabonte

ML PhD student @GeorgiaTech, Math BS @USC. Deep learning theory, generalization, robustness.

Atlanta, GA Katılım Aralık 2019
702 Takip Edilen894 Takipçiler
Sabitlenmiş Tweet
Tyler LaBonte
Tyler LaBonte@tmlabonte·
Excited to present at the first #AISTATS2025 poster session on May 3! Ever wondered how LLMs can generalize to new tasks in-context despite only training on token completion? We formalize this phenomenon as "task shift" and investigate a linear version: arxiv.org/abs/2502.13285
Tyler LaBonte tweet media
English
1
2
22
2.5K
Tyler LaBonte
Tyler LaBonte@tmlabonte·
@iamwaynechi Can't wait for more games in various shades of red! ("rougelikes"... ok I'll see myself out)
English
1
0
1
62
Tyler LaBonte retweetledi
Microsoft Research
Microsoft Research@MSFTResearch·
Multimodal reasoning with Phi-4-reasoning-vision, new work on scaling LLM inference, benchmarking AI agents in network operations, cinematic video generation, adaptive evaluation for LLMs, and using AI to improve individual and population health. msft.it/6013QiQgx
English
3
9
50
11.4K
Tyler LaBonte
Tyler LaBonte@tmlabonte·
It's been the privilege of my career to help build the newest Phi series model from @MSFTResearch! Phi-4-reasoning-vision-15B is open-weight & competitive on perf with 10X less compute/tokens. Read the blog for math and CUA case studies, hybrid reasoning, data insights, & more!
Tyler LaBonte tweet media
Microsoft Research@MSFTResearch

Vision-language models improve multimodal systems, but can make them slower, costlier, and harder to deploy. Learn how Phi-4-reasoning-vision-15B, a compact and fast multimodal reasoning model, blends strengths of different methods while reducing their limits: msft.it/6014Q5X0u

English
0
0
10
856
Behnam Neyshabur
Behnam Neyshabur@bneyshabur·
I've left Anthropic to start something new. 🧵
Behnam Neyshabur tweet media
English
156
62
2.9K
398K
Tyler LaBonte
Tyler LaBonte@tmlabonte·
Finally, thanks to @Kangwook_Lee's "Tenure Track Simulator" post for inspiring me to make the game public and write this up!
English
0
0
1
81
Tyler LaBonte
Tyler LaBonte@tmlabonte·
Misc takeaways: • Copilot + GitHub was far more useful than I expected • Keeping code style consistent across humans + agents is painful • Overall: Claude was best for agentic coding; Gemini best for interactive pair-programming
English
2
0
0
121
Tyler LaBonte
Tyler LaBonte@tmlabonte·
Over the holidays, I stress-tested the AI coding hype by doing something concrete: I built a college football simulator game from scratch to see if agents actually deliver. Here’s what I learned 👇
Tyler LaBonte tweet media
English
2
0
1
168
Tyler LaBonte
Tyler LaBonte@tmlabonte·
@liyzhen2 What are the other workshops? I couldn't find them on the AISTATS website
English
0
0
2
152
Tyler LaBonte
Tyler LaBonte@tmlabonte·
@marikgoldstein Yes definitely, though it's better than 6mo-1yr ago. I find rewording in a more objective way helps, e.g., "Prove whether f(x) is O(n)" instead of "Prove that f(x) is O(n)". Also for anything important (i.e., research) I ask both GPT and Gemini and compare, then verify myself.
English
1
0
1
112
Mark Goldstein
Mark Goldstein@marikgoldstein·
Suppose that you are trying to prove XYZ with GPT. If in the chat you show some desire for the claim to be true, have you noticed GPT making mistakes frequently? so to speak, maybe prioritizing affirmation over correctness? If so, beyond inserting "I might be wrong", what helps?
English
2
0
4
648
Tyler LaBonte
Tyler LaBonte@tmlabonte·
Fara has been one of the most exciting projects to watch evolve @MSFTResearch over the last few months. From my perspective, Fara is a real advance towards natively multimodal computer-use agents (e.g., no accessibility trees). Congrats to Corby and the team on the release!
Corbin Rosset@corby_rosset

Microsoft just dropped Fara-7B, its first on device AI Agent that can use your computer just like you would: it clicks, types, fills out forms and completed tasks just by “seeing” the screen. It’s best-in-class in terms of accuracy and cost from yours truly at Microsoft AI Frontiers and you can use it today

English
0
0
2
264
Tim Davidson
Tim Davidson@im_td·
what is the em-dash equivalent for AI generated code?
English
3
1
1
474