Bryon Kucharski

67 posts

Bryon Kucharski

@bryonkuchML

Search & LLMs at @Gartner_inc | MS from @manningcics | BS from @WITEngineering

Connecticut, USA Katılım Kasım 2021

987 Takip Edilen111 Takipçiler

Bryon Kucharski@bryonkuchML·18 Şub

@katelyn_lesse Any detail you can go into about this code generated “post processing” of search results? Having a hard time understanding what code the model actually generates that can be used as a relevancy signal

English

123

Katelyn Lesse@katelyn_lesse·17 Şub

sonnet 4.6 is available today. we continue to be bullish on the power of code execution, so we leaned in with our new programmatic web search & fetch tools. sonnet 4.6 saw 13% higher accuracy on BrowseComp while using 32% fewer input tokens. claude.com/blog/improved-…

English

138

5.9K

Bryon Kucharski@bryonkuchML·25 Oca

@willccbb love it! Let me know if youd ever want to hear more about the painpoints I've had with RLTrainer, potentially that could help articulate why prime-rl is so great!

English

will brown@willccbb·25 Oca

@bryonkuchML yeah! been wanting to make a video + accompanying blog walking through this stuff for a while (one day). for starters, will probably just look like fleshing out the verifiers / prime-rl docs with more conceptual walkthroughs

English

will brown@willccbb·25 Oca

prob gonna deprecate vf.RLTrainer soon and move it to a demo folder in the repo there’s no good reason to use it over prime-rl. it’s purely for educational purposes as a 1000-LOC example

English

6.9K

Bryon Kucharski@bryonkuchML·23 Ara

@johnowhitaker Would also ditto your comments about the value prop of Tinker 😁 Majority of my time here was spent on the GPU and infra related setup. Definitely spent way more than 30 cents too! But verifiers/prime-rl is solving a different problem

English

Jonathan Whitaker@johnowhitaker·23 Ara

@bryonkuchML Very cool!

English

Jonathan Whitaker@johnowhitaker·17 Ara

OK I had to record a quick video and share a dialog showing my first few tests: youtube.com/watch?v=yId2PE… Dialog: share.solve.it.com/d/e52b8889b9d3… In the video, I show how easy it can be to train a model on a custom task with your own reward function. LMK what I should try next :)

YouTube

Jonathan Whitaker@johnowhitaker

First impressions of Tinker: I can tinker with LLMs again! Really liking it so far - you can focus on the data and *what* you want to DO, not the stress of distributed training, model loading, arcane incantations, implementation differences, library bugs... Amazing work @thinkymachines team <3

English

9.1K

Bryon Kucharski retweetledi

Bryon Kucharski@bryonkuchML·23 Ara

@johnowhitaker I really wanted to tinker with verifiers so i made this into an environment! app.primeintellect.ai/dashboard/envi… Was able to train Qwen/Qwen3-0.6B on various difficulty levels (easy=2 numbers, medium=4 numbers, hard=6 numbers)

English

Bryon Kucharski@bryonkuchML·23 Ara

@johnowhitaker here my exact training toml. Used RLTrainer() for the training run on a ml.g5.24xlarge obligatory thanks to @willccbb and prime intellect

English

Bryon Kucharski@bryonkuchML·23 Ara

@johnowhitaker Worked pretty well - very fun learning environment. Thanks for the idea @johnowhitaker

English

Bryon Kucharski retweetledi

Gartner@Gartner_inc·19 Kas

With AskGartner, Gartner clients get fast answers, tailored outputs and the confidence to take action in seconds. Learn more about what AskGartner can do for you: gtnr.it/4hZ6H22 #AskGartner #TrustedAI #Business #Technology #Insights

English

Bryon Kucharski@bryonkuchML·14 Kas

@allisontam_ I miss when I got to learn about these new tricks from a paper ☹️

English

Allison Tam@allisontam_·14 Kas

for the first time 5.1 instant uses adaptive reasoning when responding. shipping this led the team down some pretty fun ML rabbit holes! if you’re an RL nerd, please reach out 🙂 I’ll be at neurips looking for folks to join the Science of Posttraining team

OpenAI@OpenAI

GPT-5.1 in ChatGPT is rolling out to all users this week. It’s smarter, more reliable, and a lot more conversational. openai.com/index/gpt-5-1

English

427

117.8K

Bryon Kucharski@bryonkuchML·26 Ağu

@jxnlco @Anthropic @ivanleomk I love your package and the procedural APU. Im having some issues setting up the UI. Seems like im not able to load my JSONL checkpoints properly. Is this a known issue?

English

jason liu@jxnlco·2 Haz

kura 0.5.0 is out, it encapuslaes a lot of what @anthropic's clio does as well as what we teach in improvingrag.com usekura.xyz/blog/2025/05/2…

English

1.2K

Bryon Kucharski@bryonkuchML·21 Ağu

@willccbb That’d be sweet. I’m trying to figure out what people normally do for deep research type systems. Any idea?

English

will brown@willccbb·20 Ağu

they should make an App Store for Verifiable Rewards

English

145

16.8K

Bryon Kucharski retweetledi

ClearSight AI@ClearSightAI·5 Ağu

$IT Q2 FY25 Earnings 📊 ✅ Revenue: $1.686B (beat) ✅ Adj. EPS: $3.53 (beat) ✅ GAAP EPS: $3.11 (beat) 📈 EPS up 9.6% YoY 🤖 Rolled out new AI tool “AskGartner” 🔁 $700M boost to share repurchase plan 📊 Guidance updated for FY25 #Earnings #Tech #AI

English

449

Bryon Kucharski@bryonkuchML·15 Tem

@cthorrez @lmarena_ai Happy for you! Best of luck

English

Clayton Thorrez@cthorrez·15 Tem

Extremely excited to announce that I've joined @lmarena_ai! For years I've been working in LLMs for my job, and hacking on rankings and ratings for fun, beyond thrilled to be able to join this project at the intersection!

English

4.3K

Bryon Kucharski@bryonkuchML·9 Nis

@yoavgo I have been playing with this and love it, thank you so much for all hard work and the public demo! Do you have any more information regarding ETA for the code? I know the blog mentions some copyright challenges

English

(((ل()(ل() 'yoav))))👾@yoavgo·26 Mar

we've been working on this for a while, and i find it very useful. try it out. (consider this to be v0, more to come!)

Ai2@allen_ai

Meet Ai2 Paper Finder, an LLM-powered literature search system. Searching for relevant work is a multi-step process that requires iteration. Paper Finder mimics this workflow — and helps researchers find more papers than ever 🔍

English

258

23.6K

Bryon Kucharski@bryonkuchML·2 Nis

@thomazvu I love this paper and your idea, how can I help

English

thomas@thomazvu·13 Mar

this paper was so cool but they didn’t go far enough introduce money, equip each agent with an inventory, let them trade and scam each other let them fall in love, form alliances and fight each other, put them in the hunger games who’s working on this? DM me

English

779

Bryon Kucharski@bryonkuchML·22 Ara

@antoine_chaffin @jobergum Any reason why you left out fine-tuned (in domain) of colbert models on MLDR?

English

Antoine Chaffin@antoine_chaffin·19 Ara

@jobergum I am mostly speaking about the results on BEIR On MLDR, even if the training on MS MARCO is different, ColBERT (with long context models) outperforms even fine-tuned (in domain) dense models, so this is not even a fight here

English

115

Jo Kristian Bergum@jobergum·19 Ara

Thoughts on ModernBERT and retrieval! What stands out to me here is the difference in effectiveness between a single-dense representation (DPR) versus ColBERT. Especially on long-context (MLDR).

English

3.9K

Bryon Kucharski@bryonkuchML·23 Kas

@jsuarez @emollick I’m interested, say more!

English

Joseph Suarez 🐡@jsuarez·23 Kas

@emollick We have NetHack running 130k steps/second in PufferLib. I love this env and wish we had more contributors interested in working on it. We can probably get farther than current LLMs with a 2M param model + RL

English

785

Ethan Mollick@emollick·23 Kas

This may sound odd, but game-based benchmarks are some of the most useful for AI, since we have human scores and they require reasoning, planning & vision The hardest of all is Nethack. No AI is close, and I suspect that an AI that can fairly win/ascend would need to be AGI-ish.

English

105

596

90.7K

Bryon Kucharski@bryonkuchML·18 Eki

@jobergum I get something similar when I try so seems reasonable 😀

English

Bryon Kucharski@bryonkuchML·18 Eki

@jobergum If you're referring to that chat system prompt this tweet might be helpful x.com/JauniusKadunas…

Jaunius Kadunas@jaunius

NotebookLM system prompt is easier to get than most think!

English

129

Jo Kristian Bergum@jobergum·18 Eki

NotebookML is a great LLM product with a dorky name. Would love to see the system prompt as I have tried their aistudio for similar workflows without being impressed.

English

1.5K

Keşfet

@katelyn_lesse @willccbb @johnowhitaker @allisontam_ @jxnlco @Anthropic @ivanleomk @anthropic