Jay Shim

41 posts

Jay Shim

@jayjshim

Undergrad RL Researcher @ UT Austin | Sharing what I learn about RL/ML

Austin Katılım Aralık 2025

36 Takip Edilen7 Takipçiler

Jay Shim@jayjshim·21h

@kylelostat @allen_ai When you came to UT Austin and gave a talk about Olmo data distributions, I was really inspired by the work you've done! Thank you for the talk and I look forward to speaking with you in the future! Best of luck!

English

Kyle Lo@kylelostat·22h

Today I'm saying farewell to @allen_ai. I'm so proud of our team & grateful to have shared fully-open Olmo, Dolma, olmOCR, Molmo, etc with the world I know the team is more committed than ever to advancing open-source & open-science. Forever rooting for my dear friends 🫶

English

430

20.8K

Jay Shim retweetledi

Jiaheng Hu@JiahengHu1·13 Mar

VLA models are capable generalists. But can they continually self-improve? Such Continual Reinforcement Learning (CRL) problems are traditionally considered very challenging. Surprisingly, we found that with the right setup, the simplest CRL recipe can work really well! arxiv.org/abs/2603.11653

English

268

43.5K

Jay Shim@jayjshim·10 Şub

@trq212 Have you seen Claude "game the system" by pushing lots of useless code/comments to boost its own metrics? Or is that specifically de-emphasized

English

185

Thariq@trq212·10 Şub

We've launched Claude Code contribution metrics to help you track PRs and lines of code contributed with the help of Claude Code.

English

908

406.9K

Jay Shim@jayjshim·8 Şub

@bcherny Super exciting! Has there been any major noticeable differences in persona or output quality when using the fast mode?

English

214

Boris Cherny@bcherny·7 Şub

We just launched an experimental new fast mode for Opus 4.6. The team has been building with it for the last few weeks. It’s been a huge unlock for me personally, especially when going back and forth with Claude on a tricky problem.

Claude@claudeai

Our teams have been building with a 2.5x-faster version of Claude Opus 4.6. We’re now making it available as an early experiment via Claude Code and our API.

English

146

1.8K

288.1K

Jay Shim@jayjshim·4 Şub

The real bottleneck for AI in medicine might be human trust, not technical capabilities. Even if an AI hospital had higher survival rates, many of us would still hesitate. What would it actually take for people to trust AI in high-stakes settings?

English

Jay Shim@jayjshim·4 Şub

Realizing that building AI for healthcare means first working on safety reshaped how I think about my path. @DarioAmodei's essay on powerful AI left me with both awe at what's coming and urgency about getting it right, so I wrote down my thoughts. shimboi.hashnode.dev/the-reach-of-ai

English

Jay Shim@jayjshim·1 Şub

@bcherny Super cool tips! Using diction for prompts was a surprising tip that I hadn't even considered before. On a similar note, what specific keywords/phrases have you found boost performance significantly when written in the Claude.md, even more than you expected?

English

618

Boris Cherny@bcherny·1 Şub

I'm Boris and I created Claude Code. I wanted to quickly share a few tips for using Claude Code, sourced directly from the Claude Code team. The way the team uses Claude is different than how I use it. Remember: there is no one right way to use Claude Code -- everyones' setup is different. You should experiment to see what works for you!

English

925

5.9K

50.9K

9.1M

Jay Shim@jayjshim·31 Oca

@bcherny Congrats on the successful usage! In hindsight, were there any specific areas you thought needed more safety-proofing? If Claude provided incorrect simulation/planning it could've been disastrous right?

English

175

Boris Cherny@bcherny·30 Oca

NASA used Claude Code to plot and simulate the Perseverance rover’s route on Mars 👾 Pretty sure this is the furthest-from-Earth application of Claude

Anthropic@AnthropicAI

On December 8, the Perseverance rover safely trundled across the surface of Mars. This was the first AI-planned drive on another planet. And it was planned by Claude.

English

202

3.6K

260.9K

Jay Shim@jayjshim·29 Oca

@karpathy The sentiment of not being able to compete with big names to me isn't a de-motivator, rather it feels like a challenge to outcompete them even without the same resources or connections

English

155

Andrej Karpathy@karpathy·28 Oca

A conventional narrative you might come across is that AI is too far along for a new, research-focused startup to outcompete and outexecute the incumbents of AI. This is exactly the sentiment I listened to often when OpenAI started ("how could the few of you possibly compete with Google?") and 1) it was very wrong, and then 2) it was very wrong again with a whole another round of startups who are now challenging OpenAI in turn, and imo it still continues to be wrong today. Scaling and locally improving what works will continue to create incredible advances, but with so much progress unlocked so quickly, with so much dust thrown up in the air in the process, and with still a large gap between frontier LLMs and the example proof of the magic of a mind running on 20 watts, the probability of research breakthroughs that yield closer to 10X improvements (instead of 10%) imo still feels very high - plenty high to continue to bet on and look for. The tricky part ofc is creating the conditions where such breakthroughs may be discovered. I think such an environment comes together rarely, but @bfspector & @amspector100 are brilliant, with (rare) full-stack understanding of LLMs top (math/algorithms) to bottom (megakernels/related), they have a great eye for talent and I think will be able to build something very special. Congrats on the launch and I look forward to what you come up with!

Flapping Airplanes@flappyairplanes

Announcing Flapping Airplanes! We’ve raised $180M from GV, Sequoia, and Index to assemble a new guard in AI: one that imagines a world where models can think at human level without ingesting half the internet.

English

251

502

8.1K

1.2M

Jay Shim@jayjshim·22 Oca

On my TODO list is figuring out how to get multi-node pytorch training working with ray, FSDP, Huggingface, etc. Will keep you updated on my progress

English

Jay Shim@jayjshim·21 Oca

@DanielXieee @yukez What was the overall cost of developing something like this? It seems like a super cool project I'd like to try, but maybe there's a more cost-effective option

English

Quanting Xie@DanielXieee·18 Oca

A few days ago we got in YC W26, and here is we are working on. Building hardware is hard, but I really like a quote from @yukez: “People who are really serious about robot learning should make their own robot hardware.”

English

139

1.3K

121.2K

Jay Shim@jayjshim·18 Oca

Anyone have some comprehensive resources for learning to use Claude Code? I've been seeing it everywhere on my feed and excited to dive into it

English

Jay Shim@jayjshim·18 Oca

TIL: Forcing the model to output two tokens at opposite extremes for the gripper dimension doesn't destabilize/make it more difficult for the model to learn even though it's bin size is >> 2. Feel free to let me know if you've seen instances that disagree

English

Jay Shim@jayjshim·15 Oca

Training run is looking a lot better now. Before, the loss decreased and accuracy somewhat increased, but still somehow got close to 0% success on held-out tasks. Gripper seems to fix most of the issue, now there is some shakiness but I want to say that's from sampling.

English

Jay Shim@jayjshim·14 Oca

Spent a few days debugging a policy and found that the dataset I'm training on requires the gripper dim to be negated and spread out to {-1,1}. Hopefully this helps at least one other person since it was hard for me to find the 3 lines of transformations in a giant codebase

English

Jay Shim@jayjshim·13 Oca

TIL: the LIBERO dataset suites have image observations that are upside down (flipped over y axis). I guess LIBERO didn't have this issue since they trained all their transformers from scratch?

English

Jay Shim@jayjshim·12 Oca

So far, this issue still persists and it seems like Claude and GPT have issues pinpointing the problem as well, since the code "appears" correct. For now, I am going to try to reduce the amount of memory and see when exactly, if at all, the model gets offloaded

English

Jay Shim@jayjshim·10 Oca

I've tried manually getting rid of potentially remaining gradients, blocking all threads until garbage collection executes. Yet the issue still seems to persist

English

Jay Shim@jayjshim·10 Oca

I'm currently debugging an issue with FSDP and offloading sharded model weights. For some reason, even if I put a cuda synchronize + torch garbage collection + gc, on specific clusters it seems to maintain the memory on the gpu. Let me know if anyone else has experienced this!

English

Jay Shim@jayjshim·11 Oca

@neelsomani Where do you think this solving capability is coming from? Is it coming up with creative proofs humans wouldn't think of or simply no one has applied it in a new way?

English

Neel Somani@neelsomani·11 Oca

Weekend win: The proof I submitted for Erdos Problem #397 was accepted by Terence Tao. The proof was generated by GPT 5.2 Pro and formalized with Harmonic. Many open problems are sitting there, waiting for someone to prompt ChatGPT to solve them:

English

338

1.2K

8.7K

3.6M

Keşfet

@kylelostat @allen_ai @trq212 @bcherny @DarioAmodei @karpathy @bfspector @amspector100