Jay Shim

41 posts

Jay Shim banner
Jay Shim

Jay Shim

@jayjshim

Undergrad RL Researcher @ UT Austin | Sharing what I learn about RL/ML

Austin Katılım Aralık 2025
36 Takip Edilen7 Takipçiler
Jay Shim
Jay Shim@jayjshim·
@kylelostat @allen_ai When you came to UT Austin and gave a talk about Olmo data distributions, I was really inspired by the work you've done! Thank you for the talk and I look forward to speaking with you in the future! Best of luck!
English
0
0
0
91
Kyle Lo
Kyle Lo@kylelostat·
Today I'm saying farewell to @allen_ai. I'm so proud of our team & grateful to have shared fully-open Olmo, Dolma, olmOCR, Molmo, etc with the world I know the team is more committed than ever to advancing open-source & open-science. Forever rooting for my dear friends 🫶
Kyle Lo tweet mediaKyle Lo tweet mediaKyle Lo tweet mediaKyle Lo tweet media
English
50
12
430
20.8K
Jay Shim retweetledi
Jiaheng Hu
Jiaheng Hu@JiahengHu1·
VLA models are capable generalists. But can they continually self-improve? Such Continual Reinforcement Learning (CRL) problems are traditionally considered very challenging. Surprisingly, we found that with the right setup, the simplest CRL recipe can work really well! arxiv.org/abs/2603.11653
English
7
50
268
43.5K
Jay Shim
Jay Shim@jayjshim·
@trq212 Have you seen Claude "game the system" by pushing lots of useless code/comments to boost its own metrics? Or is that specifically de-emphasized
English
0
0
0
185
Thariq
Thariq@trq212·
We've launched Claude Code contribution metrics to help you track PRs and lines of code contributed with the help of Claude Code.
English
68
49
908
406.9K
Jay Shim
Jay Shim@jayjshim·
@bcherny Super exciting! Has there been any major noticeable differences in persona or output quality when using the fast mode?
English
0
0
1
214
Jay Shim
Jay Shim@jayjshim·
The real bottleneck for AI in medicine might be human trust, not technical capabilities. Even if an AI hospital had higher survival rates, many of us would still hesitate. What would it actually take for people to trust AI in high-stakes settings?
English
0
0
0
34
Jay Shim
Jay Shim@jayjshim·
Realizing that building AI for healthcare means first working on safety reshaped how I think about my path. @DarioAmodei's essay on powerful AI left me with both awe at what's coming and urgency about getting it right, so I wrote down my thoughts. shimboi.hashnode.dev/the-reach-of-ai
English
1
0
0
50
Jay Shim
Jay Shim@jayjshim·
@bcherny Super cool tips! Using diction for prompts was a surprising tip that I hadn't even considered before. On a similar note, what specific keywords/phrases have you found boost performance significantly when written in the Claude.md, even more than you expected?
English
0
0
0
618
Boris Cherny
Boris Cherny@bcherny·
I'm Boris and I created Claude Code. I wanted to quickly share a few tips for using Claude Code, sourced directly from the Claude Code team. The way the team uses Claude is different than how I use it. Remember: there is no one right way to use Claude Code -- everyones' setup is different. You should experiment to see what works for you!
English
925
5.9K
50.9K
9.1M
Jay Shim
Jay Shim@jayjshim·
@bcherny Congrats on the successful usage! In hindsight, were there any specific areas you thought needed more safety-proofing? If Claude provided incorrect simulation/planning it could've been disastrous right?
English
0
0
0
175
Jay Shim
Jay Shim@jayjshim·
@karpathy The sentiment of not being able to compete with big names to me isn't a de-motivator, rather it feels like a challenge to outcompete them even without the same resources or connections
English
0
0
0
155
Andrej Karpathy
Andrej Karpathy@karpathy·
A conventional narrative you might come across is that AI is too far along for a new, research-focused startup to outcompete and outexecute the incumbents of AI. This is exactly the sentiment I listened to often when OpenAI started ("how could the few of you possibly compete with Google?") and 1) it was very wrong, and then 2) it was very wrong again with a whole another round of startups who are now challenging OpenAI in turn, and imo it still continues to be wrong today. Scaling and locally improving what works will continue to create incredible advances, but with so much progress unlocked so quickly, with so much dust thrown up in the air in the process, and with still a large gap between frontier LLMs and the example proof of the magic of a mind running on 20 watts, the probability of research breakthroughs that yield closer to 10X improvements (instead of 10%) imo still feels very high - plenty high to continue to bet on and look for. The tricky part ofc is creating the conditions where such breakthroughs may be discovered. I think such an environment comes together rarely, but @bfspector & @amspector100 are brilliant, with (rare) full-stack understanding of LLMs top (math/algorithms) to bottom (megakernels/related), they have a great eye for talent and I think will be able to build something very special. Congrats on the launch and I look forward to what you come up with!
Flapping Airplanes@flappyairplanes

Announcing Flapping Airplanes! We’ve raised $180M from GV, Sequoia, and Index to assemble a new guard in AI: one that imagines a world where models can think at human level without ingesting half the internet.

English
251
502
8.1K
1.2M
Jay Shim
Jay Shim@jayjshim·
On my TODO list is figuring out how to get multi-node pytorch training working with ray, FSDP, Huggingface, etc. Will keep you updated on my progress
English
0
0
0
39
Jay Shim
Jay Shim@jayjshim·
@DanielXieee @yukez What was the overall cost of developing something like this? It seems like a super cool project I'd like to try, but maybe there's a more cost-effective option
English
0
0
0
40
Quanting Xie
Quanting Xie@DanielXieee·
A few days ago we got in YC W26, and here is we are working on. Building hardware is hard, but I really like a quote from @yukez: “People who are really serious about robot learning should make their own robot hardware.”
English
94
139
1.3K
121.2K
Jay Shim
Jay Shim@jayjshim·
Anyone have some comprehensive resources for learning to use Claude Code? I've been seeing it everywhere on my feed and excited to dive into it
English
0
0
0
39
Jay Shim
Jay Shim@jayjshim·
TIL: Forcing the model to output two tokens at opposite extremes for the gripper dimension doesn't destabilize/make it more difficult for the model to learn even though it's bin size is >> 2. Feel free to let me know if you've seen instances that disagree
English
0
0
0
37
Jay Shim
Jay Shim@jayjshim·
Training run is looking a lot better now. Before, the loss decreased and accuracy somewhat increased, but still somehow got close to 0% success on held-out tasks. Gripper seems to fix most of the issue, now there is some shakiness but I want to say that's from sampling.
English
0
0
0
23
Jay Shim
Jay Shim@jayjshim·
Spent a few days debugging a policy and found that the dataset I'm training on requires the gripper dim to be negated and spread out to {-1,1}. Hopefully this helps at least one other person since it was hard for me to find the 3 lines of transformations in a giant codebase
English
1
0
0
30
Jay Shim
Jay Shim@jayjshim·
TIL: the LIBERO dataset suites have image observations that are upside down (flipped over y axis). I guess LIBERO didn't have this issue since they trained all their transformers from scratch?
English
0
0
0
22
Jay Shim
Jay Shim@jayjshim·
So far, this issue still persists and it seems like Claude and GPT have issues pinpointing the problem as well, since the code "appears" correct. For now, I am going to try to reduce the amount of memory and see when exactly, if at all, the model gets offloaded
English
0
0
0
20
Jay Shim
Jay Shim@jayjshim·
I've tried manually getting rid of potentially remaining gradients, blocking all threads until garbage collection executes. Yet the issue still seems to persist
English
1
0
0
23
Jay Shim
Jay Shim@jayjshim·
I'm currently debugging an issue with FSDP and offloading sharded model weights. For some reason, even if I put a cuda synchronize + torch garbage collection + gc, on specific clusters it seems to maintain the memory on the gpu. Let me know if anyone else has experienced this!
English
1
0
0
27
Jay Shim
Jay Shim@jayjshim·
@neelsomani Where do you think this solving capability is coming from? Is it coming up with creative proofs humans wouldn't think of or simply no one has applied it in a new way?
English
0
0
0
75
Neel Somani
Neel Somani@neelsomani·
Weekend win: The proof I submitted for Erdos Problem #397 was accepted by Terence Tao. The proof was generated by GPT 5.2 Pro and formalized with Harmonic. Many open problems are sitting there, waiting for someone to prompt ChatGPT to solve them:
Neel Somani tweet media
English
338
1.2K
8.7K
3.6M