Joshua Yang

111 posts

Joshua Yang

@RealJoshuaYang

CS @CarnegieMellon | @CMU_Robotics

Katılım Ekim 2023

395 Takip Edilen52 Takipçiler

Joshua Yang@RealJoshuaYang·21h

imo its a very good indicator of intelligence - its not a surprise to me how LLM math benchmarks use such competitions and quant companies specifically recruit from high performers there. In my experience, in class, the best performers always come from a competition background. i think there needs to be said difference between doing well in these comps and being top. doing well means you're partially seeking validation from these organizations who value them. being top means you're doing these competitions purely for the true love of the game

English

136

gum@gum1h0x·1d

in my experience competitive programming and math olympiads don't really matter that much if you don't have that sweet ivy league edu. i was elite at both and also at other similar and dissimilar subjects. tbf maybe i wasn't as skilled at applying back then and some bad luck in recruiters filtering was involved. always felt like ppl especially in germany don't really care abt these things, there's barely any status or job advantages attached compared to eastern countries like romania or poland which per capita produce way more elite ctf/cp/mo contestants. it's totally different in the states or said countries. a lot of the top people in germany which you can basically count on three hands have roots in those eastern countries too, e.g. i'm half polish. there's barely any dedicated support here so fewer and fewer of these people isn't surprising. just sad a country this size and wealth can't do better

OpenAI@OpenAI

Are you up for a challenge? openai.com/parameter-golf

English

7.4K

Joshua Yang@RealJoshuaYang·1d

@danveloper I'm a bit confused - what do you mean by approximate a graph here? Forgive me, MoE is a topic still new to me, but I don't understand your analogy.

English

323

Dan Woods@danveloper·1d

I actually hate MoE's now. Not just because they're difficult to hardwaremaxx, but it's actually a really dumb architecture (no offense to anyone). They naively approximate a graph without any of the benefit of graph traversal. We're sending a blind person down a path and we've trained something to nudge them onto a different path to get to the end, but it doesn't know the next part of the map until the person has walked down the street. I hate this.

English

111

15.4K

Joshua Yang@RealJoshuaYang·2d

@iarthsingh Did you try Lambda AI Grant; NSF research grant?

English

Arth Singh@iarthsingh·2d

I recently applied to a lot of places for compute grants related to my research proposal and heard back from only one of them, which was happy to fund, but they don’t have H or B series GPUs. Thus, I just wanted to ask does anyone know of any compute grants that at least reply back with an acceptance or rejection?

English

Joshua Yang@RealJoshuaYang·2d

@LeeMcClymont @BrandonLuuMD I’m talking about revising decade later not 2 months later. You cannot physically index a decade of hadwrittwn notes better than Command F will digitally

English

Lee McClymont@LeeMcClymont·3d

@RealJoshuaYang @BrandonLuuMD You rewrite the handwritten notes whilst revising for the exam. People have done this for centuries.

English

Brandon Luu, MD@BrandonLuuMD·4d

Students who took notes by hand scored ~28% higher on conceptual questions than laptop note-takers. Writing forces your brain to process and compress ideas instead of copying them.

English

447

5.2K

24.5K

1.6M

Joshua Yang@RealJoshuaYang·4d

@a_weers Can you also add an RSS feed to the overall blog?

English

Alex Weers@a_weers·5d

would love some feedback

English

6.2K

Alex Weers@a_weers·5d

Finally finished! If you're interested in an overview of recent methods in reinforcement learning for reasoning LLMs, check out this blog post: aweers.de/blog/2026/rl-f… It summarizes ten methods, tries to highlight differences and trends, and has a collection of open problems

English

240

1.8K

305.1K

Joshua Yang@RealJoshuaYang·5d

@_xjdr Can you please add an RSS feed? Need this for my aggregated RSS reader!

English

xjdr@_xjdr·5d

we are releasing a series of research journal posts at noumena.com/research/ . these are some of the things we are working on at noumena including 2 (very) pre print paper drafts. happy to answer any questions you have . i will push the code and logs to reproduce most of this today in the nmoe repo

English

447

35.1K

Joshua Yang@RealJoshuaYang·5d

@a_weers Are evolutionary search methods any good for post-training? Saw some people post about it recently, not sure if relevant

English

281

Joshua Yang@RealJoshuaYang·6d

@KyleVedder But what tasks can you actually train using SO-101? All the many hackathon and personal demos I've seen so far don't accomplish anything novel or very impressive - payload and length of arm are very limiting factors.

English

3.5K

Kyle Vedder@KyleVedder·6d

im often asked “how do i break into robot learning without a phd?” buy an SO-101 ($300) and do something interesting understand the stack, train models, implement a paper, do serious on-robot evals, document your work the pool of talent with real experience is v small

English

1.1K

129.4K

Joshua Yang@RealJoshuaYang·13 Mar

@Luckyballa how did you make this visual?

English

130

Lucky Iyinbor@Luckyballa·13 Mar

My 2026 reading list is stacking up way too fast

English

1.9K

Joshua Yang@RealJoshuaYang·13 Mar

@puttasync are you following a dreamer style approach for mining diamonds?

English

putt@puttasync·12 Mar

my agent just punched its first tree in minecraft

English

1.8K

Joshua Yang@RealJoshuaYang·13 Mar

@GawrGuraRawr @var_epsilon what video program?

English

Hysteria.Eve is cool@GawrGuraRawr·11 Mar

@var_epsilon you're wrong, you need actively the most flashiest graphics on the face of the planet + that one video program every startup uses

English

630

varepsilon@var_epsilon·11 Mar

you can take a cool project, wrap it in a for loop, share it on twitter, and get an offer within a week

English

176

11.1K

Joshua Yang@RealJoshuaYang·12 Mar

@neural_avb What RL algo?

English

149

AVB@neural_avb·12 Mar

this is in 2022 btw, before I knew vibe coding. competitive self playing RL agents trained from scratch locally... learning how to shoot and dodge bullets.

shaurya@shauseth

my hot take is the ai community vastly underestimates what humans can do

English

117

20.8K

Joshua Yang@RealJoshuaYang·12 Mar

@k7agar I dont understand. Can you explain why?

English

atharva ☆@k7agar·10 Mar

i think the golden era of *published* research is behind us, i think it is going to be more pre prints then more blogs followed by just dropping github repo's or starting companies

English

3.8K

Joshua Yang@RealJoshuaYang·11 Mar

@arnie_hacker Anything involving the core world model has to be done by the hand. Everything else including data processing, visualization/plots, training scripts for cloud gpu/cluster usage/notebook can be done with AI.

English

140

Arnie Ramesh@arnie_hacker·11 Mar

I’m building a CS:GO world model. Right now I write all code by hand & question/learn with LLMs The depth of understanding is much better - but should I just scrap it all & get an automated research lab to do it all? What do I learn in that case?

Andrej Karpathy@karpathy

Three days ago I left autoresearch tuning nanochat for ~2 days on depth=12 model. It found ~20 changes that improved the validation loss. I tested these changes yesterday and all of them were additive and transferred to larger (depth=24) models. Stacking up all of these changes, today I measured that the leaderboard's "Time to GPT-2" drops from 2.02 hours to 1.80 hours (~11% improvement), this will be the new leaderboard entry. So yes, these are real improvements and they make an actual difference. I am mildly surprised that my very first naive attempt already worked this well on top of what I thought was already a fairly manually well-tuned project. This is a first for me because I am very used to doing the iterative optimization of neural network training manually. You come up with ideas, you implement them, you check if they work (better validation loss), you come up with new ideas based on that, you read some papers for inspiration, etc etc. This is the bread and butter of what I do daily for 2 decades. Seeing the agent do this entire workflow end-to-end and all by itself as it worked through approx. 700 changes autonomously is wild. It really looked at the sequence of results of experiments and used that to plan the next ones. It's not novel, ground-breaking "research" (yet), but all the adjustments are "real", I didn't find them manually previously, and they stack up and actually improved nanochat. Among the bigger things e.g.: - It noticed an oversight that my parameterless QKnorm didn't have a scaler multiplier attached, so my attention was too diffuse. The agent found multipliers to sharpen it, pointing to future work. - It found that the Value Embeddings really like regularization and I wasn't applying any (oops). - It found that my banded attention was too conservative (i forgot to tune it). - It found that AdamW betas were all messed up. - It tuned the weight decay schedule. - It tuned the network initialization. This is on top of all the tuning I've already done over a good amount of time. The exact commit is here, from this "round 1" of autoresearch. I am going to kick off "round 2", and in parallel I am looking at how multiple agents can collaborate to unlock parallelism. github.com/karpathy/nanoc… All LLM frontier labs will do this. It's the final boss battle. It's a lot more complex at scale of course - you don't just have a single train. py file to tune. But doing it is "just engineering" and it's going to work. You spin up a swarm of agents, you have them collaborate to tune smaller models, you promote the most promising ideas to increasingly larger scales, and humans (optionally) contribute on the edges. And more generally, *any* metric you care about that is reasonably efficient to evaluate (or that has more efficient proxy metrics such as training a smaller network) can be autoresearched by an agent swarm. It's worth thinking about whether your problem falls into this bucket too.

English

13.8K

Joshua Yang@RealJoshuaYang·10 Mar

@juliarturc what site is this?

English

403

Julia Turc@juliarturc·9 Mar

When reading diffusion papers, my most common reaction is "mkay the math works out but WHY WOULD A SANE PERSON CHOOSE TO DO THIS". This included the (foundational) DDPM formulas. The saving grace is that you can visualize it as 2D particle motion and get a solid intuition.

English

183

19.8K

Joshua Yang@RealJoshuaYang·10 Mar

@juliarturc papers don't really explain intuition - 3rd party and blogs and videos are really the best way to truly understand concepts.

English

166

Joshua Yang@RealJoshuaYang·8 Mar

@miniapeur It's hard, but there are ppl who've been able to become independent researchers - typically those who didn't study CS in undergrad but adjacent subjects (Math, Physics, etc).

English

5.5K

Mathieu@miniapeur·8 Mar

Academia is like high school. There are the cool kids doing research that everyone talks about, and it's pretty hard to get into those circles.

English

125

1.8K

434.3K

Joshua Yang@RealJoshuaYang·7 Mar

@ethanmclark1 What counts as easy scene randomization?

English

106

Ethan Clark@ethanmclark1·7 Mar

For sim-to-real manipulation you need three things. GPU parallelism. Accurate and intuitive contact modeling. Easy scene randomization Isaac Sim gives you 1 and 3 but not 2. MuJoCo gives you 2 but not 1 or 3. MuJoCo Warp gets you closer to 1 and 2 but still no 3 Nobody has achieved the trifecta yet This is why locomotion is exploding and manipulation isn't. Locomotion needs minimal scene randomization so MuJoCo Warp is good enough. Manipulation needs all three and nobody has the tooling to make that easy yet The sim gap IS the manipulation gap

English

179

19.8K

Joshua Yang@RealJoshuaYang·7 Mar

@kaiwynd is this with manual defined trajectories or directly running inference in sim?

English

535

Kaifeng Zhang@kaiwynd·6 Mar

Cloth simulation using NVIDIA Newton. Not perfect, but looking good!

English

459

34.3K

Keşfet

@danveloper @iarthsingh @LeeMcClymont @BrandonLuuMD @a_weers @_xjdr @KyleVedder @Luckyballa