Michael Zhou

5 posts

Michael Zhou

Michael Zhou

@michaelhzhou

Studying CS, ML @SCSatCMU. AI Research @CMU_Robotics

Katılım Ekim 2023
50 Takip Edilen11 Takipçiler
Michael Zhou retweetledi
Murtaza Dalal
Murtaza Dalal@mihdalal·
LLMs are capable of high-level planning, but they require pre-trained skills! Our #ICLR2024 paper instead uses LLM guidance to train RL agents from scratch to solve 25+ long-horizon robotics tasks across four benchmarks w/ >85% success rates Paper & code: mihdalal.github.io/planseqlearn
English
5
35
129
36.2K
Michael Zhou retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
# on shortification of "learning" There are a lot of videos on YouTube/TikTok etc. that give the appearance of education, but if you look closely they are really just entertainment. This is very convenient for everyone involved : the people watching enjoy thinking they are learning (but actually they are just having fun). The people creating this content also enjoy it because fun has a much larger audience, fame and revenue. But as far as learning goes, this is a trap. This content is an epsilon away from watching the Bachelorette. It's like snacking on those "Garden Veggie Straws", which feel like you're eating healthy vegetables until you look at the ingredients. Learning is not supposed to be fun. It doesn't have to be actively not fun either, but the primary feeling should be that of effort. It should look a lot less like that "10 minute full body" workout from your local digital media creator and a lot more like a serious session at the gym. You want the mental equivalent of sweating. It's not that the quickie doesn't do anything, it's just that it is wildly suboptimal if you actually care to learn. I find it helpful to explicitly declare your intent up front as a sharp, binary variable in your mind. If you are consuming content: are you trying to be entertained or are you trying to learn? And if you are creating content: are you trying to entertain or are you trying to teach? You'll go down a different path in each case. Attempts to seek the stuff in between actually clamp to zero. So for those who actually want to learn. Unless you are trying to learn something narrow and specific, close those tabs with quick blog posts. Close those tabs of "Learn XYZ in 10 minutes". Consider the opportunity cost of snacking and seek the meal - the textbooks, docs, papers, manuals, longform. Allocate a 4 hour window. Don't just read, take notes, re-read, re-phrase, process, manipulate, learn. And for those actually trying to educate, please consider writing/recording longform, designed for someone to get "sweaty", especially in today's era of quantity over quality. Give someone a real workout. This is what I aspire to in my own educational work too. My audience will decrease. The ones that remain might not even like it. But at least we'll learn something.
English
667
3.3K
17K
2.2M
Michael Zhou retweetledi
Google DeepMind
Google DeepMind@GoogleDeepMind·
Introducing AlphaGeometry: an AI system that solves Olympiad geometry problems at a level approaching a human gold-medalist. 📐 It was trained solely on synthetic data and marks a breakthrough for AI in mathematical reasoning. 🧵 dpmd.ai/alphageometry
GIF
English
117
944
3.9K
1.2M
Michael Zhou retweetledi
Anuda Weerasinghe
Anuda Weerasinghe@anuda_w·
If you found this interesting/want to dig deeper, @vasumvikram, @michaelhzhou and I built a lecture+project for a SWE class at @SCSatCMU. [Lecture] A software engineer's guide to LLM's-cmu-313.github.io/_old/F23/asset… [Project] Answer our syllabus q's with an LLM-colab.research.google.com/drive/18ppvk_X…
Sarah Chieng@MilksandMatcha

I compiled a ~5 min read of @andrewyng and @isafulf's 2.5 hour "Application Development using Large Language Models" talk at NeurIPS! It was super cool to attend with @metaphorsystems and I managed to get a spot by showing up very very early! 😛 Some things it covers: > LLM basics // steps to create an LLM > Supercharging with RAG > Prompt Engineering Best Practices + Tricks > Tips from the field for developers Also, shoutout to @swyx, @jerryjliu0, @brianryhuang, and @scikud for their contributions. 🙌 As per usual, love and appreciate any feedback or video/folk recommendations. This is all a learning process for me as well :) ❤️ full doc: mphr.notion.site/Application-De…

English
1
5
10
1.1K
Michael Zhou retweetledi
Jim Fan
Jim Fan@DrJimFan·
I've been asked what's the biggest thing in 2024 other than LLMs. It's Robotics. Period. We are ~3 years away from the ChatGPT moment for physical AI agents. We've been cursed by the Moravec's paradox for too long, which is the counter-intuitive phenomenon that "tasks that humans find easy are extremely hard for AI, and vice versa". 2024 will be remembered as the first year that the AI community fights back big time against the curse. We will not win immediately, but we will be on the path of winning. In 2023, we've caught a glimpse of the future foundation models and platforms for robots: - Multimodal LLMs with robot arms as a physical I/O device: VIMA, PerAct, RvT (NVIDIA), RT-1, RT-2, PaLM-E (Google), RoboCat (DeepMind), Octo (Berkeley, Stanford, CMU), etc. - Algorithms that bridge the gap between System 1 high-level reasoning (LLMs) and System 2 low-level control: Eureka (NVIDIA), Code as Policies (Google), etc. - Insane amounts of progress on robust hardware: Tesla Optimus @elonmusk, Figure @adcock_brett, 1X @ericjang11, Apptronik, Sanctuary, Agility+Amazon, Unitree, etc. - Data has always been the Achilles' heel of robotics. The research community is coming together to curate the next ImageNet, such as the Open X-Embodiment (RT-X) dataset. It's still not diverse enough, but a baby step is a major step. - Simulation and synthetic data will play a critical role in solving robot dexterity and even computer vision in general. (1) NVIDIA Isaac can simulate reality at 1000x faster than real-time. The incoming data stream scales as compute scales. (2) Photorealism can be enabled by hardware-accelerated raytracing. The realistic renderings also come with groundtruth annotations for free, such as segmentation, depth, 3D pose, etc. (3) Simulators can even multiply real-world data to create much larger datasets, greatly reducing the expensive human demonstration efforts. MimicGen (NVIDIA) is a representative example. I'm all in, personally. The best is yet to come.
Jim Fan tweet media
English
75
341
1.8K
357.1K