Peter Schafhalter

118 posts

Peter Schafhalter

@pschafhalter

AI-Sys PhD student @ucbrise @ucberkeley focusing on systems for self-driving cars

Katılım Mayıs 2019

248 Takip Edilen368 Takipçiler

Peter Schafhalter@pschafhalter·29 Eki

I’m incredibly grateful for my collaborators at @GoogleDeepMind: Shun Liao, @zhouyanqi30, Chih-Kuan Yeh, Arun Kandoor, and James Laudon. Also, thank you to @ICGog and the Pathways Team for their support and feedback.

English

Peter Schafhalter@pschafhalter·29 Eki

We exploit parallelism in MoDE’s architecture to apply flexible sharding configurations that place experts weights on different devices from the pre-trained model’s weights. Such configurations can reduce communication overheads and improve training speeds by up to 38%.

English

133

Peter Schafhalter@pschafhalter·29 Eki

Introducing Modular Domain Experts (MoDE), a new multi-domain adaptation technique for LLMs. MoDE independently trains experts on different domains and composes them to boost LLM performance on complex, multi-domain tasks. Paper: arxiv.org/abs/2410.10181

English

1.4K

Peter Schafhalter@pschafhalter·22 Eki

@vsreekanti NotebookLM is awesome

English

Vikram Sreekanti@vsreekanti·22 Eki

Feels like the discussion around long context has disappeared recently — are there any big applications of it that are popular right now?

English

725

Peter Schafhalter@pschafhalter·17 Eki

@samkumar_cs @TheOfficialACM @acm_ccs Congrats Sam!

English

102

Sam Kumar@samkumar_cs·17 Eki

It’s an honor to have been recognized as a Runner-Up for the ⁦@TheOfficialACM⁩ SIGSAC Doctoral Dissertation Award at CCS 2024 (⁦@acm_ccs⁩)! I’d like to thank my nominator and letter writers for this award, and my advisors, mentors, and collaborators during my PhD.

English

6.5K

Peter Schafhalter@pschafhalter·26 Eyl

@luo_michael1234 @justinywong_ @brandontrabucco @bignamehyp @profjoeyg @rsalakhu Congrats Michael and Justin! What a big week for y'all

English

Michael Luo@michaelzluo·26 Eyl

Excited to announce today that Stylus has been accepted at NeurIPS 2024 with Oral! Link: stylus-diffusion.github.io Special thanks to my collaborators and advisors: @justinywong_ , @brandontrabucco, @bignamehyp , Ion Stoica, @profjoeyg, @rsalakhu #AI #NeurIPS #neurips2024

English

4.5K

Peter Schafhalter@pschafhalter·24 Eyl

Great work by @justinywong_ @luo_michael1234! I've been having a lot of fun voting on the video models.

Video Arena@aivideoarena

🚀 Just Launched: VideoArena!🎥 Discover head-to-head comparisons of video clips generated from the same prompts across top text-to-video models. Compare outputs from 7 leading models and we're adding more soon! 🔗 Check out the leaderboard: videoarena.tv #Text2Video

English

224

Peter Schafhalter@pschafhalter·22 Ağu

Proud of what we came up with. The new Berkeley OS prelim balances breadth with depth by connecting key modern advances to ideas introduced in older papers. Syllabus: ucbosprelim.samkumar.org

Shishir Patil@shishirpatil_

@conor_power23 PSA: everybody just take the OS prelims. We rehauled the prelims last semester, and it’s now a pure riot 🎆🎉 cc @samkumar_cs @pschafhalter

English

Peter Schafhalter@pschafhalter·20 Ağu

@samkumar_cs @CS_UCLA @UCLA Congrats Sam! Wishing you all the best and much success at UCLA 🥳📚

English

Sam Kumar@samkumar_cs·20 Ağu

I've wrapped up my postdoc and just moved down to Los Angeles. I'm super excited to start at UCLA CS (@CS_UCLA)! Today is my first day working in person at the @UCLA campus.

English

724

Peter Schafhalter@pschafhalter·16 Ağu

@profjoeyg Usage-based pricing makes sense AI infra (fine-tuning, generating tokens). AI Infra is like AWS, and is used to build applications (ChatGPT, Copilot). Fundamentally, infra and apps target different use-cases and could benefit from different price models.

English

Peter Schafhalter@pschafhalter·16 Ağu

@profjoeyg Interesting post. I think that monthly fees for AI products make a lot of sense because, as you mentioned, prices are predictable. Usage-based pricing increases complexity, and could harm adoption despite being more cost-effective.

English

Joey Gonzalez@profjoeyg·16 Ağu

What is the right pricing model for AI? Should it be a monthly fee or a flat rate per token? Do you pay extra for more knowledge? Three years ago, I was focused on server-less computing for AI and how to allocate inference engines. At the time, consumption based pricing was the future. Maybe it still is? @vsreekanti and I have been thinking about pricing models and we just published our latest thoughts: frontierai.substack.com/p/the-future-o…

English

479

Peter Schafhalter@pschafhalter·6 Ağu

Very cool to see another high-performance simulator for self-driving! GPUDrive addresses a key limitation of CARLA (simulation speed) and is even capable of simulating sensors like LiDARs.

Eugene Vinitsky 🦋@EugeneVinitsky

We’re open-sourcing and arxiving GPUDrive, a GPU-accelerated 2.5D multi-agent driving simulator that runs at over a million FPS. Hundreds of scenes on one GPU means scalable multi-agent planning

English

662

Peter Schafhalter@pschafhalter·23 Tem

@tianjun_zhang @AIatMeta Congrats Tianjun!!!

Català

133

Tianjun Zhang@tianjun_zhang·23 Tem

It has been really a rewarding journey since I joined the #LLaMA3 team @AIatMeta a little more than 2 months ago, and yet today we are releasing one of the world's best models! 🔥With the new license, we allow synthetic data generation from Llama to enhance your own model! Checkout our research paper on how we build this: ai.meta.com/research/publi…. Excited to see what we can build on top! 🫡

English

115

14.4K

Peter Schafhalter@pschafhalter·26 Haz

@kanaad Congrats 🚀🚀🚀

English

Peter Schafhalter@pschafhalter·22 May

Interesting approach from Waymo to use Q/A for remote interventions. Perhaps this is used to improve driving models via RLHF?

Waymo@Waymo

We’ve built the Waymo Driver to operate without the need for human intervention, and in order to do that sometimes, it requires additional context from our fleet response team. See their role in helping us safely scale: waymo.com/blog/2024/05/f…

English

344

Peter Schafhalter@pschafhalter·7 May

@robertnishihara @mrry Hi Robert, considering all the lessons learned building Ray, are there any changes you would have made back when you first started the project? Personally, I always wondered whether the flexibility of dynamic task graphs would eventually lead to performance bottlenecks.

English

45.2K

Robert Nishihara@robertnishihara·7 May

Ray originally started with just the "task" API for executing Python functions asynchronously (with some resemblance to systems like Dask, Celery, PySpark, etc). Actually, the system most closely resembling Ray's task API is CIEL (built by @mrry). usenix.org/legacy/events/… That said, a lot of AI is stateful, and the task API was just too limited, so pretty early on we ended up needing to add the actor API (essentially the ability to spin up a Python class as a little actor / microservice). The actor API was what really opened the floodgates and enabled Ray to support training workloads, online serving workloads, reinforcement learning workloads, and so on. Even Ray Data is built on actors despite data processing workloads being traditionally stateless.

ray@raydistributed

Ray operates at two levels: Ray Core, which scales Python functions and classes with tasks and actors, and its libraries, offering easy-to-use abstractions tailored for ML workloads. #Ray #ML #DistributedComputing

English

13.3K

Peter Schafhalter@pschafhalter·16 Nis

@Michaelvll1 @skypilot_org Congratulations! 🥳

English

215

Zhanghao Wu@Michaelvll1·16 Nis

I am honored to share that our recent paper won the Outstanding Paper Award in NSDI’24! The paper explores the policy design of our SkyPilot managed spot for @skypilot_org: Can’t Be Late: Optimizing Spot Instance Savings under Deadlines It would not be possible, if it were not with the fantastic folks and advisors: @infwinston @ziming_mao @zongheng_yang Eric Friedman, Scott Shenker and Ion Stoica; and the whole SkyPilot team @skypilot_org

English

21.2K

Peter Schafhalter retweetledi

Simon Guo@simonguozirui·12 Nis

Spent the last few weeks working on this blog! When I first read the Ring Attention paper, I kind of get the concept yet not really. Diving into the details from the math to compute was incredibly rewarding for our understanding, and hope it be a fun read for you too!

Kilian Haefeli@khshind

How do state-of-the-art LLMs like Gemini 1.5 and Claude 3 scale to long context windows beyond 1M tokens? Well, Ring Attention by @haoliuhl presents a way to split attention calculation across GPUs while hiding the communication overhead in a ring, enabling zero overhead scaling

English

179

58.7K

Keşfet

@GoogleDeepMind @zhouyanqi30 @ICGog @vsreekanti @samkumar_cs @TheOfficialACM @acm_ccs @justinywong_