Andy Tang

26 posts

Andy Tang

@tangerinecoder

Robotics and AI @ SAIL fear less; imagine more

Katılım Nisan 2023

247 Takip Edilen81 Takipçiler

Andy Tang retweetledi

Will Chen@verityw_·16 Şub

How can robot policies be trained to best leverage VLMs' CoT reasoning and in-context learning for generalization? The key is Steerable Policies: vision-language-action models that can be flexibly controlled in many ways! steerable-policies.github.io 1/9

English

142

22.3K

Andy Tang retweetledi

Jubayer Ibn Hamid@jubayer_hamid·1 Eki

Exploration is fundamental to RL. Yet policy gradient methods often collapse: during training they fail to explore broadly, and converge into narrow, easily exploitable behaviors. The result is poor generalization, limited gains from test-time scaling, and brittleness on tasks where strategic exploration is necessary. We introduce a framework for training a policy over sets of generations and use it to induce exploration. Work with @ifdita_hasan (co-lead), @ellenjxu_ , @chelseabfinn and @DorsaSadigh at Stanford 🧵

English

143

196K

Andy Tang retweetledi

Marcel Torné@marceltornev·26 Haz

Very happy to share that our work on learning long-history policies received the Best Paper Award from the Workshop on Learned Robot Representations @RoboticsSciSys ! 🤖🥳 Check out our paper if you haven't already! long-context-dp.github.io Thank you to all the organizers and the amazing collaborators @tangerinecoder, @liu_yuejiang and @chelseabfinn!

Marcel Torné@marceltornev

Giving history to our robot policies is crucial to solve a variety of daily tasks. However, diffusion policies get worse when adding history. 🤖 In our recent work we learn how adding an auxiliary loss that we name Past-Token Prediction (PTP) together with cached embeddings enables us to reliably add longer history context to our robot policies! 🧠 We also show how PTP enables some test-time scaling techniques for robotics! 🚀

English

11.4K

Andy Tang retweetledi

Shirley Wu@ShirleyYXWu·16 Haz

Even the smartest LLMs can fail at basic multiturn communication Ask for grocery help → without asking where you live 🤦‍♀️ Ask to write articles → assumes your preferences 🤷🏻‍♀️ ⭐️CollabLLM (top 1%; oral @icmlconf) transforms LLMs from passive responders into active collaborators. Website: aka.ms/CollabLLM Github: github.com/Wuyxin/collabl… Blog: #blog" target="_blank" rel="nofollow noopener">wuyxin.github.io/collabllm/#blog Paper: arxiv.org/pdf/2502.00640 🎯 Key insight: Rewards responses not by immediate helpfulness, but by their long-term impact on the conversation trajectory. @MSFTResearch @StanfordAILab @stanfordnlp

English

210

72.1K

Andy Tang retweetledi

Annie Chen@_anniechen_·19 May

How can robots autonomously handle ambiguous situations that require commonsense reasoning? *VLM-PC* provides adaptive high-level planning, so robots can get unstuck by exploring multiple strategies. Paper: anniesch.github.io/vlm-pc/

English

24K

Andy Tang retweetledi

Chelsea Finn@chelseabfinn·17 May

How do we make a scalable RL recipe for robots? We study batch online RL w/ demos. Key findings: - iterative filtered imitation is insufficient - need diverse policy data, eg using diffusion policy - policy extraction can hinder data diversity Paper: pd-perry.github.io/batch-online-r…

Perry Dong@perryadong

Robotic models are advancing rapidly—but how do we scale their improvement? 🤖 We propose a recipe for batch online RL (train offline with online rollouts) that enables policies to self-improve without complications of online RL More: pd-perry.github.io/batch-online-rl (1/8)

English

171

22.8K

Andy Tang retweetledi

Yuejiang Liu@liu_yuejiang·18 May

🧠Memory is crucial for robots — to handle occlusions, track progress, stay coherent, etc. Yet, most VLA truncate context. 🤔Why is long-context hard for robot policies? And how can we fix it? 📄Our new paper: Learning Long-Context Diffusion Policies via Past-Token Prediction

Marcel Torné@marceltornev

English

3.6K

Andy Tang@tangerinecoder·15 May

Was super fun exploring this! Most modern policies don't use history -- Diffusion Policy in particular gets a lot worse. We identify a simple ingredient for history improvement, and use it to improve efficiency and performance of long-context policies.

Marcel Torné@marceltornev

English

664

Andy Tang@tangerinecoder·30 Ara

@sanjehorah i have an ad-hoc version in LaTeX annotated with date added/finished, stage (to read, annotate, done), priority, and stream (from Twitter, robotics, neuro, etc.) -- lmk if ppl get together to build as i have opinions

English

177

sanje horah@sanjehorah·29 Ara

i am BEGGING

English

254

2.1K

27.6K

8.5M

Andy Tang retweetledi

Karan Dalal@karansdalal·8 Tem

I’m excited to share a project I’ve been working on for over a year, which I believe will fundamentally change our approach to language models. We’ve designed a new architecture, which replaces the hidden state of an RNN with a machine learning model. This model compresses context through actual gradient descent on input tokens. We call our method “Test-Time-Training layers.” TTT layers directly replace attention, and unlock linear complexity architectures with expressive memory, allowing us to train LLMs with millions (someday billions) of tokens in context. Our instantiations, TTT-Linear and TTT-MLP, both match or beat the strongest Transformers and Mamba. Arxiv: arxiv.org/abs/2407.04620

English

281

1.8K

428.6K

Andy Tang@tangerinecoder·17 May

@adriana0nline CONGRATSSS!!!

English

Adriana@adriana0nline·15 May

Graduated btw :-)

English

212

18.6K

Andy Tang@tangerinecoder·14 May

@TheBookie0 seems like they might be in need of a designer ;)

English

Clément@TheBookie0·13 May

for some reason my teacher didn't stick to the default templates and added these random background colors to the slides?!

English

511

Andy Tang@tangerinecoder·8 Nis

@giansegato do this as well! don't have a voice recorder but turn off wifi/cellular, open voice memos, go pace around a basement or run

English

250

gian@giansegato·8 Nis

a couple of weeks ago i bought an old-school portable voice recorder, like the ones we used to use to record classes before the iphone been using it while going on smartphones-less walks. i talk to it with no structure, just following my train of thoughts as they come, and when i come back home i transcribe my rambles with whisper, summarize/synthesize with an llm, and work on the very few good ideas in a more focused mental state it's well understood that movement improves thinking, and that the unfocused, bored state of mind can foster high quality ideas. the issue is that every time i go on a walk, my smartphone puts me in a focused, cognitively engaged state. no matter what i try. i'm addicted to mental stimulation, and constantly having the internet in my pocket simply hurts my ability to think crisply and *widely* i don't think it's a coincidence that @paulg always preaches the benefits of walks, but doesn't bring the internet with him. anecdotal, but telling on the other hand though, issue with smartphone-less walks is that i just forget stuff. i feel the constant need to take notes. probably a byproduct of my internet addiction. and so back to square 1 until today there was not really a good solution to my problem. luckily, not anymore whisper + any decently sized llm is already good enough to bring order to my incoherent rambling, making the transition from unfocused to focused thinking *seamless*. can't overstate how huge this is for me this *feels* artificial intelligence. it's an intelligent tool, making sense of my voice memos. it's kinda wild there may be a hardware side project somewhere in there (could be fun to put together with a raspberry), but with a modest $30 investment and open source tools i can run locally i already solved an issue that has been bothering me for years i think we just began to explore the impact of deep learning on tools for thought

Amjad Masad@amasad

With long-context LLMs, the ROI on documenting your life has gone massively up. You can load up your diary, photos, and even emails and texts and write all sorts of useful software to find patterns, do reflections, ask the LLM for advice, or just have an "ask my life" app.

English

135

42.2K

Andy Tang@tangerinecoder·7 Nis

@TheBookie0 lfggggggggggggg

Clément@TheBookie0·7 Nis

i'm going to Cornell!!!

English

112

8.2K

Andy Tang retweetledi

Sergiy Nesterenko@sergiynest·1 Mar

We've had over a thousand new engineers try Quilter in the last few weeks submitting some really interesting designs. We really want to see some of these come to life, so we're subsidizing board builds! If you want to build a Quilter design in real life, we'll cover the cost of the PCB! More about this in the link. Open hardware community: this one is especially for you ;)

Quilter@quilterai

Build free prototypes of AI-generated circuit boards with our new "Fab for Free" program! Learn more at blog.quilter.ai/fab-for-free/

English

5.8K

Andy Tang retweetledi

Patrick@_patrickhult·27 Şub

Here is Playground v2.5, our latest model.

Suhail@Suhail

1/ We are releasing Playground v2.5, our latest foundation model to create images. We tested our model across 20K+ users in a rigorous benchmark that went beyond anything we've seen to date. This model is open weights. More information in the tweets below. 👇

English

1.2K

Andy Tang retweetledi

Annie Chen@_anniechen_·3 Kas

Very excited to introduce ROAM, our new work that allows a robot to *adapt on-the-go* as it faces OOD situations during deployment, drawing on pre-trained behaviors. See as ROAM enables our Go1 to roller skate zero-shot 🤖🐕🛼 (without any lessons!) 🧵(1/9)

GIF

English

140

66.3K

Andy Tang retweetledi

Replit ⠕@Replit·20 Eyl

We’ve had a flurry of product launches over the past week. Unless you’ve been on X every day, you likely missed a couple. Here’s a recap of every launch so you can get up to speed👇

English

19.5K

Andy Tang@tangerinecoder·14 Eyl

@itsarnavb @0xDak my fav is the triple back tap on iPhone, I feel like no one knows about it and it's so nice to switch my phone to grayscale and back

English

Arnav Bansal ⠕@itsarnavb·13 Eyl

@0xDak there’s actually a bunch of these available buried somewhere under accessibility! i guess “double pinch” was the most reliable one so they shipped that

English

117

Andy Tang@tangerinecoder·13 Eyl

@JimmyAustin you dropped this👑

English

135

James Austin ⠕@JimmyAustin·13 Eyl

Replit ⠕@Replit

Introducing Replit ModelFarm, the fastest and safest way to build your next Generative AI app. Available for free on Hacker and Pro plans till October 15th. It requires zero setup, zero configuration, and zero API keys. With Replit ModelFarm, you can build a working Gen AI app in as little as 3 lines of code. Get started by installing the Replit AI library in any Python, JavaScript, or TypeScript Repl. The library implements an API for text completion, chat completion, and text embeddings. It supports streaming so your users can see model responses in real-time rather than waiting on a single output. All Hacker and Pro builders will have free access to a selection of Gen AI models offered by @googlecloud Vertex AI through Replit ModelFarm. All models are accessible from the development environment and any deployed app.

ZXX

16.9K

Keşfet

@ifdita_hasan @ellenjxu_ @chelseabfinn @DorsaSadigh @RoboticsSciSys @liu_yuejiang @icmlconf @MSFTResearch