Minae Kwon

117 posts

Minae Kwon

@MinaeKwon

Katılım Aralık 2018

553 Takip Edilen1.1K Takipçiler

Minae Kwon retweetledi

Anthropic@AnthropicAI·23 Mar

Introducing the Anthropic Science Blog. Increasing the pace of scientific progress is a core part of Anthropic’s mission. The Science Blog will feature new research and stories of how scientists are using AI to accelerate their work. Read the intro: anthropic.com/research/intro…

English

208

638

5.1K

423.9K

Minae Kwon retweetledi

Anthropic@AnthropicAI·21 Oca

We’re publishing a new constitution for Claude. The constitution is a detailed description of our vision for Claude’s behavior and values. It’s written primarily for Claude, and used directly in our training process. anthropic.com/news/claude-ne…

English

523

970

7.7K

3.3M

Minae Kwon@MinaeKwon·18 Haz

@siddkaramcheti @ICatGT @GTrobotics @mlatgt Congrats Sidd <3

English

317

Siddharth Karamcheti@siddkaramcheti·18 Haz

Thrilled to share that I'll be starting as an Assistant Professor at Georgia Tech (@ICatGT / @GTrobotics / @mlatgt) in Fall 2026. My lab will tackle problems in robot learning, multimodal ML, and interaction. I'm recruiting PhD students this next cycle – please apply/reach out!

English

563

60.9K

Minae Kwon retweetledi

Anthropic@AnthropicAI·22 Eki

Introducing an upgraded Claude 3.5 Sonnet, and a new model, Claude 3.5 Haiku. We’re also introducing a new capability in beta: computer use. Developers can now direct Claude to use computers the way people do—by looking at a screen, moving a cursor, clicking, and typing text.

English

472

1.8K

10K

3.7M

Minae Kwon retweetledi

Ethan Perez@EthanJPerez·18 Eyl

I’m taking applications for collaborators via @MATSprogram! It’s a great way for new or experienced researchers outside AI safety research labs to work with me/others in these groups: @NeelNanda5, @EvanHub, @MrinankSharma, @NinaPanickssery, @FabienDRoger, @RylanSchaeffer, ...🧵

English

149

135.8K

Minae Kwon retweetledi

Anthropic@AnthropicAI·20 Haz

Introducing Claude 3.5 Sonnet—our most intelligent model yet. This is the first release in our 3.5 model family. Sonnet now outperforms competitor models on key evaluations, at twice the speed of Claude 3 Opus and one-fifth the cost. Try it for free: claude.ai

English

421

1.5K

7.1K

2.5M

Minae Kwon@MinaeKwon·28 May

@SerenaLBooth @BrownUniversity @BrownCSDept Congratulations ♥️

English

188

Minae Kwon retweetledi

Dorsa Sadigh@DorsaSadigh·11 May

At #ICRA24 we've a few papers on 𝗴𝗿𝗼𝘂𝗻𝗱𝗲𝗱 𝗿𝗲𝗮𝘀𝗼𝗻𝗶𝗻𝗴 of LLMs/VLMs. • Grounded common-sense reasoning via active perception - @MinaeKwon's 🧵👇 • Physically grounding VLMs - @jensen_gao's 🧵👇 • Learning from online language corrections - @lihanzha's 🧵👇

English

18.9K

Minae Kwon retweetledi

Alex Tamkin@AlexTamkin·5 Nis

Made a short video exploring tool use and subagents! (w/ @aaron_begg and @typochondriac) Goal: Find the “quickest quicksort” implementation on GitHub by having a larger model orchestrate 100 subagent models Here’s how it works: 1/ twitter.com/AnthropicAI/st…

Anthropic@AnthropicAI

Tool use is now available in beta to all customers in the Anthropic Messages API, enabling Claude to interact with external tools using structured outputs.

English

18.4K

Minae Kwon retweetledi

Jesse Mu@jayelmnop·20 Mar

We’re hiring for the adversarial robustness team @AnthropicAI! As an Alignment subteam, we're making a big effort on red-teaming, test-time monitoring, and adversarial training. If you’re interested in these areas, let us know! (emails in 🧵)

English

446

72.6K

Minae Kwon retweetledi

Jesse Mu@jayelmnop·24 Şub

Achievement unlocked ✅, thanks for the shout-out @karpathy!

Andrej Karpathy@karpathy

New (2h13m 😅) lecture: "Let's build the GPT Tokenizer" Tokenizers are a completely separate stage of the LLM pipeline: they have their own training set, training algorithm (Byte Pair Encoding), and after training implement two functions: encode() from strings to tokens, and decode() back from tokens to strings. In this lecture we build from scratch the Tokenizer used in the GPT series from OpenAI.

English

246

29.5K

Minae Kwon retweetledi

Hugh Zhang@hughbzhang·26 Kas

I also have no idea what Q* is, but given speculation that it’s a method of self-learning and Monte-Carlo Tree Search (MCTS) in language models, I thought I’d share some recent work on an adjacent idea.

English

400

87.2K

Minae Kwon retweetledi

Alex Tamkin@AlexTamkin·19 Eki

Eliciting Human Preferences with Language Models Currently, people write detailed prompts to describe what they want a language model to do We explore *generative elicitation*—where models interactively ask for this information through open-ended conversation 1/

English

447

82.9K

Minae Kwon retweetledi

Andy Shih@andyshih_·17 Eki

Excited about recent improvements of our NeurIPS Spotlight paper, now even faster with ⚡️⚡️multiprocessing⚡️⚡️! We now get 2x speedup on as low as 50-step DDIM, and 4x speedup on 200-step DDIM! The first version of our paper showed good results, but we wanted even better. - torch.DataParallel showed big gains for DDPM, but not for few-step DDIM (due to python GIL and DP's parameter copying) - torch.DDP does not work out of the box for our algorithm To address this, we implemented custom multiprocessing logic with a producer/consumer design. This resolves python GIL, and keeps parameters persistent on GPUs to avoid repeated copying. Our multiprocessing implementation gave 5x improvement over torch.DataParallel for 50-step sampling 🥳🥳 Excited to present more about this work at NeurIPS! Try it out now at: github.com/AndyShih12/par…

Andy Shih@andyshih_

Diffusion models are slow to sample from. Many methods propose to sample using *fewer* steps, but this can hurt sample quality. We introduce ParaDiGMS, a new method to speed up diffusion models by 2-4x while using the *same* number of steps! arxiv.org/abs/2305.16317

English

18.4K

Minae Kwon retweetledi

Foundation Models, LLMs, and Game Theory Workshop@fm_llms_gt·20 Eyl

We also call on researchers with ongoing research to submit posters to our workshop. The workshop will provide financial support to a limited number of researchers. Applications for financial support will remain open until September 23, 2023. docs.google.com/forms/d/e/1FAI…

English

568

Minae Kwon retweetledi

Foundation Models, LLMs, and Game Theory Workshop@fm_llms_gt·20 Eyl

We are excited to announce the first workshop on Foundation Models, Large Language Models (LLMs), and Game Theory! The workshop will take place at the Center for Discrete Mathematics and Theoretical Computer Science (DIMACS) on October 19-20, 2023. dimacs.rutgers.edu/events/details…

English

5.1K

Minae Kwon retweetledi

shreya rajpal@ShreyaR·18 Eyl

It's an absolute honor to be a guest on the @twimlai podcast! @samcharrington and I cover everything under the sun in LLMOps, from hallucinations, RAG to LLM safety. Check out the podcast on the link below!

The TWIML AI Podcast@twimlai

Today we’re joined by @ShreyaR to discuss LLM safety for production applications! We explore hallucinations, RAG, evaluation & tooling for LLMs as well as Guardrails, an open-source project enforcing correctness in LMs. 🎧/🎥 Check out the episode at twimlai.com/go/647

English

15K

Minae Kwon retweetledi

Sang Michael Xie@sangmichaelxie·14 Eyl

Releasing an open-source PyTorch implementation of DoReMi! github.com/sangmichaelxie… The pretraining data mixture is a secret sauce of LLM training. Optimizing your data mixture for robust learning with DoReMi can reduce training time by 2-3x. Train smarter, not longer!

Sang Michael Xie@sangmichaelxie

Should LMs train on more books, news, or web data? Introducing DoReMi🎶, which optimizes the data mixture with a small 280M model. Our data mixture makes 8B Pile models train 2.6x faster, get +6.5% few-shot acc, and get lower pplx on *all* domains! 🧵⬇️ arxiv.org/abs/2305.10429

English

262

68.2K

Minae Kwon retweetledi

Priya Sundaresan@priyasun_·12 Eyl

Hungry? Let our robot twirl your spaghetti for you! 🍝🤖 Introducing VAPORS: Visual Action Planning OveR Sequences, a framework for long-horizon food acquisition. Project Page: sites.google.com/view/vaporsbot Paper: arxiv.org/abs/2309.05197 To appear at @corl_conf 1/11🧵

English

136

50.6K

Minae Kwon retweetledi

Yuchen Cui@YuchenCui1·8 Eyl

We use gestures all the time for specifying targets! How can robots make sense of “gimme that one”? We propose GIRAF, a framework for interpreting human gesture instructions using LLMs. Paper to appear in @corl_conf: arxiv.org/abs/2309.02721 Website: tinyurl.com/giraf23

English

35.7K

Keşfet

@siddkaramcheti @ICatGT @GTrobotics @mlatgt @MATSprogram @NeelNanda5 @EvanHub @MrinankSharma