Sungmin Cha (@_sungmin_cha) - Twitter Profili | Zamantika Mersobahis Locabet

Sabitlenmiş Tweet

Sungmin Cha@_sungmin_cha·4 Şub

I’m happy to share that I’m starting a new position as Research Scientist at Meta!

English

155

93

3.9K

91.1K

Sungmin Cha@_sungmin_cha·1d

@jmin__cho @UNC Congrats!!!!!!!🎉

English

1

0

2

35

Jaemin Cho@jmin__cho·1d

🥳 I am incredibly honored and grateful to receive the 2026 @UNC Distinguished Dissertation Award! This award recognizes four recipients across the whole university, and I’m humbled to represent the Mathematics, Physical Sciences, and Engineering category this year. Many thanks to my advisor @mohitban47, our MURGe-Lab family, and the @unccs @unc_ai_group for their constant support! 🙏 This is a great reminder of all the good memories from my PhD journey before I start my faculty career at The Johns Hopkins University 😊

English

9

16

78

5.6K

Sungmin Cha retweetledi

Yoonho@youknow04·2d

몬티홀 문제는 1마리 염소가 3개의 문 중에 하나에 있는게 아니라, 아예 확 늘려서 100개의 문 중에 하나에 염소가 있다 하고 참여자가 문 하나를 찍은 다음에, 나머지 빈 문 98개를 열어주고 바꿀래? 케이스로 설명하면 수학이 불편한 사람들의 직관적 결정이 합리적 결정이되도록 돕기 좋더라.

셋@sepiroot

‘몬티 홀 문제’의 정답을 수학자들이 해설과 시뮬레이션으로 보여줘도 사람들이 믿지 못하는 문제를 ”’몬티 홀 문제‘ 문제“라고 부르는건 어떨까요

한국어

4

152

445

302.6K

Sungmin Cha retweetledi

Mengye Ren@mengyer·2d

A joint postdoc position available with Prof. Sungjin Ahn at KAIST. Consider applying!

Sungjin Ahn@SungjinAhn_

We are seeking a highly motivated postdoctoral researcher to work on fundamental challenges toward AGI, particularly in reasoning, abstraction, and world modeling. The position also offers potential opportunities for co-advising with Yoshua Bengio (Mila) and/or Mengye Ren (NYU). Research areas include: • World Model Learning & Planning • Compositional Generalization & Neuro-Symbolic World Learning • Causal Discovery, Reasoning, and Abstraction This position is supported by the InnoCORE Fellowship Program 2026, with: • Competitive salary of KRW 90M+ (~USD 60K+) • Renewable yearly contract For more information and recent publications: mlml.kaist.ac.kr If you are interested, please send me your CV by email.

English

1

5

30

5.3K

Sungmin Cha retweetledi

Sarath Chandar@apsarathchandar·3d

Continual learning is the future of AI and @CoLLAs_Conf is the best venue to publish your state-of-the-art research in designing adaptive machine learning systems! Abstract deadline in 10 days and the conference is in Romania this year!

CoLLAs 2026@CoLLAs_Conf

⏰ The CoLLAs abstract deadline is only 10 days away! We invite researchers to explore all facets of ML adaptation, from incorporating new capabilities during continuous training to efficiently removing outdated or harmful data. - 𝗔𝗯𝘀𝘁𝗿𝗮𝗰𝘁 𝗗𝗲𝗮𝗱𝗹𝗶𝗻𝗲: April 10, 2026 - 𝗦𝘂𝗯𝗺𝗶𝘀𝘀𝗶𝗼𝗻 𝗗𝗲𝗮𝗱𝗹𝗶𝗻𝗲: April 15, 2026 - 𝗖𝗼𝗻𝗳𝗲𝗿𝗲𝗻𝗰𝗲 𝗗𝗮𝘁𝗲𝘀: Sep 14–17, 2026 📚 Accepted papers will be published in the Proceedings of Machine Learning Research (PMLR). 🔗 𝗙𝗼𝗿 𝗳𝘂𝗹𝗹 𝗱𝗲𝘁𝗮𝗶𝗹𝘀 𝗼𝗻 𝘁𝗵𝗲 𝗖𝗮𝗹𝗹 𝗳𝗼𝗿 𝗣𝗮𝗽𝗲𝗿𝘀: lifelong-ml.cc/Conferences/20…

English

0

7

33

5.9K

Sungmin Cha retweetledi

Meta Newsroom@MetaNewsroom·3d

Our AI glasses are constantly getting more useful, helpful, and intuitive 👓Today we’re launching our first prescription-optimized AI glasses and a range of software updates including nutrition tracking, @WhatsApp summaries and recall by Meta AI, Neural Handwriting, and more. about.fb.com/news/2026/03/m…

English

28

66

373

66.5K

Sungmin Cha retweetledi

Alexandr Wang@alexandr_wang·28 Mar

great new update from MSL—SAM 3.1!

AI at Meta@AIatMeta

We’re releasing SAM 3.1: a drop-in update to SAM 3 that introduces object multiplexing to significantly improve video processing efficiency without sacrificing accuracy. We’re sharing this update with the community to help make high-performance applications feasible on smaller, more accessible hardware. 🔗 Model Checkpoint: go.meta.me/8dd321 🔗 Codebase: go.meta.me/b0a9fb

English

21

25

399

56.2K

Sungmin Cha retweetledi

Jung-Woo Ha@JungWooHa2·27 Mar

#AI연구역량 에서도 중국, 미국에 이은 세계 3위! #NeurIPS 논문 1저자 기준 3위로 등극했네요! AI G3 차근차근 달성해 갑니다. economist.com/interactive/sc…

Seongnam-si, Republic of Korea 🇰🇷 한국어

26

265

819

23.5K

Sungmin Cha retweetledi

AI at Meta@AIatMeta·27 Mar

We’re releasing SAM 3.1: a drop-in update to SAM 3 that introduces object multiplexing to significantly improve video processing efficiency without sacrificing accuracy. We’re sharing this update with the community to help make high-performance applications feasible on smaller, more accessible hardware. 🔗 Model Checkpoint: go.meta.me/8dd321 🔗 Codebase: go.meta.me/b0a9fb

English

102

276

2.2K

318.7K

Sungmin Cha retweetledi

Miran Heo@miran_heo·27 Mar

SAM 3.1 is here 🚀 7x faster with 128 objects, without sacrificing any quality Glad to have contributed as part of the SAM team Special kudos to our intern @hkchengrex for the amazing contribution! 🙌

AI at Meta@AIatMeta

We’re releasing SAM 3.1: a drop-in update to SAM 3 that introduces object multiplexing to significantly improve video processing efficiency without sacrificing accuracy. We’re sharing this update with the community to help make high-performance applications feasible on smaller, more accessible hardware. 🔗 Model Checkpoint: go.meta.me/8dd321 🔗 Codebase: go.meta.me/b0a9fb

English

2

12

44

3.6K

Sungmin Cha retweetledi

Alexandr Wang@alexandr_wang·27 Mar

incredible new research from our teams at FAIR

AI at Meta@AIatMeta

Today we're introducing TRIBE v2 (Trimodal Brain Encoder), a foundation model trained to predict how the human brain responds to almost any sight or sound. Building on our Algonauts 2025 award-winning architecture, TRIBE v2 draws on 500+ hours of fMRI recordings from 700+ people to create a digital twin of neural activity and enable zero-shot predictions for new subjects, languages, and tasks. Try the demo and learn more here: go.meta.me/tribe2

English

60

93

1.4K

163.2K

Sungmin Cha retweetledi

Mooni Insight 💫@Semicon_player·26 Mar

명 글입니다. 메모리 투자자들은 1독을 권합니다.

Alis volat propriis@Alisvolatprop12

x.com/i/article/2036…

한국어

8

94

632

95.8K

Sungmin Cha retweetledi

Emergence@PcIOvebbCbTdSTb·26 Mar

한동안 나는 데이터사이언티스트로서의 내 해자가 비교적 분명하다고 생각해왔다. 새로운 문제를 만나면 데이터의 구조와 제약을 파악하고, 관련 논문과 아이디어를 빠르게 훑어 가설을 세우고, 직접 모델을 바꿔가며 성능을 끌어올리는 능력. 나름 성과도 내면서 자신감도 있었다. 그런데 LLM이 발전하면서 막연했지만 이제는 실제가 된 불안이 생겼다. 내가 가장 자신 있던 바로 그 영역이 생각보다 빠르게 자동화될 수 있다는 사실 때문이다. (autoresearch의 충격을 웃기게더 karpathy보다 빠르게 작년 말쯤 업무를 통해 확인했다) 그래서 나는 더 늦기 전에 포지션을 GenAI 쪽으로 옮겼다. 모델을 직접 트레이닝하며 개선하던 쪽에서, 이미 존재하는 강력한 모델을 활용해 시스템과 제품을 만드는 쪽으로... 분명 시대의 방향에 맞는 선택이라고 생각했고, 지금도 그 판단 자체를 후회하지는 않는다. 다만 막상 와보니 예상과는 다른 종류의 공허함이 있다. 예전에는 데이터와 문제를 깊이 파고들며 내 아이디어로 모델 성능을 밀어 올리는 데서 분명한 보람이 있었는데, 지금은 내가 데이터사이언티스트라기보다 AI를 잘 다루는 SWE에 더 가까워진 것처럼 느껴지기도 한다 (물론 그조차도 딸깍질이지만). 아무튼 내 고민의 과거의 해자가 사라진것도 있지만 앞으로 나의 무기는 뭐가 될 것인가이다. 앞으로의 덮쳐올 거대한 파도에 고민자체가 의미가 있나 싶지만...

한국어

6

38

180

20K

Sungmin Cha retweetledi

NeurIPS Conference@NeurIPSConf·26 Mar

NeurIPS is aware of the community's concerns regarding the list of sanctions. NeurIPS is an inclusive community focused on free scientific discourse. We deeply value the research that comes from everyone in our community. The present concerns are not about science or academic freedom. They are about legal requirements that apply to the NeurIPS Foundation, which is responsible for complying with sanctions. We are actively consulting legal counsel to fully understand the legal constraints and we will update the NeurIPS community as soon as we have reliable guidance from our lawyers.

English

128

31

238

212K

Sungmin Cha retweetledi

말러팔삼@mahler83·25 Mar

오늘 하루종일 핫한 TurboQuant #논문 내가 이해한 대로 읊어보면: 고차원 벡터를 양자화 압축해 데이터 양을 줄이는데, 압축하는 만큼 부정확해짐. 특히 예를 들어 100번째 좌표가 0.5 근처에 다 몰려있으면 양자화할 때 이 부분 정보가 사실상 날아감. 그래서 뭘 하냐면 랜덤한 방향으로 회전시킴

말러팔삼@mahler83

조깅 3일차. 타임라인에 열번쯤 올라온 TurboQuant arXiv 논문을 넣고 AO 만들어서 퇴근길, 빨래설거지, 조깅에 거쳐서 들어봤는데 어렵기도 하고 신기하기도 했다. 압축률이 Theoretical lower bound에 근접한다고? 시끌시끌할만 하구나 싶었다

한국어

4

34

127

22.6K

Sungmin Cha retweetledi

Rohan Paul@rohanpaul_ai·25 Mar

Google’s TurboQuant research blog, published yesterday is rattling memory stocks in financial market. Shares of major memory and storage suppliers had declined during early market action on Wednesday. Micron Technology (MU) was down 4%, Western Digital had slid 4.4%, Seagate Technology (STX) had declined 5.6%, and Sandisk (SNDK) had sunk 6.5%. KV cache is the running memory an LLM keeps so it does not recompute every past token, and that memory grows fast as context windows get longer. TurboQuant says Google can shrink that cache to 3 bits per value with no retraining, which means roughly 6x less KV memory while keeping quality close to intact. That hits the part of the AI hardware story that made MU, WDC, SNDK, and STX attractive, because fewer bits per model session can mean fewer high-end memory chips per server. As to the mechanism of TurboQuant, Google takes a long list of numbers that represent the model’s memory and turns that list a little, like rotating a pile of objects so they line up better in a box. That makes the numbers easier to store in a very low-precision form, so each one uses far fewer bits while still keeping most of the useful pattern. The second step is a cleanup pass that fixes part of the distortion caused by that heavy compression, so the model can still find the right past information instead of getting confused by the rougher stored version. Google also claims up to 8x faster performance on H100 for some key operations, so this is not only about saving memory but also about moving data with less friction. The selloff makes sense as a first reaction, but it may be too aggressive because lab wins do not automatically become industry-wide deployment and AI demand is still hitting hard supply limits. --- seekingalpha .com/news/4568538-google-reveals-algorithms-to-address-ai-memory-challenges-memory-stocks-drop

Rohan Paul@rohanpaul_ai

This is massive. Google released TurboQuant, advanced theoretically grounded quantization algorithms - massive compression for LLMs. Tackles one of the nastiest costs in long-context LLMs: the KV cache, which stores small memory vectors for every past token and keeps growing as the prompt gets longer. The usual fix is quantization, where each number is stored with far fewer bits, but most methods quietly add bookkeeping data, so the real memory savings are smaller than they seem. Google’s idea is a 2-stage compressor that keeps the useful geometry of those vectors while stripping out most of that hidden overhead. PolarQuant first randomly rotates the vector, then rewrites pairs of coordinates as a length and an angle, which makes the data easier to pack tightly without storing extra per-block constants. That captures most of the signal, and then QJL uses just 1-bit signs on the tiny leftover error so the final attention score stays accurate instead of drifting. A simple way to picture it is this: PolarQuant stores the main shape of the memory, and QJL stores a tiny correction note almost for free. The other smart part is that this works without retraining or fine-tuning, so it can sit under an existing model rather than forcing the whole system to learn a new format. In Google’s tests, TurboQuant cut KV cache memory by at least 6x, reached 3-bit storage with no accuracy drop on long-context benchmarks, and showed up to 8x faster attention scoring at 4-bit on H100 GPUs. That is a big deal because long prompts are often bottlenecked not by raw compute, but by the cost of moving huge amounts of memory around. Overall, the real advance is not just better compression, but compression that attacks hidden overhead directly, which is why the speed gains look unusually strong for something this lightweight.

English

16

18

75

13.1K

Sungmin Cha retweetledi

Prince Canuma@Prince_Canuma·25 Mar

Just implemented Google’s TurboQuant in MLX and the results are wild! Needle-in-a-haystack using Qwen3.5-35B-A3B across 8.5K, 32.7K, and 64.2K context lengths: → 6/6 exact match at every quant level → TurboQuant 2.5-bit: 4.9x smaller KV cache → TurboQuant 3.5-bit: 3.8x smaller KV cache The best part: Zero accuracy loss compared to full KV cache.

Google Research@GoogleResearch

Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: goo.gle/4bsq2qI

English

148

412

5.2K

723.9K

Sungmin Cha retweetledi

Alex Finn@AlexFinn·25 Mar

This is potentially the biggest news of the year Google just released TurboQuant. An algorithm that makes LLM’s smaller and faster, without losing quality Meaning that 16gb Mac Mini now can run INCREDIBLE AI models. Completely locally, free, and secure This also means: • Much larger context windows possible with way less slowdown and degradation • You’ll be able to run high quality AI on your phone • Speed and quality up. Prices down. The people who made fun of you for buying a Mac Mini now have major egg on their face. This pushes all of AI forward in a such a MASSIVE way It can’t be stated enough: props to Google for releasing this for all. They could have gatekept it for themselves like I imagine a lot of other big AI labs would have. They didn’t. They decided to advance humanity. 2026 is going to be the biggest year in human history.

Google Research@GoogleResearch

Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: goo.gle/4bsq2qI

English

332

879

9.7K

1.5M

Sungmin Cha retweetledi

Rohan Paul@rohanpaul_ai·25 Mar

This is massive. Google released TurboQuant, advanced theoretically grounded quantization algorithms - massive compression for LLMs. Tackles one of the nastiest costs in long-context LLMs: the KV cache, which stores small memory vectors for every past token and keeps growing as the prompt gets longer. The usual fix is quantization, where each number is stored with far fewer bits, but most methods quietly add bookkeeping data, so the real memory savings are smaller than they seem. Google’s idea is a 2-stage compressor that keeps the useful geometry of those vectors while stripping out most of that hidden overhead. PolarQuant first randomly rotates the vector, then rewrites pairs of coordinates as a length and an angle, which makes the data easier to pack tightly without storing extra per-block constants. That captures most of the signal, and then QJL uses just 1-bit signs on the tiny leftover error so the final attention score stays accurate instead of drifting. A simple way to picture it is this: PolarQuant stores the main shape of the memory, and QJL stores a tiny correction note almost for free. The other smart part is that this works without retraining or fine-tuning, so it can sit under an existing model rather than forcing the whole system to learn a new format. In Google’s tests, TurboQuant cut KV cache memory by at least 6x, reached 3-bit storage with no accuracy drop on long-context benchmarks, and showed up to 8x faster attention scoring at 4-bit on H100 GPUs. That is a big deal because long prompts are often bottlenecked not by raw compute, but by the cost of moving huge amounts of memory around. Overall, the real advance is not just better compression, but compression that attacks hidden overhead directly, which is why the speed gains look unusually strong for something this lightweight.

Google Research@GoogleResearch

Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: goo.gle/4bsq2qI

English

18

25

195

52.4K

Sungmin Cha retweetledi

Sukh Sroay@sukh_saroy·23 Mar

🚨 BREAKING: You asked AI to improve your writing. It changed what you were actually saying. New research just proved it. In a controlled study, heavy AI writing assistance led to a 70% increase in essays that gave no clear answer to the question being asked. Not unclear writing. Neutral writing. The kind that sounds polished but commits to nothing. Here's what makes this worse: Researchers took essays written in 2021 — before ChatGPT existed — and asked an LLM to revise them based on real expert feedback. The instruction was simple: fix the grammar. The model changed the meaning anyway. Every time. It can't help it. The training pushes toward inoffensive, agreeable, averaged-out text. That's not a bug they can patch. It's the objective function. And then there's the peer review finding. 21% of reviews at a recent top AI conference were AI-generated. Those reviews scored papers a full point higher on average. They also placed significantly less weight on clarity and significance — the two things peer review is supposed to evaluate. So we're not just talking about your email sounding a little corporate. We're talking about AI quietly flattening scientific discourse. Laundering opinions into non-answers. Replacing your voice with the mean of everyone's voice. The industry keeps asking: is AI-written content detectable? Wrong question. The right question is: what are we losing when a billion people let the same model edit their thinking?

English

58

291

772

46.8K

Sungmin Cha retweetledi

NeurIPS Conference@NeurIPSConf·23 Mar

Following the success of the EurIPS and NeurIPS-Mexico City pilots in 2025, we are thrilled to announce two official NeurIPS 2026 satellite events for this year! These will be held in Paris, France and Atlanta, USA, respectively, running alongside the main venue in Sydney, Australia. Both satellite events will feature keynotes, oral and poster presentations of accepted NeurIPS 2026 papers, as well as workshops. We are planning tutorials, affinity events, and other elements for the satellite sites and we'll share more information as planning advances. Wherever you choose to join us, the entire NeurIPS organizing committee is working hard to deliver an outstanding experience for the whole community! neurips.cc

English

13

62

515

123.7K

Sungmin Cha

Keşfet