J. Lyle Kim

192 posts

J. Lyle Kim

@jlylekim

Quantum computing research scientist @jpmorgan | PhD @RiceCompSci | BA @UChicago | Optimization, quantum computing, machine learning | 🇰🇷

New York, USA Katılım Kasım 2020

333 Takip Edilen246 Takipçiler

J. Lyle Kim@jlylekim·15h

@AlexShtf :) proceedings.neurips.cc/paper_files/pa… this is from last year, and it's about good old convex optimization and how quantum can help.

English

Alex Shtoff@AlexShtf·20h

@jlylekim Is there any optimizer work that is not about "optimizing for language models"? :)

English

199

J. Lyle Kim@jlylekim·1d

🚨New paper: Anytime Training with Schedule-Free Spectral Optimization🚨 We introduce SF-NorMuon, a schedule-free spectral method that outperforms or matches heavily tuned AdamW across 125M and 772M parameter language models.

English

126

14.1K

J. Lyle Kim@jlylekim·16h

@tmpethick Very interesting, will def check out!

English

120

Thomas Pethick@tmpethick·23h

@jlylekim Looks interesting! We’re using schedule-free to get rid of weight decay tuning in arxiv.org/pdf/2605.11172 - I’m curious if you can somehow combine to get rid of both

English

490

J. Lyle Kim@jlylekim·1d

Joint work with @anujsapte, Pranav Deshpande, Niraj Kumar, and Shouvanik Chakrabarti at Global Technology Applied Research, JPMorganChase.

English

453

J. Lyle Kim@jlylekim·1d

The takeaway? High-quality checkpoints at any point during the run without committing to a fixed training horizon upfront. Read the full paper here: arxiv.org/abs/2605.23061

English

613

J. Lyle Kim retweetledi

Aaron Defazio@aaron_defazio·5d

🚨 New Paper 🚨 ScheduleFree+: Scaling Learning-Rate-Free & Schedule-Free Learning to Large Language Models A few modifications to Schedule-Free Learning make it completely LR tuning free, and allow it to greatly outperform schedules for long duration training! arxiv.org/abs/2605.19095…

English

416

82K

J. Lyle Kim retweetledi

OptimaLab@optimalab1·27 Nis

During neural network training, the loss landscape gets sharper until it hits a ceiling. GD pins right at the ceiling. SGD settles below it — and the gap grows as you shrink the batch. Why? We now have the answer. arxiv.org/abs/2604.21016 🧵 Blog: akyrillidis.github.io/aiowls/stochas…

English

414

36.5K

J. Lyle Kim@jlylekim·3 Nis

@bremen79 Explain why the one you solved is useful

English

563

Francesco Orabona@bremen79·2 Nis

Us: We solve 1 open problem in FamousPaper'18 Reviewer: IrrelevantPaper'21 solved it! Us: Nope, IrrelevantPaper'21 is a different problem Reviewer: But other open problems in FamousPaper'18 were solved, this means the one you solved is useless Us: ... What would you do? #ICML26

English

15.9K

J. Lyle Kim retweetledi

The Big Karuna@Nondual8·23 Oca

Hi Madison, I understand this well. I’m 66 years old. My first computer was a DEC PDP-8 with an ASR 33 teletype as the input output device. For most of my career I’ve written embedded software for medical devices – clinical ultrasound systems, and defibrillators. It’s taking me about a year to adjust to the double whammy of being somewhat, but not entirely replaceable by AI and also generally being judged as too old and used up. In fact, neither of those characteristics or attributes defined who I am. I will say that it’s going to be interesting to see how this unfolds. You and I just happened to be members of the field where this effect is felt first. Yet I think it will sweep all knowledge work. Well, it’s taken some time. I’m at peace. Life just unfolds. And I would urge you to not identify too strongly with what you think of as you. You are more than a software engineer and software engineering does not define you at all. Easier said than done, perhaps. But I would urge you to just sit with the uncomfortable feelings if you have them. And when you feel like it invest a little bit of thought in the question, how can I be useful to others around me? That is where I am spending my time. Wishing you peace of mind and ease in the world.

English

552

53.6K

J. Lyle Kim retweetledi

Elon Musk@elonmusk·23 Oca

For quality of life, it is better to err on the side of being an optimist and wrong, rather than a pessimist and right

English

12.2K

30K

251.5K

48.8M

J. Lyle Kim@jlylekim·17 Oca

How does SuperGrok perform for math-ish research? I see a lot of comparison between other models but rarely see any tweets about SuperGrok for research

English

J. Lyle Kim@jlylekim·14 Oca

@rather_deep @deviparikh Well put imho

English

Deep@rather_deep·14 Oca

@deviparikh No, math is deterministic, writing is art. Only if you knew how thoughts shape language & language shapes thoughts you'd understand it. Each choice you make while writing is a decision that shows your view, sometimes accurately, disregarding it means you let go of fine control

English

117

Devi Parikh@deviparikh·14 Oca

Just because an AI wrote it, doesn’t mean I don’t mean it. Just because I used AI, doesn’t mean I don’t care. It’s like saying, “If you really cared about this contract, you’d do the budget math by hand.” Me being efficient with my time doesn’t mean I respect yours less. All that matters is — was it good? The problem with AI slop isn’t the AI. It’s the slop. If it took you 15 minutes to read and it was bad, I wasted your time. Regardless of whether it took me 5 minutes or 3 hours to write, with or without AI.

English

175

16.7K

J. Lyle Kim@jlylekim·14 Oca

@deviparikh Isn't the analogy more like "If you really cared about this contract, you'd actually read it instead of asking AI to summarize "? I presume you don't write everything with AI. Why not?

English

207

J. Lyle Kim@jlylekim·27 Ara

@humphilomath @npparikh @irl_danB Lol, tensors?

English

bilinear form@humphilomath·27 Ara

@npparikh @irl_danB Lol, what is not a matrix?

English

dan@irl_danB·26 Ara

Anthropic is a billion-dollar company????? a billion???????? dollars??????????????? for literally... matrix. multiplication.

English

2.7K

183.7K

J. Lyle Kim@jlylekim·18 Ara

@alz_zyd_ I do regret taking media aesthetics

English

162

J. Lyle Kim retweetledi

alz@alz_zyd_·17 Ara

as a UChicago alum, one of my great life regrets is I took a real SOSC (power) but a fake HUM (language and the human). UChi kids: take real SOSC and HUM classes! You're at one of the few places in America where you can still get a good classical education, don't miss your chance

English

219

11.7K

Keşfet

@AlexShtf @tmpethick @anujsapte @bremen79 @rather_deep @deviparikh @humphilomath @npparikh