vxnuaj

18 posts

vxnuaj

@vxnuaj

grokking.

sf Katılım Eylül 2022

522 Takip Edilen4.7K Takipçiler

vxnuaj@vxnuaj·2h

@danielcberk hi. been running for 4 years. currently on a build up to 100 miles a week before racing a half marathon (hopefully @ mid 1:10’s)

English

439

Daniel Berk 🐝@danielcberk·8h

I want to start a group chat for runners. 32 people max. Two reqs: 1. You need to have an iPhone (sorry green bubbles) 2. You need to be a runner. You don’t need to be elite. You just need to be committed. We’ll make each other stronger, faster, better. Who wants in?

English

11.5K

vxnuaj@vxnuaj·3h

@nikitabier @JeebsTX nikita wtf is this

English

Nikita Bier@nikitabier·3h

@JeebsTX It would look better if we made the decision to change the Like button: 👍 422 👎 Not sure if I’m ready for the mob that it would create on this peaceful Sunday.

English

466

1.3K

124.4K

JeebsTX 🇺🇸@JeebsTX·4h

Hey @nikitabier, that dislike button looks so out of place in replies!! What's the end goal here? I can select “Spam” On Elon’s reply? lol

English

217

62K

vxnuaj@vxnuaj·15h

@readswithravi this mf marcus is a genius.

English

Reads with Ravi@readswithravi·1d

Marcus Aurelius wrote this over 1800 years ago: “Think of yourself as dead. You have lived your life. Now take what's left and live it properly.”

English

2.5K

21.2K

605.7K

vxnuaj@vxnuaj·1d

@nikitabier @jack one of your engineers made a typo in their css

English

2.2K

Nikita Bier@nikitabier·1d

Happy anniversary to this app. Thank you for bringing us all together @jack

jack@jack

five words. 20 years. unfinished.

English

230

5.4K

314.6K

vxnuaj@vxnuaj·1d

nothing is inevitable except problems but all problems are soluble

English

127

vxnuaj@vxnuaj·1d

if you think AI will eventually "solve" an entire field in the limit, you're implicitly asserting that the growth of knowledge is fundamentally bounded. claims about a field eventually being fully solved in the limit quietly assume the set of meaningful questions and problems is exhaustible and non-generative. which is simply not true.

English

249

vxnuaj@vxnuaj·3 Mar

paper: arxiv.org/pdf/2602.21545 paper repo: github.com/K1seki221/Muon… modded-nanogpt fork: github.com/vxnuaj/modded-…

English

455

vxnuaj@vxnuaj·3 Mar

if anyone wants to try to finish up a speedrun attempt for the NanoGPT speedrun wr, i've implemented Muon+ (improves on norMuon by focusing on post-orthogonalization normalization w/o param-wise lr scaling), but haven't quite hit the record (quite close at about ~3.29 loss). won't be working on this as I didn't have the intention of attempting to beat the record initially, but given that its quite close, figured someone might find it worth a shot. putting links in comment below.

English

839

vxnuaj@vxnuaj·28 Şub

@harvie_z_z_w @unakar666 i wasn’t referring to the residual connections themselves, but rather to the identity transformation being better than gates for the residual connections.

English

168

harvie@harvie_z_z_w·28 Şub

@vxnuaj @unakar666 It’s gated connections / highway network and the plagiarism-hc didn’t cite it, even the first gradient vanishing paper by Sepp

English

106

Tian Xie (Unakar)@unakar666·28 Şub

DeepSeek mHC: dropping the "m" improves performance. Empirical finding: identity hres outperforms the original design. Concurrent discoveries by multiple teams confirm this zhuanlan.zhihu.com/p/201085238967…