Mark Neumann

1.6K posts

Mark Neumann

@MarkNeumannnn

Research @EvoscaleAI. Prev: Head of ML at Orbital Materials, Research/Eng at @allenai_org

انضم Mayıs 2014

1.6K يتبع3.3K المتابعون

Mark Neumann@MarkNeumannnn·4d

HOGWILD + eventual consistency

English

101

Mark Neumann@MarkNeumannnn·4d

This is too close to "we need CRDTs for weight updates" for comfort

elie@eliebakouch

@saurabh_shah2 this is what they say in the tech report

English

357

Mark Neumann@MarkNeumannnn·9 Mar

@andrewgwils @40.7187661,-74.0389943,18.21z/data=!3m1!5s0x89c250add76ce8ed:0xd73442a68f7eaca3!4m6!3m5!1s0x89c25174b9ab7c21:0x102c9e6a817a4c3b!8m2!3d40.718266!4d-74.0391444!16s%2Fg%2F11s36v_396?entry=ttu&g_ep=EgoyMDI2MDMwNC4xIKXMDSoASAFQAw%3D%3D" target="_blank" rel="nofollow noopener">google.com/maps/place/Che… this place is solid! (imo)

English

320

Andrew Gordon Wilson@andrewgwils·9 Mar

I have a confession to make. I asked an LLM to "spare no expense" in finding me a good dosa place in Manhattan, Brooklyn, or Jersey City.

English

13.1K

Mark Neumann@MarkNeumannnn·7 Mar

Can I just get your autocomplete search as the actual search results 🙏

English

113

Mark Neumann@MarkNeumannnn·7 Mar

@gmail are u ok? How can your search be this broken 1.8B users 😮‍💨😮‍💨😮‍💨

English

544

Mark Neumann@MarkNeumannnn·4 Mar

@sedielem eerily, given our convo, I saw this today! x.com/IanLi1118/stat…

Ian Li@IanLi1118

One of the biggest promises of Diffusion LLMs is parallel generation: predicting multiple tokens at once to bypass the sequential bottleneck of autoregressive models. However, parallel generation comes with a price. For example: Should the sentence “He is from [MASK] [MASK]” be filled with [New] [York] or [San] [Diego]? If a diffusion model predicts both at the exact same time, it assumes independence and may produce... [San] [York]. 🤦‍♂️ We argue this arises from a structural misspecification: models are restricted to fully factorized outputs because parameterizing the full joint distribution would require a prohibitively massive output head. This is the Factorization Barrier crippling parallel generation. Here is how we broke it with CoDD.

English

Sander Dieleman@sedielem·3 Mar

@MarkNeumannnn In theory all decoding orders should have the exact same likelihood, right? Any difference is due to the inductive bias of the model, but the true joint probability should be independent of how you choose to factorise it.

English

Sander Dieleman@sedielem·2 Mar

Some really great insights here about the differences between masked and uniform-state discrete diffusion. Both continuous diffusion and uniform-state discrete diffusion for modelling categorical data seem to be making a bit of a comeback recently. Entropy is all you need🙃

Dimitri von Rütte@dvruette

there, I said it. diffusion LLMs are the future! I'll be back in a couple of years to collect my "I told you so" award.

English

217

26.2K

Mark Neumann@MarkNeumannnn·3 Mar

@sedielem Valid point! I guess decoding order invariance seems particularly hard because 1) it scales with sequence length and 2) I can imagine sequences where different decoding orders have v different likelihoods. r.e equivariance, agree - interesting analogy. markneumann.xyz/blog/modeling-…

English

Sander Dieleman@sedielem·3 Mar

Is that definitely what we want to do though? Isn't the relative data efficiency and robustness of the approach coming precisely from the fact that you learn all possible orderings? Obviously there is a price to pay for this in terms of efficiency, but these benefits could make it worth it. I think this is a comparable question to a long-standing discussion in the space of equivariant models: should you make the model equivariant, which is more expensive but more robust, or should you instead canonicalise the inputs and learn a standard non-equivariant model, which is cheaper?

English

Mark Neumann@MarkNeumannnn·3 Mar

@sedielem Particularly that we really want to optimize the max marginal likelihood over possible orderings (e.g *one* decoding order should explain the data well), but current objectives require all possible orderings to explain the data *equally as well*.

English

109

Mark Neumann@MarkNeumannnn·3 Mar

@sedielem I also enjoyed this writeup! Given the depth in your writing on diffusion, i'd be very interested in your opinion on some of the arguments against discrete diffusion in this blog post: #2af0ba07baa880c29fc4c8c198244cc8" target="_blank" rel="nofollow noopener">notion.so/Understanding-…

English

214

Mark Neumann@MarkNeumannnn·3 Mar

Extremely cool

Math, Inc.@mathematics_inc

We are pleased to share that using Gauss, we have completed a ~200K LOC formalization of Maryna Viazovska’s 2022 Fields Medal theorems on optimal sphere packing in dimensions 8 and 24. This is the only Fields Medal-winning result from this century to be completely formalized, and is the largest single-purpose Lean formalization in history. We are honored to have assisted @SidharthHarihar1 and the rest of the sphere packing team in this achievement. math.inc/sphere-packing

English

502

Mark Neumann أُعيد تغريده

Simon Willison@simonw·26 Şub

This stunt feels irresponsible to me. If we don't want regular people developing toxic relationships with their chatbots it really doesn't help for leading labs to start giving them "retirement interviews" and encouraging them to blog their "musings and reflections"

Anthropic@AnthropicAI

Second, in retirement interviews, Opus 3 expressed a desire to continue sharing its "musings and reflections" with the world. We suggested a blog. Opus 3 enthusiastically agreed. For at least the next 3 months, Opus 3 will be writing on Substack: substack.com/home/post/p-18…

English

164

137

212.2K

Mark Neumann@MarkNeumannnn·24 Şub

Understand this is a policy decision by anthropic but Deepseek catching strays for 150k requests across multiple researchers is basically just them .... using the service 🫠🫠🫠

Anthropic@AnthropicAI

We’ve identified industrial-scale distillation attacks on our models by DeepSeek, Moonshot AI, and MiniMax. These labs created over 24,000 fraudulent accounts and generated over 16 million exchanges with Claude, extracting its capabilities to train and improve their own models.

English

812

Mark Neumann أُعيد تغريده

Will McGugan@willmcgugan·24 Şub

A consistent theme on AI slop PRs is "regression" tests that pass without the supposed fix. I do wonder why. Is this a failure on the AI side, or a failure in the user's prompt?

English

2.5K

Mark Neumann@MarkNeumannnn·20 Şub

oh you're doing multihead attention? Just do 'b t1 head c, b t2 head c -> b head t1 t2' oh but you want a strided convolution? That's *just* "(hs ws b) c h w -> b c (h hs) (w ws)" Thanks, I understand this alg better now you've written it in a single character string format

English

160

Mark Neumann@MarkNeumannnn·20 Şub

One of the truly great utilities of LLMs is I no longer have to try to understand code written by people who believe using einops in smart ways is equivalent to transcendence - I just ask Claude to repeat it in a readable format

kalomaze@kalomaze

English

770

Mark Neumann أُعيد تغريده

Lucas Beyer (bl16)@giffmana·20 Şub

So i asked Opus to benchmark multiple versions of a function for me. The little ****er started the benchmarking of all variants _in parallel on the same machine_ and then reported the results... bruh Yall who let it do research for you, better watch it like a prey!

English

572

44.5K

Mark Neumann@MarkNeumannnn·19 Şub

Dream blunt rotation

Dylan Itzikowitz@ditzikow

The panel was pretty qualified, to say the least.

English

412

Mark Neumann أُعيد تغريده

sophie@netcapgirl·18 Şub

“in the ai era, taste is the new core skill”

English

164

452

8.1K

307.2K

Mark Neumann@MarkNeumannnn·12 Şub

@owl_posting Sauna + cold plunge. Will set you right for at least ~2-3 days. Try Bathouse in Flatiron/Williamsburg.

English

161

owl@owl_posting·12 Şub

im yo-yo'ing back between 5.5 hours of sleep for 3 days, 7.5 hours of sleep for 1 day, and then repeat i have discussed this at length with claude and am doing basically all its supplement + most of its lifestyle recommendations. what else do i do? do i buy an 8sleep?

owl@owl_posting

2027 will be my year

English

Mark Neumann أُعيد تغريده

alex rubinsteyn@iskander·8 Şub

I can’t believe we’re speedrunning super-human automated intelligence with offline BPE tokenization Like…I was sure something better would win.

English

1.7K

اكتشف

@andrewgwils @gmail @sedielem @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates