lineardiff

1.2K posts

lineardiff

@lineardiff

isolation, perception, and communication

Katılım Haziran 2023

409 Takip Edilen164 Takipçiler

lineardiff@lineardiff·3h

@she_llac probably not “curable” if it underlies continual/effective learning

English

shellac@she_llac·1d

i hope that we'll cure sleep in the next 20 years

English

838

lineardiff@lineardiff·5h

@kalomaze the only thing stopping you getting pwned these days is being unimportant enough or not worth the time

English

kalomaze@kalomaze·9h

Feross@feross

🚨 CRITICAL: Active supply chain attack on axios -- one of npm's most depended-on packages. The latest axios @1.14.1 now pulls in plain-crypto-js@4.2.1, a package that did not exist before today. This is a live compromise. This is textbook supply chain installer malware. axios has 100M+ weekly downloads. Every npm install pulling the latest version is potentially compromised right now. Socket AI analysis confirms this is malware. plain-crypto-js is an obfuscated dropper/loader that: • Deobfuscates embedded payloads and operational strings at runtime • Dynamically loads fs, os, and execSync to evade static analysis • Executes decoded shell commands • Stages and copies payload files into OS temp and Windows ProgramData directories • Deletes and renames artifacts post-execution to destroy forensic evidence If you use axios, pin your version immediately and audit your lockfiles. Do not upgrade.

ZXX

lineardiff@lineardiff·6h

@leothecurious imo it’s likely to be one of the most critical aspects of human learning, although i think the augmentation probably happens internally in a semi-abstract way. i think this mechanism might be one of evolution’s largest contributions.

English

davinci@leothecurious·13h

how much data augmentation does a human really need when learning?

English

1.3K

lineardiff@lineardiff·7h

@beffjezos obv they’ll just fork it and old vulnerable coins will become worthless

English

466

Beff (e/acc)@beffjezos·8h

BTC is dead by 2029

nic carter@nic_carter

Many are wondering "what Google saw" that caused them to revise their post-quantum cryptography transition deadline to 2029 last week. It was this: research.google/blog/safeguard…

English

177

1.1K

290.1K

lineardiff@lineardiff·2d

@francoisfleuret looks like we have similar thoughts

English

François Fleuret@francoisfleuret·3d

It's very simple tbh, I want the model to:

English

5.8K

lineardiff@lineardiff·3d

@Laz4rz arxiv.org/abs/2510.04071

QME

602

Lazarz@Laz4rz·3d

@lineardiff papers?

English

398

Lazarz@Laz4rz·4d

Samip@industriaalist

here's @JeffDean talking about how labs will do multi-epoch pretraining with heavy regularization to keep scaling even with limited data. no wonder slowrun gets so much attention from pretraining teams at big labs. pretraining is about to look very very different.

ZXX

253

35.6K

lineardiff@lineardiff·4d

@beffjezos quite

lineardiff@lineardiff

@apples_jimmy sometimes i wonder if this is intentional marketing strategy, it just seems too easy to whip up everybody into mania repeatedly with any hint of the next big thing on the other hand, capabilities continue to improve

Português

Beff (e/acc)@beffjezos·4d

Gotta give it to Anthropic, the whole AI danger fearmongering narrative is quite the marketing trick.

Disclose.tv@disclosetv

JUST IN - Leaked documents from Anthropic show that a new generation of super-strong models, "Claude Mythos," is already in testing with Anthropic believing it "poses unprecedented cybersecurity risks." — Fortune

English

555

28.9K

lineardiff@lineardiff·4d

@RyanPGreenblatt been a fan of ARC for many years now, since Icecuber. think the guys behind it are great, but worried it’s starting to push a bit into bad faith territory now.

English

170

Ryan Greenblatt@RyanPGreenblatt·4d

I wish they published the performance for each human baseliner rather than just the performance of the second best human run on each task. My current guess is that the median human baseliner would score around ~15% on the metric but we can't check because the data isn't public!

ARC Prize@arcprize

Announcing ARC-AGI-3 The only unsaturated agentic intelligence benchmark in the world Humans score 100%, AI <1% This human-AI gap demonstrates we do not yet have AGI Most benchmarks test what models already know, ARC-AGI-3 tests how they learn

English

163

15.7K

lineardiff@lineardiff·4d

@AcerFur they plausibly have the compute, but probably not. they do not have the data, but they could buy it.

English

Acer@AcerFur·4d

lol Apple have the compute and data necessary to train decent models of their own Why aren’t they

Mark Gurman@markgurman

Apple will let any AI platform - big apps include Gemini, Claude, Alexa, Meta AI etc. - to be queried in Siri if they enable an Extensions service inside of their iOS, macOS or iPadOS app. Apple will have a new section in the App Store. Unclear if there’s an approval process.

English

5.5K

lineardiff@lineardiff·4d

@apples_jimmy agreed, doesn’t seem to be Anthropic’s style

English

485

Jimmy Apples 🍎/acc@apples_jimmy·4d

@lineardiff Not anthropics thing. They got caught out. Unless intentionally sloppy.

English

3.8K

Jimmy Apples 🍎/acc@apples_jimmy·4d

“ A draft blog post that was available in an unsecured and publicly-searchable data store prior to Thursday evening said the new model is called “Claude Mythos” and that the company believes it poses unprecedented cybersecurity risks. “

English

617

305K

lineardiff@lineardiff·4d

@vikhyatk very impressive

English

208

vik@vikhyatk·4d

Real-time Moondream inference using our new inference engine

moondream@moondreamai

VLMs too slow for production? Not anymore: 46ms end-to-end inference, 60+ fps on a single H100. Introducing Photon, Moondream's inference engine. Runs on everything from edge to server. moondream.ai/blog/photon-re…

English

1.3K

123.8K

lineardiff@lineardiff·4d

@industriaalist @JeffDean Dropout Is All You Need

English

1.6K

Samip@industriaalist·4d

English

570

125K

lineardiff@lineardiff·5d

@she_llac where’s this from? seems impossibly low

English

289

shellac@she_llac·5d

this will result in the greatest text compression tool ever made -98% to file size

shellac@she_llac

what did i just wake up to when i went to sleep it was at 0.90

English

6.1K

lineardiff@lineardiff·5d

noticing an obvious LLM inflection while reading a passage is like accidentally biting your tongue while eating

English

lineardiff@lineardiff·6d

@kalomaze ok i understand, interesting

English

kalomaze@kalomaze·6d

@lineardiff i am saying you could - pretend a sequentially deep linear network is parallel during fwd (get true loss fast) - do backward over the sequential structure (do sequential chain rule matrix products) - get better inductive bias from backprop sequentiality, maintain fwd parallelism

English

101

kalomaze@kalomaze·6d

most practitioners in ML know that you can't express deeper functions with pure linear projections that lack nonlinearities, but don't let this distract you from the fact that chain rule optimization over deeper linear networks imposes a factorized prior over optimization itself

English

286

21K

lineardiff@lineardiff·6d

@kalomaze i don’t understand this, can you go further?

English

102

kalomaze@kalomaze·6d

@lineardiff in principle you could... optimize for a chain that the forward pass structure never computes, and get cleaner/better gradients, no inference cost, just bwd cost... (ofc you would need to do activation checkpointing but that's fairly typical anyways)

English

164

lineardiff@lineardiff·6d

@kalomaze i generally agree

English

kalomaze@kalomaze·6d

@lineardiff by my intuition, nonlinearities are doing selection/implicit branching, while the decomposition of linear projections in sequence is the actual primary thing causing backprop to build representations hierarchially (in a typical fixed depth fwd pass)

English

699

Keşfet

@she_llac @kalomaze @leothecurious @beffjezos @francoisfleuret @Laz4rz @RyanPGreenblatt @AcerFur