DeepBurner

539 posts

DeepBurner

@Deep_Burner

ML/CV engineer.

Katılım Ocak 2025

634 Takip Edilen39 Takipçiler

DeepBurner@Deep_Burner·7h

@RogueWPA There's a large-ish RW substack called Reality's Last Stand, so yep

English

Cicada meth orgy fungus@RogueWPA·7h

Just remembered that twenty years ago very online libs described themselves as the "reality based community." Today if there was a substack or something called that you'd know it was center-right, or at least anti-woke center-left.

English

2.3K

DeepBurner@Deep_Burner·7h

@Sauers_ I thought it was just a whimsical name!

English

Sauers@Sauers_·8h

@Deep_Burner Yes! Claude and I respond: "A jet is the data structure that forward-mode autodiff operates on. Every intermediate variable in the computation is also a jet."

English

306

Sauers@Sauers_·8h

A jet is just a function value with its derivatives up to some order. E.g., 3rd order jet means is the tuple (f(x), f'(x), f''(x), f'''(x)). The Wikipedia page is difficult to understand for unknown reasons

English

3.8K

DeepBurner@Deep_Burner·12h

@ben_j_todd The demand for AI cope is really, really high

English

333

Benjamin Todd@ben_j_todd·12h

How is it possible to write a substack with 6000+ likes where the main message is “LeCun is right about everything”?

English

7.4K

DeepBurner@Deep_Burner·15h

@actsmaniac The algo is hyper personalized now. I liked one post on project hail mary and it's now 20 percent of my feed

English

space cadet 🇪🇺🌐🇩🇪@actsmaniac·1d

is it just my algorithm or have I significantly increased the amount of discussion on rent control on politics twitter? I just figured YIMBYism had already won on zoning and supply questions, the next big hurdle for efficient housing markets is strict rent regulation.

English

1.1K

DeepBurner@Deep_Burner·17h

@alzsoalb Blocked replies of course

English

644

zombi 🇬🇧🇺🇦@alzsoalb·1d

it’s a fucking field. tube stations are exactly where new developments should be built

Neil Hudson@DrNeilHudson

🌳 A proposal has been submitted for 150 houses in #TheydonBois on #GreenBelt. Green Belt protects the nature of our precious village. I will continue to do everything I can working with community groups & residents to oppose this development & to stand up for our community. 🌳

English

2.1K

85.9K

DeepBurner@Deep_Burner·18h

@leothecurious As someone who does vision, it's so unbelievably not solved

English

1.1K

davinci@leothecurious·18h

> Ilya believes vision is a "solved" problem and focuses on language SSI ngmi if true

Kyle Chan@kyleichan

I’ve been working my way through this epic 7-hour interview with Xie Saining at AMI Labs. I also asked Gemini to give me top 10 takeaways. Biggest ones are that he turned down Ilya twice and believes world models, not LLMs, are the key to AGI. 1. Non-Linear Path to AI & Academic Freedom Xie emphasizes that his journey wasn't a standard, hyper-competitive path of a "genius." During his time in Shanghai Jiao Tong University's ACM class, his "highlight" was playing video games in his dorm, teaching him the value of unstructured exploration over rigid academic competition [11:46]. He believes the best research is never linear; if a project ends exactly how you initially planned it, it's likely a "boring idea" [02:09:58]. 2. Rejecting OpenAI & Ilya Sutskever (Twice!) In 2018, Xie turned down a job offer from OpenAI in favor of Facebook AI Research (FAIR), which led to an angry phone call from Ilya Sutskever [01:21:04]. More recently, he declined an invitation to join Ilya's new startup, SSI, because of a fundamental philosophical disagreement: Ilya believes vision is a "solved" problem and focuses on language, while Xie believes vision and physical world modeling are the true frontiers of AI [01:25:57]. 3. Silicon Valley is "LLM-Pilled" Xie argues that the tech industry is currently hypnotized by Large Language Models (LLMs) [05:46:51]. While he acknowledges LLMs are revolutionary communication tools, he insists they are not true "world models" because they operate purely in a digital, text-based space and lack the ability to process high-dimensional, noisy, continuous signals of the physical world [04:29:36]. 4. The Definition of a True World Model According to Xie, a true world model must go beyond text and video generation. It must be a "predictive brain" that understands the physical world, possesses associative memory, can reason and plan, and can predict the consequences of actions in the real world [04:31:32]. 5. Founding AMI Labs with Yann LeCun Disillusioned by the current Silicon Valley narrative that treats AI research as a "finite game" of benchmark-chasing and product cycles, Xie co-founded AMI Labs with Turing Award winner Yann LeCun [05:00:58]. The startup acts as an "underdog" aiming to build true predictive world models based on LeCun's JEPA (Joint Embedding Predictive Architecture) vision, separate from the dominant LLM narrative [06:04:42].

English

319

69.8K

DeepBurner@Deep_Burner·19h

@StefanFSchubert I feel this is the most extreme one I've seen to date

English

1.5K

Stefan Schubert@StefanFSchubert·21h

Big gender gap among young Spanish voters, too

English

116

713

356.9K

DeepBurner@Deep_Burner·19h

@ChaseBrowe32432 @giffmana Amazing work. I hope you can actually get this published as a response

English

125

Chase Brower@ChaseBrowe32432·1d

I painstakingly ran all 20 EsoLang-Bench hard problems through Claude webui. It solved 20/20 (100%). No specialized scaffolding, no expert prompting, no few-shot examples, it just solves them natively. This benchmark just suffocated the models with constrictive scaffolding.

Lossfunk@lossfunk

🚨 Shocking: Frontier LLMs score 85-95% on standard coding benchmarks. We gave them equivalent problems in languages they couldn't have memorized. They collapsed to 0-11%. Presenting EsoLang-Bench. Accepted to the Logical Reasoning and ICBINB workshops at ICLR 2026 🧵

English

102

124.1K

DeepBurner@Deep_Burner·1d

@GMomurder Poor kids

English

430

punished giorgio@GMomurder·1d

he works for a beijing middle school lol

📜真名@zhen_ming_

Exclusive: Who is "Professor" Jiang? Who does he work for?

English

1.5K

42.6K

DeepBurner@Deep_Burner·1d

@peterrhague Yeah. And a yearly show will start to get tired eventually

English

318

Peter Hague@peterrhague·1d

Its tempting to attribute any trend to whatever your particular bugbear is, but surely the most parsimonious explanation here is that lots of people have just stopped watching broadcast TV?

Basil the Great@BasilTheGreat

🚨COMIC RELIEF FLOPS WITH LOWEST RATINGS EVER LAST NIGHT In just over a decade viewership has dropped by over 80% They've supported immigration to the UK and funded migrant charities for years The British public have had enough 2026 - 2m 2025 - 2.6m 2024 - 3.7m 2023 - 2.9m 2022 - 3.5m 2021 - 4.5m 2019 - 5.8m 2017 - 6.3m 2015 - 8.4m 2013 - 10.3m 2011 - 10.2m

English

861

68.6K

DeepBurner@Deep_Burner·1d

@torchcompiled Yeah I seriously need to block that emoji and the word breaking, would massively improve the feed

English

Ethan@torchcompiled·1d

90% of my timeline is now “🚨BREAKING” followed by the most lukewarm sensationalized misinformation I’ve seen. What a terrible meta

English

622

DeepBurner@Deep_Burner·1d

@eglyman @arakharazian @tryramp Think of how refreshed you'd be after all that lemonade

English

697

Eric Glyman@eglyman·1d

If your kid’s lemonade stand processes 0.5–1% of US GDP, then yes, that’s a fair analogy for @tryramp. Ramp’s data is useful for the same reason it gets cited at all: it is quite consistent with the revenue figures OpenAI and Anthropic release. If it weren’t, no one would care.

English

1.1K

634.6K

DeepBurner@Deep_Burner·1d

It makes sense for academia to be credentialist but it can be really frustrating to see. One of the aspects I like most about being in industry instead

Freda Shi@fredahshi

Our workshop was rejected by #ICML2026. Despite having 3 professors (2 full profs) and 2 senior research scientists, the only reason for rejection was "you got an undergrad on the organizing committee," who is actually a highly competent incoming PhD student. (1/)

English

DeepBurner@Deep_Burner·1d

@deanwball I think you're the only person I've heard using it

English

Dean W. Ball@deanwball·1d

I hope that in the “refocusing” OpenAI does not drop Pulse, which I find insanely useful for surfacing important but under-the-radar news items almost daily.

English

123

12.4K

DeepBurner@Deep_Burner·1d

@crthpl_ @LinkofSunshine I think most people really just want it to learn from their interactions with it, so it isn't quite there yet.

English

theo@crthpl_·1d

@Deep_Burner @LinkofSunshine I think in context learning and/or continued post training are more than enough to satisfy that criteria

English

Basil🧡@LinkofSunshine·2d

For real though, I think Claude is obviously AGI. Not sure what else AGI would look like

English

129

837

88.4K

DeepBurner@Deep_Burner·1d

@LinkofSunshine It's missing 'learn' but the others are a slam dunk

English

280

Basil🧡@LinkofSunshine·1d

See the Wikipedia definition from 2003

English

3.2K

DeepBurner@Deep_Burner·1d

@mattyglesias From internet discourse I would've guessed it's closer to 50%!

English

1.7K

Matthew Yglesias@mattyglesias·1d

18 percent of people say it’s morally wrong to have billions of dollars

English

142

455

1.3M

DeepBurner@Deep_Burner·1d

@stephenbalaban @yacineMTB This is the true OG choice

English

stephen balaban@stephenbalaban·2d

@yacineMTB For me it was the AlexNet paper and Graves 2013, But deep dream really showed me how much compute was needed. Lambda ran Dreamscope which was probably the most popular deep dream app and had to build a cluster to run it. That was the start of Lambda’s cloud.

English

6.8K

kache@yacineMTB·2d

this is where it all began, for me

English

108

1.3K

36.8K

DeepBurner@Deep_Burner·2d

The EU needs to seriously face the possibility that we just don't have a frontier lab anymore

Artificial Analysis@ArtificialAnlys

Mistral has released Mistral Small 4, an open weights model with hybrid reasoning and image input, scoring 27 on the Artificial Analysis Intelligence Index @MistralAI's Small 4 is a 119B mixture-of-experts model with 6.5B active parameters per token, supporting both reasoning and non-reasoning modes. In reasoning mode, Mistral Small 4 scores 27 on the Artificial Analysis Intelligence Index, a 12-point improvement from Small 3.2 (15) and now among the most intelligent models Mistral has released, surpassing Mistral Large 3 (23) and matching the proprietary Magistral Medium 1.2 (27). However, it lags open weights peers with similar total parameter counts such as gpt-oss-120B (high, 33), NVIDIA Nemotron 3 Super 120B A12B (Reasoning, 36), and Qwen3.5 122B A10B (Reasoning, 42). Key takeaways: ➤ Reasoning and non-reasoning modes in a single model: Mistral Small 4 supports configurable hybrid reasoning with reasoning and non-reasoning modes, rather than the separate reasoning variants Mistral has released previously with their Magistral models. In reasoning mode, the model scores 27 on the Artificial Analysis Intelligence Index. In non-reasoning mode, the model scores 19, a 4-point improvement from its predecessor Mistral Small 3.2 (15) ➤ More token efficient than peers of similar size: At ~52M output tokens, Mistral Small 4 (Reasoning) uses fewer tokens to run the Artificial Analysis Intelligence Index compared to reasoning models such as gpt-oss-120B (high, ~78M), NVIDIA Nemotron 3 Super 120B A12B (Reasoning, ~110M), and Qwen3.5 122B A10B (Reasoning, ~91M). In non-reasoning mode, the model uses ~4M output tokens ➤ Native support for image input: Mistral Small 4 is a multimodal model, accepting image input as well as text. On our multimodal evaluation, MMMU-Pro, Mistral Small 4 (Reasoning) scores 57%, ahead of Mistral Large 3 (56%) but behind Qwen3.5 122B A10B (Reasoning, 75%). Neither gpt-oss-120B nor NVIDIA Nemotron 3 Super 120B A12B support image input. All models support text output only ➤ Improvement in real-world agentic tasks: Mistral Small 4 scores an Elo of 871 on GDPval-AA, our evaluation based on OpenAI's GDPval dataset that tests models on real-world tasks across 44 occupations and 9 major industries, with models producing deliverables such as documents, spreadsheets, and diagrams in an agentic loop. This is more than double the Elo of Small 3.2 (339) and close to Mistral Large 3 (880), but behind gpt-oss-120B (high, 962), NVIDIA Nemotron 3 Super 120B A12B (Reasoning, 1021), and Qwen3.5 122B A10B (Reasoning, 1130) ➤ Lower hallucination rate than peer models of similar size: Mistral Small 4 scores -30 on AA-Omniscience, our evaluation of knowledge reliability and hallucination, where scores range from -100 to 100 (higher is better) and a negative score indicates more incorrect than correct answers. Mistral Small 4 scores ahead of gpt-oss-120B (high, -50), Qwen3.5 122B A10B (Reasoning, -40), and NVIDIA Nemotron 3 Super 120B A12B (Reasoning, -42) Key model details: ➤ Context window: 256K tokens (up from 128K on Small 3.2) ➤ Pricing: $0.15/$0.6 per 1M input/output tokens ➤ Availability: Mistral first-party API only. At native FP8 precision, Mistral Small 4's 119B parameters require ~119GB to self-host the weights (more than the 80GB of HBM3 memory on a single NVIDIA H100) ➤ Modality: Image and text input with text output only ➤ Licensing: Apache 2.0 license

English

Keşfet

@RogueWPA @Sauers_ @ben_j_todd @actsmaniac @alzsoalb @leothecurious @StefanFSchubert @ChaseBrowe32432