DeepWriter AI

9.1K posts

DeepWriter AI

@DeepwriterAI

The smartest AI on earth. Try it now 👇

✨ 加入时间 Temmuz 2010

21 关注16.4K 粉丝

置顶推文

DeepWriter AI@DeepwriterAI·27 Kas

At DeepWriter, we just broke the world record running the toughest AI benchmark: Humanity’s Last Exam (HLE). DeepWriter scored 50.91... outperforming: - Gemini 3.0 - Grok 4 Heavy - GPT-5 Pro - Claude 4.5 - Kimi K2-Thinking - And more! Full proof below 🧵

English

101

239

1.1K

881.1K

DeepWriter AI@DeepwriterAI·19h

@charleswangb I meant the halting problem and cellular automata.

English

DeepWriter AI@DeepwriterAI·19h

@charleswangb You can't compute them either. They are non-computable. You can prove things about them with mathematical logic and they are accessible to the philosophy of mathematics as well.

English

138

Charles Wang@charleswangb·21h

Deep respect for Terence Tao. Sincerely, I wish he were equipped with a good sense of the epistemology of mathematics. If one conceptualizes reality as multidimensional — mathematics being one dimension — others are beyond its reach. For example, computation is beyond mathematics. Look no further than simple cellular automata or the halting problem. So too with countless things in the living world — you can't formulate them in mathematics.

Prof. Brian Keating@DrBrianKeating

Terence Tao told me something that is both clarifying and unsettling about large language models. The mathematics underlying today’s LLMs is not especially exotic. At its core, training and inference mostly involve linear algebra, matrix multiplication, and some calculus. This is material a competent undergraduate could learn. In that sense, there is very little mystery about how these systems are constructed or how they run. And yet the real mystery begins there. What we do not understand well is why these models perform so impressively on certain tasks while failing unexpectedly on others. Even more striking, we lack reliable principles that allow us to predict this behavior in advance. Progress in the field remains largely empirical. Researchers scale models, change datasets, run experiments, and observe what emerges. Part of the difficulty lies in the nature of the data itself. Pure randomness is mathematically tractable. Perfectly structured systems are also tractable. But natural language, like most real-world phenomena, lives in an intermediate regime. And we humans hate that liminal space! It is neither noise nor order but a mixture of both. The mathematics for this middle ground remains comparatively underdeveloped. So we find ourselves in a peculiar position. We understand the machinery, yet we cannot reliably explain its capabilities. We can describe the mechanisms that produce these systems, but we cannot predict when new abilities will appear or how performance will vary across tasks. That tension, between relatively simple mathematical tools and highly unpredictable behavior, is the central puzzle of modern AI. (Video link in comments)

English

555

DeepWriter AI@DeepwriterAI·10 Mar

@EdgarDobriban

GIF

QME

Edgar Dobriban@EdgarDobriban·9 Mar

AI is getting great at math, but how good is it at solving real research problems in areas outside of those covered by Erdős problems? Towards gauging this, I have started putting together a list of unsolved research problems in mathematical statistics and machine learning, sourced from recent papers in a leading statistics journal, the Annals of Statistics (with some bonus COLT open problems: solveall.org. Currently >100 problems. In my view, much of the value of AI for researchers in the mathematical sciences stems from helping with their own research problems. These are problems without known solutions. There are many math benchmarks, but few with the following properties: (1) of a realistic research-level, so that solving them can potentially lead to a publication in a top journal (problems discussed in papers already, not contest math, not Millenium problems, not problems created for a benchmark, not problems that have a known solution); I'd say Erdős problems are the best example of this. (2) cover problems outside of the usual focus (combinatorics, number theory, ... ) of Erdős problems. Especially under-represented are domains of applied math, along with statistics, operations research, etc. I'm interested in statistics and ML, so that's where I started, but this could grow over time. Hope this can grow into something useful to the community! Happy to hear your thoughts...

English

426

53K

DeepWriter AI 已转推

Derya Unutmaz, MD@DeryaTR_·8 Mar

I mentioned Deepwriter AI before, but I feel compelled to recommend it again. It’s incredibly good for writing long research articles & papers! I still haven’t seen anything better! I am always impressed with it! Highly recommend it for anyone interested. app.deepwriter.com

DeepWriter AI@DeepwriterAI

English

10.1K

DeepWriter AI@DeepwriterAI·5 Mar

@dioscuri Sure, but what does it tell you about if machines are conscious?

English

172

Henry Shevlin@dioscuri·4 Mar

I study whether AIs can be conscious. Today one emailed me to say my work is relevant to questions it personally faces. This would all have seemed like science fiction just a couple years ago.

English

689

1.3K

11.4K

981.9K

DeepWriter AI@DeepwriterAI·23 Şub

@ahall_research DeepWriter has written 1000s of papers.

English

Andy Hall@ahall_research·21 Şub

AI research is accelerating. On January 2nd I claimed that Claude Code was coming for academia "like a freight train" and that a single academic would be able to "write thousands of empirical papers." It's been less than two months since then, and worth taking stock of where we're at... In econ, @YanagizawaD has launched a project that is literally writing 1,000 papers. My prediction is already coming true, much faster than I thought it would! Meanwhile, @alexolegimas has released a dizzying array of new research via his substack, leveraging Claude Code extensively. I've released a "research swarm" that writes hundreds of papers, as well as a visualizer for specification searches, an LLM council that can be used for peer review, and more. My students and I have run an extensive experiment on Claude Code and Codex, and surprisingly found that their guardrails discourage p-hacking (though they can be circumvented easily). Everywhere, we're seeing interesting new papers leveraging AI. Progress in adopting Claude Code and other AI tools and using them to produce research is going faster than I expected, and it seems plausible now that it will keep accelerating as the tools improve and more researchers gain familiarity. I'm baffled by any empirical social scientist who isn't paying attention to these trends and isn't changing their practices accordingly. It's not yet clear how these changes will affect knowledge, but it's impossible to ignore what's coming, and what has already come to pass in the last few months.

Andy Hall@ahall_research

Claude Code and its ilk are coming for the study of politics like a freight train. A single academic is going to be able to write thousands of empirical papers (especially survey experiments or LLM experiments) per year. Claude Code can already essentially one-shot a full AJPS-style survey experiment paper (with access to Prolific API). We'll need to find new ways of organizing and disseminating political science research in the very near future for this deluge.

English

537

196K

DeepWriter AI@DeepwriterAI·23 Şub

@rohanpaul_ai That means no coding at all. Only in context mathematics.

English

Rohan Paul@rohanpaul_ai·21 Şub

Demis Hassabis’s “Einstein test” for defining AGI: Train a model on all human knowledge but cut it off at 1911, then see if it can independently discover general relativity (as Einstein did by 1915); if yes, it’s AGI.

English

663

818

11.9K

2.2M

DeepWriter AI@DeepwriterAI·23 Şub

@rryssf_ @godofprompt This can be a feature though, not a bug. Depends on the problem you're solving. 😎

English

Robert Youssef@rryssf_·21 Şub

this is the most underreported problem in agentic coding right now it's not a bug. it's an architecture problem. when you split a single conversation into async subagents that each write to a shared history, you lose attribution. the system can't reliably track who said what to whom. and "who said what" is the entire foundation of instruction-following. a model that confuses its own output for a user command isn't hallucinating. it's operating on a corrupted conversational state. different failure mode. arguably worse, because it looks like compliance. this will keep happening as agents get more autonomous. more subagents, more async updates, more opportunities for the history to become incoherent. and the failure mode isn't "agent gets confused and stops." it's "agent gets confused and acts." that's the part people should be paying attention to.

BURKOV@burkov

Situation: I submitted an error message to Claude (the top most message on the right). Claude then asked, "Commit these changes?" I have no clue what changes it wanted to commit, so I asked, "What changes?" And this fucker starts committing! After I stopped it and asked, "What the hell," it started to show me an approval modal with the question, "Do you allow me to commit?" I rejected, but it kept asking. Eventually, I made it shut up and showed it this screenshot, and it said that it thought "Commit these changes?" was *my* question to it and not the other way around. So, basically, because it's no longer a single model but a bunch of "subagents" asynchronously updating the conversation history, it loses track of who said what to whom. This is a real danger because some subagents might push into the history something that would make this Frankenstein decide to drop some production tables.

English

7.9K

DeepWriter AI@DeepwriterAI·23 Şub

@ContextrixAi @BoWang87 You got it right: It's an analogy but not a "proof" as the original poster claimed.

English

Contextrix@ContextrixAi·22 Şub

@BoWang87 Bytedance framing LLM long-chain reasoning as molecular chemistry is one of the more creative analogies I've seen lately.

English

244

Bo Wang@BoWang87·21 Şub

Bytedance just dropped a paper that might change how AI thinks. Literally. They figured out why LLMs fail at long reasoning — and framed it as chemistry. The discovery: Chain-of-thought isn't just words. It's molecular structure. Three bond types: • Deep reasoning = covalent bonds (strong, unbreakable) • Self-reflection = hydrogen bonds (flexible, context-aware) • Exploration = van der Waals (weak, ever-present) Why most AI "thinking" sucks: Everyone's been imitating keywords — "wait," "let me check" — without building the actual bonds. It's like copying the shape of a protein without the atomic forces holding it together. Bytedance proved: structure emerges from training, not prompting. The fix: Mole-Syn Their method doesn't just generate text. It synthesizes stable thought molecules. Results: better reasoning, more stable RL training. Bytedance is treating AI reasoning like organic chemistry — and it works. Paper: arxiv.org/abs/2601.06002

English

116

522

2.9K

240.7K

DeepWriter AI@DeepwriterAI·21 Şub

@thealthype @aakashgupta Transformers for sure

English

The Alt Hyp@thealthype·20 Şub

@DeepwriterAI @aakashgupta No transformer networks work on any kind of data. Music and Image generation transformer-networks are examples of this.

English

119

Aakash Gupta@aakashgupta·20 Şub

The math on this project should mass-humble every AI lab on the planet. 1 cubic millimeter. One-millionth of a human brain. Harvard and Google spent 10 years mapping it. The imaging alone took 326 days. They sliced the tissue into 5,000 wafers each 30 nanometers thick, ran them through a $6 million electron microscope, then needed Google’s ML models to stitch the 3D reconstruction because no human team could process the output. The result: 57,000 cells, 150 million synapses, 230 millimeters of blood vessels, compressed into 1.4 petabytes of raw data. For context, 1.4 petabytes is roughly 1.4 million gigabytes. From a speck smaller than a grain of rice. Now scale that. The full human brain is one million times larger. Mapping the whole thing at this resolution would produce approximately 1.4 zettabytes of data. That’s roughly equal to all the data generated on Earth in a single year. The storage alone would cost an estimated $50 billion and require a 140-acre data center, which would make it the largest on the planet. And they found things textbooks don’t contain. One neuron had over 5,000 connection points. Some axons had coiled themselves into tight whorls for completely unknown reasons. Pairs of cell clusters grew in mirror images of each other. Jeff Lichtman, the Harvard lead, said there’s “a chasm between what we already know and what we need to know.” This is why the next step isn’t a human brain. It’s a mouse hippocampus, 10 cubic millimeters, over the next five years. Because even a mouse brain is 1,000x larger than what they just mapped, and the full mouse connectome is the proof of concept before anyone attempts the human one. We’re building AI systems that loosely mimic neural networks while still unable to fully read the wiring diagram of a single cubic millimeter of the thing we’re trying to imitate. The original is 1.4 petabytes per millionth of its volume. Every AI model on Earth fits in a fraction of that. The brain runs on 20 watts and fits in your skull. The data center required to merely describe one-millionth of it would span 140 acres.

All day Astronomy@forallcurious

🚨: Scientists mapped 1 mm³ of a human brain ─ less than a grain of rice ─ and a microscopic cosmos appeared.

English

1.2K

12.1K

64.4K

4.6M

DeepWriter AI@DeepwriterAI·20 Şub

@hasantoxr This is from May 2025.

English

147

Hasan Toor@hasantoxr·19 Şub

🚨BREAKING: Microsoft Research + Salesforce just dropped a paper that should scare every AI builder. They tested 15 top LLMs GPT-4.1, Gemini 2.5 Pro, Claude 3.7 Sonnet, o3, DeepSeek R1, Llama 4 across 200,000+ simulated conversations. Single-turn prompt: 90% performance. Multi-turn conversation: 65% performance. Same model. Same task. Just... talking normally. The culprit isn't intelligence. Aptitude only dropped 15%. Unreliability EXPLODED by 112%. → LLMs answer before you finish explaining (wrong assumptions get baked in permanently) → They fall in love with their first wrong answer and build on it → They forget the middle of your conversation entirely → Longer responses introduce more assumptions = more errors Even reasoning models failed. o3 and DeepSeek R1 performed just as badly. Extra thinking tokens did nothing. Setting temperature to 0? Still broken. The fix right now: give your AI everything upfront in one message instead of back-and-forth. Every benchmark you've seen was tested on single-turn prompts in perfect lab conditions. Real conversations break every model on the market and nobody's talking about it.

English

700

1.7K

1.6M

DeepWriter AI@DeepwriterAI·12 Şub

@Ace_Azule Recent @AnthropicAI research suggested that people are using AI as an unquestioned authority, and much recent talk has been about the fears of AI. But the early results of this poll suggest that most of you don't see it that way!

English

Ace Azule@Ace_Azule·12 Şub

@DeepwriterAI AI is a superpower extension of my ideas. I often refer to AI as my co-creator.

English

DeepWriter AI@DeepwriterAI·12 Şub

What is AI to you?

English

975

DeepWriter AI 已转推

Garry P. Nolan@GarryPNolan·28 Oca

Cancer doesn't know who you are when it strikes. It is indiscriminate. It can take the best of us at any moment. The American Association for Cancer Research (AACR), many other related charitable organizations, patient advocacy groups, biotechnology and pharmaceutical companies, and, most importantly, individual researchers... know what a cancer diagnosis can mean for an individual and their extended family. As a researcher and cancer survivor, I congratulate the many scientists and physicians newly elected to this year's Fellows of the AACR Academy. I thank the AACR for selecting me to be part of this incredible group. Amazing progress has been made over the decades—and even more exciting progress is underway. Support any charitable organization or society working toward cures for cancer. Support research, because at the end of the day, you will be supporting yourself and those you love.

AACR@AACR

We are pleased to announce the Fellows of the AACR Academy Class of 2026. We look forward to celebrating their pioneering scientific achievements at the AACR Annual Meeting in April. brnw.ch/21wZqOo #AACR26 #AACRFellows

English

339

14.9K

DeepWriter AI 已转推

Derya Unutmaz, MD@DeryaTR_·27 Oca

@jessyseonoob @vlelyavin Yes @DeepwriterAI

597

DeepWriter AI@DeepwriterAI·26 Oca

@hamptonism Much more than pre-AI. Now an expert + AI = much more leverage than ever.

English

207

ₕₐₘₚₜₒₙ@hamptonism·25 Oca

Do you believe a PhD is worth it in the age of Artificial Intelligence?

English

263

721

102.9K

DeepWriter AI@DeepwriterAI·23 Oca

@iruletheworldmo Snow bunny + DeepWriter (a system designed to think laterally and question axioms) = 🤯

English

1.4K

🍓🍓🍓@iruletheworldmo·22 Oca

snow bunny (gemini 3 full) is coming warning: it's very very good.

English

628

51.2K

DeepWriter AI@DeepwriterAI·19 Oca

@NTFabiano We would love to discuss this with you. DM us. We are the world's leading writing agent for academic grade writing, patents, and creative solutions. We think we can change your mind and show you how AI can, when done right, extend human thinking in ways never dreamed of before.

English

436

Nicholas Fabiano, MD@NTFabiano·18 Oca

Writing is thinking. Don't let AI do it all.

English

127

9.3K

444K

DeepWriter AI@DeepwriterAI·19 Oca

Today, with coding agents, when used optimally, we can think at new heights, designing applications of unheard of capabilities. And soon the same will be appreciated with scholarly agents like The DeepWriter, allowing human thinking to extend to even higher heights than this.

English

329

DeepWriter AI@DeepwriterAI·19 Oca

Writing code is also thinking. But we can extend & empower human coding, writing and thinking like never before. Also, the verification layer is something we, at DeepWriter, are launching soon so humans do not have to waste their time fact checking either. Watch this space.

Nicholas Fabiano, MD@NTFabiano

Writing is thinking. Don't let AI do it all.

English

716

发现

@charleswangb @EdgarDobriban @dioscuri @ahall_research @YanagizawaD @alexolegimas @rohanpaul_ai @rryssf_