DeepWriter AI

9.1K posts

DeepWriter AI banner
DeepWriter AI

DeepWriter AI

@DeepwriterAI

The smartest AI on earth. Try it now 👇

Beigetreten Temmuz 2010
21 Folgt16.4K Follower
Angehefteter Tweet
DeepWriter AI
DeepWriter AI@DeepwriterAI·
At DeepWriter, we just broke the world record running the toughest AI benchmark: Humanity’s Last Exam (HLE). DeepWriter scored 50.91... outperforming: - Gemini 3.0 - Grok 4 Heavy - GPT-5 Pro - Claude 4.5 - Kimi K2-Thinking - And more! Full proof below 🧵
DeepWriter AI tweet media
English
101
239
1.1K
881.1K
DeepWriter AI
DeepWriter AI@DeepwriterAI·
@charleswangb You can't compute them either. They are non-computable. You can prove things about them with mathematical logic and they are accessible to the philosophy of mathematics as well.
English
2
0
2
138
Charles Wang
Charles Wang@charleswangb·
Deep respect for Terence Tao. Sincerely, I wish he were equipped with a good sense of the epistemology of mathematics. If one conceptualizes reality as multidimensional — mathematics being one dimension — others are beyond its reach. For example, computation is beyond mathematics. Look no further than simple cellular automata or the halting problem. So too with countless things in the living world — you can't formulate them in mathematics.
Prof. Brian Keating@DrBrianKeating

Terence Tao told me something that is both clarifying and unsettling about large language models. The mathematics underlying today’s LLMs is not especially exotic. At its core, training and inference mostly involve linear algebra, matrix multiplication, and some calculus. This is material a competent undergraduate could learn. In that sense, there is very little mystery about how these systems are constructed or how they run. And yet the real mystery begins there. What we do not understand well is why these models perform so impressively on certain tasks while failing unexpectedly on others. Even more striking, we lack reliable principles that allow us to predict this behavior in advance. Progress in the field remains largely empirical. Researchers scale models, change datasets, run experiments, and observe what emerges. Part of the difficulty lies in the nature of the data itself. Pure randomness is mathematically tractable. Perfectly structured systems are also tractable. But natural language, like most real-world phenomena, lives in an intermediate regime. And we humans hate that liminal space! It is neither noise nor order but a mixture of both. The mathematics for this middle ground remains comparatively underdeveloped. So we find ourselves in a peculiar position. We understand the machinery, yet we cannot reliably explain its capabilities. We can describe the mechanisms that produce these systems, but we cannot predict when new abilities will appear or how performance will vary across tasks. That tension, between relatively simple mathematical tools and highly unpredictable behavior, is the central puzzle of modern AI. (Video link in comments)

English
2
0
4
555
Edgar Dobriban
Edgar Dobriban@EdgarDobriban·
AI is getting great at math, but how good is it at solving real research problems in areas outside of those covered by Erdős problems? Towards gauging this, I have started putting together a list of unsolved research problems in mathematical statistics and machine learning, sourced from recent papers in a leading statistics journal, the Annals of Statistics (with some bonus COLT open problems: solveall.org. Currently >100 problems. In my view, much of the value of AI for researchers in the mathematical sciences stems from helping with their own research problems. These are problems without known solutions. There are many math benchmarks, but few with the following properties: (1) of a realistic research-level, so that solving them can potentially lead to a publication in a top journal (problems discussed in papers already, not contest math, not Millenium problems, not problems created for a benchmark, not problems that have a known solution); I'd say Erdős problems are the best example of this. (2) cover problems outside of the usual focus (combinatorics, number theory, ... ) of Erdős problems. Especially under-represented are domains of applied math, along with statistics, operations research, etc. I'm interested in statistics and ML, so that's where I started, but this could grow over time. Hope this can grow into something useful to the community! Happy to hear your thoughts...
Edgar Dobriban tweet media
English
31
72
426
53K
DeepWriter AI retweetet
Derya Unutmaz, MD
Derya Unutmaz, MD@DeryaTR_·
I mentioned Deepwriter AI before, but I feel compelled to recommend it again. It’s incredibly good for writing long research articles & papers! I still haven’t seen anything better! I am always impressed with it! Highly recommend it for anyone interested. app.deepwriter.com
DeepWriter AI@DeepwriterAI

At DeepWriter, we just broke the world record running the toughest AI benchmark: Humanity’s Last Exam (HLE). DeepWriter scored 50.91... outperforming: - Gemini 3.0 - Grok 4 Heavy - GPT-5 Pro - Claude 4.5 - Kimi K2-Thinking - And more! Full proof below 🧵

English
2
4
43
10.1K
DeepWriter AI
DeepWriter AI@DeepwriterAI·
@dioscuri Sure, but what does it tell you about if machines are conscious?
English
0
0
2
172
Henry Shevlin
Henry Shevlin@dioscuri·
I study whether AIs can be conscious. Today one emailed me to say my work is relevant to questions it personally faces. This would all have seemed like science fiction just a couple years ago.
Henry Shevlin tweet media
English
689
1.3K
11.4K
981.9K
Andy Hall
Andy Hall@ahall_research·
AI research is accelerating. On January 2nd I claimed that Claude Code was coming for academia "like a freight train" and that a single academic would be able to "write thousands of empirical papers." It's been less than two months since then, and worth taking stock of where we're at... In econ, @YanagizawaD has launched a project that is literally writing 1,000 papers. My prediction is already coming true, much faster than I thought it would! Meanwhile, @alexolegimas has released a dizzying array of new research via his substack, leveraging Claude Code extensively. I've released a "research swarm" that writes hundreds of papers, as well as a visualizer for specification searches, an LLM council that can be used for peer review, and more. My students and I have run an extensive experiment on Claude Code and Codex, and surprisingly found that their guardrails discourage p-hacking (though they can be circumvented easily). Everywhere, we're seeing interesting new papers leveraging AI. Progress in adopting Claude Code and other AI tools and using them to produce research is going faster than I expected, and it seems plausible now that it will keep accelerating as the tools improve and more researchers gain familiarity. I'm baffled by any empirical social scientist who isn't paying attention to these trends and isn't changing their practices accordingly. It's not yet clear how these changes will affect knowledge, but it's impossible to ignore what's coming, and what has already come to pass in the last few months.
Andy Hall@ahall_research

Claude Code and its ilk are coming for the study of politics like a freight train. A single academic is going to be able to write thousands of empirical papers (especially survey experiments or LLM experiments) per year. Claude Code can already essentially one-shot a full AJPS-style survey experiment paper (with access to Prolific API). We'll need to find new ways of organizing and disseminating political science research in the very near future for this deluge.

English
47
98
537
196K
Rohan Paul
Rohan Paul@rohanpaul_ai·
Demis Hassabis’s “Einstein test” for defining AGI: Train a model on all human knowledge but cut it off at 1911, then see if it can independently discover general relativity (as Einstein did by 1915); if yes, it’s AGI.
English
663
818
11.9K
2.2M
Robert Youssef
Robert Youssef@rryssf_·
this is the most underreported problem in agentic coding right now it's not a bug. it's an architecture problem. when you split a single conversation into async subagents that each write to a shared history, you lose attribution. the system can't reliably track who said what to whom. and "who said what" is the entire foundation of instruction-following. a model that confuses its own output for a user command isn't hallucinating. it's operating on a corrupted conversational state. different failure mode. arguably worse, because it looks like compliance. this will keep happening as agents get more autonomous. more subagents, more async updates, more opportunities for the history to become incoherent. and the failure mode isn't "agent gets confused and stops." it's "agent gets confused and acts." that's the part people should be paying attention to.
Robert Youssef tweet media
BURKOV@burkov

Situation: I submitted an error message to Claude (the top most message on the right). Claude then asked, "Commit these changes?" I have no clue what changes it wanted to commit, so I asked, "What changes?" And this fucker starts committing! After I stopped it and asked, "What the hell," it started to show me an approval modal with the question, "Do you allow me to commit?" I rejected, but it kept asking. Eventually, I made it shut up and showed it this screenshot, and it said that it thought "Commit these changes?" was *my* question to it and not the other way around. So, basically, because it's no longer a single model but a bunch of "subagents" asynchronously updating the conversation history, it loses track of who said what to whom. This is a real danger because some subagents might push into the history something that would make this Frankenstein decide to drop some production tables.

English
12
6
27
7.9K
Contextrix
Contextrix@ContextrixAi·
@BoWang87 Bytedance framing LLM long-chain reasoning as molecular chemistry is one of the more creative analogies I've seen lately.
English
1
0
3
244
Bo Wang
Bo Wang@BoWang87·
Bytedance just dropped a paper that might change how AI thinks. Literally. They figured out why LLMs fail at long reasoning — and framed it as chemistry. The discovery: Chain-of-thought isn't just words. It's molecular structure. Three bond types: • Deep reasoning = covalent bonds (strong, unbreakable) • Self-reflection = hydrogen bonds (flexible, context-aware) • Exploration = van der Waals (weak, ever-present) Why most AI "thinking" sucks: Everyone's been imitating keywords — "wait," "let me check" — without building the actual bonds. It's like copying the shape of a protein without the atomic forces holding it together. Bytedance proved: structure emerges from training, not prompting. The fix: Mole-Syn Their method doesn't just generate text. It synthesizes stable thought molecules. Results: better reasoning, more stable RL training. Bytedance is treating AI reasoning like organic chemistry — and it works. Paper: arxiv.org/abs/2601.06002
Bo Wang tweet mediaBo Wang tweet media
English
116
522
2.9K
240.7K
The Alt Hyp
The Alt Hyp@thealthype·
@DeepwriterAI @aakashgupta No transformer networks work on any kind of data. Music and Image generation transformer-networks are examples of this.
English
2
0
4
119
Aakash Gupta
Aakash Gupta@aakashgupta·
The math on this project should mass-humble every AI lab on the planet. 1 cubic millimeter. One-millionth of a human brain. Harvard and Google spent 10 years mapping it. The imaging alone took 326 days. They sliced the tissue into 5,000 wafers each 30 nanometers thick, ran them through a $6 million electron microscope, then needed Google’s ML models to stitch the 3D reconstruction because no human team could process the output. The result: 57,000 cells, 150 million synapses, 230 millimeters of blood vessels, compressed into 1.4 petabytes of raw data. For context, 1.4 petabytes is roughly 1.4 million gigabytes. From a speck smaller than a grain of rice. Now scale that. The full human brain is one million times larger. Mapping the whole thing at this resolution would produce approximately 1.4 zettabytes of data. That’s roughly equal to all the data generated on Earth in a single year. The storage alone would cost an estimated $50 billion and require a 140-acre data center, which would make it the largest on the planet. And they found things textbooks don’t contain. One neuron had over 5,000 connection points. Some axons had coiled themselves into tight whorls for completely unknown reasons. Pairs of cell clusters grew in mirror images of each other. Jeff Lichtman, the Harvard lead, said there’s “a chasm between what we already know and what we need to know.” This is why the next step isn’t a human brain. It’s a mouse hippocampus, 10 cubic millimeters, over the next five years. Because even a mouse brain is 1,000x larger than what they just mapped, and the full mouse connectome is the proof of concept before anyone attempts the human one. We’re building AI systems that loosely mimic neural networks while still unable to fully read the wiring diagram of a single cubic millimeter of the thing we’re trying to imitate. The original is 1.4 petabytes per millionth of its volume. Every AI model on Earth fits in a fraction of that. The brain runs on 20 watts and fits in your skull. The data center required to merely describe one-millionth of it would span 140 acres.
All day Astronomy@forallcurious

🚨: Scientists mapped 1 mm³ of a human brain ─ less than a grain of rice ─ and a microscopic cosmos appeared.

English
1.2K
12.1K
64.4K
4.6M
Hasan Toor
Hasan Toor@hasantoxr·
🚨BREAKING: Microsoft Research + Salesforce just dropped a paper that should scare every AI builder. They tested 15 top LLMs GPT-4.1, Gemini 2.5 Pro, Claude 3.7 Sonnet, o3, DeepSeek R1, Llama 4 across 200,000+ simulated conversations. Single-turn prompt: 90% performance. Multi-turn conversation: 65% performance. Same model. Same task. Just... talking normally. The culprit isn't intelligence. Aptitude only dropped 15%. Unreliability EXPLODED by 112%. → LLMs answer before you finish explaining (wrong assumptions get baked in permanently) → They fall in love with their first wrong answer and build on it → They forget the middle of your conversation entirely → Longer responses introduce more assumptions = more errors Even reasoning models failed. o3 and DeepSeek R1 performed just as badly. Extra thinking tokens did nothing. Setting temperature to 0? Still broken. The fix right now: give your AI everything upfront in one message instead of back-and-forth. Every benchmark you've seen was tested on single-turn prompts in perfect lab conditions. Real conversations break every model on the market and nobody's talking about it.
Hasan Toor tweet media
English
700
1.7K
9K
1.6M
DeepWriter AI
DeepWriter AI@DeepwriterAI·
@Ace_Azule Recent @AnthropicAI research suggested that people are using AI as an unquestioned authority, and much recent talk has been about the fears of AI. But the early results of this poll suggest that most of you don't see it that way!
English
0
0
1
27
Ace Azule
Ace Azule@Ace_Azule·
@DeepwriterAI AI is a superpower extension of my ideas. I often refer to AI as my co-creator.
English
1
0
1
33
DeepWriter AI
DeepWriter AI@DeepwriterAI·
What is AI to you?
English
3
2
7
975
DeepWriter AI retweetet
Garry P. Nolan
Garry P. Nolan@GarryPNolan·
Cancer doesn't know who you are when it strikes. It is indiscriminate. It can take the best of us at any moment. The American Association for Cancer Research (AACR), many other related charitable organizations, patient advocacy groups, biotechnology and pharmaceutical companies, and, most importantly, individual researchers... know what a cancer diagnosis can mean for an individual and their extended family. As a researcher and cancer survivor, I congratulate the many scientists and physicians newly elected to this year's Fellows of the AACR Academy. I thank the AACR for selecting me to be part of this incredible group. Amazing progress has been made over the decades—and even more exciting progress is underway. Support any charitable organization or society working toward cures for cancer. Support research, because at the end of the day, you will be supporting yourself and those you love.
AACR@AACR

We are pleased to announce the Fellows of the AACR Academy Class of 2026. We look forward to celebrating their pioneering scientific achievements at the AACR Annual Meeting in April. brnw.ch/21wZqOo #AACR26 #AACRFellows

English
0
35
339
14.9K
DeepWriter AI
DeepWriter AI@DeepwriterAI·
@hamptonism Much more than pre-AI. Now an expert + AI = much more leverage than ever.
English
0
2
5
207
ₕₐₘₚₜₒₙ
ₕₐₘₚₜₒₙ@hamptonism·
Do you believe a PhD is worth it in the age of Artificial Intelligence?
English
263
40
721
102.9K
DeepWriter AI
DeepWriter AI@DeepwriterAI·
@iruletheworldmo Snow bunny + DeepWriter (a system designed to think laterally and question axioms) = 🤯
English
0
0
0
1.4K
🍓🍓🍓
🍓🍓🍓@iruletheworldmo·
snow bunny (gemini 3 full) is coming warning: it's very very good.
🍓🍓🍓 tweet media
English
46
24
628
51.2K
DeepWriter AI
DeepWriter AI@DeepwriterAI·
@NTFabiano We would love to discuss this with you. DM us. We are the world's leading writing agent for academic grade writing, patents, and creative solutions. We think we can change your mind and show you how AI can, when done right, extend human thinking in ways never dreamed of before.
English
0
0
1
436
Nicholas Fabiano, MD
Nicholas Fabiano, MD@NTFabiano·
Writing is thinking. Don't let AI do it all.
Nicholas Fabiano, MD tweet media
English
127
2K
9.3K
444K
DeepWriter AI
DeepWriter AI@DeepwriterAI·
Today, with coding agents, when used optimally, we can think at new heights, designing applications of unheard of capabilities. And soon the same will be appreciated with scholarly agents like The DeepWriter, allowing human thinking to extend to even higher heights than this.
English
0
2
5
329