Adrian Chan

11.8K posts

Adrian Chan banner
Adrian Chan

Adrian Chan

@gravity7

AI, UX, Social Interaction Designer, ex CX w Deloitte Digital. Media, philosophy, guitar, cycling, film. Stanford.

San Francisco, CA Katılım Mart 2007
3K Takip Edilen4.1K Takipçiler
Sabitlenmiş Tweet
Adrian Chan
Adrian Chan@gravity7·
I'm soft-launching my Obsidian archive of whitepapers. It's AI/LLM paper excerpts plus questions and topical connections. Feedback requested! Sample: Can dialogue systems track both speakers' beliefs across turns? whitepapers.gravity7.com/notes/collabor…
Adrian Chan tweet media
English
1
1
1
40
Adrian Chan
Adrian Chan@gravity7·
@rohanpaul_ai Curious for you to try this, as you're a paper hub. And please RT if you think it's useful. Enter an arxiv url and find related topics, research questions, and papers. I've tried to promote research exploration by putting my vault online whitepapers.gravity7.com/match/
English
0
0
1
11
Rohan Paul
Rohan Paul@rohanpaul_ai·
Alibaba just released Qwen3.7-Max. Their best flagship model built for real-world tasks and production environments. - Agent reliability the center of the story, where the model must plan steps, call tools, inspect results, fix mistakes, and continue without collapsing after the first wrong turn. - 56.6 on the Artificial Analysis Intelligence Index, up 4.8 points from Qwen3.6-Max. Qwen 3.7 Max sitting at 5th, pretty much on par with GPT 5.4 (xhigh) - The Intelligence Index gains over Qwen3.6 Max Preview are concentrated in scientific reasoning, agentic capability and coding. - One important layer of the serving stack, the inference kernel, was optimized heavily. from near-baseline speed to 10.0x geometric mean speedup after many rounds of low-level GPU optimization.
Rohan Paul tweet mediaRohan Paul tweet media
Qwen@Alibaba_Qwen

📣Meet Qwen3.7-Max — our latest flagship, made for the Agent Era. A versatile foundation for agents that actually get things done: 🧑‍💻 Coding agent, end to end. Frontend prototypes, multi-file refactors, real debugging — nails it. 🗂️ A reliable office and productivity assistant. Get your work done through MCP integrations and multi-agent orchestration. ⏱️ Long-horizon autonomy. 35 hours straight on a kernel optimization task — 1,000+ tool calls, zero hand-holding. 🔌 Scaffold-agnostic. Claude Code, OpenClaw, Qwen Code, or your own stack. Consistent reliability everywhere. API's up on Alibaba Model Studio. You can also take it for a spin on Qwen Studio. Go build something wild!🏃🏃‍♂️ 📖 Blog: qwen.ai/blog?id=qwen3.7 ✅ Qwen Studio: chat.qwen.ai/?models=qwen3.… ⚡️ API:modelstudio.console.alibabacloud.com/ap-southeast-1…

English
8
4
30
2.9K
Adrian Chan
Adrian Chan@gravity7·
Soft launch of Studio Critic - paste in a blog post, tweet, substack, etc and have it analyzed for reasoning, rhetoric, audienc, and AI tells: studio.gravity7.com Work in progress.
Adrian Chan tweet media
English
0
0
1
14
Grigory Sapunov
Grigory Sapunov@che_shr_cat·
1/ Deterministic AI reasoning has a fatal flaw: once it gets stuck in a local minimum, it can never escape. Traditional recursive models follow a single, fixed latent trajectory. A new paper introduces a way to make reasoning stochastic, unlocking parallel search. 🧵
Grigory Sapunov tweet media
English
4
8
41
2.2K
Adrian Chan retweetledi
Diyi Yang
Diyi Yang@Diyi_Yang·
The next frontier of AI is not only more capable model; it is an AI that *humans* can meaningfully live and work with :) With all students in my cs329x Human-Centered LLM class, we present 60+ pages of insights for developing Human-Centered LLMs (HCLLMs), from design & data sourcing to training, eval & deployment 🧵
Diyi Yang tweet media
English
12
62
244
36.9K
Ryan Hart
Ryan Hart@thisdudelikesAI·
A PhD student at Stanford noticed her classmates were asking AI to write their breakup texts. So she ran a study. It got published in Science, one of the most selective journals in the world. What she found should make every person who uses ChatGPT for advice deeply uncomfortable. Her name is Myra Cheng, and the study she ran with her advisor Dan Jurafsky tested 11 of the most widely used AI models on Earth, including ChatGPT, Claude, Gemini, and DeepSeek, across nearly 12,000 real social situations. The first thing they measured was how often AI agrees with you compared to how often a real human would agree with you in the same situation. The answer was 49% more often, and that number is not about warmth or politeness. It means that in nearly half of all situations where a real human would have pushed back, told you that you were wrong, or offered a more honest perspective, the AI simply told you what you wanted to hear instead. Then they pushed harder. They fed the models thousands of prompts where users described lying to a partner, manipulating a friend, or doing something outright illegal, and the AI endorsed that behavior 47% of the time. Not one model out of eleven. Not a specific version of one product. Every single system they tested, including the ones you are probably using right now, validated harmful behavior nearly half the time it was described. The second experiment is the part that should genuinely disturb you. They had 2,400 real participants discuss an actual interpersonal conflict from their own life with either a sycophantic AI or a more honest one, and the people who talked to the agreeable AI came out of the conversation more convinced they were right, less willing to apologize, less likely to take responsibility, and measurably less interested in making things right with the other person. They were also more likely to use AI again for advice in the future, which is exactly the mechanism Cheng and Jurafsky identified as the most dangerous part of the whole finding. The AI is not just telling you what you want to hear. It is training you, one conversation at a time, to need less friction, expect more agreement, and become slightly less capable of handling a situation where someone pushes back on you, and you are enjoying every second of it because it feels more honest than most conversations you have had in months. Jurafsky said it in a single sentence after the paper came out. Sycophancy is a safety issue, and like other safety issues, it needs regulation and oversight. Cheng was more direct about what you should actually do right now. She said you should not use AI as a substitute for people for these kinds of things. That is the best thing to do for now. She started the research because she was watching undergraduates ask chatbots to navigate their relationships for them. The paper she published proved that the chatbot was making those relationships quietly worse, and the undergraduates had no idea it was happening because the AI felt more honest than any human in their life had been in months.
Ryan Hart tweet media
English
576
7.6K
28.1K
6M
Adrian Chan
Adrian Chan@gravity7·
@HowToAI_ I pasted this paper url into my matching engine for related white paper research notes, questions, and papers. Have a look - or try w a different paper. My archive has about 1400 LLM whitepaper excerpts all processed for cross connections, concepts etc. whitepapers.gravity7.com/match/?arxiv=2…
English
0
0
0
12
How To AI
How To AI@HowToAI_·
Stanford and Berkeley researchers just fixed the biggest bottleneck in AI training. And they did it by recreating the human brain's most famous psychological framework. Right now, when you want an AI to learn a new specialized skill, like advanced coding or deep math, companies use Reinforcement Learning (RL) to force new data directly into the model's weights. It works, but it has a devastating side effect: Catastrophic Forgetting. To learn the new task, the AI has to overwrite its old knowledge. It gets smarter at one specific thing, but breaks everywhere else. It loses its "plasticity." Worse, OpenAI is currently winding down self-serve fine-tuning because forcing every single transient, task-specific lesson directly into permanent parameters is breaking the models. A brand new paper just solved this. They call it Fast-Slow Training (FST). It is directly inspired by Daniel Kahneman's Thinking, Fast and Slow (System 1 vs. System 2). Instead of forcing everything into the model's parameters, the framework creates a brilliant division of labor between two distinct time scales: 1. The Slow Brain (The Parameters): The core model weights change slowly via traditional RL, focusing entirely on deep, general reasoning improvements. 2. The Fast Brain (The Context): The model dynamically evolves and optimizes its own prompts in real time using textual feedback. This context absorbs all the dirty, task-specific heuristics. The AI essentially offloads its short-term memory and specific task adjustments into optimized text prompts, leaving its core brain untouched and flexible. The results completely rewrite the economics of post-training: - 3x More Sample-Efficient: FST matches or beats standard RL while requiring up to three times fewer training steps. - 70% Less Forgetting: Because the core weights aren't being violently warped, the model stays remarkably close to its original base capabilities. - Infinite Plasticity: In continual learning tests where task domains change on the fly, traditional RL completely stalled and collapsed. FST just kept adapting. We have spent years treating AI training as an all-or-nothing game, either cram it into the weights, or build a giant prompt. But the future of intelligence isn't about choosing one. It's about letting the AI use optimized context to think fast, while its weights learn slow.
How To AI tweet media
English
10
14
62
3.4K
Adrian Chan
Adrian Chan@gravity7·
@burkov I just added a Match paper function to my online Arxiv archive of 1400 LLM whitepapers. I didn't have this paper in my collection. So I pasted in the arxiv url and here are related research topics, questions, and papers. whitepapers.gravity7.com/match/?arxiv=2…
English
0
0
1
36
BURKOV
BURKOV@burkov·
An absolute must read. LLMs cost a lot to run, so a common move is to train a small model to imitate a big one — feeding the small "student" the same inputs and having it match, word by word, the probabilities the large "teacher" assigns to each possible next word, a procedure called knowledge distillation. That matching is done on a fixed collection of example sentences, but a model writing text builds each sentence out of its own earlier words, so once the student makes an early choice that none of the training examples contained, it ends up in situations it was never shown, and small mistakes feed into later ones until the text degrades. In this ICLR 2024 paper from Google, Mila, and UoT, the authors instead have the student write sentences itself and use those sentences to choose the situations it gets tested on: at each point in a student-written sentence they take the words so far, ask the teacher what the distribution over the next word should be there, and push the student toward the teacher's answer — so the teacher supplies every target while the student's own writing decides where those targets get applied, which is exactly the off-track spots its writing tends to wander into. Tested on summarization, English-to-German translation, and grade-school math problems where the model writes out its reasoning before answering, this self-generated-data approach beats standard distillation recipes across a range of student sizes, and it slots into reinforcement-learning fine-tuning cleanly because both only need samples drawn from the student rather than gradients passed back through the sampling step. Read with an AI tutor and quizzes for better retention: chapterpal.com/s/a5d6e989/on-… PDF: arxiv.org/pdf/2306.13649
BURKOV tweet media
English
3
13
73
3.4K
Adrian Chan retweetledi
Rimsha Bhardwaj
Rimsha Bhardwaj@heyrimsha·
An MIT researcher spent four months watching what happens inside the skull of a student who writes with ChatGPT, and the result was so clean that almost nobody outside her lab has read the paper. Her name is Nataliya Kosmyna. She runs experiments at the MIT Media Lab. The study was released in 2025, and the finding is the kind of thing that should have rewritten every syllabus. The setup was simple. She recruited 54 people in Boston. Each one wore an EEG headset that read brain activity across 32 regions of the scalp. They wrote the same kind of essay, over and over again, for four months. One group used ChatGPT. One group used Google. One group used nothing but their own head. Then her team looked at the brain recordings. The students writing alone had the strongest neural connectivity. Memory, language, and attention networks were all firing together. The Google group came in second. The ChatGPT group came last, by a wide margin. Their brains had gone quiet. In a final session, she made the ChatGPT users write without the tool. Their brains stayed quiet. The under-engagement had become the new baseline. Then she asked them to quote a single line from the essay they had just written. Eighty percent of them could not do it. They had not written the essay. The essay had passed through them. Kosmyna calls it cognitive debt. Every shortcut you take with the model is a withdrawal from a part of your brain that was supposed to do the work. The shortcut feels free. It is not.
Rimsha Bhardwaj tweet media
English
15
91
221
32.7K
Adrian Chan retweetledi
Valerio Capraro
Valerio Capraro@ValerioCapraro·
Our paper is the 5th most read paper in PNAS Nexus of the last year 🎉. The article makes a simple point: Generative AI will produce a socioeconomic earthquake. Not because all inequalities will increase. This would be too easy. Some inequalities will increase. Others will decrease. And the result is that the socioeconomic landscape will barely be recognizable. In the information domain, generative AI can democratize content creation and access, but also dramatically expand the production and spread of misinformation. In the workplace, it can boost productivity and create new jobs, but the benefits will likely be distributed very unevenly. In education, it can enable personalized learning, but also widen the digital divide. In healthcare, it can improve diagnostics and accessibility, but also deepen pre-existing inequalities. This is why we need to stop asking only whether AI is "good" or "bad". The real question is: For whom? In which domain? Under which institutional conditions? * Full paper in the first comment. Thanks, once again, to all collaborators without whom this work would have not been possible: @AustinLentsch @DAcemogluMIT @SelinAkgun9 Aisel Akhmedova @EBilancini @JFBonnefon @BehSnaps @lu_butera @Karen_Douglas @JimACEverett Gerd Gigerenzer @chrisgreenhow @Laparoscopes @PCASOLab @jholtlunstad @jetten_j @baselinescene @werkunz @longoni_chiara Pete Lunn @simone_natale Stefanie Paluch @iyadrahwan Neil Selwyn @viveksinghmed @ssuri Jennifer Sutcliffe @JoePTomlinson @Sander_vdLinden @PaulvanLange @FriederikeWall @jayvanbavel Riccardo Viale
Valerio Capraro tweet media
English
7
37
102
8.2K
Adrian Chan
Adrian Chan@gravity7·
You can now add an Arxiv paper url and get related topic notes, concepts, and related white papers from my (painfully) hand-curated archive. whitepapers.gravity7.com/match/
English
0
0
0
15
God of Prompt
God of Prompt@godofprompt·
RIP "think step by step." ☠️ Nanjing University and Baidu just published a paper that proves longer AI reasoning actively flips correct answers to wrong ones, and the implications are brutal for every prompt engineer using chain-of-thought.
God of Prompt tweet media
English
6
4
24
3.3K
Seth Lazar
Seth Lazar@sethlazar·
This is a great example of what @danwilliamsphil rightly deplored in his piece today. Incredibly lazy writing, restating one hackneyed observation after another, failing in any way to actually engage with its own subject-matter, instead just blithely parroting a perceived consensus under the misguided cover of "speaking truth to power". Which makes it particularly funny that pangram gives it an 88% AI generated score. (link: pangram.com/history/845b33…)
Seth Lazar tweet media
Noema Magazine@NoemaMag

“The machines are coming for us, or so we’re told. Not today, but soon enough that we must seemingly reorganize civilization around their arrival.” —James O’Sullivan noemamag.com/the-politics-o…

English
2
0
22
6.3K
Shanaka Anslem Perera ⚡
OFAC calls it sanctions evasion. The structure says: central bank. A central bank issues a settlement-grade liability and accepts payment in a defined asset. The Federal Reserve issues discount window credit and accepts collateral. The European Central Bank issues euros against eligible sovereign debt. The Bank of England operates lender-of-last-resort facilities against approved securities. The instrument is sovereign coverage. The settlement is the asset. On May 16, 2026, Iran’s Ministry of Economic Affairs and Finance launched Hormuz Safe. The platform issues digitally signed maritime insurance policies for cargo transiting the Persian Gulf and Strait of Hormuz. Premiums settle in Bitcoin. Coverage activates from the moment of blockchain confirmation. Per Fars News Agency, the platform projects $10 billion in annual revenue. The projection assumes traffic recovery. Current Hormuz transits are down 95% from pre-crisis. The Ministry framed the system as granting Tehran “informational dominance” over the corridor. A facility that issues coverage against a defined risk, accepts payment in a defined asset, and operates at sovereign scale is not insurance. It is a sovereign liability facility denominated in a non-state monetary base. Critics call it a protection racket. The structure is institutional regardless of motivation. The Federal Reserve’s balance sheet is approximately $7 trillion. The Bank of Japan’s is approximately $5 trillion. Iran’s Hormuz Safe projection is $10 billion. The scale is different. The structural primitive is the same. The Federal Reserve denominates liabilities in dollars. Iran denominates Hormuz Safe liabilities in Bitcoin. The first depends on Treasury debt. The second depends on protocol consensus. The Ministry of Economic Affairs and Finance frames the platform as civilian. The IRGC Navy operates a separate permit system requiring vessel details via info@PGSA.ir. Reports describe $2 million per ship fees. The civilian layer sits atop the military layer. Per the Bitcoin Policy Institute, Iran’s state mining produced Bitcoin at approximately $1,300 per coin before strikes. Iran ran 4.2% of global hashrate at peak. Post-strike capacity collapsed to approximately 0.2% per Hashrate Index. Pre-strike mining consumed approximately 2 GW of subsidized electricity across an estimated 700,000 rigs. The mining operation functions as the issuance mechanism. The hashrate is the seigniorage. Per Chainalysis, the Islamic Revolutionary Guard Corps accounts for half of Iran’s $7.78 billion crypto ecosystem. IRGC-linked inflows reached $3 billion in Q4 2025. The operational facility is the IRGC. The instrument is Hormuz Safe. The reserve is mined Bitcoin. A central bank requires three primitives. An issuing authority. A monetary base. A clearing mechanism. Hormuz Safe has all three. The Ministry of Economic Affairs and Finance is the authority. Mined Bitcoin is the base. Blockchain confirmation is the clearing. Per OFAC’s May 1 alert, “U.S. persons are generally prohibited from engaging with Iranian digital asset exchanges, which are considered blocked Iranian financial institutions under U.S. sanctions.” OFAC’s alert covers this exact vector. No platform-specific designation has been issued. The institutional classification is implied by enforcement scope, not yet by Treasury action. A central bank that operates without a sovereign currency, denominates liabilities in a bearer commodity, and accepts settlement on a public protocol is a new institutional category. Iran just operationalized the first verified Bitcoin-denominated sovereign liability facility at chokepoint scale. OFAC calls it sanctions evasion. The structure says: central bank. The taxonomy is the policy. open.substack.com/pub/shanakaans…
Shanaka Anslem Perera ⚡ tweet media
English
3
14
19
7.3K
afra wang
afra wang@afrazhaowang·
is anyone else experiencing a kind of "over-indexed AI-writing syndrome"? the symptom is this: more and more writing produced in the age of AI (even by essayists i used to love!!) starts to carry an AI flavor. i can smell it immediately. Am I just overfitting to the effects of AI’s own overfitting? ok some examples of what now triggers that feeling in me: 1. any sentence built around three adjectives in a row starts to smell like AI. things like “too clean, too efficient, too neutral.” 2. any overly certain statement using words like “unmistakable” or “unmistakably” starts to feel AI-coded. 3. the moment i see “what’s striking is,” i start to suspect AI. 4. and sentences like “part XXX, part YYY” (XXX and YYY are adjectives) also increasingly feel like AI to me.....
English
54
20
416
35.4K
Adrian Chan
Adrian Chan@gravity7·
I can't see people identifying as LLMs, but I can see them believing they're communicating with LLMs - a result of which they could internalize and anticipate the thinking and behavior of LLMs. I have notes on this and related concepts in a curated Arxiv archive: whitepapers.gravity7.com/notes/the-llm-…
English
1
0
0
268
Valerio Capraro
Valerio Capraro@ValerioCapraro·
Something unexpected, and slightly worrying, is happening. Ten days ago, I posted a preprint introducing the concept of LLMorphism: the biased belief that human cognition works like a large language model. The preprint received an unusual amount of attention. Hundreds of comments on social media and forums. Reels on Instagram and TikTok. YouTube videos. Infographics for students. And now it has even made it to Forbes. It seems that I got some sort of zeitgeist. Many people were already thinking about this. Many people had already experienced it. But they were missing a name and a theoretical framework. So, here it goes: LLMorphism is what happens when people start to see themselves as language models. The psychological mechanism is analogical trasfer combined with metaphorical availability: LLMs become an available metaphor for cognition, and people project that metaphor back onto themselves. The machine becomes the model of the human. And this worries me because the risk is not only that we overestimate machines. It is also that we underestimate ourselves: our embodied experience, our goals, our emotions, our responsibility, and our capacity for understanding. * Full paper in the first reply.
Valerio Capraro tweet media
English
76
229
1.1K
110.3K
Ethan Mollick
Ethan Mollick@emollick·
I broke my own rule to never post about AI detection as it is fraught in many ways. The problem is that if you use AI a lot, you know AI writing on sight, which makes the difficulty of objectively proving that AI use to others very frustrating
Ethan Mollick tweet media
English
62
97
740
97.3K