Xiaolong Yang

587 posts

Xiaolong Yang banner
Xiaolong Yang

Xiaolong Yang

@yang_appstats

AM student of political methodology @HarvardGSAS. 東大教養の人間だった。因果推論。

Tokyo-to, Japan Katılım Nisan 2020
3.4K Takip Edilen421 Takipçiler
Xiaolong Yang retweetledi
Richard McElreath 🐈‍⬛
Richard McElreath 🐈‍⬛@rlmcelreath·
Statistical Rethinking 2026 is done: 20 new lectures emphasizing logical & critical statistical workflow, from basics of probability to causal inference to reliable computation to sensitivity. It's all free, made just for you. Lecture list & links: #calendar--topical-outline" target="_blank" rel="nofollow noopener">github.com/rmcelreath/sta…
English
14
296
1.6K
110.7K
Xiaolong Yang retweetledi
Kentaro Fukumoto
Kentaro Fukumoto@000fukumoto·
新年度から東大と早稲田の政治学関係の大学院で、統計関係の授業の授業を相互に履修できることになりました。院生の皆さんの積極的な履修を期待しています。 j.u-tokyo.ac.jp/wp-content/upl…
日本語
0
44
158
16.1K
Xiaolong Yang retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
It is hard to communicate how much programming has changed due to AI in the last 2 months: not gradually and over time in the "progress as usual" way, but specifically this last December. There are a number of asterisks but imo coding agents basically didn’t work before December and basically work since - the models have significantly higher quality, long-term coherence and tenacity and they can power through large and long tasks, well past enough that it is extremely disruptive to the default programming workflow. Just to give an example, over the weekend I was building a local video analysis dashboard for the cameras of my home so I wrote: “Here is the local IP and username/password of my DGX Spark. Log in, set up ssh keys, set up vLLM, download and bench Qwen3-VL, set up a server endpoint to inference videos, a basic web ui dashboard, test everything, set it up with systemd, record memory notes for yourself and write up a markdown report for me”. The agent went off for ~30 minutes, ran into multiple issues, researched solutions online, resolved them one by one, wrote the code, tested it, debugged it, set up the services, and came back with the report and it was just done. I didn’t touch anything. All of this could easily have been a weekend project just 3 months ago but today it’s something you kick off and forget about for 30 minutes. As a result, programming is becoming unrecognizable. You’re not typing computer code into an editor like the way things were since computers were invented, that era is over. You're spinning up AI agents, giving them tasks *in English* and managing and reviewing their work in parallel. The biggest prize is in figuring out how you can keep ascending the layers of abstraction to set up long-running orchestrator Claws with all of the right tools, memory and instructions that productively manage multiple parallel Code instances for you. The leverage achievable via top tier "agentic engineering" feels very high right now. It’s not perfect, it needs high-level direction, judgement, taste, oversight, iteration and hints and ideas. It works a lot better in some scenarios than others (e.g. especially for tasks that are well-specified and where you can verify/test functionality). The key is to build intuition to decompose the task just right to hand off the parts that work and help out around the edges. But imo, this is nowhere near "business as usual" time in software.
English
1.6K
4.8K
37.3K
5.1M
Xiaolong Yang retweetledi
Obsidian
Obsidian@obsdmd·
Anything you can do in Obsidian you can do from the command line. Obsidian CLI is now available in 1.12 (early access).
English
489
1.7K
18.5K
3.7M
Xiaolong Yang retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
A lot of people quote tweeted this as 1 year anniversary of vibe coding. Some retrospective - I've had a Twitter account for 17 years now (omg) and I still can't predict my tweet engagement basically at all. This was a shower of thoughts throwaway tweet that I just fired off without thinking but somehow it minted a fitting name at the right moment for something that a lot of people were feeling at the same time, so here we are: vibe coding is now mentioned on my Wikipedia as a major memetic "contribution" and even its article is longer. lol The one thing I'd add is that at the time, LLM capability was low enough that you'd mostly use vibe coding for fun throwaway projects, demos and explorations. It was good fun and it almost worked. Today (1 year later), programming via LLM agents is increasingly becoming a default workflow for professionals, except with more oversight and scrutiny. The goal is to claim the leverage from the use of agents but without any compromise on the quality of the software. Many people have tried to come up with a better name for this to differentiate it from vibe coding, personally my current favorite "agentic engineering": - "agentic" because the new default is that you are not writing the code directly 99% of the time, you are orchestrating agents who do and acting as oversight. - "engineering" to emphasize that there is an art & science and expertise to it. It's something you can learn and become better at, with its own depth of a different kind. In 2026, we're likely to see continued improvements on both the model layer and the new agent layer. I feel excited about the product of the two and another year of progress.
Andrej Karpathy@karpathy

There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It's possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good. Also I just talk to Composer with SuperWhisper so I barely even touch the keyboard. I ask for the dumbest things like "decrease the padding on the sidebar by half" because I'm too lazy to find it. I "Accept All" always, I don't read the diffs anymore. When I get error messages I just copy paste them in with no comment, usually that fixes it. The code grows beyond my usual comprehension, I'd have to really read through it for a while. Sometimes the LLMs can't fix a bug so I just work around it or ask for random changes until it goes away. It's not too bad for throwaway weekend projects, but still quite amusing. I'm building a project or webapp, but it's not really coding - I just see stuff, say stuff, run stuff, and copy paste stuff, and it mostly works.

English
643
820
8.8K
1.2M
Xiaolong Yang retweetledi
Boris Cherny
Boris Cherny@bcherny·
13/ A final tip: probably the most important thing to get great results out of Claude Code -- give Claude a way to verify its work. If Claude has that feedback loop, it will 2-3x the quality of the final result. Claude tests every single change I land to claude.ai/code using the Claude Chrome extension. It opens a browser, tests the UI, and iterates until the code works and the UX feels good. Verification looks different for each domain. It might be as simple as running a bash command, or running a test suite, or testing the app in a browser or phone simulator. Make sure to invest in making this rock-solid. code.claude.com/docs/en/chrome
English
93
152
3.6K
552.2K
Xiaolong Yang retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
I took delivery of a beautiful new shiny HW4 Tesla Model X today, so I immediately took it out for an FSD test drive, a bit like I used to do almost daily for 5 years. Basically... I'm amazed - it drives really, really well, smooth, confident, noticeably better than what I'm used to on HW3 (my previous car) and eons ahead of the version I remember driving up highway 280 on my first day at Tesla ~9 years ago, where I had to intervene every time the road mildly curved or sloped. (note this is v13, my car hasn't been offered the latest v14 yet) On the highway, I felt like a passenger in some super high tech Maglev train pod - the car is locked in the center of the lane while I'm looking out from Model X's higher vantage point and its panoramic front window, listening to the (incredible) sound system, or chatting with Grok. On city streets, the car casually handled a number of tricky scenarios that I remember losing sleep over just a few years ago. It negotiated incoming cars in tight lanes, it gracefully went around construction and temporarily in-lane stationary cars, it correctly timed tricky left turns with incoming traffic from both sides, it gracefully gave way to the car that went out of order in the 4-way stop sign, it found a way to squeeze into a bumper to bumper traffic to make its turn, it overtook the bus that was loading passengers but still stopped for the stop sign that was blocked by the bus, and at the end of the route it circled around a parking lot, found a spot and... parked. Basically a flawless drive. For context, I'm used to going out for a brief test drive around the neighborhood to return with 20 clips of things that could be improved. It's new for me to do just that and exactly like I used to, but come back with nothing. Perfect drive, no notes. I expect there's still more work for the team in the long march of 9s, but it's just so cool to see that we're beyond finding issues on any individual ~1 hour drive around the neighborhood, you actually have to go to the fleet and mine them. Back then, I processed the incredible promise of vehicle autonomy at scale (in the fully scaleable, vision only, end-to-end Tesla way) only intellectually, but now it is possible to feel it intuitively too if you just go out for a drive. Wait, of course surround video stream at 60Hz processed by a fully dedicated "driving brain" neural net will work, and it will be so much better and safer than a human driver. Did anyone else think otherwise? I also watched @aelluswamy 's new ICCV25 talk last week (x.com/aelluswamy/sta…) that hints at some of the recent under the hood technical components driving this progress. Sensor streams (videos, maps, kinematics, audio, ...) over long contexts (e.g. ~30 seconds) go into a big neural net, steering/acceleration comes out, optionally with visualization auxiliary data. This is the dream of the complete Software 1.0 -> Software 2.0 re-write that scales fully with data streaming from millions of cars in the fleet and the compute capacity of your chip, not some engineer's clever new DoubleParkedCarHandler C++ abstraction with undefined test-time characteristics of memory and runtime. There's a lot more hints in the video on where things are going with the emerging "robotics+AI at scale stack". World reconstructors, world simulators "dreaming" dynamics, RL, all of these components general, foundational, neural net based, how the car is really just one kind of robot... are people getting this yet? Huge congrats to the team - you're building magic objects of the future, you rock! And I love my car <3.
English
954
2.8K
27.8K
17.9M
Xiaolong Yang retweetledi
Eric Zhang
Eric Zhang@ekzhang1·
NYSRG is back for November, this month we'll be reading about data! Personally looking forward to this because there's been a tremendous amount of data systems and adjacent work in the past few years :) We will try out some new meeting locations / hosting in NYC
Eric Zhang tweet media
English
3
4
129
8K
Xiaolong Yang retweetledi
mel
mel@melqtx·
many such cases
mel tweet media
English
62
2.2K
23.4K
756.4K
Xiaolong Yang retweetledi
Severin Hacker
Severin Hacker@severinhacker·
Founders often come to me for advice. (Why? I haven’t a clue.) Still, I take these calls, and I wanted to share some of my most frequently shared advice to consumer tech companies: Don't build a startup in education. It's a regulated market where most money goes to teachers. There's limited demand for what people want to learn outside formal education. We succeeded with Duolingo, but we're the exception, not the rule. In consumer tech, the only metric that matters is retention. How many people come back the next day matters the most. Not growth rate. Not revenue. Your success ultimately depends on this one number. Retention is the best proxy for product quality. Start with a mission, not money. Founders who build to get rich rarely succeed at the scale of those who solve a real problem they care about. Paradoxically, the people who care least about making money often make the most. Just build. Stop overanalyzing. Second-time founders see all the ways that things can go wrong. First-time founders who don't know better have nothing to lose - and that's an advantage.
English
4
4
57
19.5K
Xiaolong Yang retweetledi
Kentaro Fukumoto
Kentaro Fukumoto@000fukumoto·
今学期は駒場で「社会科学のためのデータ分析入門」という演習を開きます。今井本3冊をやります。駒場生がこのツイートを見て関心を持ったら、是非お越し下さい。
日本語
0
27
121
33.1K
Xiaolong Yang retweetledi
Eric Zhang
Eric Zhang@ekzhang1·
Quite a long journey, but @modal has indeed made it. I guess I can now say that I've taken a company from pre-product to $1B Kind of insane to think abt
Erik Bernhardsson@bernhardsson

It's true – @modal has raised a $87M Series B at a $1.1B valuation to advance the future of AI infrastructure.  Thank you to @Lux_Capital, @Redpoint, @AmplifyPartners, and others. Now more than ever, AI demands a complete reinvention of traditional compute infrastructure

English
31
12
905
148.8K
Xiaolong Yang retweetledi
Boaz Barak
Boaz Barak@boazbaraktcs·
Wrote up some impressions from my AI safety course so far. Top ones are: 1. Students are amazing 2. I am bad at time management. lesswrong.com/posts/2pZWhCnd…
English
3
7
118
11.6K
Xiaolong Yang
Xiaolong Yang@yang_appstats·
@j_ok2 Reply to my message and let’s celebrateeeee
English
0
0
1
45
Xiaolong Yang retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
How to become expert at thing: 1 iteratively take on concrete projects and accomplish them depth wise, learning “on demand” (ie don’t learn bottom up breadth wise) 2 teach/summarize everything you learn in your own words 3 only compare yourself to younger you, never to others
English
171
2.6K
13.9K
0
Xiaolong Yang retweetledi
Hiroto Sawada
Hiroto Sawada@sawada_hiroto·
I’m on the job market!! Using formal political theory, my JMP shows how climate disasters systematically make empirical evidence on climate and conflict mixed.
Princeton Politics@PUPolitics

In his job market paper, @sawada_hiroto presents a formal model to explain (i) when a climate disaster triggers armed conflict and (ii) why empirical evidence on climate and conflict seems mixed. His research can be found at hiroto-sawada.github.io

English
1
10
48
10.6K