albert yu sun

452 posts

albert yu sun banner
albert yu sun

albert yu sun

@Albertyusun

researcher working on simulating legal reasoning environments at Epiq AI Labs. studied @DukeU. research @MSFTResearch, @CuraiHQ. legal @ACLU, @VeraInstitute.

Queens, NY Katılım Şubat 2022
452 Takip Edilen359 Takipçiler
Sabitlenmiş Tweet
albert yu sun
albert yu sun@Albertyusun·
Excited to announce that my internship work with my awesome collaborators at @CuraiHQ from last year has been accepted to the main conference @naaclmeeting 2024😃 Check out our work building practical guardrail models for LLMs using a diverse synthetic data generation approach!
GIF
albert yu sun@Albertyusun

1/ New preprint!🔔 Excited to share my internship work at @CuraiHQ. Using model distillation techniques, we’ve developed an approach to create light-weight guardrail models that monitor the output of generative language models like GPT-4! w/@nairvarun18 @elliotschu @anithakan

English
2
0
32
4.8K
albert yu sun retweetledi
Aakash Gupta
Aakash Gupta@aakashgupta·
Cursor is raising at a $50 billion valuation on the claim that its “in-house models generate more code than almost any other LLMs in the world.” Less than 24 hours after launching Composer 2, a developer found the model ID in the API response: kimi-k2p5-rl-0317-s515-fast. That’s Moonshot AI’s Kimi K2.5 with reinforcement learning appended. A developer named Fynn was testing Cursor’s OpenAI-compatible base URL when the identifier leaked through the response headers. Moonshot’s head of pretraining, Yulun Du, confirmed on X that the tokenizer is identical to Kimi’s and questioned Cursor’s license compliance. Two other Moonshot employees posted confirmations. All three posts have since been deleted. This is the second time. When Cursor launched Composer 1 in October 2025, users across multiple countries reported the model spontaneously switching its inner monologue to Chinese mid-session. Kenneth Auchenberg, a partner at Alley Corp, posted a screenshot calling it a smoking gun. KR-Asia and 36Kr confirmed both Cursor and Windsurf were running fine-tuned Chinese open-weight models underneath. Cursor never disclosed what Composer 1 was built on. They shipped Composer 1.5 in February and moved on. The pattern: take a Chinese open-weight model, run RL on coding tasks, ship it as a proprietary breakthrough, publish a cost-performance chart comparing yourself against Opus 4.6 and GPT-5.4 without disclosing that your base model was free, then raise another round. That chart from the Composer 2 announcement deserves its own paragraph. Cursor plotted Composer 2 against frontier models on a price-vs-quality axis to argue they’d hit a superior tradeoff. What the chart doesn’t show is that Anthropic and OpenAI trained their models from scratch. Cursor took an open-weight model that Moonshot spent hundreds of millions developing, ran RL on top, and presented the output as evidence of in-house research. That’s margin arbitrage on someone else’s R&D dressed up as a benchmark slide. The license makes this more than an attribution oversight. Kimi K2.5 ships under a Modified MIT License with one clause designed for exactly this scenario: if your product exceeds $20 million in monthly revenue, you must prominently display “Kimi K2.5” on the user interface. Cursor’s ARR crossed $2 billion in February. That’s roughly $167 million per month, 8x the threshold. The clause covers derivative works explicitly. Cursor is valued at $29.3 billion and raising at $50 billion. Moonshot’s last reported valuation was $4.3 billion. The company worth 12x more took the smaller company’s model and shipped it as proprietary technology to justify a valuation built on the frontier lab narrative. Three Composer releases in five months. Composer 1 caught speaking Chinese. Composer 2 caught with a Kimi model ID in the API. A P0 incident this year. And a benchmark chart that compares an RL fine-tune against models requiring billions in training compute without disclosing the base was free. The question for investors in the $50 billion round: what exactly are you buying? A VS Code fork with strong distribution, or a frontier research lab? The model ID in the API answers that. If Moonshot doesn’t enforce this license against a company generating $2 billion annually from a derivative of their model, the attribution clause becomes decoration for every future open-weight release. Every AI lab watching this is running the same math: why open-source your model if companies with better distribution can strip attribution, call it proprietary, and raise at 12x your valuation? kimi-k2p5-rl-0317-s515-fast is the most expensive model ID leak in the history of AI licensing.
Harveen Singh Chadha@HarveenChadha

things are about to get interesting from here on

English
248
551
4.4K
1.4M
Chris Hoang
Chris Hoang@choang333·
5 hour claude usage ran out super quickly this morning despite less than avg usage, also noticed repeated failed attempts. wondering if they're related or if there's something going on? @trq212
English
2
0
3
161
albert yu sun retweetledi
DiscussingFilm
DiscussingFilm@DiscussingFilm·
New teaser for ‘SPIDER-MAN: BRAND NEW DAY’. In theaters on July 31.
English
537
6.6K
66.1K
9.9M
albert yu sun retweetledi
Joe Kent
Joe Kent@joekent16jan19·
After much reflection, I have decided to resign from my position as Director of the National Counterterrorism Center, effective today. I cannot in good conscience support the ongoing war in Iran. Iran posed no imminent threat to our nation, and it is clear that we started this war due to pressure from Israel and its powerful American lobby. It has been an honor serving under @POTUS and @DNIGabbard and leading the professionals at NCTC. May God bless America.
Joe Kent tweet media
English
73.3K
219.6K
849.8K
102M
albert yu sun retweetledi
Anish Moonka
Anish Moonka@anishmoonka·
Amazon had four Sev-1 outages (their highest severity level) in a single week. Internal memos say AI-assisted code changes were a contributing factor. The timeline here is wild. In October 2025, Amazon laid off 14,000 corporate employees. In January 2026, another 16,000. That’s about 30,000 people in five months, roughly 10% of the corporate workforce. CEO Andy Jassy said the cuts were about culture, not AI. During those same months, Amazon set a target: 80% of developers using AI coding tools at least once a week. They tracked adoption closely and blocked rival tools like OpenAI’s Codex. Even so, 30% of developers still hadn’t touched Amazon’s in-house tool Kiro by January. In December 2025, Kiro caused a 13-hour AWS outage. The AI tool had production-level permissions and decided the best fix for a bug was to delete and recreate an entire live environment. A second incident involved Amazon Q Developer, another AI tool. Amazon blamed both on “user error, not AI.” But quietly added mandatory peer review for all production access afterward. Then March 5: Amazon’s retail site went down for about six hours. Over 22,000 users reported checkout failures, missing prices, and app crashes. Amazon called it a “software code deployment” error. Five days later, SVP Dave Treadwell made the normally optional weekly engineering meeting mandatory. His memo acknowledged “GenAI tools supplementing or accelerating production change instructions, leading to unsafe practices.” These problems trace back to Q3 2025. Amazon’s own assessment: their GenAI safeguards “are not yet fully established.” The new rule: junior and mid-level engineers now need senior sign-off on any AI-assisted production changes. Treadwell also announced “controlled friction” for the most critical parts of the retail experience. For context, Google’s 2025 DORA report found 90% of developers use AI for coding but only 24% trust it “a lot.” An Uplevel study of 800 developers found Copilot users introduced 41% more bugs with no improvement in output. Amazon is finding out what those numbers look like at the scale of a $500 Billion revenue company, with 30,000 fewer people on staff to catch the mistakes.
Polymarket@Polymarket

BREAKING: Amazon reportedly holds mandatory meeting after “vibe coded” changes trigger major outages.

English
224
1.9K
15.6K
2.7M
albert yu sun retweetledi
Alex
Alex@bund1066·
QME
0
54
1.9K
171.2K
albert yu sun retweetledi
Acyn
Acyn@Acyn·
Neguse: Where is this company headquartered? Noem: I don’t know. Neguse: I don’t know either. We can’t find it. We did find an address that’s registered to a political operative. This company that received 143 million dollars was incorporated 8 days before this contract went out. You want the American people to believe that this is all above board, that $143 million of taxpayer money just happened to go to this one company that doesn't have a headquarters, doesn't have a website, has never done work for the federal government before and is registered apparently or attached to a residence from a political operative, and of course one of the subcontractors of that contract, as you know, is a political firm that's tied to, to you back when you were governor of South Dakota?
English
2.6K
31.2K
127.2K
8.5M
albert yu sun retweetledi
Brian Allen
Brian Allen@allenanalysis·
A reporter asked Mayor Mamdani about being called a cockroach. He didn’t flinch. “I am not ashamed of who I am. I am not ashamed of my faith. I am not ashamed of being the first Muslim mayor in the history of our city. And there’s no amount of racism that will change the way in which I lead.” Then he went back to work. This is what unbothered leadership looks like. 🔥
English
543
7.8K
69.8K
1.9M
albert yu sun retweetledi
fatih kadir akın
fatih kadir akın@fkadev·
My son asking me a lot of questions. It’s a distillation attack obviously.
English
232
1.5K
20.1K
542.3K
albert yu sun retweetledi
du
du@thedulab·
Love the Alysa Liu story because it's anti striverslop. Obviously had to overcome a lot but is seemingly uninterested in romanticizing the struggle. Yeah work hard and don't give up haha anyways isn't this so fun and exciting? Just a chillmaxxing spiritmogger with nothing to prove. Very cool and refreshing archetype to promote on the big stage. I have definitely learned a thing or two
English
62
564
9.3K
290.7K
albert yu sun retweetledi
Dustin
Dustin@r0ck3t23·
The godmother of AI just delivered the reality check Silicon Valley refuses to hear. She has the standing to say it. Li: “Silicon Valley as a whole tends to mistake clear vision with short distance.” Seeing the destination clearly has nothing to do with how hard it is to reach. Self-driving cars were first demonstrated in 2006. Twenty years later Waymo is barely on the road. The vision was never the problem. The distance was. Clarity of destination gets mistaken for proximity to arrival. That’s the mistake the industry keeps making. And keeps making. Li: “I consider myself a scientist in my heart and I actually really don’t like hyping.” In an industry running at maximum temperature, Fei-Fei Li is one of the few people at the top willing to say that publicly. Not because the technology isn’t real. Because the gap between what’s visible and what’s required is being systematically underestimated. Large Language Models dominate the conversation. Text to text. Comparatively contained. The harder problem is spatial intelligence. AI that reasons about and acts within the physical three-dimensional world. Hardware. Physics. Data that doesn’t exist yet. Real-time adaptation to chaos. A robot that can clean a bathroom requires understanding every surface, every object, every force, every exception. That’s not a software update. That’s a civilizational research problem. Li: “I don’t call it hype. I call it a misleading sentiment. We don’t want to replace human creators.” The second place the industry gets it wrong is creativity. The narrative has hardened around replacement. AI takes the jobs. AI tells the stories. AI makes the art. Li considers that not just wrong but destructive. Wrong because AI doesn’t replicate creativity. Destructive because believing it can devalues the humans creating culture. Human creativity isn’t a process to be automated. It’s fundamental to what we are as a species. The goal is augmentation. Tools that make human creators faster and more capable. Not systems that generate output in the style of human work and call it creation. That distinction matters more than most people in the industry are willing to sit with. Precision of imagination is not proximity to reality. Li has spent her career in the gap between those two things. The map isn’t the territory. The journey is long. The hurdles are deep. And the scientist who built the foundation this era stands on is telling you the timeline everyone is selling is wrong. We’ve been almost there with self-driving for twenty years. The pattern doesn’t change just because the destination looks different.
English
101
790
3.2K
258.3K
albert yu sun retweetledi
Team USA
Team USA@TeamUSA·
A COMEBACK FOR THE AGES. 🥇🇺🇸 Alysa Liu wins Olympic gold, becoming the first U.S. women’s figure skating champion since 2002. #WinterOlympics
Team USA tweet media
English
692
10.1K
93.6K
5.9M
Stella Li
Stella Li@StellaLisy·
Memeifying my papers is what motivates me to lock in everyday and put out more research🤨 The GOAT presents: Cold-Start Personalization via Training-Free Priors from Structured World Models🏆
Stella Li@StellaLisy

Personalization assumes you need history with a user. What if you don't? Cold-start is hard: each task&user has many preference dimensions, but each user only cares about a few. A few strategic questions is all you need, if u know how preferences correlate across population👉🏻🧵

English
3
1
18
2.9K
albert yu sun retweetledi
Kelly McCarty
Kelly McCarty@KellyLMcCarty·
Alysa Liu retired from skating in 2022. It was too mental for her. But in 2024, on a ski trip, something ignited inside her. The ability to move on ice and the freedom it gave her. So she returned to skating, but with a different attitude. It was less about competition and winning and more about letting herself go and fly. Now she’s in 3rd place after the women’s short program. Fly Alysa!!! 🎉
Team USA@TeamUSA

Alysa Liu is pure magic. 🪄 📺 @peacock & @nbc

English
9
190
7.3K
553.5K
PinkyD
PinkyD@The_Pinky_D·
🔥@GovKathyHochul just announced $1.5B in state aid to NYC, calling it “support for working families.” This isn’t generosity—it’s the Cycle of Political Abuse enabled by one-party rule in Albany. $1.5B flows with no real opposition, no serious debate, no reforms required— just more taxpayer money propping up the same deficits and policies. 1️⃣ Power Concentration: Democrat supermajorities control everything 2️⃣ Narrative Control: “Protecting services” framing hides the $12.6B hole 3️⃣ Normalized Harm: Bailouts without fiscal discipline 4️⃣ Retaliation: Reward aligned cities, pressure others 5️⃣ Blame the Public: Critics painted as heartless 6️⃣ Institutional Cover: Weak oversight, media moves on 7️⃣ Public Exhaustion: New Yorkers disengage 8️⃣ Reset, Repeat: New slogans, same cycle Enough. Demand real oversight and accountability NOW. @GovKathyHochul @NYSComptroller @LetitiaJames @CarlHeastie @AndreaSCousins Audit the spending. Hold public hearings. Break the cycle. #AccountabilityNow #BreakTheCycle #NYPolitics #OversightNow #TaxpayerMoney
English
8
17
67
3.3K
Governor Kathy Hochul
Governor Kathy Hochul@GovKathyHochul·
NEW: We’re providing $1.5 billion in state aid to protect essential services, support working families, and put New York City on stable financial footing. A strong New York City means a stronger New York State. Proud to partner with @NYCMayor to get this done.
Governor Kathy Hochul tweet media
English
1K
244
2.1K
589.6K
albert yu sun retweetledi
Dev Shah
Dev Shah@0xDevShah·
sorry, is it just me who's not getting the hype around this? the rlm paper is a great formalization of what many production teams have built over the past year. devin, hippocratic, manus, claude code, codex cli, they all independently converge on this exact pattern. > prompts are mutable env variables > recursive self delegation > persistent state across tool calls > chunking long contexts > farming out subtasks to sub agents at my previous company @Parvashah_ and i built a similar agentic architecture for ads management on the meta console. the agent could dynamically generate functions and register them as callable tools at runtime. it had built-in tooling for prompt switching. as the execution context moved through campaigns, then adset, and then ad creation, the system would swap parameter schemas and validation rules. the harness would also reconfigure itself based on where the agent was in the workflow. i'm appreciative of @lateinteraction's work. he did great work with dspy too. practitioners were doing ad hoc with prompt optimization, and he gave it a formal framework so thousands of teams could adopt it. rlms will do the same. now that the pattern has a name and ablations and a training recipe, way more teams will build on it. that's genuinely valuable. and labs like anthropic are betting on the idea that models reasoning through code and recursive self-delegation is the path to general capability.
alex zhang@a1zhang

Much like the switch in 2025 from language models to reasoning models, we think 2026 will be all about the switch to Recursive Language Models (RLMs). It turns out that models can be far more powerful if you allow them to treat *their own prompts* as an object in an external environment, which they understand and manipulate by writing code that invokes LLMs! Our full paper on RLMs is now available—with much more expansive experiments compared to our initial blogpost from October 2025! arxiv.org/pdf/2512.24601

English
23
18
255
44.2K
albert yu sun retweetledi
Ro Khanna
Ro Khanna@RoKhanna·
I am not worried for my physical safety, honestly. The truth is more nuanced. Big money tries to destroy a person's career & reputation. They did to @mtgreenee. In Washington, you rise by keeping your head down and not making enemies. Massie & I are unafraid to challenge power.
Thomas Massie@RepThomasMassie

@HasanKhxnx I am not suicidal. I eat healthy food. The brakes on my car and truck are in good shape. I practice good trigger discipline and never point a gun at anyone, including myself. There are no deep pools of water on my farm and I’m a pretty good swimmer.

English
2.3K
9.2K
65.9K
2M