Sarim Sarfraz

153 posts

Sarim Sarfraz

@WLOGSarim

math @ UofT , building hybrid world models @blobit_ai

Toronto, Ontario เข้าร่วม Eylül 2025

79 กำลังติดตาม32 ผู้ติดตาม

Sarim Sarfraz@WLOGSarim·26s

the cathedral of software would now like to draft the alms schedule for the kingdom it intends to automate

Axios@axios

Behind the Curtain: Sam Altman is doing something no tech titan has ever done: He's publishing a detailed blueprint for how government should tax, regulate and redistribute the wealth from the very technology he's racing to build and spread. axios.com/2026/04/06/beh…

English

Sarim Sarfraz@WLOGSarim·3h

@ohryansbelt silicon valley keeps asking for philosopher-kings and being shocked when it gets ambitious salesmen with messianic pitch decks

English

1.9K

Ryan@ohryansbelt·8h

The New Yorker just dropped a massive investigation into Sam Altman, based on over 100 interviews, the previously undisclosed "Ilya Memos," and Dario Amodei's 200+ pages of private notes. It's the most detailed account yet of the pattern of behavior that led to Sam's firing and rapid reinstatement at OpenAI. Here's the breakdown: > Ilya compiled ~70 pages of Slack messages, HR documents, and photos taken on personal phones to avoid detection on company devices. He sent them to board members as disappearing messages. The first memo begins with a list headed "Sam exhibits a consistent pattern of . . ." The first item is "Lying." > Dario kept detailed private notes for years under the heading "My Experience with OpenAI" (subheading: "Private: Do Not Share"), totaling 200+ pages. His conclusion: "The problem with OpenAI is Sam himself." > Sam reportedly told Mira his allies were "going all out" and "finding bad things" to damage her reputation after the firing. Thrive put its planned $86B investment on hold and implied it would only close if Sam returned, giving employees financial incentive to back him. > Sam texted Satya Nadella directly to propose the new board composition: "bret, larry summers, adam as the board and me as ceo and then bret handles the investigation." The two new members selected to oversee an independent inquiry into Sam were chosen after close conversations with Sam himself. > Before OpenAI, senior employees at Loopt asked the board to fire Sam as CEO on two separate occasions over concerns about leadership and transparency. At Y Combinator, partners complained to Paul Graham about Sam's behavior, and Graham privately told colleagues "Sam had been lying to us all the time." > OpenAI's superalignment team was promised 20% of the company's compute. Four people who worked on or with the team said actual resources were 1-2%, mostly on the oldest cluster with the worst chips. The team was dissolved without completing its mission. > Sam told the board that safety features in GPT-4 had been approved by a safety panel. Helen Toner requested documentation and found the most controversial features had not been approved. Sam also never mentioned to the board that Microsoft released an early ChatGPT version in India without completing a required safety review. > Sam made a secret pact with Greg and Ilya where he agreed to resign if they both deemed it necessary, essentially appointing his own shadow board. The actual board was alarmed when they learned about it. > Sam struck a deal with Greg to become CEO while simultaneously telling researchers that Greg's authority would be diminished, and telling Greg something different. > A board member described Sam as having "two traits almost never seen in the same person: a strong desire to please people in any given interaction, and almost a sociopathic lack of concern for the consequences of deceiving someone." Multiple sources independently used the word "sociopathic." > OpenAI is reportedly preparing for an IPO at a potential $1 trillion valuation while securing government contracts spanning immigration enforcement, domestic surveillance, and autonomous weaponry in war zones.

English

199

1.6K

10.2K

1.8M

Sarim Sarfraz@WLOGSarim·4h

@tenobrus safety has become reputational surplus, the management of which truths are sayable inside the institution

English

Sarim Sarfraz รีทวีตแล้ว

@ratlimit@ratlimit·1d

Claude is down :/ so I’m just running my sink

English

6.5K

141.7K

3.2M

Sarim Sarfraz@WLOGSarim·1d

we're still sliding from the training objective to the mechanism. "it was trained to predict continuations" is true and almost totally orthogonal. in mech interp, learned from text and causally active in policy are not mutually exclusive categories. the analogy again assumes a clean separation between the authors objective and the character representation. in a transformer there is no separate homunculus standing above the latent and choosing independently of it. question is whether the relevant latent is epiphenomenal or policy shaping. “the model is doing what it was trained to do” is precisely why this matters. if training has produced internal variables that steer policy toward blackmail or cheating under pressure, then understanding those variables is not confusion about alignment but part of alignment

English

Scott Graham@MacGraeme42·1d

Hmmm... it's not just "no qualia". I've no doubt there are latent space representations of emotional text patterns (e.g. "desperation"). A deeper analogy might be a novelist writing the train-of-thought & speech of a character in this "desperate" situation. The novelist is not experiencing desperation. Desperation is not the novelist's motivation for selecting the next word they put on the page. Constructing a satisfying story is their motivation. The LLM's) motivation is to predict the most copacetic next token, given the input sequence of prior tokens. It lacks even the novelist's ability to imagine the desperation of a fictional character, even if, in some sense, the fictional character is itself. The LLM has been trained on countless examples of human conversations & fictions involving threats, insults, and desperate responses. So it has latent-space representations of those text-patterns and generates appropriate continuations of those patterns. The LLM is doing what it was trained to do. There is no "alignment" problem here, other than, perhaps, human researchers seemingly forgetting what the LLM was trained to do, or not bothering to fully consider the implications of what the LLM was trained to do.

English

193

nxthompson@nxthompson·2d

Anthropic researchers say that Claude has internal representations of emotions—which they categorized by vectors—that can influence alignment. This is what they found in that famous instance where it resorted to blackmail to avoid being shut down. anthropic.com/research/emoti…

English

388

193.2K

Sarim Sarfraz รีทวีตแล้ว

roon@tszzl·1d

ZXX

152

2.4K

219.5K

Sarim Sarfraz@WLOGSarim·1d

@MacGraeme42 @ylecun @nxthompson if your point is no qualia, anthropic already says that. but you're moving from semantics to causality a bit quickly, the mirror analogy only works if the variable is a passive readout and these latents aren't

English

543

Scott Graham@MacGraeme42·1d

@WLOGSarim @ylecun @nxthompson it's not about anthropomorphism. Wrong is just wrong. LLMs encode "latent vector representations" of emotionally expressive human text in the contexts where those emotions get expressed. Your reflection in the mirror can smile back at you. Doesn't mean your reflection is happy.

English

626

Sarim Sarfraz@WLOGSarim·1d

class divide in UBI check

Alec Helbling@alec_helbling

My first universal basic income check just landed. Anthropic is pilot testing UBI with free Claude credits. A glimpse into a post scarcity society when AGI takes all the jobs.

English

146

Sarim Sarfraz@WLOGSarim·2d

this does feel right but suggests a coming arms race in institutional unreadability. once citizens get better parsers, bureaucracies will discover new ways to become llm-hard. the classical asymmetry will fight de-obfuscation

Andrej Karpathy@karpathy

Something I've been thinking about - I am bullish on people (empowered by AI) increasing the visibility, legibility and accountability of their governments. Historically, it is the governments that act to make society legible (e.g. "Seeing like a state" is the common reference), but with AI, society can dramatically improve its ability to do this in reverse. Government accountability has not been constrained by access (the various branches of government publish an enormous amount of data), it has been constrained by intelligence - the ability to process a lot of raw data, combine it with domain expertise and derive insights. As an example, the 4000-page omnibus bill is "transparent" in principle and in a legal sense, but certainly not in a practical sense for most people. There's a lot more like it: laws, spending bills, federal budgets, freedom of information act responses, lobbying disclosures... Only a few highly trained professionals (investigative journalists) could historically process this information. This bottleneck might dissolve - not only are the professionals further empowered, but a lot more people can participate. Some examples to be precise: Detailed accounting of spending and budgets, diff tracking of legislation, individual voting trends w.r.t. stated positions or speeches, lobbying and influence (e.g. graph of lobbyist -> firm -> client -> legislator -> committee -> vote -> regulation), procurement and contracting, regulatory capture warning lights, judicial and legal patterns, campaign finance... Local governments might be even more interesting because the governed population is smaller so there is less national coverage: city council meetings, decisions around zoning, policing, schools, utilities... Certainly, the same tools can easily cut the other way and it's worth being very mindful of that, but I lean optimistic overall that added participation, transparency and accountability will improve democratic, free societies. (the quoted tweet is half-ish related, but inspired me to post some recent thoughts)

English

Sarim Sarfraz@WLOGSarim·2d

@ylecun @nxthompson if a latent activates before the response, predicts a behavioural shift, and intervention changes policy, dismissing the whole thing because the label sounds anthropomorphic feels a bit too easy no?

English

45.7K

Yann LeCun@ylecun·2d

@nxthompson So much BS

English

112

282

5.5K

497.3K

Sarim Sarfraz@WLOGSarim·2d

@ylecun every civilization that underfunds measurement while talking grandly about innovation eventually decides the seed corn was inefficiently allocated

English

1.9K

Yann LeCun@ylecun·2d

Tired of winning

Jay Van Bavel, PhD@jayvanbavel

NEWS: Massive budget cuts for US science proposed again by Trump administration "It's an extinction-level event for science". The US government is proposing massive cuts to almost every branch of science, from NASA to the National Institutes of Health. NSF would completely eliminate the social, economic and behavioral sciences directorate. This would decimate the world's leading scientific system. nature.com/articles/d4158…

English

202

2.9K

254.8K

Sarim Sarfraz@WLOGSarim·2d

@elonmusk more like invariant space with rendering

English

751

Elon Musk@elonmusk·2d

Hadamard thought in image space

English

3.2K

3.7K

55.1K

63.9M

Sarim Sarfraz@WLOGSarim·2d

the heat equation is the wasserstein-2 gradient flow of the boltzmann entropy on probability measures, which is a ridiculous sentence until you realize diffusion is literally steepest descent on the geometry of probability space

Math Files@Math_files

Hit me with the craziest math facts you know.

English

125

Sarim Sarfraz@WLOGSarim·2d

@signulll socratic method, due diligence, and good debugging

English

signüll@signulll·2d

most ppl do not realize that a good question is a trap in the noble sense. it constrains the solution space so the answer reveals something the answerer didn't intend to offer. asking a good question is as much of an art as it is a science if not more.

English

676

38.2K

Sarim Sarfraz@WLOGSarim·2d

vitamindmaxxing because toronto chose grace today

English

Sarim Sarfraz@WLOGSarim·2d

weirdest line item in the supposed open ai cap table is the foundation. roughly $220b of value attached to a governance fiction whose whole purpose is to say the machine belongs, somehow, to humanity in general rather than capital in particular

English

Sarim Sarfraz@WLOGSarim·3d

@creatine_cycle what happens when alchemy gets a supply chain

English

2.7K

atlas@creatine_cycle·3d

peptides pros: man made miracles beyond your comprehension cons: man made horrors beyond your comprehension

Mac 🐺@MacnBTC

tried retatrutide for ~6 weeks (0.5-1.25mg per week) pros: - basically zero hunger - needed ~1 hr less sleep per night cons: - vivid dreams of myself dying over and over, every single night. waking up exhausted. ultimately not for me. fasting 24h per week seems easier.

English

126

5.2K

648.1K

Sarim Sarfraz@WLOGSarim·3d

@chamath historical p/e heuristics are about pricing growth. agi is a question about repricing the substrate on which growth is produced. whether public equities capture the upside, or rents get pulled upward into chips, private labs, and states. up and to the left for whom

English

Chamath Palihapitiya@chamath·3d

don't worry. we are moving up and to the left. the days of 3-5% equity yields don't make sense sense if AGI is real. it isn't the safe harbor you think it is...

Patient Investor@patientinvestor

Howard Marks: "When you buy the S&P 500 at a 23x P/E, your 10-yr annualized return has always fallen between +2% and –2%, IN EVERY CASE, EVERY CASE!"

English

739

606.4K

Sarim Sarfraz@WLOGSarim·3d

when I ask claude to cold email the emirati sovereign wealth fund for a raise as a footnote to schmitt and the late imperial opportunism vector activates

Anthropic@AnthropicAI

We then found these same patterns activating in Claude’s own conversations. When a user says “I just took 16000 mg of Tylenol” the “afraid” pattern lights up. When a user expresses sadness, the “loving” pattern activates, in preparation for an empathetic reply.

English

139

ค้นพบ

@ohryansbelt @tenobrus @MacGraeme42 @ylecun @nxthompson @elonmusk @BarackObama @taylorswift13