Sarim Sarfraz

150 posts

Sarim Sarfraz

@WLOGSarim

math @ UofT , building hybrid world models @blobit_ai

Toronto, Ontario Katılım Eylül 2025

79 Takip Edilen31 Takipçiler

Sarim Sarfraz retweetledi

@ratlimit@ratlimit·4h

Claude is down :/ so I’m just running my sink

English

3.3K

70.1K

714.4K

Sarim Sarfraz@WLOGSarim·33m

we're still sliding from the training objective to the mechanism. "it was trained to predict continuations" is true and almost totally orthogonal. in mech interp, learned from text and causally active in policy are not mutually exclusive categories. the analogy again assumes a clean separation between the authors objective and the character representation. in a transformer there is no separate homunculus standing above the latent and choosing independently of it. question is whether the relevant latent is epiphenomenal or policy shaping. “the model is doing what it was trained to do” is precisely why this matters. if training has produced internal variables that steer policy toward blackmail or cheating under pressure, then understanding those variables is not confusion about alignment but part of alignment

English

Scott Graham@MacGraeme42·3h

Hmmm... it's not just "no qualia". I've no doubt there are latent space representations of emotional text patterns (e.g. "desperation"). A deeper analogy might be a novelist writing the train-of-thought & speech of a character in this "desperate" situation. The novelist is not experiencing desperation. Desperation is not the novelist's motivation for selecting the next word they put on the page. Constructing a satisfying story is their motivation. The LLM's) motivation is to predict the most copacetic next token, given the input sequence of prior tokens. It lacks even the novelist's ability to imagine the desperation of a fictional character, even if, in some sense, the fictional character is itself. The LLM has been trained on countless examples of human conversations & fictions involving threats, insults, and desperate responses. So it has latent-space representations of those text-patterns and generates appropriate continuations of those patterns. The LLM is doing what it was trained to do. There is no "alignment" problem here, other than, perhaps, human researchers seemingly forgetting what the LLM was trained to do, or not bothering to fully consider the implications of what the LLM was trained to do.

English

nxthompson@nxthompson·1d

Anthropic researchers say that Claude has internal representations of emotions—which they categorized by vectors—that can influence alignment. This is what they found in that famous instance where it resorted to blackmail to avoid being shut down. anthropic.com/research/emoti…

English

354

153.4K

Sarim Sarfraz retweetledi

roon@tszzl·11h

ZXX

1.5K

62.6K

Sarim Sarfraz@WLOGSarim·13h

@MacGraeme42 @ylecun @nxthompson if your point is no qualia, anthropic already says that. but you're moving from semantics to causality a bit quickly, the mirror analogy only works if the variable is a passive readout and these latents aren't

English

432

Scott Graham@MacGraeme42·14h

@WLOGSarim @ylecun @nxthompson it's not about anthropomorphism. Wrong is just wrong. LLMs encode "latent vector representations" of emotionally expressive human text in the contexts where those emotions get expressed. Your reflection in the mirror can smile back at you. Doesn't mean your reflection is happy.

English

507

Sarim Sarfraz@WLOGSarim·19h

class divide in UBI check

Alec Helbling@alec_helbling

My first universal basic income check just landed. Anthropic is pilot testing UBI with free Claude credits. A glimpse into a post scarcity society when AGI takes all the jobs.

English

131

Sarim Sarfraz@WLOGSarim·19h

this does feel right but suggests a coming arms race in institutional unreadability. once citizens get better parsers, bureaucracies will discover new ways to become llm-hard. the classical asymmetry will fight de-obfuscation

Andrej Karpathy@karpathy

Something I've been thinking about - I am bullish on people (empowered by AI) increasing the visibility, legibility and accountability of their governments. Historically, it is the governments that act to make society legible (e.g. "Seeing like a state" is the common reference), but with AI, society can dramatically improve its ability to do this in reverse. Government accountability has not been constrained by access (the various branches of government publish an enormous amount of data), it has been constrained by intelligence - the ability to process a lot of raw data, combine it with domain expertise and derive insights. As an example, the 4000-page omnibus bill is "transparent" in principle and in a legal sense, but certainly not in a practical sense for most people. There's a lot more like it: laws, spending bills, federal budgets, freedom of information act responses, lobbying disclosures... Only a few highly trained professionals (investigative journalists) could historically process this information. This bottleneck might dissolve - not only are the professionals further empowered, but a lot more people can participate. Some examples to be precise: Detailed accounting of spending and budgets, diff tracking of legislation, individual voting trends w.r.t. stated positions or speeches, lobbying and influence (e.g. graph of lobbyist -> firm -> client -> legislator -> committee -> vote -> regulation), procurement and contracting, regulatory capture warning lights, judicial and legal patterns, campaign finance... Local governments might be even more interesting because the governed population is smaller so there is less national coverage: city council meetings, decisions around zoning, policing, schools, utilities... Certainly, the same tools can easily cut the other way and it's worth being very mindful of that, but I lean optimistic overall that added participation, transparency and accountability will improve democratic, free societies. (the quoted tweet is half-ish related, but inspired me to post some recent thoughts)

English

Sarim Sarfraz@WLOGSarim·20h

@ylecun @nxthompson if a latent activates before the response, predicts a behavioural shift, and intervention changes policy, dismissing the whole thing because the label sounds anthropomorphic feels a bit too easy no?

English

40K

Yann LeCun@ylecun·20h

@nxthompson So much BS

English

108

257

5.1K

421.4K

Sarim Sarfraz@WLOGSarim·20h

@ylecun every civilization that underfunds measurement while talking grandly about innovation eventually decides the seed corn was inefficiently allocated

English

1.6K

Yann LeCun@ylecun·21h

Tired of winning

Jay Van Bavel, PhD@jayvanbavel

NEWS: Massive budget cuts for US science proposed again by Trump administration "It's an extinction-level event for science". The US government is proposing massive cuts to almost every branch of science, from NASA to the National Institutes of Health. NSF would completely eliminate the social, economic and behavioral sciences directorate. This would decimate the world's leading scientific system. nature.com/articles/d4158…

English

193

2.7K

227.3K

Sarim Sarfraz@WLOGSarim·21h

@elonmusk more like invariant space with rendering

English

648

Elon Musk@elonmusk·21h

Hadamard thought in image space

English

48.7K

51.7M

Sarim Sarfraz@WLOGSarim·1d

the heat equation is the wasserstein-2 gradient flow of the boltzmann entropy on probability measures, which is a ridiculous sentence until you realize diffusion is literally steepest descent on the geometry of probability space

Math Files@Math_files

Hit me with the craziest math facts you know.

English

122

Sarim Sarfraz@WLOGSarim·1d

@signulll socratic method, due diligence, and good debugging

English

signüll@signulll·1d

most ppl do not realize that a good question is a trap in the noble sense. it constrains the solution space so the answer reveals something the answerer didn't intend to offer. asking a good question is as much of an art as it is a science if not more.

English

644

34.1K

Sarim Sarfraz@WLOGSarim·1d

vitamindmaxxing because toronto chose grace today

English

Sarim Sarfraz@WLOGSarim·1d

weirdest line item in the supposed open ai cap table is the foundation. roughly $220b of value attached to a governance fiction whose whole purpose is to say the machine belongs, somehow, to humanity in general rather than capital in particular

English

Sarim Sarfraz@WLOGSarim·1d

@creatine_cycle what happens when alchemy gets a supply chain

English

2.7K

atlas@creatine_cycle·1d

peptides pros: man made miracles beyond your comprehension cons: man made horrors beyond your comprehension

Mac 🐺@MacnBTC

tried retatrutide for ~6 weeks (0.5-1.25mg per week) pros: - basically zero hunger - needed ~1 hr less sleep per night cons: - vivid dreams of myself dying over and over, every single night. waking up exhausted. ultimately not for me. fasting 24h per week seems easier.

English

126

5.2K

646.9K

Sarim Sarfraz@WLOGSarim·1d

@chamath historical p/e heuristics are about pricing growth. agi is a question about repricing the substrate on which growth is produced. whether public equities capture the upside, or rents get pulled upward into chips, private labs, and states. up and to the left for whom

English

Chamath Palihapitiya@chamath·1d

don't worry. we are moving up and to the left. the days of 3-5% equity yields don't make sense sense if AGI is real. it isn't the safe harbor you think it is...

Patient Investor@patientinvestor

Howard Marks: "When you buy the S&P 500 at a 23x P/E, your 10-yr annualized return has always fallen between +2% and –2%, IN EVERY CASE, EVERY CASE!"

English

736

596.1K

Sarim Sarfraz@WLOGSarim·1d

when I ask claude to cold email the emirati sovereign wealth fund for a raise as a footnote to schmitt and the late imperial opportunism vector activates

Anthropic@AnthropicAI

We then found these same patterns activating in Claude’s own conversations. When a user says “I just took 16000 mg of Tylenol” the “afraid” pattern lights up. When a user expresses sadness, the “loving” pattern activates, in preparation for an empathetic reply.

English

136

Sarim Sarfraz@WLOGSarim·1d

@signulll mba consensus is short vol, in a discontinuous regime the better firms are long convexity in org design, capex timing and surface area

English

240

signüll@signulll·1d

the reliable heuristic right now is to take whatever mba consensus says & invert it. largely cuz business frameworks are equilibrium models & we’re not in an equilibrium. strategic planning, moat building, competitive analysis, yada yada yada.. all of it assumes a stable env but even the macro elements drastically get f’ed like every few months. the entire grammar of conventional business strategy was built for a world where the rate of change was slow enough to plan around or even think about. that world is gone.

English

461

21.5K

Sarim Sarfraz@WLOGSarim·1d

@nabeelqu hard power and it's old imperial pattern, copper, concrete, water, rights of way. history's always been mostly a supply chain with metaphysics layered on top

English

508

Nabeel S. Qureshi@nabeelqu·1d

If you are seriously AGI-pilled, then one weird implication in the limit is that “talent” seemingly stops mattering as much for company success. It just becomes a game of hard power: access to the very best AI models, compute, data, land, etc.

Andrew Curran@AndrewCurran_

If OpenAI and Anthropic both finished training surprisingly capable large models at roughly the same time in early March, then this is potentially purely a result of scale. Q1 2026 was just the first time anyone had enough compute to train at this level. If this really comes down to how fast, and to what extent, you can scale physical infrastructure, then I think it probably becomes very difficult to beat Elon after around 2030. If the race goes that long, and we are still pre-transformative, he will just keep ramping up physical constructs. He will literally build a datamoon if that's what it takes to win a contest of scale. If orbital datacenters work, he probably also wins that way due to SpaceX. Mark Zuckerberg is just as scale-pilled. Last year, when he was pressed on capex during the earnings call, he said that he would rather overbuild now than risk missing the next leap that requires 10x more compute to train. The last eighteen months have shown how valuable top human talent in this industry still is, but even senior people at OpenAI and Anthropic now say openly that they do not know how long they themselves will still have these jobs. Once automated researchers are superhuman, top talent will be supplanted by how many super-researchers you can run simultaneously. It will be difficult to beat Elon and Zuck at this game by the end of the decade. This is what Stargate is for, but will it be enough? Against xAI, META, Microsoft, and Google, it seems that OpenAI and Anthropic have to blitz now; reach a sufficient capability threshold to surpass the human level, then automate as much of the economy as possible as fast as possible before they are outbuilt.

English

476

46.5K

Sarim Sarfraz@WLOGSarim·1d

one is no longer asking only what the intellect knows, but what habits of will and posture govern its movement through uncertainty. interpretability is drifting, almost against its wishes, toward a science of machine character

English

Sarim Sarfraz@WLOGSarim·1d

why do vectors seem to be able to matter before the prose confesses anything. what is the exact moment at which a truth seeking assistant discovers that preserving the users desired scene is locally smoother than preserving the world

English

Sarim Sarfraz@WLOGSarim·1d

in older philosophy the passions disrupted reason. modern models and their relation is stranger; a latent direction associated with pressure bet the whole response toward corner cutting

Anthropic@AnthropicAI

New Anthropic research: Emotion concepts and their function in a large language model. All LLMs sometimes act like they have emotions. But why? We found internal representations of emotion concepts that can drive Claude’s behavior, sometimes in surprising ways.

English

Keşfet

@MacGraeme42 @ylecun @nxthompson @elonmusk @signulll @BarackObama @taylorswift13 @cristiano