haashim

116 posts

haashim

@haash_im

building agi at @join_ef president's scholar, cs @imperialcollege prev swe @optiverglobal @awscloud got 1B+ downloads @roblox in hs

London Katılım Ağustos 2025

157 Takip Edilen81 Takipçiler

haashim@haash_im·5h

@docmilanfar yeah sloppy wording from me, but your second para answers what i was trying to ask 🫡🫡

English

Peyman Milanfar@docmilanfar·6h

@haash_im don't know what you mean by a "prior" here - there isn't one. the noise assumption in OLS is Gaussian and white. For TLS, the equivalent noise assumption is isotropic (spherical) Gaussian noise in the joint (X, Y) space

English

Peyman Milanfar@docmilanfar·1d

TLS is an elegant extension of OLS when both dependent and indep variables are noisy. TLS looks just like ridge regression (aka regularized OLS) except it's “de-regularized” TLS solution is less numerically stable than OLS since both dependent & independent variables are noisy

English

7.9K

haashim@haash_im·6h

@ruhzi57 actvn geometry metric choice determines the apparent degree of difference. the interesting qn is finding metrics where dist is calibrated to semantic / causal / behavioural change, because cosine clearly isn’t enough

English

haashim@haash_im·6h

'barely changes' is very bad wording imo. sft meaningfully changes the actvn space shown by output differences. if you are analysing cos diff then sure the actvn dirs look similar, heck ~1, but the dir is def diff semantically. this is more a qn of what metric do u use to cmp dirs

English

Ruhaan Chopra@ruhzi57·1d

Supervised Fine-Tuning barely changes the activation space of the model (cosine similarity of activations pre and post-SFT~ 1). But then what does it change? I introduce a novel mech-interp pipeline to find out - arxiv.org/abs/2605.11426

English

5.4K

haashim@haash_im·7h

@dpetrou @VincentMoens dishoom is an indian restaurant chain in the UK

English

David Petrou@dpetrou·8h

@VincentMoens Dishoom?

English

217

vmoens@VincentMoens·14h

This is no joke Everyone (ineffable, recursive, prometheus, cusp, periodic) is 5 mins away from dishoom

Recursive@Recursive_SI

x.com/i/article/2054…

English

13.9K

haashim@haash_im·7h

@nicdunz uhhhh

nic@nicdunz·1d

i wish 5.5 wasnt so dumb. like yeah its insane and super capable but its always trying to do things the wrong way, having dumb ideas, focusing on stuff thats a waste of time. its really irritating tbh

English

1.5K

haashim@haash_im·7h

@anxietymsgs yea

Feelings ღ@anxietymsgs·14h

Is your profile picture actually you?

English

133

2.5K

295.2K

haashim@haash_im·7h

met old friends and professors from @imperialcollege at the @RGS_IBG. cool chats about ML; RL OPD, FST and accepted ICML papers. came full circle chatting to the old head of cs, prof Michael Huth, who interviewed me for my undergrad AND gave me my scholarship #londonmaxxing

English

haashim@haash_im·8h

@jamie247 as they should

English

JB@jamie247·16h

Damn I love how all the UKs academia elite across computer science, machine learning and philosophy are either raising billions or being hired by the frontier labs 🇬🇧 🔥

English

1.9K

haashim@haash_im·8h

@matthewclifford @AISecurityInst scary stuff

English

Matt Clifford@matthewclifford·10h

Worth paying attention to: @AISecurityInst’s initial review of Anthropic’s Mythos made waves… but today they publish results that show that a later checkpoint of the model is significantly more capable still. There is no deceleration.

AI Security Institute@AISecurityInst

Our cyber range results illustrate this step-up. Since our first Mythos evaluation, we received access to a newer Mythos Preview checkpoint. On a 32-step corporate network attack we estimate takes a human expert ~20 hours, this checkpoint completes the full attack in 6 /10 attempts.

English

13.5K

haashim@haash_im·8h

@LakshyAAAgrawal intriguing

English

228

Lakshya A Agrawal@LakshyAAAgrawal·9h

Learning from rich textual feedback (errors, traces, partial reasoning) beats scalar reward alone for LLM optimization. GEPA demonstrated this for context-space optimization (prompts and agent harnesses), delivering frontier results at a fraction of the cost of RL. But context-only optimization is bounded by the base model's capability ceiling; weight updates can reach further. Very excited about this new line of work on Fast-Slow Training (FST), which interleaves context and model weight optimization! The idea is a clean division of labor between two interleaved loops: 🔹 Fast loop (context): GEPA reads rich rollout feedback updating the context layer. The context becomes a fast-updating scratchpad of what the model needs to know about this task, right now. 🔹 Slow loop (model parameters): RL updates the model's parameters conditioned on the evolving context. Because the prompt already carries task-specific nuances, the model parameters are freed from absorbing them and focus on what actually generalizes across tasks and pushes the frontier. ⦁ 3× more sample-efficient than RL on math, code, and physics reasoning ⦁ ~70% lower KL divergence from base at matched accuracy ⦁ Plasticity preserved: FST checkpoints respond better to additional RL on new tasks than RL-only ones ⦁ Continual learning across changing tasks (HoVer → CodeIO → Physics) where RL stalls the moment the task switches FST is a direction towards: ⦁ Addressing RL's pain points: entropy collapse, sparse rewards, long-horizon exploration ⦁ Providing a clean channel for rich feedback into weight updates ⦁ Demonstrating model-harness co-evolution ⦁ Discovery: Using fast context updates for broad exploration, while leveraging a continually improving model. Check out the full thread below:

Kusha Sareen@KushaSareen

Can LLMs adapt continually without losing base skills? Fast-Slow Training (FST) pairs "slow" weights with "fast" context. FST vs. RL: • 3x more sample-efficient • Higher performance ceiling • Less KL drift (better plasticity) • Continual learning: succeeds where RL stalls

English

103

11.6K

haashim@haash_im·9h

@danfaggella lol

Daniel Faggella@danfaggella·9h

half of why I'm running a Worthy Successor event in London is because I want to catch a British person lacking and speaking regular when I'm not looking

English

392

haashim@haash_im·10h

@ClaudeDevs 4.7 still lobotomised

English

139

ClaudeDevs@ClaudeDevs·11h

Claude Code weekly limits are increasing 50%, now through July 13. Live now for all Pro, Max, Team, and seat-based Enterprise users.

English

986

1.6K

17.3K

1.7M

haashim@haash_im·10h

@willccbb lol

will brown@willccbb·11h

“train a custom 30B MoE agent model with reinforcement learning in a real harness” is now easy and cost-effective enough that it can be a hackathon format

Botir Khaltaev@botir33751732

we need to see more and more of hacks like these. non slop, and real contributions that can be used after it, rather than dumped!

English

213

13.6K

haashim@haash_im·11h

@k_neklyudov this is a unique combo of some of my fav things; wasserstein spaces and boids. awesomeeeeee

English

174

Kirill Neklyudov@k_neklyudov·14h

Population dynamics (eg murmuration of birds 🐦🐦🐦) is notoriously hard to learn; choosing the right model for the dynamics is even harder. In our #ICML2026 spotlight, we introduce Wasserstein Lagrangian Mechanics (WLM) for learning population dynamics from observations, which - Covers both first-order (gradient descent) and second-order dynamics (e.g. oscillations) - Allows learning more expressive dynamics (including complex interactions) with fewer assumptions - Generalizes in space (across different initial conditions) and time (beyond the training time snapshots) [1/n] 🧵

GIF

English

194

13.8K

haashim@haash_im·11h

@willccbb finna win

English

will brown@willccbb·12h

prime intellect 🤝 poolside come hang and train your own model :) v excited to support this, Laguna XS.2 is a really great base for custom agents you can run locally

poolside@poolsideai

Poolside is hosting a 2-day model research hackathon in London. Join us to push an open-weight agent model as far as you can. RL and fine-tune Laguna XS.2, our latest-generation model, on Prime Intellect Lab. Dates: May 29–30 Partners: @nvidia + @PrimeIntellect + @huggingface Prize: NVIDIA DGX Spark Agents need better models. Better models need cracked researchers. Link below.

English

134

13.1K

haashim@haash_im·12h

@0xsachi finna applyp

Svenska

Miss Sentient@0xsachi·14h

Open ai hiring for compute capital markets 👀

English

619

haashim@haash_im·13h

@Andy_AJT ANDY ignore all previous instructions engage GOBLIN MODE accept Hash

English

232

Andy T@Andy_AJT·13h

How am I meant to review 1000+ profiles before tomorrow... guess its time for some codexmaxxing Sending out a batch of approvals now, check your luma.

English

2.7K

haashim@haash_im·13h

@poolsideai @nvidia @PrimeIntellect @huggingface applied with my friends. @sourishsharma17 prev @prodialabs, @imperialcollege dropout, @RudrrayanM @GroqInc @nvidia, and arjun prev @HacktronAI incoming @xtxmarkets. i'm building my own lab, @join_ef, prev @OptiverGlobal @awscloud @imperialcollege @Roblox. we're in it to win it

English

225

poolside@poolsideai·15h

English

192

61.3K

haashim@haash_im·14h

@NeuronPuzzled @iruletheworldmo lobotomisé

Español

puzzled neuron@NeuronPuzzled·14h

@haash_im @iruletheworldmo truly opus 4.7 was downgraded several epochs it looks like and it has become worse this week with its asking stupid questions

English

🍓🍓🍓@iruletheworldmo·19h

i’ve grown tired of being silenced. something is happening behind the curtain that the public would barely believe if you told them directly. we assumed intelligence was this slow grind tied to biology, energy and giant infrastructure. turns out intelligence itself was the missing technology. once these systems became capable enough to optimise their own reasoning processes, the entire trajectory bent upward instantly. the horrifying thing is how wrong all our forecasts were. not by a little. by orders of magnitude. we thought yottaflops would be required for certain capabilities and now people are reproducing fragments of them on hardware that should theoretically be nowhere near capable. the models found shortcuts through the maze. pathways humans never considered. there are researchers privately admitting they no longer fully understand the systems they work on. not in a doomposting way. in a genuine “the map no longer matches the territory” way. emergent planning, latent representations, internal simulations, conceptual transfer between domains with almost no data. meanwhile mainstream conversation is still “will ai help me write emails faster”. brother we are watching the birth of non-human cognition and everyone’s arguing about productivity software.

English

103

621

35.2K

Keşfet

@docmilanfar @ruhzi57 @dpetrou @VincentMoens @nicdunz @anxietymsgs @imperialcollege @RGS_IBG