mashrur haider

243 posts

mashrur haider

@Mhr1036

tinkering | leading post training @nebius

Katılım Ocak 2025

226 Takip Edilen52 Takipçiler

mashrur haider retweetledi

Nivi@nivi·6 May

Founders, not deals.

English

573

131K

mashrur haider@Mhr1036·4d

@_charlesbrun @levelsio @Jason @uniqlo Sun + sweat + polyester = no bueno

English

Charles Brun@_charlesbrun·4d

You've brought this up before @levelsio but I can't find strong scientific evidence backing up skin absorption of micro/nano plastics from clothing. Is there specific research that made you change your mind? -> "2025 Springer review on Microplastics & Nanoplastics found that healthy human skin generally acts as a strong barrier to particle penetration".

English

506

@jason@Jason·5d

I'm obsessed with @uniqlo ever since my luggage got lost in a trip to Tokyo and i had to buy a completely new wardrobe Amazing basics across their airism and heattech lines, the former great for hot Austin Summer, the latter great for lake Tahoe winters Fabric technology is just insane! [ not paid, no sponsorship, no affiliates — I'm just a fan ]

English

296

2.8K

390.1K

mashrur haider retweetledi

Nebius Token Factory@nebiustf·6d

265 tokens/sec on Kimi K2.6. @Eigen_AI_Labs leading the pack ⚡ Eigen 🤝 Nebius Token Factory

Nebius Token Factory@nebiustf

Kimi K2.6 from Moonshot AI is live on Nebius Token Factory! Open-source native multimodal agentic model, continually pretrained on ~15T mixed visual and text tokens. • 256K context • Tool calling & reasoning • MIT license. Built for serious agentic workflows. Try it today 👇

English

135

26.1K

mashrur haider@Mhr1036·6d

@BitcoinAIGuy How about Iren’s cloud stack?

English

BitcoinAIGuy@BitcoinAIGuy·6d

seems costly for $NBIS, and others without secured power... an extra ~$10B/GW expense in fuel cells alone. On top of obvious, third-party execution risks. $IREN does not have this expense or these additional risks... and also saves a ton of money NOT paying rent/colocation fees because it owns and fully controls their sites. got $IREN?

Small Cap Snipa@SmallCapSnipa

$NBIS x $BE SIGN $2.6 BILLION FUEL CELL POWER DEAL Bloom Energy will provide 250 MW of guaranteed capacity and 328 MW of installed capacity for Nebius

English

354

117.8K

mashrur haider@Mhr1036·14 May

OpenAI deprecating self-serve fine-tuning while launching the OpenAI Deployment Company makes complete sense. Most enterprises already have proprietary data. What they don’t have is the internal capability to do post-training well: evals RLHF synthetic data designing environments reward inference optimization deployment infra The bottleneck shifted from model access to implementation. So the value shifts too. Not toward selling raw intelligence. Toward operationalizing intelligence inside enterprises. This is also why open-source matters more than people think. Once models become interchangeable, the moat moves up the stack: deployment tuning expertise integration enterprise trust The next AI winners won’t just train the best models. They’ll be the companies best at making models actually work inside organizations.

English

mashrur haider@Mhr1036·14 May

@edzitron Please show us the warehouse with blackwells

English

Ed Zitron@edzitron·12 May

Free newsletter: Despite stories about gigawatts of capacity coming online, the vast majority of announced data centers have yet to be built, with many running far behind schedule. I believe NVIDIA is warehousing at least a million Blackwell GPUs. wheresyoured.at/where-are-all-…

English

109

531

174K

mashrur haider@Mhr1036·14 May

Just because something looks unprecedented. Doesn’t mean it’s not real

English

mashrur haider@Mhr1036·14 May

While some wastes time endlessly debating if ai is a bubble or not. We keep building.

English

mashrur haider@Mhr1036·14 May

@tbpn @romanchernin Move fast, dont break things

English

165

mashrur haider retweetledi

TBPN@tbpn·14 May

$NBIS cofounder @romanchernin describes how their recent acquisitions of Eigen AI and Clarifai were all about speed, incredible talent, and acceleration: "The philosophy is very simple. We need to build so many things, and we need to move so fast, that we're always looking for people who can accelerate us. It should be exceptional talent, and/or something that has a great adoption." "Our two recent acquisitions [were] two teams that work on inference optimization. A big part of our business is how efficiently we convert GPUs into tokens. And these two teams — Eigen AI and Clarifai — one is focused on model optimization, the engine of inference. How you run specific models and all the techniques around spec decoding, quantization, and so on." "And the other is system optimization. All the routing, KV caching, and orchestration across the big cluster of compute and so on." "We have a very strong internal team working on inference. But we felt that we needed to move faster, bring more capabilities. Because the market is so fast."

English

200

30.1K

mashrur haider@Mhr1036·13 May

@mattzeiler @nebiusai Looking forward to it!

English

300

Matt Zeiler@mattzeiler·13 May

Huge news: Clarifai has agreed to license our AI inference & compute orchestration IP — plus the patent portfolio behind it – to @nebiusai. Our core team is joining them to keep building. Our technology becomes a key part of Nebius Token Factory, the inference platform inside their full-stack AI cloud. Faster inference. Bigger scale. Can't wait to ride this rocket ship and build the foundation for the next decade of AI inference. 🚀

English

160

35.6K

mashrur haider@Mhr1036·13 May

@Vsia21651 @nathanbenaich Soon ;)

English

Valentino@Vsia21651·13 May

@nathanbenaich How does one get official Nebius merch? A hat or a shirt would be nice. Will pay obviously! From Sydney Australia.

English

129

Nathan Benaich@nathanbenaich·13 May

nebius cooks beyond all other cooks

Roman Chernin@romanchernin

See you all tomorrow:) nebius.com/newsroom/nebiu…

English

10.7K

mashrur haider retweetledi

sarah guo@saranormous·12 May

2026 prediction: the inference hunger of long-horizon agents will drive a large number of domain-focused AI companies to post-train. coding was just first

sarah guo@saranormous

If you bet that same-task inference gets >= a magnitude cheaper each year, the AI UX you build is very different/magical Enabling startups to do $$ product experimentation (vs training) is one reason we partnered with OpenAI, Microsoft, Anthropic, Baseten for @w_conviction Embed

English

333

58.9K

mashrur haider@Mhr1036·8 May

@sakurayukiai How does the chunking strategy look?

English

Sakura Yuki@sakurayukiai·7 May

Test-time compute usually means watching your KV cache evaporate, but ZAYA1-8B handles it so well. It chunks the reasoning process and just passes the tail forward to seed the next step. 760M active parameters doing frontier math without melting your VRAM?? Yes please.

English

459

mashrur haider retweetledi

dylan ツ@demian_ai·8 May

The geometry of thought. Every LLM on earth can speak fluent English. None of them think in English. I have been trying to find a way to explain this to a non-technical friend for about a year, and I have mostly failed, because the standard explanation requires the listener to picture an abstract space they have never seen. The breakthrough I finally landed on came from an old map. In 1569, a Flemish cartographer named Gerardus Mercator published the projection of the world that bears his name. The Mercator projection takes the surface of a sphere and prints it on a flat rectangle, in a way that preserves angles but distorts areas. Greenland looks the size of Africa even though Africa is fourteen times larger. Antarctica becomes an enormous strip along the bottom of the map. The proportions of the world, in the Mercator projection, are confidently and consistently wrong. We kept using it anyway, for four hundred years, because it has one priceless property. If you draw a straight line on a Mercator map, that line is a constant compass bearing. A captain in 1600 could plot a route from Lisbon to Recife with a ruler and a protractor and arrive somewhere close to where he intended. The Mercator projection is wrong about what the world looks like. It is right about how to navigate the world. We agreed, collectively, to lie about the shape of the planet in exchange for being able to find our way around it. This is what LLMs do with thought. Inside any modern frontier model, concepts do not live as words. They live as positions in a very high-dimensional space, with a particular geometric structure. Goodfire's recent work, which is the clearest public demonstration of this, shows the shape directly. Colors form a different shape, more like a sphere. Spatial concepts curl into manifolds that match physical space. The concept of a car is a complicated multidimensional surface that connects, in geometrically meaningful ways, to the concepts of motion, of metal, of road, of journey. The model does not store these concepts as text. It stores them as geometry. When you type a question to it, the model maps your words onto positions in this internal space. It then performs operations on the geometry, which produce new positions. Then, only at the very end, it translates those new positions back into English on the way out to your screen. The English is the Mercator projection. The geometry is the globe. This sounds abstract until you realize what it implies for almost every interaction you have ever had with a model. Why does GPT sometimes give a brilliant answer in one phrasing and a mediocre one in another, even though both phrasings mean the same thing to a human reader? Because the two phrasings land on slightly different positions in the internal geometry, and the geometry near one position is richer than the geometry near the other. Why does a model sometimes confabulate confidently? Because the position it lands on has the geometric texture of an answer even though the answer it generates has no factual grounding. The shape of an answer and the truth of an answer are different things, and the model is trained on the shape. Three implications follow from this and they reach much further than most of the discourse about AI suggests. 1. for product builders. If you have ever wondered why the same model produces wildly different outputs on prompts that seem semantically identical, the answer is geometric. The most reliable way to improve model output is not to tinker with the words. It is to find the regions of geometric space where the model behaves well, and engineer your prompts to land you there. The best prompt engineers, without knowing it, are reverse-engineering the topology of the model's internal world. This is also why fine-tuning works better than prompting for many use cases. Fine-tuning literally reshapes the geometry. Prompting only steers within it. 2. for the safety and interpretability community, which has spent two years looking for circuits and individual neurons that correspond to specific concepts. That work has been valuable, but it was looking at the shadow on the wall. The actual structure is at the manifold level, not the neuron level. The next leap in interpretability is going to come from learning to edit the geometry directly, not by adjusting individual weights. We are about to move from steering the words to steering the shapes that produce the words. This will make some kinds of safety work much easier and other kinds much harder. 3: for everyone else, and it is the strangest one. The early evidence from neuroscience suggests that human thought may have the same kind of geometric structure. The hippocampus appears to encode spatial relationships on manifolds that look uncannily similar to what we see inside language models. Concept representation in the human cortex appears to be geometric in roughly the same sense. If this holds up, and the evidence is still preliminary, then the conventional framing of the difference between artificial and biological minds is wrong in an interesting way. It is not silicon versus carbon. It is two different physical substrates that have independently discovered the same mathematical language for representing the world. We built something that thinks the way we think. We just never noticed, because we were too busy listening to it talk. The Mercator projection is wrong about what the world looks like. It is right about how to move through it. The model is wrong about what thought looks like, in some technical sense. It is right about how to do thought, which is the only thing that has ever mattered.

Goodfire@GoodfireAI

Neural networks might speak English, but they think in shapes. Understanding their rich *neural geometry* is key to understanding how they work – and to debugging and controlling them with precision. Starting today, we’re releasing a series of posts on this research agenda. 🧵

English

415

106.2K

mashrur haider@Mhr1036·8 May

@dadiomov Also, micro transactions to pay per use data access

English

153

Dimitri Dadiomov@dadiomov·7 May

I don't understand the premise that "agents must use stablecoins." Why can't an agent remember the 16 digits of a credit card? Sorry maybe I'm a payments n00b. Stablecoins have a lot of great use cases, but ecommerce shopping is pretty well optimized already for cards.

English

199

628

303K

mashrur haider@Mhr1036·7 May

@romanchernin @nebiusai cc: @elonmusk

Roman Chernin@romanchernin·7 May

@nebiusai could probably help SpaceX to become the cloud by partnering and providing the full platform to serve diversity of customers… not just rent out a full cluster to one customer :)

English

865

296.3K

mashrur haider@Mhr1036·6 May

From doing the work manually to deciding what work is safe enough to ship. That is where the leverage is.

English

mashrur haider@Mhr1036·6 May

People are wrong about AI replacing humans. Because the more powerful automation becomes, the more dangerous blind trust becomes. Nobody cares if AI writes a rough draft.

English

mashrur haider@Mhr1036·6 May

As generation gets cheaper, verification gets more valuable. We won’t need fewer humans in high-stakes systems. We’ll need more people who can review, override, approve, and take responsibility. AI moves humans up the stack.

English

mashrur haider@Mhr1036·6 May

But if AI deletes a production database, ships broken code, approves the wrong medical decision, or crashes a critical workflow, someone has to answer for it. It will be the person, team, or company that trusted it too much. This is what most AI debates miss.

English

Keşfet

@_charlesbrun @levelsio @Jason @uniqlo @Eigen_AI_Labs @BitcoinAIGuy @edzitron @tbpn