Ian Hailey

3.2K posts

Ian Hailey

@IanHailey

Learning

Cotswolds Katılım Mayıs 2014

2.3K Takip Edilen176 Takipçiler

Ian Hailey@IanHailey·5h

@alexocheema @kDasme @nvidia @Apple @GroqInc Some clever pipeline split between the two?

English

342

Alex Cheema@alexocheema·8h

@kDasme @nvidia @Apple Here’s a clue. @nvidia x @GroqInc

English

16K

Alex Cheema@alexocheema·9h

Big moment coming for local AI. @nvidia x @apple

English

210

159

3.9K

234.8K

Ian Hailey@IanHailey·6d

@0xSero You can be the Linus of base models!

English

0xSero@0xSero·6d

Good news a private donor has reached out to me personally and compute will be secured in the next 3 months. It’s been a crazy week but we made it happen.

0xSero@0xSero

x.com/i/article/2034…

English

1.6K

39.7K

Ian Hailey@IanHailey·6d

@simonw @dataeatworld It will be too slow to be of any practical use.

English

154

Simon Willison@simonw·6d

Turns out you can run enormous Mixture-of-Experts on Mac hardware without fitting the whole model in RAM by streaming a subset of expert weights from SSD for each generated token - and people keep finding ways to run bigger models Kimi 2.5 is 1T, but only 32B active so fits 96GB

seikixtc@seikixtc

I got a 1T-parameter model running locally on my MacBook Pro. LLM: Kimi K2.5 1,026,408,232,448 params (~1.026T) Hardware: M2 Max MacBook Pro (2023) w/ 96GB unified memory Running on MLX with a flash-style SSD streaming path + local patching. This is an experimental setup and I haven’t optimized speed yet, but it’s stable enough that I’ve started testing it in an autoresearch-style loop. #LocalAI #MLX #MoE

English

121

284

3.8K

326.5K

Ian Hailey@IanHailey·23 Mar

Every time I hear about this new Terrafab I imagine it’s a TerraHawks sequel youtu.be/vZC9B0YJ8W8

YouTube

English

Ian Hailey@IanHailey·20 Mar

New MiniMax M2.7 now closed, I hope eventually we can get a Linux of LLMs.

English

Ian Hailey@IanHailey·20 Mar

@_thomasip @0xSero Exactly this, NVidia needs to keep the open source models alive especially if they can target their own HW optimisations.

English

Thomas Ip@_thomasip·20 Mar

@0xSero sadge. Nvidia has to pick up the slack and become the next llama/qwen

English

1.4K

0xSero@0xSero·20 Mar

You think I was joking? Say goodbye to open weight models, we frankly fucked it up for ourselves. x.com/0xSero/status/…

sudo rm -rf@itsjustmarky

@0xSero @MiniMax_AI @Scobleizer

English

349

38.3K

Ian Hailey@IanHailey·19 Mar

@simonw @DanielleFong Unless the answer you’re expecting is 42 this is going to be painfully slow.

English

441

Simon Willison@simonw·18 Mar

Dan says he's got Qwen 3.5 397B-A17B - a 209GB on disk MoE model - running on an M3 Mac at ~5.7 tokens per second using only 5.5 GB of active memory (!) by quantizing and then streaming weights from SSD (at ~17GB/s), since MoE models only use a small subset of their weights for each token

Dan Woods@danveloper

x.com/i/article/2034…

English

186

252.8K

Ian Hailey@IanHailey·18 Mar

@krunkosaurus @gmoneyNFT This is quite good, faster than a single DGX Spark.

English

Mau “Rules without Rulers” Ledford@krunkosaurus·18 Mar

@gmoneyNFT It’s 🔥. Can even run Qwen3.5 122b 4bit @ 38 TPS unoptimized. Both 35b and 122b varients are with full multimodel reasoning Much rather have 512gb Mac Studio tho but waiting for the next gen. This is way better than my Macbook Air so i’ll take it.

English

508

Mau “Rules without Rulers” Ledford@krunkosaurus·17 Mar

Apple M5 Max Macbook Pro with 128GB ram arrived. First unoptimized chat with Qwen3.5 is 108 TPS! Local AI is here.

Mau “Rules without Rulers” Ledford tweet media

English

197

162

403.8K

Ian Hailey@IanHailey·16 Mar

@jukan05 Exactly, this is good for Samsung who don’t care if mobile unit loses margin when that margin has gone to one of the other units.

English

Jukan@jukan05·15 Mar

Bruh, they’re saying Samsung’s Galaxy division may struggle to even achieve an operating margin in the 1% range. I now think that second- and third-tier brands like TCL and Xiaomi may not just face surging memory prices, but could actually struggle to launch products at all because the memory itself may not be available.

Jukan@jukan05

[EXCLUSIVE] Samsung Electronics Extends 'Emergency Management' to Mobile Division — DX Business as a Whole Now on Crisis Footing Samsung Electronics has officially declared an emergency management regime for its mobile phone division, following similar moves in its TV and home appliance businesses. This effectively puts the entire Device Experience (DX) division — excluding the semiconductor (DS) segment — on crisis footing. Despite the Galaxy S26 series setting pre-order records and signaling a blockbuster launch, sources within the Mobile Experience (MX) division say that soaring semiconductor procurement costs driven by "chipflation" — the explosive surge in memory chip prices — have raised internal concerns about the possibility of the division posting its first-ever operating loss in the company's history. According to multiple Samsung Electronics insiders on the 15th, the company declared an emergency management regime for the MX division at the end of February — following similar declarations last year for the VD division (TVs) and DA division (home appliances). The MX division had been the last pillar of DX profitability. The primary driver cited is a severe deterioration in margins due to the explosion in memory semiconductor prices. Memory semiconductor prices have surged more than 850% over the past year, marking a historically unprecedented rally in the industry. The fact that Samsung quietly activated emergency management at the very moment it was holding the global Unpacked event for the Galaxy S26 — its flagship product of the year — signals that the "chipflation" shock has far exceeded what the market had anticipated. Compounding the pressure, the outbreak of war in the Middle East has triggered a spike in oil prices, adding further logistics cost burdens. A Samsung Electronics spokesperson stated: "With raw material costs under extreme pressure from rising semiconductor prices, and logistics costs increasing on top of that, we ultimately had no choice but to put the MX division under emergency management as well." According to Samsung Electronics' 2025 annual business report, the company's raw material procurement costs (excluding Samsung Display) reached KRW 99.9475 trillion last year, up 8.8% (KRW 8.0177 trillion) from KRW 91.8398 trillion the prior year. The bulk of this increase was driven by rising cost burdens within the DX division. As a result, forecasts suggest the MX division's operating profit this year could fall more than 60% from last year's KRW 12.9 trillion, to approximately KRW 5 trillion. Under the conservative scenario, the possibility of an operating loss has not been ruled out. Market estimates from late January projected the MX division's operating margin — 11% in Q1 of last year — to decline to the low-3% range in Q1 of this year and drop further into the 2% range from Q2 onward. However, internal voices are reportedly saying even 1% may be difficult to achieve. The DA and VD divisions recorded an operating loss of approximately KRW 200 billion last year and are expected to post a similar-sized deficit this year. As part of its emergency management response, the DX division has instructed all business units to cut costs by 30%. Business travel policies have also been revised. Executives at the vice president level and below within the DX division will now be assigned economy class on flights of less than 10 hours — previously, business class had been provided — effective immediately, citing cost reduction. Following the TV division's lead, the home appliance and MX divisions are widely expected to face workforce restructuring measures, including internal redeployment under the label of "job redesign" and voluntary separation programs, depending on how the earnings deterioration unfolds.

English

329

85.3K

Ian Hailey@IanHailey·13 Mar

Plain Old Telephony is like the old HTTP days but unlike HTTP can’t easily be patched with SSL.

English

Ian Hailey@IanHailey·12 Mar

@ctnzr Wonder the NVFP4 quant of this compares to Sehyo/Qwen3.5-122B-A10B-NVFP4

English

Bryan Catanzaro@ctnzr·11 Mar

Announcing NVIDIA Nemotron 3 Super! 💚120B-12A Hybrid SSM Latent MoE, designed for Blackwell 💚36 on AAIndex v4 💚up to 2.2X faster than GPT-OSS-120B in FP4 💚Open data, open recipe, open weights Models, Tech report, etc. here: research.nvidia.com/labs/nemotron/… And yes, Ultra is coming!

English

205

1.2K

203K

Ian Hailey@IanHailey·11 Mar

@jxwalker @TheAhmadOsman The NVFP4 is 80.4GB

English

James Walker 🇬🇧🇺🇦@jxwalker·11 Mar

@TheAhmadOsman NVFP4 ~60GB so will run on my DGX Spark I hope!

English

852

Ahmad@TheAhmadOsman·11 Mar

INCREDIBLE Nemotron 3 Super 120B-A12B by NVIDIA is here Most important part to me? > NVFP4 = ~4x BF16 throughput > and ~7× the BF16 throughput of Qwen 3.5 120B-A5B > despite Qwen having less than half the active parameters per token The future is looking SO GOOOOD for NVFP4

English

331

28.5K

Ian Hailey@IanHailey·10 Mar

@VadimYuryev I hope it proves able to beat the DGX spark for a similar price on all LLM workloads.

English

Vadim Yuryev@VadimYuryev·10 Mar

M5 Ultra will shake the AI community. Been saying this for a while, but it's only now becoming clear now that we can see M5 Max performance.

Ronald Mannak@ronaldmannak

MLX benchmarks are in and I did not expect these results. the M5 Max blows the M3 Ultra out of the water, despite having more GPU cores and higher memory bandwidth. Compute-bound prefill is much faster (up to 2x) thanks to the new M5 Neural Accelerators, but also memory-bound decoding is faster, so long as you use MoE models instead of dense models. The M5 Ultra will be a beast. Can’t wait to see those numbers

English

266

31.3K

Ian Hailey@IanHailey·9 Mar

@alexocheema @exolabs Sorry film guys film guys we need your compute.

English

Alex Cheema@alexocheema·8 Mar

this is going viral on chinese social media right now. exo is being used by a school in china to deploy private ai agents locally. they repurposed m1 ultra macs from their film lab, clustered them together with @exolabs, and ingested their entire school corpus including curriculums, reports, handbooks and class schedules. with this, each student and teacher has a personalised ai agent that is free and private, grounded to real school data.

Bo Wang@BoWang87

Few people know how popular @openclaw is in China. Two scenes that went viral in Chinese social medias this week: 👵 Thousands of elderly people lined up so Tencent engineers could help them install it. 🎓 A Beijing school deploying AI agents for every student. From grandparents to students. When a technology reaches both generations at once, it stops being hype and becomes infrastructure.

English

120

389

3.7K

626.8K

Ian Hailey@IanHailey·7 Mar

@realmasroork @__tinygrad__ Shame this cant easily be done, like find 198 other people and make that an entity that invests $1m (or more) into the project.

English

Masroor@realmasroork·7 Mar

@__tinygrad__ Can I invest 5,000?

English

718

the tiny corp@__tinygrad__·7 Mar

if tiny corp was raising $20M (@ $200M), who'd be interested? business model is basically this. buy this $11.5M building (with 5MW of power): link in our discord wait for AMD to launch the RDNA5 96GB cards (mid 2027). preorder 3000 cards (hopefully we can negotiate for $2500 each), build 500 $20k tinyboxes with 6 of the card. run all the chinese llms. make $600k / month revenue selling tokens on openrouter (market depth is there, this is 1% of openrouter). improvements to tinygrad yield revenue improvements. due to how power is priced in oregon it's only like $50k for the electric bill (below 4MW they price for peak, not usage, we get like 3c kWh power). we can also make ~$100k / month leasing colo space to comma. building and cards paid off in 3 years max, investment made back. low risk of being undercut since we're using consumer GPUs and running the cheapest colo you can believe. if someone chill wants in, i'd do it. i'm not gonna hype fake tech, but demand for tokens is going to skyrocket (look at the openclaw install numbers). with crazy good optimizations we could potentially get 3x more from the machines, and we have electricity for 3x more machines. $5.4M revenue per month. then continue to scale from there, custom chips, etc...

English

164

2.3K

253.9K

Ian Hailey@IanHailey·7 Mar

Can we have all the 2008 or 1986 F1 rounds rerun every race weekend this year as a backup.

English

Ian Hailey@IanHailey·7 Mar

@HillF1 Perhaps they should listen more to the drivers next time.

English

Damon Hill@HillF1·7 Mar

Oh dear 😬

The Race@wearetherace

Two champions, two pretty strong opinions on the new F1 regulations... 😬

English

280

202

3.8K

185K

Ian Hailey@IanHailey·6 Mar

@blaiklockBP DRAM also way up although admittedly people are talking about that.

English

185

Catherine Blaiklock@blaiklockBP·6 Mar

On Saturday 28th February AFTER the crisis had started, the price for 1000 lites of heating oil was £676. I know because I bought some . The price is now £1371 No one is talking about central heating oil .

English

193

554

1.8K

74.6K

Ian Hailey@IanHailey·6 Mar

@MartinSLewis It’s market priced like most things.

English

266

Martin Lewis@MartinSLewis·6 Mar

The home heating oil situation is terrible for many unlucky enough to be about to refill. The solutions of 'comparison' and 'collective buying' are weak at this time. The outrage is this is an unregulated, unprotected, market (we've long called for that to change). We are subserving many, especially those who live in rural communities.

s0@st3v3ns35

@MartinSLewis @BBCWatchdog Home oil order plcd 3 weeks ago- pre-middle east war at 0.60p/ltr. Supplier cancels order today stating they can’t fulfil orders under 0.95 p/l AND they have no oil to supply or they will go out of business. Now on hook to pay £1.40 p/ltr #OilPrices

English

116

431

204K

Ian Hailey@IanHailey·4 Mar

@BrandonButch No Touch ID base is silly.

English

Brandon Butch@BrandonButch·4 Mar

Apple just announced the MacBook Neo! - 13-inch display - A18 Pro chip - Multi-touch trackpad + Touch ID - 2x USB-C ports + headphone jack - Comes in Silver, Indigo, Blush, and Citrus - 16 hours of battery life - 1080p camera + dual mics - Starts at $599 - Releases March 11

English

1.2K

84.6K

Keşfet

@alexocheema @kDasme @nvidia @Apple @GroqInc @0xSero @simonw @dataeatworld