Ian Hailey

3.2K posts

Ian Hailey banner
Ian Hailey

Ian Hailey

@IanHailey

Learning

Cotswolds Katılım Mayıs 2014
2.3K Takip Edilen176 Takipçiler
0xSero
0xSero@0xSero·
Good news a private donor has reached out to me personally and compute will be secured in the next 3 months. It’s been a crazy week but we made it happen.
0xSero@0xSero

x.com/i/article/2034…

English
84
51
1.6K
39.7K
Simon Willison
Simon Willison@simonw·
Turns out you can run enormous Mixture-of-Experts on Mac hardware without fitting the whole model in RAM by streaming a subset of expert weights from SSD for each generated token - and people keep finding ways to run bigger models Kimi 2.5 is 1T, but only 32B active so fits 96GB
seikixtc@seikixtc

I got a 1T-parameter model running locally on my MacBook Pro. LLM: Kimi K2.5 1,026,408,232,448 params (~1.026T) Hardware: M2 Max MacBook Pro (2023) w/ 96GB unified memory Running on MLX with a flash-style SSD streaming path + local patching. This is an experimental setup and I haven’t optimized speed yet, but it’s stable enough that I’ve started testing it in an autoresearch-style loop. #LocalAI #MLX #MoE

English
121
284
3.8K
326.5K
Ian Hailey
Ian Hailey@IanHailey·
New MiniMax M2.7 now closed, I hope eventually we can get a Linux of LLMs.
English
0
0
0
29
Ian Hailey
Ian Hailey@IanHailey·
@_thomasip @0xSero Exactly this, NVidia needs to keep the open source models alive especially if they can target their own HW optimisations.
English
0
0
3
74
Thomas Ip
Thomas Ip@_thomasip·
@0xSero sadge. Nvidia has to pick up the slack and become the next llama/qwen
English
1
0
12
1.4K
Simon Willison
Simon Willison@simonw·
Dan says he's got Qwen 3.5 397B-A17B - a 209GB on disk MoE model - running on an M3 Mac at ~5.7 tokens per second using only 5.5 GB of active memory (!) by quantizing and then streaming weights from SSD (at ~17GB/s), since MoE models only use a small subset of their weights for each token
Dan Woods@danveloper

x.com/i/article/2034…

English
99
186
2K
252.8K
Mau “Rules without Rulers” Ledford
@gmoneyNFT It’s 🔥. Can even run Qwen3.5 122b 4bit @ 38 TPS unoptimized. Both 35b and 122b varients are with full multimodel reasoning Much rather have 512gb Mac Studio tho but waiting for the next gen. This is way better than my Macbook Air so i’ll take it.
English
2
0
2
508
Ian Hailey
Ian Hailey@IanHailey·
@jukan05 Exactly, this is good for Samsung who don’t care if mobile unit loses margin when that margin has gone to one of the other units.
English
0
0
1
79
Jukan
Jukan@jukan05·
Bruh, they’re saying Samsung’s Galaxy division may struggle to even achieve an operating margin in the 1% range. I now think that second- and third-tier brands like TCL and Xiaomi may not just face surging memory prices, but could actually struggle to launch products at all because the memory itself may not be available.
Jukan tweet media
Jukan@jukan05

[EXCLUSIVE] Samsung Electronics Extends 'Emergency Management' to Mobile Division — DX Business as a Whole Now on Crisis Footing Samsung Electronics has officially declared an emergency management regime for its mobile phone division, following similar moves in its TV and home appliance businesses. This effectively puts the entire Device Experience (DX) division — excluding the semiconductor (DS) segment — on crisis footing. Despite the Galaxy S26 series setting pre-order records and signaling a blockbuster launch, sources within the Mobile Experience (MX) division say that soaring semiconductor procurement costs driven by "chipflation" — the explosive surge in memory chip prices — have raised internal concerns about the possibility of the division posting its first-ever operating loss in the company's history. According to multiple Samsung Electronics insiders on the 15th, the company declared an emergency management regime for the MX division at the end of February — following similar declarations last year for the VD division (TVs) and DA division (home appliances). The MX division had been the last pillar of DX profitability. The primary driver cited is a severe deterioration in margins due to the explosion in memory semiconductor prices. Memory semiconductor prices have surged more than 850% over the past year, marking a historically unprecedented rally in the industry. The fact that Samsung quietly activated emergency management at the very moment it was holding the global Unpacked event for the Galaxy S26 — its flagship product of the year — signals that the "chipflation" shock has far exceeded what the market had anticipated. Compounding the pressure, the outbreak of war in the Middle East has triggered a spike in oil prices, adding further logistics cost burdens. A Samsung Electronics spokesperson stated: "With raw material costs under extreme pressure from rising semiconductor prices, and logistics costs increasing on top of that, we ultimately had no choice but to put the MX division under emergency management as well." According to Samsung Electronics' 2025 annual business report, the company's raw material procurement costs (excluding Samsung Display) reached KRW 99.9475 trillion last year, up 8.8% (KRW 8.0177 trillion) from KRW 91.8398 trillion the prior year. The bulk of this increase was driven by rising cost burdens within the DX division. As a result, forecasts suggest the MX division's operating profit this year could fall more than 60% from last year's KRW 12.9 trillion, to approximately KRW 5 trillion. Under the conservative scenario, the possibility of an operating loss has not been ruled out. Market estimates from late January projected the MX division's operating margin — 11% in Q1 of last year — to decline to the low-3% range in Q1 of this year and drop further into the 2% range from Q2 onward. However, internal voices are reportedly saying even 1% may be difficult to achieve. The DA and VD divisions recorded an operating loss of approximately KRW 200 billion last year and are expected to post a similar-sized deficit this year. As part of its emergency management response, the DX division has instructed all business units to cut costs by 30%. Business travel policies have also been revised. Executives at the vice president level and below within the DX division will now be assigned economy class on flights of less than 10 hours — previously, business class had been provided — effective immediately, citing cost reduction. Following the TV division's lead, the home appliance and MX divisions are widely expected to face workforce restructuring measures, including internal redeployment under the label of "job redesign" and voluntary separation programs, depending on how the earnings deterioration unfolds.

English
24
26
329
85.3K
Ian Hailey
Ian Hailey@IanHailey·
Plain Old Telephony is like the old HTTP days but unlike HTTP can’t easily be patched with SSL.
English
0
0
0
10
Ian Hailey
Ian Hailey@IanHailey·
@ctnzr Wonder the NVFP4 quant of this compares to Sehyo/Qwen3.5-122B-A10B-NVFP4
English
0
0
0
89
Bryan Catanzaro
Bryan Catanzaro@ctnzr·
Announcing NVIDIA Nemotron 3 Super! 💚120B-12A Hybrid SSM Latent MoE, designed for Blackwell 💚36 on AAIndex v4 💚up to 2.2X faster than GPT-OSS-120B in FP4 💚Open data, open recipe, open weights Models, Tech report, etc. here: research.nvidia.com/labs/nemotron/… And yes, Ultra is coming!
Bryan Catanzaro tweet media
English
62
205
1.2K
203K
Ahmad
Ahmad@TheAhmadOsman·
INCREDIBLE Nemotron 3 Super 120B-A12B by NVIDIA is here Most important part to me? > NVFP4 = ~4x BF16 throughput > and ~7× the BF16 throughput of Qwen 3.5 120B-A5B > despite Qwen having less than half the active parameters per token The future is looking SO GOOOOD for NVFP4
Ahmad tweet media
English
32
23
331
28.5K
Ian Hailey
Ian Hailey@IanHailey·
@VadimYuryev I hope it proves able to beat the DGX spark for a similar price on all LLM workloads.
English
0
0
0
62
Alex Cheema
Alex Cheema@alexocheema·
this is going viral on chinese social media right now. exo is being used by a school in china to deploy private ai agents locally. they repurposed m1 ultra macs from their film lab, clustered them together with @exolabs, and ingested their entire school corpus including curriculums, reports, handbooks and class schedules. with this, each student and teacher has a personalised ai agent that is free and private, grounded to real school data.
Alex Cheema tweet media
Bo Wang@BoWang87

Few people know how popular @openclaw is in China. Two scenes that went viral in Chinese social medias this week: 👵 Thousands of elderly people lined up so Tencent engineers could help them install it. 🎓 A Beijing school deploying AI agents for every student. From grandparents to students. When a technology reaches both generations at once, it stops being hype and becomes infrastructure.

English
120
389
3.7K
626.8K
Ian Hailey
Ian Hailey@IanHailey·
@realmasroork @__tinygrad__ Shame this cant easily be done, like find 198 other people and make that an entity that invests $1m (or more) into the project.
English
2
0
1
68
the tiny corp
the tiny corp@__tinygrad__·
if tiny corp was raising $20M (@ $200M), who'd be interested? business model is basically this. buy this $11.5M building (with 5MW of power): link in our discord wait for AMD to launch the RDNA5 96GB cards (mid 2027). preorder 3000 cards (hopefully we can negotiate for $2500 each), build 500 $20k tinyboxes with 6 of the card. run all the chinese llms. make $600k / month revenue selling tokens on openrouter (market depth is there, this is 1% of openrouter). improvements to tinygrad yield revenue improvements. due to how power is priced in oregon it's only like $50k for the electric bill (below 4MW they price for peak, not usage, we get like 3c kWh power). we can also make ~$100k / month leasing colo space to comma. building and cards paid off in 3 years max, investment made back. low risk of being undercut since we're using consumer GPUs and running the cheapest colo you can believe. if someone chill wants in, i'd do it. i'm not gonna hype fake tech, but demand for tokens is going to skyrocket (look at the openclaw install numbers). with crazy good optimizations we could potentially get 3x more from the machines, and we have electricity for 3x more machines. $5.4M revenue per month. then continue to scale from there, custom chips, etc...
English
164
71
2.3K
253.9K
Ian Hailey
Ian Hailey@IanHailey·
Can we have all the 2008 or 1986 F1 rounds rerun every race weekend this year as a backup.
English
0
0
0
46
Ian Hailey
Ian Hailey@IanHailey·
@HillF1 Perhaps they should listen more to the drivers next time.
English
0
0
2
68
Ian Hailey
Ian Hailey@IanHailey·
@blaiklockBP DRAM also way up although admittedly people are talking about that.
English
0
0
0
185
Catherine Blaiklock
Catherine Blaiklock@blaiklockBP·
On Saturday 28th February AFTER the crisis had started, the price for 1000 lites of heating oil was £676. I know because I bought some . The price is now £1371 No one is talking about central heating oil .
Catherine Blaiklock tweet media
English
193
554
1.8K
74.6K
Martin Lewis
Martin Lewis@MartinSLewis·
The home heating oil situation is terrible for many unlucky enough to be about to refill. The solutions of 'comparison' and 'collective buying' are weak at this time. The outrage is this is an unregulated, unprotected, market (we've long called for that to change). We are subserving many, especially those who live in rural communities.
s0@st3v3ns35

@MartinSLewis @BBCWatchdog Home oil order plcd 3 weeks ago- pre-middle east war at 0.60p/ltr. Supplier cancels order today stating they can’t fulfil orders under 0.95 p/l AND they have no oil to supply or they will go out of business. Now on hook to pay £1.40 p/ltr #OilPrices

English
98
116
431
204K
Brandon Butch
Brandon Butch@BrandonButch·
Apple just announced the MacBook Neo! - 13-inch display - A18 Pro chip - Multi-touch trackpad + Touch ID - 2x USB-C ports + headphone jack - Comes in Silver, Indigo, Blush, and Citrus - 16 hours of battery life - 1080p camera + dual mics - Starts at $599 - Releases March 11
Brandon Butch tweet media
English
42
66
1.2K
84.6K