
Akshobya
10.4K posts



















It is unconventional but it actually works, depending on the workload of course. There are strengths and weaknesses for sure. There are some real deployments (governments, big companies) running this setup in production (pods of 4 MacBooks). It's the best price to performance for many workloads (e.g. transcription, low batch LLM inference). They landed on this themselves as the best hardware to run their workloads on. Can share more in private if you are interested (don't want to turn this into a sales pitch for exo). If Apple actually sold us an M5 Max / M5 Ultra Mac Studio, then we'd use that. But we could be waiting until October for that (or longer, the supply chain issues seem pretty bad). It's the same M5 Max chip in the MacBook as the Mac Studio, and it goes up to 128GB unified memory. Each chip has 614GB/s memory bandwidth (2.24x DGX Spark). I would say the main downside (which we should make more clear) is the software ecosystem - it's still quite immature. It has got much better in the last year e.g. clustering came a long way with low-latency RDMA in macOS 26.2.






cool Unitree CEO Wang Xingxing and his robotics army of G1s. These are designed for mass production at a starting price of roughly US$16,000. This is just a beginning of an exponential era!



















