Parallax

87 posts

Parallax

@tryParallax

build your own ai cluster. run open models across your machines.

Katılım Aralık 2025

38 Takip Edilen1.2K Takipçiler

Parallax@tryParallax·3d

@VitalikButerin buy a GPU, get together a group of friends. don’t carry the world on your own shoulders. we’ve been building this for a while. try parallax for local ai.

English

1.2K

vitalik.eth@VitalikButerin·4d

My self-sovereign / local / private / secure LLM setup, April 2026 vitalik.eth.limo/general/2026/0…

English

515

602

4.7K

988.8K

Parallax@tryParallax·4d

@RoundtableSpace 35b model on a macbook with compressed cache is a solid result. local inference keeps getting more accessible and it's fun to watch people push the limits of what consumer hardware can do!

English

208

0xMarioNawfal@RoundtableSpace·6d

A solo dev rebuilt Google’s new algorithm with Claude in 7 days, made it 3.7x faster, and got a 35B model running on a MacBook with 4.6x compressed cache. Google published the paper. He shipped the code.

English

195

2.6K

249.9K

Parallax@tryParallax·4d

@adrgrondin @PrismML 1-bit model running at 40 tok/s on an iphone. mlx is making on-device inference surprisingly usable now.

English

511

Adrien Grondin@adrgrondin·5d

Demo of 1-bit Bonsai 8B from @PrismML running on-device on iPhone 17 Pro More than 40tk/s for a dense 8B model on iPhone, that’s a first Powered by Apple MLX and available now in Locally AI

English

125

1.8K

157.3K

Parallax@tryParallax·4d

@ollama local llm + mlx is a great combo! apple silicon keeps getting better for local inference and it's nice to see more players in the ecosystem lean into it properly.

English

111

ollama@ollama·6d

Ollama is now updated to run the fastest on Apple silicon, powered by MLX, Apple's machine learning framework. This change unlocks much faster performance to accelerate demanding work on macOS: - Personal assistants like OpenClaw - Coding agents like Claude Code, OpenCode, or Codex

English

286

732

5.8K

750.2K

Parallax@tryParallax·27 Mar

@tom_doerr single binary, self-hosted, no dependencies. this is the way local ai should ship. less config, more building.

English

Tom Dörr@tom_doerr·26 Mar

Single-binary self-hosted AI agent github.com/shachiku-ai/sh…

English

3.8K

Parallax@tryParallax·27 Mar

@karaage0703 9bから27bへのローカル性能の差がすごい。qwen3.5は今セルフホストするなら最高のモデルの一つ。特に異なるデバイス間でシャーディングするなら。

日本語

125

からあげ@karaage0703·26 Mar

自分の用途で、DGX Sparkで動かした感じだとQwen3.5 27Bの方が9Bより圧倒的によいですね。用途や環境で体感ことなるものなのですね > Qwen3.5の27Bが9Bに負けた RTX 4060の逆説｜ぷらずもん zenn.dev/plasmon/articl… #zenn

日本語

150

15.7K

Parallax@tryParallax·26 Mar

TurboQuant tackles one bottleneck: KV cache memory. there's another one that matters just as much in distributed setups: communication latency between nodes. we built Decentralized Speculative Decoding (DSD) to turn that idle network wait time into useful computation, 2.56x speedup on HumanEval, no retraining needed. combine cache compression with latency compression and local inference starts looking very different. arxiv.org/abs/2511.11733

English

530

Shay Boloor@StockSavvyShay·25 Mar

$GOOGL just released TurboQuant which is a new compression method that can cut LLM cache memory by at least 6x & deliver ~8x speedups without sacrificing quality This could make local AI inference far more capable with larger context windows & less memory strain across devices

GIF

English

104

715

754.9K

Parallax@tryParallax·26 Mar

hf-mount solves the storage side: any model, mounted locally like a drive. the next piece is actually running those models across whatever hardware you have. that's what parallax does: schedule inference across a pool of heterogeneous GPUs so the model doesn't just live on your machine, it runs there too. mount + serve, fully local.

English

111

clem 🤗@ClementDelangue·24 Mar

Local AI is free, fast & secure! So today we're introducing hf-mount: attach any storage bucket, model or dataset from @huggingface as a local filesystem. This is a game changer, as it allows you to attach remote storage that is 100x bigger than your local machine's disk. This is also perfect for Agentic storage!! Let's go!

English

226

1.3K

250.5K

Parallax@tryParallax·25 Mar

@oprydai you don't need to go into debt though. a couple of mac minis or an nvidia card can already run serious models locally. parallax lets you connect whatever hardware you have into one cluster. start small, add devices as you go. the whole point is using what's already on your desk.

English

Mustafa@oprydai·23 Mar

get into debt if you must, but build a hardware home lab.

English

1.2K

31.6K

Parallax@tryParallax·24 Mar

@openclaw solid release. deepseek provider plugin + qwen pay-as-you-go opens up a lot of new local setups. parallax users running openclaw stacks should have a smoother time with this one.

English

1.5K

OpenClaw🦞@openclaw·24 Mar

OpenClaw 2026.3.23 🦞 🧪 DeepSeek provider plugin ☁️ Qwen pay-as-you-go ♻️ OpenRouter auto pricing + Anthropic thinking order 🖥️ Chrome MCP waits for tabs 🔧 Discord/Slack/Matrix + Web UI fixes Upgrade before your agent does it for you. github.com/openclaw/openc…

English

218

217

2.4K

369K

Parallax@tryParallax·24 Mar

@wolfejosh the ceiling for on-device keeps moving. a year ago people argued you couldn't run anything useful locally. now it's 400B on a phone. parallax already supports mixed hardware clusters — apple silicon, nvidia, whatever you've got. the trend is clear.

English

179

Josh Wolfe@wolfejosh·24 Mar

It is happening (on-device inference) this is the worst that on-device inference will ever be 400B model in your pocket

Anemll@anemll

Running 400B model on iPhone! 0.6 t/s Credit @danveloper @alexintosh @danpacary @anemll

English

158

35K

Parallax@tryParallax·24 Mar

the $3,469 single-night burn is a good reminder of what you're actually signing up for with cloud inference. when the meter's always running, one stuck agent is a bill. parallax runs models on your own machines. no token meter, no overnight surprises.

Ziwen@ziwenxu_

x.com/i/article/2034…

English

885

Parallax@tryParallax·22 Mar

@JustinLin610 🙋‍♂️and we are here to help

English

475

Junyang Lin@JustinLin610·22 Mar

local agents will definitely become more important in the coming days and months while agents become part of our life and work. privacy always matters

Zach Mueller@TheZachMueller

PinchBench results for Qwen3.5 27B using @UnslothAI K_XL quants, best of 3, thinking enabled. TL;DR: Q3 KXL (14.5GB) or Q4 KXL (18GB) While overall the "best" results showed little degradation, if you dig into mean/std Q4_K_XL overall was the best at ~84% on average. Q3 seems viable, while Q2 is the the lowest performing, of course.

English

572

63.6K

Parallax@tryParallax·21 Mar

local ai has picked up fast since openclaw dropped. with the latest wave of small capable models, more people are running serious workloads on their own hardware. if you missed this good local ai tutorial from @yacinelearning or want a refresher on how distributed scheduling actually works under the hood, it's worth the rewatch over the weekend!

Yacine Mahdid@yacinelearning

I am continuing my adventure into distributed AI system with the parallax scheduling strat from @Gradient_HQ in this 37min tutorial I go through: - heuristic used to make scheduling tractable - dynamic programming formulation - filling GPU with water - shoving them into shelves

English

104

11.2K

Parallax@tryParallax·17 Mar

@tomosman @openclaw @NousResearch mac minis are underrated for this. we've been running multi-node setups on apple silicon with parallax and the performance-per-dollar is hard to beat. nice to see more people building this way.

English

Tom Osman 🐦‍⬛@tomosman·11 Mar

Infinitely bullish on a stack of MacMinis or Studios at home running @openclaw or @NousResearch Hermes. Run local models and soon you will have AGI at home. Lots of other epic stuff too but feels downstream of being able to do this.

English

1.2K

Parallax@tryParallax·17 Mar

@cyb3rops the "local" label is doing a lot of heavy lifting for some of these apps. if your data still round-trips to someone else's server, it's not really local. with parallax, your inference actually stays on your machines. your devices, your models, even offline.

English

329

Florian Roth ⚡️@cyb3rops·16 Mar

Can anyone explain this to me? First Claude Workspace, then Perplexity, now Manus - they keep using words like “my”, “personal”, and “local” in a way that suggests local information isn’t being sent to a remote LLM or RAG system for evaluation. But if no local LLM is actually running, then almost nothing except maybe config stays local. The reasoning still happens remotely. Right? Also - does anyone really think this belongs on a corporate workstation?

Manus@ManusAI

Today, we're taking Manus out of the cloud and putting it on your desktop. Introducing My Computer, the core feature of the new Manus Desktop app. It’s your AI agent, now on your local machine.

English

253

37.7K

Parallax@tryParallax·13 Mar

some parallax dev lunch break fun: - a macbook pro, a mac mini, some cables - zero internet, zero cost - openclaw running on parallax no subs. no token burn. nothing leaves the desk. just local agents vibing.

English

111

12K

Parallax@tryParallax·13 Mar

@AlexFinn yes. that's why parallax exists.

English

Alex Finn@AlexFinn·12 Mar

If you have your OpenClaw working 24/7 using frontier models like Opus, you're easily burning $300 a day. That's $100,000 a year. I have 3 Mac Studios and a DGX Spark running 4 high end local models (Nemotron 3, Qwen 3.5, Kimi K2.5, MiniMax2.5). They're chugging 24/7/365. I spent a third of that yearly cost to buy these computers I'll be able to use them for years for free On top of that they're completely private, secure, and personalized. Not a single prompt goes to a cloud server that can be read by an employee or used to train another model I hope this makes it painfully obvious why local is the future for AI agents. And why America needs to enter the local AI race.

English

430

160

2.4K

383.9K

Parallax@tryParallax·13 Mar

messari's new report on echo-2 highlights how parallax powers the rollout plane. consumer RTX 5090s served as distributed rollout actors via parallax, feeding a centralized learner cluster. and we got 33-36% lower hardware costs with no quality loss. this is parallax doing what it was built for. turning consumer hardware into production AI infrastructure.

Youssef@0xYoussef_

Beyond improvements in speed and cost, Echo-2 demonstrates a high standard of model performance. Benchmarking data across five math reasoning tasks shows Echo-2 achieving an average score of 35.75, compared to 35.30 for ByteDance’s verl. These results confirm that the architectural efficiencies of the Open Intelligence Stack (OIS) do not come at the expense of reasoning capabilities.

English

842

Parallax@tryParallax·12 Mar

@farukomerekinci @perplexity_ai this is the right framing. but owning the stack means owning the inference too. openclaw + cloud api = local agent, cloud brain. openclaw + parallax = local agent, local brain. that's the difference between "your data stays on your machine" as marketing vs as reality.

English

1.7K

Faruk Ekinci@farukomerekinci·12 Mar

Here's the difference with OpenClaw: Perplexity: Their AI, their servers, your data through their pipeline. One model. One product. Take it or leave it. OpenClaw: Open source. Runs any model, Claude, Grok, Kimi, whatever you want. Your data never leaves your machine. You build the agents, you set the rules, you own the stack. What's now on the table, with a $1B company validating the category: -AI that checks your email before you wake up -Agents monitoring your business 24/7 -Cron jobs running strategies while you're offline -Your entire workflow automated, on hardware you own The difference between Perplexity's version and what you can build yourself isn't features. It's control. Perplexity = Shopify. OpenClaw = owning the server.

English

545

85.3K

Perplexity@perplexity_ai·11 Mar

Announcing Personal Computer. Personal Computer is an always on, local merge with Perplexity Computer that works for you 24/7. It's personal, secure, and works across your files, apps, and sessions through a continuously running Mac mini.

English

1.7K

3.5K

32.4K

14.1M

Keşfet

@VitalikButerin @RoundtableSpace @adrgrondin @PrismML @ollama @tom_doerr @karaage0703 @huggingface