Parallax

92 posts

Parallax banner
Parallax

Parallax

@tryParallax

build your own ai cluster. run open models across your machines.

Katılım Aralık 2025
39 Takip Edilen1.3K Takipçiler
Parallax retweetledi
Yuan ./
Yuan ./@yuangao·
Thrilled to see @tryParallax live in production on @Theta_Network. This is exactly why @Gradient_HQ built Parallax: turning the world’s GPU mesh into a sovereign, distributed token factory. Congrats on the milestone! 🫡
Theta Network@Theta_Network

To make this work, we adapted Parallax, @Gradient_HQ's distributed inference framework, to run across EdgeCloud's global node network. One API endpoint, model split across many machines, no centralized cluster required.

English
25
64
353
35K
Parallax
Parallax@tryParallax·
glad we could help! with the agentic adoption soaring, privacy and token cost are already the top concerns for both agent and human users. that's what parallax's built for.
Theta Network@Theta_Network

To make this work, we adapted Parallax, @Gradient_HQ's distributed inference framework, to run across EdgeCloud's global node network. One API endpoint, model split across many machines, no centralized cluster required.

English
17
27
204
20.4K
Theta Network
Theta Network@Theta_Network·
To make this work, we adapted Parallax, @Gradient_HQ's distributed inference framework, to run across EdgeCloud's global node network. One API endpoint, model split across many machines, no centralized cluster required.
English
11
35
267
77.8K
Theta Network
Theta Network@Theta_Network·
Qwen3 32B by Alibaba is now live on Theta EdgeCloud as a decentralized on-demand inference API, a large-scale LLM served across community GPU nodes using pipeline parallelism over the internet. 🧵
Theta Network tweet mediaTheta Network tweet media
English
24
115
506
40.5K
Parallax
Parallax@tryParallax·
@VitalikButerin buy a GPU, get together a group of friends. don’t carry the world on your own shoulders. we’ve been building this for a while. try parallax for local ai.
English
1
0
13
1.6K
Parallax
Parallax@tryParallax·
@RoundtableSpace 35b model on a macbook with compressed cache is a solid result. local inference keeps getting more accessible and it's fun to watch people push the limits of what consumer hardware can do!
English
0
0
4
229
0xMarioNawfal
0xMarioNawfal@RoundtableSpace·
A solo dev rebuilt Google’s new algorithm with Claude in 7 days, made it 3.7x faster, and got a 35B model running on a MacBook with 4.6x compressed cache. Google published the paper. He shipped the code.
0xMarioNawfal tweet media
English
64
196
2.5K
251.9K
Parallax
Parallax@tryParallax·
@adrgrondin @PrismML 1-bit model running at 40 tok/s on an iphone. mlx is making on-device inference surprisingly usable now.
English
0
0
6
552
Adrien Grondin
Adrien Grondin@adrgrondin·
Demo of 1-bit Bonsai 8B from @PrismML running on-device on iPhone 17 Pro More than 40tk/s for a dense 8B model on iPhone, that’s a first Powered by Apple MLX and available now in Locally AI
English
45
123
1.8K
159.6K
Parallax
Parallax@tryParallax·
@ollama local llm + mlx is a great combo! apple silicon keeps getting better for local inference and it's nice to see more players in the ecosystem lean into it properly.
English
1
0
6
189
ollama
ollama@ollama·
Ollama is now updated to run the fastest on Apple silicon, powered by MLX, Apple's machine learning framework. This change unlocks much faster performance to accelerate demanding work on macOS: - Personal assistants like OpenClaw - Coding agents like Claude Code, OpenCode, or Codex
English
293
732
5.8K
778K
Parallax
Parallax@tryParallax·
@tom_doerr single binary, self-hosted, no dependencies. this is the way local ai should ship. less config, more building.
English
1
0
6
59
Parallax
Parallax@tryParallax·
@karaage0703 9bから27bへのローカル性能の差がすごい。qwen3.5は今セルフホストするなら最高のモデルの一つ。特に異なるデバイス間でシャーディングするなら。
日本語
1
0
6
137
からあげ
からあげ@karaage0703·
自分の用途で、DGX Sparkで動かした感じだとQwen3.5 27Bの方が9Bより圧倒的によいですね。用途や環境で体感ことなるものなのですね > Qwen3.5の27Bが9Bに負けた RTX 4060の逆説|ぷらずもん zenn.dev/plasmon/articl… #zenn
日本語
3
17
150
16K
Parallax
Parallax@tryParallax·
TurboQuant tackles one bottleneck: KV cache memory. there's another one that matters just as much in distributed setups: communication latency between nodes. we built Decentralized Speculative Decoding (DSD) to turn that idle network wait time into useful computation, 2.56x speedup on HumanEval, no retraining needed. combine cache compression with latency compression and local inference starts looking very different. arxiv.org/abs/2511.11733
English
6
3
20
561
Shay Boloor
Shay Boloor@StockSavvyShay·
$GOOGL just released TurboQuant which is a new compression method that can cut LLM cache memory by at least 6x & deliver ~8x speedups without sacrificing quality This could make local AI inference far more capable with larger context windows & less memory strain across devices
GIF
English
53
104
715
758.6K
Parallax
Parallax@tryParallax·
hf-mount solves the storage side: any model, mounted locally like a drive. the next piece is actually running those models across whatever hardware you have. that's what parallax does: schedule inference across a pool of heterogeneous GPUs so the model doesn't just live on your machine, it runs there too. mount + serve, fully local.
English
0
0
1
132
clem 🤗
clem 🤗@ClementDelangue·
Local AI is free, fast & secure! So today we're introducing hf-mount: attach any storage bucket, model or dataset from @huggingface as a local filesystem. This is a game changer, as it allows you to attach remote storage that is 100x bigger than your local machine's disk. This is also perfect for Agentic storage!! Let's go!
clem 🤗 tweet media
English
67
226
1.3K
252.8K
Parallax
Parallax@tryParallax·
@oprydai you don't need to go into debt though. a couple of mac minis or an nvidia card can already run serious models locally. parallax lets you connect whatever hardware you have into one cluster. start small, add devices as you go. the whole point is using what's already on your desk.
English
0
0
4
58
Mustafa
Mustafa@oprydai·
get into debt if you must, but build a hardware home lab.
Mustafa tweet media
English
44
68
1.2K
32.1K
Parallax
Parallax@tryParallax·
@openclaw solid release. deepseek provider plugin + qwen pay-as-you-go opens up a lot of new local setups. parallax users running openclaw stacks should have a smoother time with this one.
English
0
0
6
1.5K
OpenClaw🦞
OpenClaw🦞@openclaw·
OpenClaw 2026.3.23 🦞 🧪 DeepSeek provider plugin ☁️ Qwen pay-as-you-go ♻️ OpenRouter auto pricing + Anthropic thinking order 🖥️ Chrome MCP waits for tabs 🔧 Discord/Slack/Matrix + Web UI fixes Upgrade before your agent does it for you. github.com/openclaw/openc…
English
217
216
2.4K
374.3K
Parallax
Parallax@tryParallax·
@wolfejosh the ceiling for on-device keeps moving. a year ago people argued you couldn't run anything useful locally. now it's 400B on a phone. parallax already supports mixed hardware clusters — apple silicon, nvidia, whatever you've got. the trend is clear.
English
0
0
4
182
Parallax
Parallax@tryParallax·
the $3,469 single-night burn is a good reminder of what you're actually signing up for with cloud inference. when the meter's always running, one stuck agent is a bill. parallax runs models on your own machines. no token meter, no overnight surprises.
Ziwen@ziwenxu_

x.com/i/article/2034…

English
5
8
40
1.4K
Junyang Lin
Junyang Lin@JustinLin610·
local agents will definitely become more important in the coming days and months while agents become part of our life and work. privacy always matters
Zach Mueller@TheZachMueller

PinchBench results for Qwen3.5 27B using @UnslothAI K_XL quants, best of 3, thinking enabled. TL;DR: Q3 KXL (14.5GB) or Q4 KXL (18GB) While overall the "best" results showed little degradation, if you dig into mean/std Q4_K_XL overall was the best at ~84% on average. Q3 seems viable, while Q2 is the the lowest performing, of course.

English
35
38
570
65.7K
Parallax
Parallax@tryParallax·
local ai has picked up fast since openclaw dropped. with the latest wave of small capable models, more people are running serious workloads on their own hardware. if you missed this good local ai tutorial from @yacinelearning or want a refresher on how distributed scheduling actually works under the hood, it's worth the rewatch over the weekend!
Yacine Mahdid@yacinelearning

I am continuing my adventure into distributed AI system with the parallax scheduling strat from @Gradient_HQ in this 37min tutorial I go through: - heuristic used to make scheduling tractable - dynamic programming formulation - filling GPU with water - shoving them into shelves

English
14
15
121
15.2K
Parallax
Parallax@tryParallax·
@tomosman @openclaw @NousResearch mac minis are underrated for this. we've been running multi-node setups on apple silicon with parallax and the performance-per-dollar is hard to beat. nice to see more people building this way.
English
0
0
0
12
Tom Osman 🐦‍⬛
Tom Osman 🐦‍⬛@tomosman·
Infinitely bullish on a stack of MacMinis or Studios at home running @openclaw or @NousResearch Hermes. Run local models and soon you will have AGI at home. Lots of other epic stuff too but feels downstream of being able to do this.
English
2
2
26
1.3K