ByteShape

12 posts

ByteShape

@ByteShape

Accelerating AI by Learning Optimal Datatypes

Canada เข้าร่วม Ekim 2024

5 กำลังติดตาม33 ผู้ติดตาม

ByteShape@ByteShape·6d

Run your own local AI coding agent We just published a beginner guide for using @opencode with local models (@lmstudio, llama.cpp, @ollama). Mac, Linux, WSL2, full setup + API + config. byteshape.com/blogs/tutorial… From “I have a model” → “I have a working coding agent”

English

152

ByteShape@ByteShape·1 Nis

GPUs are consistent. CPUs are not. With our ByteShape Qwen 3.5 9B quants, the same models perform well across GPUs, but CPUs each have their own “favorites”. No one-size-fits-all. Optimize for your hardware. byteshape.com/blogs/Qwen3.5-…

English

ByteShape รีทวีตแล้ว

Allen Lau 🇨🇦@allenlau·19 Mar

ByteShape was quietly launched just before the year end. Two weeks ago, we announced our investment in the company. Since its launch, and with minimal fanfare on purpose, @ByteShape cumulative downloads have easily blown past 100,000. No small feat for a new startup!

English

209

ByteShape รีทวีตแล้ว

Allen Lau 🇨🇦@allenlau·5 Mar

Announcing @twosmallfishvc's investment in @ByteShape. In short, ByteShape is delivering step-function gains in AI efficiency, including up to 7x faster training, up to 10x faster inference, plus up to 40% compression to reduce model size.

English

238

ByteShape@ByteShape·25 Şub

@atarl666028 @Alibaba_Qwen 👀

QME

343

atarlı@atarl666028·24 Şub

@Alibaba_Qwen @ByteShape When will Shape Learn Quantized be released?

English

112

Qwen@Alibaba_Qwen·24 Şub

🚀 Introducing the Qwen 3.5 Medium Model Series Qwen3.5-Flash · Qwen3.5-35B-A3B · Qwen3.5-122B-A10B · Qwen3.5-27B ✨ More intelligence, less compute. • Qwen3.5-35B-A3B now surpasses Qwen3-235B-A22B-2507 and Qwen3-VL-235B-A22B — a reminder that better architecture, data quality, and RL can move intelligence forward, not just bigger parameter counts. • Qwen3.5-122B-A10B and 27B continue narrowing the gap between medium-sized and frontier models — especially in more complex agent scenarios. • Qwen3.5-Flash is the hosted production version aligned with 35B-A3B, featuring: – 1M context length by default – Official built-in tools 🔗 Hugging Face: huggingface.co/collections/Qw… 🔗 ModelScope: modelscope.cn/collections/Qw… 🔗 Qwen3.5-Flash API: modelstudio.console.alibabacloud.com/ap-southeast-1… Try in Qwen Chat 👇 Flash: chat.qwen.ai/?models=qwen3.… 27B: chat.qwen.ai/?models=qwen3.… 35B-A3B: chat.qwen.ai/?models=qwen3.… 122B-A10B: chat.qwen.ai/?models=qwen3.… Would love to hear what you build with it.

English

434

1.1K

ByteShape รีทวีตแล้ว

wd 🔺@populartourist·21 Şub

Excellent quantisation on Devstral Small 2

ByteShape@ByteShape

We released ShapeLearn-optimized GGUFs for: • Devstral Small 2 24B, tuned for RTX 40/50 GPUs • Qwen3 Coder 30B, runs everywhere, yes even the Pi Maximum quality. Fastest TPS. Minimal compromise. GGUFs + interactive plots are live: byteshape.com/blogs/Devstral…

English

188

ByteShape@ByteShape·19 Şub

English

456

ByteShape@ByteShape·15 Oca

Edge computing is getting spicy! Shoutout to @geerlingguy for showcasing our model. Love seeing what the community is building and how hard it’s being pushed. Clip: youtube.com/watch?v=jRQaur…

YouTube

Jeff Geerling@geerlingguy

Raspberry Pi has a new AI HAT. This time with built-in 8 GB of RAM, so you can run machine vision + LLM inference all without touching the Pi's CPU. It's $130 and a little bit of a niche item. Find out why in my video: youtube.com/watch?v=jRQaur…

English

280

ByteShape รีทวีตแล้ว

Jeff Geerling@geerlingguy·15 Oca

YouTube

English

606

55.4K

ByteShape รีทวีตแล้ว

HackerNewsTop5@hackernewstop5·7 Oca

A 30B Qwen Model Walks into a Raspberry Pi and Runs in Real Time #HackerNews byteshape.com/blogs/Qwen3-30…

English

348

ByteShape@ByteShape·6 Oca

New ShapeLearn GGUFs! Qwen3-30B-A3B-Instruct-2507, tuned for real hardware, not just “fewer bits.” We treat memory as a budget: fit first, then optimize TPS vs quality. And yes, a 30B runs on a Raspberry Pi 5. byteshape.com/blogs/Qwen3-30…

English

357

ByteShape@ByteShape·13 Ara

What if AI models could learn their own compression? ByteShape's tech learns the best datatype per tensor. On Qwen3-4B and Llama 3.1-8B: 2.5x–3.5x better accuracy vs GGUF, same quality faster or higher quality same speed, from RTX 5090 to Raspberry Pi 5. byteshape.com/blogs/Qwen3-4B…

English

172

ค้นพบ

@opencode @lmstudio @ollama @twosmallfishvc @atarl666028 @Alibaba_Qwen @geerlingguy @elonmusk