ByteShape

12 posts

ByteShape banner
ByteShape

ByteShape

@ByteShape

Accelerating AI by Learning Optimal Datatypes

Canada เข้าร่วม Ekim 2024
5 กำลังติดตาม33 ผู้ติดตาม
ByteShape
ByteShape@ByteShape·
GPUs are consistent. CPUs are not. With our ByteShape Qwen 3.5 9B quants, the same models perform well across GPUs, but CPUs each have their own “favorites”. No one-size-fits-all. Optimize for your hardware. byteshape.com/blogs/Qwen3.5-…
ByteShape tweet media
English
1
4
6
84
ByteShape รีทวีตแล้ว
Allen Lau 🇨🇦
Allen Lau 🇨🇦@allenlau·
ByteShape was quietly launched just before the year end. Two weeks ago, we announced our investment in the company. Since its launch, and with minimal fanfare on purpose, @ByteShape cumulative downloads have easily blown past 100,000. No small feat for a new startup!
English
1
3
6
209
ByteShape รีทวีตแล้ว
Allen Lau 🇨🇦
Allen Lau 🇨🇦@allenlau·
Announcing @twosmallfishvc's investment in @ByteShape. In short, ByteShape is delivering step-function gains in AI efficiency, including up to 7x faster training, up to 10x faster inference, plus up to 40% compression to reduce model size.
Allen Lau 🇨🇦 tweet media
English
2
4
6
238
Qwen
Qwen@Alibaba_Qwen·
🚀 Introducing the Qwen 3.5 Medium Model Series Qwen3.5-Flash · Qwen3.5-35B-A3B · Qwen3.5-122B-A10B · Qwen3.5-27B ✨ More intelligence, less compute. • Qwen3.5-35B-A3B now surpasses Qwen3-235B-A22B-2507 and Qwen3-VL-235B-A22B — a reminder that better architecture, data quality, and RL can move intelligence forward, not just bigger parameter counts. • Qwen3.5-122B-A10B and 27B continue narrowing the gap between medium-sized and frontier models — especially in more complex agent scenarios. • Qwen3.5-Flash is the hosted production version aligned with 35B-A3B, featuring: – 1M context length by default – Official built-in tools 🔗 Hugging Face: huggingface.co/collections/Qw… 🔗 ModelScope: modelscope.cn/collections/Qw… 🔗 Qwen3.5-Flash API: modelstudio.console.alibabacloud.com/ap-southeast-1… Try in Qwen Chat 👇 Flash: chat.qwen.ai/?models=qwen3.… 27B: chat.qwen.ai/?models=qwen3.… 35B-A3B: chat.qwen.ai/?models=qwen3.… 122B-A10B: chat.qwen.ai/?models=qwen3.… Would love to hear what you build with it.
Qwen tweet media
English
434
1.1K
8K
4M
ByteShape
ByteShape@ByteShape·
We released ShapeLearn-optimized GGUFs for: • Devstral Small 2 24B, tuned for RTX 40/50 GPUs • Qwen3 Coder 30B, runs everywhere, yes even the Pi Maximum quality. Fastest TPS. Minimal compromise. GGUFs + interactive plots are live: byteshape.com/blogs/Devstral…
ByteShape tweet media
English
0
3
9
456
ByteShape รีทวีตแล้ว
Jeff Geerling
Jeff Geerling@geerlingguy·
Raspberry Pi has a new AI HAT. This time with built-in 8 GB of RAM, so you can run machine vision + LLM inference all without touching the Pi's CPU. It's $130 and a little bit of a niche item. Find out why in my video: youtube.com/watch?v=jRQaur…
YouTube video
YouTube
English
25
47
606
55.4K
ByteShape
ByteShape@ByteShape·
New ShapeLearn GGUFs! Qwen3-30B-A3B-Instruct-2507, tuned for real hardware, not just “fewer bits.” We treat memory as a budget: fit first, then optimize TPS vs quality. And yes, a 30B runs on a Raspberry Pi 5. byteshape.com/blogs/Qwen3-30…
ByteShape tweet media
English
0
2
4
357
ByteShape
ByteShape@ByteShape·
What if AI models could learn their own compression? ByteShape's tech learns the best datatype per tensor. On Qwen3-4B and Llama 3.1-8B: 2.5x–3.5x better accuracy vs GGUF, same quality faster or higher quality same speed, from RTX 5090 to Raspberry Pi 5. byteshape.com/blogs/Qwen3-4B…
ByteShape tweet media
English
0
1
2
172