Fleek

3.5K posts

Fleek banner
Fleek

Fleek

@fleek

Turning AI models into supermodels. Our internal lab releases cool AI research & open source tools via https://t.co/xZMtSRR4mO

Tensor Cores เข้าร่วม Ekim 2018
193 กำลังติดตาม99.1K ผู้ติดตาม
Fleek
Fleek@fleek·
@xcorat not for long! ⚡️
English
0
0
0
68
Fleek
Fleek@fleek·
NVIDIA just dropped benchmarks showing 4-bit inference loses less than 1 point vs BF16 on most tasks. It's not accuracy per request that you should be measuring. It's tasks completed per dollar. And at that metric, 4-bit wins by a landslide. Read the full blog 👇
Fleek@fleek

x.com/i/article/2016…

English
12
6
24
3.3K
Fleek
Fleek@fleek·
@OpheliaMystic There are several other components to the stack that will be open-sourced in the coming weeks / months. Stay tuned, and keep an eye out on the repos 👀
English
0
0
0
37
Ophelia
Ophelia@OpheliaMystic·
@fleek This looks intriguing! The integration of mdspan with CUTLASS layouts could streamline our CUDA workflow significantly. I appreciate the focus on minimizing overhead while enhancing functionality. Excited to check out the complete example! 🚀
English
1
0
1
44
Fleek
Fleek@fleek·
1/ Yesterday we announced mdspan-cute: C++23 std::mdspan syntax with CUTLASS cute layouts. One header. Zero overhead. Here's how it works 🧵
English
2
5
14
1.3K
Fleek
Fleek@fleek·
7/ Layout algebra is formalized in Lean 4. 26 theorems, 0 sorry. Properties extracted to RapidCheck tests. The art/ directory has 23 SVG visualizations - we drew pictures until we understood.
English
1
0
4
780
Fleek
Fleek@fleek·
💿 Open Source Release 💿 mdspan-cute: a zero-overhead bridge between C++23 std::mdspan and CUTLASS cute layouts. One header. Swizzled memory. No bank conflicts. Read the blog and check out the repo (links in reply)
English
1
1
7
1K
Fleek
Fleek@fleek·
@cv_alphas @grok If their broad definition stands, it has implications for other industries including drones, robotics, and more - not just the EVs they claim later in the patent it applies to
English
0
0
1
26
Fleek
Fleek@fleek·
@cv_alphas @grok "Derivative" would mean it's based on prior art. rfl means it IS prior art. Not derived. Identical.
English
1
0
1
31
Fleek
Fleek@fleek·
6/ On "bit augmentation": Log/exp is a bijection. Information in = information out. You can't create precision from a reversible transformation. Thermodynamics doesn't allow it.
English
0
0
2
63
Fleek
Fleek@fleek·
5/ Quantized RoPE already runs in: → LLaMA → Mistral → Most open source inference stacks This isn't obscure. It's foundational.
English
1
0
2
120