PyTorch

3.3K posts

PyTorch banner
PyTorch

PyTorch

@PyTorch

Tensors and neural networks in Python with strong hardware acceleration. PyTorch is an open source project at the Linux Foundation. #PyTorchFoundation

Sumali Eylül 2016
86 Sinusundan495.2K Mga Tagasunod
NVIDIA AI
NVIDIA AI@NVIDIAAI·
We're adopting the Linux Foundation’s OpenMDW framework across our open model families. This helps make open model licensing simpler and more consistent at scale. A single legal framework across models, code, documentation, and data helps reduce friction for developers and enterprises building with open source.
The Linux Foundation@linuxfoundation

OpenMDW-1.1 is now available — and @NVIDIAAI is adopting it across Cosmos, Isaac GR00T, Ising, and Nemotron model families. A permissive, unified legal framework purpose-built for AI models. Learn more at openmdw.ai

English
26
73
622
157.1K
PyTorch
PyTorch@PyTorch·
Wondering what it's like to attend a PyTorch Conference? Attendees of the first ever #PyTorchCon Europe 2026 in Paris share a glimpse into what it's like... Watch the highlight reel on YouTube: youtu.be/0lKdvXZkS-4?si… Will we see you at PyTorch Conference China and PyTorch Conference North America later this year?
YouTube video
YouTube
English
3
6
25
11.3K
PyTorch
PyTorch@PyTorch·
How to Eliminate Pipeline Friction in AI Model Serving There are numerous issues that are collectively known as pipeline friction, and they cost organizations time, money, and competitive advantage. This post provides actionable best practices for eliminating the most common sources of friction in AI model serving pipelines. Learn about the most frequent sources of pipeline friction, including unsupported operations, dynamic input sizes, version mismatches, and model export issues that might arise, for example, when converting from a training frameworks like PyTorch into optimized inference formats. Read the full post: developer.nvidia.com/blog/how-to-el…
English
2
13
46
6.5K
PyTorch
PyTorch@PyTorch·
"We believe the future of AI is built on open, production-proven infrastructure - and PyTorch sits at the heart of that future. Joining the PyTorch Foundation is a natural step given our years of running PyTorch at scale across heterogeneous hardware on Alibaba Cloud. We look forward to working alongside the PyTorch Foundation to raise the bar for AI infrastructure and help developers build the next generation of models with confidence," said Dr. Feifei Li, Chief Technology Officer, Alibaba Cloud. ITBrief @techday covers @alibaba_cloud joining the PyTorch Foundation as a Platinum member itbrief.news/story/alibaba-…
English
1
7
52
8.8K
PyTorch
PyTorch@PyTorch·
PyTorch Compile can make models run dramatically faster, but the real magic is what happens under the hood. This blog breaks down one of the most important optimizations behind torch.compile: kernel fusion. Instead of launching separate GPU kernels for every PyTorch operation, PyTorch Inductor can combine dependent operations into a single optimized Triton kernel. The result? Fewer kernel launches, less memory traffic, fewer intermediate tensors, and more efficient GPU execution. Learn more here 👉 tinyurl.com/ms9sdnyn
PyTorch tweet media
English
3
24
191
13K
PyTorch
PyTorch@PyTorch·
The speed-of-light optimization for Qwen3.5 on the TokenSpeed inference engine is a significant milestone, achieving a record-breaking 580 tokens per second (tps) for agentic workloads on NVIDIA GPUs. In the PyTorch Foundation's latest community blog post, you can learn all about the complete design, implementation, and optimization of Qwen3.5 models in the TokenSpeed inference framework and see for yourself how this work is improving performance 👉 bit.ly/4uGUvIS This achievement was a joint effort between the @Alibaba_Qwen inference team, @lightseekorg Foundation TokenSpeed team, @NVIDIAAI , and the Mooncake team, with special contributions from @tri_dao for FlashAttention-4 (FA4) optimization. @KVCache_AI
PyTorch tweet media
English
12
49
284
264.3K
PyTorch
PyTorch@PyTorch·
Don't miss the flagship #PyTorch event of the year! 🚀 Join us in San Jose, CA, from Oct 20-21 for #PyTorchCon North America. Early bird registration saves you $400 through July 31. Register: bit.ly/4sh3DSw
English
0
3
16
4.5K
PyTorch nag-retweet
LightSeek Foundation
LightSeek Foundation@lightseekorg·
Introducing EAGLE 3.1 — the next evolution of speculative decoding from @EagleCorp, developed by @hongyangzh, @dogacel0, and the EAGLE team in collaboration with vLLM @vllm_project and TorchSpec @lightseekorg. EAGLE 3.1 introduces a new FC normalization + post-normalization hidden-state feedback architecture that significantly improves long-context robustness, acceptance length, and serving stability across real-world inference environments. @NVIDIA has been instrumental in the large-scale training, benchmarking, and inference validation of EAGLE 3.1 to help bring this next step in inference acceleration to production environments. For EAGLE 3.1, the EAGLE team identified attention drift as a key bottleneck behind deeper-step acceptance-length degradation in speculative decoding.| The Results: • Up to 2× longer acceptance length in long-context • Stronger long-context + chat-template robustness • More stable serving across diverse prompts/environments • Native vLLM support • TorchSpec training support • Open-source Kimi K2.6 EAGLE 3.1 draft model Read more below 👇 lightseek.org/blog/eagle-3-1…
English
2
11
57
16.8K
PyTorch
PyTorch@PyTorch·
We’re excited to welcome @alibaba_cloud as a Platinum Member of the PyTorch Foundation 🎉 Alibaba Cloud is a global leader in full-stack AI infrastructure and the force behind Qwen—one of the world’s most influential open-weight model families. Having run PyTorch at massive scale across diverse hardware, they bring invaluable, production-hardened engineering expertise to the upstream community.
PyTorch tweet media
English
2
5
82
16.3K
vLLM
vLLM@vllm_project·
🎉Thrilled to announce EAGLE 3.1 - the next evolution of speculative decoding from @EagleCorp, developed by @hongyangzh, @dogacel0, and the EAGLE team in collaboration with vLLM @vllm_project and TorchSpec @lightseekorg! 💡EAGLE 3.1 introduces a new FC normalization + post-normalization hidden-state feedback architecture that significantly improves long-context robustness, acceptance length, and serving stability across real-world inference environments. Shoutout to @NVIDIA who has been instrumental in the large-scale training, benchmarking, and inference validation of EAGLE 3.1 to help bring this next step in inference acceleration to production environments. For EAGLE 3.1, the EAGLE team identified attention drift as a key bottleneck behind deeper-step acceptance-length degradation in speculative decoding. ✨What's new: • Up to 2× longer acceptance length in long-context • Stronger long-context + chat-template robustness • More stable serving across diverse prompts or environments • Native vLLM support • TorchSpec training support • Open-source Kimi K2.6 EAGLE 3.1 draft model 🔗 Blog: vllm.ai/blog/2026-05-2…
vLLM tweet mediavLLM tweet media
English
11
37
348
34.9K
PyTorch
PyTorch@PyTorch·
@Birchlabs We apologize, the folder sync was in progress due to folder restructuring. It should be visible now!
English
0
0
2
238
PyTorch
PyTorch@PyTorch·
PyTorch member Meta just open-sourced a GPU kernel that makes attention 2.3x faster on NVIDIA Blackwell. TLX Block Attention is a warp-specialized Triton kernel built for block-diagonal self-attention — a pattern widely used in recommendation and feature-interaction models. By exploiting compile-time knowledge of the attention structure, entire stages of the Flash Attention algorithm have been eliminated: no multi-tile loops, no correction factors, no auxiliary tensors. The result: 2.3x kernel speedup, 3.5x when fused with rotary embeddings, and +30.6% MFU on production layers. Learn more here 👉 bit.ly/4fOoh9E Code: bit.ly/4e6SPlK
PyTorch tweet media
English
6
36
265
18K
PyTorch
PyTorch@PyTorch·
Model Optimization and Post-Training Quantization Model quantization is an effective method to reduce VRAM usage and improve inference performance on consumer devices. By lowering computational and memory requirements while preserving model quality, quantization helps AI models run more efficiently in resource-constrained environments. This post walks through how to use NVIDIA Model Optimizer to quantize a CLIP model in FP8 format with the post-training quantization (PTQ) method, including an example workflow exporting a PyTorch checkpoint. Read the complete blog post: developer.nvidia.com/blog/model-qua…
English
2
22
142
12.1K
PyTorch nag-retweet
Matt White
Matt White@matthew_d_white·
Back from MLSys 2026 in Bellevue — a packed week at the intersection of AI and systems. Highlights: @marksaroufim on AI writing systems code, Lidong Zhou on “system intelligence,” deep sessions on LLM serving/training, agentic AI, kernels, compilers, edge ML, benchmarking, and Industry Day. Also great to see strong interest at the PyTorch Foundation booth all week. Thank you to everyone who stopped by — and especially to the volunteers who represented the community so well. #MLSys2026 #PyTorch #MLSystems #AIInfrastructure
Matt White tweet media
English
0
2
15
6.6K