Hayate Iso

39 posts

Hayate Iso

Hayate Iso

@iso_map

@nvidia

Santa Clara, CA เข้าร่วม Eylül 2023
130 กำลังติดตาม95 ผู้ติดตาม
Hayate Iso รีทวีตแล้ว
Bryan Catanzaro
Bryan Catanzaro@ctnzr·
Announcing NVIDIA Nemotron 3 Super! 💚120B-12A Hybrid SSM Latent MoE, designed for Blackwell 💚36 on AAIndex v4 💚up to 2.2X faster than GPT-OSS-120B in FP4 💚Open data, open recipe, open weights Models, Tech report, etc. here: research.nvidia.com/labs/nemotron/… And yes, Ultra is coming!
Bryan Catanzaro tweet media
English
62
205
1.2K
203.3K
Hayate Iso รีทวีตแล้ว
NVIDIA AI Developer
NVIDIA AI Developer@NVIDIAAIDev·
What if you could ask a chatbot a question the size of an entire encyclopedia—and get an answer in real time? Multi-million token queries with 32x more users are now possible with Helix Parallelism, an innovation by #NVIDIAResearch that drives inference at huge scale. 🔗 nvda.ws/4eCXxqh
GIF
English
4
33
146
16.9K
Hayate Iso รีทวีตแล้ว
NVIDIA AI Developer
NVIDIA AI Developer@NVIDIAAIDev·
Inference at scale is pushing the boundaries of what today’s systems can handle. One promising direction? #DisaggregatedInference — splitting the serving pipeline into distinct stages like prefill and decode — to unlock better performance across the throughput-interactivity spectrum. While the concept has gained traction and sparked a wave of open source experimentation, real-world deployments remain rare. Why? Because the optimization space is vast, and orchestrating system-level coordination is incredibly complex. 📄 In our latest #NVIDIAResearch, we present a comprehensive systematic study of disaggregated inference at scale, evaluating hundreds of thousands of design points across varied model sizes, traffic patterns, and hardware configurations. Key findings: 1️⃣ Disaggregation shines in prefill-heavy traffic patterns and with larger models. 2️⃣ Dynamic rate matching and elastic scaling are essential to achieving Pareto-optimal performance. 3️⃣ There's no one-size-fits-all architecture — workload-aware tuning is critical. Whether you're building production inference infrastructure or exploring #LLM optimization techniques, this work offers actionable insights to balance system throughput with low-latency responsiveness. 🔬 Read the paper and dive into the data — let’s shape the future of scalable #AI serving, together. 📗research.nvidia.com/publication/20…
NVIDIA AI Developer tweet media
English
0
5
22
2.3K
Hayate Iso รีทวีตแล้ว
NVIDIA AI Developer
NVIDIA AI Developer@NVIDIAAIDev·
👀New #NVIDIAResearch on boosting MoE model performance with disaggregated serving. Learn how our NVIDIA Dynamo and GB200 NVL72 work together to boost the performance of #AI data centers running MOE models like DeepSeek R1 and the new Llama 4. ⚡ Technical deep dive➡️ nvda.ws/45eElfW
NVIDIA AI Developer tweet media
English
7
16
61
5.3K
Hayate Iso รีทวีตแล้ว
Sajjadur Rahman
Sajjadur Rahman@subZero_saj·
📢Excited to bring the DAIS workshop to ICDE'25 (w/ @SainyamGalhotra @FarihaAnna @MikeCafarella @sairamgv) The focus is on the emerging idea of compound AI systems with a specific emphasis on data discovery, interactions w/ data, architectures for #AgenticAI+#LLM, and evaluation.
DAIS@ICDE2025@DAIS_workshop

We are excited to announce the First Workshop on Data-AI Systems@ICDE 2025 (DAIS). The workshop will focus on exploring innovative approaches towards building data-aware compound AI systems in the #LLM era. More details: dais-workshop-icde.github.io #DataManagement #AgenticAI

English
0
5
9
1K
Hayate Iso
Hayate Iso@iso_map·
Introducing 🫧 HoloBench—a new benchmark for measuring LLMs' reasoning capabilities across massive document collections! - RAG models fetch info well but struggle with multi-doc reasoning. - HoloBench evaluates how LLMs synthesize and aggregate info to solve complex tasks. 🧵 1/
Hayate Iso tweet media
English
1
0
3
680
Hayate Iso รีทวีตแล้ว
Megagon Labs
Megagon Labs@MegagonLabs·
⚡️Are you working with #NLP or #AI-driven products for Natural Language Generation (#NLG)? Task ambiguity is a common pain point and AmbigNLG changes that! AmbigNLG is designed to solve task ambiguity in instructions for NLG. What it is... 👇🧵
Megagon Labs tweet media
English
1
2
5
591
Hayate Iso รีทวีตแล้ว
Ayana Niwa
Ayana Niwa@ayaniwa1213·
🎉 Our long paper has been accepted for the #EMNLP2024 main conference! In "AmbigNLG," co-authored with @iso_map, we tackle task ambiguity in NLG instructions to better align LLM outputs with your expectations. 📃 Read more: arxiv.org/abs/2402.17717
English
5
10
123
9.6K
Hayate Iso รีทวีตแล้ว
Megagon Labs
Megagon Labs@MegagonLabs·
Let’s push the boundaries of #LLMs in text editing tasks! XATU is the first-of-its-kind benchmark that addresses the nuances of text editing, revolutionizing how we edit text with large language models. Don't miss the presentation at #Coling2024! #AI #NLP megagon.ai/xatu-fine-grai…
English
1
4
8
2K
Hayate Iso รีทวีตแล้ว
Pouya Pezeshkpour
Pouya Pezeshkpour@PPezeshkpour·
📢New Preprint📢 LLMs excel at ranking items for retrieval/recommender systems. But, what if we reduce the number of items and instead enforce multiple conditions as ranking instructions? It turns out, LLMs have a long way to go. See our new paper: arxiv.org/abs/2404.00211. 1/n
Pouya Pezeshkpour tweet media
English
2
4
28
3.9K