NVIDIA AI (@NVIDIAAI) - โปรไฟล์ Twitter

ทวีตที่ปักหมุด

NVIDIA AI@NVIDIAAI·4 Haz

Today we're shipping Nemotron 3 Ultra. A 550B MoE frontier-intelligence open model built for long-running agents. It delivers 5x faster inference and lowers the cost of complex agentic tasks by up to 30% versus other open frontier models.

English

199

463

3.5K

1.2M

NVIDIA AI รีทวีตแล้ว

Alex Cheema@alexocheema·1d

What are people using to run local inference on DGX Spark? ollama? LM Studio? vLLM? sgLang? TRT-LLM? Do you run in Docker? @TheAhmadOsman @AlicanKiraz0 @spark_arena @NVIDIAAI @NVIDIARTXSpark

English

64

7

145

43.6K

NVIDIA AI@NVIDIAAI·1d

@AcceleratedMu3n 😍

QME

2

0

38

2.7K

acc-mu3n@AcceleratedMu3n·2d

(一瞬ですが)4台に増えました👏

日本語

8

6

160

23.4K

NVIDIA AI@NVIDIAAI·1d

Congrats to the @MiniMax_AI team on the release of MiniMax M3, a long-context multimodal model for text, image, and video reasoning. 🙌 Try it today with our free GPU-accelerated endpoint on build.nvidia.com. Details: nvda.ws/4v4BWhD

MiniMax (official)@MiniMax_AI

MiniMax M3, Open-Weight, Now On Hugging Face , with only ~428B parameters and ~23B activated parameters Weights: huggingface.co/MiniMaxAI/Mini… MiniMax Sparse Attention: huggingface.co/papers/2606.13…

English

51

116

1.3K

134.2K

NVIDIA AI@NVIDIAAI·1d

@_inception_ai @augmentcode 🔥 Congrats to the @_inception_ai and @baseten teams!

English

1

0

13

1.4K

Inception@_inception_ai·2d

The fastest reasoning LLM is now in production on Baseten. Mercury 2 is a diffusion LLM, so it generates tokens in parallel and hits 1,000+ tokens/sec on @NVIDIAAI GPUs, speeds that used to require specialized hardware. @augmentcode is already using Mercury 2, cutting cost 90% and latency 82%. Proud to partner with the @baseten team to bring dLLMs to production.

Baseten@baseten

We are excited to announce that we have partnered with @_inception_ai to make Mercury 2 available on Baseten. This makes us the first inference platform to bring Inception’s diffusion LLM to production. Inception’s dLLM architecture fixes the bottlenecks of sequential token generation and can deliver 1,000+ tokens/sec on standard NVIDIA GPUs. Early users like @augmentcode have seen impressive results, such as an 82% reduction in latency and 90% cost savings, while maintaining high quality.

English

5

11

113

12K

NVIDIA AI@NVIDIAAI·2d

@slashreboot 👀

QME

0

1

16

1.2K

Matthew@slashreboot·2d

A peek behind the curtain. I really need to dust my home lab...

English

5

1

32

1.9K

NVIDIA AI@NVIDIAAI·2d

@david_nix 🦾

QME

4

0

34

2.1K

David Nix@david_nix·2d

Absolute monster of a GPU. Pictures online don't do it justice.

English

15

1

84

8K

NVIDIA AI@NVIDIAAI·2d

@xyz2maureen 🔥

QME

2

1

20

2.2K

NVIDIA AI@NVIDIAAI·2d

@recursive_wave 🤣 you'll still make people jealous with those stacks though

English

1

0

1

77

NVIDIA AI@NVIDIAAI·2d

@Smallzero its a workhorse!

English

1

123

NVIDIA AI@NVIDIAAI·2d

@knaidu78 super clean

English

0

1

86

Kamlesh Naidu@knaidu78·2d

@NVIDIAAI Running OpenWebUI, searXNG, hermes agent and local RAG with Qwen3.6-35B-A3B.

English

1

0

2

122

NVIDIA AI@NVIDIAAI·3d

Love seeing photos like this. Let's see some more setups. Curious what everyone's running these days. What models are you using and what are you building?

Alican Kiraz@AlicanKiraz0

4x Nvidia GB10 128GB, 400G QSFP-DD Switch, 2x QSFP-DD 400G to 2x200G QSFP cable and @NVIDIAAI Magic🔥🦾

English

69

29

684

50.5K

NVIDIA AI@NVIDIAAI·2d

@TingwuWang 🙌

QME

0

7

945

tingwu.wang@TingwuWang·2d

We are also preparing an interactive session for our UE5 demo in SIGGRAPH 2026 Los Angeles 🌴🌴🌴. see you there! 🚀🚀🚀

NVIDIA AI@NVIDIAAI

One open model. 350,000+ motion clips. 15,000 FPS. MotionBricks from NVIDIA Research runs real-time character animation at scale, without hand-crafted transitions or fine-tuning. And yes, it works for robotics too. #SIGGRAPH2026 paper, demos + code: nvlabs.github.io/motionbricks

English

1

0

13

2K

NVIDIA AI@NVIDIAAI·2d

NVIDIA is coming to #SIGGRAPH2026 in Los Angeles 🌴 Neural rendering, world models, physical AI, hands-on labs, and more. All the details 👉 nvidia.com/en-us/events/s…

English

2

5

49

6.2K

NVIDIA AI@NVIDIAAI·2d

One open model. 350,000+ motion clips. 15,000 FPS. MotionBricks from NVIDIA Research runs real-time character animation at scale, without hand-crafted transitions or fine-tuning. And yes, it works for robotics too. #SIGGRAPH2026 paper, demos + code: nvlabs.github.io/motionbricks

English

36

129

1.1K

98.9K

NVIDIA AI@NVIDIAAI·2d

Shoutout to Caleb for putting together a great deep dive on Nemotron 3 🙌 Check it out.

Caleb Eom@calebfoundry

Nemotron 3 Full Breakdown With the help of Joey Conway from @NVIDIAAI getting into the specifics around why Nemotron 3 is kind of a big deal Biggest headline with Nemotron is: Hybrid Mamba Transformer, Latent MoE, and MTP Hybrid Mamba Transformer essentially attacks right at the Attention mechanism to make the overhead sub-quadratic, but unlike quantizing KV Cache or swapping out attention head, NVIDIA chose Mamba-2 Latent MoE helps further optimize on sparsity by down projecting the dimensions so you're doing less math and less memory movement between HBM and SRAM, you're saving a ton, and NVIDIA made a conscious choice to add more experts given the surplus Finally, MTP or multi token prediction where the model can see future tokens to be more expressive in training and also option to use for speculative decoding during inference Oh, also the model adopts the new OpenMDW 1.1 License

English

13

27

322

29.3K

NVIDIA AI@NVIDIAAI·2d

Generate Synthetic Data for Physical AI With NVIDIA Brev Launchables and Agent Skills x.com/i/broadcasts/1…

English

7

11

82

5.9K

NVIDIA AI@NVIDIAAI·2d

@ztnimbus @claudeai @pollenrobotics 😍

QME

6

0

24

6.4K

Nimbus@ztnimbus·3d

Fable tailored Reachy running locally on a DGX spark cluster is magical. 🌠 @NVIDIAAI @claudeai @pollenrobotics

English

32

2

32

6K

NVIDIA AI@NVIDIAAI·2d

@calebfoundry 🔥 Joey is the best! Thanks for sharing.

English

0

22

1.8K

Caleb Eom@calebfoundry·3d

Nemotron 3 Full Breakdown With the help of Joey Conway from @NVIDIAAI getting into the specifics around why Nemotron 3 is kind of a big deal Biggest headline with Nemotron is: Hybrid Mamba Transformer, Latent MoE, and MTP Hybrid Mamba Transformer essentially attacks right at the Attention mechanism to make the overhead sub-quadratic, but unlike quantizing KV Cache or swapping out attention head, NVIDIA chose Mamba-2 Latent MoE helps further optimize on sparsity by down projecting the dimensions so you're doing less math and less memory movement between HBM and SRAM, you're saving a ton, and NVIDIA made a conscious choice to add more experts given the surplus Finally, MTP or multi token prediction where the model can see future tokens to be more expressive in training and also option to use for speculative decoding during inference Oh, also the model adopts the new OpenMDW 1.1 License

English

5

19

153

47.3K

NVIDIA AI@NVIDIAAI·2d

@hashsriram Great setup! Thanks for sharing. 💚

English

0

20

2.9K

Sriram Sivakumar@hashsriram·2d

Dual GB10 cluster @NVIDIAAI

English

22

17

470

46.4K

NVIDIA AI@NVIDIAAI·2d

@Soveryn_AI Nice! Keep us posted once the Spark lands on your doorstep.

English

1

0

2

142

Soveryn Intelligence@Soveryn_AI·2d

@NVIDIAAI waiting for my spark to arrive i am tying it to my main server with 3 gpus with 144 gb of vram should be epic

English

1

0

1

173

NVIDIA AI@NVIDIAAI·2d

@_rockt @Recursive_SI @karpathy 🙌

QME

0

1

8

901

Tim Rocktäschel@_rockt·2d

Excited to show results of the first steps towards automated AI research at @Recursive_SI. The same general system achieved state of the art on @NVIDIAAI's SOL-ExecBench GPU Kernel Optimization, nanoGPT Speedrun, and @karpathy's NanoChat autoresearch benchmarks.

English

6

12

133

9K

NVIDIA AI

ค้นพบ