Anton

62.5K posts

Anton

@AntonAlexander

Senior Generative AI Specialist at @awscloud | Helping startups & enterprises train large-scale models & optimize inferencing. Founders—DM to connect!

Washington, DC Katılım Mart 2009

1.7K Takip Edilen43.9K Takipçiler

Sabitlenmiş Tweet

Anton@AntonAlexander·12 Mar

What a future 🇹🇹 🇺🇸 MIT freshman looks like. Top of the class, Mandarin speaking, JavaScript coding, track & field ⭐️, and fashionista. Going to one of the top 10 middle schools in USA

English

169

2.8K

Anton retweetledi

Computer@AskPerplexity·15 Ara

BREAKING: NVIDIA just dropped an open 30B model that beats GPT-OSS and Qwen3-30B — and runs 2.2–3.3× faster Nemotron 3 Nano: • Up to 1M-token context • MoE: 31.6B total params, 3.6B active • Best-in-class performance for SWE-Bench • Open weights + training recipe + redistributable datasets You can run the model locally with 24GB RAM.

English

317

2.7K

206.9K

Anton@AntonAlexander·28 Eki

🚀 New blog: Building custom LLMs for public sector on AWS Learn how governments can develop national & domain-specific language models that meet sovereignty, compliance & cultural requirements. Full 6-stage development guide 👇 aws.amazon.com/blogs/publicse… #AWS #AI #PublicSector #LLM #MachineLearning

English

273

Anton@AntonAlexander·27 Eki

@yangzhouy Nice can talk

English

Yang Zhou@yangzhouy·27 Eki

👀 Wanna run DeepSeek MoE models on AWS Cloud with DeepEP!!! 1/ 🚀 Introducing UCCL-EP: A portable, efficient Expert Parallelism framework that brings DeepEP-level GPU-driven communication with the same APIs to any cloud or hardware — AWS EFA, AMD GPUs, Broadcom NICs and beyond.

English

404

Anton@AntonAlexander·3 Eki

@HiroNishikawa The best book ever

English

1.2K

Hiroaki Nishikawa@HiroNishikawa·2 Eki

"Introduction to Algorithms", Thomas H. Cormen. Charles E. Leiserson. Ronald L. Rivest. One of the books at NIA inherited from ICASE, NASA Langley (1972-2002). I've used this book when I needed to speed up an agglomeration algorithm for FUN3D in around 2008; I ended up implementing the heap sort.

English

144

1.4K

63.5K

Anton@AntonAlexander·3 Eki

@yangzhouy @awscloud Got deepep working on efa

English

Yang Zhou@yangzhouy·12 Eyl

@AntonAlexander @awscloud Kernel crash causes the node unable to ssm into, so I cannot sudo reboot…

English

Yang Zhou@yangzhouy·12 Eyl

Any way to reboot GPU VMs inside a hyperpod slurm-managed cluster of AWS? @awscloud There are literaturely no hyperpod instances appearing in my EC2 page, so I cannot reboot (to fix a kernel crash). Anyone have ideas?

English

246

Anton retweetledi

Sami Khan@ibnAmjid·2 Eki

had a sudden urge to write a neural network from scratch in C++ using MLX

English

1.4K

84.2K

Anton retweetledi

Laura 🌲 ⛰️@LauraDeming·1 Eki

this 17 year old homeschooled girl refuted a conjecture that was unsolved for 40 years, and which professional mathematicians worked on for years without solving she was rejected from most graduate programs she applied to, because she did not have a degree

English

488

710

11.3K

Anton@AntonAlexander·2 Eki

Spent the last three days hacking DeepEP to work on EFA and I made it happen.

English

9.1K

Anton retweetledi

Tom Dörr@tom_doerr·20 Eyl

lets you build C/C++ apps that run on Linux, Mac, Windows, BSD, and BIOS

English

827

35.9K

Anton retweetledi

AWS AI@AWSAI·25 Tem

Learn how @nvidia Dynamo can be quickly setup & seamlessly deployed using Amazon EKS for automated scaling & simplified Kubernetes management ⚡💡🔧 NVIDIA Dynamo supports #AWS services such as Amazon S3, Amazon EFA & Amazon EKS. 👉 go.aws/3IK98YH

English

934

Anton@AntonAlexander·12 Eyl

@yangzhouy @awscloud Go into the hyper pod console and scale down the node and then scale it back up. You can send me your email I can show you.

English

Anton@AntonAlexander·12 Eyl

@yangzhouy @awscloud Contact me

English

Anton@AntonAlexander·5 Eyl

Grateful to be featured on AWS for AI Podcast (Ep. 8)! 🎙️ I shared insights on training foundation models at scale + my journey from 🇹🇹 to AWS. Would love if you could watch, like & comment to support! 🙌 youtu.be/i95xUdpy0qQ?si…

YouTube

English

261

Anton retweetledi

Gill Verdon@GillVerd·6 Ağu

First ever thermodynamic computer was put online internally today. Soon to be accessed by our first customers. So excited for things to come.

English

133

110

1.4K

142K

Anton retweetledi

Drishan Arora@drishanarora·31 Tem

Today, we are releasing 4 hybrid reasoning models of sizes 70B, 109B MoE, 405B, 671B MoE under open license. These are some of the strongest LLMs in the world, and serve as a proof of concept for a novel AI paradigm - iterative self-improvement (AI systems improving themselves). The largest 671B MoE model is amongst the strongest open models in the world. It matches/exceeds the performance of the latest DeepSeek v3 and DeepSeek R1 models both, and approaches closed frontier models like o3 and Claude 4 Opus.