Anton

62.5K posts

Anton banner
Anton

Anton

@AntonAlexander

Senior Generative AI Specialist at @awscloud | Helping startups & enterprises train large-scale models & optimize inferencing. Founders—DM to connect!

Washington, DC Katılım Mart 2009
1.7K Takip Edilen43.9K Takipçiler
Sabitlenmiş Tweet
Anton
Anton@AntonAlexander·
What a future 🇹🇹 🇺🇸 MIT freshman looks like. Top of the class, Mandarin speaking, JavaScript coding, track & field ⭐️, and fashionista. Going to one of the top 10 middle schools in USA
Anton tweet media
English
30
169
2.8K
0
Anton retweetledi
Computer
Computer@AskPerplexity·
BREAKING: NVIDIA just dropped an open 30B model that beats GPT-OSS and Qwen3-30B — and runs 2.2–3.3× faster Nemotron 3 Nano: • Up to 1M-token context • MoE: 31.6B total params, 3.6B active • Best-in-class performance for SWE-Bench • Open weights + training recipe + redistributable datasets You can run the model locally with 24GB RAM.
Computer tweet mediaComputer tweet media
English
84
317
2.7K
206.9K
Yang Zhou
Yang Zhou@yangzhouy·
👀 Wanna run DeepSeek MoE models on AWS Cloud with DeepEP!!! 1/ 🚀 Introducing UCCL-EP: A portable, efficient Expert Parallelism framework that brings DeepEP-level GPU-driven communication with the same APIs to any cloud or hardware — AWS EFA, AMD GPUs, Broadcom NICs and beyond.
English
2
0
4
404
Hiroaki Nishikawa
Hiroaki Nishikawa@HiroNishikawa·
"Introduction to Algorithms", Thomas H. Cormen. Charles E. Leiserson. Ronald L. Rivest. One of the books at NIA inherited from ICASE, NASA Langley (1972-2002). I've used this book when I needed to speed up an agglomeration algorithm for FUN3D in around 2008; I ended up implementing the heap sort.
Hiroaki Nishikawa tweet media
English
16
144
1.4K
63.5K
Yang Zhou
Yang Zhou@yangzhouy·
Any way to reboot GPU VMs inside a hyperpod slurm-managed cluster of AWS? @awscloud There are literaturely no hyperpod instances appearing in my EC2 page, so I cannot reboot (to fix a kernel crash). Anyone have ideas?
English
3
0
0
246
Anton retweetledi
Sami Khan
Sami Khan@ibnAmjid·
had a sudden urge to write a neural network from scratch in C++ using MLX
Sami Khan tweet media
English
46
54
1.4K
84.2K
Anton retweetledi
Laura 🌲 ⛰️
Laura 🌲 ⛰️@LauraDeming·
this 17 year old homeschooled girl refuted a conjecture that was unsolved for 40 years, and which professional mathematicians worked on for years without solving she was rejected from most graduate programs she applied to, because she did not have a degree
Laura 🌲 ⛰️ tweet media
English
488
710
11.3K
2M
Anton
Anton@AntonAlexander·
Spent the last three days hacking DeepEP to work on EFA and I made it happen.
Anton tweet media
English
0
0
14
9.1K
Anton retweetledi
Tom Dörr
Tom Dörr@tom_doerr·
lets you build C/C++ apps that run on Linux, Mac, Windows, BSD, and BIOS
Tom Dörr tweet media
English
7
75
827
35.9K
Anton retweetledi
AWS AI
AWS AI@AWSAI·
Learn how @nvidia Dynamo can be quickly setup & seamlessly deployed using Amazon EKS for automated scaling & simplified Kubernetes management ⚡💡🔧 NVIDIA Dynamo supports #AWS services such as Amazon S3, Amazon EFA & Amazon EKS. 👉 go.aws/3IK98YH
AWS AI tweet media
English
2
4
9
934
Anton
Anton@AntonAlexander·
@yangzhouy @awscloud Go into the hyper pod console and scale down the node and then scale it back up. You can send me your email I can show you.
English
0
0
0
36
Anton
Anton@AntonAlexander·
Grateful to be featured on AWS for AI Podcast (Ep. 8)! 🎙️ I shared insights on training foundation models at scale + my journey from 🇹🇹 to AWS. Would love if you could watch, like & comment to support! 🙌 youtu.be/i95xUdpy0qQ?si…
YouTube video
YouTube
English
0
0
5
261
Anton retweetledi
Gill Verdon
Gill Verdon@GillVerd·
First ever thermodynamic computer was put online internally today. Soon to be accessed by our first customers. So excited for things to come.
English
133
110
1.4K
142K
Anton retweetledi
Drishan Arora
Drishan Arora@drishanarora·
Today, we are releasing 4 hybrid reasoning models of sizes 70B, 109B MoE, 405B, 671B MoE under open license. These are some of the strongest LLMs in the world, and serve as a proof of concept for a novel AI paradigm - iterative self-improvement (AI systems improving themselves). The largest 671B MoE model is amongst the strongest open models in the world. It matches/exceeds the performance of the latest DeepSeek v3 and DeepSeek R1 models both, and approaches closed frontier models like o3 and Claude 4 Opus.
Drishan Arora tweet media
English
43
257
2K
451.7K
Anton retweetledi
tim.
tim.@ygg0f·
lock in. no one else's gonna do it.
tim. tweet media
English
40
974
9.3K
424.5K
Anton retweetledi
Hiroaki Nishikawa
Hiroaki Nishikawa@HiroNishikawa·
Verification and Validation in Computational Science and Engineering by Patrick J. Roache.
Hiroaki Nishikawa tweet media
English
2
112
984
33.3K
Anton retweetledi
ₕₐₘₚₜₒₙ
ₕₐₘₚₜₒₙ@hamptonism·
Probability Theory and Mathematical Statistics:
ₕₐₘₚₜₒₙ tweet media
English
9
47
559
26.2K
Anton retweetledi
Piotr Mazurek in SF 🌉
Piotr Mazurek in SF 🌉@tugot17·
I solved every single problem in the CUDA mode book. A quick thread summarizing this experience and what I learned 1/x
Piotr Mazurek in SF 🌉 tweet media
English
31
237
2.4K
283.9K