Michael Xu

7 posts

Michael Xu banner
Michael Xu

Michael Xu

@MichaelXu25

NLP & Computational Linguistics Researcher

Hangzhou, China Katılım Şubat 2019
16 Takip Edilen2 Takipçiler
Shunyu Yao
Shunyu Yao@ShunyuYao14·
Not the best, but better model is in line You can have a taste This time, more on user experience less on numbers #gemini-3-5-flash" target="_blank" rel="nofollow noopener">blog.google/innovation-and…
English
10
2
36
1.3K
F. Güney
F. Güney@ftm_guney·
geçen Çarşamba Ankara’da Parlar Vakfı ödül törenindeydik. Emre, Erdal, Ayşegül ve daha birçok değerli arkadaşımız ile genç bilim insanlarına verilen teşvik ödülünü aldık. Cuma günü de Vehbi Koç Vakfı bursiyerleriyle bir araya geldik. gemicilikten hemşireliğe farklı alanlardan çok parlak öğrencilerle tanıştık, AI üzerine konuştuk. öğrenciler nasıl bu kadar enerjik olduğumu sordular. aslında yorgunluktan ölüyordum ama hayalleri olan, onlara doğru hızla koşan gençleri görünce, insan o enerjiyi buluyor, yaşadıkları zorlukları dinlemek, paylaşmak istiyor. çünkü gerçekten, “bütün ümidimiz gençliktedir.” 👌🏾💯
F. Güney tweet media
Türkçe
1
0
26
1.2K
Michael Xu
Michael Xu@MichaelXu25·
@LBunzel @StefanoErmon @StartupGrind Fascinating keynote. The framing of efficiency, rather than raw capability alone, as a central constraint for high-volume agentic workloads is very compelling. Was the talk recorded? I would be very interested in watching the full keynote. @LBunzel @StefanoErmon
English
1
0
1
64
Lucas Bunzel
Lucas Bunzel@LBunzel·
"Beyond autoregressive: why diffusion is the future of language models" @StefanoErmon's keynote at @startupgrind yesterday. Fully packed Fox Theatre. Mercury 2 is hitting >1,000 tok/sec on standard GPUs at a fraction of the cost, comparable quality to frontier speed-optimized models. Diffusion. Parallel token generation. His closing line: the question isn't which model is smartest, it's which model is most efficient, without sacrificing quality, on the highest-volume tasks. When agents make 50 LLM calls per task, latency is the product. @_inception_ai
Lucas Bunzel tweet media
English
2
11
34
5.2K
Yu Yang
Yu Yang@YuYang_i·
Sharing a little late update (before it’s no longer news): I wrapped up my PhD at the end of last year and recently joined @OpenAI’s reasoning team 🍓✨!
English
117
29
2.2K
200.3K
DeepSeek
DeepSeek@deepseek_ai·
🚀 Day 5 of #OpenSourceWeek: 3FS, Thruster for All DeepSeek Data Access Fire-Flyer File System (3FS) - a parallel file system that utilizes the full bandwidth of modern SSDs and RDMA networks. ⚡ 6.6 TiB/s aggregate read throughput in a 180-node cluster ⚡ 3.66 TiB/min throughput on GraySort benchmark in a 25-node cluster ⚡ 40+ GiB/s peak throughput per client node for KVCache lookup 🧬 Disaggregated architecture with strong consistency semantics ✅ Training data preprocessing, dataset loading, checkpoint saving/reloading, embedding vector search & KVCache lookups for inference in V3/R1 📥 3FS → github.com/deepseek-ai/3FS ⛲ Smallpond - data processing framework on 3FS → github.com/deepseek-ai/sm…
English
523
1.2K
10.2K
3.2M
Michael Xu
Michael Xu@MichaelXu25·
@Xianbao_QIAN Sorry, I just got online. Where did this picture come from?🤣
English
1
0
1
361
Tiezhen WANG
Tiezhen WANG@Xianbao_QIAN·
Wait... What....
Tiezhen WANG tweet media
English
18
25
326
92K