

Varad Pimpalkhute
322 posts

@varad0309
RS @ IFM | Prev @Articul8_AI @AmazonScience @allen_ai | MS CS @UMassAmherst. Towards super intelligence, One Algorithm at a Time.










Can language models learn useful priors without ever seeing language? We pre-pre-train transformers on neural cellular automata — fully synthetic, zero language. This improves language modeling by up to 6%, speeds up convergence by 40%, and strengthens downstream reasoning. Surprisingly, it even beats pre-pre-training on natural text! Blog: hanseungwook.github.io/blog/nca-pre-p… (1/n)













Introducing n1 — Yutori’s browser-use model. Available today via our API. If you’ve been using Claude, Gemini or OpenAI’s computer use models for browser automation — you should switch to Yutori’s n1. It is more accurate, significantly cheaper, and a drop-in replacement.



🚀 New Blog: INT4 Quantization-Aware Training (QAT) is fired up! Inspired by the Kimi K2 team, our SGLang RL team shipped an end-to-end INT4 Quantization-Aware Training (QAT) pipeline that achieved BF16-level stability & train–infer consistency with: 📙Fake quant during training + real W4A16 at inference 💻INT4 compression shrinks ~1TB-scale models to fit on a single H200 GPU 💡Single-node rollout: no cross-node synchronization and communication overhead, so we have faster, more stable RL sampling More to come: speeding up QAT on the training side, exploring FP4 RL on NVIDIA Blackwell and beyond.

Please welcome K2 Think V2, our first fully sovereign 70B reasoning model. Built on the K2-V2 base, this release bridges the gap between community-owned AI and proprietary models. About K2 Think V2: 🧠 70B parameters, RLVR-tuned 🛡️ 100% Sovereign (IFM-curated data only) 🔓 Fully Open (Pre-training to Post-training) 💡 Top-tier Openness & Intelligence
