

xonecas
1.2K posts





AI can be personalized to almost anything. But unlocking the nuances of a specific language, sometimes a specific region, is genuinely hard. Most of the open data for these "niche" languages is low-grade, and almost nobody is working on it. @duarteocarmo built Bagaço, a pretraining dataset for European Portuguese, and with it a method anyone can copy for their own language: how to find usable data when there isn't much, how to score and filter it, how to build the evals, how to train a model that speaks one specific dialect instead of a flattened average. While Portugal's government spent €5.5M on its own European Portuguese LLM, Amália, Duarte is doing it solo, fully in the open. At LISBON AI he'll run the pipeline live and you can judge who did it better. Come to Lisbon on 23–24 September and watch Duarte teach a machine to speak Portuguese with a dataset named after Portuguese moonshine. That's exactly what this conference is for.




🎙️ State of Agentic Coding with @mitsuhiko returns for Episode 7 In this pre-Fable 5 episode: - looping? - no more human-designed programming languages? - the state of local coding models - the marketing of token-maxing, and more




@mjovanovictech Do you think everyone does coding with LLMs? Most do trivial tasks like write essay, mail, excel for that local models are more than sufficient. Local LLMs took a back bench due to RAM shortage, otherwise we would hv had 128gb Unified mem systems common by now.





This was fixed. You know what's coming 👀 Give us 24 hours to reset the Codex rate limits across all plans.

📣 Introducing the Qwen-Robot Suite — Qwen-RobotNav, Qwen-RobotManip, Qwen-RobotWorld, three foundation models, a full stack for embodied intelligence. 🧭 Qwen-RobotNav — the gateway to mobility. • Unifies 5 navigation tasks in one model: instruction following, point-goal, object-goal, target tracking, autonomous driving • Controllable observation protocol • Tool interface for agentic systems 🤖 Qwen-RobotManip — the foundation of interaction. • Unified state-action space across heterogeneous robots • Camera-frame delta poses for coherent cross-embodiment training • Pretrained on a 38,100+ hour open-source corpus 🌍 Qwen-RobotWorld — infinite worlds for physical agents. • Single world model, 20+ embodiments • Natural-language action interface • Predicts physically grounded futures across manipulation, driving, and navigation Each model is independently useful, and could be composed as physical-world tools.Together, they form the low-level toolkit for general-purpose agentic systems that don't just see the world, but act in it. 📷 Blog: qwen.ai/blog?id=qwen-r… 📖 Report: Qwen-RobotNav: …anwen-res.oss-accelerate.aliyuncs.com/qwenrobot/pape… Qwen-RobotManip: …anwen-res.oss-accelerate.aliyuncs.com/qwenrobot/pape… Qwen-RobotWorld: …anwen-res.oss-accelerate.aliyuncs.com/qwenrobot/pape…


SpaceXAI and @cursor_ai are now working closely together to create the world’s best coding and knowledge work AI. The combination of Cursor’s leading product and distribution to expert software engineers with SpaceX’s million H100 equivalent Colossus training supercomputer will allow us to build the world’s most useful models. Cursor has also given SpaceX the right to acquire Cursor later this year for $60 billion or pay $10 billion for our work together.


