

Dado
176 posts





I'm really not a fan of Skills in agentic AI systems. They add unnecessary complexity and open up a whole new set of problems, especially once Skills can be downloaded and suddenly need versioning, trust, compatibility, and package management. I understand why they are useful right now. But the more general these AI systems become, the less they should need this extra layer.

New CursorBench results just dropped. Two big takeaways. Composer 2.5 is way better than most people think. 63.2% score at $0.55 per task. Nearly matching Opus 4.7 Max and GPT 5.5 Extra High at 20x less cost. This is insane value. Gemini 3.5 Flash is #10 at 49.8%. Below GPT 5.5 Low. Below Opus 4.7 Low. Google's newest model can't even beat budget tier competition. Composer 2.5 is the sleeper. Gemini 3.5 Flash is the disappointment.







For Qwen3.7-Max, we have invested far more compute into RL training than ever before. Its top-tier AA score confirms the resulting general and agentic capabilities. This is just the start. We will firmly push forward RL scaling to build more powerful Qwen models. Stay tuned!




‼️UPDATE: Pixal3D is now under the MIT License. We hope this makes Pixal3D easier to use, build upon, and adopt for broader research and applications. Thanks again for your support, and feel free to let us know your feedback!




🚀🚀 Introducing Pixal3D (SIGGRAPH’26) — a new pixel-aligned image-to-3D generation paradigm for high-fidelity 3D asset creation. Today’s Image-to-3D has become pretty good at producing plausible 3D assets. But plausibility is not enough. Fidelity is a hidden bottleneck. ❓A generated model may look “about right,” yet still fail to truly align with the input pixels. Can we make 3D generation as faithful as reconstruction, while still allowing it to complete the unseen? Pixal3D is our answer. 💡We believe the core bottleneck behind fidelity is 2D–3D correspondence. Most 3D-native generators synthesize shapes in canonical space and inject image cues through cross-attention, forcing the model to implicitly search for which pixels correspond to which 3D regions. 🍀Pixal3D takes a different route. Instead of generating in canonical space, Pixal3D generates directly in pixel-aligned camera space — what you see is what you get. The generated 3D asset is aligned with the input view from the start. ☕️Meanwhile, Pixal3D introduces back-projection-based image condition scheme - explicitly back-projects multi-scale pixel features into 3D voxels, thus resolving the 2D-3D association problem. The input image is no longer just a prompt - it becomes a geometric anchor. 🚩Pixal3D shows that pixel-aligned 3D generation is not only feasible and scalable, but also significantly improves fidelity, pushing 3D-native generation closer to reconstruction-level faithfulness. It also naturally extends to multi-view and scene-level 3D generation. ✅Faithful to the input view. ✅Generative for the unseen. Closer to reconstruction-level fidelity, with the creativity of 3D generation. Pixal3D also represents an effort towards the unification of 3D generation and reconstruction. 📢Paper, code, and demo are fully released — try it out and let us know your feedback! 🌐Project page: ldyang694.github.io/projects/pixal… 🤗Huggingface Demo: huggingface.co/spaces/Tencent… 💻Code: github.com/TencentARC/Pix… 📄Paper: arxiv.org/abs/2605.10922




1/5 TLDR; We used Codex to discover and maintain heuristic learning for hard fluid dynamics control cases. I’ve been applying DRL and GNN to physics since 2019,, and over the past 3 months I’ve been toying with the idea of using agents in our processes. Inspired by the blog post from @Trinkle23897, I decided to use the same strategy and have agents find readable control strategies. This means a lot to our field, where interpretability can be key for industry.