
Introducing Gemini 3.1 Flash Live, our new realtime model to build voice and vision agents!! We have spent more than a year improving the model + infra + experience, the results? A step function improvement in quality, reliability, and latency.
Philipp Kandal
15.9K posts

@apphil
cpo @ grab, before: engineering vp @ telenav, founder skobbler (sold to telenav).

Introducing Gemini 3.1 Flash Live, our new realtime model to build voice and vision agents!! We have spent more than a year improving the model + infra + experience, the results? A step function improvement in quality, reliability, and latency.





When @ashtom goes off to start something new, you pay attention. It could just end up signaling the future of how developers and AI will work together. @EntireHQ a developer-first AI platform where humans and AI agents can truly collaborate to build, learn and evolve together. It’s a vision that goes far beyond being simply a place to store code. We designed the @EntireHQ logo to humanize AI. So we gave the logomark a face. Then brought it to life as a mascot with an entire behavioral system of its own. A friendly robot that embodies the developer-first mindset. Making cutting-edge tech feel approachable and relatable. And less like impenetrable science fiction. Sometimes the best way to introduce the future is to make it smile back at you.

Beep, boop. Come in, rebels. We’ve raised a 60m seed round to build the next developer platform. Open. Scalable. Independent. And we ship our first OSS release today. entire.io/blog/hello-ent…








We’re excited to announce the release and open-source of HunyuanImage 3.0 — the largest and most powerful open-source text-to-image model to date, with over 80 billion total parameters, of which 13 billion are activated per token during inference.The effect is completely comparable to the industry’s flagship closed-source model.🚀🚀🚀 HunyuanImage 3.0 originates from our internally developed native multimodal large language model, with fine-tuning and post-training focused on text-to-image generation. This unique foundation gives the model a powerful set of capabilities: ✅Reason with world knowledge ✅Understand complex, thousand-word prompts ✅Generate precise text within images Different from traditional DiT architecture image generation models, HunyuanImage 3.0’s MoE architecture uses a Transfusion-based approach to deeply couple Diffusion and LLM training for a single, powerful system. Built on Hunyuan-A13B, HunyuanImage 3.0 was trained on a massive dataset: 5 billion image-text pairs, video frames, interleaved image-text data, and 6 trillion tokens of text corpora. This hybrid training across multimodal generation, understanding, and LLM capabilities allows the model to seamlessly integrate multiple tasks. Whether you're an illustrator, designer, or creator, this is built to slash your workflow from hours to minutes. HunyuanImage 3.0 can generate intricate text, detailed comics, expressive emojis, and lively, engaging illustrations for educational content. The current release focuses solely on text-to-image generation and future updates will include image-to-image, image editing, multi-turn interaction, and more. 👉🏻Try it now: hunyuan.tencent.com/image 🔗GitHub: github.com/Tencent-Hunyua… 🤗Hugging Face: huggingface.co/tencent/Hunyua…


🚀 LongCat-Flash-Chat Launches! ▫️ 560B Total Params | 18.6B-31.3B Dynamic Activation ▫️ Trained on 20T Tokens | 100+ tokens/sec Inference ▫️ High Performance: TerminalBench 39.5 | τ²-Bench 67.7 🔗 Model: huggingface.co/meituan-longca… 💻 Try Now: longcat.ai


BTW, I've basically stopped using Opus entirely and I now have several Codex tabs with GPT-5-high working on different tasks across the 3 codebases (HVM, Bend, Kolmo). Progress has never been so intense. My job now is basically passing well-specified tasks to Codex, and reviewing its outputs. OpenAI isn't paying me and couldn't care less about me. This model is just very good and the fact people can't see it made me realize most of you are probably using chatbots as girlfriends or something other than assisting with complex coding tasks

🚀 Meet Qwen-Image — a 20B MMDiT model for next-gen text-to-image generation. Especially strong at creating stunning graphic posters with native text. Now open-source. 🔍 Key Highlights: 🔹 SOTA text rendering — rivals GPT-4o in English, best-in-class for Chinese 🔹 In-pixel text generation — no overlays, fully integrated 🔹 Bilingual support, diverse fonts, complex layouts 🎨 Also excels at general image generation — from photorealistic to anime, impressionist to minimalist. A true creative powerhouse. Blog:qwenlm.github.io/blog/qwen-imag… Hugging Face:huggingface.co/Qwen/Qwen-Image ModelScope:modelscope.cn/models/Qwen/Qw… Github:github.com/QwenLM/Qwen-Im… Technical report:…anwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/Qwe… Demo: modelscope.cn/aigc/imageGene…



We're thrilled to release & open-source Hunyuan3D World Model 1.0! This model enables you to generate immersive, explorable, and interactive 3D worlds from just a sentence or an image. It's the industry's first open-source 3D world generation model, compatible with CG pipelines for full editability & simulation. Set to transform game development, VR, digital content creation and so on. Get started now👇🏻 Project Page:3d-models.hunyuan.tencent.com/world/ Try it now:3d.hunyuan.tencent.com/sceneTo3D Github:github.com/Tencent-Hunyua… Hugging Face:huggingface.co/tencent/Hunyua…



