DeepInfra
659 posts

DeepInfra
@DeepInfra
Fast ML inference. Run top AI models using a simple API.
Palo Alto Katılım Şubat 2023
65 Takip Edilen5.1K Takipçiler

Your own AI agent, always on, from $13/mo. 📷
Deep Infra Hosted Agents: OpenClaw (web dashboard) or Hermes (SSH/terminal). One-click setup & updates. Pre-wired for fast inference from second one. Auto-backups + restore. Stop it and pay $0 while idle.
Spin one up → deepinfra.com/dash/agents
English

📉 Price cut on @NVIDIAAI Nemotron 3 Ultra.
$0.50 in / $2.20 out / $0.10 cached per 1M — output down 12%, cached down 33%.
550B/55B MoE for agentic reasoning and deep research. Multimodal, function calling, 256K context.
deepinfra.com/nvidia/NVIDIA-…
English

Congrats on the launch @Zai_org 🚀
Day zero on DeepInfra → deepinfra.com/zai-org/GLM-5.2
Z.ai@Zai_org
Introducing GLM-5.2: Frontier Intelligence, Open Weights - Significant improvements in coding and agentic tasks - Strong long-horizon capabilities with a 1M context window - Two levels of reasoning effort: GLM-5.2 (max) pushes the limits, while GLM-5.2 (high) strikes a strong balance between performance and token efficiency - MIT-licensed open weights - Same API pricing as GLM-5.1 Tech Blog: z.ai/blog/glm-5.2 Weights: huggingface.co/zai-org/GLM-5.2 API: docs.z.ai/guides/llm/glm… Coding Plan: z.ai/subscribe Chat: chat.z.ai
English

@nvidia Blackwell powers agentic AI, and DeepInfra is one of the places you can run it.
New AgentPerf benchmark from @ArtificialAnlys is the first to properly measure agentic workload performance: sequential LLM + tool calls, real coding trajectories, concurrent sessions.
Customers like @pamdotai are already running gpt-oss-120b on DeepInfra for production agentic workloads.
blogs.nvidia.com/blog/nvidia-bl…
English

We just added text-to-music on DeepInfra.
ACE-Step v1.5 XL — open-source, full song generation from a text prompt. Vocals, lyrics, instrumentation. Quality that rivals commercial tools.
We run the XL checkpoint with the planning step on by default, so it optimizes for musical structure and coherence.
$0.001 / second of audio.
@ACEStep_Music
English

Excited to announce Concentrate AI’s $5.1M pre-seed!
Todd Lieberman and I are launching Concentrate AI (the fifth company we've started together).
Most companies are not in control of their AI spend or the data they’re sending to AI.
@concentrateai solves that.
Why are we building it? What problem does it solve? And why now?
English

Play around with it here: deepinfra.com/nvidia/NVIDIA-…
deepinfra.com/nvidia/Nemotro…
deepinfra.com/nvidia/Nemotro…
English

We just added @NVIDIA Nemotron 3.x to DeepInfra — Day 0.
Two open and highly efficient models, live now:
→ Nemotron 3 Ultra: Frontier reasoning for long-running agents with, up to 5x faster inference and up to 30% lower cost
→ Nemotron 3.5 Content Safety: 4B multimodal, multilingual safety model with custom policy support, reasoning traces, and coverage across, 23 safety categories for enterprise AI guardrails
→ Nemotron 3.5 ASR:(Coming soon) 0.6B streaming model with ~40 language-locales.
Built for agentic AI. Same API as everything else on DeepInfra.
English


