Susid Sprava
140 posts


10M developers use @opencode every month. This means the experience has to feel the same every hour of every day; slow or inconsistent inference breaks productivity and flow.
Enter Baseten. With Baseten's Model APIs, OpenCode achieves 5x faster TPS, sub-second TTFT, and 33% blended cost savings passed directly to users through cache token pricing.

English

@aipulseda1ly There is no way I can run it, its 550b parameters
English

Nemotron 3 Ultra is live on OpenRouter and it's free right now.
NVIDIA's first open frontier model built for agents.
The specs:
550B params / 55B active MoE
1M context window
300+ tok/s
It's open weights, which sounds like you can run it locally.
You can't, realistically. 550B parameters needs serious datacenter GPUs, not a desktop.
So the real story is the free hosted tier.
Frontier class reasoning at $0 in and $0 out, while it lasts.
For anyone building agents, that's the unlock.
Test it before the free window closes.
Dropping it into Hermes today.
Will report back with real numbers.

English


🚀 Gemma 4 12B is here!
We partnered with @GoogleDeepMind to bring and optimize their new dense and unifed multimodal model for Apple Silicon.
◈ 12B dense · 256K context
◈ Thinking mode (built-in reasoning)
◈ Vision: dynamic res, OCR, UI + charts
◈ Native audio: ASR + speech translation
◈ Function calling for agents
◈ Text + image + audio, interleaved
Runs local. Get started now ⚡
> uv pip install -U mlx-vlm
github.com/Blaizzy/mlx-vlm

Google Gemma@googlegemma
Meet Gemma 4 12B! A unified, encoder-free multimodal model designed to bring high-performance intelligence directly to your laptop, and released under an Apache 2.0 license. Bridging the gap between edge efficiency and advanced reasoning. Here is what’s new with Gemma 4 12B: 👇
English
Susid Sprava retweetledi

Qwen 3.7 Max is now on Hyper.
We waited a week to make sure it met our standards for performance and zero data retention.
This one is impressive. Try it here:
⚡️ hyper.charm.land
English
Susid Sprava retweetledi

Because humanity would try to put their d!ck in it. And if they did, they wouldn't be breathing anymore.
narsa.🪺@rathor7_
Why can't we all have one big hole instead of two holes in nose? It would make breathing much easier
English

@sosichest We don’t allow email domains that are temp emails. Been getting a lot of abuse through that.
Sign up with gmail?
English


@r0ktech Before you use it, show your passport to the camera. This is absurd. Zuckerberg is losing his mind.
GIF
English

@bridgemindai Be objective. Don't underestimate the models based on your own subjective opinion.
English

MiniMax M3 is free right now in OpenCode.
On paper it beats GPT 5.5 on SWE-bench. But benchmarks get benchmaxed, so I trust them about as far as I can throw them.
I am testing it live on real work right now.
Real codebase, real TODOs, no cherry picking.
We will see if the score holds up or falls apart.
Report coming.

English

@TeksEdge @OpenRouter I suspect they implemented 4-bit quantization, and the inference became so cheap that there’s no point in charging people those few cents.
English

Holy💩! I didn’t get the memo. Did Kimi K3.0 release? Because Kimi K2.6 is free in @OpenRouter - what?

English
Susid Sprava retweetledi

@kilocode What nonsense. Even the free version of Deepseek Flash can handle this problem.
English

@bridgemindai Complaining that Claude has cut the limits and canceling the subscription... in 3... 2...
English

UltraCode is running over 100 Opus 4.8 agents right now.
This is the future of software development.
Not 5 agents. Not 10.
Over 100 working simultaneously across the entire BridgeMind ecosystem.
This is what vibe coding looks like at scale.
One person orchestrating a swarm of AI agents building, debugging, and shipping in parallel.
We are so early.
English

@jcarlos2001 Once everyone has jumped on, the team needs to walk a few more meters.
English

















