rannarvasa ری ٹویٹ کیا

Run Claude Code using local LLMs for FREE.
No API costs. No data leaving your machine.
Here's how it works:
Claude Code lets you swap its backend via a single env variable. Point `ANTHROPIC_BASE_URL` to a local llama.cpp server, and it'll route all requests to whatever model you're running locally.
The Unsloth team put together a step-by-step guide for this where you can run Claude code using Qwen3.5
It covers everything from model download to server setup to running Claude Code.
The trick is serving your model on port 8001 via llama-server, then setting two env vars: `ANTHROPIC_BASE_URL` and a dummy `ANTHROPIC_API_KEY`. That's it. Claude Code thinks it's talking to Anthropic's API.
If you don't want to pay per token for every agentic loop and want fast, private, cost-free coding runs, this is exactly what you're looking for.
Link to the guide in the next tweet

English


















