LocalAI

1.5K posts

LocalAI banner
LocalAI

LocalAI

@LocalAI_API

OpenAI Open Source alternative. LocalAI is a community, drop-in replacement API compatible with OpenAI for local CPU/GPU inferencing

Katılım Nisan 2023
39 Takip Edilen3.7K Takipçiler
Sabitlenmiş Tweet
LocalAI
LocalAI@LocalAI_API·
LocalAI 3.9 and 3.10 are out! 🎉 Now we are a fully extensible OS for your AI applications. Highlights: 🤖 Support for Open Responses and Anthropic API 🗓️ Schedule background jobs (Cron/API) 🎥 New UI for Text-to-Video, Qwen TTS ⚡ Easy GPU: One docker image for everything👇
LocalAI tweet media
English
4
3
17
4.8K
LocalAI retweetledi
Ettore Di Giacinto
Ettore Di Giacinto@mudler_it·
LocalAI ( @LocalAI_API ) 4.2.0 is out, just few numbers and facts: - +392 commits ( we squash these 😄 ) - +11 Backends: voice and face recognition, vibevoice.cpp (from me), LocalQVE from @jichiep and among @sgl_project , @__tinygrad__ , @no_stp_on_snek 's Turboquant, ik_llama.cpp, sam.cpp from @el_PA_B - Many new QoL improvements, increased sglang and VLLM support and hardening on distributed mode - 16+ new contributors ! Thanks to the community! LocalAI is all about give you flexibility to run the latest from the community, and ds4 support from @antirez is on its way! This is the year of Local AI!
Ettore Di Giacinto tweet mediaEttore Di Giacinto tweet media
English
10
8
33
7.1K
LocalAI retweetledi
Richard Palethorpe
Richard Palethorpe@jichiep·
Also incoming is a @LocalAI_API module with websocket and REST APIs. It'll also be usable through the UI
Richard Palethorpe tweet media
English
1
2
3
666
Enrico - big-AGI
Enrico - big-AGI@enricoros·
Deepseek V4 is genuinely impressive. Quick, smart, extra resourceful (scanned all my 2,700 chats and knows everything about me already!) It's not cheap at $3.48 out, but @LocalAI_API will push this model to the people 🤍 GG @deepseek_ai
English
2
0
3
246
Ettore Di Giacinto
Ettore Di Giacinto@mudler_it·
I've just realized that this comment was worded very poorly - @exolabs is the pioneer in distributed MLX, I didn't wanted to shed any bad light - actually, I'm a huge fan of it and without the amazing work from @alexocheema it would have not been possible for @LocalAI_API to support MLX distributed! Please keep up the great work!! Indeed, our backend is based from @exolabs implemenation and we call that out clearly in the docs, in our readme and in the code (thank you!). What I wanted just to show is that we ( @LocalAI_API ) are trying to be more Apple friendly and feedback from the MLX community would be really appreciated!
Ettore Di Giacinto@mudler_it

@ivanfioravanti I'll do a shameless plug and tell to try @LocalAI_API ? Opencode and Claude code works well here, but I couldn't test mlx distributed backend ( I have only a small Mac mini, so I could just validate it).

English
2
0
8
980
LocalAI retweetledi
Ettore Di Giacinto
Ettore Di Giacinto@mudler_it·
@LocalAI_API next release will blow it. It features many new backends that lets you swap and run AI models in different ways and bench side by side in a way that you couldn't do before: - tinygrad (by cc @__tinygrad__ ) - one of the most flexible and promising torch replacement (if you'd ask me) - sglang ( @sgl_project ) one of the fastest engine out there - ikawrakow/ik_llama.cpp fork which optimizes GGUF on CPUs - TheTom/llama-cpp-turboquant ( Turbo quant llama.cpp fork by @no_stp_on_snek ) - qwen3tts.cpp (qwen 3 tts everywhere!) - kokoros (rust implemenetaion of kokoro, damn fast on CPU!) All in a compact, extensible framework that lets you download, manage, remove and manage backend releases with ease, allowing to share your instance with authentication and distribute it across all your devices!
English
3
2
14
7.2K
LocalAI retweetledi
Richard Palethorpe
Richard Palethorpe@jichiep·
How to install and run @LocalAI_API using Docker compose. Including a tour of the basic features like installing models and backends for inference, debugging requests, chatting, images, TTS, voice sessions, using the API and so on.
English
1
3
6
902
Manu Sheel Gupta
Manu Sheel Gupta@manusheel·
@mudler_it @LocalAI_API @libp2p Thank you for using @libp2p at @LocalAI_API. If you need any help, support, or have feature requests that could enable your project further, please reach out, great efforts here 🙌 Great to see the P2P federation using @libp2p + gossip for coordination-first systems. Neat :)
English
2
2
9
191
LocalAI retweetledi
Ettore Di Giacinto
Ettore Di Giacinto@mudler_it·
Not everyone knows - but @LocalAI_API has two ways of distributing load across nodes (if you are building a cluster of GPUs) 1) P2P Fedaration: this uses @libp2p behind the scenes - has a ledger and an in-memory state storage which is distributed across nodes. It uses Gossip protocol for co-ordination, suited for community use (very simple to setup) 2) full-fledged distributed mode: LocalAI uses workers that are connected via NATS and to the frontend. This allows to scale horizontally multiple frontends and to multiple worker machines. LocalAI orchestrates building, maintenance, of models and backends. LocalAI has an extensible backend system that allows to support ANY backend for inferencing. With 2) you get control, with 1) you get decentralization.
Ettore Di Giacinto tweet media
English
2
2
12
1.1K
LocalAI retweetledi
Paul Smith 🇬🇧
Paul Smith 🇬🇧@PJSmith·
I just blind-tested two quants of Qwen3.5-35B-A3B (MoE, 35B total / ~3B active): • Unsloth UD-Q4_K_XL (standard 4-bit) • APEX-I-Quality (MoE-aware, near-Q8 claims, +~1GB) And, I am quite excited ;)
English
5
6
51
8.5K
LocalAI retweetledi
Ettore Di Giacinto
Ettore Di Giacinto@mudler_it·
I've just released APEX (Adaptive Precision for EXpert Models): a novel MoE quantization technique that outperforms @UnslothAI Dynamic 2.0 on accuracy while being 2x smaller for MoE architectures. Benchmarked on Qwen3.5-35B-A3B, but the method applies to any MoE model. Half the size of Q8. Perplexity comparable to F16. Works with stock @ggml_org's llama.cpp. Open source (of course!), with ❤️ from the @LocalAI_API team. 👇Links to the model, repository and benchmarks below! (+ Bonus TurboQuant benchmarks with @no_stp_on_snek's TQ+! )
Ettore Di Giacinto tweet media
English
26
51
366
33.5K
Ivan Fioravanti ᯅ
Ivan Fioravanti ᯅ@ivanfioravanti·
LocalAI is becoming stronger and better release, after release! Keep pushing @mudler_it and @LocalAI_API 🙌
Ettore Di Giacinto@mudler_it

@LocalAI_API 4.0 is out and its crazy - New UI with React (huh!) - Canvas mode in chat - Agentic orchestration, Memory (Hybrid search), Skills management integrated - WebRTC for realtime (@jichiep ) - New backends (ace-step.cpp, faster-qwen3-tts) And this is just the tip of it👇

English
3
9
69
7.4K
LocalAI retweetledi
Ettore Di Giacinto
Ettore Di Giacinto@mudler_it·
so I just got pure C inference working for Mistral's Voxtral TTS model and I'm unreasonably excited about it (inspired by @antirez and @vllm_project ) no pytorch, no python, no dependencies. ~4400 lines of C that go from text → speech at 24kHz. 20 voices across 9 languages. github.com/mudler/voxtral…
Mistral AI@MistralAI

🔊Introducing Voxtral TTS: our new frontier open-weight model for natural, expressive, and ultra-fast text-to-speech 🎭Realistic, emotionally expressive speech. 🌍Supports 9 languages and accurately captures diverse dialects. ⚡Very low latency for time-to-first-audio. 🔄Easily adaptable to new voices

English
10
16
269
24.2K
LocalAI retweetledi
Enrico - big-AGI
Enrico - big-AGI@enricoros·
Big-AGI Open 2.0.4 is out. The best place to enter your AI API keys, 2.0.4 comes with native support for the latest parameters of the latest frontier models. Enjoy Anthropic Fast mode and dynamic web filtering, AWS Bedrock (3 protos), new models and UX improvements galore.
Enrico - big-AGI tweet media
English
2
4
5
562
LocalAI retweetledi
Richard Palethorpe
Richard Palethorpe@jichiep·
I used Claude to train Microsoft's DeepVQE model and implement it in @ggml_org for inference. It removes echo's and noise from your microphone.
English
1
2
7
525