LocalAI

1.5K posts

LocalAI

@LocalAI_API

OpenAI Open Source alternative. LocalAI is a community, drop-in replacement API compatible with OpenAI for local CPU/GPU inferencing

Katılım Nisan 2023

39 Takip Edilen3.7K Takipçiler

Sabitlenmiş Tweet

LocalAI@LocalAI_API·23 Oca

LocalAI 3.9 and 3.10 are out! 🎉 Now we are a fully extensible OS for your AI applications. Highlights: 🤖 Support for Open Responses and Anthropic API 🗓️ Schedule background jobs (Cron/API) 🎥 New UI for Text-to-Video, Qwen TTS ⚡ Easy GPU: One docker image for everything👇

English

4.8K

LocalAI retweetledi

Molly Mackinlay | momack@momack28·20 May

Solid demo lineup from @LocalAI_API, @coinbase, @latticexyz, @ekacareHQ, & more!

English

700

LocalAI retweetledi

Ettore Di Giacinto@mudler_it·11 May

LocalAI ( @LocalAI_API ) 4.2.0 is out, just few numbers and facts: - +392 commits ( we squash these 😄 ) - +11 Backends: voice and face recognition, vibevoice.cpp (from me), LocalQVE from @jichiep and among @sgl_project , @__tinygrad__ , @no_stp_on_snek 's Turboquant, ik_llama.cpp, sam.cpp from @el_PA_B - Many new QoL improvements, increased sglang and VLLM support and hardening on distributed mode - 16+ new contributors ! Thanks to the community! LocalAI is all about give you flexibility to run the latest from the community, and ds4 support from @antirez is on its way! This is the year of Local AI!

English

7.1K

LocalAI retweetledi

Ettore Di Giacinto@mudler_it·29 Nis

Say hello to vibevoice.cpp, @Microsoft 's Vibevoice in pure C++ with @ggerganov 's ggml (@ggml_org). TTS and ASR (with diarization). CPU + CUDA + Metal + Vulkan via ggml backends. Quantized models live on @huggingface. Built with ❤️ from the @LocalAI_API team github.com/mudler/vibevoi…

English

3.9K

LocalAI retweetledi

Richard Palethorpe@jichiep·30 Nis

There is a live demo on @huggingface huggingface.co/spaces/LocalAI… A @LocalAI_API module is in the making. @mudler_it @ggerganov

English

LocalAI retweetledi

Richard Palethorpe@jichiep·4 May

Also incoming is a @LocalAI_API module with websocket and REST APIs. It'll also be usable through the UI

English

668

LocalAI@LocalAI_API·24 Nis

@enricoros @deepseek_ai we are on it! 🫡

English

Enrico - big-AGI@enricoros·24 Nis

Deepseek V4 is genuinely impressive. Quick, smart, extra resourceful (scanned all my 2,700 chats and knows everything about me already!) It's not cheap at $3.48 out, but @LocalAI_API will push this model to the people 🤍 GG @deepseek_ai

English

246

LocalAI@LocalAI_API·20 Nis

@alexocheema @mudler_it @exolabs we ❤️ @exolabs !

Alex Cheema@alexocheema·20 Nis

@mudler_it @exolabs @LocalAI_API All love 💛

English

171

Ettore Di Giacinto@mudler_it·20 Nis

I've just realized that this comment was worded very poorly - @exolabs is the pioneer in distributed MLX, I didn't wanted to shed any bad light - actually, I'm a huge fan of it and without the amazing work from @alexocheema it would have not been possible for @LocalAI_API to support MLX distributed! Please keep up the great work!! Indeed, our backend is based from @exolabs implemenation and we call that out clearly in the docs, in our readme and in the code (thank you!). What I wanted just to show is that we ( @LocalAI_API ) are trying to be more Apple friendly and feedback from the MLX community would be really appreciated!

Ettore Di Giacinto@mudler_it

@ivanfioravanti I'll do a shameless plug and tell to try @LocalAI_API ? Opencode and Claude code works well here, but I couldn't test mlx distributed backend ( I have only a small Mac mini, so I could just validate it).

English

980

LocalAI retweetledi

Ettore Di Giacinto@mudler_it·20 Nis

@LocalAI_API next release will blow it. It features many new backends that lets you swap and run AI models in different ways and bench side by side in a way that you couldn't do before: - tinygrad (by cc @__tinygrad__ ) - one of the most flexible and promising torch replacement (if you'd ask me) - sglang ( @sgl_project ) one of the fastest engine out there - ikawrakow/ik_llama.cpp fork which optimizes GGUF on CPUs - TheTom/llama-cpp-turboquant ( Turbo quant llama.cpp fork by @no_stp_on_snek ) - qwen3tts.cpp (qwen 3 tts everywhere!) - kokoros (rust implemenetaion of kokoro, damn fast on CPU!) All in a compact, extensible framework that lets you download, manage, remove and manage backend releases with ease, allowing to share your instance with authentication and distribute it across all your devices!

English

7.2K

LocalAI retweetledi

Richard Palethorpe@jichiep·5 Nis

How to install and run @LocalAI_API using Docker compose. Including a tour of the basic features like installing models and backends for inference, debugging requests, chatting, images, TTS, voice sessions, using the API and so on.

English

904

LocalAI@LocalAI_API·4 Nis

@manusheel @mudler_it @libp2p 🫶 @libp2p

QME

151

Manu Sheel Gupta@manusheel·4 Nis

@mudler_it @LocalAI_API @libp2p Thank you for using @libp2p at @LocalAI_API. If you need any help, support, or have feature requests that could enable your project further, please reach out, great efforts here 🙌 Great to see the P2P federation using @libp2p + gossip for coordination-first systems. Neat :)

English

191

LocalAI retweetledi

Ettore Di Giacinto@mudler_it·3 Nis

Not everyone knows - but @LocalAI_API has two ways of distributing load across nodes (if you are building a cluster of GPUs) 1) P2P Fedaration: this uses @libp2p behind the scenes - has a ledger and an in-memory state storage which is distributed across nodes. It uses Gossip protocol for co-ordination, suited for community use (very simple to setup) 2) full-fledged distributed mode: LocalAI uses workers that are connected via NATS and to the frontend. This allows to scale horizontally multiple frontends and to multiple worker machines. LocalAI orchestrates building, maintenance, of models and backends. LocalAI has an extensible backend system that allows to support ANY backend for inferencing. With 2) you get control, with 1) you get decentralization.

English

1.1K

LocalAI@LocalAI_API·3 Nis

LocalAI 4.1.0 is out!

Ettore Di Giacinto@mudler_it

Ok, notoriously I don't sleep that much. Time to share @LocalAI_API 4.1.0 (why not?) ! TLDR: - Distributed, hybrid clusters with production ready setup - Built-in auth, quota, user metrics - Fine-tuning and quantization from the UI 🔥Details below! 👇

English

917

LocalAI retweetledi

Paul Smith 🇬🇧@PJSmith·2 Nis

I just blind-tested two quants of Qwen3.5-35B-A3B (MoE, 35B total / ~3B active): • Unsloth UD-Q4_K_XL (standard 4-bit) • APEX-I-Quality (MoE-aware, near-Q8 claims, +~1GB) And, I am quite excited ;)

English

8.5K

LocalAI retweetledi

Ettore Di Giacinto@mudler_it·1 Nis

I've just released APEX (Adaptive Precision for EXpert Models): a novel MoE quantization technique that outperforms @UnslothAI Dynamic 2.0 on accuracy while being 2x smaller for MoE architectures. Benchmarked on Qwen3.5-35B-A3B, but the method applies to any MoE model. Half the size of Q8. Perplexity comparable to F16. Works with stock @ggml_org's llama.cpp. Open source (of course!), with ❤️ from the @LocalAI_API team. 👇Links to the model, repository and benchmarks below! (+ Bonus TurboQuant benchmarks with @no_stp_on_snek's TQ+! )

English

366

33.5K

Ivan Fioravanti ᯅ@ivanfioravanti·29 Mar

LocalAI is becoming stronger and better release, after release! Keep pushing @mudler_it and @LocalAI_API 🙌

Ettore Di Giacinto@mudler_it

@LocalAI_API 4.0 is out and its crazy - New UI with React (huh!) - Canvas mode in chat - Agentic orchestration, Memory (Hybrid search), Skills management integrated - WebRTC for realtime (@jichiep ) - New backends (ace-step.cpp, faster-qwen3-tts) And this is just the tip of it👇

English

7.4K

LocalAI@LocalAI_API·30 Mar

@ivanfioravanti @mudler_it 😍 Thank you @ivanfioravanti ! and 4.1 gonna look even more strong!! stay tuned!

English

LocalAI retweetledi

Ettore Di Giacinto@mudler_it·26 Mar

so I just got pure C inference working for Mistral's Voxtral TTS model and I'm unreasonably excited about it (inspired by @antirez and @vllm_project ) no pytorch, no python, no dependencies. ~4400 lines of C that go from text → speech at 24kHz. 20 voices across 9 languages. github.com/mudler/voxtral…

Mistral AI@MistralAI

🔊Introducing Voxtral TTS: our new frontier open-weight model for natural, expressive, and ultra-fast text-to-speech 🎭Realistic, emotionally expressive speech. 🌍Supports 9 languages and accurately captures diverse dialects. ⚡Very low latency for time-to-first-audio. 🔄Easily adaptable to new voices

English

269

24.2K

LocalAI@LocalAI_API·25 Mar

@enricoros Congrats for the release!! 😍

English

LocalAI retweetledi

Enrico - big-AGI@enricoros·25 Mar

Big-AGI Open 2.0.4 is out. The best place to enter your AI API keys, 2.0.4 comes with native support for the latest parameters of the latest frontier models. Enjoy Anthropic Fast mode and dynamic web filtering, AWS Bedrock (3 protos), new models and UX improvements galore.

English

562

LocalAI retweetledi

Richard Palethorpe@jichiep·23 Mar

I used Claude to train Microsoft's DeepVQE model and implement it in @ggml_org for inference. It removes echo's and noise from your microphone.

English

525

Keşfet

@coinbase @latticexyz @ekacareHQ @jichiep @sgl_project @__tinygrad__ @no_stp_on_snek @el_PA_B