Ettore Di Giacinto

2.7K posts

Ettore Di Giacinto banner
Ettore Di Giacinto

Ettore Di Giacinto

@mudler_it

dad, creator of LocalAI(https://t.co/ReVYw5Pf4D) and Kairos (https://t.co/R6M51FYVs7) , ex @SUSE/@Rancher, ex-Gentoo Dev.

Italy Katılım Ocak 2016
256 Takip Edilen3K Takipçiler
Sabitlenmiş Tweet
Ettore Di Giacinto
Ettore Di Giacinto@mudler_it·
LocalAI ( @LocalAI_API ) 4.2.0 is out, just few numbers and facts: - +392 commits ( we squash these 😄 ) - +11 Backends: voice and face recognition, vibevoice.cpp (from me), LocalQVE from @jichiep and among @sgl_project , @__tinygrad__ , @no_stp_on_snek 's Turboquant, ik_llama.cpp, sam.cpp from @el_PA_B - Many new QoL improvements, increased sglang and VLLM support and hardening on distributed mode - 16+ new contributors ! Thanks to the community! LocalAI is all about give you flexibility to run the latest from the community, and ds4 support from @antirez is on its way! This is the year of Local AI!
Ettore Di Giacinto tweet mediaEttore Di Giacinto tweet media
English
10
8
33
7.1K
Eric ⚡️ Building...
I have this thesis to build a tool that detects your hardware -> select your use case -> auto loads a swarm of agents on your hardware Maximum potential to get 100% utilization To load multiple models across your stack. It’s not about what model you can load, it’s about how MANY!
0xSero@0xSero

I think the play is to have multiple tiers of hardware, each for a different use-case. Instead of focusing on having 1 huge model than can do it all, we can have vision, audio, media gen, agent, recursive problem solving, security etc.. all small and focused. Cheaper better

English
7
1
29
3.4K
Ettore Di Giacinto
Ettore Di Giacinto@mudler_it·
@ngiocoli Oppure, con ancora più lungimiranza si da più opportunità a quello che abbiamo nel nostro paese invece di favorire big corps American, con un monopolio che è molto rischioso perché attualmente nel campo dell'AI c'è uno sbilanciamento pazzesco.
Italiano
1
0
1
227
Nicola Giocoli
Nicola Giocoli@ngiocoli·
Un governo minimamente lungimirante proverebbe a convincere Amodei a trasformare un semplice ufficio in un centro di ricerca. Dandogli carta bianca su come farlo. Accompagnando l'offerta con sgravi fiscali ad hoc (e negoziando con UE di non rompere col discorso aiuti di Stato).
Seb Johnson@SebJohnsonUK

Anthropic is doubling down on Europe and opening an office in Milan. Europe is Anthropic's fastest growing region with revenue up 9x YoY, while Milan is the centre of Italian tech. Anthropic now has hubs in London, Dublin, Zurich, Paris, Munich and Milan. It's pushing more and more into Europe as its relationship with the US continues to sour. Great stuff for Italy!

Italiano
50
60
593
58.9K
SpacemiT
SpacemiT@spacemit_riscv·
Just merged: SpacemiT K3 IME2 support is now upstream in llama.cpp (ggml CPU backend). 1. IME2 instructions for SpacemiT K3 2. Native quantized formats: Q2_K → Q8_0, efficient 4-bit ops for Q4 models 3. TCM access: first-time public interface with usage examples This upstream patch adds K3’s IME2-based acceleration in llama.cpp, providing a maintainable path for developers to leverage hardware-optimized kernels. Details: github.com/ggml-org/llama… 🔧 Detailed technical walkthrough and examples are coming soon — stay tuned. #RISC_V #llamacpp #AI #BackendDev #OpenSource
English
1
3
13
1.2K
Damian Z. 💛🇪🇺
@TheAhmadOsman I would swap pure llama.cpp into @LocalAI_API by @mudler_it , unless you have HW like RISC-V/ARM/Other odd setup - why? You get most of that portability of llama.cpp, with a possibility to run orchestration for 2+ PC over the network for distributed inference. Easier scaling etc
English
1
0
2
105
Ahmad
Ahmad@TheAhmadOsman·
DROP EVERYTHING The bible for running LLMs locally is now available online to read for free Covers what to use on - Laptop / edge / odd hardware - Mac-first workflows - Single RTX GPUs - 2-4+ NVIDIA / CUDA GPUs - General production serving - Long-context / MoE / routing - NVIDIA max performance - Cluster orchestration Software - llama.cpp - MLX / MLX-LM - ExLlamaV2 - ExLlamaV3 - vLLM - SGLang - TensorRT-LLM - NVIDIA Dynamo You should read this, and if you cannot now then you most definitely wanna bookmark it for later Local AI FTW
Ahmad@TheAhmadOsman

x.com/i/article/2057…

English
45
238
1.8K
236.8K
Ettore Di Giacinto retweetledi
Richard Palethorpe
Richard Palethorpe@jichiep·
The idea that LLMs produce slop and are therefor useless is as dangerous to creators as the idea that because your vibe coded OS works in some capacity, the code base is therefor a sound foundation. While the unconstrained output of LLMs is dangerous. When you restrict the output so that it can only make errors that you have a method of recovering from then you move far, far quicker. Not as fast as people may naively think from a weekend vibe coding, but still very fast and potentially better quality because covering every edge case is far cheaper (as with fuzzing). What's dangerous is when you don't know what kind of error the system can produce or be subject to. There are a lot of people in AI using FOMO and alarm to get attention and I'm in AI, but I have to call it as I see it and there is very high risk to companies and projects where the failure modes of the product are well known and the agentic coding system can be suitably constrained so that the system identifies errors and can "learn" from them, reverse them or only suffers from harmless errors the user can tolerate. A lot of companies and individuals are failing to identify the ways in which LLMs and in particular LLM coding agents can go wrong and this is a public safety issue. We don't fully understand what happens when software complexity increases and some types of software are just not well understood at all. However you have to look at what you are creating and ask if that's really your present project?
English
0
1
2
1.1K
Ettore Di Giacinto
Ettore Di Giacinto@mudler_it·
Congratulations for the release @allen_ai This is how real OSS AI looks like and it should, I wish they can be an inspiration for many. Other companies claiming doing AI OSS can't compare to the same openness that @allen_ai shows constantly. Yes, they indeed publish pre-training datasets. - Thank you from the community!
Ai2@allen_ai

Today we’re releasing OlmoEarth v1.1. It’s 3x cheaper to run than v1 while delivering the same state-of-the-art performance—and fully open. 🧵

English
2
2
11
1.6K
Ettore Di Giacinto retweetledi
Richard Palethorpe
Richard Palethorpe@jichiep·
OBS now has realtime echo cancellation and noise suppression via the new LocalVQE plugin! As far as I know there isn't another AI solution which combines echo cancellation with noise suppression for OBS?
English
2
1
6
802
Oppollo
Oppollo@oppollo11·
@mudler_it Any benchmarks about how compares to base model?
English
1
0
1
57
Ettore Di Giacinto
Ettore Di Giacinto@mudler_it·
New APEX-MTP quant drop! Qwen3.6-35B-A3B-APEX-MTP-GGUF 👇HF link
4
7
79
8.1K
Carlo
Carlo@Italianclownz·
There are a few people in this AI community I would like to really thank for all they do. The first person is @no_stp_on_snek . He has worked in his free time to make AI more accessible to a majority of people with TurboQuant and his vLLM Swift project. Not only has he do ne these projects but he has written on his thoughts and ideas. I would like to thank @UnslothAI for their dedication to AI and making great quants. Giving the community the tools to make AI more accessible. @mr_r0b0t for his contributions in the AI space and posting results. @mudler_it for giving us some great options in model choices. I see a lot of great APEX models being released. @morganlinton for being a positive member, contributing to the space, eager and happy to put models through their long paces. Probably had the longest /goal I have ever seen. And @Scobleizer who is always pointing out more amazing people in the AI space who are contributing to projects and building with new ideas. I just want to let you all know I appreciate all of you. There are many others but the Xeet would just go on and on. Sorry for this tweet with all the tags but I hope people that see this and follow me, also follow all of you because of all the great work you just keep doing.
English
6
1
16
1.1K
maxgreco
maxgreco@maxgreco·
@mudler_it Thanks a lot, there are improvements in speed also for Pascal generation Nvidia cards?
English
1
0
1
253
Ettore Di Giacinto
Ettore Di Giacinto@mudler_it·
@cursedrobot APEX was developed with help of Claude, not vibe science, but it heavily helps in automatisms and keeping long quants/benchmarks going.
English
0
0
0
93
Id est
Id est@cursedrobot·
@mudler_it Gotta ask, how exactly did you develop your APEX method? I see Claude in contributors, does it mean it is vibe-science?
English
1
0
1
177
Ettore Di Giacinto
Ettore Di Giacinto@mudler_it·
@iotcoi These have applies Imatrix, usually performs slightly better in benchmarks
English
1
0
1
257
Ettore Di Giacinto
Ettore Di Giacinto@mudler_it·
New APEX-MTP quant drop (and one of my favorites): Qwen3.6-35B-A3B-Claude-4.7-Opus-Reasoning-Distilled-APEX-MTP-GGUF 👇Link below
English
4
5
46
4.7K