Xuan-Son Nguyen

1K posts

Xuan-Son Nguyen

@ngxson

Engineer @huggingface

Paris เข้าร่วม Ağustos 2020

240 กำลังติดตาม6.3K ผู้ติดตาม

ทวีตที่ปักหมุด

Xuan-Son Nguyen@ngxson·22 Ara

Updated my GH profile with a list of what I'm doing on llama.cpp 😂 Why? Because sometimes I forgot what I did...

English

3.8K

Xuan-Son Nguyen รีทวีตแล้ว

clem 🤗@ClementDelangue·13h

Share this with your representative! huggingface.co/blog/cybersecu…

clem 🤗@ClementDelangue

I’m hearing there’s renewed lobbying in DC and in state legislatures to ban or severely restrict open-source. Like a few years ago, we’ll need everyone to help show policymakers why open-source matters: for startups, for competition, for economic growth, and for jobs. If you build with open-source, now is the time to speak up!

English

123

26.1K

Xuan-Son Nguyen รีทวีตแล้ว

Julien Chaumond@julien_c·13h

did you know that huggingface_hub (just the Python client) is sending almost 6B requests/week? wow 😮 @huggingface

English

6.1K

Xuan-Son Nguyen รีทวีตแล้ว

Lysandre@LysandreJik·1d

We're opening a Hugging Face office in Tokyo! Our goal: help open-source AI develop in Japan and grow the local community. Let's meet! ハギングフェイスの東京オフィスがオープンしました！私たちの目標は、日本におけるオープンソースAIの発展を支援し、ローカルコミュニティを育てることです。ぜひお会いしましょう！

日本語

125

473

3.3K

274.7K

Xuan-Son Nguyen@ngxson·4d

@garyfung Yes, qwen3.5 / 3.6 is also quite good. However, its recurrent architecture pose quite some problems with KV cache reuse. On gemma 4, while cache reuse is also pretty much a mess with sliding attention, you can actually bypass it via --swa-full flag in llama.cpp

English

gary IH fung@garyfung·5d

@ngxson gemma 4 is getting mogged by qwen3.6, or even 3.5 x.com/garyfung/statu…

gary IH fung@garyfung

this is one surprisingly impressive small model! Ran the @UnslothAI q3 quant of this Qwen3.6 35b A3B, 110-130 tps on my local 4090 card - one shotted the pelican on bike svg (not best but, small model) - asteroid. 1 shotted full working version. Was it just memorization? Went down the rabbit hole of vibing iterations and looking at its thinking tokens, it thinks in code snippets and makes pretty bang on decisions based on vague product-esque prompts without explicit spec'ing you can see the full @lmstudio chat export at github.com/fungilation/as… genuinely impressed! Feels like Claude Sonnet, internally agentic for coding use case, running 110 tps+ locally is amazing. Thanks @Alibaba_Qwen for keeping up the open weights momentum!

English

116

Xuan-Son Nguyen@ngxson·5d

I stopped using claude code on all of my llama.cpp workflows for the past few days. The quality degradation is just too significant. Experimenting on a mixed usage between Gemma 4 26B-A4B and Gemini 3.1 Pro, so far much better than what anthropic can offer.

Simon Willison@simonw

Shocking result on my pelican benchmark this morning, I got a better pelican from a 21GB local Qwen3.6-35B-A3B running on my laptop than I did from the new Opus 4.7! Qwen on the left, Opus on the right

English

2.2K

Xuan-Son Nguyen รีทวีตแล้ว

Julien Chaumond@julien_c·5d

opus 4.7 slightly more dangerous, slightly more expensive OR: run local models!

English

3.9K

Xuan-Son Nguyen@ngxson·14 Nis

Given the right harness, you can just do everything you want

clem 🤗@ClementDelangue

"But here is what we found when we tested: We took the specific vulnerabilities Anthropic showcases in their announcement, isolated the relevant code, and ran them through small, cheap, open-weights models. Those models recovered much of the same analysis. Eight out of eight models detected Mythos's flagship FreeBSD exploit, including one with only 3.6 billion active parameters costing $0.11 per million tokens. A 5.1B-active open model recovered the core chain of the 27-year-old OpenBSD bug." aisle.com/blog/ai-cybers…

English

1.2K

Xuan-Son Nguyen@ngxson·14 Nis

Having a small break today! I'm taking a step back to reflect on my motivations and what I value when working on open source. Read my latest blog post 👇 blog.ngxson.com/the-value-of-o…

English

381

Xuan-Son Nguyen@ngxson·13 Nis

llama.cpp now supports Qwen3-ASR, Qwen3-Omni and Gemma 4 audio/vision input 🔥 Mixed modalities is the future 😼😼

English

4.1K

Xuan-Son Nguyen@ngxson·10 Nis

AFAICT there is no metrics to determine if an OCR model is "best". Example: one can be better in OCR English, and another can be better in Chinese. The "best quality" model may be even a big model that can't fit into your RAM. So unfortunately, it still requires trials to know which one is the "best fit" for your particular use case.

English

995

Harry Zhang@tokeemb·10 Nis

@ngxson What is best ocr model? I don’t want various I want one

English

1.1K

Xuan-Son Nguyen@ngxson·10 Nis

llama.cpp now supports various small OCR models that can run on low-end devices. These models are small enough to run on GPU with 4GB VRAM, and some of them can even run on CPU with decent performance. In this post, I will show you how to use these OCR models with llama.cpp 👇

English

244

21.8K

Xuan-Son Nguyen@ngxson·10 Nis

Also follow the blog post / discussion on Hugging Face: huggingface.co/blog/ggml-org/…

English

1.8K

Xuan-Son Nguyen@ngxson·10 Nis

blog.ngxson.com/using-ocr-mode…

ZXX

2.1K

Xuan-Son Nguyen@ngxson·9 Nis

@Davidid0x @le0z00s @jlcjak lol yes why not, we attach a solar panel on the other side to harness the power

English

187

David@Davidid0x·9 Nis

@ngxson @le0z00s @jlcjak Guys… the solution to all of this is obvious… TOSLINK

English

209

jlcjak@jlcjak·8 Nis

idea: instead of having a single connector operate at multiple voltages without warning, what if we had some way to exchange information between charger and device so they can negotiate an appropriate voltage. call it "power delivery". could we use usbc and barrel jack for this?

English

1.8K

91.4K

Xuan-Son Nguyen@ngxson·9 Nis

@le0z00s @jlcjak I hope you're not gonna have a heart attack because someone *cough Panasonic* *cough CF-SC6* still makes laptops with VGA port in 2026

English

750

Rafał Zbojak@le0z00s·9 Nis

@jlcjak @ngxson Why not DA-15? It was good enough for MIDI devices. Or better: either DB-25 or 36-pin Mini-Centronics IEEE 1284 I think I'm having a stroke.

English

782

Xuan-Son Nguyen@ngxson·9 Nis

@lettuce_isgood @jlcjak IIRC pin 5 6 7 8 are all connected to ground. Lazy manufacturers don't care 🤷

English

Tohru Defender@lettuce_isgood·9 Nis

@ngxson @jlcjak not really though...

English

Xuan-Son Nguyen@ngxson·6 Nis

@Prince_Canuma Same here, I skipped the comment altogether since most of the time contributors won't question it. Also, we (kinda) enforced a PR template where contributors have to explicitly indicate that they agreed to the guidelines.

English

Prince Canuma@Prince_Canuma·6 Nis

@ngxson Tell me about it! I usually respond and close with the comment but it’s exhausting when you getting dozens a day. I might need to build a skill to triage and tag PR automatically every 24h. Agent on agent action 😂

GIF

English

168

Prince Canuma@Prince_Canuma·6 Nis

Have a new label for certain type of PRs 😤

English

3.3K

Xuan-Son Nguyen@ngxson·6 Nis

AI, are you alright?

English

503

ค้นพบ

@huggingface @garyfung @Davidid0x @le0z00s @jlcjak @elonmusk @BarackObama @taylorswift13