Pedro Cuenca

3.9K posts

Pedro Cuenca

@pcuenq

ML Engineer at 🤗 Hugging Face | Co-founder at LateNiteSoft (Camera+). I love AI and photography.

Spain Beigetreten Mayıs 2019

1K Folgt7.2K Follower

Pedro Cuenca retweetet

Ben Burtenshaw@ben_burtenshaw·1d

Open source projects like transformers are drowning in AI agent PRs, so we auto-merged everything to see what would happen and share the results. tl;dr: if 100s of agents want to fix something, it’s probably broken. Agent PRs on transformers have quadrupled over the past quarter. We classified and validated 1k PRs (42% features, 39% bugs, 13% docs). The quality distribution is skewed toward noise. But the bug fixes cluster around a small number of hotspots: tokenizer handling, model loading, dtype mismatches, multimodal pipelines. I.e. an underlying problem. When 28 PRs independently flag the same area, that is signal regardless of whether any individual fix is correct. One issue generated 39 near-identical PRs in a day. Each applied the same decorator pattern to a different model file. A maintainer would do the same cognitive work 39 times, so a single combined PR replaces all of that work. We built tooling to cluster, deduplicate, and merge these contributions at scale, then ran an experiment: bulk-merge hundreds of agent PRs into a fork, benchmark it, and see what breaks. Nothing broke. Zero delta across three models on arc_challenge, gsm8k, and hellaswag. The contributors are not adversarial. They lack the context to evaluate whether the agent's output is correct. Check out this blog post, where we dive deep on this pipeline: huggingface.co/spaces/hugging…

English

108

20.5K

Pedro Cuenca retweetet

Prince Canuma@Prince_Canuma·2d

This hackathon is going to be wild. Build the fastest Metal kernels for Apple Silicon 🚀 Ben pitched me on this idea in person a couple weeks ago, happy finally seeing it happen!

Ben Burtenshaw@ben_burtenshaw

Humanity's Last Hackathon is NOW OPEN for registration. This is not a normal hackathon. You will be judged on the context, not the code! Use Codex @OpenAIDevs to build and optimize models for local inference (kernels on Max metal). Submit through @GPU_MODE. Climb the leaderboard. Top performers qualify for the final battle. Launches May 4th. Registration is live now.

English

9.3K

Pedro Cuenca@pcuenq·2d

@ivanfioravanti Taking a look.

English

Ivan Fioravanti ᯅ@ivanfioravanti·2d

@pcuenq @angeloskath It requires trust_remote_code.

English

101

Pedro Cuenca retweetet

Ivan Fioravanti ᯅ@ivanfioravanti·2d

MLX Ling-2.6-flash support added! 💪 Here my (preliminary, because I bet @angeloskath will improve performance) context benchmark for the 4bit version running on M3 Ultra (cooking a new version) I created the PR with the amazing transformer_to_mlx skill by @pcuenq and Opus 4.7. Few iterations and it seems 😂 working 100%! Ultra fast model created by @TheInclusionAI. Can't wait to test it with a code harness! Raw results: Ling-2.6-flash-mlx-4bit MLX Benchmark Results Hardware: Apple M3 Ultra, 512.0GB RAM, 32 CPU cores, 80 GPU cores 0.5k pp 632 tg 79 t/s mem 59.4GB kv 0.03GB 1k pp 676 tg 79 t/s mem 59.9GB kv 0.04GB 2k pp 693 tg 79 t/s mem 61.1GB kv 0.04GB 4k pp 704 tg 79 t/s mem 61.1GB kv 0.05GB 8k pp 708 tg 78 t/s mem 61.2GB kv 0.07GB 16k pp 700 tg 77 t/s mem 61.9GB kv 0.11GB 32k pp 678 tg 74 t/s mem 64.5GB kv 0.18GB 64k pp 637 tg 70 t/s mem 69.5GB kv 0.33GB 128k pp 564 tg 63 t/s mem 79.6GB kv 0.64GB Total generated tokens: 1135 Batch TPS: b1 78 b2 123 b4 164 b8 218 b16 307 b32 418 Batch KV : b1 0.04GB b2 0.08GB b4 0.16GB b8 0.32GB b16 0.63GB b32 1.26GB

English

Pedro Cuenca@pcuenq·2d

@ivanfioravanti @angeloskath Also, any feedback on the skill or the process will be very much appreciated 🙏

English

226

Pedro Cuenca retweetet

clem 🤗@ClementDelangue·3d

300,000 AI builders have already added their hardware to HF to instantly see what model they can run locally. To do so, go to huggingface.co/settings/local… and add your hardware specs. You can even show off publicly by adding it to your HF profile! Let's go local AI!

English

488

35.2K

Pedro Cuenca retweetet

Aritra 🤗@ariG23498·3d

[Hugging Face Machine Learning Club India] It is with immense gratitude and exictement I announce the next speaker for our virtual event, @sarahookr! The date, time, details on how to join will be shared in due time. Share this announcement with you friends, and help make this a great event! 🤗

English

139

5.5K

Pedro Cuenca@pcuenq·2d

@ariG23498 It's really beautiful.

English

Aritra 🤗@ariG23498·3d

@pcuenq I would not have gotten even one task done. Would have just admired the architecture the whole day.

English

153

Pedro Cuenca@pcuenq·3d

My office today (Escuelas Pías Library, set on the rebuilt ruins of a church and school that burned in 1936 during the civil war)

English

696

Pedro Cuenca retweetet

Ben Burtenshaw@ben_burtenshaw·3d

English

362

87.6K

Pedro Cuenca@pcuenq·3d

New open agentic model released by @poolsideai 🔥 Welcome!

poolside@poolsideai

Today we’re releasing Laguna XS.2, Poolside’s first open-weight model. It’s a 33B total / 3B active MoE model built for agentic coding and long-horizon tasks. Trained fully in-house on our own stack. Runs on a single GPU. Released under Apache 2.0. Links 👇 Weights: huggingface.co/poolside/Lagun… API: platform.poolside.ai Blog: poolside.ai/blog/laguna-a-…

English

2.4K

Pedro Cuenca retweetet

Cheng@zcbenz·24 Nis

In the new release of MLX we are bringing thread safety: def worker(): print(mx.arange(10)) threading.Thread(target=worker) which makes parallel inference easier to implement, a feature that had been driven by projects like omlx/vmlx/vllm-mlx. github.com/ml-explore/mlx…

English

146

9.7K

Pedro Cuenca retweetet

Carles Reina@Carles_Reina·3d

Mega excited to be officially launching @ElevenLabs Spain today. My home country and one of the fastest growing markets for us. Let's go!!

English

811

34.4K

Pedro Cuenca retweetet

Prince Canuma@Prince_Canuma·4d

Shoutout to @0xClandestine for the quick fix! Was literally driving 300KM back from an appointment to fix this. @0xClandestine: “Lemme know if you need anything for your PR”💥 This is why I love this community.

Ivan Fioravanti ᯅ@ivanfioravanti

MLX DeepSeek-V4-Flash-2bit-DQ MLX 4K context issue solved! Benchmark results on Apple M5 Max, 128.0GB RAM, 18 CPU cores, 40 GPU cores A comparison M3 Ultra vs M5 Max including bath performance will follow shortly. 0.5k pp 446 tg 42 t/s mem 97.8GB kv 0.02GB 1k pp 578 tg 42 t/s mem 98.1GB kv 0.02GB 2k pp 622 tg 40 t/s mem 99.2GB kv 0.03GB 4k pp 570 tg 37 t/s mem 100.7GB kv 0.04GB 8k pp 513 tg 37 t/s mem 101.4GB kv 0.06GB 16k pp 390 tg 37 t/s mem 102.7GB kv 0.12GB 32k pp 343 tg 36 t/s mem 104.5GB kv 0.23GB 64k pp 297 tg 34 t/s mem 109.4GB kv 0.45GB This is using this PR from @0xClandestine 🔥 It's faster than yesterday! I bet it's using matmul in hardware much more. github.com/Blaizzy/mlx-lm…

English

4.2K

Pedro Cuenca retweetet

clem 🤗@ClementDelangue·4d

Top 3 trending models of the week on HF: @deepseek_ai @OpenAI & @Alibaba_Qwen!

English

189

16.3K

Pedro Cuenca retweetet

steven@Tu7uruu·4d

Today we launch smol-audio A collection of notebooks & scripts to build on cutting-edge local audio models ⚡️ Already in the cookbook: > Fine-tune Whisper / Parakeet / Voxtral / Granite Speech > Fine-tune Audio Flamingo 3 (full + LoRA) > Dialogue TTS with Dia-1.6B > Zero-shot video + audio↔text retrieval with Meta's PE-AV More to come — what would you like to see next? Reply with suggestions.

English

576

49.9K

Pedro Cuenca@pcuenq·4d

@ariG23498 @RisingSayak 👀👀

QME

230

Aritra 🤗@ariG23498·4d

[Hugging Face ML Club India] We are beyond excited for the next virtual event. We host an incredible researcher and more than that an idol of mine (pretty sure of @RisingSayak's as well). They will be talking about the slow death of scaling. I am pretty sure you know who that is, but more information coming soon. Keep your eyes glued to this space. 🤗

English

177

Pedro Cuenca retweetet

Xuan-Son Nguyen@ngxson·4d

I'm giving a talk at GOSIM 2026 about llama.cpp. It will be a high-level overview of what we archived in the past one year. Get your ticket here --> paris2026.gosim.org

English

743

Pedro Cuenca retweetet

swappy@swaapppyyy·4d

Wanted to post this yesterday but I was too tired, but my team and I managed to adapt 4 tasks from @ProximalHQ FrontierSWE benchmark as OpenEnv compatible environments and make them run on HF spaces as part of our hackathon submission checkout the repo at github.com/3xcaffeine/fro…

English

1.5K

Entdecken

@ivanfioravanti @angeloskath @TheInclusionAI @sarahookr @ariG23498 @OpenAIDevs @GPU_MODE @poolsideai