Florian Soudan

74 posts

Florian Soudan

@FSoudan

AI team lead, I manage cross functional teams delivering industrial AI solutions.

Montréal, Québec 가입일 Ekim 2013

195 팔로잉20 팔로워

Florian Soudan@FSoudan·2 Mar

@RisingSayak @ben_nebulous Surprisingly Nano Banana 2 is doing quite well but love the work done to have physically correct. Ground work for world models!

English

Sayak Paul@RisingSayak·27 Şub

Editing images is a series of state transitions between the source image and the edited image that we want. Yet, the existing paradigm doesn't explicitly include any transitioning priors in the editing process. This becomes particularly prevalent for edits, involving causal dynamics (e.g., refraction, deformation). To model this kind of physics-informed information, we leverage the rich priors present in videos and introduce PhysicEdit 🔥 TL;DR: We fine-tune QwenImage Edit on a curated dataset of videos with reasoning traces and fixed-length transition queries to do solid physics-aware image editing! In the process, we introduce a cool dataset "PhysicTran38K", consisting of 38K transition trajectories across five physical domains and devise a method to provide supervision from it QwenImage Edit. Hop in to learn more ⬇️

English

344

42.6K

Florian Soudan@FSoudan·15 Ara

@PontiEdoardo Very interesting! Isn’t the model way slower as the inputs and outputs are now 3 to 5 times longer?

English

103

Edoardo Ponti@PontiEdoardo·15 Ara

Finally, you can count the r's in strawberry and check if 3.11 is higher than 3.9 without tokenisation interfering: Here's Bolmo, a fully open byte-level LLM with latent tokenisation, derived from a SOTA LLM (Olmo 3). Promising on coding and char-level understanding!

Ai2@allen_ai

Introducing Bolmo, a new family of byte-level language models built by "byteifying" our open Olmo 3—and to our knowledge, the first fully open byte-level LM to match or surpass SOTA subword models across a wide range of tasks. 🧵

English

4.2K

Florian Soudan@FSoudan·15 Ağu

@mervenoyann Thanks for that. Generic question on your training notebooks: why you don’t use lightning or any PyTorch training lib? Is do you need specific control you would not get?

English

merve@mervenoyann·15 Ağu

HF: huggingface.co/merve/smol-vis… GH: github.com/merveenoyan/sm… 🤗

1.6K

merve@mervenoyann·15 Ağu

made a small notebook on fine-tuning DINOv3 on image classification 🦖🦕 we will have DINOv3 task heads in transformers at some point, but you can customize and use this notebook in the meantime! 🤗

English

421

26.1K

Florian Soudan@FSoudan·8 Nis

@Prince_Canuma Hey MLX Prince! I was wandering if you were planning to keep working on Phi 4? I tried transformer code and the ONNX version both on MPS with no luck. Thanks again for all the work you do for the community

English

Prince Canuma@Prince_Canuma·12 Mar

Hell yeah! 🔥 Phi-4-multimodal port to MLX update #05 Language model only inference is working 🚀 Next step, load LoRAs and test vision and audio inference.

English

6.7K

Florian Soudan@FSoudan·22 Mar

@Prince_Canuma @lin72h @lllucas @ivanfioravanti @love_police_ok Hey! Thanks so much for your work! Do you plan to work on Phi 4 multimodal to finish it in a near future? 🙏🏻

English

Prince Canuma@Prince_Canuma·22 Mar

@lin72h @lllucas @ivanfioravanti @love_police_ok The fastest community🚀

English

430

Prince Canuma@Prince_Canuma·22 Mar

New Audio Language Models on MLX 🔥🚀 - Orpheus - Sesame - Suno Bark Shoutout to @lllucas, @ivanfioravanti @love_police_ok and Chi Kim for their amazing contributions Get started today: > pip install -U mlx-audio Please leave us a star and send us a PR: github.com/Blaizzy/mlx-au…

English

315

35.7K

Florian Soudan@FSoudan·2 Mar

@ClementDelangue @huggingface Me! 🤗

clem 🤗@ClementDelangue·2 Mar

Who should join @huggingface?

English

141

367

78.6K

Florian Soudan@FSoudan·2 Mar

@fchollet Same here. Let us know where we can still follow you 🙏🏻

English

François Chollet@fchollet·2 Mar

Twitter used to be my favorite place on the Internet. I've derived enormous value from it in the past 16 years. Not true anymore. Most of the people I enjoyed reading have left. My feed, which used to feature art and science and technology and humor, has become constant political propaganda -- on the opposite side of the Enlightenment values I believe in. (I like the rule of law, free speech, free markets, and democracy. I like science and reject obscurantism. I can see that Putin is a dictator, not a genius role model, and that it is Russia that is invading Ukraine, not the other way around.) I have been coming here less and less as a result -- it only brings me negative emotions. I never thought that would be possible, but I might eventually stop coming altogether.

English

553

405

6.9K

457K

Florian Soudan@FSoudan·13 Tem

@_akhaliq @huggingface Do you post different content on HF?

English

AK@_akhaliq·13 Tem

4K followers on @huggingface 🔥

English

116

17.9K

Florian Soudan@FSoudan·4 Haz

@simonw We did extensive OCR tests and Gemini Flash 1.5 is amazing, you can disable any safety check in the arguments of the call. No open source model comes close to it

English

115

Simon Willison@simonw·3 Haz

Multimodal models like GPT-4o and Claude 3 Opus and Google Gemini seem great for OCR at first, but they're no good if they're going to refuse to return text because the content disagrees with their content policies, or they skip text labeled "ignore this text:" in the document!

English

133

14.2K

Simon Willison@simonw·3 Haz

Any OCR models out there with LLM-like capabilities - like the ability to "guess" partial words based on context - but that don't follow extra instructions or apply safety filters of any kind? I want reliable OCR that can't be prompt injected and that won't sometimes refuse text

English

580

218.6K

Florian Soudan@FSoudan·22 May

@Prince_Canuma Love the composition and design of B. I think it just lacks "Llama" for your usage

English

Prince Canuma@Prince_Canuma·21 May

A, B or C?

English

623

Florian Soudan 리트윗함

Prince Canuma@Prince_Canuma·1 May

LLaVA Llama-3 and Phi-3 now on MLX 🎉🚀 You can now run inference locally on your Mac. pip install -U mlx-vlm I’m getting ~50 tokens on a M3 Max. Model cards 👇🏾

merve@mervenoyann

Collection of Llama-3 based VLMs: huggingface.co/collections/xt… Collection of Phi-3 based VLMs: huggingface.co/collections/xt…

English

284

74.4K

Florian Soudan@FSoudan·21 Nis

@Prince_Canuma @qnguyen3 That’s great! Thanks. Next on your list should be Idefics 2, I got great results with it

English

Prince Canuma@Prince_Canuma·20 Nis

Current support Llava and NanoLlava by @qnguyen3 🔥

Italiano

796

Prince Canuma@Prince_Canuma·20 Nis

Excited to announce MLX-VLLM 🎉 The first local framework for Vision Large Language inference powered by MLX. Still WIP and we are open for contributions. It will be on pipy soon. 🚀 github.com/Blaizzy/mlx-vl…

English

198

36.7K

Florian Soudan@FSoudan·19 Nis

@arpitingle Can we talk about the answer of the model? 😂

English

arpit@arpitingle·18 Nis

running llama3-8b-instruct-4bit using mlx

English

123

10.3K

Florian Soudan@FSoudan·19 Nis

@julien_c Excellent, thanks 🤗 team! It works perfectly and Lama 3 70b is so fast!

English

Julien Chaumond@julien_c·18 Nis

we just shipped HuggingChat on iOS 💬 The app is super polished and gives you access to the community's best open AI models, on the go. Give it a try! link to Appstore below ⤵️

English

133

829

176K

Florian Soudan@FSoudan·13 Nis

@mervenoyann Amazing! Thanks @mervenoyann. You are now my reference at work « here a new post of Merve, let’s check that guys » 😅

English

merve@mervenoyann·12 Nis

Ever wanted to learn about fantastic vision language models and how to find and fine-tune them? 🧙🏻 We've just added support to train VLMs like LLaVa in TRL and wrote a walkthrough on vision language models! 🎉 Read about VLMs and SFTTrainer for vision hf.co/blog/vlms

English

177

37.5K

Florian Soudan@FSoudan·10 Mar

@Lykon4072 Congrats! This result is amazing with very difficult features usually not well understood by diffusion models: a single contrastive color applied only on the proper spots and high contrast. Question: do you get the texture directly out of #SD3 (no upscaler or refiner or post)?

English

439

Lykon@Lykon4072·9 Mar

#SD3

QME

26.2K

Florian Soudan@FSoudan·26 Şub

@BenGeskin Ahah @Bouletcorp on a du te tagguer 100 fois sur cette vidéo 🤣

Français

Ben Geskin@BenGeskin·26 Şub

Lenovo's transparent laptop is so futuristic 🤩

Lietuvių

269

460

4.2K

734K

Florian Soudan@FSoudan·26 Şub

@Lykon4072 Thanks great work! The model looks amazing, can’t wait to get access to it

English

Florian Soudan@FSoudan·25 Şub

@Lykon4072 SD3 seems to be very "raw" compared to MJ6 which tends to force its aesthetic. How much have you worked on the style in the prompt to get this image? Or was it a lucky generation?

English

162

Lykon@Lykon4072·24 Şub

#SD3