rotem (@irotem98) - Twitter Profili | Zamantika Mersobahis Locabet

rotem@irotem98·13h

@KT_Project_AI can’t wait for the gemma-4-26B-A4B-it-eagle3

English

0

21

KVCache.AI@KT_Project_AI·13h

Huge milestone for kimi-k2.5-eagle3 reaching 40K downloads on Hugging Face, especially in just two weeks 🚀🚀🚀 It is also a great signal for the growing adoption of speculative decoding in production.

LightSeek Foundation@lightseekorg

🚀TorchSpec has been live for 2 weeks — and kimi-k2.5-eagle3 just hit 40K downloads on HuggingFace! Thanks to @KT_Project_AI Team and @vllm_project Team for the amazing collaboration. Links in comments.

English

1

2

7

512

rotem@irotem98·13h

@lightseekorg @KT_Project_AI @vllm_project can’t wait for the gemma-4-26B-A4B-it-eagle3

English

0

87

LightSeek Foundation@lightseekorg·15h

🚀TorchSpec has been live for 2 weeks — and kimi-k2.5-eagle3 just hit 40K downloads on HuggingFace! Thanks to @KT_Project_AI Team and @vllm_project Team for the amazing collaboration. Links in comments.

English

2

4

15

7.8K

rotem@irotem98·15 Şub

@JiaweiYang118 👀

QME

0

203

Jiawei Yang@JiaweiYang118·15 Şub

0.9+ FID with a 150M model, 1-step pixel space😅

Jiawei Yang@JiaweiYang118

I guess I just have some 1-step pixel space generation models with 1.3 FID on ImageNet256?

English

9

2

104

22.2K

rotem@irotem98·11 Şub

@Alibaba_Qwen ❌apache 2.0

Indonesia

0

1

77

Qwen@Alibaba_Qwen·10 Şub

🚀 Introducing Qwen-Image-2.0 — our next-gen image generation model! 🎨 Your imagination, unleashed. ✨ Type a paragraph → get a pro slides ✨ Describe a scene → get photoreal 2K magic ✨ Add text → it just works (no more glitchy letters!) ✨ Key upgrades: ✅ Professional typography (1K-token prompts for slides, posters & comics) ✅ 2K native resolution with stunning detail ✅ Flawless text rendering + unified generation/editing ✅ Lighter architecture = faster inference Try it now → chat.qwen.ai/?inputFeature=… Full details → qwen.ai/blog?id=qwen-i…

English

154

338

2.6K

299.2K

rotem@irotem98·10 Şub

@UnslothAI @huggingface OMG😱

0

2

148

Unsloth AI@UnslothAI·10 Şub

You can now train MoE models 12× faster with 35% less VRAM via our new Triton kernels (no accuracy loss). Train gpt-oss locally on 12.8GB VRAM. In collab with @HuggingFace, Unsloth trains DeepSeek, Qwen3, GLM faster. Repo: github.com/unslothai/unsl… Blog: unsloth.ai/docs/new/faste…

English

50

197

1.5K

211.6K

rotem@irotem98·21 Oca

@QGallouedec i always cache tokenized dataset for fast iterations

English

0

12

Quentin Gallouédec@QGallouedec·20 Oca

sft, dpo, reward modeling, they all involve dataset preparation one simple arg can significantly speedup this stage

English

2

57

2K

rotem@irotem98·20 Oca

@SwayStar123 woww🥺 what are the layers you target with muon?

English

1

0

535

sway@SwayStar123·20 Oca

Prodigy muon

Suomi

7

3

83

7K

rotem@irotem98·15 Oca

@UnslothAI @vllm_project when unsloth is quiet for a two weeks you know something big is cooking

English

1

0

2

235

Unsloth AI@UnslothAI·15 Oca

You can now do reinforcement learning training with 7× longer context and no accuracy loss, via our new batching algorithms. Long reasoning chains in RL are costly, but now we enable you to train gpt-oss with GRPO & reach 380K context on a 192GB GPU. unsloth.ai/docs/new/grpo-…

English

15

78

548

72.4K

rotem@irotem98·15 Oca

@SwayStar123 sooo cool! what implementation and kernels do you use?

English

0

77

sway@SwayStar123·15 Oca

Oh the original hyper connections paper already tested it out on diffusion! Will try out mHC on SR-DiT. (Btw i hit 3.13 FID now, maybe we can break the 3 FID wall!)

English

4

46

8.2K

rotem@irotem98·6 Oca

@classiclarryd @ChrisJMcCormick amazing! will i be able to take this adam to completely different training like diffusion?

English

0

31

Larry Dial@classiclarryd·4 Oca

New NanoGPT Speedrun WR at 113.7 (-1.4s) from @ChrisJMcCormick, w/ param bank to centralize certain per-layer params, optimized Adam, ema buffer precision increase, and gate matrices from Muon to Adam. Scientists claim records must stop after reaching 0s. github.com/KellerJordan/m…

English

7

18

172

13.2K

rotem@irotem98·25 Kas

@UnslothAI @PyTorch @NVIDIAAIDev @vllm_project goated

English

0

4

961

Unsloth AI@UnslothAI·25 Kas

You can now run FP8 reinforcement learning on consumer GPUs! Try DeepSeek-R1’s FP8 GRPO at home using only a 5GB GPU. Qwen3-1.7B fits in 5GB VRAM. We collabed with PyTorch to make FP8 RL inference 1.4× faster. Unsloth: 60% less VRAM, 12× longer context. docs.unsloth.ai/new/fp8-reinfo…

English

19

129

892

144.9K

rotem@irotem98·7 Kas

@xieenze_jr @lawrence_cjs @RisingSayak when sana video sprint

English

1

0

37

Enze Xie@xieenze_jr·6 Kas

🥳🎉Sana-video inference code has been integrated into diffusers! Thanks to @lawrence_cjs @RisingSayak and the team for making it happen. huggingface.co/docs/diffusers…

Enze Xie@xieenze_jr

The training/ Inference code and checkpoints are released. Welcome to try! github.com/NVlabs/Sana

English

2

7

36

7.8K

rotem@irotem98·3 Kas

@Alibaba_Qwen will training vlm support soon images with any resolution?

English

0

34

Qwen@Alibaba_Qwen·2 Kas

You can now run Qwen3-VL locally with Unsloth AI. 👇Fine-tune & RL via free notebooks.

Unsloth AI@UnslothAI

You can now run Qwen3-VL locally! 💜 Run the 235B variant for SOTA vision/OCR on 128GB unified memory (dynamic 4-bit). Includes our chat template fixes. Qwen3-VL-2B runs at ~40 t/s on 4GB RAM. Fine-tune & RL via Unsloth free notebooks & export to GGUF. docs.unsloth.ai/models/qwen3-vl

English

21

68

573

64.5K

rotem@irotem98·1 Kas

@UnslothAI @Alibaba_Qwen goat support for vllm fast inference for qwen3 vl for rl soon?

English

0

1

249

Unsloth AI@UnslothAI·31 Eki

You can now run Qwen3-VL locally! 💜 Run the 235B variant for SOTA vision/OCR on 128GB unified memory (dynamic 4-bit). Includes our chat template fixes. Qwen3-VL-2B runs at ~40 t/s on 4GB RAM. Fine-tune & RL via Unsloth free notebooks & export to GGUF. docs.unsloth.ai/models/qwen3-vl

English

25

95

587

92.2K

rotem@irotem98·23 Eki

@paulabartabajo_ @liquidai Thanks for the reply. Benchmarks for agents will be most interesting to me, but the average scores like that post will be great for everyone.

English

0

1

11

Pau Labarta Bajo@paulabartabajo_·23 Eki

@irotem98 @liquidai For what tasks/use cases you want to benchmark LFM2-VL vs Qwen?

English

1

0

47

Liquid AI@liquidai·22 Eki

Introducing our new tiny vision language model: LFM2-VL-3B 👀 > Expanded multilingual visual understanding: English, Japanese, French, Spanish, German, Italian, Portuguese, Arabic, Chinese, Korean > 51.8% on MM-IFEval (instruction following) > 71.4% on RealWorldQA (real-world understanding) > Excels in single and multi-image understanding and English OCR > Low object hallucination rate (POPE benchmark) Download below 👇

English

16

58

379

57.4K

rotem@irotem98·19 Eki

@moondreamai great! is there multiple images support?

English

0

67

moondream@moondreamai·17 Eki

We just launched Moondream Cloud ☁️ Our hosted vision AI that’s faster, cheaper, and smarter than Gemini 2.5 Flash and GPT-5 Mini. No subs. $5 free monthly credits. Pay-as-you-go: $0.30/M input · $2.50/M output.

English

14

25

368

83.1K

rotem@irotem98·16 Eki

@vikhyatk mistral existed before rl was even a thing

English

0

9