rotem

110 posts

rotem

rotem

@irotem98

👨🏻‍🍳

Katılım Temmuz 2019
123 Takip Edilen12 Takipçiler
LightSeek Foundation
LightSeek Foundation@lightseekorg·
🚀TorchSpec has been live for 2 weeks — and kimi-k2.5-eagle3 just hit 40K downloads on HuggingFace! Thanks to @KT_Project_AI Team and @vllm_project Team for the amazing collaboration. Links in comments.
LightSeek Foundation tweet media
English
2
4
15
7.8K
Qwen
Qwen@Alibaba_Qwen·
🚀 Introducing Qwen-Image-2.0 — our next-gen image generation model! 🎨 Your imagination, unleashed. ✨ Type a paragraph → get a pro slides ✨ Describe a scene → get photoreal 2K magic ✨ Add text → it just works (no more glitchy letters!) ✨ Key upgrades: ✅ Professional typography (1K-token prompts for slides, posters & comics) ✅ 2K native resolution with stunning detail ✅ Flawless text rendering + unified generation/editing ✅ Lighter architecture = faster inference Try it now → chat.qwen.ai/?inputFeature=… Full details → qwen.ai/blog?id=qwen-i…
Qwen tweet media
English
154
338
2.6K
299.2K
rotem
rotem@irotem98·
@QGallouedec i always cache tokenized dataset for fast iterations
English
0
0
0
12
Quentin Gallouédec
Quentin Gallouédec@QGallouedec·
sft, dpo, reward modeling, they all involve dataset preparation one simple arg can significantly speedup this stage
Quentin Gallouédec tweet media
English
2
2
57
2K
rotem
rotem@irotem98·
@SwayStar123 woww🥺 what are the layers you target with muon?
English
1
0
0
535
sway
sway@SwayStar123·
Prodigy muon
sway tweet media
Suomi
7
3
83
7K
Unsloth AI
Unsloth AI@UnslothAI·
You can now do reinforcement learning training with 7× longer context and no accuracy loss, via our new batching algorithms. Long reasoning chains in RL are costly, but now we enable you to train gpt-oss with GRPO & reach 380K context on a 192GB GPU. unsloth.ai/docs/new/grpo-…
Unsloth AI tweet media
English
15
78
548
72.4K
rotem
rotem@irotem98·
@SwayStar123 sooo cool! what implementation and kernels do you use?
English
0
0
0
77
sway
sway@SwayStar123·
Oh the original hyper connections paper already tested it out on diffusion! Will try out mHC on SR-DiT. (Btw i hit 3.13 FID now, maybe we can break the 3 FID wall!)
sway tweet media
English
4
4
46
8.2K
Larry Dial
Larry Dial@classiclarryd·
New NanoGPT Speedrun WR at 113.7 (-1.4s) from @ChrisJMcCormick, w/ param bank to centralize certain per-layer params, optimized Adam, ema buffer precision increase, and gate matrices from Muon to Adam. Scientists claim records must stop after reaching 0s. github.com/KellerJordan/m…
Larry Dial tweet mediaLarry Dial tweet mediaLarry Dial tweet media
English
7
18
172
13.2K
Unsloth AI
Unsloth AI@UnslothAI·
You can now run FP8 reinforcement learning on consumer GPUs! Try DeepSeek-R1’s FP8 GRPO at home using only a 5GB GPU. Qwen3-1.7B fits in 5GB VRAM. We collabed with PyTorch to make FP8 RL inference 1.4× faster. Unsloth: 60% less VRAM, 12× longer context. docs.unsloth.ai/new/fp8-reinfo…
Unsloth AI tweet media
English
19
129
892
144.9K
rotem
rotem@irotem98·
@Alibaba_Qwen will training vlm support soon images with any resolution?
English
0
0
0
34
Unsloth AI
Unsloth AI@UnslothAI·
You can now run Qwen3-VL locally! 💜 Run the 235B variant for SOTA vision/OCR on 128GB unified memory (dynamic 4-bit). Includes our chat template fixes. Qwen3-VL-2B runs at ~40 t/s on 4GB RAM. Fine-tune & RL via Unsloth free notebooks & export to GGUF. docs.unsloth.ai/models/qwen3-vl
Unsloth AI tweet media
English
25
95
587
92.2K
rotem
rotem@irotem98·
@paulabartabajo_ @liquidai Thanks for the reply. Benchmarks for agents will be most interesting to me, but the average scores like that post will be great for everyone.
English
0
0
1
11
Liquid AI
Liquid AI@liquidai·
Introducing our new tiny vision language model: LFM2-VL-3B 👀 > Expanded multilingual visual understanding: English, Japanese, French, Spanish, German, Italian, Portuguese, Arabic, Chinese, Korean > 51.8% on MM-IFEval (instruction following) > 71.4% on RealWorldQA (real-world understanding) > Excels in single and multi-image understanding and English OCR > Low object hallucination rate (POPE benchmark) Download below 👇
Liquid AI tweet media
English
16
58
379
57.4K
rotem
rotem@irotem98·
@moondreamai great! is there multiple images support?
English
0
0
0
67
moondream
moondream@moondreamai·
We just launched Moondream Cloud ☁️ Our hosted vision AI that’s faster, cheaper, and smarter than Gemini 2.5 Flash and GPT-5 Mini. No subs. $5 free monthly credits. Pay-as-you-go: $0.30/M input · $2.50/M output.
English
14
25
368
83.1K
rotem
rotem@irotem98·
@vikhyatk mistral existed before rl was even a thing
English
0
0
0
9
vik
vik@vikhyatk·
@irotem98 i’m pretty sure the bird evolved first
English
1
0
1
23
vik
vik@vikhyatk·
made a cute logo for our internal inference engine
vik tweet media
English
9
0
90
4.2K