eisneim

955 posts

eisneim banner
eisneim

eisneim

@eisneim

Gen AI | Web developer | videographer 视频大拍档特里

Shenzhen, China Katılım Nisan 2017
327 Takip Edilen318 Takipçiler
Sabitlenmiş Tweet
eisneim
eisneim@eisneim·
I'm working on a Flux dev based model that can relight a photo conditioned on time (eg. 6AM, 7AM ) without changing the background unlike ic-light and LBM model
English
2
0
21
581
eisneim retweetledi
Mickmumpitz
Mickmumpitz@mickmumpitz·
Another test with the LTX 2.3 vid2vid lip sync workflow. I've been finding the inpainting mode works more reliably overall, so I'd actually recommend turning it on even for close-ups.
English
5
12
143
8.6K
eisneim retweetledi
Brie Wensleydale🧀🐭
Brie Wensleydale🧀🐭@SlipperyGem·
Yet another amazing-lookingIC lora for LTX 2.3 lands on the scene. Its v2v and text prompted. Does editing, removal, replacement and restyle. Personally, I would REALLY like to know if it can handle a first frame as a reference. I'm guessing now though. civitai.red/models/2553102…
English
3
14
160
7.4K
eisneim retweetledi
Purz.ai
Purz.ai@PurzBeats·
LTX 2.3 IC LoRA - EditAnything by Alisson Pereira
English
9
7
102
7.9K
eisneim retweetledi
A.Robot
A.Robot@100PercentRobot·
Just discovered frame injection in LTX-2.3, so of course I did something weird
English
12
9
153
12.7K
eisneim
eisneim@eisneim·
github.com/eisneim/LTX-2_… i created a new Repo for faster and better image to video generation using LTX 2.3 with triple stage sampling
English
0
0
0
164
eisneim
eisneim@eisneim·
github.com/GAIR-NLP/daVin… new video model: 15B-parameter, 40-layer Transformer that jointly processes text, video, and audio via self-attention only. No cross-attention, no multi-stream complexity. Achieves 80.0% win rate vs Ovi 1.1 and 60.9% vs LTX 2.3
English
0
1
5
157
eisneim retweetledi
Tongyi Lab
Tongyi Lab@Ali_TongyiLab·
1/2 Qwen3.5 is here. The next frontier of Native Multimodal Agents is open. 🚀 We are thrilled to release Qwen3.5-397B-A17B, our flagship open-weight vision-language model. Built for the future of coding, reasoning, and seamless multimodal interaction. Key Highlights: Inference Efficiency: A massive 397B total parameters, but only 17B active—delivering flagship power at a fraction of the cost. Hybrid Architecture: Innovative Gated Delta Networks (Linear Attention) + Sparse MoE for extreme speed. True Multimodality: Exceptional performance across GUI interaction, video comprehension, and agentic workflows. Global Scale: Qwen3.5 now supports over 200 languages. Empowering developers and enterprises to build smarter, faster, and more versatile AI agents.
English
45
186
1.6K
449.4K
eisneim retweetledi
Ivan Fioravanti ᯅ
Ivan Fioravanti ᯅ@ivanfioravanti·
OpenCode + MLX + Qwen3.5-397B-A17B-4bit. Video is 8x, but the goal is showing that It works! This is something unimaginable just few months ago. MLX Team is pushing like crazy and M5 Ultra will do the rest 🚀
English
24
48
522
48.4K
eisneim retweetledi
Wildminder
Wildminder@wildmindai·
Capybara? 14B model for T2V, T2I, TV2V, TI2I. - based on HunyuanVideo1.5; - byt5-small, Glyph-SDXL-v2, SigLIP; - 480p-1080p; 16.7GB model, 5GB VAE.. mostly for video editing. huggingface.co/xgen-universe/…
English
0
22
148
16.8K
eisneim retweetledi
Gorden Sun
Gorden Sun@Gorden_Sun·
BitDance:字节大年初一开源的AI绘画模型 最大的亮点是速度快,使用高压缩视觉分词器,将图像映射为紧凑的二值Token序列,并且每一步扩散过程并行预测64个Token。所以即使模型大小有14B,生成图片的速度也非常快。 模型:huggingface.co/collections/sh… Github:github.com/shallowdream20…
Gorden Sun tweet media
中文
3
28
120
15.4K
eisneim retweetledi
Wildminder
Wildminder@wildmindai·
Self-Refining Video Sampling: inference-time method using a video generator as its own refiner to correct physics and motion. no retraining needed; scores >70% human preference; is validated on Wan2.2 & Cosmos. agwmon.github.io/self-refine-vi…
English
4
39
261
32K
eisneim retweetledi
AK
AK@_akhaliq·
OmniTransfer All-in-one Framework for Spatio-temporal Video Transfer
English
2
11
88
10.7K
eisneim retweetledi
DailyPapers
DailyPapers@HuggingPapers·
Qwen just dropped Qwen3-TTS on Hugging Face Voice cloning from 3s of audio, 10-language support, and 97ms streaming latency for ultra-realistic speech generation
DailyPapers tweet media
English
2
32
209
19.1K
eisneim retweetledi
Radical Numerics
Radical Numerics@RadicalNumerics·
Scaling scientific world models requires co-designing architectures, training objectives, and numerics. Today, we share the first posts in our series on low-precision pretraining, starting with NVIDIA's NVFP4 recipe for stable 4-bit training. Part 1: radicalnumerics.ai/blog/nvfp4-par… Part 2: radicalnumerics.ai/blog/nvfp4-par… We cover floating point fundamentals, heuristics, custom CUDA kernels, and stabilization techniques. Future entries will cover custom recipes and results on hybrid architectures.
Radical Numerics tweet media
English
9
94
528
67.9K
eisneim retweetledi
AK
AK@_akhaliq·
HeartMuLa A Family of Open Sourced Music Foundation Models
English
5
13
64
9.7K
eisneim retweetledi
PhotoX86
PhotoX86@PhotoX86·
一个有意思的小节点,通过色块引导给图像重新打光.图1为原图,图2绘制的色块(通常色块绘制在图像边缘见图4,我这里是尝试在身体上绘制),图3根据色块位置和颜色重新生成光线效果. 虽然有时候效果还不够满意,毕竟这个还只是一个Alpha版,期待以后的更新迭代.节点名称:Qwen-Edit-2511_LightingRemap_Alpha0.2
PhotoX86 tweet mediaPhotoX86 tweet mediaPhotoX86 tweet mediaPhotoX86 tweet media
中文
1
1
1
230
eisneim retweetledi
Wildminder
Wildminder@wildmindai·
Again Wan. Reward Forcing: Real-time streaming video gen, 23 FPS w/ interactive control; - infinite generation; - built on Wan2.1-T2V-1.3B reward-forcing.github.io
English
3
14
125
7.9K
eisneim retweetledi
Tencent Hy
Tencent Hy@TencentHunyuan·
💡HunyuanVideo1.5 Update: We are now releasing the 480p I2V step-distilled model, which generates videos in 8 or 12 steps (recommended)! On RTX 4090, end-to-end generation time is reduced by 75%, and a single RTX 4090 can generate videos within 75 seconds. The step-distilled model maintains comparable quality to the original model while achieving significant speedup. For even faster generation, you can also try 4 steps (faster speed with slightly reduced quality). 🔗Check out the GitHub Repo: github.com/Tencent-Hunyua…
English
15
70
545
39K