Wildminder

2.2K posts

Wildminder banner
Wildminder

Wildminder

@wildmindai

Physicist, Programmer, Designer

Berlin Katılım Aralık 2024
88 Takip Edilen10K Takipçiler
Wildminder
Wildminder@wildmindai·
Helix4D. Turns a simple video into a high-quality, animated 3D model. - reconstructs shattering, melting, transparent dynamics - Trellis2 backbone - 2.3x faster than full-attention baselines - optimized for animation, 3D content creation. snap-research.github.io/helix4d/
English
1
4
27
1.2K
Wildminder
Wildminder@wildmindai·
SUPIR upscaler is outdated. ASASR- turns blurry, low-quality photos into sharp, high-res images. Prevents the fake hallucinated details. - improves OCR - high segmentation accuracy - based on FLUX.1 dev this looks sweet github.com/wafer-bob/ASASR
English
4
18
178
7.6K
Wildminder
Wildminder@wildmindai·
SCOPE - Wan2.2 for creating playable FPS scenes. You use a standard gamepad to control the video output in real-time. Keeps the bg stable while you execute fast actions. - upto 10-DoF control: firing, reloading, 4-axis movement - 0.9 Dynamic Degree with 71% zero-shot action completion Useful for rapid prototyping z2tong.github.io/SCOPE/
English
0
5
44
4.1K
Wildminder
Wildminder@wildmindai·
GenRecon turns a handful of room photos into professional-grade 3D models. Nice for games, VR, and architectural design. - PBR meshes from sparse RGB images - Trellis.2+DINOv3 - partitions large scenes into overlapping chunks for global consistency kasothaphie.github.io/GenRecon/
English
2
30
223
11.8K
Wildminder
Wildminder@wildmindai·
Awesome. NVIDIA dropped PiD - fast high-res latent decoding via pixel diffusion! - replace VAE - 4/8x upsampling - 2k decoding in <1s on RTX 5090 - works with FLUX.1/SD3/Z - rapid generation previews sharper details, much lower hardware lag compared to standard methods. research.nvidia.com/labs/sil/proje…
English
7
57
419
20.9K
Wildminder
Wildminder@wildmindai·
InstructAV2AV: Instruction-guided joint audio-video editing. - Ovi+Wan2.2+T5 - change people - edit speech - add/remove things - mask-free object/sound replacement You type what you want, the model keeps the background perfectly intact while syncing the new visuals and sounds hjzheng.net/projects/Instr…
English
2
10
99
6.7K
Wildminder
Wildminder@wildmindai·
Microsoft Lens in ComfyUI. Not sure how I feel about this model: it’s good in one spot, awful in another. > people: 4/10 > abstract stuff: 5/10 > nature/animals - 8/10 The training dataset was probably quite limited huggingface.co/Comfy-Org/Lens
Wildminder tweet mediaWildminder tweet mediaWildminder tweet mediaWildminder tweet media
English
1
4
66
6.8K
Wildminder
Wildminder@wildmindai·
Microsoft finally releases the full weights for the Lens T2I 3.8B models (Lens/Turbo/Base). - uses FLUX.2 VAE + GPT-OSS - 1440x1440 - 4-step gen with Turbo Looks pretty interesting huggingface.co/microsoft/Lens
Wildminder tweet mediaWildminder tweet mediaWildminder tweet media
English
6
26
231
12.4K
Wildminder
Wildminder@wildmindai·
Tencent dropped Z-Image 6B with pixel space gen. - no VAE - 1k resolution - high efficiency It’s a transfer learning framework that turns any regular Flux/SD model into a full pixel-space beast nju-pcalab.github.io/projects/L2P/
Wildminder tweet media
English
4
68
513
45.4K
Wildminder
Wildminder@wildmindai·
CogOmniControl by Tencent. Reasoning-driven controllable video gen. CogVLM + CogOmniDiT to translate sparse storyboards/sketches into production-quality video. beats VINO, VACE-Wan2.1 um-lab.github.io/CogOmniControl/
English
6
51
338
23.3K
Wildminder
Wildminder@wildmindai·
WavFlow by Meta. It's the audio equivalent of pixel-space image gen. Works directly on the raw waveform to avoid any information loss, no VAE. - T2A / VT2A via 1.03B MMDiT - high-fidelity 44.1kHz output facebookresearch.github.io/WavFlow/
English
6
16
156
11K
Wildminder
Wildminder@wildmindai·
ComfyUI just turned into the ultimate Swiss Army knife. MediaPipe face detection is now natively supported. We’ll likely get more stuff too: object detection, text classification, embeddings, audio classification, img segmentation huggingface.co/Comfy-Org/medi…
Wildminder tweet media
English
0
9
98
4.7K
Wildminder
Wildminder@wildmindai·
Awesome AI Auto-Research. a comprehensive guide to automating the scientific research lifecycle. Shifts AI from being a simple assistant to an autonomous researcher. And the GitHub repo has a huge collection of papers and code on agentic AI research github.com/worldbench/awe…
Wildminder tweet media
English
0
2
15
1.6K
Wildminder
Wildminder@wildmindai·
Aurora- gent-driven video editor designed to fix lazy user requests. A bridge between a vague idea and a high-quality video. - Qwen3-VL-8B + Wan2.2 - automatically retrieves assets, makes masks - handles object insertion, removal, bg changes in one pipeline useful for brand placement, concept editing, smart swaps etc yongshengyu.com/Aurora-Page/
English
1
16
100
6K
Wildminder
Wildminder@wildmindai·
PanoWorld. An interesting way to use Qwen-Edit. It converts 2D floor plans into photorealistic, consistent VR home tours. Great for real estate and interior designers. It lets you walk through a home that hasn’t been built or furnished yet. Ensures seamless 360 views via CPRoPE jjrcn.github.io/PanoWorld-proj…
English
7
39
331
19.8K
Wildminder
Wildminder@wildmindai·
Pixal3D-ComfyUI Nice, full-featured nodes for Pixel-Aligned 3D Generation. - GLB export - FlashAttention support - manual camera control - native model management github.com/Saganaki22/Pix…
Wildminder tweet media
English
0
15
117
5.1K
Wildminder
Wildminder@wildmindai·
LiTo by Apple: Surface light field tokenization for high-fidelity 3D generation. turns a single photo into a realistic 3D,captures both its shape and how light reflects off its surface. accurately reproduces shiny highlights and glossy finishes. apple.github.io/ml-lito/
English
0
15
93
37.9K