Jürgen Fey

902 posts

Jürgen Fey banner
Jürgen Fey

Jürgen Fey

@androidian60

ai consulting, industrial applications. tech geek. classic car geek. gadget geek. system architect. IoT. WebRTC and related realtime use cases. CTO

Munich and Europe Katılım Mayıs 2009
1.1K Takip Edilen132 Takipçiler
Jürgen Fey retweetledi
Wildminder
Wildminder@wildmindai·
Another cool stuff from NVIDIA. LocateAnything - high-speed visual search engine. You provide a text prompt and it instantly pinpoints that object's exact location in an image. - 10x speedup for dense object detection - Qwen2.5-3B + Moon-ViT - Fast/Slow/Hybrid modes - trained on 138M samples for UI, docs, generic grounding. research.nvidia.com/labs/lpr/locat…
English
13
151
1.2K
48.9K
Jürgen Fey retweetledi
Aleksa Gordić (水平问题)
new in-depth blog post time: Inside the Transformer: The Life of a Token a deep dive into a modern dense transformer, i cover YaRN (why does pairwise coordinate rotation induce positional information?), hybrid attention (getting to 160k context length), soft capping, QK normalization, etc. as the token flows through the transformer bonus transformer math: FLOPs/token formula (and when is 6N formula broken), cluster sizing (how big of a cluster do you need given the model/data size and experiment throughput of interest), and more
Aleksa Gordić (水平问题) tweet media
English
19
142
999
47K
Jürgen Fey retweetledi
Sebastian Raschka
Added a DeepSeek Sparse Attention (DSA) from-scratch implementation to my LLMs-from-scratch repo thanks to an awesome new reader contrib. With motivation, overview, and GPT-style model reference implementation as standalone example code: github.com/rasbt/LLMs-fro…
Sebastian Raschka tweet media
English
42
239
1.8K
72K
Jürgen Fey retweetledi
NVIDIA AI
NVIDIA AI@NVIDIAAI·
A closer look at LongLive-2.0: • Generates long video at 720p • Preserves subject and background consistency across multi-shot sequences • Able to switch prompts at chunk boundaries
NVIDIA AI tweet media
English
2
6
68
8.3K
Jürgen Fey retweetledi
Meituan LongCat
Meituan LongCat@Meituan_LongCat·
Meet LongCat-Video-Avatar 1.5🐱—our upgraded, open-source digital human framework. Built for real production, not just short demos. What's New: 🔹 Upgraded Audio Encoder: Replaces Wav2Vec2 with Whisper-Large, yielding significantly smoother and more natural lip dynamics. 🔹 Production-Ready Stability: Achieves accurate lip-synchronization, full-body temporal stability, and robust long-video generation with strict identity consistency. 🔹 Stylized Domain Generalization: Robustly generalizes to anime, animals, and complex real-world conditions such as multi-person interactions and object handling. 🔹 Efficient 8-Step Inference: Advanced step distillation accelerates inference to 8 NFE, balancing cost-effective serving with exceptional visual fidelity. 📊 LongCat-Video-Avatar 1.5 performs strongly in realism, naturalness, and stability, outperforming leading open-source models and closed systems. 🐱 Avatar 1.5 framework is now open source: 🔗 Weights & Code:github.com/meituan-longca… 🔗 HuggingFace: huggingface.co/meituan-longca… 🔗 Tech Report: github.com/meituan-longca… 🔗 Project Page: meigen-ai.github.io/LongCat-Video-…
Meituan LongCat tweet media
English
13
41
223
29.6K
Jürgen Fey
Jürgen Fey@androidian60·
Part 5/7 of my Local Agent with Go series is online. This one extends the agent with narrow MCP servers, smart web fetching via llms.txt, Lemonade Server management for AMD, full end-to-end execution, and troubleshooting bit.ly/4dqXLjS #GenAI #Golang #MCP #LocalAI #LLM
English
0
0
1
33
Jürgen Fey retweetledi
Shubham Sharma
Shubham Sharma@HappyyPablo·
open sourcing Marlin-2B 🐟 a tiny VLM to extract structured information from videos Marlin is finetuned for two questions devs want to ask in their videos: what is happening, and when? Best open model in its weight class, competitive with Gemini-2.5-flash at only 2B params 🧵
English
135
521
4.6K
297.3K
Jürgen Fey retweetledi
GitHub Projects Community
GitHub Projects Community@GithubProjects·
Image generation used to require a cluster. SANA runs 4096x4096 on a 16GB laptop. 0.6B params. Linear attention. 32x latent compression. Sub-second at 1024px. 4-bit quantization fits under 8GB. Open source with full training pipeline.
GitHub Projects Community tweet media
English
3
43
365
19.1K
Jürgen Fey retweetledi
Georgi Gerganov
Georgi Gerganov@ggerganov·
llama.cpp adds MTP for the Qwen3.6 family This is a significant milestone for the local AI ecosystem. The performance jump with these changes is massive and elevates local inference on commodity hardware further. Special thanks to Aman Gupta for leading this development! github.com/ggml-org/llama…
English
48
185
1.2K
271.2K
Jürgen Fey retweetledi
まだ面白い
まだ面白い@madaomoshiroi·
ロボコンで披露された「紙飛行機大量生産ロボット」がかっこ良すぎる
日本語
164
909
11.3K
3M
Jürgen Fey
Jürgen Fey@androidian60·
Part 3 of my Go agent series is online. It covers the core of the system: the agent loop. LLM calls, MCP tool execution, tool result handling, loop limits, runaway protection, compatibility across local/cloud endpoints. Link: bit.ly/4tHQYZg #Golang #LLM #Agents #MCP #Go
English
0
0
0
32
Jürgen Fey retweetledi
Kyle Hessling
Kyle Hessling@KyleHessling1·
Hello again, everyone! We've got another really fun 9b, this one specifically trained for tool calling and agentic coding workflows in @NousResearch Hermes agent. Happy to report that it crushes, and as a 9b it runs on super affordable hardware. We also hit this one with some coding domain-specific training, and it scored a 53.33% on SWE bench on a slice of 200 samples! To me, I was really shocked to see this high of a score on a 9B model in swe, correct me if I'm wrong, but I think that's nipping at the heels of the Gemma 4 series, much larger models on this particular benchmark, which is really incredible to see! It also crushes the HermesAgent-20 benchmark, scoring an 85 vs the base model's 71! Make sure to run it hot, --temp around 1, that seems to be the sweet spot for running these particular fine tunes in harnesses. If you have trouble, you can work your way down, but it does a much better job departing from base models, overthinking when you run it, high temp ~1. Please spin it up in Hermes and let us know your thoughts! Looking forward to hearing your feedback as always! Also, those of you waiting for Qwopus 3.6 27B, I have put together a preliminary evaluation for you in my HF repo, go check it out; we will be releasing the full model very soon! I will put the preliminary repo in the comments! huggingface.co/Jackrong/Qwopu…
English
71
146
1.5K
119.7K
Jürgen Fey
Jürgen Fey@androidian60·
Part 2 of Building a Local AI Agent in Go: Understanding MCP. ▸ agent.json works against Lemonade, vLLM, Gemini, Anthropic ▸ "LLM proposes, agent disposes" ▸ MCP is symmetric: your agent as tool consumer and provider bit.ly/4fmd8Ne #MCP #Golang #Agents
English
2
0
3
54
Jürgen Fey
Jürgen Fey@androidian60·
I started a new 7-part blog series: Building a Portable AI Agent in Go. Part 1 introduces the motivation and architecture behind a Go-based agent system that can orchestrate LLMs and MCP tools across local runtimes. Pt 1: bit.ly/4nvq0CF #go #genai #llm #aiagents #mcp
English
0
0
0
23
Jürgen Fey retweetledi
Sebastian Raschka
Sebastian Raschka@rasbt·
New article: a visual tour of recent LLM architecture advances, from Gemma 4 to DeepSeek V4. I focus on long-context efficiency tweaks like KV sharing, per-layer embeddings, layer-wise attention budgets, compressed attention, and mHC. Link: magazine.sebastianraschka.com/p/recent-devel…
Sebastian Raschka tweet media
English
45
424
2.4K
121.4K
Jürgen Fey retweetledi
Simplifying AI
Simplifying AI@simplifyinAI·
🚨 BREAKING: NVIDIA proved back-propagation isn't the only way to build an AI. Billion-parameter models were trained without a single gradient. No calculus, no exploding memory, no massive GPU clusters. The culprit? A long-dismissed technique called Evolution Strategies. NVIDIA and Oxford just made it scalable with EGGROLL, which replaces bloated mutation matrices with two tiny ones, enabling hundreds of thousands of parallel mutations at inference-level speed. They're pretraining models from scratch using only simple integers. No backprop. No decimals. We assumed the future of AI required endless precision hardware. Evolution had other plans.
Simplifying AI tweet media
English
30
178
1.3K
268.1K
Jürgen Fey retweetledi
Kevin Lin
Kevin Lin@KevinQHLin·
🌟Introducing🎻Violin — an Open-source Video Translation Skill. 📹Video is the dominant medium on the internet, yet most high-quality content (lecture, talk, podcast) is locked behind a single language, leaving global audiences behind. So we built Violin: a video skill that combines speech recognition, LLM translation, and speech synthesis into one seamless pipeline. 🌐 Demo: violin-ai.com 📝 Blog: together.ai/blog/violin-op… 🔗 GitHub: github.com/shang-zhu/viol… ✨Key Features: 🎙️High-quality multilingual ASR & Translation & TTS. 🗣️Personalize translation & voice (turn an academic talk into something children can follow). 💬Chat with the video — ask any questions grounded in the video. 🧩Support Web app, CLI, and Agent skill 🍃Fully open-source under MIT. ❤️Built with the wonderful @ShangZhu18 and advised by @james_y_zou ! All features powered by @togethercompute . Try it and let us know what you think! 🎻
English
24
140
654
135.4K
Jürgen Fey retweetledi
Sebastian Raschka
Sebastian Raschka@rasbt·
A little talk on what we can learn from implementing LLM architectures from scratch in Python and PyTorch. And how I approach new open-weight models, compare them against reference implementations etc: youtube.com/watch?v=TXzQ7P…
YouTube video
YouTube
English
23
167
995
69.8K