Jürgen Fey

902 posts

Jürgen Fey

@androidian60

ai consulting, industrial applications. tech geek. classic car geek. gadget geek. system architect. IoT. WebRTC and related realtime use cases. CTO

Munich and Europe Katılım Mayıs 2009

1.1K Takip Edilen132 Takipçiler

Jürgen Fey retweetledi

Wildminder@wildmindai·2d

Another cool stuff from NVIDIA. LocateAnything - high-speed visual search engine. You provide a text prompt and it instantly pinpoints that object's exact location in an image. - 10x speedup for dense object detection - Qwen2.5-3B + Moon-ViT - Fast/Slow/Hybrid modes - trained on 138M samples for UI, docs, generic grounding. research.nvidia.com/labs/lpr/locat…

English

151

1.2K

48.9K

Jürgen Fey retweetledi

Aleksa Gordić (水平问题)@gordic_aleksa·2d

new in-depth blog post time: Inside the Transformer: The Life of a Token a deep dive into a modern dense transformer, i cover YaRN (why does pairwise coordinate rotation induce positional information?), hybrid attention (getting to 160k context length), soft capping, QK normalization, etc. as the token flows through the transformer bonus transformer math: FLOPs/token formula (and when is 6N formula broken), cluster sizing (how big of a cluster do you need given the model/data size and experiment throughput of interest), and more

English

142

999

47K

Jürgen Fey retweetledi

Sebastian Raschka@rasbt·6d

Added a DeepSeek Sparse Attention (DSA) from-scratch implementation to my LLMs-from-scratch repo thanks to an awesome new reader contrib. With motivation, overview, and GPT-style model reference implementation as standalone example code: github.com/rasbt/LLMs-fro…

English

239

1.8K

72K

Jürgen Fey retweetledi

NVIDIA AI@NVIDIAAI·6d

A closer look at LongLive-2.0: • Generates long video at 720p • Preserves subject and background consistency across multi-shot sequences • Able to switch prompts at chunk boundaries

English

8.3K

Jürgen Fey retweetledi

Meituan LongCat@Meituan_LongCat·21 May

Meet LongCat-Video-Avatar 1.5🐱—our upgraded, open-source digital human framework. Built for real production, not just short demos. What's New: 🔹 Upgraded Audio Encoder: Replaces Wav2Vec2 with Whisper-Large, yielding significantly smoother and more natural lip dynamics. 🔹 Production-Ready Stability: Achieves accurate lip-synchronization, full-body temporal stability, and robust long-video generation with strict identity consistency. 🔹 Stylized Domain Generalization: Robustly generalizes to anime, animals, and complex real-world conditions such as multi-person interactions and object handling. 🔹 Efficient 8-Step Inference: Advanced step distillation accelerates inference to 8 NFE, balancing cost-effective serving with exceptional visual fidelity. 📊 LongCat-Video-Avatar 1.5 performs strongly in realism, naturalness, and stability, outperforming leading open-source models and closed systems. 🐱 Avatar 1.5 framework is now open source: 🔗 Weights & Code:github.com/meituan-longca… 🔗 HuggingFace: huggingface.co/meituan-longca… 🔗 Tech Report: github.com/meituan-longca… 🔗 Project Page: meigen-ai.github.io/LongCat-Video-…

English

223

29.6K

Jürgen Fey@androidian60·21 May

The Local Agent with Go sources are now on GitHub: github.com/smarttechlabs-… A Go-based LLM agent with MCP tools, local/cloud LLM backends, observability, safety limits, REST/MCP surfaces, and dashboard support. #Golang #GenAI #MCP #LocalAI #Agents #LLM #AMD #NVIDIA

English

Jürgen Fey@androidian60·20 May

Part 5/7 of my Local Agent with Go series is online. This one extends the agent with narrow MCP servers, smart web fetching via llms.txt, Lemonade Server management for AMD, full end-to-end execution, and troubleshooting bit.ly/4dqXLjS #GenAI #Golang #MCP #LocalAI #LLM

English

Jürgen Fey retweetledi

Shubham Sharma@HappyyPablo·19 May

open sourcing Marlin-2B 🐟 a tiny VLM to extract structured information from videos Marlin is finetuned for two questions devs want to ask in their videos: what is happening, and when? Best open model in its weight class, competitive with Gemini-2.5-flash at only 2B params 🧵

English

135

521

4.6K

297.3K

Jürgen Fey retweetledi

GitHub Projects Community@GithubProjects·19 May

Image generation used to require a cluster. SANA runs 4096x4096 on a 16GB laptop. 0.6B params. Linear attention. 32x latent compression. Sub-second at 1024px. 4-bit quantization fits under 8GB. Open source with full training pipeline.

English

365

19.1K

Jürgen Fey retweetledi

Georgi Gerganov@ggerganov·18 May

llama.cpp adds MTP for the Qwen3.6 family This is a significant milestone for the local AI ecosystem. The performance jump with these changes is massive and elevates local inference on commodity hardware further. Special thanks to Aman Gupta for leading this development! github.com/ggml-org/llama…

English

185

1.2K

271.2K

Jürgen Fey@androidian60·19 May

Part 4 of 7 of the Local Agent with Go series is online. A walkthrough in the agent: entry point, config, agent loop, LLM client, MCP manager, tool parser, events, web dashboard, REST API, HITL and more.. bit.ly/4dfVJo4 #Golang #GenAI #LLM #AIAgents #MCP #LocalAI

English

Jürgen Fey retweetledi

まだ面白い@madaomoshiroi·17 May

ロボコンで披露された「紙飛行機大量生産ロボット」がかっこ良すぎる

日本語

164

909

11.3K

Jürgen Fey@androidian60·18 May

Part 3 of my Go agent series is online. It covers the core of the system: the agent loop. LLM calls, MCP tool execution, tool result handling, loop limits, runaway protection, compatibility across local/cloud endpoints. Link: bit.ly/4tHQYZg #Golang #LLM #Agents #MCP #Go

English

Jürgen Fey retweetledi

Kyle Hessling@KyleHessling1·17 May

Hello again, everyone! We've got another really fun 9b, this one specifically trained for tool calling and agentic coding workflows in @NousResearch Hermes agent. Happy to report that it crushes, and as a 9b it runs on super affordable hardware. We also hit this one with some coding domain-specific training, and it scored a 53.33% on SWE bench on a slice of 200 samples! To me, I was really shocked to see this high of a score on a 9B model in swe, correct me if I'm wrong, but I think that's nipping at the heels of the Gemma 4 series, much larger models on this particular benchmark, which is really incredible to see! It also crushes the HermesAgent-20 benchmark, scoring an 85 vs the base model's 71! Make sure to run it hot, --temp around 1, that seems to be the sweet spot for running these particular fine tunes in harnesses. If you have trouble, you can work your way down, but it does a much better job departing from base models, overthinking when you run it, high temp ~1. Please spin it up in Hermes and let us know your thoughts! Looking forward to hearing your feedback as always! Also, those of you waiting for Qwopus 3.6 27B, I have put together a preliminary evaluation for you in my HF repo, go check it out; we will be releasing the full model very soon! I will put the preliminary repo in the comments! huggingface.co/Jackrong/Qwopu…

English

146

1.5K

119.7K

Jürgen Fey@androidian60·17 May

Part 2 of Building a Local AI Agent in Go: Understanding MCP. ▸ agent.json works against Lemonade, vLLM, Gemini, Anthropic ▸ "LLM proposes, agent disposes" ▸ MCP is symmetric: your agent as tool consumer and provider bit.ly/4fmd8Ne #MCP #Golang #Agents

English

Jürgen Fey@androidian60·16 May

I started a new 7-part blog series: Building a Portable AI Agent in Go. Part 1 introduces the motivation and architecture behind a Go-based agent system that can orchestrate LLMs and MCP tools across local runtimes. Pt 1: bit.ly/4nvq0CF #go #genai #llm #aiagents #mcp

English

Jürgen Fey retweetledi

Sebastian Raschka@rasbt·16 May

New article: a visual tour of recent LLM architecture advances, from Gemma 4 to DeepSeek V4. I focus on long-context efficiency tweaks like KV sharing, per-layer embeddings, layer-wise attention budgets, compressed attention, and mHC. Link: magazine.sebastianraschka.com/p/recent-devel…

English

424

2.4K

121.4K

Jürgen Fey retweetledi

Simplifying AI@simplifyinAI·14 May

🚨 BREAKING: NVIDIA proved back-propagation isn't the only way to build an AI. Billion-parameter models were trained without a single gradient. No calculus, no exploding memory, no massive GPU clusters. The culprit? A long-dismissed technique called Evolution Strategies. NVIDIA and Oxford just made it scalable with EGGROLL, which replaces bloated mutation matrices with two tiny ones, enabling hundreds of thousands of parallel mutations at inference-level speed. They're pretraining models from scratch using only simple integers. No backprop. No decimals. We assumed the future of AI required endless precision hardware. Evolution had other plans.

English

178

1.3K

268.1K

Jürgen Fey retweetledi

Kevin Lin@KevinQHLin·14 May

🌟Introducing🎻Violin — an Open-source Video Translation Skill. 📹Video is the dominant medium on the internet, yet most high-quality content (lecture, talk, podcast) is locked behind a single language, leaving global audiences behind. So we built Violin: a video skill that combines speech recognition, LLM translation, and speech synthesis into one seamless pipeline. 🌐 Demo: violin-ai.com 📝 Blog: together.ai/blog/violin-op… 🔗 GitHub: github.com/shang-zhu/viol… ✨Key Features: 🎙️High-quality multilingual ASR & Translation & TTS. 🗣️Personalize translation & voice (turn an academic talk into something children can follow). 💬Chat with the video — ask any questions grounded in the video. 🧩Support Web app, CLI, and Agent skill 🍃Fully open-source under MIT. ❤️Built with the wonderful @ShangZhu18 and advised by @james_y_zou ! All features powered by @togethercompute . Try it and let us know what you think! 🎻

English

140

654

135.4K

Jürgen Fey retweetledi

Sebastian Raschka@rasbt·13 May

A little talk on what we can learn from implementing LLM architectures from scratch in Python and PyTorch. And how I approach new open-weight models, compare them against reference implementations etc: youtube.com/watch?v=TXzQ7P…

YouTube

English

167

995

69.8K

Keşfet

@NousResearch @ShangZhu18 @james_y_zou @togethercompute @elonmusk @BarackObama @taylorswift13 @cristiano