Moritz

1.5K posts

Moritz banner
Moritz

Moritz

@moritzp82

Living in different worlds :) #Munich #IT #Triathlon #Marathon #agile

Munich Katılım Mart 2010
124 Takip Edilen140 Takipçiler
Moritz retweetledi
TNG Technology Consulting GmbH
Today we release DeepSeek-TNG R1T2 Chimera. This new Chimera is a Tri-Mind Assembly-of-Experts model with three parents, namely R1-0528, R1 and V3-0324. R1T2 operates at a sweet spot in intelligence vs. output token length. It appears to be... * about 20% faster than R1, and more than twice as fast as R1-0528 * significantly more intelligent than R1 in benchmarks such as GPQA Diamond and AIME-24/25, albeit not quite on R1-0528 level * much more intelligent than our first R1T Chimera, and also think-token consistent, which is a major improvement We perceive it as generally well-behaved and a nice persona to talk to. The weights are on @huggingface under the MIT licence. We are looking forward to your experiments and feedback! Thanks to @deepseek_ai for giving their models to the world, to @chutes_ai and @openrouter for hosting R1T, to @WolframRvnwlf for benchmarking it, to @xlr8harder for beta-testing the new Chimera, and to @natolambert for constructive discussions at @aiDotEngineer.
TNG Technology Consulting GmbH tweet media
English
21
87
392
126.5K
Moritz retweetledi
Moritz retweetledi
TNG Technology Consulting GmbH
Today we release DeepSeek-R1T-Chimera, an open weights model adding R1 reasoning to @deepseek_ai V3-0324 with a novel construction method. In benchmarks, it appears to be as smart as R1 but much faster, using 40% fewer output tokens. The Chimera is a child LLM, using V3s shared experts augmented with a custom merge of R1s and V3s routed experts. It is not a finetune or distillation, but constructed from neural network parts of both parent MoE models. A bit surprisingly, we did not detect defects of the hybrid child model. Instead, its reasoning and thinking processes appear to be more compact and orderly than the sometimes very long and wandering thoughts of the R1 parent model. Model weights are on @huggingface, just a little late for #ICLR2025. Kudos to @deepseek_ai for V3 and R1!
TNG Technology Consulting GmbH tweet media
English
26
106
595
80.7K
Moritz retweetledi
TNG Technology Consulting GmbH
At TNG, we handle 5,000+ #LLM requests per hour and generate 10+ million tokens every day. Learn how our team optimizes inference serving for low-latency responses in high-traffic environments in the second article of our series on LLM performance: huggingface.co/blog/tngtech/l…
TNG Technology Consulting GmbH tweet media
English
0
4
16
447
Moritz retweetledi
TNG Technology Consulting GmbH
Presenting Mixture of Tunable Experts (MoTE): Behavior Modification of DeepSeek-R1 at Inference Time - a method extending the MoE architecture of LLMs. By tuning 10 key experts, it enables meaningful and focused behavior changes on-the-fly. Mon 18h CET: linkedin.com/events/mixture…
TNG Technology Consulting GmbH tweet media
English
4
21
35
4.7K
Moritz retweetledi
TNG Apps
TNG Apps@tngapps·
Every company has their own optimized workflows, but not all can be expressed in #Jira out of the box. Remove that limiter with Workflow Enhancer, and access all fields in your transition element to craft the perfect #workflow for your work processes 👉 marketplace.atlassian.com/apps/575829/wo…
TNG Apps tweet media
English
0
1
2
89
Moritz retweetledi
TNG Apps
TNG Apps@tngapps·
Are you subscribed to our TNG Apps YouTube channel yet? In our videos, you will find an overview of our popular apps for #Jira and #Confluence and gain new ideas on how to significantly enhance your #Atlassian experience. Check out the channel: @tngapps" target="_blank" rel="nofollow noopener">youtube.com/@tngapps
TNG Apps tweet media
English
0
3
4
161
Moritz retweetledi
TNG Apps
TNG Apps@tngapps·
Join us at Atlas Camp '25 in Brussels on Feb. 3rd and 4th. Our colleagues Lina and Miroslava will share how they pioneered the migration of our Enterprise Survey app to the Cloud using @Atlassian's app migration platform for Forge. You can sign up here: events.atlassian.com/atlascamp25
TNG Apps tweet media
English
0
3
4
177
Moritz
Moritz@moritzp82·
It's pretty impressive to see this live. You can sense quite some tension in the air and heads burning 😅 #tngbtd @VincentKeymer04
Moritz tweet media
English
0
0
3
93