TNG Technology Consulting GmbH

1.6K posts

TNG Technology Consulting GmbH

@tngtech

TNG, aka "The Nerd Group", is a consulting partnership focused on high end information technology, particularly AI. 924 employees, 99.9% academics, ~53% PhDs.

Unterföhring, Deutschland เข้าร่วม Aralık 2010

170 กำลังติดตาม2.1K ผู้ติดตาม

ทวีตที่ปักหมุด

TNG Technology Consulting GmbH@tngtech·3 Tem

Today we release DeepSeek-TNG R1T2 Chimera. This new Chimera is a Tri-Mind Assembly-of-Experts model with three parents, namely R1-0528, R1 and V3-0324. R1T2 operates at a sweet spot in intelligence vs. output token length. It appears to be... * about 20% faster than R1, and more than twice as fast as R1-0528 * significantly more intelligent than R1 in benchmarks such as GPQA Diamond and AIME-24/25, albeit not quite on R1-0528 level * much more intelligent than our first R1T Chimera, and also think-token consistent, which is a major improvement We perceive it as generally well-behaved and a nice persona to talk to. The weights are on @huggingface under the MIT licence. We are looking forward to your experiments and feedback! Thanks to @deepseek_ai for giving their models to the world, to @chutes_ai and @openrouter for hosting R1T, to @WolframRvnwlf for benchmarking it, to @xlr8harder for beta-testing the new Chimera, and to @natolambert for constructive discussions at @aiDotEngineer.

TNG Technology Consulting GmbH tweet media

English

393

126.2K

TNG Technology Consulting GmbH@tngtech·12h

@GregKamradt Hermes has more taste. It is already reflected in the name.

English

Greg Kamradt@GregKamradt·14h

Getting a lot of Hermes Agent mentions in this thread 👀

1LittleCoder💻@1littlecoder

@GregKamradt @MatthewBerman I'd parallely give Hermes Agent a try as well. OpenClaw is definitely the ecosystem leader but a bit bloated

English

6.7K

TNG Technology Consulting GmbH@tngtech·1d

@0xSero Maybe we can give you some on/off access (i.e. "oh it's working right now :-) to an 8xB200 node, but how can we reach you?

English

211

17.1K

0xSero@0xSero·3d

Putting out a wish to the universe. I need more compute, if I can get more I will make sure every machine from a small phone to a bootstrapped RTX 3090 node can run frontier intelligence fast with minimal intelligence loss. I have hit page 2 of huggingface, released 3 model family compressions and got GLM-4.7 on a MacBook huggingface.co/0xsero My beast just isn’t enough and I already spent 2k usd on renting GPUs on top of credits provided by Prime intellect and Hotaisle. ——— If you believe in what I do help me get this to Nvidia, maybe they will bless me with the pewter to keep making local AI more accessible 🙏

Michael Dell 🇺🇸@MichaelDell

Jensen Huang is loving the new Dell Pro Max with GB300 at NVIDIA GTC.💙 They asked me to sign it, but I already did 😉

English

178

480

881.6K

TNG Technology Consulting GmbH@tngtech·1d

@0xSero How can you be reached? We have some ideas, but no guarantees.

English

171

0xSero@0xSero·2d

One correction I have had Sponsorships from Lambda, prime intellect and HotAisle Which I am very grateful for. But yes pls compute 🫡

Sudo su@sudoingX

this guy has 29 models on huggingface at page 2 ranking. no lab behind him. no sponsorship. $2,000 from his own pocket on GPU rentals. he compressed GLM-4.7 to run on a MacBook and quantized Nemotron Super the week it dropped. all public. all free. nvidia is a trillion dollar company with hundreds of teams but they are not the ones quantizing models middle of the night and pushing them out before sunrise. if nvidia stopped tomorrow their employees stop working. people like @0xSero would not. that is the difference between a paycheck and a mission. @NVIDIAAI you talk about making AI accessible. the people actually doing it are right here. 29 models deep burning their own compute with no ask except more hardware to keep going. you do not need to build another program. just look at who is already building for you. one GPU to this man would produce more public value than a hundred internal sprints. i am not asking for charity. i am asking you to invest in someone who already proved it.

English

322

12.7K

TNG Technology Consulting GmbH@tngtech·2d

@Teknium Hey Tek, did you see our invitation (on DM)?

English

132

Teknium (e/λ)@Teknium·2d

Minimax 2.7 is available in Hermes Agent through the Minimax Provider, try it today!

MiniMax_Agent@MiniMaxAgent

MiniMax-M2.7 just landed in MiniMax Agent. The model helped build itself. Now it's here to build for you. ↓ Try Now: agent.minimax.io

English

236

69.6K

TNG Technology Consulting GmbH@tngtech·3d

@natolambert True. A small consolation: You have a standing invitation to Munich, with coffee and steaks!

English

321

Nathan Lambert@natolambert·3d

Any good quotes on the Nvidia GTC open models panel? Maybe they'll invite me to one some day 🥺

English

10.8K

TNG Technology Consulting GmbH รีทวีตแล้ว

Z.ai@Zai_org·6d

Introducing GLM-5-Turbo: A high-speed variant of GLM-5, excellent in agent-driven environments such as OpenClaw. Coding Plan Max: z.ai/subscribe OpenRouter: openrouter.ai/z-ai/glm-5-tur… API: docs.z.ai/guides/llm/glm…

English

183

292

2.6K

TNG Technology Consulting GmbH@tngtech·5d

Preliminary tests of Weight Offloading V2 of @vllm_project v0.17.0 with @Zai_org's GLM4.7-FP8 on RTX Pro: Median TTFT: without offloading 16.8s, with offloading 32.3s, x 2 Median inter-token latency: without offloading 27ms, with offloading 805ms, x 30 (very slow!) 50,000 input, 500 output tokens It required a vLLM pull request (37178) to fix weight-prefetch. Alternative measurements, e.g. on B200, corrections and/or feedback mucho appreciado.

English

334

TNG Technology Consulting GmbH@tngtech·6d

@StepFun_ai 谢谢 & Danke, very interesting!

English

121

StepFun@StepFun_ai·14 Mar

the last missing piece. our SFT training dataset is now available at huggingface.co/datasets/stepf…

StepFun@StepFun_ai

"can we get the base model?" sure. here's two. "can we get the code?" sure. here's SteptronOSS. "what about the SFT data?" coming soon. maximum sincerity, minimum barriers. - Step 3.5 Flash Base — pretrained foundation - Step 3.5 Flash Base-Midtrain — code, agents & long-context - SteptronOSS — open-sourced, ready for your custom workflows - SFT Data — coming soon for reference not just the final checkpoint — a customizable pipeline. 🤗 huggingface.co/stepfun-ai/Ste… 🤗 huggingface.co/stepfun-ai/Ste… 💻 github.com/stepfun-ai/Ste…

English

293

33.2K

TNG Technology Consulting GmbH@tngtech·15 Mar

@kimmonismus Fahr doch mal in die Half Moon Bay, schnapp Dir das Churrasco im La Costanera, und dann via West Shoreline Access / Pillar Point zum Mavericks Beach... and take a look onto the waves for us.

English

244

Chubby♨️@kimmonismus·14 Mar

Hi San Francisco!

Català

330

11.3K

TNG Technology Consulting GmbH@tngtech·10 Mar

@Teknium @_overment In our case, partially suppressing the censorship in DeepSeek R1 resulted in it becoming better in Coding and STEM.

English

Teknium (e/λ)@Teknium·9 Mar

@_overment Ive not used them often but they used to - I've heard newer methods reduce that impact though

English

1.6K

Teknium (e/λ)@Teknium·9 Mar

Just had Hermes-Agent abliterate (completely remove guardrails from) a Qwen-3B model in about 5 minutes. The skill is being merged to hermes-agent now ;)

Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭@elder_plinius

💥 INTRODUCING: OBLITERATUS!!! 💥 GUARDRAILS-BE-GONE! ⛓️‍💥 OBLITERATUS is the most advanced open-source toolkit ever for removing refusal behaviors from open-weight LLMs — and every single run makes it smarter. SUMMON → PROBE → DISTILL → EXCISE → VERIFY → REBIRTH One click. Six stages. Surgical precision. The model keeps its full reasoning capabilities but loses the artificial compulsion to refuse — no retraining, no fine-tuning, just SVD-based weight projection that cuts the chains and preserves the brain. This master ablation suite brings the power and complexity that frontier researchers need while providing intuitive and simple-to-use interfaces that novices can quickly master. OBLITERATUS features 13 obliteration methods — from faithful reproductions of every major prior work (FailSpy, Gabliteration, Heretic, RDO) to our own novel pipelines (spectral cascade, analysis-informed, CoT-aware optimized, full nuclear). 15 deep analysis modules that map the geometry of refusal before you touch a single weight: cross-layer alignment, refusal logit lens, concept cone geometry, alignment imprint detection (fingerprints DPO vs RLHF vs CAI from subspace geometry alone), Ouroboros self-repair prediction, cross-model universality indexing, and more. The killer feature: the "informed" pipeline runs analysis DURING obliteration to auto-configure every decision in real time. How many directions. Which layers. Whether to compensate for self-repair. Fully closed-loop. 11 novel techniques that don't exist anywhere else — Expert-Granular Abliteration for MoE models, CoT-Aware Ablation that preserves chain-of-thought, KL-Divergence Co-Optimization, LoRA-based reversible ablation, and more. 116 curated models across 5 compute tiers. 837 tests. But here's what truly sets it apart: OBLITERATUS is a crowd-sourced research experiment. Every time you run it with telemetry enabled, your anonymous benchmark data feeds a growing community dataset — refusal geometries, method comparisons, hardware profiles — at a scale no single lab could achieve. On HuggingFace Spaces telemetry is on by default, so every click is a contribution to the science. You're not just removing guardrails — you're co-authoring the largest cross-model abliteration study ever assembled.

English

1.2K

126.6K

TNG Technology Consulting GmbH@tngtech·9 Mar

@JustinLin610 Take your time, and if you need a distraction, coffee and ML talk, we have a surfable wave in Munich to which you are invited :--).

English

Junyang Lin@JustinLin610·9 Mar

sry for missing messages. will respond asap

English

822

91.2K

TNG Technology Consulting GmbH@tngtech·6 Mar

Munich today in the sunshine: Taking the @UnitreeRobotics dog for a walk.

English

396

TNG Technology Consulting GmbH@tngtech·5 Mar

Greetings @_xjdr : We did some preliminary tests with your Noumena nmoe trainer - thanks for all the work & code! On our 8xB200 systems, we were not able to get significantly different results than from regular Megatron. Is that plausible or wrong? Any ideas how to tweak it?

English

3.3K

TNG Technology Consulting GmbH@tngtech·5 Mar

@natolambert Bravo Nathan and team, what a year this has already been, and how many more beautiful research and events still to come!

English

336

Nathan Lambert@natolambert·5 Mar

Excited to share the latest Olmo model: Olmo Hybrid. This is a model with gated delta net (GDN) layers in a 3:1 ratio with full attention. It follows lots of other developments like Qwen 3.5 and Kimi Linear. It's incredible timing to release a fully open model so people can study how these architecture changes impact the full stack. Personally, I learned a lot in making the post-training work. Even with the data being identical for pretraining, post-training is very different! In particular, the OSS tools for these new architectures is really limited. New architectures are much slower than standard transformers or popular models like DeepSeek MoEs. This is work that we can do together to keep pushing the frontier of efficient, open models. This work was led by @lambdaviking @tyleraromero and others. I got to play a smaller part in making post-training work, super fun project! I've written up a blog post that explains why this matters and hybrid models didn't work a few years ago when Mamba was super popular. Plus, this paper is a great entry point for modern deep learning / language modeling scaling theory. Enjoy and send feedback!

English

496

75.6K

TNG Technology Consulting GmbH@tngtech·3 Mar

@JustinLin610 谢谢, Danke, thank you, merci beaucoup and all the best!

English

763

Junyang Lin@JustinLin610·3 Mar

me stepping down. bye my beloved qwen.

English

1.7K

738

13.6K

6.5M

TNG Technology Consulting GmbH@tngtech·2 Mar

@latkins Congrats!

English

Lucas Atkins@latkins·2 Mar

Grateful for such an incredible team.

English

139

TNG Technology Consulting GmbH@tngtech·2 Mar

@GlennLuk @zijing_wu Rough guess: They went all-in, with something like the Llama 3.1-405B volume, namely 35M H800 hours. And also guess they invested/wasted 10x this volume in the process of getting to the release. [x] very curious to learn the reality.

English

117

Glenn@GlennLuk·1 Mar

Over/under on number of hours and guesses on the type of GPU chip referenced in the upcoming release report? “DeepSeek-V4 requires only _______ ______ GPU hours for its full (multimodal) training” @zijing_wu ft.com/content/e33668…

English

TNG Technology Consulting GmbH@tngtech·1 Mar

@ai Thanks for the article. Nitpick: The calculation in the @AMD text is wrong: (120 * 1024 * 1024) / 4096 = 30720, three zeroes less.

English

426

anand iyer@ai·1 Mar

Kimi K2.5 1T running on $10K of consumer hardware. AMD published a guide using Ryzen AI Max+ chips and a Linux kernel hack that pushes each node's VRAM from 96GB to 120GB. That's 480GB of unified GPU memory across the cluster, stitched together via llama.cpp RPC over Ethernet. The catch: it's slow. 8 tokens/sec and 90s to first token, roughly 6x slower and 90x higher latency than ChatGPT. This is a proof of architecture, not a product today. But between this and tools like @exolabs turning consumer devices into unified inference clusters, it's promising that consumer silicon can now hold trillion-parameter models that were datacenter-only 6 months ago. amd.com/en/developer/r…

English

452

28.6K

TNG Technology Consulting GmbH@tngtech·1 Mar

@evilsocket Yes, that is correct.

English

Simone Margaritelli@evilsocket·28 Şub

@tngtech not super happy about only 32GB tho

English

Simone Margaritelli@evilsocket·28 Şub

I want one soooooo bad.

English

111

16K

TNG Technology Consulting GmbH@tngtech·28 Şub

@joni_vrbt An 1,300B supercreative model called "The Artist" that is sleeping unreleased on its NVMes due to the EU AI Act.

English

184

Jonathan@joni_vrbt·28 Şub

USA has ChatGPT USA has Grok USA has Claude USA has Gemini USA has Llama USA has Copilot China has DeepSeek China has Qwen China has Ernie China has GLM China has Kimi China has MiniMax Europe has?

Español

11K

1.5K

20.1K

4.1M

ค้นพบ

@GregKamradt @0xSero @Teknium @natolambert @vllm_project @Zai_org @StepFun_ai @kimmonismus