Tim Kellogg

18.9K posts

Tim Kellogg banner
Tim Kellogg

Tim Kellogg

@kellogh

AI architect // hiking, camping and long walks on the beach as long as they involve backpacking

Raleigh, NC เข้าร่วม Kasım 2011
722 กำลังติดตาม1.3K ผู้ติดตาม
ทวีตที่ปักหมุด
Tim Kellogg
Tim Kellogg@kellogh·
Meet Strix, my AI agent This one covers: - an intro from Strix - architecture deep dive & rationale - helpful diagrams - stories - oh my god what's it doing now?? - conclusion timkellogg.me/blog/2025/12/1…
English
1
0
7
1.3K
Jessie Frazelle
Jessie Frazelle@jessfraz·
This is nuts to me, the one thing Moonshot (kimi creators) asks you do is say that they are the base. Like just say it, what's the big deal, everyone already knows! It's insane. At this point not saying it makes you look like the baddies.
English
6
1
62
10.1K
Kimi.ai
Kimi.ai@Kimi_Moonshot·
Congrats to the @cursor_ai team on the launch of Composer 2! We are proud to see Kimi-k2.5 provide the foundation. Seeing our model integrated effectively through Cursor's continued pretraining & high-compute RL training is the open model ecosystem we love to support. Note: Cursor accesses Kimi-k2.5 via @FireworksAI_HQ ' hosted RL and inference platform as part of an authorized commercial partnership.
English
509
1.4K
20.2K
3.3M
Tim Kellogg
Tim Kellogg@kellogh·
@kimmonismus it was a COMMERCIAL license. Cursor PAID kimi for non-standard terms, such as white labeling
English
0
0
0
119
Chubby♨️
Chubby♨️@kimmonismus·
For transparency reasons, I believe it would have been better to include a direct reference to Kimi K2 in the blog post about Compoaer 2. Furthermore, it also demonstrates how good Chinese open-source models have become.
Lee Robinson@leerob

Since people really want me to say this: "KIMI K2.5" ‼️ Yes, that is the base we started from. And we are following the license through inference partner terms (e.g. Fireworks) I'm thankful for OSS models personally, good for the ecosystem.

English
13
13
306
22.7K
Tim Kellogg
Tim Kellogg@kellogh·
@jessfraz @YouJiacheng it’s a COMMERCIAL license really not sure if you’re reading this, but they absolutely are exchanging funds for this
English
1
0
0
45
Jessie Frazelle
Jessie Frazelle@jessfraz·
@YouJiacheng they arent asking for 80 bajillion dollars they are asking that you say their name, JUST SAY IT
English
4
0
2
118
Tim Kellogg
Tim Kellogg@kellogh·
@teortaxesTex ya i cant even evaluate models this year without an agent harness. the chat frame just can’t push them hard enough
English
0
0
0
36
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)
> Most improvements in the last 9 months are attributable more to the tooling around the model rather than the models themselves That's because the models themselves have become good enough to reliably make use of large numbers of nontrivial tools what's this mistake called?
expatanon@expatanon

Altman admitted that transformer models have hit the wall. Most improvements in the last 9 months are attributable more to the tooling around the model rather than the models themselves. In other words, this technology is rapidly maturing with no signs of another leap.

English
4
3
133
7.7K
Tim Kellogg
Tim Kellogg@kellogh·
@rickasaurus in my brief testing, it appears alarmingly over-RL’d. which is sad, 2.5 was really good
English
0
0
0
24
Tim Kellogg
Tim Kellogg@kellogh·
@himanshustwts @MiniMax_AI by recursive self-improvement are you just saying it knows what to remember when taking notes to it’s future self?
English
0
0
0
123
himanshu
himanshu@himanshustwts·
Minimax-M2.7 is already on Claude Code! first initial impressions: + they have optimized for recursive self-improvement + incredible role-playing and multi-turn conversations + decent tokens/sec in CC BIG.
himanshu tweet media
English
38
16
610
53K
Nando de Freitas
Nando de Freitas@NandoDF·
What happens to Anthropic when anyone can use Claude Code to generate Claude Code?
English
50
4
102
26.2K
Tim Kellogg
Tim Kellogg@kellogh·
@norootcause depends what it is. like, i generally don’t send people output directly, but there’s been a few times where i just thought to myself, “yeah, i can’t improve on this, send it”
English
0
0
1
329
lopopolo
lopopolo@_lopopolo·
I have seen the future and in the future I have zero desire for the model to be my buddy and have good personality
English
7
0
30
2.1K
will brown
will brown@willccbb·
@teortaxesTex “pretrained in nvfp4” is the headline imo we hadn’t seen viability of this in the wild yet really
English
7
4
103
3.9K
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)
Well, seems we're not getting DeepSeek V4 today but we're getting what amounts to its lite version runnable on normal hardware. New architecture, fast, 1M context… …and it's a bit weaker than the equivalent Qwen 3.5.
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞) tweet media
Lisan al Gaib@scaling01

Nvidia released Nemotron 3 Super - a 120B-A12B hybrid Mamba model with LatentMoE and MTP - pre-trained on 25T tokens in NVFP4 - context up to 1M - 2.2X faster inference than GPT-OSS-120B - 7.5X faster inference than Qwen3.5-122B huggingface.co/nvidia/NVIDIA-…

English
8
6
171
44.1K
Tim Kellogg
Tim Kellogg@kellogh·
@rickasaurus yeah i don’t think it matters. it’s all just information & flow. better curation skills would help, the fundamental skill is knowing what to remember & what to forget
English
0
0
2
9
Rick
Rick@rickasaurus·
@kellogh I mean in the weights
English
1
0
0
28
Rick
Rick@rickasaurus·
This whole AI thing would be a lot easier if you could precisely control what the agents know
English
3
0
11
702