Collinear AI

79 posts

Collinear AI banner
Collinear AI

Collinear AI

@CollinearAI

The AI Simulation Lab

Se unió Ekim 2023
34 Siguiendo387 Seguidores
Collinear AI
Collinear AI@CollinearAI·
Launching soon.
English
2
0
6
978
Collinear AI retuiteado
Muyu He
Muyu He@HeMuyu0327·
Interested in how @Kimi_Moonshot 's kimi linear attention (KDA) "improves" linear attention, I break down the math to show how it evolves all the way from the most basic version. Linear attention can be seen in two perspectives: - On the one hand the linear "fast memory" matrix is a sum of all value vectors in the context weighted by the key vectors, so when it is multiplied by the query vector, the outcome is just a non-softmax version of attention computed linearly. - On the other hand the same construction of the fast memory can be seen as the gradient descent of a particular loss function. This is where people can choose different losses to make fast memory more powerful. The choice of loss function to optimize: - Naively the construction is equivalent to optimizing the correlation (inner product) between the latest key vector and the latest value vector. This means the fast memory will help the query vector "find" the right value vector given that it matches the right key vector. - A step forward is optimizing the "reconstruction" (MSE) between the key and the value. This means the query vector's search for the right value vector can be even more accurate. This is DeltaNet. - The most recent step is to add a scalar to the old fast memory so that as we optimize the search for the new value, we gradually "forget" the influence of previous values in the attention. This is Gated DeltaNet. The innovation of KDA: - A scalar value will demolish the influence of previous key-value pairs across all dimensions, but we want a fine-grained gate to control the forgetting each dimension. - So KDA introduces a diagonal matrix gate for each head, so that each dimension of the head will be assigned a forgetting scalar. - This diagonal gate is built from the linear combination of a token-dependent gate (g) to control "how much to forget in this dim for this token) and a token-independent gate (A) to control "how much to forget in this dim in general). In reality they have done some optimizations to make the inference much faster, such as reformatting the update rule in chunks and materializing the diagonal matrix as a vector. But the core math is just to make surgical choices to forget previous information along semantically meaningful heads/dimensions. Pretty cool!
Muyu He tweet mediaMuyu He tweet media
English
7
84
653
35.9K
Collinear AI retuiteado
Muyu He
Muyu He@HeMuyu0327·
As I want to understand how @deepseek_ai v3.2 "fixed" GRPO, I break down the math behind the KL term. What is in the original GRPO: the DS math/R1 papers borrowed what @johnschulman2 calls the k3 estimator. Starting from the true KL term (in green square, "the k1 estimator") that pushes the policy model closer to the reference model, there is a harmless second term (in yellow square) that has an expectation of 0 which means no bias is introduced. The role of this second term is to offset the k1 estimator, so that while the expectation stays the same, the variance of the KL is much lower. This is because the second term (the "r-1") can be shown to negatively correlate with the KL term (the "logr"). The problem with GRPO: a true KL samples the rollouts from the policy model, but in reality, during several optimization steps, we are sampling from what is to the current step a previous policy. This makes GRPO off-policy and the KL term loses its meaning. The fix: the new formula simply applies importance sampling, which basically reweights each sample so that the probability of it now equates it coming from the policy model. There is then no more bias in the KL estimation because each sample is weighted with the correct probability density. The leftover: since the sampled rollouts still come from the previous policy model, the KL estimation is slightly high variance, but still there is no bias. As a result, training becomes more stable. The paper is quite fascinating even after two weeks. I want to next have a deeper look at the DeepSeek Attention mechanism and potentially do some interp stuff on it!
Muyu He tweet mediaMuyu He tweet mediaMuyu He tweet media
English
11
76
592
36.9K
Collinear AI retuiteado
Nazneen Rajani
Nazneen Rajani@nazneenrajani·
Thrilled to introduce spider 🕷️, a system for crafting data recipes that you can directly use @thinkymachines's tinker for model training. Supports both off-policy and on-policy distillation (tokenizer agnostic). You can also run various data filters and verifiers for curation. Thanks to @johnschulman2 for tinker access and supporting @CollinearAI's spider. happy halloween 🕸️
Muyu He@HeMuyu0327

Tinker for training exists, but Tinker for data doesn't. Yet, researchers spend most of the time on data preprocessing / generation and training integration. This Halloween, we introduce spider, ie. Tinker for data. It spins up a client for users to define a production-grade distillation run in a few lines of code. Features in the mvp: - For off-policy distillation, users can define custom preprocessing logic for any dataset, and spin up a remote inference engine for high throughput rollout generation. In 2-3 lines a whole dataset can be created. - For on-policy distillation revived by @thinkymachines , users can simply toggle on_policy: true so that the same workflow now admits a teacher model and on-policy kl supervision. Under the hood a Tinker service client is integrated. Again, 2-3 lines of code. We want to make spider as useful for open-source research as possible and support fast validation for experiment ideas. Next up we will implement the cross-tokenizer approach by @huggingface to support on-policy distillation from and to any model. And we will make filtering, inference, etc. more intelligent and simple for end users.

English
3
10
106
20.4K
Collinear AI
Collinear AI@CollinearAI·
We’ve partnered with @togethercompute 🚀 Collinear Simulations are now live inside Together Evals, bringing real-world, multi-turn testing to model evaluation. Simulate messy user behavior with our TraitMix engine and see how your models perform under real conditions!
Together AI@togethercompute

Together AI 🤝@CollinearAI Introducing TraitMix, Collinear’s simulation product empowering teams to generate persona-driven AI agent interactions. 🔌Plug these interactions into your workflows and evaluate their effectiveness with Together Evals. Details: bit.ly/43GHJhR

English
1
1
2
672
Collinear AI retuiteado
Nazneen Rajani
Nazneen Rajani@nazneenrajani·
would love to see more effort being put in building systems that enable people to create end artifacts instead of just open-sourcing the end artifacts datasets and models are great but imagine if many people worked on putting systems out that anyone could use to craft training and data recipes tinker from @thinkymachines and trl from @huggingface are great examples of such systems
English
1
2
14
2.4K
Collinear AI retuiteado
Nazneen Rajani
Nazneen Rajani@nazneenrajani·
I interviewed 103 candidates for the MTS role in the last 6 months, we ended up only making single digit offers. This is what I am looking for: - how do they approach a problem. do they jump to the solution or first think about *how to measure* and how to set up evals - how good are the fundamentals. everyones uses torchtune or verl but can you go beyond the wrappers and setup yolo runs from existing curves - culture vibes. imo this is usually data hygiene, thorough documentation, and reproducible expts
English
4
13
227
33.3K
Collinear AI retuiteado
Anand Kumar
Anand Kumar@anand_k27·
Excited to attend #ICCV2025 this week! - I will be presenting our work “IntroStyle: Training-free Introspective Style Attribution using Diffusion Features”: anandk27.github.io/IntroStyle/ Hit me up if you want to talk about VLMs and get some ICCV goodies from @CollinearAI :)
English
0
3
7
2.2K
Collinear AI
Collinear AI@CollinearAI·
Results: strong gains on WikiArt and DomainNet, outperforming existing methods and staying robust to evolving artistic styles. If you’re at ICCV this week, stop by his poster, and grab one of our custom cat stickers while you’re at it 🐱
English
0
0
0
166
Collinear AI
Collinear AI@CollinearAI·
IntroStyle tackles one of the toughest challenges in generative vision — attributing artistic style in text-to-image models without training new networks or collecting massive custom datasets. The framework is training-free and leverages diffusion model features directly.
English
0
0
0
122
Collinear AI
Collinear AI@CollinearAI·
Our research was featured in the 2025 State of AI Report by Air Street Capital, alongside @OpenAI, @GoogleDeepMind, @Apple, and @Meta. The report spotlights Collinear’s work on adversarial testing and reasoning brittleness, advancing how we evaluate and improve reasoning stability in large models. Full report: lnkd.in/dDgBveNC
English
0
3
7
390
Collinear AI retuiteado
Muyu He
Muyu He@HeMuyu0327·
Some very fun moments I have while grinding at @CollinearAI below. If these sound interesting to you, come and join us: - Using mech interp tools to drastically alter the personas of LLMs for agentic evals / RL - Generating 25B+ post training tokens in 5 days on limited compute with tons of inference optimizations - Doing RLVR / mid-training on novel domains and benchmarks - In general: proposing ur own research ideas and quickly see them materialized in collab with big labs / enterprises Bonus: ordering Zareen's, a very cute dog called 土豆 from another company, and hands down best office vibe ever. DM if you are interested in doing open-source research that will turn into both products and publications.
Nazneen Rajani@nazneenrajani

We’re growing and hiring! I’m looking for Research Scientists and Research Engineers passionate about pushing the boundaries of post-training AI technologies. We've shipped 100B+ tokens of high-quality data in a very short time and enabled enterprises to save serious $$ while dramatically improving their AI performance, safety, and reliability. If you’re excited by post-training, RL envs, and enabling each individual and organization to build great AI — let’s talk. DM me if you’re curious. Happy to chat!

English
3
3
22
3K
Collinear AI
Collinear AI@CollinearAI·
Deloitte refunded part of a $290K contract after their review of Australia’s welfare system included AI generated hallucinations. Even trusted workflows need guardrails, post-training and red teaming checks, so that models don’t start writing their own facts.
Collinear AI tweet media
English
0
0
4
831
Collinear AI
Collinear AI@CollinearAI·
@Collinear, we’re studying these post-training dynamics to build smarter improvement loops, where every dataset, reward, and evaluation helps models climb out of their own valley. Our paper has been accepted for @NeurIPSConf ’25 (DL4C) — arxiv.org/pdf/2510.06101
English
0
0
2
229
Collinear AI
Collinear AI@CollinearAI·
What if learning to reason requires unlearning first? We often imagine fine-tuning as a straight climb: add more data, get better results. But when we studied how small models learn to reason through code distillation… We found what we call a valley of reasoning. #AIResearch
Collinear AI tweet media
English
2
0
3
431
Collinear AI
Collinear AI@CollinearAI·
As we scaled reasoning data from 1K → 10K → 30K examples, model performance on competitive coding first dropped by half before climbing back to surpass baseline by over 100%! Small models need to unlearn surface-level patterning before internalizing structured reasoning.
English
0
0
1
167