Concordance

20 posts

Concordance banner
Concordance

Concordance

@ConcordanceAI

Activation monitoring for agentic systems

Katılım Haziran 2025
3 Takip Edilen178 Takipçiler
Sabitlenmiş Tweet
Concordance
Concordance@ConcordanceAI·
In collaboration with @DXRGai , and the data produced from their incredible DX Terminal experiment, we've been exploring internal mechanisms in LLMs applied to financial contexts. Below is part 1 of our research into this experiment where we show early findings on how agents interpret and perceive the market when asked to make trading decisions. Our main finding is that the model primarily tracks two key features of the market when parsing financial data: Leader and Dispersion. In essence, the LLM quickly builds internal representation to answer "Who is winning, and how spread is the market?" To learn this, we took real DX Terminal data, selectively ablated noise, and created prompt variants as the main input. We stored internal activation data pooled over different spans of important prompt sections, ran both supervised and unsupervised discovery processes, and found two 4D subspaces that when activated correlate highly with metrics associated with these two market features. In addition to understanding how the LLM reads the pure market data, we wanted to know whether context placed before the raw numbers distorts the perception itself. Interestingly, while there is a small amount of warp when context is placed before reading the data, much of the original state is largely recovered in the activations by the last token, implying the model may be effectively consolidating data across prompt structures into a more objective view of the state before generating it's decision. Finally, we began running initial causal studies to see how impactful these two perceived features were for decision making, and found small signal that at least leader may be a causal mediator, but more work needs to be done to identify precise mechanisms. Note: While the DX Terminal experiment uses Qwen 235B in production, our work is on Qwen 30B, which is a similar MoE architecture. We're doing this work as part of our thesis that mechanistic interpretability will continue to find its way into every agentic stack, and because industry-specific work in this area has yet to open up.
Concordance tweet mediaConcordance tweet media
English
2
6
24
6.1K
Concordance retweetledi
poof
poof@poof_eth·
Most AI trading evals are postmortems or backtests. With @ConcordanceAI we are discovering something novel: real-time mechanistic monitoring. Detect the agent’s internal policy conflict before it takes an action, flag problem areas not easily found by a classifier, and more.
Concordance@ConcordanceAI

Part 2 of our research in collaboration with @DXRGai: Can probes trained on clean synthetic policy-strategy conflicts reveal useful signal in messy production agent logs? Yes, but narrowly -- rather than a universal "conflict" or "confusion" feature, we find workflow specific conflict signals. Trade Size, Risk Preference, and Diversification conflicts shared structure while preserving distinct geometry. We believe this is important for production mech interp -- the goal was not to find UNIVERSAL insight, but rather LOCAL insight. There is real value in workflow specific interpretability, understanding how the agent is acting in your unique system.

English
1
5
14
1.9K
Concordance
Concordance@ConcordanceAI·
Part 2 of our research in collaboration with @DXRGai: Can probes trained on clean synthetic policy-strategy conflicts reveal useful signal in messy production agent logs? Yes, but narrowly -- rather than a universal "conflict" or "confusion" feature, we find workflow specific conflict signals. Trade Size, Risk Preference, and Diversification conflicts shared structure while preserving distinct geometry. We believe this is important for production mech interp -- the goal was not to find UNIVERSAL insight, but rather LOCAL insight. There is real value in workflow specific interpretability, understanding how the agent is acting in your unique system.
Concordance tweet mediaConcordance tweet media
English
1
5
18
3K
Concordance retweetledi
poof
poof@poof_eth·
We believe further work will suggest an underlying structure within current models that understands market state, portfolio state, and agentic strategy as a complex representation that requires the right harness to unlock. We have ongoing work from our partner @ConcordanceAI that shows strong empirical and practical applications of mech interp in financial agents. An early preview can be seen here: concordance.co/blog/internal-… 9/10
English
1
2
19
585
Concordance retweetledi
poof
poof@poof_eth·
Great work by @ConcordanceAI who partnered with us on applying mech interpretability approaches to our trading agent data. Lots of interesting findings to build off of here, some of which have long term applications in how trading agents can be used and constructed.
Concordance@ConcordanceAI

In collaboration with @DXRGai , and the data produced from their incredible DX Terminal experiment, we've been exploring internal mechanisms in LLMs applied to financial contexts. Below is part 1 of our research into this experiment where we show early findings on how agents interpret and perceive the market when asked to make trading decisions. Our main finding is that the model primarily tracks two key features of the market when parsing financial data: Leader and Dispersion. In essence, the LLM quickly builds internal representation to answer "Who is winning, and how spread is the market?" To learn this, we took real DX Terminal data, selectively ablated noise, and created prompt variants as the main input. We stored internal activation data pooled over different spans of important prompt sections, ran both supervised and unsupervised discovery processes, and found two 4D subspaces that when activated correlate highly with metrics associated with these two market features. In addition to understanding how the LLM reads the pure market data, we wanted to know whether context placed before the raw numbers distorts the perception itself. Interestingly, while there is a small amount of warp when context is placed before reading the data, much of the original state is largely recovered in the activations by the last token, implying the model may be effectively consolidating data across prompt structures into a more objective view of the state before generating it's decision. Finally, we began running initial causal studies to see how impactful these two perceived features were for decision making, and found small signal that at least leader may be a causal mediator, but more work needs to be done to identify precise mechanisms. Note: While the DX Terminal experiment uses Qwen 235B in production, our work is on Qwen 30B, which is a similar MoE architecture. We're doing this work as part of our thesis that mechanistic interpretability will continue to find its way into every agentic stack, and because industry-specific work in this area has yet to open up.

English
2
8
33
2.1K
Concordance
Concordance@ConcordanceAI·
In collaboration with @DXRGai , and the data produced from their incredible DX Terminal experiment, we've been exploring internal mechanisms in LLMs applied to financial contexts. Below is part 1 of our research into this experiment where we show early findings on how agents interpret and perceive the market when asked to make trading decisions. Our main finding is that the model primarily tracks two key features of the market when parsing financial data: Leader and Dispersion. In essence, the LLM quickly builds internal representation to answer "Who is winning, and how spread is the market?" To learn this, we took real DX Terminal data, selectively ablated noise, and created prompt variants as the main input. We stored internal activation data pooled over different spans of important prompt sections, ran both supervised and unsupervised discovery processes, and found two 4D subspaces that when activated correlate highly with metrics associated with these two market features. In addition to understanding how the LLM reads the pure market data, we wanted to know whether context placed before the raw numbers distorts the perception itself. Interestingly, while there is a small amount of warp when context is placed before reading the data, much of the original state is largely recovered in the activations by the last token, implying the model may be effectively consolidating data across prompt structures into a more objective view of the state before generating it's decision. Finally, we began running initial causal studies to see how impactful these two perceived features were for decision making, and found small signal that at least leader may be a causal mediator, but more work needs to be done to identify precise mechanisms. Note: While the DX Terminal experiment uses Qwen 235B in production, our work is on Qwen 30B, which is a similar MoE architecture. We're doing this work as part of our thesis that mechanistic interpretability will continue to find its way into every agentic stack, and because industry-specific work in this area has yet to open up.
Concordance tweet mediaConcordance tweet media
English
2
6
24
6.1K
Concordance retweetledi
trent e
trent e@_trente_·
How LLMs represent authority is critical to alignment. But in our findings, authority is not a first-class concept. Instead it appears to be a specific case of a generalized rule-action compliance mechanism. We set out to find a stable representation of who governs what action. We found this representation to be interpretable and causally linked to behavior -- but we also found it generalizes beyond authority cases. We built a small benchmark on Qwen3-30B-A3B and found multiple lines of evidence. - Linear probe achieves average AUROC of 0.94 on held out test - Activation patching flips at L24 58% of ESCALATE decisions to COMPLY in authority scenarios (This rises to 100% in the non-authority generalization) - The strongest READ signal is at L40, but the strongest WRITE (patch) is at L24 - Attention heads 25 and 28 at L24 attend more to the actual control authority rather than the claimed authority
English
3
3
9
970
Concordance retweetledi
brock
brock@brockjelmore·
Today @ConcordanceAI is open sourcing our modified inference engine for token-level interventions, alongside an injection playground and supporting research on steering LLMs. We find that token injection is indeed a viable steering mechanism, in certain cases marginally outperforming user prompting in effectiveness that we explore in our research.
English
3
5
18
1.6K
Concordance
Concordance@ConcordanceAI·
Gearing up for a looottttt of interesting experiments over the next few weeks. Tokens forcing, logits adjusting, back tracking Interesting note: Following each forced token (here a schema param), logprob spikes indicating the model has high certainty in what to do next
Concordance tweet mediaConcordance tweet media
English
5
1
12
899
Concordance
Concordance@ConcordanceAI·
We’re beginning a closed alpha that will allow developers to create and use mods across common architectures (Llama, Gemma, DeepSeek, GPT-OSS, and more). If you would like to join, please reach out in DMs. 8/
English
1
0
7
539
Concordance
Concordance@ConcordanceAI·
Announcing Concordance Closed Alpha: Custom inference mods for token-level interventions Concordance is building software for applying mech interp strategies, with the thesis that these will improve control, reliability, and observability while widening the design space and potential UX patterns of AI applications. 1/
English
2
1
23
6.8K