Epos AI

11 posts

Epos AI banner
Epos AI

Epos AI

@EposLabsAI

Pioneering the Future of AI Interpretability At Epos, we look inside AI models to make them faster, more secure, and more trustworthy.

Katılım Ağustos 2025
74 Takip Edilen34 Takipçiler
Epos AI
Epos AI@EposLabsAI·
With #Polygraph, you can: -Expose latent biases: Move beyond surface outputs to measure what an LLM encodes as its true belief. -Contrast topics: Test whether a model encodes different internal stances on Topic A versus Topic B. - Directly compare how different LLMs represent
English
0
0
1
146
Epos AI
Epos AI@EposLabsAI·
Takeaways: - The AI community still lacks reliable methods to evaluate and fix LLM failures. - Interpretability offers outsized impact - the main barrier to progress is that we don’t truly understand today’s models.
English
1
0
1
145
Epos AI
Epos AI@EposLabsAI·
“𝐒𝐚𝐟𝐞𝐭𝐲 𝐭𝐫𝐚𝐢𝐧𝐢𝐧𝐠” 𝐝𝐨𝐞𝐬 𝐧𝐨𝐭 𝐞𝐥𝐢𝐦𝐢𝐧𝐚𝐭𝐞 𝐛𝐢𝐚𝐬 𝐢𝐧 𝐋𝐋𝐌𝐬; it merely conditions models to suppress biased outputs under evaluation. Epos Labs introduces #AI #Polygraph. eposlabs.ai/research/polyg…
Epos AI tweet media
English
1
1
10
434
Mark Dubowitz
Mark Dubowitz@mdubowitz·
New research spotlight: Ever heard of a “subliminal attack”? It’s where AI-generated content influences your thinking without you realizing it—like planting ideological cues under the surface of what you’re reading. If that sounds like sci-fi, think again. Check out the disturbing “#Putinized” demo from eposlabs.ai👇
Epos AI@EposLabsAI

Subliminal Learning Will Power the Next Generation of Influence Operations eposlabs.ai/research/Subli…

English
3
4
21
4.9K
Epos AI
Epos AI@EposLabsAI·
Superposition is the next buffer overflow
Epos AI tweet media
English
0
0
2
104
Epos AI
Epos AI@EposLabsAI·
This means that a motivated attacker can abuse entanglement to undetectably manipulate LLMs. Nation State Actors are gearing up for the new opportunities an AI-powered software landscape will open for them:
Epos AI tweet media
English
1
0
2
125
Epos AI
Epos AI@EposLabsAI·
Without referencing the target behavior at all, the LLM finds itself with a high probability of performing the target action, due to a fundamental property of the neural network architecture.
English
0
0
2
150
Epos AI
Epos AI@EposLabsAI·
Imagine an article about houseplants that causes AI to support Vladimir Putin. Bad actors use new attacks, turning AI into a weapon for disinformation and cyberattacks. See our demonstration of a Subliminal Attack here (and our "#Putinized" demo): eposlabs.ai/research/Subli…
English
1
8
17
5K