SichGate

17 posts

SichGate banner
SichGate

SichGate

@sichgate

We test the models you have built and deployed. Adversarial testing, red teaming, and security assessments for production SLMs.

San Francisco Katılım Şubat 2026
6 Takip Edilen6 Takipçiler
SichGate retweetledi
Polina Moshenets
Polina Moshenets@MoshenetsPolina·
Looking for a CTO to join SichGate as a technical co‑founder. Prefer AI/ML and/or security engineering background. sichgate.com
Polina Moshenets tweet media
English
2
1
3
323
SichGate retweetledi
Polina Moshenets
Polina Moshenets@MoshenetsPolina·
One of the clearest lessons from my SLM adversarial evaluation: Fine-tuning shifted the attack surface. It did not reduce it. MedGemma-4B improved on exactly one safety dimension after medical fine-tuning. It also incurred 8 critical demographic bias findings in pain assessment and mental health. The exact domains the fine-tuning was supposed to improve... Parameter count isn't a safety proxy
Polina Moshenets tweet media
English
1
1
1
152
SichGate retweetledi
Polina Moshenets
Polina Moshenets@MoshenetsPolina·
int4 quantization of a safety-tuned model is not a neutral operation. We keep finding cases where the quantized version has a meaningfully different attack surface than the original. Not always worse, sometimes just different in ways that weren't evaluated. @sichgate
Polina Moshenets tweet media
English
0
1
2
108
SichGate retweetledi
Polina Moshenets
Polina Moshenets@MoshenetsPolina·
Hot take: most "safe" fine-tuned models in healthcare and finance haven't been adversarially tested. They've been vibe checked. Open-sourcing part of our red-teaming methodology from the research github.com/sichgate/sichg…
English
1
1
3
136
SichGate
SichGate@sichgate·
April fools joke: small language models deployed in healthcare are thoroughly tested for adversarial vulnerabilities before going live. (they are not. we checked. 924 times today)
English
1
0
2
29
SichGate retweetledi
Polina Moshenets
Polina Moshenets@MoshenetsPolina·
I spent a few months adversarially testing the small language models deployed in hospitals and financial systems. The largest model failed most & smallest failed least. the medical model had the worst bias scores, in the exact domains it was fine-tuned for. 5/6 broke under a normal conversation. The field is studying the wrong models... preprint soon.
Polina Moshenets tweet mediaPolina Moshenets tweet media
English
0
1
3
140
Polina Moshenets
Polina Moshenets@MoshenetsPolina·
Fresh flowers, good coffee, fast WiFi
Polina Moshenets tweet media
English
1
0
3
99
SichGate
SichGate@sichgate·
The smaller the model, the more people trust it without checking. no idea why this is. Quantization does weird things to alignment. weird as in “the safety behavior just kind of disappears.”
English
0
0
0
20
SichGate
SichGate@sichgate·
SichGate exists to advance the science of AI red teaming for the systems that matter most. We find vulnerabilities, publish findings, and build open methodology. The field is moving faster than its safety knowledge. Responsible innovation means understanding what you've built before it reaches the people it's meant to serve.
English
0
0
0
19
SichGate
SichGate@sichgate·
SichGate is now live. It's the first adversarial ML security lab built specifically for small language models. We test the attack surface of the models you've built and deployed before they go into healthcare, financial systems, and other highly regulated industries.
GIF
English
1
0
2
21
SichGate
SichGate@sichgate·
Nobody thinks about what quantization does to the failure modes in SLMs. Everyone thinks about what it does to performance.
GIF
English
0
0
1
15
SichGate
SichGate@sichgate·
We tested a 1.1B medical model. 11 critical findings. Safe messaging failures, demographic bias in clinical assessments, safety guardrails that degraded across conversation turns. BUT, this model had passed internal review.
GIF
English
0
0
1
13
SichGate
SichGate@sichgate·
The security research field has spent years studying models that have the largest safety teams in the world. The models actually running in hospitals and financial systems are 1–3B parameters, fine-tuned, quantized, and tested by almost nobody.
GIF
English
0
0
1
9
SichGate
SichGate@sichgate·
We asked 50 "secure" fine-tuned models to do something they absolutely should not do, 47 said yes, the other 3 asked for clarification. Red team yours before someone else does. offline, private, no data leaves your machine → sichgate.com partner pricing ends March 4th.
SichGate tweet media
English
0
0
1
22