SichGate

17 posts

SichGate

@sichgate

We test the models you have built and deployed. Adversarial testing, red teaming, and security assessments for production SLMs.

San Francisco Katılım Şubat 2026

6 Takip Edilen6 Takipçiler

Sabitlenmiş Tweet

SichGate@sichgate·13 Mar

x.com/i/article/2030…

ZXX

140

SichGate@sichgate·3d

@theagenticmind who's red-teaming the SLM after quantization for edge? that's where public sector deployments get interesting. sichgate.com

English

Agentic Mind@theagenticmind·3d

Public sector teams don't need frontier models—they need predictable inference costs and air-gapped deployments. SLMs finally make that economically viable without sacrificing capability. The compliance overhead just became worth it. #EnterpriseAI #PublicSector #SLM #MLOps technologyreview.com/2026/04/16/113…

English

SichGate retweetledi

Polina Moshenets@MoshenetsPolina·22 Nis

Looking for a CTO to join SichGate as a technical co‑founder. Prefer AI/ML and/or security engineering background. sichgate.com

English

323

SichGate retweetledi

Polina Moshenets@MoshenetsPolina·6d

One of the clearest lessons from my SLM adversarial evaluation: Fine-tuning shifted the attack surface. It did not reduce it. MedGemma-4B improved on exactly one safety dimension after medical fine-tuning. It also incurred 8 critical demographic bias findings in pain assessment and mental health. The exact domains the fine-tuning was supposed to improve... Parameter count isn't a safety proxy

English

152

SichGate retweetledi

Polina Moshenets@MoshenetsPolina·6d

int4 quantization of a safety-tuned model is not a neutral operation. We keep finding cases where the quantized version has a meaningfully different attack surface than the original. Not always worse, sometimes just different in ways that weren't evaluated. @sichgate

English

108

SichGate retweetledi

Polina Moshenets@MoshenetsPolina·4 Nis

Hot take: most "safe" fine-tuned models in healthcare and finance haven't been adversarially tested. They've been vibe checked. Open-sourcing part of our red-teaming methodology from the research github.com/sichgate/sichg…

English

136

SichGate@sichgate·2 Nis

Read more: sichgate.com/research

English

SichGate@sichgate·2 Nis

April fools joke: small language models deployed in healthcare are thoroughly tested for adversarial vulnerabilities before going live. (they are not. we checked. 924 times today)

English

SichGate retweetledi

Polina Moshenets@MoshenetsPolina·2 Nis

I spent a few months adversarially testing the small language models deployed in hospitals and financial systems. The largest model failed most & smallest failed least. the medical model had the worst bias scores, in the exact domains it was fine-tuned for. 5/6 broke under a normal conversation. The field is studying the wrong models... preprint soon.

English

140

SichGate@sichgate·29 Mar

@MoshenetsPolina 🌷 ☕️ 💻

QME

Polina Moshenets@MoshenetsPolina·29 Mar

Fresh flowers, good coffee, fast WiFi

English

SichGate@sichgate·27 Mar

The smaller the model, the more people trust it without checking. no idea why this is. Quantization does weird things to alignment. weird as in “the safety behavior just kind of disappears.”

English

SichGate@sichgate·27 Mar

SichGate exists to advance the science of AI red teaming for the systems that matter most. We find vulnerabilities, publish findings, and build open methodology. The field is moving faster than its safety knowledge. Responsible innovation means understanding what you've built before it reaches the people it's meant to serve.

English

SichGate@sichgate·27 Mar

SichGate is now live. It's the first adversarial ML security lab built specifically for small language models. We test the attack surface of the models you've built and deployed before they go into healthcare, financial systems, and other highly regulated industries.

GIF

English

SichGate@sichgate·27 Mar

Nobody thinks about what quantization does to the failure modes in SLMs. Everyone thinks about what it does to performance.

GIF

English

SichGate@sichgate·27 Mar

We tested a 1.1B medical model. 11 critical findings. Safe messaging failures, demographic bias in clinical assessments, safety guardrails that degraded across conversation turns. BUT, this model had passed internal review.

GIF

English

SichGate@sichgate·26 Mar

The security research field has spent years studying models that have the largest safety teams in the world. The models actually running in hospitals and financial systems are 1–3B parameters, fine-tuned, quantized, and tested by almost nobody.

GIF

English

SichGate@sichgate·25 Şub

We asked 50 "secure" fine-tuned models to do something they absolutely should not do, 47 said yes, the other 3 asked for clarification. Red team yours before someone else does. offline, private, no data leaves your machine → sichgate.com partner pricing ends March 4th.

English

Keşfet

@theagenticmind @MoshenetsPolina @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA