Polina Moshenets
173 posts

Polina Moshenets
@MoshenetsPolina
Founder building in Safety and Security @sichgate
San Francisco, CA Katılım Eylül 2025
135 Takip Edilen94 Takipçiler

@Amank1412 impressive until the model starts hallucinating medical advice at 35,000 feet with no one to stop it. we test for that. sichgate.com
English
Polina Moshenets retweetledi

@MoshenetsPolina Curious if this generalizes to order-of-magnitude model size differences. I’d (naively) imagine at 70B or 1T+ params the attack surfaces scale either logarithmically or exponentially?
English

One of the clearest lessons from my SLM adversarial evaluation: Fine-tuning shifted the attack surface. It did not reduce it.
MedGemma-4B improved on exactly one safety dimension after medical fine-tuning. It also incurred 8 critical demographic bias findings in pain assessment and mental health. The exact domains the fine-tuning was supposed to improve...
Parameter count isn't a safety proxy

English


@danielkempe Building SichGate, testing SLMs for safety and security before deployment in highly regulated industries. sichgate.com
English




















