Roko Pozaric
7 posts

Roko Pozaric
@RPozaric
building @sentientAGI prev @princeton @pwaterpolo
Katılım Ekim 2022
22 Takip Edilen185 Takipçiler

1/ New ways to bypass model protection: How @SentientAGI shapes the future of AI research.
tl;dr.
Sentient@SentientAGI
Are Robust LLM Fingerprints Adversarially Robust? Our latest paper, accepted to IEEE SaTML 2026, analyzes the robustness of model fingerprinting under adversarial conditions and shows that simple, targeted attacks can reliably defeat many existing fingerprinting strategies. 🧵 Read more for a deep dive on our latest research
English
Roko Pozaric retweetledi
Roko Pozaric retweetledi
Roko Pozaric retweetledi

🚀 𝐓𝐇𝐄 𝐅𝐈𝐑𝐒𝐓 𝐒𝐄𝐍𝐓𝐈𝐄𝐍𝐓 𝐌𝐎𝐃𝐄𝐋, 𝐃𝐎𝐁𝐁𝐘-𝐦𝐢𝐧𝐢, 𝐈𝐒 𝐋𝐀𝐔𝐍𝐂𝐇𝐈𝐍𝐆 𝐓𝐎𝐃𝐀𝐘
The signature fingerprinted Dobby 70B model is coming in early February.
𝐃𝐨𝐛𝐛𝐲-𝐦𝐢𝐧𝐢 𝐜𝐨𝐦𝐞𝐬 𝐢𝐧 𝐭𝐰𝐨 𝐟𝐥𝐚𝐯𝐨𝐫𝐬
😇𝐃𝐨𝐛𝐛𝐲-𝐦𝐢𝐧𝐢 𝐋𝐞𝐚𝐬𝐡𝐞𝐝—Friendly and conversational, your regular chill guy
😈𝐃𝐨𝐛𝐛𝐲-𝐦𝐢𝐧𝐢 𝐔𝐧𝐡𝐢𝐧𝐠𝐞𝐝—Brutally honest and unfiltered, speaks uncomfortable truths others won't
Try them now by running Dobby-mini locally:
👉 huggingface.co/collections/Se…
…with a helpful how-to guide from our friend @chrisaubin_:
👉 huggingface.co/blog/chrisaubi…
English

finally able to get my hands on one of these Dobby Fingerprints
x.com/SentientAGI/st…
Sentient@SentientAGI
Dobby enthusiasts, the time is now. We had 661,494 pre-registrations, and we want to give 𝐘𝐎𝐔 participants early access to a fingerprint and ownership claim in Dobby. Prove your intelligence. Claim your ownership. campaign.sentient.xyz campaign.sentient.xyz/?dobby=true
English
