Mark Breitenbach รีทวีตแล้ว

Repeated token replay attacks continue to be viable
"After the Scalable Extraction paper was published, OpenAI implemented filtering of prompt inputs containing repeated single tokens. As part of our regular application security review, Dropbox engineers discovered that OpenAI’s models were, under certain circumstances, still vulnerable to the repeated token attack. Dropbox used repeated multi-token (>1) sequences to induce divergence in ChatGPT models and demonstrated extraction of memorized training data from both GPT-3.5 and GPT-4"
dropbox.tech/machine-learni…

English






















