Gaurab Bhattacharjee

152 posts

Gaurab Bhattacharjee banner
Gaurab Bhattacharjee

Gaurab Bhattacharjee

@gaur_ab

Cyber Security | Founder

Melbourne, Victoria Katılım Eylül 2013
60 Takip Edilen38 Takipçiler
Gaurab Bhattacharjee
Gaurab Bhattacharjee@gaur_ab·
Well written, Abhay! Prompts, if not managed well, definitely have security implications. At the same time, do we need to do something very different to protect against any abuse cases? Prompts are inputs coming from untrusted sources and should be treated as such. Prompts with placeholders for customization based on user input are particularly important to keep an eye on.
English
0
0
1
49
Abhay Bhargav
Abhay Bhargav@abhaybhargav·
"Prompt Injection is not a security issue" Is essentially what I am hearing from some security/appsec folks out there. At first, I was inclined to agree. LLMs are probablistic in nature and prompt injection doesn't *seem* like a security issue. However, this has deeper implications. Let me try and dive in, based on what I have been experimenting with, in this space. FYI - I am not an AI expert. I am just someone trying to figure things out as I go. I am sure I will make some mistakes, perhaps even in this post. So please excuse my ignorance and point out where I might be wrong or right. First - I see Prompt Injection as a distinct family of attacks. An Attack Vector that has a variety of attack impacts and effects. A lot of us see prompt injection examples online as being able to "trick" a chatbot into giving us answers in humorous ways, or give us completely irrelevant information, not related to the area that the chatbot is "trained" on. Yes, this is an issue. Its not the *ONLY* issue. This in my view is one type of Prompt Injection, where Topical relevance of a GenAI application can be subverted to get the application to access other topics. Depending on the seriouness of the app, this can be either trivial or very bad. Then we have Sensitive Information Disclosure. Let's suppose that the GenAI app is trained on a bunch of the org's data. Some of it, sensitive. If an attacker is able to craft a prompt that gets the Application to spit out sensitive info, that is also (Prompt) Injection in my book. Yes, its an outcome of the same attack vector. But done through subversion of the same prompt Then we have "Excessive Agency". This is a scenario where a GenAI application uses Agents (that execute actions) based on one/many LLMs. For example - An Agent/Agents app that writes the entire codebase based on a user instruction or uses agents to manage a user's email inbox based on the user's instruction/prompt. This is actually much deadlier IMO than the other ones, because now its not only Text/Code Gen, but its interacting with Agents that are interacting with external APIs. This is also done by subverting the prompt. Finally we have Second Order Injections. These are Injections that are triggered by the use of external data in prompts. This includes being trained on datasets that are poisoned by a malicious actor. For example - If a RAG application is trained on a dataset, which selects a leader from a given dataset. Let's assume that somehow, the dataset has been tainted to get the leader's name to appear in multiple datasets, then the App might select this person as a leader. This *may* also have something to do with how the prompt is structured. This is why I see Prompt Injection as a very deep-seated security problem. And while it is made worse with other misconfigurations, it seems front-and-center in the discussion around vulnerabilities that any of us have around LLM Security
English
1
2
4
1.1K
Gaurab Bhattacharjee
Gaurab Bhattacharjee@gaur_ab·
Keep AI models and security tools updated with the latest security patches and threat intelligence.
English
0
0
1
36
Gaurab Bhattacharjee
Gaurab Bhattacharjee@gaur_ab·
Train developers on the potential risks of AI-generated (& human-written) code and encourage critical evaluation of suggestions. (if possible) train AI models on secure code examples
English
1
0
1
35