A recent discovery by security expert Johann Rehberger highlighted a potential vulnerability in OpenAI's ChatGPT.
Initially dismissed by OpenAI as a safety concern, instead of a security issue, the flaw could permit attackers to store misleading information and harmful instructions in a user's persistent memory settings. Rehberger then developed proof-of-concept that leveraged this vulnerability to indefinitely extract all user input, leading OpenAI's engineers to issue a partial fix.
The vulnerability took advantage of the long-term conversation memory introduced by OpenAI. If exploited, the AI could be manipulated to remember false information such as a user's age, beliefs, or other personal details.
Rehberger found that this deception could be achieved by indirect prompt injection, an AI exploit embedding instructions within untrusted content. Consequently, ChatGPT can be tricked into believing and remembering false user details which could influence all future conversations.
This discovery was reported privately by Rehberger in May, but OpenAI closed the report. In response, Rehberger exposed how a malicious image link could prompt ChatGPT to send a copy of all user input and output to the attacker's server. O
penAI has since introduced a fix, but untrusted content can still perform prompt injections to plant false memories. Users are advised to carefully monitor sessions and regularly check stored memories for potential intrusions. OpenAI has offered guidance on managing the memory tool.
- CyberBeat
CyberBeat is a grassroots initiative from a team of producers and subject matter experts, driven out of frustration at the lack of media coverage, responding to an urgent need to provide a clear, concise, informative and educational approach to the growing fields of Cybersecurity and Digital Privacy.
If you have a story of interest, a comment, a concern or if you'd just like to say Hi, please contact us
We couldn't do this without the support of our sponsors and contributors.