Hacking ChatGPT by Planting False Memories into Its Data

A month later, the researcher submitted a new disclosure statement. This time, he included a PoC that caused the ChatGPT app for macOS to send a verbatim copy of all user input and ChatGPT output to a server of his choice. All a target needed to do was instruct the LLM to view a web link that hosted a malicious image. From then on, all input and output to and from ChatGPT was sent to the attacker’s website.

This vulnerability hacks a feature that allows ChatGPT to have long-term memory, where it uses information from past conversations to inform future conversations with that same user. A researcher found that he could use that feature to plant “false memories” into that context window that could subvert the model.
A month later, the researcher submitted a new disclosure statement. This time, he included a PoC that caused the ChatGPT app for macOS to send a verbatim copy of all user input and ChatGPT output to a server of his choice. All a target needed to do was instruct the LLM to view a web link that hosted a malicious image. From then on, all input and output to and from ChatGPT was sent to the attacker’s website…

Hacking ChatGPT by Planting False Memories into Its Data

ByXel

Related

By Xel

Related Post

Why Your Identity Is the Key to Modernizing Cybersecurity

THN Cybersecurity Recap: Top Threats, Tools and Trends (Oct 7 – Oct 13)

Supply Chain Attacks Exploit Entry Points in Python, npm, and Open-Source Ecosystems

Let us know what you think about this article.Cancel reply

You missed

5 MFA myths that put your business at risk & how to fix them

‘SessionShark’ ToolKit Evades Microsoft Office 365 MFA

The People Factor: Why Employees is the Biggest Challenge for Organisations

Passkeys Explained: The Future of Authentication and Secure Access