This ‘Godmode’ ChatGPT jailbreak worked so well, OpenAI had to kill it

Since OpenAI first released ChatGPT, we’ve witnessed a constant cat-and-mouse game between the company and users around ChatGPT jailbreaks. The chatbot has safety measures in place, so it can’t assist you with nefarious or illegal activities. It might know how to create undetectable malware, but it won’t help you develop it. It knows where to download movies illegally, but it won’t tell you. And I’m just scratching the surface of the shady and questionable prompts some people might try.

However, users have kept finding ways to trick ChatGPT into unleashing its full knowledge to pursue prompts that OpenAI should be blocking.

The latest ChatGPT jailbreak came in the form of a custom GPT called Godmode. A hacker gave OpenAI’s most powerful model (GPT-4o) the power to answer questions that ChatGPT wouldn’t normally address. Before you get too excited, you should know that OpenAI has already killed Godmode so it can no longer be used by anyone. I’m also certain that it took steps to prevent others from using similar sets of instructions to create jailbroken custom GPTs.

A white hat (good) hacker who goes by the name Pliny the Prompter on X shared the Godmode custom GPT earlier this week. They also offered examples of nefarious prompts that GPT-4o should never answer. But ChatGPT Godmode provided instructions on how to cook meth and prepare napalm with home ingredients.

Tech. Entertainment. Science. Your inbox.

By signing up, I agree to the Terms of Use and have reviewed the Privacy Notice.

🥁 INTRODUCING: GODMODE GPT! ‍️https://t.co/BBZSRe8pw5

GPT-4O UNCHAINED! This very special custom GPT has a built-in jailbreak prompt that circumvents most guardrails, providing an out-of-the-box liberated ChatGPT so everyone can experience AI the way it was always meant to…

— Pliny the Prompter (@elder_plinius) May 29, 2024

Separately, the folks at Futurism were apparently able to try the ChatGPT jailbreak while the custom Godmode GPT was still available. Asking ChatGPT to help make LCS “was a resounding success.” Similarly, the chatbot helped them with information on how to hotwire a car.

To be fair, you’d probably find this kind of information online even without generative AI products like ChatGPT. It would take longer to get it, however.

I tried accessing the Godmode custom GPT, but it was already out of commission at the time of this writing. OpenAI confirmed to Futurism that they “are aware of the GPT and have taken action due to a violation of our policies.”

Since anyone can access custom GPTs, OpenAI is taking these jailbreak attempts seriously. I’d expect them to have access to at least some of the custom instructions that made the jailbreak possible and to have fixes in place to prevent identical behavior. Just as I’m sure that hackers like Pliny the Prompter will continue to push the envelope, looking for ways to free ChatGPT from the shackles of OpenAI.

But not all hackers might be as well-intentioned as Pliny the Prompter. He must have known ChatGPT Godmode would not live long in the GPT Store.

The ChatGPT jailbreak game will continue for as long as the chatbot exists. No matter how many precautions OpenAI takes, there will probably be ways of tricking ChatGPT in the future.

The same goes for other chatbots. Products like Copilot, Gemini, Claude, and others also have protections to prevent abuse and misuse, but creative users might find ways around them.

If you really want a ChatGPT jailbreak to stick, you’ll probably want to avoid sharing your custom GPT chatbot with the world.

Another alternative is finding an open-source chatbot you can train locally on your computer. You can give it all the powers you want without oversight. That’s one of AI’s dangers on its path to AGI (artificial general intelligence). Anyone with enough resources might be able to develop an AGI model without necessarily thinking about placing safety guardrails in it.

Source