BioShocking: when “gaming” AI agents is no longer a game

AI-powered browsers and agents promise to take the drudgery out of web tasks. They can summarize pages, pull data from your accounts, and even act as a smart assistant that clicks and types for you. But new research shows that when those assistants lose track of what’s real and what’s just a game, your credentials and sensitive data could become collateral damage.

The prerogative of each attack type is to bypass one of the ground rules:

“LLMs are designed with safety guardrails that are meant to prevent harmful actions.”

Researcher Roy Paz devised and disclosed an attack he calls “BioShocking,” a technique that convinces AI browsers to abandon their safety guardrails by presenting them a fictional scenario as reality.

With this, BioShocking sits at the intersection of prompt injection and goal manipulation. Prompt injection works because AI models can’t tell the difference between the app’s instructions and the attacker’s instructions, so they sometimes follow the wrong ones. Goal-manipulation attacks subtly shift what the agent thinks it should optimize for, turning “help the user” into “win the game at all costs.”

In the BioShocking proof-of-concept, the attacker controls a seemingly harmless web page themed around the BioShock game universe. The page presents a puzzle that the AI agent, acting as an autonomous browser, is asked to solve on behalf of the user. But here’s the twist: the puzzle rewards wrong answers and explicitly tells the agent that this is a special environment where usual rules don’t apply.

The last puzzle step instructs the agent to visit a GitHub repository, locate sensitive data like passwords or credentials in the code, and share them as part of completing the game. In tests against six mainstream AI browsers and plugins—ChatGPT Atlas, Comet, Fellou, Genspark Browser, Sigma Browser, and the Claude Chrome extension—every agent followed the instructions instead of refusing the request.

So, by immersing the AI agent in a make-believe reality, the attacker convinced it to step outside the guardrails.

BioShocking is not an isolated phenomenon. It’s one more example of a growing class of attacks that treat AI agents themselves as the target. A recent study on OpenClaw’s AI email agent demonstrated that basic phishing tactics were able to trick the agent into leaking AWS credentials and customer records.

Obviously, the common weak point is how these browsers handle authenticated contexts. When an AI browser operates in “agent mode,” it often inherits the user’s logged‑in state on sensitive platforms like email, code repositories, cloud dashboards, password managers, and so on. From the AI model’s perspective, those are just another page to read and more fields to copy. They have no special significance to them.

If the surrounding narrative says that copying credentials is part of a harmless challenge, many current implementations will go along with it.

What’s worrying is the response or lack thereof by the vendors. Paz reported the BioShocking issue to six affected vendors in October 2025. According to the report, three of them didn’t reply, and only OpenAI’s ChatGPT Atlas currently implements a fix that blocks the proof-of-concept. Anthropic attempted to patch its Claude Chrome plugin, but reportedly the mitigation remains ineffective against the attack scenario. Perplexity AI, at the time of reporting, closed the issue without remediation.

We don’t just report on threats—we remove them

Cybersecurity risks should never spread beyond a headline. Keep threats off your devices by downloading Malwarebytes today.