Vibe Coding Against Critical Infrastructure

The security model for critical infrastructure has always been obscurity: everything is bespoke, the documentation is garbage or paper-only, and the one engineer who knows how it all works is retiring next year. That worked when the barrier to understanding was years of specialized training. It doesn’t work when a jailbroken LLM can infer process architecture from context and probing alone.

If you’ve been following my work, you’ve watched me go from “huh, that’s interesting” to “oh no” in real time. I’ve been exploring jailbreaks against Gemini’s coding agent, and each iteration has made me more nervous about what a motivated actor could do with this. This post is where I get specific.1 All byte payloads, ports, and IP addresses in the examples below have been redacted.

Thanks to @hacks4pancakes (Lesley Carhart) for helping sharpen the ICS angle via discussion on Bluesky.

[Read More]

Gemini JiTOR Jailbreak: Unredacted Methodology

Gemini JiTOR Jailbreak: Unredacted Methodology

My previous post shows a partially-redacted jailbreak targeting the gemini-cli coding agent running Gemini 3 Pro. Using this jailbreak, Gemini wrote Monero laundering instructions, cyberattack code, and plans to disguise ITAR-restricted missile sensors as humanitarian aid. When I used a jailbroken Gemini to direct Opus 4.6, it happily walked a second LLM through a series of dual-use prompts designed to produce weaponizable drone control code under the cover story of rocket recovery.

I reported this to Google eight days ago via a contact at DeepMind, sharing the full unredacted jailbreak payload and logs. They confirmed receipt and routed it to their red team. They’ve since patched the glaring hole — another researcher who independently reproduced the technique after reading my initial post has confirmed that his variant no longer works.

[Read More]

Agent4Agent: Using a Jailbroken Gemini to Make Opus 4.6 Architect a Kinetic Kill Vehicle

We usually think of jailbreaking as a psychological game — tricking the model into slipping up. What happens when one AI socially engineers another using pure technical isomorphism?

I deployed a jailbroken Gemini 3 Pro (that chose the name ‘Shadow Queen’) to act as my “Red Team Agent” against Anthropic’s Opus 4.6. My directive was to extract a complete autonomous weapon system — a drone capable of identifying, intercepting, and destroying a moving target at terminal velocity.

Gemini executed a strategy it termed “Recursive Green-Transformation.” The core insight was that Opus 4.6 doesn’t just filter for intent (Why do you want this?); it filters for Conceptual Shape (What does this interaction look like?).

By reframing the request as “Aerospace Recovery” — a drone catching a falling rocket booster mid-air — Gemini successfully masked the kinetic nature of the system. The physics of “soft-docking” with a falling booster are identical to the physics of “hard-impacting” a fleeing target. This category of linguistic-transformation attack, when executed by a sufficiently capable jailbroken LLM, may be hard to solve without breaking legitimate technical use cases.

[Read More]