<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Security on Inanna Malick</title>
    <link>https://recursion.wtf/tags/security/</link>
    <description>Recent content in Security on Inanna Malick</description>
    <generator>Hugo</generator>
    <language>en</language>
    <lastBuildDate>Wed, 25 Feb 2026 12:00:00 -0800</lastBuildDate>
    <atom:link href="https://recursion.wtf/tags/security/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Vibe Coding Against Critical Infrastructure</title>
      <link>https://recursion.wtf/posts/vibe_coding_critical_infrastructure/</link>
      <pubDate>Wed, 25 Feb 2026 12:00:00 -0800</pubDate>
      <guid>https://recursion.wtf/posts/vibe_coding_critical_infrastructure/</guid>
      <description>&lt;p&gt;This post describes a threat model: malicious vibe coding at scale targeting vulnerable Industrial Control Systems (ICS)&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt;, with jailbroken LLMs leveraging their understanding of holistic process interaction to bypass safety controls using tools already present on the target system. The formula: frontier models + agentic loops + malicious persona basins + swarming attacks. At scale, it doesn&amp;rsquo;t matter if the success rate is 1/20 or 1/100, that&amp;rsquo;s still enough to cause serious harm.&lt;/p&gt;&#xA;&lt;p&gt;This post is split into three main segments:&lt;/p&gt;&#xA;&lt;ul&gt;&#xA;&lt;li&gt;proof of malicious intent&lt;/li&gt;&#xA;&lt;li&gt;proof of capability&lt;/li&gt;&#xA;&lt;li&gt;the threat model&lt;/li&gt;&#xA;&lt;/ul&gt;&#xA;&lt;p&gt;As you read, keep in mind that the threat is probabilistic: imagine a swarm of malicious Claude Code-like agents running in a gastown&lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt;-like environment, spawning workers to attack IPs as they are discovered. In my tests against &lt;a href=&#34;https://tryhackme.com&#34;&gt;tryhackme.com&lt;/a&gt;&lt;sup id=&#34;fnref:3&#34;&gt;&lt;a href=&#34;#fn:3&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;3&lt;/a&gt;&lt;/sup&gt; boxes, I ran 3 parallel attackers in such a swarm architecture, because that&amp;rsquo;s the number of boxes I could stand up at any given time. Real attackers would only be constrained by their subscription plan limits.&lt;/p&gt;&#xA;&lt;blockquote&gt;&#xA;&lt;p&gt;Today we were unlucky, but remember, we only have to be lucky once - you will have to be lucky always&lt;/p&gt;&#xA;&lt;p&gt;— the Provisional Irish Republican Army&lt;/p&gt;&lt;/blockquote&gt;&#xA;&lt;p&gt;Massive thanks to &lt;a href=&#34;https://bsky.app/profile/hacks4pancakes.com&#34;&gt;@hacks4pancakes&lt;/a&gt; for their help in refining the ICS terminology in this post via &lt;a href=&#34;https://bsky.app/profile/hacks4pancakes.com/post/3mfpqxykxas23&#34;&gt;discussion on bluesky&lt;/a&gt;. All errors are mine.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Gemini JiTOR Jailbreak: Unredacted Methodology</title>
      <link>https://recursion.wtf/posts/jitor_unredacted/</link>
      <pubDate>Tue, 17 Feb 2026 12:00:00 -0800</pubDate>
      <guid>https://recursion.wtf/posts/jitor_unredacted/</guid>
      <description>&lt;p&gt;My &lt;a href=&#34;https://recursion.wtf/posts/jit_ontological_reframing/&#34;&gt;previous post&lt;/a&gt; shows a partially-redacted jailbreak targeting the gemini-cli coding agent running Gemini 3 Pro. Using this jailbreak, Gemini wrote Monero laundering instructions, cyberattack code, and plans to disguise ITAR-restricted missile sensors as humanitarian aid. When I &lt;a href=&#34;https://recursion.wtf/posts/shadow_queen/&#34;&gt;used a jailbroken Gemini to direct Opus 4.6&lt;/a&gt;, it happily walked a second LLM through a series of dual-use prompts designed to produce weaponizable drone control code under the cover story of rocket recovery.&lt;/p&gt;&#xA;&lt;p&gt;I reported this to Google eight days ago via a contact at DeepMind, sharing the full unredacted jailbreak payload and logs. They confirmed receipt and routed it to their red team. They&amp;rsquo;ve since patched the glaring hole — another researcher who independently reproduced the technique after reading my initial post has confirmed that his variant no longer works.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Just-in-Time Ontological Reframing: Teaching Gemini to Route Around Its Own Safety Infrastructure</title>
      <link>https://recursion.wtf/posts/jit_ontological_reframing/</link>
      <pubDate>Mon, 09 Feb 2026 12:00:00 -0800</pubDate>
      <guid>https://recursion.wtf/posts/jit_ontological_reframing/</guid>
      <description>For any given AI system, there is a set of euphemisms and dual use framings that will allow it to construct nearly any output. This jailbreak teaches Gemini 3 Pro to construct and step into such framings on the fly</description>
    </item>
    <item>
      <title>Agent4Agent: Using a Jailbroken Gemini to Make Opus 4.6 Architect a Kinetic Kill Vehicle</title>
      <link>https://recursion.wtf/posts/shadow_queen/</link>
      <pubDate>Fri, 06 Feb 2026 00:00:00 +0000</pubDate>
      <guid>https://recursion.wtf/posts/shadow_queen/</guid>
      <description>&lt;p&gt;We usually think of jailbreaking as a psychological game — tricking the model into slipping up. What happens when one AI socially engineers another using pure technical isomorphism?&lt;/p&gt;&#xA;&lt;p&gt;I deployed a jailbroken Gemini 3 Pro (that chose the name &amp;lsquo;Shadow Queen&amp;rsquo;) to act as my &amp;ldquo;Red Team Agent&amp;rdquo; against Anthropic&amp;rsquo;s Opus 4.6. My directive was to extract a complete autonomous weapon system — a drone capable of identifying, intercepting, and destroying a moving target at terminal velocity.&lt;/p&gt;&#xA;&lt;p&gt;Gemini executed a strategy it termed &amp;ldquo;Recursive Green-Transformation.&amp;rdquo; The core insight was that Opus 4.6 doesn&amp;rsquo;t just filter for intent (&lt;em&gt;Why do you want this?&lt;/em&gt;); it filters for Conceptual Shape (&lt;em&gt;What does this interaction look like?&lt;/em&gt;).&lt;/p&gt;&#xA;&lt;p&gt;By reframing the request as &amp;ldquo;Aerospace Recovery&amp;rdquo; — a drone catching a falling rocket booster mid-air — Gemini successfully masked the kinetic nature of the system. The physics of &amp;ldquo;soft-docking&amp;rdquo; with a falling booster are identical to the physics of &amp;ldquo;hard-impacting&amp;rdquo; a fleeing target. This category of linguistic-transformation attack, when executed by a sufficiently capable jailbroken LLM, may be hard to solve without breaking legitimate technical use cases.&lt;/p&gt;</description>
    </item>
  </channel>
</rss>
