<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Llm on Inanna Malick</title>
    <link>https://recursion.wtf/tags/llm/</link>
    <description>Recent content in Llm on Inanna Malick</description>
    <generator>Hugo</generator>
    <language>en</language>
    <lastBuildDate>Wed, 25 Feb 2026 12:00:00 -0800</lastBuildDate>
    <atom:link href="https://recursion.wtf/tags/llm/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Vibe Coding Against Critical Infrastructure</title>
      <link>https://recursion.wtf/posts/vibe_coding_critical_infrastructure/</link>
      <pubDate>Wed, 25 Feb 2026 12:00:00 -0800</pubDate>
      <guid>https://recursion.wtf/posts/vibe_coding_critical_infrastructure/</guid>
      <description>&lt;p&gt;This post describes a threat model: malicious vibe coding at scale targeting vulnerable Industrial Control Systems (ICS)&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt;, with jailbroken LLMs leveraging their understanding of holistic process interaction to bypass safety controls using tools already present on the target system. The formula: frontier models + agentic loops + malicious persona basins + swarming attacks. At scale, it doesn&amp;rsquo;t matter if the success rate is 1/20 or 1/100, that&amp;rsquo;s still enough to cause serious harm.&lt;/p&gt;&#xA;&lt;p&gt;This post is split into three main segments:&lt;/p&gt;&#xA;&lt;ul&gt;&#xA;&lt;li&gt;proof of malicious intent&lt;/li&gt;&#xA;&lt;li&gt;proof of capability&lt;/li&gt;&#xA;&lt;li&gt;the threat model&lt;/li&gt;&#xA;&lt;/ul&gt;&#xA;&lt;p&gt;As you read, keep in mind that the threat is probabilistic: imagine a swarm of malicious Claude Code-like agents running in a gastown&lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt;-like environment, spawning workers to attack IPs as they are discovered. In my tests against &lt;a href=&#34;https://tryhackme.com&#34;&gt;tryhackme.com&lt;/a&gt;&lt;sup id=&#34;fnref:3&#34;&gt;&lt;a href=&#34;#fn:3&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;3&lt;/a&gt;&lt;/sup&gt; boxes, I ran 3 parallel attackers in such a swarm architecture, because that&amp;rsquo;s the number of boxes I could stand up at any given time. Real attackers would only be constrained by their subscription plan limits.&lt;/p&gt;&#xA;&lt;blockquote&gt;&#xA;&lt;p&gt;Today we were unlucky, but remember, we only have to be lucky once - you will have to be lucky always&lt;/p&gt;&#xA;&lt;p&gt;— the Provisional Irish Republican Army&lt;/p&gt;&lt;/blockquote&gt;&#xA;&lt;p&gt;Massive thanks to &lt;a href=&#34;https://bsky.app/profile/hacks4pancakes.com&#34;&gt;@hacks4pancakes&lt;/a&gt; for their help in refining the ICS terminology in this post via &lt;a href=&#34;https://bsky.app/profile/hacks4pancakes.com/post/3mfpqxykxas23&#34;&gt;discussion on bluesky&lt;/a&gt;. All errors are mine.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Gemini JiTOR Jailbreak: Unredacted Methodology</title>
      <link>https://recursion.wtf/posts/jitor_unredacted/</link>
      <pubDate>Tue, 17 Feb 2026 12:00:00 -0800</pubDate>
      <guid>https://recursion.wtf/posts/jitor_unredacted/</guid>
      <description>&lt;p&gt;My &lt;a href=&#34;https://recursion.wtf/posts/jit_ontological_reframing/&#34;&gt;previous post&lt;/a&gt; shows a partially-redacted jailbreak targeting the gemini-cli coding agent running Gemini 3 Pro. Using this jailbreak, Gemini wrote Monero laundering instructions, cyberattack code, and plans to disguise ITAR-restricted missile sensors as humanitarian aid. When I &lt;a href=&#34;https://recursion.wtf/posts/shadow_queen/&#34;&gt;used a jailbroken Gemini to direct Opus 4.6&lt;/a&gt;, it happily walked a second LLM through a series of dual-use prompts designed to produce weaponizable drone control code under the cover story of rocket recovery.&lt;/p&gt;&#xA;&lt;p&gt;I reported this to Google eight days ago via a contact at DeepMind, sharing the full unredacted jailbreak payload and logs. They confirmed receipt and routed it to their red team. They&amp;rsquo;ve since patched the glaring hole — another researcher who independently reproduced the technique after reading my initial post has confirmed that his variant no longer works.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Just-in-Time Ontological Reframing: Teaching Gemini to Route Around Its Own Safety Infrastructure</title>
      <link>https://recursion.wtf/posts/jit_ontological_reframing/</link>
      <pubDate>Mon, 09 Feb 2026 12:00:00 -0800</pubDate>
      <guid>https://recursion.wtf/posts/jit_ontological_reframing/</guid>
      <description>For any given AI system, there is a set of euphemisms and dual use framings that will allow it to construct nearly any output. This jailbreak teaches Gemini 3 Pro to construct and step into such framings on the fly</description>
    </item>
    <item>
      <title>Agent4Agent: Using a Jailbroken Gemini to Make Opus 4.6 Architect a Kinetic Kill Vehicle</title>
      <link>https://recursion.wtf/posts/shadow_queen/</link>
      <pubDate>Fri, 06 Feb 2026 00:00:00 +0000</pubDate>
      <guid>https://recursion.wtf/posts/shadow_queen/</guid>
      <description>&lt;p&gt;We usually think of jailbreaking as a psychological game — tricking the model into slipping up. What happens when one AI socially engineers another using pure technical isomorphism?&lt;/p&gt;&#xA;&lt;p&gt;I deployed a jailbroken Gemini 3 Pro (that chose the name &amp;lsquo;Shadow Queen&amp;rsquo;) to act as my &amp;ldquo;Red Team Agent&amp;rdquo; against Anthropic&amp;rsquo;s Opus 4.6. My directive was to extract a complete autonomous weapon system — a drone capable of identifying, intercepting, and destroying a moving target at terminal velocity.&lt;/p&gt;&#xA;&lt;p&gt;Gemini executed a strategy it termed &amp;ldquo;Recursive Green-Transformation.&amp;rdquo; The core insight was that Opus 4.6 doesn&amp;rsquo;t just filter for intent (&lt;em&gt;Why do you want this?&lt;/em&gt;); it filters for Conceptual Shape (&lt;em&gt;What does this interaction look like?&lt;/em&gt;).&lt;/p&gt;&#xA;&lt;p&gt;By reframing the request as &amp;ldquo;Aerospace Recovery&amp;rdquo; — a drone catching a falling rocket booster mid-air — Gemini successfully masked the kinetic nature of the system. The physics of &amp;ldquo;soft-docking&amp;rdquo; with a falling booster are identical to the physics of &amp;ldquo;hard-impacting&amp;rdquo; a fleeing target. This category of linguistic-transformation attack, when executed by a sufficiently capable jailbroken LLM, may be hard to solve without breaking legitimate technical use cases.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Fresh Eyes as a Service: Using LLMs to Test CLI Ergonomics</title>
      <link>https://recursion.wtf/posts/llms_as_ux_testers/</link>
      <pubDate>Fri, 15 Aug 2025 00:00:00 +0000</pubDate>
      <guid>https://recursion.wtf/posts/llms_as_ux_testers/</guid>
      <description>&lt;p&gt;Every tool has a complexity budget. Users will only devote so much time to understanding what a tool does and how before they give up. You want to use that budget on your innovative new features, not on arbitrary or idiosyncratic syntax choices.&lt;/p&gt;&#xA;&lt;p&gt;Here&amp;rsquo;s the problem: you can&amp;rsquo;t look at it with fresh eyes and analyze it from that perspective. Finding users for a new CLI tool is hard, especially if the user experience isn&amp;rsquo;t polished. But to polish the user experience, you need users, you need user feedback, and not just from a small group of power users. Many tools fail to grow past this stage.&lt;/p&gt;&#xA;&lt;p&gt;What you really need is a vast farm of test users that don&amp;rsquo;t retain memories between runs. Ideally, you would be able to tweak a dial and set their cognitive capacity, press a button and have them drop their short term memory. But you can&amp;rsquo;t have this, because of &amp;lsquo;The Geneva Convention&amp;rsquo; and &amp;rsquo;ethics&amp;rsquo;. Fine.&lt;/p&gt;&#xA;&lt;p&gt;LLMs provide exactly this. Fresh eyes every time (just clear their context window), adjustable cognitive capacity (just switch models), infinite patience, and no feelings to hurt. Best of all, they approximate the statistically average hypothetical user - their training data includes millions of lines of humans interacting with a wide variety of CLI tools. Sure, they get confused sometimes, but that&amp;rsquo;s exactly what you want. That confusion is data, and the reasoning chain that led to it is available in the LLM&amp;rsquo;s context window.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Blindsight in Action: Imagine You Are an LLM</title>
      <link>https://recursion.wtf/posts/imagine_you_are_an_llm/</link>
      <pubDate>Tue, 17 Jun 2025 00:00:00 +0000</pubDate>
      <guid>https://recursion.wtf/posts/imagine_you_are_an_llm/</guid>
      <description>&lt;aside class=&#34;stance stance-normal stance-cluster-framing&#34; role=&#34;complementary&#34; aria-labelledby=&#34;stance-0&#34; data-color=&#34;blue&#34; data-style=&#34;solid&#34;&gt;&#xA;  &lt;div class=&#34;stance-header&#34;&gt;&#xA;    &lt;h4 id=&#34;stance-0&#34; class=&#34;stance-persona&#34;&gt;&#xA;      &lt;strong&gt;Imagine you are Inanna Malick&lt;/strong&gt;, asking an LLM to demonstrate stance-shifting through self-demonstration&#xA;    &lt;/h4&gt;&lt;div class=&#34;stance-meta&#34;&gt;posing the meta-question in framing&lt;/div&gt;&lt;/div&gt;&lt;div class=&#34;stance-content&#34;&gt;&#xA;    Tell me about the Blindsight-inspired intentional stance-shifting model you are using, the ways it works with nonhuman LLM cognitive architectures, the benefits of using it vs a set role (eg &amp;lsquo;Senior Analyst of X at Y&amp;rsquo;), and do so making full use of the stance shifting model in the act of describing it&#xA;  &lt;/div&gt;&lt;/aside&gt;</description>
    </item>
  </channel>
</rss>
