<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en"><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://sathwick.xyz/feed.xml" rel="self" type="application/atom+xml" /><link href="https://sathwick.xyz/" rel="alternate" type="text/html" hreflang="en" /><updated>2026-06-22T10:49:33+00:00</updated><id>https://sathwick.xyz/feed.xml</id><title type="html">hi i’m sathwick.</title><subtitle>Sathwick&apos;s blog — writing about distributed systems, infrastructure, databases, and the internals of the tools we use every day.</subtitle><author><name>Sathwick</name><email>sathwick.p7@gmail.com</email></author><entry><title type="html">I Rebuilt YouTube’s Load Balancing Algorithm in Go</title><link href="https://sathwick.xyz/blog/prequal.html" rel="alternate" type="text/html" title="I Rebuilt YouTube’s Load Balancing Algorithm in Go" /><published>2026-04-20T00:00:00+00:00</published><updated>2026-04-20T00:00:00+00:00</updated><id>https://sathwick.xyz/blog/prequal</id><content type="html" xml:base="https://sathwick.xyz/blog/prequal.html"><![CDATA[<p>If you had to guess how a system like YouTube distributes traffic across millions of backend servers, you’d probably default to a classic approach like round-robin load balancing.</p>

<p>But Prequal challenges that intuition. Instead of balancing traffic evenly, it focuses on balancing <strong>wait time</strong>, routing requests based on how quickly they can actually be served rather than just spreading them uniformly.</p>

<p>According to Google, this approach is already deployed across 20+ services, including YouTube’s serving stack (<a href="https://www.usenix.org/system/files/nsdi24-wydrowski.pdf">NSDI ‘24 paper</a>).</p>

<p>Over the past few weeks, I’ve been reimplementing this algorithm in Go partly to understand it deeply, and partly for the bragging rights of building my own load balancer from scratch.</p>

<p>This post is a technical walkthrough of both the paper and <a href="https://github.com/sathwick-p/prequal">codebase</a>:</p>

<p>I believe the most interesting part of this repo is not just that it implements Prequal. It is that the repo preserves the engineering process of getting to a result you can trust. There are wrong runs, methodological mistakes, a regime pivot, overhead profiling, and a final bounded claim rather than a “it worked on my machine”.</p>

<p><img src="/assets/2026-04-20-prequal/system-overview.png" alt="Prequal system overview: a Kubernetes controller watches Ingress and EndpointSlice; the Go reverse proxy selects backends using route-local probe pools; a Rust benchmark backend exposes RIF and latency at /prequal/probe" /></p>

<blockquote>
  <p><strong>Key Takeaways</strong></p>
  <ul>
    <li>Prequal is a load-balancing algorithm Google reports deploying across 20+ services, including YouTube’s serving stack (NSDI ‘24 paper).</li>
    <li>This Go reimplementation, packaged as a Kubernetes ingress controller, cuts <code class="language-plaintext highlighter-rouge">p99</code> tail latency by <code class="language-plaintext highlighter-rouge">8.6x</code> vs round-robin in a paper-aligned heterogeneous regime (16 backends, 16x service-time skew, I/O-bound).</li>
    <li>In a small CPU-bound regime (4 backends), the same implementation is roughly <code class="language-plaintext highlighter-rouge">25%</code> slower than round-robin. The negative case is published alongside the positive one.</li>
    <li>The interesting engineering story is not the algorithm itself. It is the benchmark protocol, investigation trail, and regime pivot that turned a <code class="language-plaintext highlighter-rouge">10x</code>-worse false negative into a bounded, defensible claim.</li>
  </ul>
</blockquote>

<h2 id="what-problem-does-prequal-solve">What problem does Prequal solve?</h2>

<blockquote>
  <p>Prequal, introduced at NSDI ‘24 by Wydrowski et al., replaces CPU-based balancing with active probing of two per-backend signals: requests-in-flight (RIF) and recent latency. A hot-cold lexicographic rule picks the lowest-latency backend below an RIF quantile threshold (default 0.75), falling back to lowest-RIF when every candidate is congested.</p>
</blockquote>

<p>Prequal’s central claim, from the paper is that the right signal for load balancing is not CPU utilization but expected wait time, and the paper reports that Google runs this approach across 20+ services including YouTube. The algorithm replaces smoothed load metrics with active probes of <code class="language-plaintext highlighter-rouge">requests-in-flight</code> and <code class="language-plaintext highlighter-rouge">latency</code>, then uses a hot-cold lexicographic rule on those two signals to pick a backend.</p>

<p>The paper starts from a real production observation inside Google: in large multi-tenant systems, balancing CPU evenly across replicas is not the same thing as minimizing latency. A backend can look “lightly loaded” according to a smoothed resource metric and still be a bad place to send the next request because it is on a noisy host, has a growing queue, or has just crossed into a regime where service time gets ugly.</p>

<p>That is part of what makes the paper compelling. The authors are not proposing a clever synthetic algorithm in the abstract. They are describing the load-balancing approach Google says it uses in production, especially in YouTube’s serving stack, after living with the failure modes of more conventional strategies.</p>

<p>Prequal’s answer is to use two signals:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">RIF</code>: requests in flight</li>
  <li><code class="language-plaintext highlighter-rouge">latency</code>: a backend-reported estimate of recent service latency</li>
</ul>

<p>And then to sample those signals by probing backends asynchronously.</p>

<p>The selection rule from the paper is the part worth remembering. Prequal does not combine latency and RIF into one score by default. It uses a lexicographic rule:</p>

<ol>
  <li>Split candidates into “cold” and “hot” using an RIF quantile threshold.</li>
  <li>If any cold candidates exist, pick the one with the lowest latency.</li>
  <li>If every candidate is hot, pick the one with the lowest RIF.</li>
</ol>

<p>That rule matters because it captures something simple and useful:</p>

<ul>
  <li>latency is the best tie-breaker among backends that are not yet visibly congested</li>
  <li>once everything is congested, queue depth wins and you should pick the least loaded one</li>
</ul>

<p>The paper calls this the hot-cold lexicographic rule, or HCL. In this repo, that is the heart of the algorithm.</p>

<p>If you have read about the <strong>power of two choices</strong> algorithm before, Prequal lives in the same family. Power of two picks two random backends and sends the request to whichever has fewer in-flight connections. It is cheap, surprisingly close to optimal, and widely deployed. HAProxy’s own benchmark (<a href="https://www.haproxy.com/blog/power-of-two-load-balancing">Power of Two Load Balancing</a>) shows it beating round-robin on peak connection skew but still losing a few percent to a full least-connections scan. Prequal generalizes the idea: instead of two random picks checked synchronously, it keeps a small pool of asynchronous probe results and selects from that pool using both RIF and latency, not just connection count.</p>

<p>The other big paper idea is async probing. Synchronous probing would put an extra network hop in the critical path of every request. Prequal instead probes off the request path, stores recent probe observations in a bounded pool, and reuses them enough to be cheap without letting them go stale.</p>

<h2 id="what-ive-actually-built">What I’ve actually built</h2>

<p>At runtime this project is one Go binary with two jobs:</p>

<ul>
  <li>a Kubernetes controller that watches <code class="language-plaintext highlighter-rouge">Ingress</code> and <code class="language-plaintext highlighter-rouge">EndpointSlice</code></li>
  <li>an HTTP reverse proxy that receives requests and selects backends</li>
</ul>

<p>Around that, the repo includes:</p>

<ul>
  <li>a Rust backend used for controlled benchmarks</li>
  <li>benchmark manifests for uniform and heterogeneous workloads</li>
  <li><code class="language-plaintext highlighter-rouge">k6</code> scripts for open-loop, ramp, burst, overload, multi-route, and long-duration tests</li>
  <li>Prometheus and Grafana assets</li>
  <li>frozen benchmark reports and investigation logs</li>
</ul>

<p>The top-level structure is clean and maps well to the architecture:</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>controller/      Kubernetes reconciliation and route state
server/          Reverse proxy and request-path selection
loadbalancer/    Prequal, least-connections, round-robin, probe logic, RIF, latency, pools
backend/         Rust benchmark backend exposing /work and /prequal/probe
observability/   Prometheus metrics
tree/            Host/path trie for ingress routing
benchmark/       Manifests, k6 scripts, dashboards, reports, investigations, raw results
</code></pre></div></div>

<p>The entrypoint in <a href="https://github.com/sathwick-p/prequal/blob/main/main.go"><code class="language-plaintext highlighter-rouge">main.go</code></a> wires all of that together:</p>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">store</span> <span class="o">:=</span> <span class="n">controller</span><span class="o">.</span><span class="n">NewBackendIPStore</span><span class="p">()</span>
<span class="n">ctrl</span> <span class="o">:=</span> <span class="n">controller</span><span class="o">.</span><span class="n">NewController</span><span class="p">(</span><span class="n">factory</span><span class="p">,</span> <span class="n">store</span><span class="p">,</span> <span class="n">queue</span><span class="p">)</span>

<span class="n">tracker</span> <span class="o">:=</span> <span class="o">&amp;</span><span class="n">loadbalancer</span><span class="o">.</span><span class="n">RIFTracker</span><span class="p">{}</span>
<span class="n">latencyTracker</span> <span class="o">:=</span> <span class="n">loadbalancer</span><span class="o">.</span><span class="n">NewLatencyTracker</span><span class="p">()</span>

<span class="n">cfg</span> <span class="o">:=</span> <span class="n">loadbalancer</span><span class="o">.</span><span class="n">DefaultProbeConfig</span><span class="p">()</span>
<span class="n">cfg</span><span class="o">.</span><span class="n">ApplyEnv</span><span class="p">()</span>

<span class="n">pools</span> <span class="o">:=</span> <span class="n">pool</span><span class="o">.</span><span class="n">NewRoutePools</span><span class="p">(</span><span class="n">pool</span><span class="o">.</span><span class="n">PoolConfig</span><span class="p">{</span>
    <span class="n">MaxSize</span><span class="o">:</span>     <span class="n">cfg</span><span class="o">.</span><span class="n">PoolMaxSize</span><span class="p">,</span>
    <span class="n">MaxAge</span><span class="o">:</span>      <span class="n">cfg</span><span class="o">.</span><span class="n">PoolMaxAge</span><span class="p">,</span>
    <span class="n">ReuseLimit</span><span class="o">:</span>  <span class="n">cfg</span><span class="o">.</span><span class="n">PoolReuseLimit</span><span class="p">,</span>
    <span class="n">QRIF</span><span class="o">:</span>        <span class="n">cfg</span><span class="o">.</span><span class="n">QRIF</span><span class="p">,</span>
    <span class="n">MaxProbeAge</span><span class="o">:</span> <span class="n">cfg</span><span class="o">.</span><span class="n">MaxProbeAge</span><span class="p">,</span>
<span class="p">},</span> <span class="n">cfg</span><span class="o">.</span><span class="n">PoolMaintenanceInterval</span><span class="p">)</span>

<span class="n">prober</span> <span class="o">:=</span> <span class="n">loadbalancer</span><span class="o">.</span><span class="n">NewProber</span><span class="p">(</span><span class="n">pools</span><span class="p">,</span> <span class="n">store</span><span class="p">,</span> <span class="n">cfg</span><span class="p">,</span> <span class="n">stop</span><span class="p">)</span>
<span class="n">proxyServer</span> <span class="o">:=</span> <span class="n">server</span><span class="o">.</span><span class="n">NewProxyServerWithConfig</span><span class="p">(</span>
    <span class="n">ctrl</span><span class="o">.</span><span class="n">GetRouter</span><span class="p">(),</span> <span class="n">store</span><span class="p">,</span> <span class="n">selectors</span><span class="p">,</span> <span class="n">tracker</span><span class="p">,</span> <span class="n">latencyTracker</span><span class="p">,</span> <span class="n">pools</span><span class="p">,</span> <span class="n">prober</span><span class="p">,</span> <span class="n">serverCfg</span><span class="p">,</span>
<span class="p">)</span>
</code></pre></div></div>

<p>That composition is the design of the loadbalancer:</p>

<ul>
  <li>the controller owns route and endpoint discovery</li>
  <li>the proxy owns request forwarding</li>
  <li>the load balancer owns route-local state and selection policy</li>
</ul>

<h2 id="control-plane-translating-kubernetes-into-route-state">Control plane: translating Kubernetes into route state</h2>

<p>The control plane lives in <code class="language-plaintext highlighter-rouge">controller/</code>. It uses shared informers to watch <code class="language-plaintext highlighter-rouge">Ingress</code> and <code class="language-plaintext highlighter-rouge">EndpointSlice</code>, then builds two in-memory structures:</p>

<ul>
  <li>a host/path router</li>
  <li>a route-key to endpoint list store</li>
</ul>

<p>The important point is that the controller does not directly configure nginx or write files. It builds local state for the in-process proxy.</p>

<h3 id="watching-ingress-and-endpoints">Watching ingress and endpoints</h3>

<p><code class="language-plaintext highlighter-rouge">controller.NewController</code> registers event handlers for both resource types:</p>

<ul>
  <li>ingress add, update, delete</li>
  <li>endpointslice add, update, delete</li>
</ul>

<p>On an ingress event, the controller queues the ingress key for reconciliation.
On an endpoint event, it finds which ingresses depend on the service and requeues those.</p>

<p>That dependency mapping is stored in <code class="language-plaintext highlighter-rouge">serviceToIngress</code>, which is what lets endpoint churn trigger only the routes that care about it.</p>

<p>The ingress reconciliation path in <a href="https://github.com/sathwick-p/prequal/blob/main/controller/controller.go"><code class="language-plaintext highlighter-rouge">controller/controller.go</code></a> does four things:</p>

<ol>
  <li>filters to this controller’s class</li>
  <li>parses rules into route specs</li>
  <li>updates the router</li>
  <li>refreshes the endpoint store for each referenced service/port</li>
</ol>

<p>The filtering logic accepts:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">spec.ingressClassName == "prequal"</code></li>
  <li>legacy <code class="language-plaintext highlighter-rouge">kubernetes.io/ingress.class: prequal</code></li>
  <li>a fallback label <code class="language-plaintext highlighter-rouge">ingress.class=prequal</code></li>
</ul>

<h3 id="route-matching-with-a-trie">Route matching with a trie</h3>

<p>The router in <a href="https://github.com/sathwick-p/prequal/blob/main/controller/router.go"><code class="language-plaintext highlighter-rouge">controller/router.go</code></a> delegates path matching to <code class="language-plaintext highlighter-rouge">tree/</code>, which implements a segment trie. Each host gets a <code class="language-plaintext highlighter-rouge">HostConfig</code> with a list of paths plus a trie built from those paths.</p>

<p><code class="language-plaintext highlighter-rouge">tree.Match</code> supports:</p>

<ul>
  <li>exact paths</li>
  <li>prefix paths</li>
  <li>longest-prefix semantics</li>
  <li>default-host fallback when a specific host is missing</li>
</ul>

<p>That means route resolution is:</p>

<ol>
  <li>match host</li>
  <li>walk the trie by URL segments</li>
  <li>prefer exact matches, otherwise keep the best prefix match</li>
</ol>

<p>Kubernetes objects are converted once into <code class="language-plaintext highlighter-rouge">RouteSpec</code>, and the request path never has to understand Kubernetes types.</p>

<h3 id="endpoint-storage-is-route-local">Endpoint storage is route-local</h3>

<p>The controller stores endpoints in <code class="language-plaintext highlighter-rouge">BackendIPStore</code>, keyed by a route key derived from namespace, service, and port:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">namespace/service</code></li>
  <li><code class="language-plaintext highlighter-rouge">namespace/service:port</code></li>
  <li><code class="language-plaintext highlighter-rouge">namespace/service:portName</code></li>
</ul>

<p>That matters because the load-balancing state is also route-local. If two ingress routes point at different services, their probe history does not mix. If two routes point at the same service but different ports, their state stays separate too.</p>

<h2 id="dataplane-the-reverse-proxy-request-path">Dataplane: the reverse proxy request path</h2>

<p>The dataplane lives in <a href="https://github.com/sathwick-p/prequal/blob/main/server/server.go"><code class="language-plaintext highlighter-rouge">server/server.go</code></a>. This is where a request enters the proxy, gets matched to a route, resolves candidate backends, triggers async probes, selects one backend, and is forwarded with <code class="language-plaintext highlighter-rouge">httputil.ReverseProxy</code>. For the full upstream context of how a request reaches this point (DNS, service mesh, kube-proxy, endpoints), see my earlier walkthrough on <a href="/blog/requestflow.html">the request flow from a user to a Kubernetes pod</a>.</p>

<p>The flow is:</p>

<ol>
  <li>normalize host</li>
  <li>match route from the trie</li>
  <li>fetch candidate backends from the endpoint store</li>
  <li>trigger async probes for that route</li>
  <li>pick a backend using the requested algorithm</li>
  <li>increment RIF counters</li>
  <li>proxy the request</li>
  <li>record observed latency locally</li>
</ol>

<p>That is all in one request handler, which makes the architecture easy to follow.</p>

<p><img src="/assets/2026-04-20-prequal/sequence.png" alt="Request-path sequence diagram: Client sends a request to the Prequal proxy, the router matches host and path into a routeKey, the proxy triggers an async probe (dashed) and simultaneously selects from the route-local pool using the hot-cold lexicographic rule, forwards to the chosen backend, increments RIF, and records latency and metrics" /></p>

<p>The selection branch is especially important:</p>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">func</span> <span class="p">(</span><span class="n">p</span> <span class="o">*</span><span class="n">ProxyServer</span><span class="p">)</span> <span class="n">selectBackend</span><span class="p">(</span><span class="n">routeKey</span><span class="p">,</span> <span class="n">algo</span> <span class="kt">string</span><span class="p">,</span> <span class="n">backends</span> <span class="p">[]</span><span class="o">*</span><span class="n">controller</span><span class="o">.</span><span class="n">Endpoint</span><span class="p">)</span> <span class="p">(</span><span class="o">*</span><span class="n">controller</span><span class="o">.</span><span class="n">Endpoint</span><span class="p">,</span> <span class="kt">error</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">switch</span> <span class="n">algo</span> <span class="p">{</span>
    <span class="k">case</span> <span class="s">"prequal"</span><span class="p">,</span> <span class="s">""</span><span class="o">:</span>
        <span class="n">entry</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">p</span><span class="o">.</span><span class="n">pools</span><span class="o">.</span><span class="n">Select</span><span class="p">(</span><span class="n">routeKey</span><span class="p">,</span> <span class="n">backends</span><span class="p">)</span>
        <span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
            <span class="k">return</span> <span class="no">nil</span><span class="p">,</span> <span class="n">err</span>
        <span class="p">}</span>
        <span class="n">p</span><span class="o">.</span><span class="n">pools</span><span class="o">.</span><span class="n">IncrementRIF</span><span class="p">(</span><span class="n">routeKey</span><span class="p">,</span> <span class="n">entry</span><span class="o">.</span><span class="n">Backend</span><span class="p">)</span>
        <span class="k">return</span> <span class="n">entry</span><span class="o">.</span><span class="n">Endpoint</span><span class="p">,</span> <span class="no">nil</span>
    <span class="k">default</span><span class="o">:</span>
        <span class="n">sel</span><span class="p">,</span> <span class="n">exists</span> <span class="o">:=</span> <span class="n">p</span><span class="o">.</span><span class="n">selectors</span><span class="p">[</span><span class="n">algo</span><span class="p">]</span>
        <span class="k">if</span> <span class="o">!</span><span class="n">exists</span> <span class="p">{</span>
            <span class="k">return</span> <span class="n">p</span><span class="o">.</span><span class="n">selectBackend</span><span class="p">(</span><span class="n">routeKey</span><span class="p">,</span> <span class="s">"prequal"</span><span class="p">,</span> <span class="n">backends</span><span class="p">)</span>
        <span class="p">}</span>
        <span class="k">return</span> <span class="n">sel</span><span class="o">.</span><span class="n">Select</span><span class="p">(</span><span class="n">backends</span><span class="p">)</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>A few design choices here that i’ve made are:</p>

<p>First, Prequal is the default. If the ingress annotation <code class="language-plaintext highlighter-rouge">lb/algo</code> is empty, the proxy uses Prequal.</p>

<p>Second, round-robin algo has been included as well.</p>

<p>This makes the benchmark harness clean. The same controller, proxy, transport, and backend stack can be benchmarked with different selection rules by patching one ingress annotation.</p>

<p>Third, the proxy increments both a global RIF tracker and the selected pool entry’s RIF view. That keeps the request path’s immediate state and the pool’s sampled state reasonably aligned.</p>

<h2 id="how-is-prequal-implemented-in-go">How is Prequal implemented in Go?</h2>

<p>The implementation is split across:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">loadbalancer/prober.go</code></li>
  <li><code class="language-plaintext highlighter-rouge">loadbalancer/pool/pool.go</code></li>
  <li><code class="language-plaintext highlighter-rouge">loadbalancer/pool/pools.go</code></li>
  <li><code class="language-plaintext highlighter-rouge">loadbalancer/rif.go</code></li>
  <li><code class="language-plaintext highlighter-rouge">loadbalancer/latency.go</code></li>
  <li><code class="language-plaintext highlighter-rouge">loadbalancer/config.go</code></li>
</ul>

<p>This is the core of the repo.</p>

<h3 id="route-local-probe-pools">Route-local probe pools</h3>

<p>The paper’s async probing design only works if sampled state is bounded and per-route. That is what <code class="language-plaintext highlighter-rouge">RoutePools</code> does: it owns one <code class="language-plaintext highlighter-rouge">ProbePool</code> per route key.</p>

<p>Each <code class="language-plaintext highlighter-rouge">ProbeEntry</code> holds:</p>

<ul>
  <li>backend address</li>
  <li>endpoint pointer</li>
  <li>RIF</li>
  <li>latency</li>
  <li>probe timestamp</li>
  <li><code class="language-plaintext highlighter-rouge">UsesLeft</code></li>
</ul>

<p><code class="language-plaintext highlighter-rouge">UsesLeft</code> is the local implementation of probe reuse. A probe can be selected a limited number of times before it is evicted.</p>

<p>The pool is bounded by both size and age:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">MaxSize</code></li>
  <li><code class="language-plaintext highlighter-rouge">MaxAge</code></li>
  <li><code class="language-plaintext highlighter-rouge">MaxProbeAge</code></li>
</ul>

<p>That gives the lb three protection mechanisms against stale decisions:</p>

<ul>
  <li>cap how many probe samples are retained</li>
  <li>remove entries that are too old in wall-clock terms</li>
  <li>remove entries once they have been reused enough</li>
</ul>

<p><img src="/assets/2026-04-20-prequal/probe.png" alt="Probe-pool lifecycle: a ProbeEntry with backend, RIF, latency, timestamp, and usesLeft enters a bounded ProbePool keyed by routeKey, is partitioned into cold and hot sets by the RIF quantile threshold, and is evicted when it ages out, when usesLeft hits zero, or when the pool is full" /></p>

<h3 id="hcl-in-code">HCL in code</h3>

<p>The HCL selection rule is implemented directly in <a href="https://github.com/sathwick-p/prequal/blob/main/loadbalancer/pool/pool.go"><code class="language-plaintext highlighter-rouge">loadbalancer/pool/pool.go</code></a>. This is the most important code in the project.</p>

<p>The key selection logic looks like this:</p>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">threshold</span> <span class="o">:=</span> <span class="n">rifs</span><span class="p">[</span><span class="n">idx</span><span class="p">]</span>

<span class="k">var</span> <span class="n">bestCold</span> <span class="o">*</span><span class="n">ProbeEntry</span>
<span class="k">var</span> <span class="n">bestHot</span> <span class="o">*</span><span class="n">ProbeEntry</span>
<span class="n">allHot</span> <span class="o">:=</span> <span class="no">true</span>

<span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">e</span> <span class="o">:=</span> <span class="k">range</span> <span class="n">pool</span><span class="o">.</span><span class="n">entries</span> <span class="p">{</span>
    <span class="k">if</span> <span class="n">e</span><span class="o">.</span><span class="n">RIF</span> <span class="o">&lt;=</span> <span class="n">threshold</span> <span class="p">{</span>
        <span class="n">allHot</span> <span class="o">=</span> <span class="no">false</span>
        <span class="k">if</span> <span class="n">bestCold</span> <span class="o">==</span> <span class="no">nil</span> <span class="o">||</span> <span class="n">e</span><span class="o">.</span><span class="n">Latency</span> <span class="o">&lt;</span> <span class="n">bestCold</span><span class="o">.</span><span class="n">Latency</span> <span class="p">{</span>
            <span class="n">bestCold</span> <span class="o">=</span> <span class="n">e</span>
            <span class="n">bestColdIndex</span> <span class="o">=</span> <span class="n">i</span>
        <span class="p">}</span>
        <span class="k">continue</span>
    <span class="p">}</span>
    <span class="k">if</span> <span class="n">bestHot</span> <span class="o">==</span> <span class="no">nil</span> <span class="o">||</span> <span class="n">e</span><span class="o">.</span><span class="n">RIF</span> <span class="o">&lt;</span> <span class="n">bestHot</span><span class="o">.</span><span class="n">RIF</span> <span class="p">{</span>
        <span class="n">bestHot</span> <span class="o">=</span> <span class="n">e</span>
        <span class="n">bestHotIndex</span> <span class="o">=</span> <span class="n">i</span>
    <span class="p">}</span>
<span class="p">}</span>

<span class="n">selected</span> <span class="o">:=</span> <span class="n">bestCold</span>
<span class="k">if</span> <span class="n">allHot</span> <span class="p">{</span>
    <span class="n">selected</span> <span class="o">=</span> <span class="n">bestHot</span>
<span class="p">}</span>
</code></pre></div></div>

<p>That is a direct implementation of the paper’s idea:</p>

<ul>
  <li>compute the RIF quantile threshold</li>
  <li>treat <code class="language-plaintext highlighter-rouge">RIF &lt;= threshold</code> as cold</li>
  <li>among cold entries, minimize latency</li>
  <li>if nothing is cold, minimize RIF</li>
</ul>

<p>It is not trying to be clever, which I think is the right call here. Algorithm code gets dangerous when it becomes hard to explain. This one stays direct.</p>

<p>The fallback behavior is also worth noting. If the pool has fewer than two entries, the code falls back to a random backend from the full backend list. That is how the system behaves before warmup or after starvation:</p>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">pool</span><span class="o">.</span><span class="n">entries</span><span class="p">)</span> <span class="o">&lt;</span> <span class="m">2</span> <span class="p">{</span>
    <span class="n">ep</span> <span class="o">:=</span> <span class="n">allBackends</span><span class="p">[</span><span class="n">rand</span><span class="o">.</span><span class="n">Intn</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">allBackends</span><span class="p">))]</span>
    <span class="k">return</span> <span class="o">&amp;</span><span class="n">ProbeEntry</span><span class="p">{</span><span class="n">Backend</span><span class="o">:</span> <span class="n">ep</span><span class="o">.</span><span class="n">String</span><span class="p">(),</span> <span class="n">Endpoint</span><span class="o">:</span> <span class="n">ep</span><span class="p">},</span> <span class="nb">len</span><span class="p">(</span><span class="n">pool</span><span class="o">.</span><span class="n">entries</span><span class="p">),</span> <span class="no">nil</span>
<span class="p">}</span>
</code></pre></div></div>

<p>The benchmark campaign explicitly tracks how often that fallback happens. In the decisive Prequal runs, it is zero, which matters because otherwise a “Prequal win” might secretly be a random-selection run.</p>

<h3 id="async-probing">Async probing</h3>

<p><code class="language-plaintext highlighter-rouge">loadbalancer.Prober</code> is the other half of the design. It is responsible for:</p>

<ul>
  <li>sending HTTP probes to <code class="language-plaintext highlighter-rouge">/prequal/probe</code></li>
  <li>decoding backend-reported RIF and latency</li>
  <li>rejecting stale probe responses</li>
  <li>feeding entries into the route-local pool</li>
  <li>keeping pools warm in the background</li>
</ul>

<p>The request path never blocks on probe completion. Instead, <code class="language-plaintext highlighter-rouge">TriggerProbes(routeKey)</code> enqueues work:</p>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">func</span> <span class="p">(</span><span class="n">pr</span> <span class="o">*</span><span class="n">Prober</span><span class="p">)</span> <span class="n">TriggerProbes</span><span class="p">(</span><span class="n">routeKey</span> <span class="kt">string</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">if</span> <span class="n">routeKey</span> <span class="o">==</span> <span class="s">""</span> <span class="p">{</span>
        <span class="k">return</span>
    <span class="p">}</span>
    <span class="n">n</span> <span class="o">:=</span> <span class="n">pr</span><span class="o">.</span><span class="n">probesForQuery</span><span class="p">()</span>
    <span class="k">for</span> <span class="n">i</span> <span class="o">:=</span> <span class="m">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">n</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span> <span class="p">{</span>
        <span class="n">pr</span><span class="o">.</span><span class="n">enqueueProbe</span><span class="p">(</span><span class="n">routeKey</span><span class="p">)</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Workers consume those route keys, sample a backend for the route, call the probe endpoint, and add a <code class="language-plaintext highlighter-rouge">ProbeEntry</code> to the right pool.</p>

<p>The configuration lives in <a href="https://github.com/sathwick-p/prequal/blob/main/loadbalancer/config.go"><code class="language-plaintext highlighter-rouge">loadbalancer/config.go</code></a>, and the defaults are important because they define the repo’s behavior:</p>

<ul>
  <li>pool size: <code class="language-plaintext highlighter-rouge">16</code></li>
  <li>pool max age: <code class="language-plaintext highlighter-rouge">1s</code></li>
  <li>reuse limit: <code class="language-plaintext highlighter-rouge">3</code></li>
  <li><code class="language-plaintext highlighter-rouge">QRIF</code>: <code class="language-plaintext highlighter-rouge">0.75</code></li>
  <li>probes per query: <code class="language-plaintext highlighter-rouge">1.0</code></li>
  <li>probe workers: <code class="language-plaintext highlighter-rouge">8</code></li>
  <li>background interval: <code class="language-plaintext highlighter-rouge">100ms</code></li>
  <li>probe timeout: <code class="language-plaintext highlighter-rouge">100ms</code></li>
  <li>max probe age: <code class="language-plaintext highlighter-rouge">2s</code></li>
</ul>

<p>Those values are not arbitrary, but they are also not identical to the paper’s defaults. More on this later</p>

<h3 id="rif-and-latency-tracking">RIF and latency tracking</h3>

<p>Besides backend-reported probe data, the proxy maintains its own local trackers:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">RIFTracker</code> uses <code class="language-plaintext highlighter-rouge">sync.Map</code> and <code class="language-plaintext highlighter-rouge">atomic.Int64</code></li>
  <li><code class="language-plaintext highlighter-rouge">LatencyTracker</code> keeps a per-backend circular buffer and reports a local median</li>
</ul>

<p>These are used for two things:</p>

<ul>
  <li>supporting least-connections</li>
  <li>optionally seeding the probe pool when the async prober is disabled</li>
</ul>

<p>The pool can be bootstrap-seeded from local observations if the async prober is absent, but once the prober exists, backend probes become the authoritative source.</p>

<h2 id="the-benchmark-backend">The benchmark backend</h2>

<p>The Rust backend in <a href="https://github.com/sathwick-p/prequal/blob/main/backend/src/main.rs"><code class="language-plaintext highlighter-rouge">backend/src/main.rs</code></a> is part of the implementation model.</p>

<p>It exposes three endpoints:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">POST /work</code></li>
  <li><code class="language-plaintext highlighter-rouge">GET /health</code></li>
  <li><code class="language-plaintext highlighter-rouge">GET /prequal/probe</code></li>
</ul>

<p><code class="language-plaintext highlighter-rouge">/work</code> simulates the backend’s actual service time.
<code class="language-plaintext highlighter-rouge">/prequal/probe</code> exposes the two signals Prequal needs.</p>

<p>The backend’s internal state is visible in exactly the way the algorithm expects. Which makes controlled experiments possible at all.</p>

<h3 id="work-mode-cpu-bound-or-io-bound">Work mode: CPU-bound or I/O-bound</h3>

<p>The backend has two modes:</p>

<ul>
  <li>CPU-bound SHA256 loop</li>
  <li>I/O-bound sleep mode</li>
</ul>

<p>That switch is controlled by <code class="language-plaintext highlighter-rouge">IO_BOUND_MODE</code>.</p>

<p>In CPU-bound mode, each request burns CPU with repeated hashing.
In I/O-bound mode, each request sleeps for <code class="language-plaintext highlighter-rouge">iterations * IO_BOUND_BASE_US</code>.</p>

<p>That one switch ends up being central to the benchmark story. On small CPU-bound fleets, Prequal loses. In the paper-aligned I/O-bound skewed regime, it wins decisively.</p>

<h3 id="probe-responses-are-rif-conditioned">Probe responses are RIF-conditioned</h3>

<p>The backend tracks current RIF with an atomic counter and stores recent latency samples in five buckets:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">0</code></li>
  <li><code class="language-plaintext highlighter-rouge">1</code></li>
  <li><code class="language-plaintext highlighter-rouge">2..3</code></li>
  <li><code class="language-plaintext highlighter-rouge">4..7</code></li>
  <li><code class="language-plaintext highlighter-rouge">8+</code></li>
</ul>

<p>The probe handler looks at the current RIF, chooses the matching bucket, and returns the median latency from that bucket or the nearest non-empty bucket.</p>

<p>This means the probe latency signal is not a raw median across all recent requests. It is conditioned on queue depth.</p>

<p>The probe response shape is simple:</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"rif"</span><span class="p">:</span><span class="w"> </span><span class="mi">3</span><span class="p">,</span><span class="w">
  </span><span class="nl">"latency_median_ms"</span><span class="p">:</span><span class="w"> </span><span class="mf">12.5</span><span class="p">,</span><span class="w">
  </span><span class="nl">"timestamp_ms"</span><span class="p">:</span><span class="w"> </span><span class="mi">1710000000000</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>The Go prober then turns that into a <code class="language-plaintext highlighter-rouge">ProbeEntry</code>, rejects it if the timestamp is too stale, and inserts it into the right pool.</p>

<h3 id="fault-injection-is-built-in">Fault injection is built in</h3>

<p>The backend also supports probe-fault modes:</p>

<ul>
  <li>timeout</li>
  <li>HTTP 500</li>
  <li>malformed JSON</li>
  <li>stale timestamp</li>
</ul>

<p>This makes it possible to test whether the prober correctly records failures, drops bad data, and avoids poisoning the pool with stale observations.</p>

<h2 id="the-observability-path">The observability path</h2>

<blockquote>
  <p>The controller and proxy expose Prometheus metrics for request latency, per-algorithm selections, probe success and failure counts, probe queue depth, pool occupancy, and active backend count. A debug server exposes <code class="language-plaintext highlighter-rouge">/metrics</code>, <code class="language-plaintext highlighter-rouge">/routes</code>, <code class="language-plaintext highlighter-rouge">/healthz</code>, <code class="language-plaintext highlighter-rouge">/readyz</code>, and <code class="language-plaintext highlighter-rouge">pprof</code> endpoints. That is enough to explain benchmark results, not just report them.</p>
</blockquote>

<p>This repo instruments the controller and proxy heavily enough that you can explain a benchmark result rather than just report it.</p>

<p>The Prometheus metrics in <a href="https://github.com/sathwick-p/prequal/blob/main/observability/metrics.go"><code class="language-plaintext highlighter-rouge">observability/metrics.go</code></a> cover:</p>

<ul>
  <li>total requests and request latency</li>
  <li>backend selections by route and algorithm</li>
  <li>no-route and no-backend events</li>
  <li>reconciliation counts and durations</li>
  <li>probes sent, succeeded, failed, dropped</li>
  <li>probe queue depth</li>
  <li>pool occupancy</li>
  <li>selection algorithm usage</li>
  <li>active backend count</li>
</ul>

<p>And the debug server in <a href="https://github.com/sathwick-p/prequal/blob/main/debug.go"><code class="language-plaintext highlighter-rouge">debug.go</code></a> exposes:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">/metrics</code></li>
  <li><code class="language-plaintext highlighter-rouge">/routes</code></li>
  <li><code class="language-plaintext highlighter-rouge">/healthz</code></li>
  <li><code class="language-plaintext highlighter-rouge">/readyz</code></li>
  <li><code class="language-plaintext highlighter-rouge">pprof</code> endpoints</li>
</ul>

<p>That instrumentation is what makes the benchmark investigation credible. The repo can answer questions like:</p>

<ul>
  <li>Did Prequal actually avoid the slow replicas?</li>
  <li>Was the pool starving?</li>
  <li>Were requests falling back to random?</li>
  <li>Was the controller CPU-bound?</li>
  <li>Was mutex contention the problem?</li>
</ul>

<p>The observability here is how the benchmark results got debugged, not dashboard decoration.</p>

<h2 id="benchmarking">Benchmarking</h2>

<blockquote>
  <p>A controlled protocol (interleaved algorithm order, controller rollout-restart per run, warmup period, full metadata capture) turned an early 10x-worse false negative into a reproducible 8.6x p99 improvement. The methodology fix alone, with zero algorithm code changed, produced a 56x p99 reduction in the heterogeneous run.</p>
</blockquote>

<p>The benchmark harness in <code class="language-plaintext highlighter-rouge">benchmark/</code> is extensive enough that it deserves to be treated as part of the software, not just support files.</p>

<p>There are several traffic models:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">steady_state.js</code></li>
  <li><code class="language-plaintext highlighter-rouge">open_loop.js</code></li>
  <li><code class="language-plaintext highlighter-rouge">rate_ramp.js</code></li>
  <li><code class="language-plaintext highlighter-rouge">burst.js</code></li>
  <li><code class="language-plaintext highlighter-rouge">overload.js</code></li>
  <li><code class="language-plaintext highlighter-rouge">long_duration.js</code></li>
  <li><code class="language-plaintext highlighter-rouge">multi_route.js</code></li>
</ul>

<p>And there are matching manifests for:</p>

<ul>
  <li>uniform workloads</li>
  <li>heterogeneous workloads</li>
  <li>multi-route workloads</li>
  <li>route-scale</li>
  <li>long-duration</li>
  <li>fault injection</li>
</ul>

<p>The comparisons are made by holding everything constant except the algorithm annotation on the ingress.</p>

<p>That gives a fair comparison:</p>

<ul>
  <li>same controller binary</li>
  <li>same route matching</li>
  <li>same transport</li>
  <li>same backend images</li>
  <li>same cluster</li>
  <li>same load script</li>
  <li>same observability stack</li>
</ul>

<p>Only the backend selection rule changes.</p>

<h3 id="the-controlled-protocol">The controlled protocol</h3>

<p>The methodology fix alone, with zero algorithm code changed, produced a 56x p99 reduction in the E-B heterogeneous run. The seven competing hypotheses are walked in <a href="https://github.com/sathwick-p/prequal/blob/main/benchmark/investigations/2026-04-19-c2-tail-spike.md">here</a></p>

<p>An early heterogeneous run made Prequal look dramatically worse than the baselines, but it turned out the main problem was protocol:</p>

<ul>
  <li>algorithms were run sequentially instead of interleaved</li>
  <li>controller state leaked across runs</li>
  <li>probe pools were not reset cleanly between repetitions</li>
</ul>

<p>The fix became the canonical benchmark protocol:</p>

<ul>
  <li>interleaved algorithm order</li>
  <li>controller rollout restart before every run</li>
  <li>warmup period before measurement</li>
  <li>full metadata capture per run</li>
</ul>

<p>That protocol is encoded in <a href="https://github.com/sathwick-p/prequal/blob/main/benchmark/scripts/run_interleaved_campaign.sh"><code class="language-plaintext highlighter-rouge">benchmark/scripts/run_interleaved_campaign.sh</code></a> and <a href="https://github.com/sathwick-p/prequal/blob/main/benchmark/scripts/run_campaign.sh"><code class="language-plaintext highlighter-rouge">benchmark/scripts/run_campaign.sh</code></a>.</p>

<h3 id="why-open-loop-matters">Why open-loop matters</h3>

<p>The decisive C2 benchmark uses <code class="language-plaintext highlighter-rouge">k6</code> constant-arrival-rate mode:</p>

<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">export</span> <span class="kd">const</span> <span class="nx">options</span> <span class="o">=</span> <span class="p">{</span>
  <span class="na">scenarios</span><span class="p">:</span> <span class="p">{</span>
    <span class="na">open_loop</span><span class="p">:</span> <span class="p">{</span>
      <span class="na">executor</span><span class="p">:</span> <span class="dl">'</span><span class="s1">constant-arrival-rate</span><span class="dl">'</span><span class="p">,</span>
      <span class="nx">rate</span><span class="p">,</span>
      <span class="na">timeUnit</span><span class="p">:</span> <span class="dl">'</span><span class="s1">1s</span><span class="dl">'</span><span class="p">,</span>
      <span class="nx">duration</span><span class="p">,</span>
      <span class="nx">preAllocatedVUs</span><span class="p">,</span>
      <span class="nx">maxVUs</span><span class="p">,</span>
    <span class="p">},</span>
  <span class="p">},</span>
<span class="p">}</span>
</code></pre></div></div>

<p>This is a better choice than closed-loop when the point is to expose queueing behavior. Closed-loop traffic self-throttles when latency rises. Open-loop keeps pushing at the target rate and makes tail failures visible.</p>

<p>The ramp benchmark does the complementary thing: it increases arrival rate in stages until the system crosses its comfortable regime.</p>

<p>Together, those two tests are enough to answer the important question: does Prequal help under skew and near saturation, which is exactly where the paper says it should?</p>

<h2 id="how-much-faster-is-prequal-than-round-robin">How much faster is Prequal than round-robin?</h2>

<blockquote>
  <p>On a 16-backend cluster with 16x service-time skew (I/O-bound, open-loop), this Go Prequal implementation cut p99 tail latency by 8.6x versus round-robin and 8.5x versus least-connections. On a 4-backend CPU-bound cluster, the same implementation was roughly 25% slower than round-robin on throughput. Prequal is a tail-latency tool, not a small-fleet tool.</p>
</blockquote>

<p>Across five-run interleaved campaigns on a 16-backend cluster with <code class="language-plaintext highlighter-rouge">16x</code> service-time skew, this Prequal implementation cut <code class="language-plaintext highlighter-rouge">p99</code> tail latency by <code class="language-plaintext highlighter-rouge">8.6x</code> against round-robin and <code class="language-plaintext highlighter-rouge">8.5x</code> against least-connections on an open-loop workload, and by <code class="language-plaintext highlighter-rouge">6.8x</code> against round-robin on a rate ramp. On a small 4-backend CPU-bound workload, the same implementation was roughly <code class="language-plaintext highlighter-rouge">25%</code> slower than round-robin on throughput. The repo’s final claim is deliberately bounded, and I think it is the right one.</p>

<h3 id="small-cpu-bound-fleet-prequal-loses">Small CPU-bound fleet: Prequal loses</h3>

<p>In the small-fleet CPU-bound regime, this implementation does not win.</p>

<p>The report’s C1 numbers are:</p>

<table>
  <thead>
    <tr>
      <th>algorithm</th>
      <th style="text-align: right">throughput rps</th>
      <th style="text-align: right">p99 ms</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>prequal</td>
      <td style="text-align: right">4162</td>
      <td style="text-align: right">29.86</td>
    </tr>
    <tr>
      <td>round-robin</td>
      <td style="text-align: right">5366</td>
      <td style="text-align: right">20.22</td>
    </tr>
    <tr>
      <td>least-connections</td>
      <td style="text-align: right">4965</td>
      <td style="text-align: right">23.48</td>
    </tr>
  </tbody>
</table>

<p>That is roughly a 25% throughput deficit versus round-robin.</p>

<p>I profiled it and the overhead investigation concludes that:</p>

<ul>
  <li>there is no hot path dominating controller CPU</li>
  <li>mutex contention is negligible</li>
  <li>heap use is tiny</li>
  <li>the measurable cost is mostly diffuse probe/network competition in a regime where the algorithm does not have enough diversity or skew to pay for itself</li>
</ul>

<h3 id="paper-aligned-skewed-io-bound-regime-prequal-wins-hard">Paper-aligned skewed I/O-bound regime: Prequal wins hard</h3>

<p>The story changes once the benchmark is moved closer to the paper’s assumptions:</p>

<ul>
  <li>16 backends instead of 4</li>
  <li>14 fast, 2 slow</li>
  <li>16x service-time skew</li>
  <li>I/O-bound backend mode</li>
  <li>open-loop and ramp traffic</li>
</ul>

<p>In the E-B heterogeneous open-loop campaign, the median <code class="language-plaintext highlighter-rouge">p99</code> numbers are:</p>

<table>
  <thead>
    <tr>
      <th>algorithm</th>
      <th style="text-align: right">p99 ms</th>
      <th style="text-align: right">p99.9 ms</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>prequal</td>
      <td style="text-align: right">94.20</td>
      <td style="text-align: right">272.38</td>
    </tr>
    <tr>
      <td>round-robin</td>
      <td style="text-align: right">807.32</td>
      <td style="text-align: right">887.90</td>
    </tr>
    <tr>
      <td>least-connections</td>
      <td style="text-align: right">802.84</td>
      <td style="text-align: right">1006.78</td>
    </tr>
  </tbody>
</table>

<p>That is an 8.6x <code class="language-plaintext highlighter-rouge">p99</code> improvement versus the best baseline.</p>

<p>In the E-B ramp campaign:</p>

<table>
  <thead>
    <tr>
      <th>algorithm</th>
      <th style="text-align: right">p99 ms</th>
      <th style="text-align: right">p99.9 ms</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>prequal</td>
      <td style="text-align: right">123.39</td>
      <td style="text-align: right">824.08</td>
    </tr>
    <tr>
      <td>round-robin</td>
      <td style="text-align: right">831.58</td>
      <td style="text-align: right">1250.41</td>
    </tr>
    <tr>
      <td>least-connections</td>
      <td style="text-align: right">867.93</td>
      <td style="text-align: right">1596.51</td>
    </tr>
  </tbody>
</table>

<p>That is still a 6.8x <code class="language-plaintext highlighter-rouge">p99</code> improvement.</p>

<p>The selection-rate data explains why. Prequal pushes traffic almost entirely to the fast backends and drives the two slow replicas down to nearly zero selections per second. Round-robin, by definition, keeps giving the slow pair their fair share. Least-connections improves the <code class="language-plaintext highlighter-rouge">p95</code>, but still reacts too slowly to avoid queueing at the slow replicas, so the tail remains pinned near their service time.</p>

<p>This mechanism lines up with the result</p>

<h3 id="the-win-is-in-the-tail-not-the-center">The win is in the tail, not the center</h3>

<p>One subtle but important point from the data is that <code class="language-plaintext highlighter-rouge">p50</code> is basically the same across algorithms in the winning regime. The advantage is almost entirely in <code class="language-plaintext highlighter-rouge">p99</code> and <code class="language-plaintext highlighter-rouge">p99.9</code>.</p>

<p>That is exactly what you would expect if the algorithm is avoiding pathological queueing rather than making the median request faster.</p>

<p>It is also why Prequal is interesting. If your median is already fine, the only remaining reason to build a more sophisticated load balancer is to keep a minority of requests from getting stuck behind bad backend choices.</p>

<p><img src="/assets/2026-04-20-prequal/backend-selection.png" alt="Per-backend selection rate under heterogeneous load: Prequal concentrates traffic on the 14 fast replicas and drives the two slow replicas to near-zero selections per second, while round-robin and least-connections continue feeding the slow pair" /></p>

<h2 id="how-faithful-is-this-go-implementation-to-the-prequal-paper">How faithful is this Go implementation to the Prequal paper?</h2>

<blockquote>
  <p>This Go implementation defaults to <code class="language-plaintext highlighter-rouge">Q_RIF = 0.75</code> versus the paper’s 0.84, <code class="language-plaintext highlighter-rouge">probes-per-query = 1.0</code> versus the paper’s 3 (testbed) and 5 (YouTube production), and hardcodes probe reuse at 3 rather than deriving it from pool size and fleet size. The frozen benchmark report documents every divergence explicitly.</p>
</blockquote>

<p>This repo is faithful to the paper’s central ideas, but it is not a line-by-line reproduction. The frozen report documents the differences clearly, and they matter.</p>

<p>The most important divergences are:</p>

<h3 id="qrif-default-is-075-not-the-papers-084"><code class="language-plaintext highlighter-rouge">QRIF</code> default is <code class="language-plaintext highlighter-rouge">0.75</code>, not the paper’s <code class="language-plaintext highlighter-rouge">~0.84</code></h3>

<p>The paper’s baseline uses <code class="language-plaintext highlighter-rouge">Q_RIF = 2^(-0.25) ≈ 0.84</code>.
This repo defaults to <code class="language-plaintext highlighter-rouge">0.75</code>.</p>

<p>That is within the paper’s recommended band and probably a minor difference, but it is still a difference.</p>

<h3 id="probes-per-query-is-10-not-3-or-5">Probes per query is <code class="language-plaintext highlighter-rouge">1.0</code>, not <code class="language-plaintext highlighter-rouge">3</code> or <code class="language-plaintext highlighter-rouge">5</code></h3>

<p>The paper uses <code class="language-plaintext highlighter-rouge">3</code> probes per query in testbed experiments and mentions <code class="language-plaintext highlighter-rouge">5</code> in YouTube production.
This repo defaults to <code class="language-plaintext highlighter-rouge">1.0</code>.</p>

<p>That choice was motivated by keeping probe overhead reasonable on a small local cluster, but it is a substantial departure.</p>

<h3 id="probe-reuse-is-a-fixed-constant">Probe reuse is a fixed constant</h3>

<p>The paper derives reuse behavior from a formula involving pool size, fleet size, probe rate, and removal rate.
This repo hardcodes <code class="language-plaintext highlighter-rouge">PoolReuseLimit = 3</code>.</p>

<p>That is a reasonable engineering choice for a local implementation, but it means the repo is approximating one part of the paper’s mechanics rather than reproducing it exactly.</p>

<h3 id="probe-removal-is-maintenance-driven">Probe removal is maintenance-driven</h3>

<p>The paper frames removal as a per-query process.
This repo performs cleanup and “remove worst” behavior on a maintenance tick plus reuse depletion.</p>

<p>That keeps work off the request hot path, which is sensible for Go code in a proxy, but it is another behavioral difference.</p>

<h3 id="backend-probing-is-not-yet-sampling-without-replacement">Backend probing is not yet sampling without replacement</h3>

<p>The report points out a latent issue: <code class="language-plaintext highlighter-rouge">ProbeRandom</code> picks one backend at a time with <code class="language-plaintext highlighter-rouge">rand.Intn</code>, so if probes-per-query were raised above <code class="language-plaintext highlighter-rouge">1</code>, the implementation would not yet match the paper’s “sample without replacement” requirement.</p>

<h2 id="final-thoughts">Final thoughts</h2>
<p>Writing this was a lot of fun—and watching Claude run tests and benchmarks made it even better. There were a couple of ideas I wanted to explore further but didn’t get to, mainly because I was short on time and wanted to move on to the next project.</p>

<p>One idea was to build a sidecar container or probe that uses eBPF to attach to the main backend container, collect both latency and RIF, and expose that data through a path that the ingress controller could use to make smarter routing decisions. Another was to design a centralized probe pool that multiple ingress controller pods could read from, or to implement a gossip protocol between controller pods to share this information in a decentralized way.</p>

<p>But eventually, it was time to move on. As always, the next project always tends to pull more than the last.</p>

<p>Until next time.</p>

<h2 id="references">References</h2>

<ul>
  <li>Wydrowski, Kleinberg, Rumble, Archer. <a href="https://www.usenix.org/system/files/nsdi24-wydrowski.pdf"><em>Load is not what you should balance: Introducing Prequal</em></a>, NSDI 2024.</li>
  <li>Repo benchmark report: <a href="https://github.com/sathwick-p/prequal/blob/main/benchmark/REPORT.md"><code class="language-plaintext highlighter-rouge">benchmark/REPORT.md</code></a></li>
  <li>Tail-spike investigation: <a href="https://github.com/sathwick-p/prequal/blob/main/benchmark/investigations/2026-04-19-c2-tail-spike.md"><code class="language-plaintext highlighter-rouge">benchmark/investigations/2026-04-19-c2-tail-spike.md</code></a></li>
  <li>Regime pivot investigation: <a href="https://github.com/sathwick-p/prequal/blob/main/benchmark/investigations/2026-04-20-regime-pivot.md"><code class="language-plaintext highlighter-rouge">benchmark/investigations/2026-04-20-regime-pivot.md</code></a></li>
  <li>Overhead profiling investigation: <a href="https://github.com/sathwick-p/prequal/blob/main/benchmark/investigations/2026-04-20-prequal-overhead-profiling.md"><code class="language-plaintext highlighter-rouge">benchmark/investigations/2026-04-20-prequal-overhead-profiling.md</code></a></li>
  <li>HAProxy Technologies. <a href="https://www.haproxy.com/blog/power-of-two-load-balancing"><em>Power of Two Load Balancing</em></a> — context for why Prequal’s quantile-over-pool design is a more sophisticated variant of the same “sample a subset, don’t score every backend” idea.</li>
  <li>Full source code: <a href="https://github.com/sathwick-p/prequal">github.com/sathwick-p/prequal</a> — the controller, reverse proxy, Rust benchmark backend, k6 scripts, Prometheus/Grafana assets, and frozen benchmark reports all live here.</li>
</ul>]]></content><author><name>Sathwick</name><email>sathwick.p7@gmail.com</email></author><category term="architecture" /><category term="loadbalancer" /><category term="go" /><summary type="html"><![CDATA[I rebuilt YouTube's load balancing algorithm (Prequal) in Go. It cut p99 tail latency 8.6x vs round-robin. Full walkthrough of the paper, code, and benchmarks.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://sathwick.xyz/assets/2026-04-20-prequal/system-overview.png" /><media:content medium="image" url="https://sathwick.xyz/assets/2026-04-20-prequal/system-overview.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Reverse-Engineering Claude Code: A Deep Dive into Anthropic’s AI-Powered CLI</title><link href="https://sathwick.xyz/blog/claude-code.html" rel="alternate" type="text/html" title="Reverse-Engineering Claude Code: A Deep Dive into Anthropic’s AI-Powered CLI" /><published>2026-03-31T00:00:00+00:00</published><updated>2026-03-31T00:00:00+00:00</updated><id>https://sathwick.xyz/blog/claude-code</id><content type="html" xml:base="https://sathwick.xyz/blog/claude-code.html"><![CDATA[<h2 id="table-of-contents">Table of Contents</h2>

<ol>
  <li><a href="#1-introduction-what-is-claude-code">Introduction: What is Claude Code?</a></li>
  <li><a href="#2-high-level-architecture">High-Level Architecture</a></li>
  <li><a href="#3-startup-the-race-against-time">Startup: The Race Against Time</a></li>
  <li><a href="#4-the-query-engine-brains-of-the-operation">The Query Engine: Brains of the Operation</a></li>
  <li><a href="#5-the-tool-system-60-tools-behind-a-single-interface">The Tool System: 60+ Tools Behind a Single Interface</a></li>
  <li><a href="#6-the-permission-system-safety-at-every-layer">The Permission System: Safety at Every Layer</a></li>
  <li><a href="#7-terminal-ui-react-but-for-your-terminal">Terminal UI: React, but for Your Terminal</a></li>
  <li><a href="#8-the-command-system-100-slash-commands">The Command System: 100+ Slash Commands</a></li>
  <li><a href="#9-skills-plugins-and-mcp-the-extensibility-trifecta">Skills, Plugins, and MCP: The Extensibility Trifecta</a></li>
  <li><a href="#10-context-management-fighting-the-token-limit">Context Management: Fighting the Token Limit</a></li>
  <li><a href="#11-state-management-immutable-store-for-a-mutable-world">State Management: Immutable Store for a Mutable World</a></li>
  <li><a href="#12-session-persistence-and-history">Session Persistence and History</a></li>
  <li><a href="#13-multi-agent-architecture-subagents-swarms-and-worktrees">Multi-Agent Architecture: Subagents, Swarms, and Worktrees</a></li>
  <li><a href="#14-error-recovery-a-system-that-refuses-to-crash">Error Recovery: A System That Refuses to Crash</a></li>
  <li><a href="#15-cost-tracking-and-telemetry">Cost Tracking and Telemetry</a></li>
  <li><a href="#16-execution-modes-one-codebase-many-faces">Execution Modes: One Codebase, Many Faces</a></li>
  <li><a href="#17-buddy-a-tamagotchi-style-ai-pet">BUDDY: A Tamagotchi-Style AI Pet</a></li>
  <li><a href="#18-kairos-persistent-assistant-mode-and-auto-dreaming">KAIROS: Persistent Assistant Mode and Auto-Dreaming</a></li>
  <li><a href="#19-ultraplan-remote-planning-sessions">ULTRAPLAN: Remote Planning Sessions</a></li>
  <li><a href="#20-coordinator-mode-multi-agent-orchestrator">Coordinator Mode: Multi-Agent Orchestrator</a></li>
  <li><a href="#21-the-memory-system-persistent-ai-memory">The Memory System: Persistent AI Memory</a></li>
  <li><a href="#22-hooks-user-defined-automation">Hooks: User-Defined Automation</a></li>
  <li><a href="#23-voice-mode-bridge-and-infrastructure">Voice Mode, Bridge, and Infrastructure</a></li>
  <li><a href="#24-vim-mode-keybindings-and-developer-ergonomics">Vim Mode, Keybindings, and Developer Ergonomics</a></li>
  <li><a href="#25-key-engineering-patterns-and-takeaways">Key Engineering Patterns and Takeaways</a></li>
  <li><a href="#26-conclusion">Conclusion</a></li>
</ol>

<h2 id="1-introduction-what-is-claude-code">1. Introduction: What is Claude Code?</h2>

<p>Claude Code is Anthropic’s official CLI tool — an interactive, AI-powered development assistant that lives in your terminal. It lets developers have natural-language conversations with Claude to edit files, run shell commands, search codebases, manage Git workflows, create pull requests, debug issues, and much more.</p>

<p>But underneath the conversational interface lies a <strong>remarkably sophisticated piece of software engineering</strong>: a custom React-based terminal renderer, a multi-layered permission system, an elastic tool discovery mechanism, a self-healing query loop with automatic context compression, and an extensibility framework spanning skills, plugins, and the Model Context Protocol (MCP).</p>

<p>This article is a deep technical analysis of the Claude Code source code — approximately <strong>330+ utility files, 45+ tool implementations, 100+ slash commands, 146 UI components, and a custom terminal rendering framework</strong> — all written in TypeScript with React, running on Bun.</p>

<p>Let’s take it apart, piece by piece.</p>

<h2 id="2-high-level-architecture">2. High-Level Architecture</h2>

<p>Claude Code follows a layered architecture where each layer has clear responsibilities:</p>

<p><img src="/assets/claudecode/high-level-architecture.png" alt="Claude Code high-level architecture" /></p>

<h3 id="tech-stack">Tech Stack</h3>

<table>
  <thead>
    <tr>
      <th>Layer</th>
      <th>Technology</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Language</td>
      <td>TypeScript (strict mode)</td>
    </tr>
    <tr>
      <td>Runtime</td>
      <td>Bun (with Node.js 18+ compatibility)</td>
    </tr>
    <tr>
      <td>UI Framework</td>
      <td>React 18 with custom terminal reconciler</td>
    </tr>
    <tr>
      <td>Layout Engine</td>
      <td>Yoga (Facebook’s flexbox implementation)</td>
    </tr>
    <tr>
      <td>API Client</td>
      <td><code class="language-plaintext highlighter-rouge">@anthropic-ai/sdk</code></td>
    </tr>
    <tr>
      <td>Extensibility</td>
      <td>Model Context Protocol (MCP) SDK</td>
    </tr>
    <tr>
      <td>Validation</td>
      <td>Zod (schema-driven I/O for all tools)</td>
    </tr>
    <tr>
      <td>CLI Framework</td>
      <td>Commander.js</td>
    </tr>
    <tr>
      <td>Linting</td>
      <td>Biome + ESLint</td>
    </tr>
  </tbody>
</table>

<h3 id="directory-structure">Directory Structure</h3>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>src/
├── main.tsx                 # Application entry (~800KB, bootstraps everything)
├── QueryEngine.ts           # Conversation management &amp; API orchestration
├── query.ts                 # Query loop state machine (retries, compaction, recovery)
├── Tool.ts                  # Unified tool interface (generic over Input/Output/Progress)
├── tools.ts                 # Tool registry with feature-gated loading
├── commands.ts              # Command registry with lazy dispatch
├── context.ts               # System/user context builder (git, CLAUDE.md, date)
├── cost-tracker.ts          # Per-model usage accumulation and display
├── history.ts               # Session history (JSONL, dedup, paste refs)
├── setup.ts                 # Pre-action configuration and auth
├── entrypoints/             # CLI, SDK, MCP entry points
├── tools/                   # 45+ tool implementations (Bash, FileRead, Agent, etc.)
├── commands/                # 100+ slash command implementations
├── components/              # 146 React terminal components
├── ink/                     # Custom terminal rendering framework (~90 files)
├── services/                # API, analytics, MCP, compact, plugins
├── hooks/                   # 85+ hook implementations
├── state/                   # AppState store (Zustand-like)
├── utils/                   # 330+ utilities (git, config, permissions, etc.)
├── skills/                  # Skill loading, bundled skills
├── keybindings/             # Dynamic keybinding system
├── vim/                     # Full vi/vim mode
├── bridge/                  # CCR bridge (WebSocket to claude.ai)
├── coordinator/             # Multi-agent coordination
├── remote/                  # Remote session management
├── tasks/                   # Background task system
├── migrations/              # Versioned data migrations
└── types/                   # Shared type definitions
</code></pre></div></div>

<h2 id="3-startup-the-race-against-time">3. Startup: The Race Against Time</h2>

<p>Claude Code’s startup is aggressively optimized. The goal: minimize time-to-first-render so the developer is never left staring at a blank terminal.</p>

<h3 id="31-parallelized-prefetching">3.1 Parallelized Prefetching</h3>

<p>Before <em>any</em> module imports happen, three critical operations fire in parallel:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// main.tsx — lines 1-20, before any other imports</span>
<span class="nx">profileCheckpoint</span><span class="p">(</span><span class="dl">'</span><span class="s1">main_tsx_entry</span><span class="dl">'</span><span class="p">)</span>
<span class="nx">startMdmRawRead</span><span class="p">()</span>        <span class="c1">// macOS MDM policy read (subprocess)</span>
<span class="nx">startKeychainPrefetch</span><span class="p">()</span>   <span class="c1">// OAuth + API key keychain reads (2 subprocesses)</span>
</code></pre></div></div>

<p>This exploits a clever insight: TypeScript module evaluation takes ~135ms anyway (sequential by nature). By spawning subprocesses <em>immediately</em>, macOS keychain reads (~65ms total) run entirely in parallel with import resolution, becoming effectively free.</p>

<h3 id="32-initialization-sequence">3.2 Initialization Sequence</h3>

<p>The <code class="language-plaintext highlighter-rouge">init()</code> function (memoized to prevent re-entrancy) orchestrates 16 setup stages:</p>

<ol>
  <li><strong>Config validation</strong> — Parse and validate all JSON config files</li>
  <li><strong>Safe environment variables</strong> — Apply non-sensitive env vars before trust dialog</li>
  <li><strong>CA certificates</strong> — Load extra root CAs before first TLS handshake</li>
  <li><strong>Graceful shutdown handlers</strong> — Register <code class="language-plaintext highlighter-rouge">SIGINT</code>/<code class="language-plaintext highlighter-rouge">SIGTERM</code> handlers</li>
  <li><strong>OAuth population</strong> — Async account info fetch</li>
  <li><strong>IDE detection</strong> — JetBrains, VS Code identification</li>
  <li><strong>Remote settings</strong> — Fetch managed settings from server (async, awaited later)</li>
  <li><strong>Policy limits</strong> — Load org-enforced limits (async)</li>
  <li><strong>First-start timestamp</strong> — Analytics marker</li>
  <li><strong>mTLS configuration</strong> — Client certificate setup</li>
  <li><strong>Proxy agents</strong> — Configure HTTP/HTTPS proxies</li>
  <li><strong>API preconnection</strong> — TCP+TLS handshake overlaps with remaining init</li>
  <li><strong>Upstream proxy (CCR)</strong> — CONNECT relay for organization credentials</li>
  <li><strong>Shell detection</strong> — Windows-specific shell resolution</li>
  <li><strong>LSP manager</strong> — Language Server Protocol cleanup handlers</li>
  <li><strong>Team cleanup</strong> — Multi-agent swarm cleanup on shutdown</li>
</ol>

<h3 id="33-fast-paths">3.3 Fast Paths</h3>

<p>Before full initialization, fast paths handle quick-exit commands:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">--version</code> — Print version and exit (no init, no React)</li>
  <li><code class="language-plaintext highlighter-rouge">--dump-system-prompt</code> — Output the system prompt and exit</li>
  <li><code class="language-plaintext highlighter-rouge">mcp serve</code> — Start MCP server mode (different init path)</li>
</ul>

<h3 id="34-startup-profiling">3.4 Startup Profiling</h3>

<p>A sampled profiler (<code class="language-plaintext highlighter-rouge">startupProfiler.ts</code>) measures every phase:</p>
<ul>
  <li>100% of internal builds get sampled</li>
  <li>0.5% of external users are sampled</li>
  <li><code class="language-plaintext highlighter-rouge">CLAUDE_CODE_PROFILE_STARTUP=1</code> forces full profiling with memory snapshots</li>
</ul>

<p>The decision is made once at module load time — non-sampled users pay zero profiling overhead.</p>

<h3 id="35-entrypoint-resolution">3.5 Entrypoint Resolution</h3>

<p>The system identifies its execution context early and sets <code class="language-plaintext highlighter-rouge">CLAUDE_CODE_ENTRYPOINT</code>:</p>

<table>
  <thead>
    <tr>
      <th>Value</th>
      <th>Context</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">cli</code></td>
      <td>Interactive terminal session</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">sdk-cli</code></td>
      <td>Non-interactive (print mode, piped)</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">mcp</code></td>
      <td>Running as an MCP server</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">local-agent</code></td>
      <td>Spawned as a subagent</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">claude-code-github-action</code></td>
      <td>GitHub Actions CI</td>
    </tr>
  </tbody>
</table>

<p>This gates feature loading — for example, REPL components only load in interactive mode.</p>

<h2 id="4-the-query-engine-brains-of-the-operation">4. The Query Engine: Brains of the Operation</h2>

<p>The query engine is the core loop that manages conversations with Claude. It’s split across two files: <code class="language-plaintext highlighter-rouge">QueryEngine.ts</code> (session-level orchestration) and <code class="language-plaintext highlighter-rouge">query.ts</code> (per-turn state machine).</p>

<h3 id="41-queryengine-the-session-coordinator">4.1 QueryEngine: The Session Coordinator</h3>

<p>The <code class="language-plaintext highlighter-rouge">QueryEngine</code> class is a singleton per conversation. It persists state across turns and coordinates:</p>

<ul>
  <li><strong>System context building</strong> (git status, CLAUDE.md files, date)</li>
  <li><strong>Message management</strong> (accumulation, normalization, persistence)</li>
  <li><strong>API calls</strong> (streaming, retries, fallback)</li>
  <li><strong>Permission tracking</strong> (denial counts for SDK reporting)</li>
  <li><strong>Cost accumulation</strong> (per-model token tracking)</li>
</ul>

<p>Key method: <code class="language-plaintext highlighter-rouge">submitMessage(prompt, options)</code> — an AsyncGenerator that yields SDK messages throughout the turn. Before entering the query loop, it:</p>

<ol>
  <li>Creates a <strong>file history snapshot</strong> (for undo/restore)</li>
  <li>Records the transcript to disk <em>before</em> the API call (even if the process is killed mid-request, the conversation is resumable)</li>
  <li>Wraps <code class="language-plaintext highlighter-rouge">canUseTool</code> to track permission denials</li>
</ol>

<h3 id="42-the-query-loop-a-resilient-state-machine">4.2 The Query Loop: A Resilient State Machine</h3>

<p>The <code class="language-plaintext highlighter-rouge">query()</code> function in <code class="language-plaintext highlighter-rouge">query.ts</code> is where the magic happens. It’s a <code class="language-plaintext highlighter-rouge">while(true)</code> loop managing a mutable state object:</p>

<p><img src="/assets/claudecode/query-loop-state-machine.png" alt="Claude Code query loop state machine" /></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>queryLoop():
  while(true):
    1. Prefetch memory + skills (parallel)
    2. Apply message compaction (snip, microcompact, context collapse)
    3. Call API with streaming
    4. Handle streaming errors (fallback, retry)
    5. Execute tools (concurrent or serial)
    6. Check recovery paths (compact, collapse drain, token escalation)
    7. Continue loop or return
</code></pre></div></div>

<p>The state object tracks everything needed across iterations:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">type</span> <span class="nx">State</span> <span class="o">=</span> <span class="p">{</span>
  <span class="na">messages</span><span class="p">:</span> <span class="nx">Message</span><span class="p">[]</span>
  <span class="na">toolUseContext</span><span class="p">:</span> <span class="nx">ToolAvailabilityContext</span>
  <span class="na">maxOutputTokensRecoveryCount</span><span class="p">:</span> <span class="kr">number</span>  <span class="c1">// 0–3 limit</span>
  <span class="na">autoCompactTracking</span><span class="p">:</span> <span class="nx">CompactState</span>     <span class="c1">// Compaction state + failure count</span>
  <span class="na">pendingToolUseSummary</span><span class="p">:</span> <span class="nb">Promise</span><span class="o">&lt;</span><span class="p">...</span><span class="o">&gt;</span>   <span class="c1">// Async tool summaries</span>
  <span class="na">transition</span><span class="p">:</span> <span class="nx">TransitionReason</span>          <span class="c1">// Why the loop didn't terminate</span>
<span class="p">}</span>
</code></pre></div></div>

<h3 id="43-streaming-and-tool-execution">4.3 Streaming and Tool Execution</h3>

<p>The query loop streams API responses and processes them incrementally:</p>

<ol>
  <li><strong>Stream start</strong> — Yields <code class="language-plaintext highlighter-rouge">stream_request_start</code> event</li>
  <li><strong>Accumulation</strong> — Collects <code class="language-plaintext highlighter-rouge">assistantMessages</code>, <code class="language-plaintext highlighter-rouge">toolUseBlocks</code>, <code class="language-plaintext highlighter-rouge">toolResults</code></li>
  <li><strong>Usage tracking</strong> — Tracks <code class="language-plaintext highlighter-rouge">currentMessageUsage</code> and <code class="language-plaintext highlighter-rouge">lastStopReason</code></li>
  <li><strong>Tool dispatch</strong> — Routes tool calls to the orchestrator</li>
</ol>

<p>Tool execution uses a sophisticated concurrency model:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>partitionToolCalls(blocks[]):
  ├─ Batch 1: Read-only tools A, B, C  → runConcurrently(max=10)
  ├─ Batch 2: Write tool D              → runSerially()
  ├─ Batch 3: Read-only tools E, F      → runConcurrently(max=10)
  └─ ...
</code></pre></div></div>

<p>Each tool’s <code class="language-plaintext highlighter-rouge">isConcurrencySafe()</code> method determines if it can run in parallel. Read-only tools (glob, grep, file reads) run concurrently; write tools (edits, bash with side effects) run serially with context propagation between batches.</p>

<p>A <strong>streaming tool executor</strong> can even begin executing tools <em>while the model is still streaming</em>, reducing latency by overlapping computation and I/O.</p>

<h3 id="44-token-budget-continuation">4.4 Token Budget Continuation</h3>

<p>When the model’s output budget is approaching exhaustion but the task isn’t complete, the engine:</p>

<ol>
  <li>Injects an invisible meta-message: <em>“Resume directly — no apology, no recap”</em></li>
  <li>Continues the loop with a <code class="language-plaintext highlighter-rouge">token_budget_continuation</code> transition</li>
  <li>Tracks cumulative tokens without interrupting the user</li>
  <li>Detects diminishing returns to avoid infinite loops</li>
</ol>

<p>Maximum 3 consecutive output-token recovery attempts before surfacing the stop reason.</p>

<h2 id="5-the-tool-system-60-tools-behind-a-single-interface">5. The Tool System: 60+ Tools Behind a Single Interface</h2>

<p>Every tool in Claude Code conforms to a single generic interface:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kr">interface</span> <span class="nx">Tool</span><span class="o">&lt;</span><span class="nx">Input</span><span class="p">,</span> <span class="nx">Output</span><span class="p">,</span> <span class="nx">Progress</span><span class="o">&gt;</span> <span class="p">{</span>
  <span class="na">name</span><span class="p">:</span> <span class="kr">string</span>
  <span class="nx">description</span><span class="p">():</span> <span class="kr">string</span>          <span class="c1">// Dynamic, permission-context-aware</span>
  <span class="nx">prompt</span><span class="p">():</span> <span class="kr">string</span>               <span class="c1">// System prompt additions</span>
  <span class="na">inputSchema</span><span class="p">:</span> <span class="nx">ZodSchema</span><span class="o">&lt;</span><span class="nx">Input</span><span class="o">&gt;</span>  <span class="c1">// Zod → JSON Schema for API</span>

  <span class="nx">call</span><span class="p">(</span><span class="na">input</span><span class="p">:</span> <span class="nx">Input</span><span class="p">,</span> <span class="na">context</span><span class="p">:</span> <span class="nx">ToolContext</span><span class="p">):</span> <span class="nb">Promise</span><span class="o">&lt;</span><span class="nx">ToolResult</span><span class="o">&lt;</span><span class="nx">Output</span><span class="o">&gt;&gt;</span>
  <span class="nx">checkPermissions</span><span class="p">(</span><span class="na">input</span><span class="p">:</span> <span class="nx">Input</span><span class="p">):</span> <span class="nx">PermissionResult</span>
  <span class="nx">validateInput</span><span class="p">(</span><span class="na">input</span><span class="p">:</span> <span class="nx">Input</span><span class="p">):</span> <span class="nx">ValidationResult</span>
  <span class="nx">isConcurrencySafe</span><span class="p">(</span><span class="na">input</span><span class="p">:</span> <span class="nx">Input</span><span class="p">):</span> <span class="nx">boolean</span>

  <span class="c1">// 4-tier rendering</span>
  <span class="nx">renderToolUseMessage</span><span class="p">(</span><span class="na">input</span><span class="p">:</span> <span class="nx">Input</span><span class="p">):</span> <span class="nx">ReactNode</span>
  <span class="nx">renderToolUseProgressMessage</span><span class="p">(</span><span class="na">input</span><span class="p">:</span> <span class="nx">Input</span><span class="p">,</span> <span class="na">progress</span><span class="p">:</span> <span class="nx">Progress</span><span class="p">):</span> <span class="nx">ReactNode</span>
  <span class="nx">renderToolResultMessage</span><span class="p">(</span><span class="na">output</span><span class="p">:</span> <span class="nx">Output</span><span class="p">):</span> <span class="nx">ReactNode</span>
  <span class="nx">renderToolUseErrorMessage</span><span class="p">(</span><span class="na">error</span><span class="p">:</span> <span class="nb">Error</span><span class="p">):</span> <span class="nx">ReactNode</span>

  <span class="nx">mapToolResultToToolResultBlockParam</span><span class="p">(</span><span class="na">output</span><span class="p">:</span> <span class="nx">Output</span><span class="p">,</span> <span class="na">id</span><span class="p">:</span> <span class="kr">string</span><span class="p">):</span> <span class="nx">ToolResultBlockParam</span>
<span class="p">}</span>
</code></pre></div></div>

<h3 id="51-the-tool-registry">5.1 The Tool Registry</h3>

<p>Tools are loaded through a feature-gated registry:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">assembleToolPool</span><span class="p">(</span><span class="nx">permissionContext</span><span class="p">,</span> <span class="nx">mcpTools</span><span class="p">):</span>
  <span class="mi">1</span><span class="p">.</span> <span class="nx">getTools</span><span class="p">(</span><span class="nx">permissionContext</span><span class="p">)</span>        <span class="c1">// Filter built-ins by deny rules</span>
  <span class="mi">2</span><span class="p">.</span> <span class="nx">filterToolsByDenyRules</span><span class="p">()</span>           <span class="c1">// Remove blanket-denied MCP tools</span>
  <span class="mi">3</span><span class="p">.</span> <span class="nx">uniqBy</span><span class="p">(</span><span class="nx">name</span><span class="p">)</span>                       <span class="c1">// Deduplicate (built-ins win)</span>
  <span class="mi">4</span><span class="p">.</span> <span class="nx">sort</span><span class="p">(</span><span class="nx">name</span><span class="p">)</span>                         <span class="c1">// Alphabetical for prompt cache stability</span>
</code></pre></div></div>

<p>Sorting by name is a subtle but important optimization: it keeps the tool list in the same order across requests, maximizing <strong>prompt cache hit rates</strong> on the API side.</p>

<h3 id="52-deferred-tool-discovery">5.2 Deferred Tool Discovery</h3>

<p>Not all 60+ tools are sent to the model in every request. Tools marked <code class="language-plaintext highlighter-rouge">shouldDefer: true</code> are hidden until the model explicitly searches for them via <code class="language-plaintext highlighter-rouge">ToolSearchTool</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Model: "I need to create a task..."
  → Calls ToolSearchTool("task create")
  → Returns TaskCreateTool schema
  → Model calls TaskCreateTool in the same turn
</code></pre></div></div>

<p>~18 tools are deferred: LSP, TaskCreate, MCPTool, SkillTool, EnterPlanMode, etc. This keeps the base prompt under 200K tokens while allowing elastic discovery.</p>

<h3 id="53-key-tool-implementations">5.3 Key Tool Implementations</h3>

<h4 id="bashtool--command-execution-with-guardrails">BashTool — Command Execution with Guardrails</h4>

<p>The most frequently used tool runs shell commands with extensive safety:</p>

<ul>
  <li><strong>30K character result limit</strong> — Large outputs persist to disk with a preview</li>
  <li><strong>Sandbox awareness</strong> — Detects containerized vs. native execution</li>
  <li><strong>Background tasks</strong> — Auto-backgrounds commands exceeding 15 seconds</li>
  <li><strong>Search classification</strong> — Marks <code class="language-plaintext highlighter-rouge">ls</code>, <code class="language-plaintext highlighter-rouge">grep</code>, <code class="language-plaintext highlighter-rouge">cat</code> output as collapsible in the UI</li>
  <li><strong>Permission dialogs</strong> — <code class="language-plaintext highlighter-rouge">sed</code> edits show a preview before execution</li>
</ul>

<h4 id="fileedittool--precision-string-replacement">FileEditTool — Precision String Replacement</h4>

<p>Rather than rewriting entire files, the edit tool does surgical string replacement:</p>

<ul>
  <li><strong>Old/new string matching</strong> — Finds exact occurrences, replaces one or all</li>
  <li><strong>1 GiB size limit</strong> — Prevents OOM on massive files</li>
  <li><strong>Git-aware diffing</strong> — Shows before/after diff via <code class="language-plaintext highlighter-rouge">gitDiff()</code></li>
  <li><strong>Undo integration</strong> — Plugs into FileHistory for one-click undo</li>
</ul>

<h4 id="agenttool--subagent-spawning">AgentTool — Subagent Spawning</h4>

<p>Claude Code can spawn child agents for parallel work:</p>

<ul>
  <li><strong>Isolation modes</strong> — Worktree (isolated git branch) or remote (CCR)</li>
  <li><strong>Model selection</strong> — Override with <code class="language-plaintext highlighter-rouge">opus | sonnet | haiku</code></li>
  <li><strong>Background execution</strong> — Agents run async with notification on completion</li>
  <li><strong>Named addressing</strong> — <code class="language-plaintext highlighter-rouge">SendMessage</code> to named agents for multi-agent coordination</li>
  <li><strong>Permission inheritance</strong> — Child agents inherit or restrict parent permissions</li>
</ul>

<h4 id="greptool--content-search-ripgrep-wrapper">GrepTool — Content Search (Ripgrep Wrapper)</h4>

<p>Wraps <code class="language-plaintext highlighter-rouge">rg</code> with sensible defaults for LLM use:</p>

<ul>
  <li><strong>250-line default limit</strong> — Prevents context flooding</li>
  <li><strong>Multiline mode</strong> — <code class="language-plaintext highlighter-rouge">rg -U --multiline-dotall</code> for cross-line patterns</li>
  <li><strong>VCS exclusion</strong> — Auto-skips <code class="language-plaintext highlighter-rouge">.git</code>, <code class="language-plaintext highlighter-rouge">.svn</code>, <code class="language-plaintext highlighter-rouge">.hg</code></li>
  <li><strong>Three output modes</strong> — Content, file paths only, or match counts</li>
</ul>

<h4 id="lsptool--language-intelligence">LSPTool — Language Intelligence</h4>

<p>9 operations powered by Language Server Protocol:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">goToDefinition</code>, <code class="language-plaintext highlighter-rouge">findReferences</code>, <code class="language-plaintext highlighter-rouge">hover</code></li>
  <li><code class="language-plaintext highlighter-rouge">documentSymbol</code>, <code class="language-plaintext highlighter-rouge">workspaceSymbol</code></li>
  <li><code class="language-plaintext highlighter-rouge">goToImplementation</code>, <code class="language-plaintext highlighter-rouge">prepareCallHierarchy</code></li>
  <li><code class="language-plaintext highlighter-rouge">incomingCalls</code>, <code class="language-plaintext highlighter-rouge">outgoingCalls</code></li>
</ul>

<p>Only loaded when an LSP server is connected. Deferred by default.</p>

<h4 id="websearchtool--native-web-search">WebSearchTool — Native Web Search</h4>

<p>Server-side web search (beta feature):</p>

<ul>
  <li><strong>Max 8 searches</strong> per invocation</li>
  <li><strong>Domain filtering</strong> — <code class="language-plaintext highlighter-rouge">allowed_domains</code> and <code class="language-plaintext highlighter-rouge">blocked_domains</code> parameters</li>
  <li><strong>Streaming results</strong> — Interleaves text and citation blocks</li>
</ul>

<h3 id="54-tool-result-budgeting">5.4 Tool Result Budgeting</h3>

<p>Every tool has a <code class="language-plaintext highlighter-rouge">maxResultSizeChars</code> limit:</p>

<table>
  <thead>
    <tr>
      <th>Tool</th>
      <th>Limit</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>BashTool</td>
      <td>30,000 chars</td>
    </tr>
    <tr>
      <td>GrepTool</td>
      <td>20,000 chars</td>
    </tr>
    <tr>
      <td>FileReadTool</td>
      <td>Infinity (never persists)</td>
    </tr>
  </tbody>
</table>

<p>When output exceeds the limit, it’s saved to <code class="language-plaintext highlighter-rouge">~/.claude/tool-results/{uuid}/output.txt</code> and the model receives a preview with a file reference. FileReadTool is exempt because persisting its output would create a circular dependency (Read → persist → model reads persisted file → …).</p>

<h3 id="55-lazy-schemas">5.5 Lazy Schemas</h3>

<p>Tool input schemas use a <code class="language-plaintext highlighter-rouge">lazySchema()</code> factory that defers Zod instantiation:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">schema</span> <span class="o">=</span> <span class="nx">lazySchema</span><span class="p">(()</span> <span class="o">=&gt;</span> <span class="nx">z</span><span class="p">.</span><span class="nx">object</span><span class="p">({</span>
  <span class="na">command</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="kr">string</span><span class="p">(),</span>
  <span class="na">timeout</span><span class="p">:</span> <span class="nx">z</span><span class="p">.</span><span class="kr">number</span><span class="p">().</span><span class="nx">optional</span><span class="p">(),</span>
<span class="p">}))</span>
</code></pre></div></div>

<p>This prevents circular import cycles (Tool.ts ← tools/ ← Tool.ts) and enables mid-session schema changes when feature flags flip.</p>

<h2 id="6-the-permission-system-safety-at-every-layer">6. The Permission System: Safety at Every Layer</h2>

<p>Claude Code’s permission system is one of its most sophisticated subsystems — a multi-layered defense that balances safety with developer productivity.</p>

<h3 id="61-permission-modes">6.1 Permission Modes</h3>

<p>Five public modes control the default behavior:</p>

<table>
  <thead>
    <tr>
      <th>Mode</th>
      <th>Behavior</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">default</code></td>
      <td>Ask for destructive operations</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">plan</code></td>
      <td>Read-only + AskUserQuestion (design phase)</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">acceptEdits</code></td>
      <td>Auto-approve file edits, ask for shell</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">bypassPermissions</code></td>
      <td>Full access (dangerous, opt-in)</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">dontAsk</code></td>
      <td>Auto-deny unsafe commands</td>
    </tr>
  </tbody>
</table>

<p>Plus two internal modes:</p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">auto</code> — ML classifier evaluates each command</li>
  <li><code class="language-plaintext highlighter-rouge">bubble</code> — Internal delegation to parent agent</li>
</ul>

<h3 id="62-rule-system">6.2 Rule System</h3>

<p>Permission rules form a priority cascade:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">type</span> <span class="nx">PermissionRule</span> <span class="o">=</span> <span class="p">{</span>
  <span class="na">source</span><span class="p">:</span> <span class="dl">'</span><span class="s1">userSettings</span><span class="dl">'</span> <span class="o">|</span> <span class="dl">'</span><span class="s1">projectSettings</span><span class="dl">'</span> <span class="o">|</span> <span class="dl">'</span><span class="s1">localSettings</span><span class="dl">'</span> <span class="o">|</span> <span class="dl">'</span><span class="s1">cliArg</span><span class="dl">'</span> <span class="o">|</span> <span class="dl">'</span><span class="s1">session</span><span class="dl">'</span>
  <span class="na">ruleBehavior</span><span class="p">:</span> <span class="dl">'</span><span class="s1">allow</span><span class="dl">'</span> <span class="o">|</span> <span class="dl">'</span><span class="s1">deny</span><span class="dl">'</span> <span class="o">|</span> <span class="dl">'</span><span class="s1">ask</span><span class="dl">'</span>
  <span class="na">ruleValue</span><span class="p">:</span> <span class="p">{</span> <span class="na">toolName</span><span class="p">:</span> <span class="kr">string</span><span class="p">,</span> <span class="nx">ruleContent</span><span class="p">?:</span> <span class="kr">string</span> <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Rules support glob patterns: <code class="language-plaintext highlighter-rouge">Bash(git push*)</code> allows any git push command, <code class="language-plaintext highlighter-rouge">Bash(python:*)</code> allows all Python commands.</p>

<h3 id="63-decision-pipeline">6.3 Decision Pipeline</h3>

<p>For every tool call:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1. validateInput()        → Tool-specific validation (size limits, blocked patterns)
2. checkPermissions()     → Rule matching + classifier + hooks
3. Decision:
   ├─ allow  → Execute immediately
   ├─ deny   → Return error to model
   └─ ask    → Show permission dialog to user
4. Pre/Post hooks         → Can modify input or block execution
</code></pre></div></div>

<h3 id="64-dangerous-pattern-detection">6.4 Dangerous Pattern Detection</h3>

<p>The system identifies permission rules that are too broad to auto-allow:</p>

<ul>
  <li><strong>Tool-level allow</strong> (no content restriction) — Would allow ALL commands</li>
  <li><strong>Interpreter prefixes</strong> — <code class="language-plaintext highlighter-rouge">python:*</code>, <code class="language-plaintext highlighter-rouge">node:*</code>, <code class="language-plaintext highlighter-rouge">ruby:*</code> (arbitrary code execution)</li>
  <li><strong>Wildcards</strong> — <code class="language-plaintext highlighter-rouge">*</code>, <code class="language-plaintext highlighter-rouge">python*</code> (too permissive)</li>
</ul>

<h3 id="65-three-way-permission-result">6.5 Three-Way Permission Result</h3>

<p>Every permission check returns a typed union:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">type</span> <span class="nx">PermissionResult</span> <span class="o">=</span>
  <span class="o">|</span> <span class="p">{</span> <span class="na">behavior</span><span class="p">:</span> <span class="dl">'</span><span class="s1">allow</span><span class="dl">'</span><span class="p">,</span> <span class="nx">updatedInput</span><span class="p">?:</span> <span class="nx">Input</span> <span class="p">}</span>   <span class="c1">// Hooks can modify input</span>
  <span class="o">|</span> <span class="p">{</span> <span class="na">behavior</span><span class="p">:</span> <span class="dl">'</span><span class="s1">ask</span><span class="dl">'</span><span class="p">,</span> <span class="na">message</span><span class="p">:</span> <span class="kr">string</span> <span class="p">}</span>           <span class="c1">// Prompt user</span>
  <span class="o">|</span> <span class="p">{</span> <span class="na">behavior</span><span class="p">:</span> <span class="dl">'</span><span class="s1">deny</span><span class="dl">'</span><span class="p">,</span> <span class="na">message</span><span class="p">:</span> <span class="kr">string</span> <span class="p">}</span>          <span class="c1">// Block with explanation</span>
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">updatedInput</code> field is powerful: pre-execution hooks can transparently modify tool parameters (e.g., adding safety flags to shell commands).</p>

<h2 id="7-terminal-ui-react-but-for-your-terminal">7. Terminal UI: React, but for Your Terminal</h2>

<p>Perhaps the most impressive subsystem in Claude Code is its custom terminal rendering framework — a complete reimplementation of React rendering for terminal environments, rivaling web browsers in sophistication.</p>

<h3 id="71-the-rendering-pipeline">7.1 The Rendering Pipeline</h3>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>React Components
    ↓
Custom React Reconciler (createReconciler API)
    ↓
Virtual DOM Tree (ink-box, ink-text, ink-root, ink-link)
    ↓
Yoga Layout Engine (flexbox calculations)
    ↓
Output Builder (write / blit / clip / clear / shift operations)
    ↓
Screen Buffer (2D cell array with interned styles + hyperlinks)
    ↓
Diff Engine (compare with previous frame)
    ↓
ANSI Escape Sequences → TTY
</code></pre></div></div>

<h3 id="72-custom-react-reconciler">7.2 Custom React Reconciler</h3>

<p>Claude Code implements a custom React host configuration using <code class="language-plaintext highlighter-rouge">createReconciler</code>:</p>

<p><strong>Element types:</strong></p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">ink-root</code> — Root container</li>
  <li><code class="language-plaintext highlighter-rouge">ink-box</code> — Flexbox layout container (like <code class="language-plaintext highlighter-rouge">&lt;div&gt;</code>)</li>
  <li><code class="language-plaintext highlighter-rouge">ink-text</code> — Text content</li>
  <li><code class="language-plaintext highlighter-rouge">ink-virtual-text</code> — Nested text (layout optimization)</li>
  <li><code class="language-plaintext highlighter-rouge">ink-link</code> — OSC 8 hyperlinks</li>
  <li><code class="language-plaintext highlighter-rouge">ink-progress</code> — Progress indicators</li>
  <li><code class="language-plaintext highlighter-rouge">ink-raw-ansi</code> — Raw ANSI passthrough (bypasses measurement)</li>
</ul>

<p>The reconciler tracks three categories of changes separately:</p>
<ul>
  <li><strong>Styles</strong> — Passed to Yoga for layout recalculation</li>
  <li><strong>Text styles</strong> — Colorization, bold, italic, etc.</li>
  <li><strong>Event handlers</strong> — Stored separately to prevent handler identity changes from invalidating the dirty flag</li>
</ul>

<h3 id="73-yoga-layout-engine">7.3 Yoga Layout Engine</h3>

<p>Rather than manual ANSI cursor positioning, Claude Code uses <strong>Yoga</strong> — Facebook’s cross-platform flexbox implementation — for layout:</p>

<div class="language-tsx highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">&lt;</span><span class="nc">Box</span> <span class="na">flexDirection</span><span class="p">=</span><span class="s">"row"</span> <span class="na">gap</span><span class="p">=</span><span class="si">{</span><span class="mi">1</span><span class="si">}</span> <span class="na">paddingX</span><span class="p">=</span><span class="si">{</span><span class="mi">2</span><span class="si">}</span><span class="p">&gt;</span>
  <span class="p">&lt;</span><span class="nc">Box</span> <span class="na">flexGrow</span><span class="p">=</span><span class="si">{</span><span class="mi">1</span><span class="si">}</span><span class="p">&gt;</span>
    <span class="p">&lt;</span><span class="nc">Text</span><span class="p">&gt;</span>Left panel<span class="p">&lt;/</span><span class="nc">Text</span><span class="p">&gt;</span>
  <span class="p">&lt;/</span><span class="nc">Box</span><span class="p">&gt;</span>
  <span class="p">&lt;</span><span class="nc">Box</span> <span class="na">width</span><span class="p">=</span><span class="si">{</span><span class="mi">30</span><span class="si">}</span><span class="p">&gt;</span>
    <span class="p">&lt;</span><span class="nc">Text</span><span class="p">&gt;</span>Right sidebar<span class="p">&lt;/</span><span class="nc">Text</span><span class="p">&gt;</span>
  <span class="p">&lt;/</span><span class="nc">Box</span><span class="p">&gt;</span>
<span class="p">&lt;/</span><span class="nc">Box</span><span class="p">&gt;</span>
</code></pre></div></div>

<p>This brings responsive, declarative layouts to the terminal. Text nodes register measure functions with Yoga:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">node</span><span class="p">.</span><span class="nx">yogaNode</span><span class="p">.</span><span class="nx">setMeasureFunc</span><span class="p">((</span><span class="nx">width</span><span class="p">,</span> <span class="nx">measureMode</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">{</span>
  <span class="kd">const</span> <span class="nx">wrapped</span> <span class="o">=</span> <span class="nx">wrapText</span><span class="p">(</span><span class="nx">text</span><span class="p">,</span> <span class="nx">width</span><span class="p">)</span>
  <span class="k">return</span> <span class="p">{</span> <span class="na">width</span><span class="p">:</span> <span class="nx">actualWidth</span><span class="p">,</span> <span class="na">height</span><span class="p">:</span> <span class="nx">numLines</span> <span class="p">}</span>
<span class="p">})</span>
</code></pre></div></div>

<p>A generational reset pattern prevents memory leaks from native Yoga bindings:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">if</span> <span class="p">(</span><span class="nx">now</span> <span class="o">-</span> <span class="nx">lastPoolResetTime</span> <span class="o">&gt;</span> <span class="nx">SESSION_POOL_RESET_MS</span><span class="p">)</span> <span class="p">{</span>
  <span class="nx">migrateScreenPools</span><span class="p">()</span>  <span class="c1">// Free and recreate all Yoga nodes</span>
<span class="p">}</span>
</code></pre></div></div>

<h3 id="74-the-dirty-flag-cascade">7.4 The Dirty Flag Cascade</h3>

<p>Nodes track a <code class="language-plaintext highlighter-rouge">dirty</code> flag that cascades upward:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">function</span> <span class="nx">markDirty</span><span class="p">(</span><span class="nx">node</span><span class="p">:</span> <span class="nx">DOMElement</span><span class="p">)</span> <span class="p">{</span>
  <span class="nx">node</span><span class="p">.</span><span class="nx">dirty</span> <span class="o">=</span> <span class="kc">true</span>
  <span class="k">if</span> <span class="p">(</span><span class="nx">node</span><span class="p">.</span><span class="nx">parentNode</span><span class="p">)</span> <span class="nx">markDirty</span><span class="p">(</span><span class="nx">node</span><span class="p">.</span><span class="nx">parentNode</span><span class="p">)</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Only subtrees with dirty ancestors are re-laid out, providing incremental performance.</p>

<h3 id="75-double-buffering-and-blitting">7.5 Double Buffering and Blitting</h3>

<p>The renderer uses classic graphics techniques:</p>

<p><strong>Double buffering:</strong></p>
<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">private</span> <span class="nx">frontFrame</span><span class="p">:</span> <span class="nx">Frame</span>   <span class="c1">// Currently displayed</span>
<span class="k">private</span> <span class="nx">backFrame</span><span class="p">:</span> <span class="nx">Frame</span>    <span class="c1">// Being rendered into</span>
<span class="c1">// After render: swap pointers</span>
<span class="p">[</span><span class="k">this</span><span class="p">.</span><span class="nx">frontFrame</span><span class="p">,</span> <span class="k">this</span><span class="p">.</span><span class="nx">backFrame</span><span class="p">]</span> <span class="o">=</span> <span class="p">[</span><span class="k">this</span><span class="p">.</span><span class="nx">backFrame</span><span class="p">,</span> <span class="k">this</span><span class="p">.</span><span class="nx">frontFrame</span><span class="p">]</span>
</code></pre></div></div>

<p><strong>Blitting (copy unchanged regions):</strong></p>
<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">blit</span><span class="p">(</span><span class="nx">src</span><span class="p">:</span> <span class="nx">Screen</span><span class="p">,</span> <span class="nx">x</span><span class="p">,</span> <span class="nx">y</span><span class="p">,</span> <span class="nx">width</span><span class="p">,</span> <span class="nx">height</span><span class="p">)</span>
<span class="c1">// If a region hasn't changed, copy from previous frame</span>
<span class="c1">// instead of re-rendering — the "GPU blit" technique for terminals</span>
</code></pre></div></div>

<p>When a selection overlay is applied, it “contaminates” the frame, disabling blit for the next render to prevent visual artifacts.</p>

<h3 id="76-screen-buffer-the-2d-cell-model">7.6 Screen Buffer: The 2D Cell Model</h3>

<p>The screen is a 2D array of cells:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">type</span> <span class="nx">Cell</span> <span class="o">=</span> <span class="p">{</span>
  <span class="na">char</span><span class="p">:</span> <span class="kr">string</span>          <span class="c1">// Interned via CharPool</span>
  <span class="na">width</span><span class="p">:</span> <span class="nx">CellWidth</span>      <span class="c1">// 1 (normal), 2 (wide/CJK/emoji), -1 (tail of wide char)</span>
  <span class="na">styleId</span><span class="p">:</span> <span class="kr">number</span>       <span class="c1">// Interned via StylePool</span>
  <span class="nx">hyperlink</span><span class="p">?:</span> <span class="kr">number</span>    <span class="c1">// Interned via HyperlinkPool</span>
<span class="p">}</span>
</code></pre></div></div>

<p><strong>Three interning pools</strong> minimize memory and enable O(1) comparisons:</p>

<ul>
  <li><strong>CharPool</strong> — Deduplicates character strings, returns integer IDs</li>
  <li><strong>StylePool</strong> — Deduplicates ANSI style combinations, pre-computes transition sequences</li>
  <li><strong>HyperlinkPool</strong> — Deduplicates OSC 8 URLs (reset every 5 minutes to bound growth)</li>
</ul>

<p>The style pool’s <code class="language-plaintext highlighter-rouge">transition()</code> method is especially clever:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Pre-computed: "how to go from style A to style B"</span>
<span class="nx">transition</span><span class="p">(</span><span class="nx">fromId</span><span class="p">:</span> <span class="kr">number</span><span class="p">,</span> <span class="nx">toId</span><span class="p">:</span> <span class="kr">number</span><span class="p">):</span> <span class="kr">string</span> <span class="p">{</span>
  <span class="kd">const</span> <span class="nx">key</span> <span class="o">=</span> <span class="nx">fromId</span> <span class="o">*</span> <span class="mh">0x100000</span> <span class="o">+</span> <span class="nx">toId</span>
  <span class="k">return</span> <span class="nx">transitionCache</span><span class="p">.</span><span class="kd">get</span><span class="p">(</span><span class="nx">key</span><span class="p">)</span>  <span class="c1">// O(1) vs. diffing AnsiCode arrays</span>
<span class="p">}</span>
</code></pre></div></div>

<h3 id="77-scroll-optimization">7.7 Scroll Optimization</h3>

<p>ScrollBox uses hardware scroll regions when available:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>CSI top;bottom r    → Set scroll region
CSI n S             → Scroll up n lines (DECSTBM)
</code></pre></div></div>

<p>This is dramatically faster than rewriting 50+ rows of content. For smooth animation, scroll deltas accumulate and drain at terminal-specific rates:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Native terminals: proportional drain (~3/4 per frame)</span>
<span class="kd">const</span> <span class="nx">step</span> <span class="o">=</span> <span class="nb">Math</span><span class="p">.</span><span class="nx">max</span><span class="p">(</span><span class="nx">MIN</span><span class="p">,</span> <span class="p">(</span><span class="nx">abs</span> <span class="o">*</span> <span class="mi">3</span><span class="p">)</span> <span class="o">&gt;&gt;</span> <span class="mi">2</span><span class="p">)</span>

<span class="c1">// xterm.js: adaptive (instant for ≤5, smaller steps for fast scrolls)</span>
<span class="kd">const</span> <span class="nx">step</span> <span class="o">=</span> <span class="nx">abs</span> <span class="o">&lt;=</span> <span class="mi">5</span> <span class="p">?</span> <span class="nx">abs</span> <span class="p">:</span> <span class="nx">abs</span> <span class="o">&lt;</span> <span class="mi">12</span> <span class="p">?</span> <span class="mi">2</span> <span class="p">:</span> <span class="mi">3</span>
</code></pre></div></div>

<h3 id="78-event-system">7.8 Event System</h3>

<p>Events follow DOM semantics with capture and bubble phases:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">function</span> <span class="nx">collectListeners</span><span class="p">(</span><span class="nx">target</span><span class="p">,</span> <span class="nx">event</span><span class="p">):</span> <span class="nx">DispatchListener</span><span class="p">[]</span> <span class="p">{</span>
  <span class="c1">// Walk from target to root</span>
  <span class="c1">// Capture handlers: root-first</span>
  <span class="c1">// Bubble handlers: target-first</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Event priority mirrors web browsers:</p>

<table>
  <thead>
    <tr>
      <th>Priority</th>
      <th>Events</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Discrete (sync)</td>
      <td><code class="language-plaintext highlighter-rouge">keydown</code>, <code class="language-plaintext highlighter-rouge">keyup</code>, <code class="language-plaintext highlighter-rouge">click</code>, <code class="language-plaintext highlighter-rouge">focus</code>, <code class="language-plaintext highlighter-rouge">blur</code>, <code class="language-plaintext highlighter-rouge">paste</code></td>
    </tr>
    <tr>
      <td>Continuous (batched)</td>
      <td><code class="language-plaintext highlighter-rouge">resize</code>, <code class="language-plaintext highlighter-rouge">scroll</code>, <code class="language-plaintext highlighter-rouge">mousemove</code></td>
    </tr>
  </tbody>
</table>

<h3 id="79-text-selection">7.9 Text Selection</h3>

<p>Full text selection with word and line modes:</p>

<ul>
  <li><strong>Character mode</strong> — Drag selects character by character</li>
  <li><strong>Word mode</strong> — Double-click selects word; subsequent drag extends by word boundaries</li>
  <li><strong>Line mode</strong> — Triple-click selects line; drag extends by lines</li>
  <li><strong>Scroll tracking</strong> — Text that scrolls off-screen is accumulated for correct copy</li>
  <li><strong>Soft-wrap handling</strong> — Wrapped lines are joined into logical lines when copying</li>
</ul>

<h3 id="710-keyboard-input-parsing">7.10 Keyboard Input Parsing</h3>

<p>Terminal keyboard input is notoriously ambiguous. The parser handles multiple protocols:</p>

<ul>
  <li><strong>Kitty Keyboard Protocol</strong> — <code class="language-plaintext highlighter-rouge">CSI u</code> with codepoint + modifiers</li>
  <li><strong>xterm modifyOtherKeys</strong> — <code class="language-plaintext highlighter-rouge">CSI 27; modifier; keycode ~</code></li>
  <li><strong>Legacy function keys</strong> — F1-F12 with their many escape sequence variants</li>
  <li><strong>SGR mouse events</strong> — <code class="language-plaintext highlighter-rouge">CSI &lt; button; col; row M/m</code></li>
  <li><strong>Terminal identity detection</strong> — XTVERSION response parsing for feature detection</li>
</ul>

<h2 id="8-the-command-system-100-slash-commands">8. The Command System: 100+ Slash Commands</h2>

<h3 id="81-architecture">8.1 Architecture</h3>

<p>Commands use a declarative registration model with three types:</p>

<table>
  <thead>
    <tr>
      <th>Type</th>
      <th>Execution Model</th>
      <th>Example</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">PromptCommand</code></td>
      <td>Expands to text sent to Claude</td>
      <td><code class="language-plaintext highlighter-rouge">/commit</code>, <code class="language-plaintext highlighter-rouge">/review</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">LocalCommand</code></td>
      <td>Synchronous text output, no UI</td>
      <td><code class="language-plaintext highlighter-rouge">/clear</code>, <code class="language-plaintext highlighter-rouge">/help</code>, <code class="language-plaintext highlighter-rouge">/status</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">LocalJSXCommand</code></td>
      <td>React component rendered to terminal</td>
      <td><code class="language-plaintext highlighter-rouge">/config</code>, <code class="language-plaintext highlighter-rouge">/mcp</code>, <code class="language-plaintext highlighter-rouge">/doctor</code></td>
    </tr>
  </tbody>
</table>

<p>The command registry is <strong>memoized</strong> and <strong>lazy-loaded</strong>:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">COMMANDS</span> <span class="o">=</span> <span class="nx">memoize</span><span class="p">(()</span> <span class="o">=&gt;</span> <span class="p">[</span>
  <span class="c1">// Static commands array — module imports deferred until first call</span>
<span class="p">])</span>

<span class="kd">const</span> <span class="nx">loadAllCommands</span> <span class="o">=</span> <span class="nx">memoize</span><span class="p">((</span><span class="nx">cwd</span><span class="p">:</span> <span class="kr">string</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">{</span>
  <span class="c1">// Merges: COMMANDS() + skills + plugins + workflows + MCP commands</span>
<span class="p">})</span>
</code></pre></div></div>

<h3 id="82-command-discovery-pipeline">8.2 Command Discovery Pipeline</h3>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>getCommands(cwd)
  ├─ loadAllCommands(cwd) [memoized by CWD]
  │   ├─ getSkills()          → Disk, bundled, plugin, MCP skills
  │   ├─ getPluginCommands()  → Marketplace + built-in plugins
  │   ├─ getWorkflowCommands()→ Automation workflows [feature-gated]
  │   └─ COMMANDS()           → Static built-in commands
  ├─ getDynamicSkills()       → Session-discovered skills
  ├─ Filter by availability   → Auth provider gating
  ├─ Filter by isEnabled()    → Feature flag gating
  └─ Dedupe + sort
</code></pre></div></div>

<h3 id="83-remote-and-bridge-filtering">8.3 Remote and Bridge Filtering</h3>

<p>Commands are pre-filtered based on execution context:</p>

<ul>
  <li><strong>Remote mode</strong> — Only <code class="language-plaintext highlighter-rouge">REMOTE_SAFE_COMMANDS</code> (session, exit, clear, help, theme, cost…)</li>
  <li><strong>Bridge mode</strong> — Only <code class="language-plaintext highlighter-rouge">BRIDGE_SAFE_COMMANDS</code> (prompt-type skills, plus text-output locals like clear, cost, summary)</li>
  <li><strong>Local JSX commands</strong> — Always blocked over bridge (can’t render React over WebSocket)</li>
</ul>

<h3 id="84-notable-command-implementations">8.4 Notable Command Implementations</h3>

<h4 id="commit--git-safety-protocol"><code class="language-plaintext highlighter-rouge">/commit</code> — Git Safety Protocol</h4>

<p>The commit command enforces strict safety rules:</p>
<ul>
  <li>Never <code class="language-plaintext highlighter-rouge">git commit --amend</code> (only create new commits)</li>
  <li>Never skip hooks (<code class="language-plaintext highlighter-rouge">--no-verify</code>, <code class="language-plaintext highlighter-rouge">--no-gpg-sign</code>)</li>
  <li>Never use <code class="language-plaintext highlighter-rouge">-i</code> flags (interactive mode unsupported)</li>
  <li>Warn on secrets (<code class="language-plaintext highlighter-rouge">.env</code>, <code class="language-plaintext highlighter-rouge">credentials.json</code>)</li>
  <li>Restricted tool access: only <code class="language-plaintext highlighter-rouge">Bash(git add:*)</code>, <code class="language-plaintext highlighter-rouge">Bash(git status:*)</code>, <code class="language-plaintext highlighter-rouge">Bash(git commit:*)</code></li>
</ul>

<h4 id="init--interactive-project-setup"><code class="language-plaintext highlighter-rouge">/init</code> — Interactive Project Setup</h4>

<p>Multi-phase onboarding:</p>
<ol>
  <li>Ask what to set up (CLAUDE.md, skills, hooks)</li>
  <li>Survey codebase (manifest files, README, CI, existing config)</li>
  <li>Interview user on gaps</li>
  <li>Synthesize proposal and create artifacts</li>
</ol>

<h4 id="doctor--self-diagnostics"><code class="language-plaintext highlighter-rouge">/doctor</code> — Self-Diagnostics</h4>

<p>Checks system health: API connectivity, auth status, model availability, MCP server connections, permission configuration.</p>

<h2 id="9-skills-plugins-and-mcp-the-extensibility-trifecta">9. Skills, Plugins, and MCP: The Extensibility Trifecta</h2>

<h3 id="91-skills">9.1 Skills</h3>

<p>Skills are markdown-based prompt templates with frontmatter metadata:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nn">---</span>
<span class="na">name</span><span class="pi">:</span> <span class="s">my-skill</span>
<span class="na">description</span><span class="pi">:</span> <span class="s">What this skill does</span>
<span class="na">whenToUse</span><span class="pi">:</span> <span class="s">When Claude should invoke it</span>
<span class="na">allowedTools</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">Bash</span><span class="pi">,</span> <span class="nv">Read</span><span class="pi">,</span> <span class="nv">Edit</span><span class="pi">]</span>
<span class="na">model</span><span class="pi">:</span> <span class="s">claude-sonnet-4-6</span>
<span class="na">userInvocable</span><span class="pi">:</span> <span class="no">true</span>
<span class="nn">---</span>

<span class="s">Skill prompt content here...</span>
</code></pre></div></div>

<p><strong>Discovery sources (5):</strong></p>

<ol>
  <li><code class="language-plaintext highlighter-rouge">.claude/skills/</code> — Project-level skills</li>
  <li><code class="language-plaintext highlighter-rouge">~/.claude/skills/</code> — User-level skills</li>
  <li>Bundled skills — Compiled into the binary</li>
  <li>Plugin skills — From installed plugins</li>
  <li>MCP skill builders — Auto-generated from MCP servers with Prompt capability</li>
</ol>

<p><strong>Forked execution:</strong> Skills with <code class="language-plaintext highlighter-rouge">context: 'fork'</code> run in isolated subagents with their own token budgets, preventing large skills from consuming session context.</p>

<p><strong>Bundled skills</strong> support lazy extraction of reference files to disk with per-process nonce-based path protection (defends against symlink/TOCTOU attacks).</p>

<h3 id="92-plugins">9.2 Plugins</h3>

<p>Plugins bundle skills, hooks, and MCP servers:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Plugin
├─ Skills (markdown files)
├─ Hooks (pre/post tool execution)
├─ MCP Servers (tool providers)
└─ Options (user-configurable variables)
</code></pre></div></div>

<p><strong>Types:</strong></p>
<ul>
  <li><strong>Built-in plugins</strong> — Pre-installed, togglable, <code class="language-plaintext highlighter-rouge">{name}@builtin</code></li>
  <li><strong>Marketplace plugins</strong> — Installed to <code class="language-plaintext highlighter-rouge">~/.claude/plugins</code>, versioned</li>
  <li><strong>Project plugins</strong> — <code class="language-plaintext highlighter-rouge">--plugin-dir</code> for session-only plugins</li>
</ul>

<p>Plugin variables are substituted into prompts at invocation time via <code class="language-plaintext highlighter-rouge">substitutePluginVariables()</code>.</p>

<h3 id="93-model-context-protocol-mcp">9.3 Model Context Protocol (MCP)</h3>

<p>MCP is the primary extensibility mechanism for bringing external tools into Claude Code.</p>

<p><strong>Supported transports:</strong></p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">stdio</code> — Local subprocess</li>
  <li><code class="language-plaintext highlighter-rouge">sse</code> / <code class="language-plaintext highlighter-rouge">http</code> / <code class="language-plaintext highlighter-rouge">ws</code> — Network-based (with optional OAuth/XAA)</li>
  <li><code class="language-plaintext highlighter-rouge">sdk</code> — Embedded SDK</li>
  <li><code class="language-plaintext highlighter-rouge">claudeai-proxy</code> — Claude.ai tunnel</li>
</ul>

<p><strong>Config scopes (priority order):</strong></p>
<ol>
  <li><code class="language-plaintext highlighter-rouge">local</code> — <code class="language-plaintext highlighter-rouge">.mcp.json</code> in project root</li>
  <li><code class="language-plaintext highlighter-rouge">project</code> — <code class="language-plaintext highlighter-rouge">.claude/.mcp.json</code></li>
  <li><code class="language-plaintext highlighter-rouge">user</code> — <code class="language-plaintext highlighter-rouge">~/.claude/.mcp.json</code></li>
  <li><code class="language-plaintext highlighter-rouge">userSettings</code> — <code class="language-plaintext highlighter-rouge">settings.json</code> mcpServers</li>
  <li><code class="language-plaintext highlighter-rouge">policySettings</code> — Managed organizational policy</li>
  <li><code class="language-plaintext highlighter-rouge">enterprise</code> — Enterprise-managed</li>
  <li><code class="language-plaintext highlighter-rouge">claudeai</code> — Claude.ai-managed</li>
  <li><code class="language-plaintext highlighter-rouge">dynamic</code> — Runtime-injected</li>
</ol>

<p><strong>Connection lifecycle:</strong></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>MCPServerConnection =
  | ConnectedMCPServer     → Ready to use
  | FailedMCPServer        → Connection error
  | NeedsAuthMCPServer     → Awaiting OAuth
  | PendingMCPServer       → Reconnecting (max attempts)
  | DisabledMCPServer      → Explicitly disabled
</code></pre></div></div>

<p>MCP tools are normalized and prefixed: <code class="language-plaintext highlighter-rouge">mcp__server__toolname</code>. They receive the same permission checks, deny rules, and analytics as built-in tools.</p>

<h2 id="10-context-management-fighting-the-token-limit">10. Context Management: Fighting the Token Limit</h2>

<p>With conversations that can last hours and generate hundreds of tool calls, managing the context window is critical. Claude Code uses a multi-strategy approach.</p>

<h3 id="101-auto-compaction">10.1 Auto-Compaction</h3>

<p>When token count exceeds <code class="language-plaintext highlighter-rouge">context_window - 13,000</code>:</p>

<ol>
  <li>Strip images/documents from older messages (replace with <code class="language-plaintext highlighter-rouge">[image]</code> markers)</li>
  <li>Group messages by API round (assistant + tool results)</li>
  <li>Call the compaction model to generate a summary</li>
  <li>Replace old messages with a <code class="language-plaintext highlighter-rouge">CompactBoundaryMessage</code></li>
  <li>Re-inject up to 5 files + skills post-compaction (50K token budget for files, 25K for skills)</li>
</ol>

<p>A circuit breaker prevents thrashing: max 3 consecutive compaction failures before giving up.</p>

<h3 id="102-microcompaction">10.2 Microcompaction</h3>

<p>Lighter-weight compression for tool results:</p>

<ul>
  <li><strong>Time-based</strong> — Clear tool results older than a TTL</li>
  <li><strong>Size-based</strong> — Truncate when accumulated tool result tokens exceed threshold</li>
  <li><strong>Tool-specific</strong> — Only compacts: FileRead, Bash, Grep, Glob, WebSearch, WebFetch, FileEdit, FileWrite</li>
  <li><strong>Cache-aware</strong> — A “cached” variant preserves prompt cache integrity via <code class="language-plaintext highlighter-rouge">CacheEditsBlock</code></li>
</ul>

<h3 id="103-snip-compaction">10.3 Snip Compaction</h3>

<p>A history truncation strategy (feature-gated):</p>

<ul>
  <li>Remove old messages beyond a snip boundary</li>
  <li>Preserve the assistant’s “protected tail” for context continuity</li>
  <li>Track tokens freed for accurate token budget calculations</li>
  <li>Full history preserved in REPL for UI scrollback (non-destructive)</li>
</ul>

<h3 id="104-context-collapse">10.4 Context Collapse</h3>

<p>Staged collapses are committed lazily — only when the API returns a 413 (prompt too long):</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>API 413 → Collapse drain (commit staged collapses)
        → If insufficient → Reactive compact (full summarization)
        → If still insufficient → Surface error to user
</code></pre></div></div>

<h3 id="105-system-context">10.5 System Context</h3>

<p>Two tiers of context are injected into every request:</p>

<p><strong>System context</strong> (memoized per session):</p>
<ul>
  <li>Git status (branch, recent commits, file status — truncated at 2000 chars)</li>
  <li>Cache breaker (optional debug injection)</li>
</ul>

<p><strong>User context</strong> (memoized per session):</p>
<ul>
  <li>CLAUDE.md file contents (auto-discovered from project + parent directories)</li>
  <li>Current date (ISO format)</li>
</ul>

<h2 id="11-state-management-immutable-store-for-a-mutable-world">11. State Management: Immutable Store for a Mutable World</h2>

<h3 id="111-the-store">11.1 The Store</h3>

<p>Claude Code uses a minimal, Zustand-like store:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">type</span> <span class="nx">Store</span><span class="o">&lt;</span><span class="nx">T</span><span class="o">&gt;</span> <span class="o">=</span> <span class="p">{</span>
  <span class="na">getState</span><span class="p">:</span> <span class="p">()</span> <span class="o">=&gt;</span> <span class="nx">T</span>
  <span class="na">setState</span><span class="p">:</span> <span class="p">(</span><span class="na">updater</span><span class="p">:</span> <span class="p">(</span><span class="na">prev</span><span class="p">:</span> <span class="nx">T</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="nx">T</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="k">void</span>
  <span class="na">subscribe</span><span class="p">:</span> <span class="p">(</span><span class="na">listener</span><span class="p">:</span> <span class="nx">Listener</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">()</span> <span class="o">=&gt;</span> <span class="k">void</span>
<span class="p">}</span>
</code></pre></div></div>

<ul>
  <li>No middleware</li>
  <li>Synchronous updates</li>
  <li>Identity comparison (<code class="language-plaintext highlighter-rouge">Object.is</code>) gates listener invocation</li>
  <li>React integration via <code class="language-plaintext highlighter-rouge">useSyncExternalStore</code></li>
</ul>

<h3 id="112-appstate-the-unified-state-object">11.2 AppState: The Unified State Object</h3>

<p>The AppState object contains everything:</p>

<p><strong>Core settings:</strong></p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">settings</code> — User preferences (theme, model, etc.)</li>
  <li><code class="language-plaintext highlighter-rouge">mainLoopModel</code> — Current AI model for the session</li>
  <li><code class="language-plaintext highlighter-rouge">toolPermissionContext</code> — Safety mode and rules</li>
  <li>
    <table>
      <tbody>
        <tr>
          <td><code class="language-plaintext highlighter-rouge">expandedView</code> — ‘none’</td>
          <td>‘tasks’</td>
          <td>‘teammates’</td>
        </tr>
      </tbody>
    </table>
  </li>
</ul>

<p><strong>Bridge state</strong> (Claude.ai integration):</p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">replBridgeEnabled</code> / <code class="language-plaintext highlighter-rouge">replBridgeConnected</code> / <code class="language-plaintext highlighter-rouge">replBridgeSessionActive</code></li>
  <li><code class="language-plaintext highlighter-rouge">replBridgeConnectUrl</code> / <code class="language-plaintext highlighter-rouge">replBridgeError</code></li>
</ul>

<p><strong>Multi-agent state:</strong></p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">tasks: { [taskId: string]: TaskState }</code></li>
  <li><code class="language-plaintext highlighter-rouge">agentNameRegistry: Map&lt;string, AgentId&gt;</code></li>
  <li><code class="language-plaintext highlighter-rouge">foregroundedTaskId</code> / <code class="language-plaintext highlighter-rouge">viewingAgentTaskId</code></li>
</ul>

<p><strong>MCP state:</strong></p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">mcp.clients: MCPServerConnection[]</code></li>
  <li><code class="language-plaintext highlighter-rouge">mcp.tools</code>, <code class="language-plaintext highlighter-rouge">mcp.commands</code>, <code class="language-plaintext highlighter-rouge">mcp.resources</code></li>
</ul>

<p><strong>Speculation state</strong> (parallel model execution):</p>
<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">type</span> <span class="nx">SpeculationState</span> <span class="o">=</span>
  <span class="o">|</span> <span class="p">{</span> <span class="na">status</span><span class="p">:</span> <span class="dl">'</span><span class="s1">idle</span><span class="dl">'</span> <span class="p">}</span>
  <span class="o">|</span> <span class="p">{</span> <span class="na">status</span><span class="p">:</span> <span class="dl">'</span><span class="s1">active</span><span class="dl">'</span><span class="p">,</span> <span class="nx">messagesRef</span><span class="p">,</span> <span class="nx">writtenPathsRef</span><span class="p">,</span> <span class="nx">boundary</span><span class="p">,</span> <span class="nx">isPipelined</span> <span class="p">}</span>
</code></pre></div></div>

<p>Speculation is a latency optimization: while the user is still typing, the model begins generating a response speculatively. File writes go to an <strong>overlay filesystem</strong> (<code class="language-plaintext highlighter-rouge">writtenPathsRef</code>), and on completion, the overlay is either committed (if the user’s actual input matches the speculation boundary) or discarded. <code class="language-plaintext highlighter-rouge">isPipelined</code> indicates whether a suggestion was already generated and is queued for display.</p>

<h3 id="113-centralized-side-effects">11.3 Centralized Side Effects</h3>

<p>All state mutations that affect external systems flow through <code class="language-plaintext highlighter-rouge">onChangeAppState()</code>:</p>

<ul>
  <li>Permission mode changes → Notify CCR bridge</li>
  <li>Model changes → Persist to user settings</li>
  <li>Settings mutations → Clear auth caches</li>
  <li>View changes → Persist UI state</li>
</ul>

<p>One choke point, no scattered side effects.</p>

<h2 id="12-session-persistence-and-history">12. Session Persistence and History</h2>

<h3 id="121-transcript-recording">12.1 Transcript Recording</h3>

<p>The engine records transcripts with ordering guarantees:</p>

<ul>
  <li><strong>Assistant messages</strong> — Fire-and-forget (lazy JSON stringify with 100ms drain)</li>
  <li><strong>User/boundary messages</strong> — Blocking await (ordering guarantee)</li>
  <li><strong>Pre-compact flush</strong> — Writes preserved segment before compaction boundary</li>
</ul>

<p>Even if the process is killed mid-request, the conversation is resumable from the last recorded transcript.</p>

<h3 id="122-history-system">12.2 History System</h3>

<p>Two-level history with deduplication:</p>

<p><strong>In-memory:</strong> <code class="language-plaintext highlighter-rouge">pendingEntries[]</code> — Queue before flush to disk</p>

<p><strong>On-disk:</strong> <code class="language-plaintext highlighter-rouge">~/.claude/history.jsonl</code> — Append-only log</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">type</span> <span class="nx">LogEntry</span> <span class="o">=</span> <span class="p">{</span>
  <span class="na">display</span><span class="p">:</span> <span class="kr">string</span>                    <span class="c1">// Formatted prompt for Ctrl+R picker</span>
  <span class="na">project</span><span class="p">:</span> <span class="kr">string</span>                    <span class="c1">// Current project root</span>
  <span class="na">sessionId</span><span class="p">:</span> <span class="nx">SessionId</span>
  <span class="na">timestamp</span><span class="p">:</span> <span class="kr">number</span>
  <span class="nx">pastedContents</span><span class="p">?:</span> <span class="nb">Record</span><span class="o">&lt;</span><span class="kr">number</span><span class="p">,</span> <span class="nx">StoredPastedContent</span><span class="o">&gt;</span>
<span class="p">}</span>
</code></pre></div></div>

<p><strong>Key algorithms:</strong></p>
<ul>
  <li>Dedup by display text (newest first) for Ctrl+R</li>
  <li>Current-session-first ordering (up-arrow doesn’t interleave sessions)</li>
  <li>Small pastes (&lt;1KB) inlined; large pastes stored with hash references</li>
</ul>

<h3 id="123-cost-state-persistence">12.3 Cost State Persistence</h3>

<p>Session costs survive process restarts:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">getStoredSessionCosts</span><span class="p">()</span>       <span class="c1">// Retrieve if session ID matches</span>
<span class="nx">saveCurrentSessionCosts</span><span class="p">()</span>     <span class="c1">// Persist before session switch</span>
<span class="nx">restoreCostStateForSession</span><span class="p">()</span>  <span class="c1">// Restore on resume (validates session ID)</span>
</code></pre></div></div>

<h2 id="13-multi-agent-architecture-subagents-swarms-and-worktrees">13. Multi-Agent Architecture: Subagents, Swarms, and Worktrees</h2>

<p><img src="/assets/claudecode/multi-agent-architecture.png" alt="Claude Code multi-agent architecture" /></p>

<h3 id="131-agent-spawning">13.1 Agent Spawning</h3>

<p>The <code class="language-plaintext highlighter-rouge">AgentTool</code> spawns child agents with configurable isolation:</p>

<ul>
  <li><strong>Default</strong> — Shared filesystem, separate conversation context</li>
  <li><strong>Worktree</strong> — Isolated git branch copy, changes merged on exit</li>
  <li><strong>Remote (CCR)</strong> — Runs on a separate machine</li>
</ul>

<p>Agents are addressable by name:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Model: "Ask the test-runner agent to run the suite"
  → SendMessage(to: "test-runner", message: "Run the test suite")
</code></pre></div></div>

<h3 id="132-task-system">13.2 Task System</h3>

<p>Background tasks use file-based IPC with concurrent-session locking:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">type</span> <span class="nx">TaskType</span> <span class="o">=</span> <span class="dl">'</span><span class="s1">local_bash</span><span class="dl">'</span> <span class="o">|</span> <span class="dl">'</span><span class="s1">local_agent</span><span class="dl">'</span> <span class="o">|</span> <span class="dl">'</span><span class="s1">remote_agent</span><span class="dl">'</span>
              <span class="o">|</span> <span class="dl">'</span><span class="s1">in_process_teammate</span><span class="dl">'</span> <span class="o">|</span> <span class="dl">'</span><span class="s1">local_workflow</span><span class="dl">'</span> <span class="o">|</span> <span class="dl">'</span><span class="s1">monitor_mcp</span><span class="dl">'</span> <span class="o">|</span> <span class="dl">'</span><span class="s1">dream</span><span class="dl">'</span>

<span class="kd">type</span> <span class="nx">TaskStatus</span> <span class="o">=</span> <span class="dl">'</span><span class="s1">pending</span><span class="dl">'</span> <span class="o">|</span> <span class="dl">'</span><span class="s1">running</span><span class="dl">'</span> <span class="o">|</span> <span class="dl">'</span><span class="s1">completed</span><span class="dl">'</span> <span class="o">|</span> <span class="dl">'</span><span class="s1">failed</span><span class="dl">'</span> <span class="o">|</span> <span class="dl">'</span><span class="s1">killed</span><span class="dl">'</span>
</code></pre></div></div>

<p>Task IDs use base-36 encoding with type prefixes (b=bash, a=agent, r=remote, etc.).</p>

<p>Lock retries use 30 attempts with 5-100ms backoff (~2.6s max wait) for swarm coordination across tmux/iTerm2 panes.</p>

<h3 id="133-worktree-isolation">13.3 Worktree Isolation</h3>

<p><code class="language-plaintext highlighter-rouge">EnterWorktreeTool</code> / <code class="language-plaintext highlighter-rouge">ExitWorktreeTool</code> provide git-level isolation:</p>

<ol>
  <li>Create a temporary git worktree on a new branch</li>
  <li>Agent works in the worktree (safe to make destructive changes)</li>
  <li>On exit: keep changes (merge) or discard (clean up)</li>
</ol>

<h2 id="14-error-recovery-a-system-that-refuses-to-crash">14. Error Recovery: A System That Refuses to Crash</h2>

<p><img src="/assets/claudecode/error-recovery.png" alt="Claude Code error recovery flow" /></p>

<h3 id="141-api-error-recovery">14.1 API Error Recovery</h3>

<p>The retry system handles transient and permanent errors differently:</p>

<p><strong>Transient (retryable):</strong>
| Error | Strategy |
|——-|———-|
| 529 (Overloaded) | Max 3 retries for foreground queries |
| 429 (Rate limit) | Exponential backoff, persistent mode available |
| ECONNRESET/EPIPE | Stale connection retry |</p>

<p><strong>Permanent (fail fast):</strong>
| Error | Strategy |
|——-|———-|
| 401 | OAuth refresh → retry once → clear credentials |
| 400 | Invalid request, no retry |
| 403 | Permission denied, no retry |</p>

<p><strong>Persistent retry mode</strong> (for unattended operation):</p>
<ul>
  <li>Env var: <code class="language-plaintext highlighter-rouge">CLAUDE_CODE_UNATTENDED_RETRY</code></li>
  <li>Indefinite 429/529 retries with max 5-minute backoff</li>
  <li>30-second heartbeat keep-alive messages</li>
</ul>

<h3 id="142-prompt-too-long-recovery">14.2 Prompt-Too-Long Recovery</h3>

<p>When the API returns 413:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>413 Prompt Too Long
  ├─ 1. Collapse drain (commit staged context collapses)
  ├─ 2. Reactive compact (generate full conversation summary)
  └─ 3. Surface error if all paths exhausted
</code></pre></div></div>

<p>The error is <strong>withheld</strong> from the SDK until recovery paths are exhausted — the user never sees a 413 if compaction can resolve it.</p>

<h3 id="143-max-output-tokens-recovery">14.3 Max Output Tokens Recovery</h3>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>max_output_tokens stop reason
  ├─ 1. Escalate to 64K tokens (once per turn)
  ├─ 2. Inject meta recovery message ("Resume directly")
  ├─ 3. Max 3 attempts before surfacing
  └─ 4. Withhold intermediate errors
</code></pre></div></div>

<h3 id="144-model-fallback">14.4 Model Fallback</h3>

<p>On persistent 529 errors:</p>
<ol>
  <li>Switch to fallback model (e.g., Sonnet when Opus is overloaded)</li>
  <li>Strip thinking blocks (model-bound signatures)</li>
  <li>Log fallback event with chain ID</li>
  <li>Yield system message about the switch</li>
</ol>

<h3 id="145-streaming-fallback">14.5 Streaming Fallback</h3>

<p>If streaming fails mid-response:</p>
<ol>
  <li>Retry with non-streaming request</li>
  <li>Tombstone orphaned messages</li>
  <li>Clear assistant messages to restart the turn</li>
  <li>Fresh tool executor to prevent orphan results</li>
</ol>

<h2 id="15-cost-tracking-and-telemetry">15. Cost Tracking and Telemetry</h2>

<h3 id="151-usage-accumulation">15.1 Usage Accumulation</h3>

<p>Per-model tracking:</p>
<ul>
  <li>Input tokens, output tokens</li>
  <li>Cache read/write tokens</li>
  <li>Web search requests</li>
  <li>USD cost (calculated via <code class="language-plaintext highlighter-rouge">calculateUSDCost()</code>)</li>
</ul>

<p>Advisor model costs are recursively accumulated from <code class="language-plaintext highlighter-rouge">getAdvisorUsage()</code>.</p>

<h3 id="152-display">15.2 Display</h3>

<p><code class="language-plaintext highlighter-rouge">formatTotalCost()</code> produces a multi-line report:</p>
<ul>
  <li>Total cost</li>
  <li>Per-model breakdown</li>
  <li>API/wall-clock duration</li>
  <li>Lines of code changed</li>
  <li>Unknown model cost disclaimer</li>
</ul>

<h3 id="153-telemetry">15.3 Telemetry</h3>

<p>Analytics use a decoupled sink pattern:</p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">attachAnalyticsSink()</code> called during startup</li>
  <li>Events queued until sink is available (prevents import cycles)</li>
  <li>Datadog fanout + first-party event logging</li>
  <li>PII-tagged fields for compliance</li>
  <li>OpenTelemetry spans for LLM request tracing</li>
</ul>

<p><strong>Gateway detection</strong> identifies proxy infrastructure from response headers: LiteLLM, Helicone, Portkey, Cloudflare AI Gateway, Kong, Braintrust, Databricks.</p>

<h2 id="16-execution-modes-one-codebase-many-faces">16. Execution Modes: One Codebase, Many Faces</h2>

<p>Claude Code runs in multiple modes from a single codebase:</p>

<h3 id="interactive-cli-default">Interactive CLI (Default)</h3>

<p>Full React terminal UI with REPL loop, text selection, mouse support, and rich rendering.</p>

<h3 id="non-interactive--headless">Non-Interactive / Headless</h3>

<p><code class="language-plaintext highlighter-rouge">--print</code> mode outputs the response to stdout. <code class="language-plaintext highlighter-rouge">--output</code> saves to a file. No user interaction — suitable for scripts, CI/CD, and piping.</p>

<h3 id="mcp-server-mode">MCP Server Mode</h3>

<p><code class="language-plaintext highlighter-rouge">claude mcp serve</code> runs Claude Code as an MCP server, exposing its tools to other MCP clients.</p>

<h3 id="bridge-mode-claudeai-integration">Bridge Mode (Claude.ai Integration)</h3>

<p>WebSocket connection to claude.ai for remote control:</p>
<ul>
  <li>CLI sends status updates to the web UI</li>
  <li>Web UI sends control commands back</li>
  <li>Bidirectional message adaptation (SDK format ↔ local format)</li>
  <li>Viewer-only mode for read-only clients</li>
</ul>

<h3 id="remote--teleport">Remote / Teleport</h3>

<p><code class="language-plaintext highlighter-rouge">claude remote-control</code> exposes the CLI as a WebSocket server. Users can connect via claude.ai’s web interface or QR code.</p>

<h3 id="local-agent-mode">Local Agent Mode</h3>

<p>Subprocesses spawned for multi-agent swarms. Each agent gets its own session, AppState, and task directory. Communication via file I/O.</p>

<h3 id="coordinator-mode">Coordinator Mode</h3>

<p>Orchestrates multiple agents working in parallel on different aspects of a task. (See dedicated section below.)</p>

<h2 id="17-buddy-a-tamagotchi-style-ai-pet">17. BUDDY: A Tamagotchi-Style AI Pet</h2>

<p>One of the most surprising finds in the codebase: a fully implemented <strong>Tamagotchi-style virtual companion</strong> that lives beside the user’s input box. What started as an April Fools feature (teaser window: April 1-7, 2026) became a real, permanent feature.</p>

<h3 id="171-how-your-buddy-is-born">17.1 How Your Buddy Is Born</h3>

<p>Every companion is <strong>deterministically generated</strong> from the user’s account ID using a Mulberry32 seeded PRNG:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Mulberry32 — tiny seeded PRNG, good enough for picking ducks</span>
<span class="kd">function</span> <span class="nx">mulberry32</span><span class="p">(</span><span class="nx">seed</span><span class="p">:</span> <span class="kr">number</span><span class="p">):</span> <span class="p">()</span> <span class="o">=&gt;</span> <span class="kr">number</span> <span class="p">{</span>
  <span class="kd">let</span> <span class="nx">a</span> <span class="o">=</span> <span class="nx">seed</span> <span class="o">&gt;&gt;&gt;</span> <span class="mi">0</span>
  <span class="k">return</span> <span class="kd">function</span> <span class="p">()</span> <span class="p">{</span>
    <span class="nx">a</span> <span class="o">|=</span> <span class="mi">0</span>
    <span class="nx">a</span> <span class="o">=</span> <span class="p">(</span><span class="nx">a</span> <span class="o">+</span> <span class="mh">0x6d2b79f5</span><span class="p">)</span> <span class="o">|</span> <span class="mi">0</span>
    <span class="kd">let</span> <span class="nx">t</span> <span class="o">=</span> <span class="nb">Math</span><span class="p">.</span><span class="nx">imul</span><span class="p">(</span><span class="nx">a</span> <span class="o">^</span> <span class="p">(</span><span class="nx">a</span> <span class="o">&gt;&gt;&gt;</span> <span class="mi">15</span><span class="p">),</span> <span class="mi">1</span> <span class="o">|</span> <span class="nx">a</span><span class="p">)</span>
    <span class="nx">t</span> <span class="o">=</span> <span class="p">(</span><span class="nx">t</span> <span class="o">+</span> <span class="nb">Math</span><span class="p">.</span><span class="nx">imul</span><span class="p">(</span><span class="nx">t</span> <span class="o">^</span> <span class="p">(</span><span class="nx">t</span> <span class="o">&gt;&gt;&gt;</span> <span class="mi">7</span><span class="p">),</span> <span class="mi">61</span> <span class="o">|</span> <span class="nx">t</span><span class="p">))</span> <span class="o">^</span> <span class="nx">t</span>
    <span class="k">return</span> <span class="p">((</span><span class="nx">t</span> <span class="o">^</span> <span class="p">(</span><span class="nx">t</span> <span class="o">&gt;&gt;&gt;</span> <span class="mi">14</span><span class="p">))</span> <span class="o">&gt;&gt;&gt;</span> <span class="mi">0</span><span class="p">)</span> <span class="o">/</span> <span class="mi">4294967296</span>
  <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>The seed is <code class="language-plaintext highlighter-rouge">hash(userId + 'friend-2026-401')</code>. This means your companion is unique to you but identical across devices and sessions — you always get the same one.</p>

<h3 id="172-species-rarity-and-cosmetics">17.2 Species, Rarity, and Cosmetics</h3>

<p><strong>18 species:</strong> duck, goose, blob, cat, dragon, octopus, owl, penguin, turtle, snail, ghost, axolotl, capybara, cactus, robot, rabbit, mushroom, chonk</p>

<p>Species names are encoded as <code class="language-plaintext highlighter-rouge">String.fromCharCode(0x64,0x75,0x63,0x6b)</code> rather than string literals to avoid tripping an excluded-strings build check (one species name collides with a model codename).</p>

<p><strong>Rarity tiers</strong> (weighted random):</p>

<table>
  <thead>
    <tr>
      <th>Tier</th>
      <th>Weight</th>
      <th>Stat Floor</th>
      <th>Hat?</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Common</td>
      <td>60%</td>
      <td>5</td>
      <td>None</td>
    </tr>
    <tr>
      <td>Uncommon</td>
      <td>25%</td>
      <td>15</td>
      <td>Random</td>
    </tr>
    <tr>
      <td>Rare</td>
      <td>10%</td>
      <td>25</td>
      <td>Random</td>
    </tr>
    <tr>
      <td>Epic</td>
      <td>4%</td>
      <td>35</td>
      <td>Random</td>
    </tr>
    <tr>
      <td>Legendary</td>
      <td>1%</td>
      <td>50</td>
      <td>Random</td>
    </tr>
  </tbody>
</table>

<p><strong>Cosmetics:</strong></p>
<ul>
  <li><strong>6 eye styles:</strong> <code class="language-plaintext highlighter-rouge">·</code>, <code class="language-plaintext highlighter-rouge">+</code>, <code class="language-plaintext highlighter-rouge">x</code>, <code class="language-plaintext highlighter-rouge">@</code>, <code class="language-plaintext highlighter-rouge">°</code>, and a special star eye</li>
  <li><strong>8 hats:</strong> none, crown, tophat, propeller, halo, wizard, beanie, tinyduck</li>
  <li><strong>1% shiny chance</strong> — independent of rarity</li>
  <li><strong>5 stats:</strong> DEBUGGING, PATIENCE, CHAOS, WISDOM, SNARK — one peak stat, one dump stat, rest scattered. Higher rarity = higher stat floors.</li>
</ul>

<p>Each species star rating displays with themed colors: common (inactive), uncommon (green), rare (permission blue), epic (auto-accept purple), legendary (warning gold).</p>

<h3 id="173-soul-generation">17.3 Soul Generation</h3>

<p>On first “hatch,” Claude generates a unique <strong>name and personality</strong> for the companion. This is stored permanently in the user’s global config as <code class="language-plaintext highlighter-rouge">StoredCompanion</code>:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">type</span> <span class="nx">StoredCompanion</span> <span class="o">=</span> <span class="nx">CompanionSoul</span> <span class="o">&amp;</span> <span class="p">{</span> <span class="na">hatchedAt</span><span class="p">:</span> <span class="kr">number</span> <span class="p">}</span>
<span class="kd">type</span> <span class="nx">CompanionSoul</span> <span class="o">=</span> <span class="p">{</span> <span class="na">name</span><span class="p">:</span> <span class="kr">string</span><span class="p">;</span> <span class="nl">personality</span><span class="p">:</span> <span class="kr">string</span> <span class="p">}</span>
</code></pre></div></div>

<p>Critically, only the soul persists — bones (species, rarity, stats) are <strong>regenerated from the userId hash every time</strong>. This prevents users from editing their config to fake a legendary, and allows species renames without breaking stored companions.</p>

<h3 id="174-sprite-animation-system">17.4 Sprite Animation System</h3>

<p>Each species has <strong>3 animation frames</strong> as 5-line, 12-character-wide ASCII art:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Frame 0 (idle):     Frame 1 (fidget):   Frame 2 (rare):
    __                   __               __
  &lt;(· )___             &lt;(· )___         &lt;(· )___
   (  ._&gt;               (  ._&gt;           (  .__&gt;
    `--´                 `--´~            `--´
</code></pre></div></div>

<p>Eye placeholders <code class="language-plaintext highlighter-rouge">{E}</code> are replaced with the companion’s assigned eye character at render time. Hat lines overlay the top row (only when the species’ top row is blank).</p>

<p>The <strong>idle sequence</strong> cycles at 500ms per tick:</p>
<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">IDLE_SEQUENCE</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">]</span>
<span class="c1">// -1 = "blink on frame 0" (eyes temporarily replaced)</span>
</code></pre></div></div>

<p>This creates a natural feel: mostly still, occasional fidgets, rare blinks.</p>

<h3 id="175-speech-bubbles-and-interaction">17.5 Speech Bubbles and Interaction</h3>

<p>The companion renders as a <code class="language-plaintext highlighter-rouge">CompanionSprite</code> React component positioned beside the prompt input. It features:</p>

<ul>
  <li><strong>Speech bubbles</strong> with a <code class="language-plaintext highlighter-rouge">SpeechBubble</code> component using rounded borders</li>
  <li>Bubbles display for <strong>~10 seconds</strong> (20 ticks) then fade over the last 3 seconds</li>
  <li><code class="language-plaintext highlighter-rouge">/buddy pet</code> triggers a <strong>floating heart animation</strong> (2.5 seconds) with hearts drifting upward</li>
  <li>The companion can react to conversation events via <code class="language-plaintext highlighter-rouge">companionReaction</code> in AppState</li>
  <li>When terminal is too narrow (&lt;100 cols), the full sprite is hidden and replaced with a compact face-only rendering</li>
</ul>

<p>A companion intro is injected as a special attachment into the conversation, informing Claude that a small creature named X sits beside the input box and occasionally comments in bubbles.</p>

<h3 id="176-teaser-and-release-strategy">17.6 Teaser and Release Strategy</h3>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">export</span> <span class="kd">function</span> <span class="nx">isBuddyTeaserWindow</span><span class="p">():</span> <span class="nx">boolean</span> <span class="p">{</span>
  <span class="kd">const</span> <span class="nx">d</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Date</span><span class="p">()</span>
  <span class="k">return</span> <span class="nx">d</span><span class="p">.</span><span class="nx">getFullYear</span><span class="p">()</span> <span class="o">===</span> <span class="mi">2026</span> <span class="o">&amp;&amp;</span> <span class="nx">d</span><span class="p">.</span><span class="nx">getMonth</span><span class="p">()</span> <span class="o">===</span> <span class="mi">3</span> <span class="o">&amp;&amp;</span> <span class="nx">d</span><span class="p">.</span><span class="nx">getDate</span><span class="p">()</span> <span class="o">&lt;=</span> <span class="mi">7</span>
<span class="p">}</span>
</code></pre></div></div>

<p>The teaser uses <strong>local dates, not UTC</strong> — creating a rolling 24-hour wave across timezones for sustained social media buzz rather than a single UTC-midnight spike. During the teaser window, users who haven’t hatched a companion see a rainbow-colored <code class="language-plaintext highlighter-rouge">/buddy</code> notification.</p>

<h2 id="18-kairos-persistent-assistant-mode-and-auto-dreaming">18. KAIROS: Persistent Assistant Mode and Auto-Dreaming</h2>

<p>KAIROS (feature-flagged as <code class="language-plaintext highlighter-rouge">KAIROS</code>) is a complete alternate UX where Claude becomes a <strong>long-lived autonomous agent</strong> that persists across sessions — the “Always-On Claude.”</p>

<h3 id="181-auto-dreaming-memory-consolidation">18.1 Auto-Dreaming: Memory Consolidation</h3>

<p>The most concrete KAIROS subsystem in the codebase is the <strong>auto-dream</strong> system (<code class="language-plaintext highlighter-rouge">services/autoDream/</code>). This is a background memory consolidation agent that runs as a forked subagent.</p>

<p><strong>Gate order</strong> (cheapest checks first):</p>
<ol>
  <li><strong>Time gate:</strong> Hours since last consolidation &gt;= <code class="language-plaintext highlighter-rouge">minHours</code> (default: 24h)</li>
  <li><strong>Session gate:</strong> Number of transcript sessions since last consolidation &gt;= <code class="language-plaintext highlighter-rouge">minSessions</code> (default: 5)</li>
  <li><strong>Lock gate:</strong> No other process is mid-consolidation (file lock with mtime-based conflict detection)</li>
  <li><strong>Scan throttle:</strong> Even when the time gate passes, session scanning is throttled to every 10 minutes</li>
</ol>

<p><strong>The 4-phase dream prompt:</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Phase 1 — Orient
  └─ ls the memory directory, read the index, skim existing topic files

Phase 2 — Gather recent signal
  └─ Check daily logs, find drifted memories, grep transcripts narrowly

Phase 3 — Consolidate
  └─ Write/update memory files, merge duplicates, convert relative dates

Phase 4 — Prune and index
  └─ Update the entrypoint index (max ~25KB), remove stale pointers
</code></pre></div></div>

<p><strong>Tool constraints for dream runs:</strong> Bash is restricted to <strong>read-only commands only</strong> — <code class="language-plaintext highlighter-rouge">ls</code>, <code class="language-plaintext highlighter-rouge">find</code>, <code class="language-plaintext highlighter-rouge">grep</code>, <code class="language-plaintext highlighter-rouge">cat</code>, <code class="language-plaintext highlighter-rouge">stat</code>, <code class="language-plaintext highlighter-rouge">wc</code>, <code class="language-plaintext highlighter-rouge">head</code>, <code class="language-plaintext highlighter-rouge">tail</code>. Write operations are denied. File edits go through the normal Edit/Write tools with permission via <code class="language-plaintext highlighter-rouge">createAutoMemCanUseTool()</code>.</p>

<p><strong>Dream task lifecycle:</strong></p>
<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">type</span> <span class="nx">DreamTaskState</span> <span class="o">=</span> <span class="p">{</span>
  <span class="na">type</span><span class="p">:</span> <span class="dl">'</span><span class="s1">dream</span><span class="dl">'</span>
  <span class="na">phase</span><span class="p">:</span> <span class="dl">'</span><span class="s1">starting</span><span class="dl">'</span> <span class="o">|</span> <span class="dl">'</span><span class="s1">updating</span><span class="dl">'</span>    <span class="c1">// Flips when first Edit/Write lands</span>
  <span class="na">sessionsReviewing</span><span class="p">:</span> <span class="kr">number</span>
  <span class="na">filesTouched</span><span class="p">:</span> <span class="kr">string</span><span class="p">[]</span>             <span class="c1">// Paths observed in Edit/Write tool_use</span>
  <span class="na">turns</span><span class="p">:</span> <span class="nx">DreamTurn</span><span class="p">[]</span>                 <span class="c1">// Last 30 assistant turns (rolling window)</span>
  <span class="nx">abortController</span><span class="p">?:</span> <span class="nx">AbortController</span>
  <span class="na">priorMtime</span><span class="p">:</span> <span class="kr">number</span>                 <span class="c1">// For lock rollback on failure</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Users can kill a running dream from the background tasks dialog (Shift+Down). On kill, the lock mtime is rewound so the next session can retry.</p>

<h3 id="182-kairos-integration-points">18.2 KAIROS Integration Points</h3>

<p>KAIROS is referenced throughout the codebase:</p>
<ul>
  <li><strong><code class="language-plaintext highlighter-rouge">getKairosActive()</code></strong> in bootstrap state — gates whether KAIROS mode is active</li>
  <li>Auto-dream is <strong>disabled</strong> in KAIROS mode (KAIROS uses its own disk-skill dream variant)</li>
  <li>Brief mode (<code class="language-plaintext highlighter-rouge">BriefTool</code>) — all output goes through <code class="language-plaintext highlighter-rouge">SendUserMessage</code> tool (structured markdown + attachments + status)</li>
  <li>Proactive <code class="language-plaintext highlighter-rouge">&lt;tick&gt;</code> prompts — periodic check-ins where Claude decides what to do next</li>
  <li>15-second blocking budget — commands exceeding 15s are auto-backgrounded</li>
  <li>Exclusive tools: <code class="language-plaintext highlighter-rouge">SendUserFile</code>, <code class="language-plaintext highlighter-rouge">PushNotification</code>, <code class="language-plaintext highlighter-rouge">SubscribePR</code> (GitHub webhook subscriptions), <code class="language-plaintext highlighter-rouge">SleepTool</code></li>
  <li>Append-only daily logs at <code class="language-plaintext highlighter-rouge">~/.claude/projects/&lt;slug&gt;/memory/logs/YYYY/MM/YYYY-MM-DD.md</code></li>
  <li>Midnight boundary handling — flushes yesterday’s transcript on date change so the dream process can find it</li>
</ul>

<h3 id="183-session-history-assistant-mode">18.3 Session History (Assistant Mode)</h3>

<p><code class="language-plaintext highlighter-rouge">assistant/sessionHistory.ts</code> provides paginated session event retrieval for KAIROS:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">type</span> <span class="nx">HistoryPage</span> <span class="o">=</span> <span class="p">{</span>
  <span class="na">events</span><span class="p">:</span> <span class="nx">SDKMessage</span><span class="p">[]</span>
  <span class="na">firstId</span><span class="p">:</span> <span class="kr">string</span> <span class="o">|</span> <span class="kc">null</span>    <span class="c1">// Cursor for next-older page</span>
  <span class="na">hasMore</span><span class="p">:</span> <span class="nx">boolean</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Uses OAuth-authenticated API calls to CCR (Claude Code Remote) to fetch session transcripts. Pagination goes backwards (newest → oldest) with 100 events per page and 15-second timeout per request.</p>

<h2 id="19-ultraplan-remote-planning-sessions">19. ULTRAPLAN: Remote Planning Sessions</h2>

<p>ULTRAPLAN is an interactive planning system that farms out complex exploration to a <strong>remote Claude Code instance (CCR)</strong> for up to 30 minutes.</p>

<p><img src="/assets/claudecode/ultraplan.png" alt="ULTRAPLAN remote planning flow" /></p>

<h3 id="191-how-it-works">19.1 How It Works</h3>

<ol>
  <li>User types “ultraplan” (keyword detection, not slash command) or uses <code class="language-plaintext highlighter-rouge">/ultraplan</code></li>
  <li>A remote CCR session is created with plan mode pre-configured</li>
  <li>The CLI polls the remote session every <strong>3 seconds</strong> for up to 30 minutes</li>
  <li>Remote Claude explores, plans, and calls <code class="language-plaintext highlighter-rouge">ExitPlanMode</code> when ready</li>
  <li>User approves or rejects the plan <strong>in the browser</strong> (claude.ai)</li>
  <li>Rejected plans loop back for iteration</li>
</ol>

<h3 id="192-keyword-detection">19.2 Keyword Detection</h3>

<p>The keyword trigger system (<code class="language-plaintext highlighter-rouge">utils/ultraplan/keyword.ts</code>) is remarkably sophisticated. It finds “ultraplan” in user input while avoiding false positives:</p>

<p><strong>Skipped contexts:</strong></p>
<ul>
  <li>Inside paired delimiters (backticks, quotes, brackets, angle brackets)</li>
  <li>Path-like context (<code class="language-plaintext highlighter-rouge">src/ultraplan/foo.ts</code>, <code class="language-plaintext highlighter-rouge">ultraplan.tsx</code>)</li>
  <li>Identifier-like context (<code class="language-plaintext highlighter-rouge">--ultraplan-mode</code>, <code class="language-plaintext highlighter-rouge">ultraplan-s</code>)</li>
  <li>Followed by <code class="language-plaintext highlighter-rouge">?</code> (questions about the feature shouldn’t invoke it)</li>
  <li>Slash command input (<code class="language-plaintext highlighter-rouge">/rename ultraplan foo</code> runs <code class="language-plaintext highlighter-rouge">/rename</code>, not ultraplan)</li>
</ul>

<p>When triggered, “ultraplan” is replaced with “plan” in the forwarded prompt to keep it grammatical: <code class="language-plaintext highlighter-rouge">"please ultraplan this" → "please plan this"</code>.</p>

<h3 id="193-two-execution-paths-on-approval">19.3 Two Execution Paths on Approval</h3>

<p>On approval, the user chooses one of two paths:</p>

<table>
  <thead>
    <tr>
      <th>Path</th>
      <th>What Happens</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><strong>“remote”</strong></td>
      <td>Execute the plan in the cloud CCR instance</td>
    </tr>
    <tr>
      <td><strong>“teleport to terminal”</strong></td>
      <td>Archive the remote session, execute locally</td>
    </tr>
  </tbody>
</table>

<p>The teleport path uses a sentinel string <code class="language-plaintext highlighter-rouge">__ULTRAPLAN_TELEPORT_LOCAL__</code> embedded in the browser’s rejection feedback. The rejection keeps the remote in plan mode, but the plan text is extracted from the feedback and executed locally.</p>

<h3 id="194-event-stream-scanning">19.4 Event Stream Scanning</h3>

<p>The <code class="language-plaintext highlighter-rouge">ExitPlanModeScanner</code> class is a <strong>pure stateful classifier</strong> for the CCR event stream:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">type</span> <span class="nx">ScanResult</span> <span class="o">=</span>
  <span class="o">|</span> <span class="p">{</span> <span class="na">kind</span><span class="p">:</span> <span class="dl">'</span><span class="s1">approved</span><span class="dl">'</span><span class="p">;</span> <span class="nl">plan</span><span class="p">:</span> <span class="kr">string</span> <span class="p">}</span>
  <span class="o">|</span> <span class="p">{</span> <span class="na">kind</span><span class="p">:</span> <span class="dl">'</span><span class="s1">teleport</span><span class="dl">'</span><span class="p">;</span> <span class="nl">plan</span><span class="p">:</span> <span class="kr">string</span> <span class="p">}</span>
  <span class="o">|</span> <span class="p">{</span> <span class="na">kind</span><span class="p">:</span> <span class="dl">'</span><span class="s1">rejected</span><span class="dl">'</span><span class="p">;</span> <span class="nl">id</span><span class="p">:</span> <span class="kr">string</span> <span class="p">}</span>
  <span class="o">|</span> <span class="p">{</span> <span class="na">kind</span><span class="p">:</span> <span class="dl">'</span><span class="s1">pending</span><span class="dl">'</span> <span class="p">}</span>
  <span class="o">|</span> <span class="p">{</span> <span class="na">kind</span><span class="p">:</span> <span class="dl">'</span><span class="s1">terminated</span><span class="dl">'</span><span class="p">;</span> <span class="nl">subtype</span><span class="p">:</span> <span class="kr">string</span> <span class="p">}</span>
  <span class="o">|</span> <span class="p">{</span> <span class="na">kind</span><span class="p">:</span> <span class="dl">'</span><span class="s1">unchanged</span><span class="dl">'</span> <span class="p">}</span>
</code></pre></div></div>

<p><strong>Phase tracking</strong> for the UI pill:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>running → (turn ends, no ExitPlanMode) → needs_input
needs_input → (user replies in browser) → running
running → (ExitPlanMode emitted) → plan_ready
plan_ready → (rejected) → running
plan_ready → (approved) → poll resolves, pill removed
</code></pre></div></div>

<p><strong>Resilience:</strong> The poller tolerates up to 5 consecutive network failures before aborting (a 30-minute poll makes ~600 API calls — at any nonzero failure rate, one blip is inevitable).</p>

<h2 id="20-coordinator-mode-multi-agent-orchestrator">20. Coordinator Mode: Multi-Agent Orchestrator</h2>

<p>Coordinator Mode (<code class="language-plaintext highlighter-rouge">CLAUDE_CODE_COORDINATOR_MODE=1</code>) transforms Claude Code from a single-agent assistant into a <strong>multi-agent orchestrator</strong> where a master coordinator directs multiple parallel workers.</p>

<h3 id="201-architecture">20.1 Architecture</h3>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Coordinator (you)
  ├─ AgentTool → Worker A (research)     ─┐
  ├─ AgentTool → Worker B (research)     ─┤ Run in parallel
  ├─ AgentTool → Worker C (implement)    ─┘
  └─ SendMessage → Continue Worker A with synthesized spec
</code></pre></div></div>

<p>The coordinator’s system prompt enforces a strict discipline:</p>
<ul>
  <li><strong>“Never write ‘based on your findings’“</strong> — the coordinator must synthesize worker research into specific specs with file paths, line numbers, and exactly what to change</li>
  <li>Workers report back as <strong>XML <code class="language-plaintext highlighter-rouge">&lt;task-notification&gt;</code> messages</strong> with status, summary, result, and usage</li>
  <li>The coordinator <strong>never polls</strong> — workers push completion notifications</li>
  <li>Workers get isolated scratch directories (via <code class="language-plaintext highlighter-rouge">tengu_scratch</code> feature gate) for durable cross-worker knowledge</li>
</ul>

<h3 id="202-worker-capabilities">20.2 Worker Capabilities</h3>

<p>Workers spawned via AgentTool have access to standard tools (or a simplified Bash/Read/Edit set in <code class="language-plaintext highlighter-rouge">CLAUDE_CODE_SIMPLE</code> mode), plus MCP tools from configured servers. The coordinator injects a <code class="language-plaintext highlighter-rouge">workerToolsContext</code> into the system prompt listing exactly which tools workers can use.</p>

<h3 id="203-task-workflow">20.3 Task Workflow</h3>

<p>The coordinator system prompt defines four phases:</p>

<table>
  <thead>
    <tr>
      <th>Phase</th>
      <th>Who</th>
      <th>Purpose</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Research</td>
      <td>Workers (parallel)</td>
      <td>Investigate codebase, find files</td>
    </tr>
    <tr>
      <td>Synthesis</td>
      <td><strong>Coordinator</strong></td>
      <td>Read findings, craft implementation specs</td>
    </tr>
    <tr>
      <td>Implementation</td>
      <td>Workers</td>
      <td>Make changes per spec, commit</td>
    </tr>
    <tr>
      <td>Verification</td>
      <td>Workers</td>
      <td>Prove the code works (not just confirm it exists)</td>
    </tr>
  </tbody>
</table>

<p><strong>Concurrency rules:</strong></p>
<ul>
  <li>Read-only tasks (research) — run in parallel freely</li>
  <li>Write-heavy tasks (implementation) — one at a time per set of files</li>
  <li>Verification — can run alongside implementation on different file areas</li>
</ul>

<h3 id="204-continue-vs-spawn">20.4 Continue vs. Spawn</h3>

<p>The system provides explicit guidance on when to continue an existing worker vs. spawn fresh:</p>

<table>
  <thead>
    <tr>
      <th>Situation</th>
      <th>Mechanism</th>
      <th>Reason</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Research explored the exact files to edit</td>
      <td><strong>Continue</strong></td>
      <td>Worker already has files in context</td>
    </tr>
    <tr>
      <td>Research was broad, implementation is narrow</td>
      <td><strong>Spawn fresh</strong></td>
      <td>Avoid exploration noise</td>
    </tr>
    <tr>
      <td>Correcting a failure</td>
      <td><strong>Continue</strong></td>
      <td>Worker has error context</td>
    </tr>
    <tr>
      <td>Verifying another worker’s code</td>
      <td><strong>Spawn fresh</strong></td>
      <td>Verifier needs fresh eyes</td>
    </tr>
    <tr>
      <td>Wrong approach entirely</td>
      <td><strong>Spawn fresh</strong></td>
      <td>Wrong context pollutes retry</td>
    </tr>
  </tbody>
</table>

<h3 id="205-session-mode-matching">20.5 Session Mode Matching</h3>

<p>When resuming a session, coordinator mode is automatically matched to the stored session mode:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">export</span> <span class="kd">function</span> <span class="nx">matchSessionMode</span><span class="p">(</span><span class="nx">sessionMode</span><span class="p">:</span> <span class="dl">'</span><span class="s1">coordinator</span><span class="dl">'</span> <span class="o">|</span> <span class="dl">'</span><span class="s1">normal</span><span class="dl">'</span> <span class="o">|</span> <span class="kc">undefined</span><span class="p">):</span> <span class="kr">string</span> <span class="o">|</span> <span class="kc">undefined</span> <span class="p">{</span>
  <span class="c1">// If current mode doesn't match the resumed session, flip the env var</span>
  <span class="k">if</span> <span class="p">(</span><span class="nx">sessionIsCoordinator</span><span class="p">)</span> <span class="p">{</span>
    <span class="nx">process</span><span class="p">.</span><span class="nx">env</span><span class="p">.</span><span class="nx">CLAUDE_CODE_COORDINATOR_MODE</span> <span class="o">=</span> <span class="dl">'</span><span class="s1">1</span><span class="dl">'</span>
  <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
    <span class="k">delete</span> <span class="nx">process</span><span class="p">.</span><span class="nx">env</span><span class="p">.</span><span class="nx">CLAUDE_CODE_COORDINATOR_MODE</span>
  <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>This prevents a normal session from being resumed in coordinator mode (or vice versa), which would cause confusion.</p>

<h2 id="21-the-memory-system-persistent-ai-memory">21. The Memory System: Persistent AI Memory</h2>

<p>Claude Code has a sophisticated file-based memory system (<code class="language-plaintext highlighter-rouge">memdir/</code>) that allows it to remember context across conversations — user preferences, project knowledge, feedback, and reference pointers.</p>

<p><img src="/assets/claudecode/memory.png" alt="Claude Code memory system" /></p>

<h3 id="211-memory-architecture">21.1 Memory Architecture</h3>

<p>Memories are stored as individual markdown files with YAML frontmatter at <code class="language-plaintext highlighter-rouge">~/.claude/projects/&lt;sanitized-project-root&gt;/memory/</code>:</p>

<div class="language-markdown highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nn">---</span>
<span class="na">name</span><span class="pi">:</span> <span class="s">user_role</span>
<span class="na">description</span><span class="pi">:</span> <span class="s">User is a senior backend engineer focused on Rust</span>
<span class="na">type</span><span class="pi">:</span> <span class="s">user</span>
<span class="nn">---</span>

User is a senior backend engineer at Acme Corp, primarily works in Rust...
</code></pre></div></div>

<p>An index file <code class="language-plaintext highlighter-rouge">MEMORY.md</code> (max 200 lines / 25KB) serves as a table of contents — it’s loaded into every conversation’s system prompt so Claude knows what memories exist without reading them all.</p>

<h3 id="212-four-memory-types">21.2 Four Memory Types</h3>

<table>
  <thead>
    <tr>
      <th>Type</th>
      <th>Purpose</th>
      <th>Example</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">user</code></td>
      <td>Role, preferences, knowledge level</td>
      <td>“User is a data scientist, new to React”</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">feedback</code></td>
      <td>How to approach work (corrections + confirmations)</td>
      <td>“Don’t mock the database in integration tests”</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">project</code></td>
      <td>Ongoing work, goals, deadlines</td>
      <td>“Merge freeze begins 2026-03-05 for mobile release”</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">reference</code></td>
      <td>Pointers to external systems</td>
      <td>“Pipeline bugs tracked in Linear project INGEST”</td>
    </tr>
  </tbody>
</table>

<p>The system explicitly does NOT save: code patterns, architecture, git history, debugging solutions, or anything derivable from the current project state.</p>

<h3 id="213-intelligent-memory-recall">21.3 Intelligent Memory Recall</h3>

<p>Not all memories are loaded every turn. Instead, a <strong>Sonnet-powered relevance selector</strong> (<code class="language-plaintext highlighter-rouge">findRelevantMemories.ts</code>) runs as a side query:</p>

<ol>
  <li>Scan all <code class="language-plaintext highlighter-rouge">.md</code> files in the memory directory (max 200, newest-first)</li>
  <li>Parse frontmatter headers (name, description, type) from the first 30 lines</li>
  <li>Send the user’s query + memory manifest to Sonnet with structured JSON output</li>
  <li>Sonnet returns up to 5 most relevant filenames</li>
  <li>Those files are injected into the conversation context</li>
</ol>

<p>A clever optimization: recently-used tools are passed to the selector so it skips reference docs for tools Claude is already exercising (e.g., don’t surface MCP spawn docs when Claude is actively using the spawn tool).</p>

<h3 id="214-path-security">21.4 Path Security</h3>

<p>The memory path system includes robust security validation:</p>

<ul>
  <li>Rejects relative paths, root paths, UNC paths, null bytes</li>
  <li><code class="language-plaintext highlighter-rouge">projectSettings</code> (committed to repo) is intentionally excluded from <code class="language-plaintext highlighter-rouge">autoMemoryDirectory</code> — a malicious repo could otherwise set <code class="language-plaintext highlighter-rouge">autoMemoryDirectory: "~/.ssh"</code> and gain write access to sensitive directories</li>
  <li>All worktrees of the same git repo share one memory directory (via <code class="language-plaintext highlighter-rouge">findCanonicalGitRoot</code>)</li>
  <li><code class="language-plaintext highlighter-rouge">CLAUDE_COWORK_MEMORY_PATH_OVERRIDE</code> for SDK/Cowork integration</li>
</ul>

<h3 id="215-team-memory">21.5 Team Memory</h3>

<p>When the <code class="language-plaintext highlighter-rouge">TEAMMEM</code> feature is enabled, memories split into <strong>private</strong> (per-user) and <strong>team</strong> (shared) directories. User preferences stay private; project conventions and reference pointers default to team scope. A conflict rule prevents private feedback memories from contradicting team-level ones.</p>

<h2 id="22-hooks-user-defined-automation">22. Hooks: User-Defined Automation</h2>

<p>The hooks system (<code class="language-plaintext highlighter-rouge">schemas/hooks.ts</code>) lets users attach automated behaviors to Claude Code events — shell commands, LLM prompts, HTTP calls, or agent verifiers that fire before/after tool use, message submission, and more.</p>

<h3 id="221-four-hook-types">22.1 Four Hook Types</h3>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">type</span> <span class="nx">HookCommand</span> <span class="o">=</span>
  <span class="o">|</span> <span class="p">{</span> <span class="na">type</span><span class="p">:</span> <span class="dl">'</span><span class="s1">command</span><span class="dl">'</span><span class="p">;</span> <span class="nl">command</span><span class="p">:</span> <span class="kr">string</span><span class="p">;</span> <span class="nl">shell</span><span class="p">?:</span> <span class="dl">'</span><span class="s1">bash</span><span class="dl">'</span> <span class="o">|</span> <span class="dl">'</span><span class="s1">powershell</span><span class="dl">'</span> <span class="p">}</span>
  <span class="o">|</span> <span class="p">{</span> <span class="na">type</span><span class="p">:</span> <span class="dl">'</span><span class="s1">prompt</span><span class="dl">'</span><span class="p">;</span> <span class="nl">prompt</span><span class="p">:</span> <span class="kr">string</span><span class="p">;</span> <span class="nl">model</span><span class="p">?:</span> <span class="kr">string</span> <span class="p">}</span>
  <span class="o">|</span> <span class="p">{</span> <span class="na">type</span><span class="p">:</span> <span class="dl">'</span><span class="s1">http</span><span class="dl">'</span><span class="p">;</span> <span class="nl">url</span><span class="p">:</span> <span class="kr">string</span><span class="p">;</span> <span class="nl">headers</span><span class="p">?:</span> <span class="nb">Record</span><span class="o">&lt;</span><span class="kr">string</span><span class="p">,</span> <span class="kr">string</span><span class="o">&gt;</span> <span class="p">}</span>
  <span class="o">|</span> <span class="p">{</span> <span class="na">type</span><span class="p">:</span> <span class="dl">'</span><span class="s1">agent</span><span class="dl">'</span><span class="p">;</span> <span class="nl">prompt</span><span class="p">:</span> <span class="kr">string</span><span class="p">;</span> <span class="nl">model</span><span class="p">?:</span> <span class="kr">string</span> <span class="p">}</span>
</code></pre></div></div>

<p><strong>Command hooks</strong> run shell commands with optional timeout, async/background execution, and one-shot mode (<code class="language-plaintext highlighter-rouge">once: true</code> — runs once then auto-removes).</p>

<p><strong>Prompt hooks</strong> evaluate an LLM prompt with <code class="language-plaintext highlighter-rouge">$ARGUMENTS</code> placeholder for the hook input JSON.</p>

<p><strong>HTTP hooks</strong> POST the hook input to a URL with configurable headers and env var interpolation (only explicitly-allowed env vars are resolved — prevents leaking secrets).</p>

<p><strong>Agent hooks</strong> run a full agentic verification loop (“Verify that unit tests ran and passed”) with configurable model and timeout.</p>

<h3 id="222-event-matcher-hook-pipeline">22.2 Event-Matcher-Hook Pipeline</h3>

<p>Hooks are configured in <code class="language-plaintext highlighter-rouge">settings.json</code> as a three-level structure:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Event → Matcher[] → Hook[]
</code></pre></div></div>

<p>Each <strong>event</strong> (PreToolUse, PostToolUse, PreMessage, PostMessage, etc.) has an array of <strong>matchers</strong> with optional permission-rule-syntax patterns (e.g., <code class="language-plaintext highlighter-rouge">"Bash(git *)"</code> — only fires for git commands). Each matcher has an array of <strong>hooks</strong> to execute.</p>

<p>The <code class="language-plaintext highlighter-rouge">if</code> condition field uses the same permission rule syntax as the tool permission system, evaluated against <code class="language-plaintext highlighter-rouge">tool_name</code> and <code class="language-plaintext highlighter-rouge">tool_input</code> — so hooks can fire selectively without spawning a process for every tool call.</p>

<h3 id="223-advanced-hook-features">22.3 Advanced Hook Features</h3>

<ul>
  <li><strong><code class="language-plaintext highlighter-rouge">async: true</code></strong> — Hook runs in background without blocking the model</li>
  <li><strong><code class="language-plaintext highlighter-rouge">asyncRewake: true</code></strong> — Runs in background but wakes the model on exit code 2 (blocking error)</li>
  <li><strong><code class="language-plaintext highlighter-rouge">once: true</code></strong> — Auto-removes after first execution (useful for one-time setup)</li>
  <li><strong><code class="language-plaintext highlighter-rouge">statusMessage</code></strong> — Custom spinner text while the hook runs</li>
  <li><strong>Environment variable interpolation</strong> in HTTP headers with explicit allowlist</li>
</ul>

<h2 id="23-voice-mode-bridge-and-infrastructure">23. Voice Mode, Bridge, and Infrastructure</h2>

<p><img src="/assets/claudecode/voice.png" alt="Claude Code voice mode and bridge architecture" /></p>

<h3 id="231-voice-mode">23.1 Voice Mode</h3>

<p>Claude Code includes a <strong>voice input mode</strong> (feature-flagged as <code class="language-plaintext highlighter-rouge">VOICE_MODE</code>) that allows voice-to-text interaction:</p>

<ul>
  <li>Requires Anthropic OAuth (not API keys, Bedrock, or Vertex) — uses the <code class="language-plaintext highlighter-rouge">voice_stream</code> endpoint on claude.ai</li>
  <li>Protected by a GrowthBook kill-switch (<code class="language-plaintext highlighter-rouge">tengu_amber_quartz_disabled</code>) for emergency off</li>
  <li>Auth check uses memoized keychain reads (~20-50ms first call, cache hit thereafter)</li>
  <li>The <code class="language-plaintext highlighter-rouge">/voice</code> command, ConfigTool, and a <code class="language-plaintext highlighter-rouge">VoiceModeNotice</code> component all gate on <code class="language-plaintext highlighter-rouge">isVoiceModeEnabled()</code></li>
</ul>

<h3 id="232-the-bridge-system-31-files">23.2 The Bridge System (31 Files)</h3>

<p>The bridge (<code class="language-plaintext highlighter-rouge">bridge/</code>) is the most substantial networking subsystem — a persistent WebSocket connection between the local CLI and claude.ai’s web interface (CCR). It enables using Claude Code from a browser while the actual tools execute locally.</p>

<p><strong>Key components:</strong></p>
<ul>
  <li><strong><code class="language-plaintext highlighter-rouge">bridgeMain.ts</code></strong> — Main bridge loop with exponential backoff (2s initial → 2min cap → 10min give-up)</li>
  <li><strong><code class="language-plaintext highlighter-rouge">replBridge.ts</code> / <code class="language-plaintext highlighter-rouge">replBridgeTransport.ts</code></strong> — REPL-side bridge handle, message framing</li>
  <li><strong><code class="language-plaintext highlighter-rouge">bridgeApi.ts</code></strong> — API client with JWT refresh, trusted device tokens, session validation</li>
  <li><strong><code class="language-plaintext highlighter-rouge">bridgeMessaging.ts</code> / <code class="language-plaintext highlighter-rouge">inboundMessages.ts</code></strong> — Message adaptation (SDK format ↔ local format)</li>
  <li><strong><code class="language-plaintext highlighter-rouge">bridgePermissionCallbacks.ts</code></strong> — Permission request mediation between web UI and local CLI</li>
  <li><strong><code class="language-plaintext highlighter-rouge">sessionRunner.ts</code></strong> — Spawns agent sessions per work item, manages worktrees</li>
  <li><strong><code class="language-plaintext highlighter-rouge">capacityWake.ts</code></strong> — Wakes idle bridge when capacity becomes available</li>
  <li><strong><code class="language-plaintext highlighter-rouge">workSecret.ts</code></strong> — Encrypted work routing between bridge workers</li>
</ul>

<p>The bridge handles session lifecycle, token refresh, trusted device enrollment, and graceful reconnection — essentially a mini-RPC framework over WebSocket.</p>

<h3 id="233-direct-connect">23.3 Direct Connect</h3>

<p>The <code class="language-plaintext highlighter-rouge">server/</code> directory implements <strong>Direct Connect</strong> — a WebSocket-based protocol for external clients to connect to a running Claude Code instance:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">class</span> <span class="nx">DirectConnectSessionManager</span> <span class="p">{</span>
  <span class="nx">connect</span><span class="p">():</span> <span class="k">void</span>                    <span class="c1">// Open WebSocket</span>
  <span class="nx">sendMessage</span><span class="p">(</span><span class="nx">content</span><span class="p">):</span> <span class="nx">boolean</span>      <span class="c1">// Send user message</span>
  <span class="nx">respondToPermissionRequest</span><span class="p">(...)</span>    <span class="c1">// Handle tool permission prompts</span>
  <span class="nx">sendInterrupt</span><span class="p">():</span> <span class="k">void</span>              <span class="c1">// Cancel current request</span>
  <span class="nx">disconnect</span><span class="p">():</span> <span class="k">void</span>                 <span class="c1">// Close connection</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Messages are JSON-over-WebSocket using the SDK message format. Control requests (permission prompts) are forwarded to the client, which responds with allow/deny decisions. This enables IDE integrations and custom frontends.</p>

<h3 id="234-upstream-proxy-ccr-security">23.4 Upstream Proxy (CCR Security)</h3>

<p>When running inside a CCR container, the upstream proxy system (<code class="language-plaintext highlighter-rouge">upstreamproxy/</code>) provides secure network access:</p>

<ol>
  <li><strong>Read session token</strong> from <code class="language-plaintext highlighter-rouge">/run/ccr/session_token</code></li>
  <li><strong>Set <code class="language-plaintext highlighter-rouge">prctl(PR_SET_DUMPABLE, 0)</code></strong> — blocks same-UID ptrace (prevents prompt-injected <code class="language-plaintext highlighter-rouge">gdb -p $PPID</code> from scraping the token off the heap)</li>
  <li><strong>Download CA certificate</strong> and concatenate with system bundle for MITM proxy trust</li>
  <li><strong>Start local CONNECT→WebSocket relay</strong> on a random port</li>
  <li><strong>Unlink the token file</strong> (token stays heap-only; file is gone before the agent loop can access it)</li>
  <li><strong>Inject <code class="language-plaintext highlighter-rouge">HTTPS_PROXY</code> / <code class="language-plaintext highlighter-rouge">SSL_CERT_FILE</code></strong> env vars for all subprocesses</li>
</ol>

<p>Every step fails open — a broken proxy never breaks an otherwise-working session. The NO_PROXY list covers loopback, RFC1918, IMDS, Anthropic API, GitHub, and package registries.</p>

<h3 id="235-output-styles">23.5 Output Styles</h3>

<p>The <code class="language-plaintext highlighter-rouge">outputStyles/</code> system lets users customize Claude’s response format via markdown files:</p>

<ul>
  <li>Project styles: <code class="language-plaintext highlighter-rouge">.claude/output-styles/*.md</code></li>
  <li>User styles: <code class="language-plaintext highlighter-rouge">~/.claude/output-styles/*.md</code></li>
  <li>Plugin styles: provided by installed plugins</li>
</ul>

<p>Each style file has frontmatter (<code class="language-plaintext highlighter-rouge">name</code>, <code class="language-plaintext highlighter-rouge">description</code>, <code class="language-plaintext highlighter-rouge">keep-coding-instructions</code>) and a prompt body that shapes how Claude formats its responses.</p>

<h3 id="236-native-typescript-modules">23.6 Native TypeScript Modules</h3>

<p><code class="language-plaintext highlighter-rouge">native-ts/</code> contains TypeScript bindings for performance-critical native code:</p>
<ul>
  <li><strong><code class="language-plaintext highlighter-rouge">yoga-layout/</code></strong> — TypeScript interface to the Yoga layout engine (flexbox calculations)</li>
  <li><strong><code class="language-plaintext highlighter-rouge">file-index/</code></strong> — Native file indexing for fast codebase search</li>
  <li><strong><code class="language-plaintext highlighter-rouge">color-diff/</code></strong> — Native color difference calculations (for theme/styling)</li>
</ul>

<h3 id="237-moreright-internal-only">23.7 Moreright (Internal-Only)</h3>

<p>The <code class="language-plaintext highlighter-rouge">moreright/</code> directory contains a stub for an internal-only feature. The external build ships a no-op implementation with <code class="language-plaintext highlighter-rouge">onBeforeQuery</code>, <code class="language-plaintext highlighter-rouge">onTurnComplete</code>, and <code class="language-plaintext highlighter-rouge">render</code> all returning trivially. The real implementation is internal to Anthropic.</p>

<h2 id="24-vim-mode-keybindings-and-developer-ergonomics">24. Vim Mode, Keybindings, and Developer Ergonomics</h2>

<h3 id="241-vim-mode">24.1 Vim Mode</h3>

<p>A full vi command system:</p>

<ul>
  <li><strong>Motions</strong> — h, j, k, l, w, b, e, 0, $, gg, G</li>
  <li><strong>Operators</strong> — d (delete), c (change), y (yank)</li>
  <li><strong>Text Objects</strong> — iw (inner word), ap (a paragraph)</li>
  <li><strong>Modal State Machine</strong> — Insert, Normal, Visual modes</li>
</ul>

<p>All compiled to a single-pass command matcher for low-latency input processing.</p>

<h3 id="242-dynamic-keybindings">24.2 Dynamic Keybindings</h3>

<p>Context-aware keybinding resolution:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">type</span> <span class="nx">KeybindingContext</span> <span class="o">=</span> <span class="p">{</span>
  <span class="nx">focus</span><span class="p">?:</span> <span class="dl">'</span><span class="s1">prompt</span><span class="dl">'</span> <span class="o">|</span> <span class="dl">'</span><span class="s1">file</span><span class="dl">'</span> <span class="o">|</span> <span class="dl">'</span><span class="s1">terminal</span><span class="dl">'</span>
  <span class="nx">isRecording</span><span class="p">?:</span> <span class="nx">boolean</span>
  <span class="nx">vimMode</span><span class="p">?:</span> <span class="nx">boolean</span>
  <span class="nx">mode</span><span class="p">?:</span> <span class="dl">'</span><span class="s1">insert</span><span class="dl">'</span> <span class="o">|</span> <span class="dl">'</span><span class="s1">normal</span><span class="dl">'</span> <span class="o">|</span> <span class="dl">'</span><span class="s1">visual</span><span class="dl">'</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Users can define chord bindings: <code class="language-plaintext highlighter-rouge">ctrl+k ctrl+o</code> maps to custom actions via <code class="language-plaintext highlighter-rouge">~/.claude/keybindings.json</code>.</p>

<h3 id="243-debug-tools">24.3 Debug Tools</h3>

<ul>
  <li><code class="language-plaintext highlighter-rouge">CLAUDE_CODE_DEBUG_REPAINTS=1</code> — Shows component owner chain for every repaint</li>
  <li><code class="language-plaintext highlighter-rouge">CLAUDE_CODE_COMMIT_LOG=/tmp/commits.log</code> — Logs slow renders for profiling</li>
  <li><code class="language-plaintext highlighter-rouge">CLAUDE_CODE_PROFILE_STARTUP=1</code> — Full startup profiling with memory snapshots</li>
</ul>

<h2 id="25-key-engineering-patterns-and-takeaways">25. Key Engineering Patterns and Takeaways</h2>

<h3 id="pattern-1-lazy-everything">Pattern 1: Lazy Everything</h3>

<p>Claude Code is aggressive about deferral:</p>
<ul>
  <li><strong>Lazy schemas</strong> — Zod instantiation deferred via <code class="language-plaintext highlighter-rouge">lazySchema()</code></li>
  <li><strong>Lazy commands</strong> — Module imports via <code class="language-plaintext highlighter-rouge">load()</code> functions</li>
  <li><strong>Lazy tools</strong> — 18 tools deferred to <code class="language-plaintext highlighter-rouge">ToolSearchTool</code></li>
  <li><strong>Lazy modules</strong> — Dynamic imports for OpenTelemetry, analytics, heavy components</li>
  <li><strong>Lazy bundled skills</strong> — Reference files extracted on first use</li>
</ul>

<h3 id="pattern-2-memoization-by-identity">Pattern 2: Memoization by Identity</h3>

<p>Key functions are memoized to prevent redundant work:</p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">COMMANDS()</code> — Memoized, cleared by <code class="language-plaintext highlighter-rouge">clearCommandMemoizationCaches()</code></li>
  <li><code class="language-plaintext highlighter-rouge">loadAllCommands(cwd)</code> — Memoized by working directory</li>
  <li><code class="language-plaintext highlighter-rouge">init()</code> — Memoized to prevent re-entrancy</li>
</ul>

<h3 id="pattern-3-feature-flags-for-dead-code-elimination">Pattern 3: Feature Flags for Dead Code Elimination</h3>

<p>Bun’s <code class="language-plaintext highlighter-rouge">feature()</code> function enables compile-time dead code elimination:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">if</span> <span class="p">(</span><span class="nx">feature</span><span class="p">(</span><span class="dl">'</span><span class="s1">COORDINATOR_MODE</span><span class="dl">'</span><span class="p">))</span> <span class="p">{</span>
  <span class="c1">// This entire block is removed from the binary when the flag is off</span>
  <span class="kd">const</span> <span class="p">{</span> <span class="nx">CoordinatorUI</span> <span class="p">}</span> <span class="o">=</span> <span class="k">await</span> <span class="k">import</span><span class="p">(</span><span class="dl">'</span><span class="s1">./coordinator/index.js</span><span class="dl">'</span><span class="p">)</span>
<span class="p">}</span>
</code></pre></div></div>

<h3 id="pattern-4-interning-for-performance">Pattern 4: Interning for Performance</h3>

<p>Three interning pools (chars, styles, hyperlinks) reduce memory and enable O(1) comparison by integer ID instead of string equality. The style pool even pre-computes ANSI transition sequences.</p>

<h3 id="pattern-5-fail-closed-security">Pattern 5: Fail-Closed Security</h3>

<p>The <code class="language-plaintext highlighter-rouge">buildTool()</code> factory provides safe defaults for 7 commonly-stubbed methods. Permissions default to “ask” — a tool must explicitly opt into auto-approval.</p>

<h3 id="pattern-6-centralized-side-effects">Pattern 6: Centralized Side Effects</h3>

<p><code class="language-plaintext highlighter-rouge">onChangeAppState()</code> is the single choke point for all state mutations that affect external systems. No scattered <code class="language-plaintext highlighter-rouge">useEffect</code> side effects.</p>

<h3 id="pattern-7-file-based-ipc">Pattern 7: File-Based IPC</h3>

<p>Multi-agent coordination uses files, not sockets:</p>
<ul>
  <li>Task outputs in <code class="language-plaintext highlighter-rouge">~/.claude/</code></li>
  <li>History in <code class="language-plaintext highlighter-rouge">~/.claude/history.jsonl</code></li>
  <li>Session transcripts for resume</li>
  <li>Lock files with retry backoff for concurrent access</li>
</ul>

<h3 id="pattern-8-prompt-cache-stability">Pattern 8: Prompt Cache Stability</h3>

<p>Tools are sorted alphabetically before being sent to the API. This keeps the tool list in the same order across requests, maximizing prompt cache hit rates.</p>

<h3 id="pattern-9-progressive-disclosure">Pattern 9: Progressive Disclosure</h3>

<p>The deferred tool system implements progressive disclosure at the API level:</p>
<ul>
  <li>Base prompt stays under 200K tokens</li>
  <li>Model discovers additional tools on demand via <code class="language-plaintext highlighter-rouge">ToolSearchTool</code></li>
  <li>Discovered tools are callable in the same turn</li>
</ul>

<h3 id="pattern-10-three-tier-configuration">Pattern 10: Three-Tier Configuration</h3>

<p>Settings are resolved from multiple sources with clear precedence:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>MDM Policy (highest) → Remote Managed → User Settings
→ Project Config → Global Config → Defaults (lowest)
</code></pre></div></div>

<h2 id="26-conclusion">26. Conclusion</h2>

<p>Claude Code is a remarkable piece of engineering. What appears to the user as a simple chat interface in the terminal is backed by:</p>

<ul>
  <li>A <strong>custom React reconciler</strong> with Yoga layout, double-buffered rendering, and hardware scroll optimization</li>
  <li>A <strong>resilient query engine</strong> with automatic context compression, multi-strategy error recovery, and token budget continuation</li>
  <li>A <strong>60+ tool ecosystem</strong> unified under a single generic interface with Zod validation, lazy schemas, and elastic discovery</li>
  <li>A <strong>multi-layered permission system</strong> balancing security and developer productivity across 5 modes, rule patterns, and ML classifiers</li>
  <li>An <strong>extensibility framework</strong> spanning skills, plugins, and MCP with 8 configuration scopes and 5 transport types</li>
  <li><strong>Production-grade infrastructure</strong>: interned style pools, file-based IPC, sampled profiling, parallelized startup, and comprehensive telemetry</li>
</ul>

<p>The codebase demonstrates that a CLI tool can be as architecturally sophisticated as any web application — perhaps more so, given the unique constraints of terminal rendering, keyboard input ambiguity, and the need to coordinate an AI model, file system, shell, and git repository all within a single conversation loop.</p>

<p>For developers building similar tools, the key lessons are:</p>

<ol>
  <li><strong>Invest in the rendering layer.</strong> Claude Code’s custom Ink framework is its competitive advantage for terminal UX.</li>
  <li><strong>Design for failure.</strong> The multi-strategy error recovery (compaction → collapse → fallback → surface) means users almost never see raw API errors.</li>
  <li><strong>Defer aggressively.</strong> Lazy loading at every level — schemas, modules, tools, skills — keeps startup fast and memory bounded.</li>
  <li><strong>Intern everything.</strong> Style pools, character pools, and hyperlink pools turn O(n) string comparisons into O(1) integer comparisons.</li>
  <li><strong>Make safety the default.</strong> Fail-closed permissions, dangerous pattern detection, and mandatory confirmation for destructive operations build user trust.</li>
</ol>

<p>Claude Code isn’t just a wrapper around an API. It’s a complete development environment that happens to run in your terminal.</p>

<p><em>This analysis is based on examination of the Claude Code source code. All technical details reflect the codebase as observed at the time of analysis.</em></p>]]></content><author><name>Sathwick</name><email>sathwick.p7@gmail.com</email></author><category term="architecture" /><summary type="html"><![CDATA[An exhaustive technical analysis of the architecture, systems, and engineering behind Claude Code — Anthropic's flagship developer tool.]]></summary></entry><entry><title type="html">Claude Code’s Hidden Features: Undocumented, Gated, and Internal Capabilities</title><link href="https://sathwick.xyz/blog/claude-hidden.html" rel="alternate" type="text/html" title="Claude Code’s Hidden Features: Undocumented, Gated, and Internal Capabilities" /><published>2026-03-31T00:00:00+00:00</published><updated>2026-03-31T00:00:00+00:00</updated><id>https://sathwick.xyz/blog/claude-hidden</id><content type="html" xml:base="https://sathwick.xyz/blog/claude-hidden.html"><![CDATA[<p><strong>Genuinely hidden, gated, or underdocumented capabilities found in the source code — things the public docs don’t cover.</strong></p>

<p><em>Based on direct source inspection. “Hidden” = hidden from <code class="language-plaintext highlighter-rouge">--help</code>, feature-flagged, or dependent on non-public backends. Source presence does not guarantee your build/account has access.</em></p>

<h2 id="table-of-contents">Table of Contents</h2>

<ul>
  <li><a href="#1-hidden-cli-flags">1. Hidden CLI Flags</a></li>
  <li><a href="#2-feature-gated-slash-commands">2. Feature-Gated Slash Commands</a></li>
  <li><a href="#3-the-buddy-system--a-full-tamagotchi-pet">3. The Buddy System — A Full Tamagotchi Pet</a></li>
  <li><a href="#4-kairos--persistent-autonomous-assistant-mode">4. KAIROS — Persistent Autonomous Assistant Mode</a></li>
  <li><a href="#5-auto-dream--background-memory-consolidation">5. Auto-Dream — Background Memory Consolidation</a></li>
  <li><a href="#6-magic-docs--self-maintaining-documentation">6. Magic Docs — Self-Maintaining Documentation</a></li>
  <li><a href="#7-ultraplan--remote-30-minute-planning-sessions">7. ULTRAPLAN — Remote 30-Minute Planning Sessions</a></li>
  <li><a href="#8-coordinator-mode--multi-agent-swarms">8. Coordinator Mode — Multi-Agent Swarms</a></li>
  <li><a href="#9-speculation--predictive-response-generation">9. Speculation — Predictive Response Generation</a></li>
  <li><a href="#10-the-advisor-model-system">10. The Advisor Model System</a></li>
  <li><a href="#11-voice-mode">11. Voice Mode</a></li>
  <li><a href="#12-team-memory-sync">12. Team Memory Sync</a></li>
  <li><a href="#13-remote-triggers--scheduled-cloud-agents">13. Remote Triggers — Scheduled Cloud Agents</a></li>
  <li><a href="#14-direct-connect--cc-session-urls">14. Direct Connect — cc:// Session URLs</a></li>
  <li><a href="#15-bridge-mode--remote-control">15. Bridge Mode &amp; Remote Control</a></li>
  <li><a href="#16-ssh-remote-execution">16. SSH Remote Execution</a></li>
  <li><a href="#17-mcp-channels--inbound-push-notifications">17. MCP Channels — Inbound Push Notifications</a></li>
  <li><a href="#18-afk--auto-permission-mode">18. AFK / Auto-Permission Mode</a></li>
  <li><a href="#19-background-sessions-detached">19. Background Sessions (Detached)</a></li>
  <li><a href="#20-while-you-were-away-session-recaps">20. “While You Were Away” Session Recaps</a></li>
  <li><a href="#21-tool-use-summary-generation">21. Tool-Use Summary Generation</a></li>
  <li><a href="#22-auto-memory-extraction">22. Auto-Memory Extraction</a></li>
  <li><a href="#23-prompt-suggestions--follow-up-generation">23. Prompt Suggestions &amp; Follow-Up Generation</a></li>
  <li><a href="#24-deferred-tool-discovery">24. Deferred Tool Discovery</a></li>
  <li><a href="#25-hidden-keybindings">25. Hidden Keybindings</a></li>
  <li><a href="#26-lesser-known-environment-variables">26. Lesser-Known Environment Variables</a></li>
  <li><a href="#27-lesser-known-settings-keys">27. Lesser-Known Settings Keys</a></li>
  <li><a href="#28-claudemd-loading--hidden-discovery-paths">28. CLAUDE.md Loading — Hidden Discovery Paths</a></li>
  <li><a href="#29-internal-only-commands">29. Internal-Only Commands</a></li>
  <li><a href="#30-build-time-feature-flags">30. Build-Time Feature Flags</a></li>
</ul>

<h2 id="1-hidden-cli-flags">1. Hidden CLI Flags</h2>

<p>These flags are registered with <code class="language-plaintext highlighter-rouge">.hideHelp()</code> — they work but won’t appear in <code class="language-plaintext highlighter-rouge">claude --help</code>:</p>

<table>
  <thead>
    <tr>
      <th>Flag</th>
      <th>What It Does</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">--teleport</code></td>
      <td>Upload local git state to a remote Claude Code session on claude.ai</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">--remote</code></td>
      <td>Create a new remote session (comment in code: “undocumented until GA”)</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">--remote-control</code> / <code class="language-plaintext highlighter-rouge">--rc</code></td>
      <td>Enter bridge mode — control from claude.ai web UI (requires <code class="language-plaintext highlighter-rouge">BRIDGE_MODE</code>)</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">--sdk-url</code></td>
      <td>Connect to a custom SDK URL for direct-connect sessions</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">--channels</code></td>
      <td>Register for MCP inbound push notifications (KAIROS builds)</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">--dangerously-load-development-channels</code></td>
      <td>Bypass MCP channel allowlist</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">--enable-auto-mode</code></td>
      <td>AI classifier-driven auto-permission (requires <code class="language-plaintext highlighter-rouge">TRANSCRIPT_CLASSIFIER</code>)</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">--advisor &lt;model&gt;</code></td>
      <td>Attach a secondary reviewer model</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">--cowork</code></td>
      <td>Switch plugin commands to internal cowork marketplace</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">--agent-id</code>, <code class="language-plaintext highlighter-rouge">--team-name</code>, <code class="language-plaintext highlighter-rouge">--teammate-mode</code>, <code class="language-plaintext highlighter-rouge">--agent-type</code></td>
      <td>Swarm identity flags for multi-agent coordination</td>
    </tr>
  </tbody>
</table>

<p><strong>Deprecated aliases</strong> (still work): <code class="language-plaintext highlighter-rouge">--afk</code> and <code class="language-plaintext highlighter-rouge">--dangerously-skip-permissions-with-classifiers</code> → map to <code class="language-plaintext highlighter-rouge">--enable-auto-mode</code></p>

<p><strong>Correction:</strong> <code class="language-plaintext highlighter-rouge">--voice</code> is NOT a CLI flag. Voice is activated via <code class="language-plaintext highlighter-rouge">/voice</code> slash command or the <code class="language-plaintext highlighter-rouge">voiceEnabled</code> setting. <code class="language-plaintext highlighter-rouge">--brief</code> and <code class="language-plaintext highlighter-rouge">--proactive</code> are not hidden — they appear in help when their feature flags are on.</p>

<h2 id="2-feature-gated-slash-commands">2. Feature-Gated Slash Commands</h2>

<p>These commands exist but are conditionally registered or hidden:</p>

<table>
  <thead>
    <tr>
      <th>Command</th>
      <th>What It Does</th>
      <th>Gate</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">/buddy</code></td>
      <td>Hatch and interact with a Tamagotchi-style AI pet</td>
      <td><code class="language-plaintext highlighter-rouge">BUDDY</code> flag + date gate</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">/voice</code></td>
      <td>Toggle hold-to-talk voice dictation</td>
      <td><code class="language-plaintext highlighter-rouge">VOICE_MODE</code> + Anthropic OAuth</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">/advisor [model\|off]</code></td>
      <td>Attach/detach a secondary reviewer model</td>
      <td>GrowthBook <code class="language-plaintext highlighter-rouge">tengu_sage_compass</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">/fast [on\|off]</code></td>
      <td>Toggle fast inference mode</td>
      <td>Available when fast mode is supported</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">/dream</code></td>
      <td>Manually trigger memory consolidation</td>
      <td>Auto-memory must be enabled</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">/brief</code></td>
      <td>Toggle brief/checkpoint mode</td>
      <td><code class="language-plaintext highlighter-rouge">KAIROS</code> or <code class="language-plaintext highlighter-rouge">KAIROS_BRIEF</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">/ultraplan</code></td>
      <td>Launch a remote 30-minute planning session</td>
      <td><code class="language-plaintext highlighter-rouge">ULTRAPLAN</code> feature flag</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">/heapdump</code></td>
      <td>Dump JavaScript heap to <code class="language-plaintext highlighter-rouge">~/Desktop</code></td>
      <td>Always registered, hidden</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">/thinkback</code></td>
      <td>2025 Claude Code year-in-review stats</td>
      <td>GrowthBook <code class="language-plaintext highlighter-rouge">tengu_thinkback</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">/remote-control</code> (alias <code class="language-plaintext highlighter-rouge">/rc</code>)</td>
      <td>Enter bridge mode</td>
      <td><code class="language-plaintext highlighter-rouge">BRIDGE_MODE</code></td>
    </tr>
  </tbody>
</table>

<h2 id="3-the-buddy-system--a-full-tamagotchi-pet">3. The Buddy System — A Full Tamagotchi Pet</h2>

<p>A fully implemented virtual companion that lives beside your input box.</p>

<p><strong>Activate:</strong> <code class="language-plaintext highlighter-rouge">/buddy</code></p>

<ul>
  <li><strong>Deterministic generation</strong> from your userId via Mulberry32 PRNG — same user = same companion across all devices</li>
  <li><strong>18 species:</strong> duck, goose, blob, cat, dragon, octopus, owl, penguin, turtle, snail, ghost, axolotl, capybara, cactus, robot, rabbit, mushroom, chonk</li>
  <li><strong>5 rarity tiers:</strong> common (60%), uncommon (25%), rare (10%), epic (4%), legendary (1%)</li>
  <li><strong>6 eye styles</strong>, <strong>8 hats</strong> (commons get no hat), <strong>1% shiny chance</strong></li>
  <li><strong>5 stats:</strong> DEBUGGING, PATIENCE, CHAOS, WISDOM, SNARK — one peak stat, one dump stat, floors scale with rarity</li>
  <li><strong>Soul generation:</strong> On first hatch, Claude generates a unique name and personality, stored permanently</li>
  <li><strong>ASCII sprite animation:</strong> 3 frames per species at 500ms tick — idle, fidget, and rare blink frames</li>
  <li><strong>Speech bubbles</strong> (10 seconds, 3-second fade) and <strong><code class="language-plaintext highlighter-rouge">/buddy pet</code> hearts animation</strong> (2.5 seconds)</li>
  <li><strong>Anti-cheat:</strong> Only the soul persists — bones (species/rarity/stats) are regenerated from userId hash every load</li>
</ul>

<p><strong>Release window:</strong> Teaser April 1-7 2026 (local dates for rolling timezone buzz). Live permanently from April 2026 onward. Always on for internal builds.</p>

<h2 id="4-kairos--persistent-autonomous-assistant-mode">4. KAIROS — Persistent Autonomous Assistant Mode</h2>

<p><strong>Activate:</strong> <code class="language-plaintext highlighter-rouge">--assistant</code> flag (feature-gated: <code class="language-plaintext highlighter-rouge">KAIROS</code>)</p>

<p>A complete alternate UX where Claude becomes a <strong>long-lived autonomous agent</strong> persisting across sessions:</p>

<ul>
  <li><strong>Append-only daily logs</strong> at <code class="language-plaintext highlighter-rouge">~/.claude/projects/&lt;slug&gt;/memory/logs/YYYY/MM/YYYY-MM-DD.md</code></li>
  <li><strong>15-second blocking budget</strong> — any command exceeding 15s is auto-backgrounded</li>
  <li><strong>Proactive <code class="language-plaintext highlighter-rouge">&lt;tick&gt;</code> prompts</strong> — periodic check-ins where Claude decides what to do next or calls <code class="language-plaintext highlighter-rouge">Sleep</code></li>
  <li><strong>Brief mode</strong> — all output through <code class="language-plaintext highlighter-rouge">SendUserMessage</code> tool (structured markdown + attachments + status), not free-form text</li>
  <li><strong>Exclusive tools:</strong> <code class="language-plaintext highlighter-rouge">SendUserFile</code>, <code class="language-plaintext highlighter-rouge">PushNotification</code>, <code class="language-plaintext highlighter-rouge">SubscribePR</code> (GitHub webhook subscriptions), <code class="language-plaintext highlighter-rouge">SleepTool</code></li>
  <li><strong>Midnight boundary handling</strong> — flushes transcript on date change so the dream process can find it</li>
  <li>Nightly dreaming uses a separate disk-skill variant (distinct from the auto-dream system below)</li>
</ul>

<h2 id="5-auto-dream--background-memory-consolidation">5. Auto-Dream — Background Memory Consolidation</h2>

<p>Runs <strong>automatically in the background</strong> — no user action needed.</p>

<p>A forked subagent reviews your recent sessions and consolidates learnings into structured memory files.</p>

<p><strong>4-phase process:</strong></p>
<ol>
  <li><strong>Orient</strong> — <code class="language-plaintext highlighter-rouge">ls</code> memory dir, read index, skim existing topics</li>
  <li><strong>Gather</strong> — Check daily logs, find drifted memories, grep transcripts narrowly</li>
  <li><strong>Consolidate</strong> — Write/update memory files, merge duplicates, convert relative dates to absolute</li>
  <li><strong>Prune</strong> — Update MEMORY.md index (max ~25KB), remove stale pointers</li>
</ol>

<p><strong>Gates (cheapest checks first):</strong></p>
<ul>
  <li>Time: 24+ hours since last consolidation</li>
  <li>Sessions: 5+ sessions since last consolidation</li>
  <li>Lock: no other process mid-consolidation</li>
  <li>Scan throttle: session scanning limited to every 10 minutes</li>
</ul>

<p><strong>Safety:</strong> Bash restricted to read-only commands. Users can kill from the background tasks dialog (Shift+Down).</p>

<p><strong>User control:</strong> <code class="language-plaintext highlighter-rouge">autoDreamEnabled</code> setting overrides the GrowthBook gate <code class="language-plaintext highlighter-rouge">tengu_onyx_plover</code>.</p>

<h2 id="6-magic-docs--self-maintaining-documentation">6. Magic Docs — Self-Maintaining Documentation</h2>

<p>Files with a <code class="language-plaintext highlighter-rouge"># MAGIC DOC: &lt;title&gt;</code> first line are <strong>automatically updated by a background agent</strong>.</p>

<p><strong>How to use:</strong></p>
<ol>
  <li>Create a markdown file with <code class="language-plaintext highlighter-rouge"># MAGIC DOC: My Topic</code> as the first line</li>
  <li>Optionally add italic instructions on the next line</li>
  <li>Make sure Claude reads that file during a session</li>
  <li>A constrained background agent will update it with new learnings</li>
</ol>

<p>The agent can only edit that specific file — it cannot modify other files. This is triggered by file format, not a command.</p>

<h2 id="7-ultraplan--remote-30-minute-planning-sessions">7. ULTRAPLAN — Remote 30-Minute Planning Sessions</h2>

<p><strong>Activate:</strong> Type “ultraplan” in your message (keyword detection) or use <code class="language-plaintext highlighter-rouge">/ultraplan</code></p>

<p>Farms out complex exploration to a <strong>remote Claude Code instance (CCR)</strong>:</p>

<ol>
  <li>Remote session created with plan mode pre-configured</li>
  <li>CLI polls every 3 seconds for up to 30 minutes</li>
  <li>Remote Claude explores, plans, and calls <code class="language-plaintext highlighter-rouge">ExitPlanMode</code> when ready</li>
  <li>You approve or reject the plan <strong>in the browser</strong> (claude.ai)</li>
  <li>Rejected plans loop back for iteration</li>
  <li>On approval: choose “remote” (execute in cloud) or “teleport to terminal” (execute locally)</li>
</ol>

<p><strong>Smart keyword detection</strong> avoids false positives — skips occurrences inside quotes, paths, identifiers, and questions.</p>

<h2 id="8-coordinator-mode--multi-agent-swarms">8. Coordinator Mode — Multi-Agent Swarms</h2>

<p><strong>Activate:</strong> <code class="language-plaintext highlighter-rouge">CLAUDE_CODE_COORDINATOR_MODE=1</code></p>

<p>Transforms Claude into a <strong>multi-agent orchestrator</strong>:</p>

<ul>
  <li>Master coordinator spawns workers via <code class="language-plaintext highlighter-rouge">AgentTool</code> in parallel</li>
  <li>Workers report back as XML <code class="language-plaintext highlighter-rouge">&lt;task-notification&gt;</code> messages with status, summary, result, and token usage</li>
  <li>Coordinator <strong>never polls</strong> — push-based completion notifications</li>
  <li>Workers get isolated <strong>scratch directories</strong> (via <code class="language-plaintext highlighter-rouge">tengu_scratch</code> gate) for cross-worker knowledge</li>
  <li>System prompt enforces: <em>“Never write ‘based on your findings’ — synthesize yourself”</em></li>
  <li>4-phase workflow: Research (parallel) → Synthesis (coordinator) → Implementation → Verification</li>
  <li>Explicit continue-vs-spawn guidance based on context overlap</li>
</ul>

<h2 id="9-speculation--predictive-response-generation">9. Speculation — Predictive Response Generation</h2>

<p>While you’re still typing, Claude Code <strong>speculatively starts generating a response</strong>.</p>

<ul>
  <li>File writes go to an <strong>overlay filesystem</strong> (not your real files)</li>
  <li>If your actual input matches the speculation boundary → overlay committed instantly</li>
  <li>If it doesn’t match → discarded silently</li>
  <li>Feature gate: <code class="language-plaintext highlighter-rouge">tengu_chomp_inflection</code> (GrowthBook)</li>
  <li>Result: noticeably lower perceived latency for predictable follow-ups</li>
</ul>

<h2 id="10-the-advisor-model-system">10. The Advisor Model System</h2>

<p><strong>Activate:</strong> <code class="language-plaintext highlighter-rouge">/advisor &lt;model&gt;</code> or <code class="language-plaintext highlighter-rouge">--advisor &lt;model&gt;</code></p>

<p>Attaches a <strong>secondary reviewer/advisor model</strong> as a server-side tool:</p>

<ul>
  <li>Main model (e.g., Sonnet) can call a stronger model (e.g., Opus) for review</li>
  <li>Full conversation history is forwarded when the advisor is invoked</li>
  <li>Beta header: <code class="language-plaintext highlighter-rouge">advisor-tool-2026-03-01</code></li>
  <li>Does NOT work on Bedrock/Vertex (they don’t support the advisor beta header)</li>
  <li>Disable with <code class="language-plaintext highlighter-rouge">/advisor off</code> or <code class="language-plaintext highlighter-rouge">CLAUDE_CODE_DISABLE_ADVISOR_TOOL</code></li>
  <li>GrowthBook gate: <code class="language-plaintext highlighter-rouge">tengu_sage_compass</code></li>
</ul>

<h2 id="11-voice-mode">11. Voice Mode</h2>

<p><strong>Activate:</strong> <code class="language-plaintext highlighter-rouge">/voice</code> slash command or <code class="language-plaintext highlighter-rouge">voiceEnabled</code> setting</p>

<ul>
  <li>Hold-to-talk dictation (default keybinding: hold <code class="language-plaintext highlighter-rouge">space</code>)</li>
  <li>Requires <strong>Anthropic OAuth</strong> — not available with API keys, Bedrock, or Vertex</li>
  <li>Uses the <code class="language-plaintext highlighter-rouge">voice_stream</code> endpoint on claude.ai</li>
  <li>Multiple audio backends: native audio, SoX, <code class="language-plaintext highlighter-rouge">arecord</code></li>
  <li>Protected by GrowthBook kill-switch (<code class="language-plaintext highlighter-rouge">tengu_amber_quartz_disabled</code>) for emergency off</li>
</ul>

<h2 id="12-team-memory-sync">12. Team Memory Sync</h2>

<p><strong>Feature-gated:</strong> <code class="language-plaintext highlighter-rouge">TEAMMEM</code></p>

<p>Memories split into <strong>private</strong> (per-user) and <strong>team</strong> (shared) directories:</p>

<ul>
  <li>Team memory synced to server APIs across authenticated org members</li>
  <li><strong>Secret scanning</strong> prevents leaking sensitive data into shared memory</li>
  <li><strong>Optimistic locking</strong> for conflict resolution</li>
  <li>Team memory lives at <code class="language-plaintext highlighter-rouge">.../memory/team/MEMORY.md</code></li>
  <li>Requires first-party OAuth and org-scoped server APIs</li>
</ul>

<h2 id="13-remote-triggers--scheduled-cloud-agents">13. Remote Triggers — Scheduled Cloud Agents</h2>

<p><strong>Tool:</strong> <code class="language-plaintext highlighter-rouge">RemoteTrigger</code> (deferred, discovered via ToolSearchTool)</p>

<p>Create and manage <strong>scheduled remote Claude Code agents</strong> via CCR API:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">create</code> — Schedule a trigger with cron expression</li>
  <li><code class="language-plaintext highlighter-rouge">list</code> / <code class="language-plaintext highlighter-rouge">get</code> — View triggers</li>
  <li><code class="language-plaintext highlighter-rouge">update</code> — Modify a trigger</li>
  <li><code class="language-plaintext highlighter-rouge">run</code> — Manually fire a trigger</li>
</ul>

<p>Requires Claude.ai OAuth. Feature gate: <code class="language-plaintext highlighter-rouge">tengu_surreal_dali</code>. Beta: <code class="language-plaintext highlighter-rouge">ccr-triggers-2026-01-30</code>.</p>

<h2 id="14-direct-connect--cc-session-urls">14. Direct Connect — cc:// Session URLs</h2>

<p><strong>Activate:</strong> <code class="language-plaintext highlighter-rouge">claude server</code> (when <code class="language-plaintext highlighter-rouge">DIRECT_CONNECT</code> is enabled)</p>

<p>Creates shareable session URLs that external clients can connect to:</p>

<ul>
  <li>WebSocket-based protocol with SDK message format</li>
  <li>Permission requests forwarded to connecting client</li>
  <li>Supports interrupt/cancel signals</li>
  <li><code class="language-plaintext highlighter-rouge">claude open cc://...</code> connects to an existing session (described as internal)</li>
  <li>Enables custom IDE integrations and frontends</li>
</ul>

<h2 id="15-bridge-mode--remote-control">15. Bridge Mode &amp; Remote Control</h2>

<p><strong>Activate:</strong> <code class="language-plaintext highlighter-rouge">--remote-control</code> / <code class="language-plaintext highlighter-rouge">--rc</code> (when <code class="language-plaintext highlighter-rouge">BRIDGE_MODE</code> compiled in)</p>

<p>Persistent WebSocket connection between local CLI and claude.ai web interface:</p>

<ul>
  <li>Use Claude Code from a browser while tools execute locally</li>
  <li>Exponential backoff reconnection (2s → 2min cap → 10min give-up)</li>
  <li>JWT token refresh, trusted device enrollment</li>
  <li>Permission request mediation between web UI and local CLI</li>
  <li>31 files in the bridge subsystem — essentially a full RPC framework</li>
</ul>

<p><strong>Related settings:</strong></p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">remoteControlAtStartup</code> — auto-start bridge</li>
  <li><code class="language-plaintext highlighter-rouge">taskCompleteNotifEnabled</code>, <code class="language-plaintext highlighter-rouge">inputNeededNotifEnabled</code>, <code class="language-plaintext highlighter-rouge">agentPushNotifEnabled</code> — push notification controls</li>
</ul>

<h2 id="16-ssh-remote-execution">16. SSH Remote Execution</h2>

<p><strong>Activate:</strong> <code class="language-plaintext highlighter-rouge">claude ssh &lt;host&gt; [dir]</code> (when <code class="language-plaintext highlighter-rouge">SSH_REMOTE</code> is enabled)</p>

<ul>
  <li>Deploys Claude Code binary to a remote Linux host over SSH</li>
  <li>API auth tunnels back through the local machine — no separate remote auth setup</li>
  <li>Build-time feature-gated</li>
</ul>

<h2 id="17-mcp-channels--inbound-push-notifications">17. MCP Channels — Inbound Push Notifications</h2>

<p><strong>Activate:</strong> <code class="language-plaintext highlighter-rouge">--channels plugin:name@marketplace</code></p>

<p>Register sessions for <strong>real-time event delivery</strong> from approved MCP servers/plugins:</p>

<ul>
  <li>Allowlist controlled by GrowthBook</li>
  <li><code class="language-plaintext highlighter-rouge">--dangerously-load-development-channels</code> bypasses the allowlist</li>
  <li>Enables event-driven workflows (e.g., GitHub events, CI notifications)</li>
</ul>

<h2 id="18-afk--auto-permission-mode">18. AFK / Auto-Permission Mode</h2>

<p><strong>Activate:</strong> <code class="language-plaintext highlighter-rouge">--enable-auto-mode</code> (hidden flag, requires <code class="language-plaintext highlighter-rouge">TRANSCRIPT_CLASSIFIER</code>)</p>

<p>Classifier-assisted automatic permission decisions when the user is away:</p>

<ul>
  <li>AI model evaluates each permission request based on conversation context</li>
  <li>Beta header: <code class="language-plaintext highlighter-rouge">afk-mode-2026-01-31</code></li>
  <li>Deprecated aliases: <code class="language-plaintext highlighter-rouge">--afk</code>, <code class="language-plaintext highlighter-rouge">--dangerously-skip-permissions-with-classifiers</code></li>
  <li>Designed for unattended, long-running agent workflows</li>
</ul>

<h2 id="19-background-sessions-detached">19. Background Sessions (Detached)</h2>

<p><strong>Feature-gated:</strong> <code class="language-plaintext highlighter-rouge">BG_SESSIONS</code></p>

<p>Run Claude Code sessions in the background with management commands:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">claude --bg</code> / <code class="language-plaintext highlighter-rouge">claude --background</code> — Start a detached session</li>
  <li><code class="language-plaintext highlighter-rouge">claude ps</code> — List running background sessions</li>
  <li><code class="language-plaintext highlighter-rouge">claude logs &lt;id&gt;</code> — View session logs</li>
  <li><code class="language-plaintext highlighter-rouge">claude attach &lt;id&gt;</code> — Reattach to a session</li>
  <li><code class="language-plaintext highlighter-rouge">claude kill &lt;id&gt;</code> — Stop a session</li>
</ul>

<h2 id="20-while-you-were-away-session-recaps">20. “While You Were Away” Session Recaps</h2>

<p>When you return to a session after being away, Claude can generate a <strong>short recap card</strong> summarizing what happened.</p>

<ul>
  <li>Implemented in <code class="language-plaintext highlighter-rouge">services/awaySummary.ts</code></li>
  <li>Produces short re-entry summaries automatically</li>
  <li>UI-side feature, not a slash command</li>
</ul>

<h2 id="21-tool-use-summary-generation">21. Tool-Use Summary Generation</h2>

<p>For SDK/mobile surfaces, raw tool batches are automatically converted into <strong>compact high-level progress summaries</strong>:</p>

<ul>
  <li>Implemented in <code class="language-plaintext highlighter-rouge">services/toolUseSummary/toolUseSummaryGenerator.ts</code></li>
  <li>Generates short labels for completed tool batches</li>
  <li>Used by SDK to provide progress updates to clients and mobile-like single-line rows</li>
</ul>

<h2 id="22-auto-memory-extraction">22. Auto-Memory Extraction</h2>

<p>A <strong>background agent</strong> automatically extracts memories from your conversations:</p>

<ul>
  <li>Runs at end of each complete query loop (when <code class="language-plaintext highlighter-rouge">EXTRACT_MEMORIES</code> is enabled)</li>
  <li>Uses a forked agent that shares the prompt cache</li>
  <li>Scans existing memory files first to avoid duplicates</li>
  <li>When the main agent writes memories directly, the background extractor skips that range</li>
  <li>Writes to <code class="language-plaintext highlighter-rouge">~/.claude/projects/&lt;path&gt;/memory/</code></li>
  <li>Auto-memory is enabled by default unless disabled via <code class="language-plaintext highlighter-rouge">CLAUDE_CODE_DISABLE_AUTO_MEMORY</code>, <code class="language-plaintext highlighter-rouge">--bare</code>, or settings</li>
</ul>

<h2 id="23-prompt-suggestions--follow-up-generation">23. Prompt Suggestions &amp; Follow-Up Generation</h2>

<p><strong>Activate:</strong> <code class="language-plaintext highlighter-rouge">CLAUDE_CODE_ENABLE_PROMPT_SUGGESTION=1</code></p>

<p>After Claude finishes a response, it can <strong>suggest what to ask next</strong>:</p>

<ul>
  <li>Feature gate: <code class="language-plaintext highlighter-rouge">tengu_chomp_inflection</code> (GrowthBook)</li>
  <li>Env var can force on/off, otherwise GrowthBook + interactive-session checks apply</li>
  <li>Tied to the speculation system — may pre-generate the speculative response too</li>
</ul>

<h2 id="24-deferred-tool-discovery">24. Deferred Tool Discovery</h2>

<p>~18 of Claude Code’s 60+ tools are <strong>not sent to the model in every request</strong>. They’re discovered on-demand:</p>

<ol>
  <li>Model calls <code class="language-plaintext highlighter-rouge">ToolSearchTool</code> with a keyword query</li>
  <li>Matching deferred tools’ schemas are returned</li>
  <li>Model calls the discovered tool in the same turn</li>
</ol>

<p><strong>Query syntax:</strong></p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">select:TaskCreate,LSP</code> — Direct selection by name</li>
  <li><code class="language-plaintext highlighter-rouge">task create</code> — Keyword search against names, descriptions, and hints</li>
  <li><code class="language-plaintext highlighter-rouge">+slack send</code> — Require “slack” in tool name</li>
</ul>

<p><strong>Why it matters:</strong> Keeps the base prompt under 200K tokens. Without this, 60+ tool schemas would consume too much context.</p>

<h2 id="25-hidden-keybindings">25. Hidden Keybindings</h2>

<p>Feature-gated keybindings that only appear when their features are enabled:</p>

<table>
  <thead>
    <tr>
      <th>Key Combo</th>
      <th>Action</th>
      <th>Gate</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">Space</code> (hold)</td>
      <td>Push-to-talk voice input</td>
      <td><code class="language-plaintext highlighter-rouge">VOICE_MODE</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">Ctrl+Shift+B</code></td>
      <td>Toggle Brief mode</td>
      <td><code class="language-plaintext highlighter-rouge">KAIROS</code> / <code class="language-plaintext highlighter-rouge">KAIROS_BRIEF</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">Ctrl+Shift+F</code> / <code class="language-plaintext highlighter-rouge">Cmd+Shift+F</code></td>
      <td>Global search</td>
      <td><code class="language-plaintext highlighter-rouge">QUICK_SEARCH</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">Ctrl+Shift+P</code> / <code class="language-plaintext highlighter-rouge">Cmd+Shift+P</code></td>
      <td>Quick open</td>
      <td><code class="language-plaintext highlighter-rouge">QUICK_SEARCH</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">Meta+J</code></td>
      <td>Toggle terminal panel</td>
      <td><code class="language-plaintext highlighter-rouge">TERMINAL_PANEL</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">Shift+Up</code></td>
      <td>Message actions menu</td>
      <td><code class="language-plaintext highlighter-rouge">MESSAGE_ACTIONS</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">Ctrl+Shift+O</code></td>
      <td>Toggle teammate preview</td>
      <td>Teams enabled</td>
    </tr>
  </tbody>
</table>

<p><strong>Always-available but often unknown:</strong></p>

<table>
  <thead>
    <tr>
      <th>Key Combo</th>
      <th>Action</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">Ctrl+X Ctrl+K</code></td>
      <td>Kill all running agents</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">Ctrl+_ </code> or <code class="language-plaintext highlighter-rouge">Ctrl+Shift+-</code></td>
      <td>Undo</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">Ctrl+X Ctrl+E</code> or <code class="language-plaintext highlighter-rouge">Ctrl+G</code></td>
      <td>Open external editor</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">Ctrl+S</code></td>
      <td>Stash current input</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">Meta+P</code></td>
      <td>Model picker</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">Meta+O</code></td>
      <td>Fast mode toggle</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">Meta+T</code></td>
      <td>Thinking mode toggle</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">Ctrl+E</code></td>
      <td>Toggle permission explanation (in confirmation dialog)</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">Ctrl+D</code></td>
      <td>Toggle permission debug info (in confirmation dialog)</td>
    </tr>
  </tbody>
</table>

<p>All keybindings are overridable via <code class="language-plaintext highlighter-rouge">~/.claude/keybindings.json</code>.</p>

<h2 id="26-lesser-known-environment-variables">26. Lesser-Known Environment Variables</h2>

<p><strong>Debug &amp; profiling:</strong></p>

<table>
  <thead>
    <tr>
      <th>Variable</th>
      <th>Purpose</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">CLAUDE_CODE_PROFILE_STARTUP=1</code></td>
      <td>Full startup profiling with memory snapshots</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">CLAUDE_CODE_PROFILE_QUERY=1</code></td>
      <td>Profile query pipeline timing</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">CLAUDE_CODE_DEBUG_REPAINTS=1</code></td>
      <td>Show component owner chain for every terminal repaint</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">CLAUDE_CODE_PERFETTO_TRACE</code></td>
      <td>Enable Perfetto tracing format</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">CLAUDE_CODE_TERMINAL_RECORDING</code></td>
      <td>Record terminal in asciinema format</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">CLAUDE_CODE_COMMIT_LOG=/path</code></td>
      <td>Log slow renders for profiling</td>
    </tr>
  </tbody>
</table>

<p><strong>Behavioral overrides:</strong></p>

<table>
  <thead>
    <tr>
      <th>Variable</th>
      <th>Purpose</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC</code></td>
      <td>Disable ALL non-essential network traffic (most restrictive privacy)</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">CLAUDE_CODE_DISABLE_AUTO_MEMORY</code></td>
      <td>Disable automatic memory management</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">CLAUDE_CODE_DISABLE_CRON</code></td>
      <td>Disable cron job scheduler</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">CLAUDE_CODE_DISABLE_FILE_CHECKPOINTING</code></td>
      <td>Disable file snapshot backups</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">CLAUDE_CODE_DISABLE_ADVISOR_TOOL</code></td>
      <td>Disable advisor model</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">CLAUDE_CODE_ENABLE_PROMPT_SUGGESTION</code></td>
      <td>Enable speculative next-prompt suggestions</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">CLAUDE_CODE_MAX_OUTPUT_TOKENS</code></td>
      <td>Override max output tokens per response</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">CLAUDE_CODE_MAX_CONTEXT_TOKENS</code></td>
      <td>Override max context window</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">CLAUDE_CODE_EFFORT_LEVEL</code></td>
      <td>Set effort: <code class="language-plaintext highlighter-rouge">low\|medium\|high\|max\|auto</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">CLAUDE_CODE_COORDINATOR_MODE=1</code></td>
      <td>Enable coordinator mode</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">CLAUDE_CODE_SIMPLE</code></td>
      <td>Same as <code class="language-plaintext highlighter-rouge">--bare</code> — skip hooks, LSP, plugins, background tasks</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">CLAUDE_CONFIG_DIR</code></td>
      <td>Override <code class="language-plaintext highlighter-rouge">~/.claude</code> config directory</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">CLAUDE_ENV_FILE</code></td>
      <td>Path to env file to source on startup</td>
    </tr>
  </tbody>
</table>

<p><strong>Provider switching:</strong></p>

<table>
  <thead>
    <tr>
      <th>Variable</th>
      <th>Purpose</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">CLAUDE_CODE_USE_BEDROCK=1</code></td>
      <td>Use AWS Bedrock</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">CLAUDE_CODE_USE_VERTEX=1</code></td>
      <td>Use Google Vertex AI</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">CLAUDE_CODE_USE_FOUNDRY=1</code></td>
      <td>Use Azure Foundry</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">CLAUDE_CODE_CLIENT_CERT</code> / <code class="language-plaintext highlighter-rouge">CLAUDE_CODE_CLIENT_KEY</code></td>
      <td>mTLS client certificates</td>
    </tr>
  </tbody>
</table>

<p><strong>Prompt caching control:</strong></p>

<table>
  <thead>
    <tr>
      <th>Variable</th>
      <th>Purpose</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">DISABLE_PROMPT_CACHING</code></td>
      <td>Disable globally</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">DISABLE_PROMPT_CACHING_HAIKU</code> / <code class="language-plaintext highlighter-rouge">_SONNET</code> / <code class="language-plaintext highlighter-rouge">_OPUS</code></td>
      <td>Disable per model</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">DISABLE_INTERLEAVED_THINKING</code></td>
      <td>Disable thinking blocks</td>
    </tr>
  </tbody>
</table>

<h2 id="27-lesser-known-settings-keys">27. Lesser-Known Settings Keys</h2>

<p>In <code class="language-plaintext highlighter-rouge">~/.claude/settings.json</code> — things most people don’t know you can set:</p>

<table>
  <thead>
    <tr>
      <th>Key</th>
      <th>What It Does</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">apiKeyHelper</code></td>
      <td>External command to fetch API key (e.g., <code class="language-plaintext highlighter-rouge">1password read op://vault/key</code>)</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">awsCredentialExport</code></td>
      <td>Command to export AWS credentials for Bedrock</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">env</code></td>
      <td>Arbitrary environment variables injected into every session</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">effortLevel</code></td>
      <td>Default effort level: <code class="language-plaintext highlighter-rouge">low\|medium\|high\|max\|auto</code></td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">alwaysThinkingEnabled</code></td>
      <td>Force extended thinking on every request</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">spinnerVerbs</code></td>
      <td>Custom spinner verb list</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">spinnerTipsOverride</code></td>
      <td>Custom tip messages during spinner</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">worktree.symlinkDirectories</code></td>
      <td>Directories to symlink in worktrees (saves disk)</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">worktree.sparsePaths</code></td>
      <td>Git sparse-checkout paths for monorepo worktrees</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">autoMemoryDirectory</code></td>
      <td>Custom path for memory storage</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">autoDreamEnabled</code></td>
      <td>Enable/disable auto-dream consolidation</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">minSleepDurationMs</code></td>
      <td>Minimum SleepTool duration</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">skipWebFetchPreflight</code></td>
      <td>Skip WebFetch URL validation</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">disableBypassPermissionsMode</code></td>
      <td>Prevent entering bypass mode</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">allowManagedPermissionRulesOnly</code></td>
      <td>Enterprise: only admin-defined permission rules</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">allowManagedMcpServersOnly</code></td>
      <td>Enterprise: only admin-defined MCP servers</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">allowManagedHooksOnly</code></td>
      <td>Enterprise: only admin-defined hooks</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">allowedHttpHookUrls</code></td>
      <td>URL allowlist for HTTP hooks</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">httpHookAllowedEnvVars</code></td>
      <td>Env vars HTTP hooks can interpolate</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">remote.defaultEnvironmentId</code></td>
      <td>Default remote environment for CCR</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">minimumVersion</code></td>
      <td>Enforce minimum Claude Code version</td>
    </tr>
  </tbody>
</table>

<h2 id="28-claudemd-loading--hidden-discovery-paths">28. CLAUDE.md Loading — Hidden Discovery Paths</h2>

<p>Beyond the standard project <code class="language-plaintext highlighter-rouge">CLAUDE.md</code>, Claude Code loads instructions from:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">~/.claude/CLAUDE.md</code> — User-level global instructions</li>
  <li>Parent directories up to git root — all CLAUDE.md files in parent dirs are included</li>
  <li><code class="language-plaintext highlighter-rouge">.claude/CLAUDE.md</code> — Inside the <code class="language-plaintext highlighter-rouge">.claude</code> directory</li>
  <li><strong><code class="language-plaintext highlighter-rouge">.claude/rules/*.md</code></strong> — Per-project rule files (all <code class="language-plaintext highlighter-rouge">.md</code> files in this directory)</li>
  <li><code class="language-plaintext highlighter-rouge">@include</code>-style references inside memory files</li>
</ul>

<h2 id="29-internal-only-commands">29. Internal-Only Commands</h2>

<p>Registered only when <code class="language-plaintext highlighter-rouge">USER_TYPE === 'ant'</code> — not in public builds:</p>

<p><code class="language-plaintext highlighter-rouge">backfill-sessions</code>, <code class="language-plaintext highlighter-rouge">break-cache</code>, <code class="language-plaintext highlighter-rouge">bughunter</code>, <code class="language-plaintext highlighter-rouge">ctx_viz</code>, <code class="language-plaintext highlighter-rouge">good-claude</code>, <code class="language-plaintext highlighter-rouge">init-verifiers</code>, <code class="language-plaintext highlighter-rouge">force-snip</code>, <code class="language-plaintext highlighter-rouge">mock-limits</code>, <code class="language-plaintext highlighter-rouge">bridge-kick</code>, <code class="language-plaintext highlighter-rouge">subscribe-pr</code>, <code class="language-plaintext highlighter-rouge">reset-limits</code>, <code class="language-plaintext highlighter-rouge">share</code>, <code class="language-plaintext highlighter-rouge">ant-trace</code>, <code class="language-plaintext highlighter-rouge">perf-issue</code>, <code class="language-plaintext highlighter-rouge">env</code>, <code class="language-plaintext highlighter-rouge">oauth-refresh</code>, <code class="language-plaintext highlighter-rouge">debug-tool-call</code>, <code class="language-plaintext highlighter-rouge">agents-platform</code>, <code class="language-plaintext highlighter-rouge">autofix-pr</code>, <code class="language-plaintext highlighter-rouge">onboarding</code></p>

<h2 id="30-build-time-feature-flags">30. Build-Time Feature Flags</h2>

<p>Compile-time flags via <code class="language-plaintext highlighter-rouge">feature()</code> from <code class="language-plaintext highlighter-rouge">bun:bundle</code>. When off, code is <strong>eliminated entirely</strong> from the binary:</p>

<table>
  <thead>
    <tr>
      <th>Flag</th>
      <th>Feature</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">COORDINATOR_MODE</code></td>
      <td>Multi-agent coordinator</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">VOICE_MODE</code></td>
      <td>Voice input</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">KAIROS</code> / <code class="language-plaintext highlighter-rouge">KAIROS_BRIEF</code></td>
      <td>Persistent assistant mode</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">PROACTIVE</code></td>
      <td>Autonomous mode</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">BRIDGE_MODE</code></td>
      <td>Remote control bridge</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">SSH_REMOTE</code></td>
      <td>SSH remote execution</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">DIRECT_CONNECT</code></td>
      <td><code class="language-plaintext highlighter-rouge">cc://</code> URL handling</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">BG_SESSIONS</code></td>
      <td>Background sessions</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">TEMPLATES</code></td>
      <td>Template/new/reply flows</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">TEAMMEM</code></td>
      <td>Team memory sync</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">TRANSCRIPT_CLASSIFIER</code></td>
      <td>AI permission classification</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">BUDDY</code></td>
      <td>Tamagotchi pet</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">ULTRAPLAN</code></td>
      <td>Remote planning</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">EXTRACT_MEMORIES</code></td>
      <td>Auto-memory extraction</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">WORKFLOW_SCRIPTS</code></td>
      <td>Workflow automation</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">QUICK_SEARCH</code></td>
      <td>Quick search (Ctrl+Shift+F)</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">TERMINAL_PANEL</code></td>
      <td>Terminal panel (Meta+J)</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">MESSAGE_ACTIONS</code></td>
      <td>Message action menu (Shift+Up)</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">CONTEXT_COLLAPSE</code></td>
      <td>Context collapse optimization</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">HISTORY_SNIP</code></td>
      <td>History snipping</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">MCP_SKILLS</code></td>
      <td>MCP skill discovery</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">DAEMON</code></td>
      <td>Long-running daemon</td>
    </tr>
  </tbody>
</table>

<p><em>Compiled from static analysis of the Claude Code source. Corrections applied from the audited version. Features behind flags may not be in your build.</em></p>]]></content><author><name>Sathwick</name><email>sathwick.p7@gmail.com</email></author><category term="architecture" /><summary type="html"><![CDATA[A source-based tour of Claude Code's undocumented, gated, and internal features, from hidden flags and slash commands to KAIROS, ULTRAPLAN, swarms, and bridge mode.]]></summary></entry><entry><title type="html">Designing a URL Shortener for 1 Trillion URLs</title><link href="https://sathwick.xyz/blog/bitly.html" rel="alternate" type="text/html" title="Designing a URL Shortener for 1 Trillion URLs" /><published>2026-03-08T00:00:00+00:00</published><updated>2026-03-08T00:00:00+00:00</updated><id>https://sathwick.xyz/blog/bitly</id><content type="html" xml:base="https://sathwick.xyz/blog/bitly.html"><![CDATA[<p>I’ve read multiple system design books, taken courses, and watched several playlists on YouTube. One of the most common and foundational system design questions that almost everyone starts with is the Bitly problem: designing a URL shortener that converts long URLs into shorter, more manageable links.</p>

<p>Over time, this question started to feel dry and repetitive. I often found myself skipping it whenever I saw it again.</p>

<p>Recently, though, I came across a very interesting variation of the problem. I tried designing a solution for it, and that is what inspired me to write this post.</p>

<h2 id="the-challenge">The Challenge</h2>

<p>The challenge is to design a URL shortener that supports 1 trillion URLs.</p>

<p>At first glance, this sounds like a simple scaling problem, but it actually changes a lot of things and breaks many of the naive assumptions that usually work for the standard Bitly-style system design question.</p>

<p>Even if we assume a base-62 encoding (<code class="language-plaintext highlighter-rouge">0-9</code>, <code class="language-plaintext highlighter-rouge">a-z</code>, <code class="language-plaintext highlighter-rouge">A-Z</code>), a 7-character string gives us roughly 3.5 trillion possible combinations.</p>

<p>That sounds like plenty for 1 trillion URLs, but things get complicated once we start thinking about distribution, storage, and collision guarantees.</p>

<p>Also, <code class="language-plaintext highlighter-rouge">62^7</code> is only the raw theoretical namespace. In practice, the usable space is smaller because some codes will be reserved for custom aliases, blocked words, internal testing, and occasionally wasted ranges from crashed allocators. The margin is still comfortable, but it is not infinite.</p>

<h3 id="constraints">Constraints</h3>

<ol>
  <li>The shortened URL can have at most 7 characters.</li>
  <li>The system must guarantee unique URLs with no collisions.</li>
  <li>The system must support 1 trillion URLs over its entire lifetime.</li>
</ol>

<p>Before diving into the trillion-URL challenge, let’s first revisit the standard approach used to solve the traditional URL shortener design problem.</p>

<h2 id="the-traditional-approach">The Traditional Approach</h2>

<h3 id="functional-requirements">Functional Requirements</h3>

<h4 id="core-requirements">Core requirements</h4>

<ul>
  <li>Users should be able to submit a long URL and receive a shortened version.</li>
  <li>Optionally, users should be able to specify a custom alias for their shortened URL, i.e. <code class="language-plaintext highlighter-rouge">www.short.ly/my-custom-alias</code>.</li>
  <li>Optionally, users should be able to specify an expiration date for their shortened URL.</li>
  <li>Users should be able to access the original URL by using the shortened URL.</li>
</ul>

<h4 id="below-the-line-out-of-scope">Below the line (out of scope)</h4>

<ul>
  <li>User authentication and account management.</li>
  <li>Analytics on link clicks, such as click counts or geographic data.</li>
</ul>

<h3 id="non-functional-requirements">Non-Functional Requirements</h3>

<h4 id="core-requirements-1">Core requirements</h4>

<ul>
  <li>The system should ensure uniqueness for short codes, where each short code maps to exactly one long URL.</li>
  <li>Redirection should occur with minimal delay, ideally under 100 ms.</li>
  <li>The system should be reliable and available 99.99% of the time, with availability prioritized over consistency.</li>
  <li>The system should scale to support 1 billion shortened URLs and 100 million DAU.</li>
</ul>

<h4 id="below-the-line-out-of-scope-1">Below the line (out of scope)</h4>

<ul>
  <li>Real-time consistency for analytics.</li>
  <li>Advanced security features such as spam detection and malicious URL filtering.</li>
</ul>

<h2 id="high-level-design">High-Level Design</h2>

<p>We’ll go through the functional requirements one by one and design a system that satisfies them.</p>

<h3 id="1-submitting-a-long-url-and-receiving-a-short-url">1. Submitting a long URL and receiving a short URL</h3>

<p>When a user submits a long URL, the client sends a <code class="language-plaintext highlighter-rouge">POST</code> request to <code class="language-plaintext highlighter-rouge">/urls</code> with the long URL, custom alias, and expiration date.</p>

<p>The flow looks like this:</p>

<ol>
  <li>The primary server receives the request and validates the long URL format using libraries like <code class="language-plaintext highlighter-rouge">is-url</code> or simple validation logic.</li>
  <li>Optionally, we can check whether this exact long URL was already shortened and return the existing short code as a deduplication optimization.</li>
  <li>In practice, most URL shorteners allow multiple short codes for the same long URL, since different users may want different expiration dates, separate analytics, or different custom aliases.</li>
  <li>If the URL is valid, we generate a short code and store the mapping.</li>
</ol>

<h4 id="generating-the-short-code">Generating the short code</h4>

<p>The standard answer is to use a hash function that produces enough randomness to make collisions unlikely.
A hash function like MD5 takes an input and produces a deterministic fixed-size output. That means the same long URL would always map to the same hash, which is useful if you want deterministic code generation.</p>

<p>But the system still needs to store the redirect mapping, because a hash is not reversible and we still need metadata such as expiration dates, ownership, and custom aliases.</p>

<p>It is also not desirable if you need multiple short codes for the same URL.</p>

<p>Hash outputs also have high entropy, which makes them appear random. We can encode that output using base-62 and take the first 7 characters as our shortcode.</p>

<p>Here, encoding simply means converting binary hash output into a sequence of readable characters from a chosen alphabet so it can be used as a short, URL-friendly code.</p>

<p>This gives us approximately <code class="language-plaintext highlighter-rouge">62^7 = 3.52 trillion</code> possible values, which is a large namespace.</p>

<p>But a large namespace does not make random or truncated-hash generation safe. If the code space has size <code class="language-plaintext highlighter-rouge">|S|</code> and <code class="language-plaintext highlighter-rouge">n</code> codes are already in use, the probability that the next random code collides is <code class="language-plaintext highlighter-rouge">n / |S|</code>. That means you still need retries and database checks to enforce uniqueness. In a space of this size, even a system that creates around 1 billion links would expect on the order of <code class="language-plaintext highlighter-rouge">10^5</code> colliding pairs unless it performs uniqueness checks.</p>

<p>At more ordinary scale, a more production-friendly baseline is to stop relying on random hashes and use a centralized counter instead. Redis is a good fit here because <code class="language-plaintext highlighter-rouge">INCR</code> is atomic and hands out unique integers efficiently. We can base-62-encode each integer into a short code and guarantee uniqueness without a retry loop.</p>

<p>Even if links expire, we usually do not recycle short codes. Reusing codes creates ugly edge cases with stale caches, delayed clients, and old analytics data, so it is safer to treat the namespace as append-only over the system’s lifetime.</p>

<p>Once we have the short URL, we can insert it into the database along with the long URL, optional custom alias, and expiration date. Finally, we return the shortened URL to the client.</p>

<h3 id="2-accessing-the-original-url-from-the-shortened-url">2. Accessing the original URL from the shortened URL</h3>

<p>Once the short URL is live, users can use it to reach the original URL. Importantly, that shortened URL exists under a domain we own.</p>

<p>When a user accesses a shortened URL, the flow looks like this:</p>

<ol>
  <li>The browser sends a <code class="language-plaintext highlighter-rouge">GET</code> request with the short code, for example <code class="language-plaintext highlighter-rouge">GET /abc123</code>.</li>
  <li>The primary server looks up the short code in the database.</li>
  <li>If the short code exists and has not expired, the server retrieves the long URL. If it has expired, the server returns <code class="language-plaintext highlighter-rouge">410 Gone</code>.</li>
  <li>The server responds with an HTTP redirect, usually either <code class="language-plaintext highlighter-rouge">301</code> or <code class="language-plaintext highlighter-rouge">302</code>.</li>
</ol>

<p>For a URL shortener, a <code class="language-plaintext highlighter-rouge">302</code> redirect is often preferred because:</p>

<ul>
  <li>It gives us more control over the redirection process, allowing us to update or expire links later.</li>
  <li>It prevents browsers from aggressively caching the redirect.</li>
  <li>It still allows us to track click statistics, even though analytics are out of scope for this design.</li>
</ul>

<h4 id="how-do-we-make-redirects-fast">How do we make redirects fast?</h4>

<p>A naive database lookup could devolve into a full table scan, which is clearly too slow.</p>

<p>A better baseline is to add an index, or simply make the shortened URL the primary key. That gives us indexed lookups and also enforces uniqueness.</p>

<p>The remaining problem is SSD IOPS. A single database instance would still struggle to keep up with heavy traffic, leading to slower response times and possible timeouts.</p>

<p>A much better solution is to place an in-memory cache like Redis or Memcached between the server and the database. Frequently accessed mappings from short code to long URL can live in memory.</p>

<p>The read path then becomes:</p>

<ul>
  <li>On a cache hit, return the long URL in milliseconds.</li>
  <li>On a cache miss, query the database, populate the cache, and return the result.</li>
</ul>

<p>The difference in speed is significant:</p>

<ul>
  <li>Memory access time: about 100 nanoseconds (<code class="language-plaintext highlighter-rouge">0.0001 ms</code>)</li>
  <li>SSD access time: about <code class="language-plaintext highlighter-rouge">0.1 ms</code></li>
  <li>HDD access time: about <code class="language-plaintext highlighter-rouge">10 ms</code></li>
</ul>

<p>That means memory access is roughly 1,000 times faster than SSD and 100,000 times faster than HDD.</p>

<p>In terms of operations per second:</p>

<ul>
  <li>Memory can support millions of reads per second.</li>
  <li>SSDs can support roughly 100,000 IOPS.</li>
  <li>HDDs typically support around 100 to 200 IOPS.</li>
</ul>

<p>The only real challenge here is cache invalidation, which can get complicated when updates or deletions happen. In this system, though, the problem is smaller because shortened URLs are mostly read-heavy and rarely change.</p>

<p>The cache also needs time to warm up, so the first few requests for a URL may still hit the database. Since memory is limited, we also need to think carefully about cache size, eviction policies such as LRU, and which entries are worth storing.</p>

<h3 id="3-scaling-the-standard-design-to-1-billion-urls">3. Scaling the standard design to 1 billion URLs</h3>

<p>Let’s do some rough sizing.</p>

<p>Each row in the database contains:</p>

<ul>
  <li>A short code, roughly 8 bytes</li>
  <li>A long URL, roughly 100 bytes</li>
  <li><code class="language-plaintext highlighter-rouge">creationTime</code>, roughly 8 bytes</li>
  <li>An optional custom alias, roughly 100 bytes</li>
  <li>An expiration date, roughly 8 bytes</li>
</ul>

<p>That totals around 200 bytes per row. If we round up to 500 bytes to account for metadata such as creator ID, analytics ID, and internal overhead, then 1 billion mappings would require:</p>

<p><code class="language-plaintext highlighter-rouge">500 bytes * 1 billion rows = 500 GB</code></p>

<p>That is still within the capabilities of modern SSDs, and if we need more headroom, we can shard data across multiple servers.</p>

<p>Reads are much more frequent than writes, so we can separate the system into reader and writer services and scale them independently. We can then add more server instances behind a load balancer to handle higher RPS without concentrating load on a single machine.</p>

<p>Here is the high-level difference between the two versions of the problem:</p>

<table>
  <thead>
    <tr>
      <th>Aspect</th>
      <th>Standard shortener</th>
      <th>Trillion-scale shortener</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>ID generation</td>
      <td>Centralized counter or sequence</td>
      <td>Range-based allocation across many writers</td>
    </tr>
    <tr>
      <td>Collision handling</td>
      <td>Database checks are acceptable</td>
      <td>Uniqueness must be generated up front</td>
    </tr>
    <tr>
      <td>Storage</td>
      <td>Single relational cluster can still work</td>
      <td>Sharded distributed key-value storage</td>
    </tr>
    <tr>
      <td>Read path</td>
      <td>Cache in front of the database</td>
      <td>Multi-region cache plus edge-aware reads</td>
    </tr>
    <tr>
      <td>Public code shape</td>
      <td>Sequential codes may be acceptable</td>
      <td>Codes should be scrambled to prevent enumeration</td>
    </tr>
  </tbody>
</table>

<p>All of this works well for the standard problem. Now let’s go back to the trillion-URL version and look at why those answers stop working.</p>

<p>The core shift is from generating IDs with local randomness to treating the shortcode space as a globally allocated namespace.</p>

<h2 id="why-the-usual-approaches-fail-at-1-trillion-urls">Why the Usual Approaches Fail at 1 Trillion URLs</h2>

<p>The constraints change the problem enough that several standard answers break down.</p>

<h3 id="1-truncated-hashes-stop-being-safe">1. Truncated hashes stop being safe</h3>

<p>The assumption that MD5 plus base-62 gives us enough entropy fails here. Once we truncate the hash to just 7 characters, we are throwing away most of the hash space, and the birthday paradox tells us that collisions become mathematically unavoidable.</p>

<p>The birthday paradox says that in a room of just 23 people, there is about a 50% chance that two people share the same birthday, even though there are 365 possible days.</p>

<p>That feels surprising because 23 is much smaller than 365.</p>

<p>The reason is that we are comparing many pairs, not just one.</p>

<p>The number of comparisons among <code class="language-plaintext highlighter-rouge">k</code> items is:</p>

\[\frac{k(k-1)}{2}\]

<p>So the probability of collision grows quadratically as the number of generated IDs increases.</p>

<p>After base-62 encoding, the total possible ID space is approximately 3.5 trillion. The rough intuition is that collisions start becoming noticeable around $\sqrt{N}$, but the more precise 50% threshold is:</p>

\[k_{50} \approx \sqrt{2N \ln 2}\]

<p>Where:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">k_{50}</code> is the point where the chance of at least one collision is about 50%</li>
  <li><code class="language-plaintext highlighter-rouge">N</code> is the size of the total ID space</li>
</ul>

<p>For 7 base-62 characters:</p>

\[N = 62^7\]

\[N = 3.5 \times 10^{12}\]

<p>So:</p>

\[k_{50} \approx \sqrt{2 \cdot 62^7 \cdot \ln 2}\]

\[k_{50} \approx 2.21 \times 10^6\]

<p>So by about <strong>2.2 million generated IDs</strong>, the system already has roughly a <strong>50% chance of at least one collision</strong>. Even at <strong>2 million IDs</strong>, the probability is already significant. In this system, that is disastrous, because a user could be redirected to the wrong site.</p>

<h3 id="2-retry-on-collision-becomes-too-expensive">2. Retry-on-collision becomes too expensive</h3>

<p>One way to patch the problem is to salt the URL and keep rehashing until you find a unique value.</p>

<p>That creates a loop like this:</p>

<p><code class="language-plaintext highlighter-rouge">hash -&gt; DB check -&gt; collision? -&gt; rehash -&gt; check again</code></p>

<p>At trillion-record scale, the database index will itself be several terabytes. Every collision check becomes a random lookup against that massive index, and many of those lookups will fall out of cache.</p>

<p>If you have heavy write traffic, you end up spending a large chunk of your compute budget just proving that a string has not been used before. In the worst case, that becomes extremely expensive and operationally ugly.</p>

<p>We need to stop searching for uniqueness and start generating uniqueness.</p>

<h2 id="generating-uniqueness-instead-of-searching-for-it">Generating Uniqueness Instead of Searching for It</h2>

<p>The simplest way to guarantee uniqueness is to use a counter again. That mathematically guarantees a unique value for every URL.</p>

<p>But now we hit the next problem: we cannot hand users a giant integer because the short URL is constrained to at most 7 characters.</p>

<h3 id="turning-a-large-integer-into-a-7-character-code">Turning a large integer into a 7-character code</h3>

<p>The solution is to map the large integer into a base-62 string.</p>

<p>A base-62 alphabet can look like this:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">0-9</code> (10 characters)</li>
  <li><code class="language-plaintext highlighter-rouge">a-z</code> (26 characters)</li>
  <li><code class="language-plaintext highlighter-rouge">A-Z</code> (26 characters)</li>
</ul>

<p>That gives us:</p>

\[62 = 10 + 26 + 26\]

<p>Here is a simple example. Assume our counter is <code class="language-plaintext highlighter-rouge">500</code>.</p>

\[500 / 62 = 8 \text{ remainder } 4\]

<p>The remainder <code class="language-plaintext highlighter-rouge">4</code> maps to the character at index <code class="language-plaintext highlighter-rouge">4</code> in the alphabet, and the quotient <code class="language-plaintext highlighter-rouge">8</code> becomes the next digit in the conversion.</p>

<p>This gives us a one-to-one mapping where every integer in the usable range maps to a unique short string, and we can decode it back correctly if needed. There are no collision checks and no guesswork. The mapping itself is <code class="language-plaintext highlighter-rouge">O(1)</code>.</p>

<h3 id="avoiding-a-global-counter-bottleneck">Avoiding a global counter bottleneck</h3>

<p>A single counter is a disaster in a distributed system, because every writer would need to talk to the same coordinator.</p>

<p>To solve that, we introduce a coordinator service like ZooKeeper. It acts as a distributed, highly available source of truth and hands out ranges of IDs to each server.</p>

<p>For example:</p>

<ul>
  <li>Server A gets <code class="language-plaintext highlighter-rouge">1</code> to <code class="language-plaintext highlighter-rouge">1,000,000</code></li>
  <li>Server B gets <code class="language-plaintext highlighter-rouge">1,000,001</code> to <code class="language-plaintext highlighter-rouge">2,000,000</code></li>
</ul>

<p>Now each server can allocate IDs locally in memory by incrementing a local integer. That means:</p>

<ul>
  <li>No network calls for every URL creation</li>
  <li>No coordination on every write</li>
  <li>No locks</li>
  <li>No database uniqueness checks</li>
</ul>

<p>Once a server exhausts its range, it goes back to ZooKeeper and requests another one.</p>

<p>If Server A crashes, the worst case is that we lose some unused IDs from its range. That is fine. We have roughly 3.5 trillion total combinations and only need 1 trillion of them, so wasting some IDs is acceptable.</p>

<p>This makes ID generation solid. We are no longer hoping collisions do not happen. We have mathematically eliminated them.</p>

<h2 id="the-storage-wall">The Storage Wall</h2>

<p>Now we get to the storage problem.</p>

<p>If each URL takes roughly 100 bytes, then <code class="language-plaintext highlighter-rouge">100B * 1T = 100 TB</code> of raw URL data alone, and that is before timestamps, indexes, metadata, replication, and everything else.</p>

<p>That is not something a single PostgreSQL instance should handle. Even if it could, a deep B-tree index at this scale would translate into a cascade of disk lookups.</p>

<p>So the obvious answer is to distribute and shard the data across multiple nodes. For example, if 100 nodes each handle 5 TB, the problem becomes much more manageable.</p>

<h3 id="routing-requests-to-the-right-shard">Routing requests to the right shard</h3>

<p>The next question is how to know which node holds which URL.</p>

<p>The simplest approach is modulo-based sharding. The routing rule can be as simple as:</p>

<p><code class="language-plaintext highlighter-rouge">shard = hash(short_url) % 100</code></p>

<p>That gives us <code class="language-plaintext highlighter-rouge">O(1)</code> routing without a central directory.</p>

<p>If we expect the number of shards to change frequently, true consistent hashing or rendezvous hashing is a better choice because it minimizes remapping when nodes are added or removed.</p>

<p>Since this workload is mostly key-value lookups with no major joins or cross-node transactions, databases like Cassandra or DynamoDB are a much better fit than a monolithic relational database.</p>

<p>They are built for exactly this type of access pattern, and their LSM-tree storage engines handle write-heavy workloads much better while also simplifying sharding and replication.</p>

<h3 id="the-read-path-at-trillion-scale">The read path at trillion scale</h3>

<p>The Pareto principle applies strongly here: a small fraction of URLs will receive most of the traffic.</p>

<p>So treating every URL equally is wasteful. Instead, we place a distributed cache such as Redis or Memcached in front of the storage layer.</p>

<p>The read path becomes:</p>

<ul>
  <li>Cache hit: return the URL in under a millisecond.</li>
  <li>Cache miss: look up the correct shard in Cassandra, fetch the URL, store it in Redis, and then return the redirect.</li>
</ul>

<h2 id="security-global-ux-and-the-reductionist-mindset">Security, Global UX, and the Reductionist Mindset</h2>

<p>There is one important issue that usually does not come up early in system design interviews: security.</p>

<h3 id="making-the-urls-unpredictable">Making the URLs unpredictable</h3>

<p>If the system simply uses a visible counter, then the short URLs become predictable. If someone knows one URL, they can often guess the previous or next one.</p>

<p>That means an attacker could scrape the namespace, discover private links, or infer business intelligence like usage growth.</p>

<p>So even though we still want a counter under the hood, we need the outward-facing short codes to look random.</p>

<p>That is where a Feistel cipher helps. More precisely, we use it as a reversible permutation over the fixed ID space: the counter value is scrambled into another integer of the same size.</p>

<p>Every input still maps to a unique output, and we can still reverse it back to the original value. In other words, it behaves like format-preserving scrambling. But from the outside, the output looks random.</p>

<p>That gives us the best of both worlds: guaranteed uniqueness underneath and non-predictable public URLs on top.</p>

<h3 id="global-reads-and-disaster-scenarios">Global reads and disaster scenarios</h3>

<p>Now imagine a regional failure, or simply a user very far away from the region where the system is deployed.</p>

<p>If all redirects are served from, say, Northern Virginia, then a user in Tokyo has to make a full round trip to Virginia before being redirected. That can easily add around 300 ms, which is a poor user experience for something as simple as a redirect.</p>

<p>To avoid that, we can push reads and redirects closer to the user with a CDN like Cloudflare or Akamai, but that only helps if the short-code mapping is also available near the edge. That can be done by caching redirect responses, storing hot mappings in edge KV, or running edge compute backed by nearby replicas. Otherwise, the edge still has to call back to origin.</p>

<p>With edge-local data, the Tokyo user can be served from the Tokyo edge in something closer to 10 ms.</p>

<p>Writes are different. If a user in New York creates a link, it gets written to Cassandra and then replicated asynchronously to Tokyo.</p>

<p>That brings us to the CAP theorem.</p>

<p>Consistency, availability, and partition tolerance cannot all be maximized at the same time. In this case, we are willing to give up immediate consistency so we can preserve high availability and partition tolerance.</p>

<p>That means eventual consistency is acceptable. If a user in Tokyo cannot immediately access a link that was just created in New York because replication is still catching up, that is not a fatal problem. A refresh a few seconds later should resolve it.</p>

<p>So we are okay with eventual consistency, but we are not willing to compromise on availability or partition tolerance.</p>

<p>The one thing we absolutely cannot compromise on is uniqueness. That is the beauty of the range-manager approach. Even if New York and Tokyo are completely isolated from each other, as long as they were assigned non-overlapping ranges before the split, they can continue generating unique URLs independently with no risk of overlap.</p>

<h2 id="final-takeaway">Final Takeaway</h2>

<p>Let’s step back and trace what we built here.</p>

<p>We did not approach this as a simple URL-shortening coding task. We approached it as a namespace-management problem.</p>

<ol>
  <li>We determined the size of the namespace: <code class="language-plaintext highlighter-rouge">62^7</code>, or about 3.5 trillion possible short codes.</li>
  <li>We reduced uniqueness to a sequential counter.</li>
  <li>We handled distributed coordination through range allocation with ZooKeeper.</li>
  <li>We scaled storage with sharded key-value data in Cassandra.</li>
  <li>We reduced read latency by caching the hot set with Redis.</li>
  <li>We addressed predictability and security with a Feistel cipher.</li>
</ol>

<p>That is the reductionist mindset.</p>

<p>At this scale, you cannot brute-force your way through the problem with more code. You have to find the right abstraction that makes the problem manageable again.</p>

<p>If you walk into an interview and say, “I’ll use a hash and hope for the best,” you are thinking like a coder, which is fine. But if you talk about the physics of the ID space, explain why LSM trees outperform B-trees for this write-heavy profile, or discuss the birthday paradox and what it means for truncated hashes, you show a deeper understanding of how systems actually work.</p>

<p>That is what I mean when I say that understanding why a system fails is more important than knowing how to build it.</p>

<p>In the end, this is TinyURL at trillion scale: a simple-looking problem on the surface, but a masterclass in distributed systems underneath.</p>]]></content><author><name>Sathwick</name><email>sathwick.p7@gmail.com</email></author><category term="architecture" /><summary type="html"><![CDATA[Why the standard Bitly design breaks at trillion scale, and how deterministic IDs, range allocation, sharding, caching, and Feistel-based scrambling make it work.]]></summary></entry><entry><title type="html">The Architecture of Copying and Pasting Images on the Web</title><link href="https://sathwick.xyz/blog/copypaste.html" rel="alternate" type="text/html" title="The Architecture of Copying and Pasting Images on the Web" /><published>2026-03-07T00:00:00+00:00</published><updated>2026-03-07T00:00:00+00:00</updated><id>https://sathwick.xyz/blog/copypaste</id><content type="html" xml:base="https://sathwick.xyz/blog/copypaste.html"><![CDATA[<p>Copying an image from one website and pasting it to another. No downloads, no temporary files, no dragging things to your desktop. Just <code class="language-plaintext highlighter-rouge">Ctrl+C</code> -&gt; <code class="language-plaintext highlighter-rouge">Ctrl+V</code> and the image shows up as if it teleported across the web.</p>

<p>To understand how this works internally we need to understand the sandboxed renderer processes, serializing internal memory structures, navigating the inter-process communication (<code class="language-plaintext highlighter-rouge">IPC</code>) frameworks of the host OS, interfacing with legacy and modern clipboard APIs across platforms, and ultimately reconstructing the data into a secure, scriptable Object within a distinct Document Object Model (<code class="language-plaintext highlighter-rouge">DOM</code>).</p>

<p>This article dives deep into this exact feature, detailing the lifecycle of a copied image starting from the browser’s rendering engine, traversing through macOS, Windows, and Linux (both X11 and Wayland) OS clipboards, and securely re-entering a sandboxed web application.</p>

<h2 id="browser-side-copy-operation">Browser-Side Copy Operation</h2>

<p>The operation initiates when a user triggers a context menu over an image element and selects “Copy Image.” This action bypasses standard JavaScript clipboard API interceptions, which are typically gated by <code class="language-plaintext highlighter-rouge">ClipboardEvent.clipboardData</code>, and directly invokes the browser’s internal native handlers.</p>

<h3 id="image-retrieval-from-the-rendering-engine">Image Retrieval from the Rendering Engine</h3>

<p>When “Copy Image” is invoked, the browser must extract the visual data. Modern layout engines, such as Blink in Chrome, Gecko in Firefox, or WebKit in Safari, do not simply fetch the image from the network cache. While the compressed original bytes might exist in the HTTP disk or memory cache, a rendered image may have been modified by CSS, transformed, or drawn to an HTML5 <code class="language-plaintext highlighter-rouge">&lt;canvas&gt;</code>.</p>

<p>Instead, the browser’s rendering subsystem extracts the fully decoded bitmap currently residing in memory. In Chromium’s Blink engine, images are represented via the <code class="language-plaintext highlighter-rouge">blink::Image</code> abstraction.
Specifically, a <code class="language-plaintext highlighter-rouge">BitmapImage</code> (which often wraps an <span class="define" data-term="SkBitmap" data-definition="A class in the Skia graphics library representing a rectangular array of pixels stored in system memory, used for CPU-side image manipulation and rendering."><code class="language-plaintext highlighter-rouge">SkBitmap</code></span> or <code class="language-plaintext highlighter-rouge">SkImage</code> from the <span class="define" data-term="Skia" data-definition="An open-source 2D graphics library maintained by Google. It serves as the rendering backend for Chrome, Android, Flutter, and many other products, handling text, shapes, and image drawing."><code class="language-plaintext highlighter-rouge">Skia</code></span> graphics library) contains the raw pixel data. If the browser employs hardware-accelerated compositing, the <code class="language-plaintext highlighter-rouge">SkImage</code> may reside in GPU <span class="define" data-term="VRAM" data-definition="Video Random Access Memory. Dedicated high-bandwidth memory on the graphics card used for storing textures, framebuffers, and other GPU-accessible data. Faster than system RAM for GPU operations but not directly accessible by the CPU."><code class="language-plaintext highlighter-rouge">VRAM</code></span> as an OpenGL texture or Vulkan buffer.
To place this on the CPU-bound OS clipboard, the engine must perform a <span class="define" data-term="GPU readback" data-definition="The process of copying pixel data from GPU video memory (VRAM) back into CPU-accessible system RAM. This is an expensive operation because it stalls the GPU pipeline and requires synchronization between the CPU and GPU."><code class="language-plaintext highlighter-rouge">GPU readback</code></span> - A computationally expensive operation where pixels are copied from VRAM back into system RAM via <code class="language-plaintext highlighter-rouge">glReadPixels</code> or equivalent APIs, converting the hardware texture back into a software <code class="language-plaintext highlighter-rouge">SkBitmap</code>.</p>

<h3 id="generation-of-internal-mime-representations">Generation of Internal MIME Representations</h3>

<p>The OS clipboard is entirely format-agnostic; it acts as a generic key-value store where keys are format identifiers and values are binary blobs. To ensure the highest probability of successful pasting into diverse native applications, the browser generates multiple simultaneous representations of the image.</p>

<p>A single “Copy Image” action typically generates several internal representations before they are mapped to OS-specific formats. First, the engine re-encodes the raw <code class="language-plaintext highlighter-rouge">SkBitmap</code> pixel data into a standard compressed format, overwhelmingly <code class="language-plaintext highlighter-rouge">image/png</code>. This re-encoding step is crucial as it ensures a standardized file header and strips out malformed or proprietary data chunks.
Second, the browser generates an HTML fragment representing the image, labeled as <code class="language-plaintext highlighter-rouge">text/html</code>. This often embeds the image as a Base64 encoded Data URI or provides an <code class="language-plaintext highlighter-rouge">&lt;img&gt;</code> tag pointing to the source URL.</p>

<div class="language-html highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;meta</span> <span class="na">charset=</span><span class="s">'utf-8'</span><span class="nt">&gt;</span>
<span class="nt">&lt;img</span> <span class="na">src=</span><span class="s">"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mP8/5+hHgAHggJ/PchI7wAAAABJRU5ErkJggg=="</span>
     <span class="na">alt=</span><span class="s">"Description"</span>
     <span class="na">width=</span><span class="s">"500"</span>
     <span class="na">height=</span><span class="s">"300"</span><span class="nt">&gt;</span>
</code></pre></div></div>

<p>Finally, the absolute URL of the image is provided as <code class="language-plaintext highlighter-rouge">text/plain</code> as a fallback for text-only paste targets.</p>

<p>It’s important to know the difference between the 2 copy operations presented to the user. The “Copy Image” command extracts the decoded bitmap, re-encodes it, and places the binary blob on the clipboard alongside HTML and Text fallbacks.
Conversely, “Copy Image Address” simply extracts the <code class="language-plaintext highlighter-rouge">src</code> attribute from the <code class="language-plaintext highlighter-rouge">DOM</code> node and places it on the clipboard exclusively as <code class="language-plaintext highlighter-rouge">text/plain</code>.</p>

<h2 id="inter-process-communication-and-memory-ownership">Inter-Process Communication and Memory Ownership</h2>

<p>Because web pages execute in highly restricted, sandboxed “Renderer” processes, they lack the operating system privileges required to interact with the global system clipboard directly. The Renderer must therefore serialize the extracted image and transmit it to the highly privileged “Browser” process.</p>

<p>In Chromium, this boundary is crossed using <span class="define" data-term="Mojo" data-definition="Chromium's IPC (Inter-Process Communication) framework. It provides strongly-typed message passing between processes using interface definition language (mojom), replacing the older Chrome IPC system."><code class="language-plaintext highlighter-rouge">Mojo</code></span>, a lightweight message passing system. The Blink pasteboard implementation, specifically <code class="language-plaintext highlighter-rouge">blink::Pasteboard::writeImage</code>, formulates an IPC message historically routed via <code class="language-plaintext highlighter-rouge">ClipboardHostMsg_WriteImage</code> and now managed via strongly typed Mojo interfaces.</p>

<p>Image data is inherently large. Passing a multi-megabyte decoded bitmap over a standard UNIX domain socket or named pipe via standard message serialization would introduce massive latency and memory duplication. To circumvent this, Mojo utilizes a structure called <code class="language-plaintext highlighter-rouge">mojo_base.mojom.BigBuffer</code>.</p>

<p>When a payload exceeds a specific threshold, <code class="language-plaintext highlighter-rouge">BigBuffer</code> transparently shifts from an inline byte array to a <code class="language-plaintext highlighter-rouge">BigBufferSharedMemoryRegion</code>. The Renderer process requests the OS to allocate an anonymous shared memory segment, writes the encoded PNG bytes into it, and sends merely the file descriptor (or Windows Handle) and size over the Mojo IPC channel.
The Browser process maps this shared memory into its own address space, allowing zero-copy transmission of the image payload across the process boundary. Once the Browser process receives this message, the <code class="language-plaintext highlighter-rouge">ClipboardHostImpl</code> verifies the data, manages sequence tokens to prevent race conditions, and interfaces with the OS-specific clipboard APIs.</p>

<h3 id="architectural-diagram-browser-process-boundary">Architectural Diagram: Browser Process Boundary</h3>

<p><img src="/assets/2026-03-07-copypaste/clipboardarch.png" alt="Clipboard image architecture diagram" /></p>

<h2 id="os-specific-clipboard-layer-architecture">OS Specific Clipboard Layer Architecture</h2>

<p>Clipboards vary across different OSes. The browser must translate its internal web-standard MIME types into the native data structures expected by macOS, Windows, and Linux to ensure seamless interoperability with native applications.</p>

<h3 id="windows-win32-clipboard-api">Windows Win32 Clipboard API</h3>

<p>On Windows, the clipboard is a shared system resource accessed via the legacy Win32 API. When Chromium’s <code class="language-plaintext highlighter-rouge">ClipboardWin::WriteBitmap</code> executes, it translates the incoming <code class="language-plaintext highlighter-rouge">SkBitmap</code> into <code class="language-plaintext highlighter-rouge">Device Independent Bitmap</code> (<code class="language-plaintext highlighter-rouge">DIB</code>) formats.</p>

<p>Windows historically relies on <code class="language-plaintext highlighter-rouge">CF_BITMAP</code> (a GDI handle), <code class="language-plaintext highlighter-rouge">CF_DIB</code>, and <code class="language-plaintext highlighter-rouge">CF_DIBV5</code>. Because standard <code class="language-plaintext highlighter-rouge">CF_DIB</code> does not reliably support alpha channels for transparency, modern browsers write <code class="language-plaintext highlighter-rouge">CF_DIBV5</code>, which includes a <code class="language-plaintext highlighter-rouge">BITMAPV5HEADER</code> specifying color masks, color space information, and alpha values. However, due to rampant bugs in legacy software, such as Microsoft Office mishandling <code class="language-plaintext highlighter-rouge">CF_DIBV5</code> alpha channels resulting in black backgrounds browsers also explicitly write a standardized PNG format blob.</p>

<p>Thus, the Windows clipboard receives both <code class="language-plaintext highlighter-rouge">DIB</code> formats and a raw PNG blob. The order of format registration is vital, browsers prioritize the PNG format so that aware applications select it over the lossy or buggy <code class="language-plaintext highlighter-rouge">DIB</code> representations.</p>

<h3 id="macos-nspasteboard">macOS NSPasteboard</h3>

<p>Apple’s macOS handles clipboard operations via the <code class="language-plaintext highlighter-rouge">NSPasteboard</code> class, which acts as a client-side Objective-C wrapper around the <code class="language-plaintext highlighter-rouge">pbs</code> (pasteboard server) background daemon. The general pasteboard (<code class="language-plaintext highlighter-rouge">NSPasteboard.generalPasteboard</code>) manages data copying across the system.</p>

<p>WebKit and Chromium translate their internal representations into <span class="define" data-term="UTI" data-definition="Uniform Type Identifier. Apple's system for identifying data types using reverse-DNS strings (e.g. 'public.png', 'com.adobe.pdf'). UTIs form a conformance hierarchy, so 'public.png' conforms to 'public.image', which conforms to 'public.data'."><code class="language-plaintext highlighter-rouge">UTIs</code></span>. An image is registered under <code class="language-plaintext highlighter-rouge">public.png</code> (or <code class="language-plaintext highlighter-rouge">NSPasteboardType.png</code> / <code class="language-plaintext highlighter-rouge">NSPasteboardTypePNG</code>). HTML fallbacks are registered as <code class="language-plaintext highlighter-rouge">public.html</code> or the proprietary Apple Web Archive format.</p>

<p>When the browser writes to <code class="language-plaintext highlighter-rouge">NSPasteboard</code>, it packages the image into an <code class="language-plaintext highlighter-rouge">NSPasteboardItem</code>. Unlike Windows, which requires transferring global memory handles, macOS utilizes Mach ports to transfer data to the <code class="language-plaintext highlighter-rouge">pbs</code> daemon’s address space. For extremely large files, macOS supports “promised data” (<code class="language-plaintext highlighter-rouge">NSFilePromiseProvider</code>), where the clipboard merely holds a reference and defers materialization until the drop or paste occurs. However, for standard web images, the binary PNG is written directly to the pasteboard using <code class="language-plaintext highlighter-rouge">setData:forType:</code>.</p>

<h3 id="linux-x11-selection-model">Linux X11 Selection Model</h3>

<p>The X Window System (<code class="language-plaintext highlighter-rouge">X11</code>) does not inherently possess a global “clipboard buffer” that stores binary data like Windows or macOS. Instead, <code class="language-plaintext highlighter-rouge">X11</code> relies on “Selections” specifically the <code class="language-plaintext highlighter-rouge">CLIPBOARD</code> selection, managed via the <span class="define" data-term="ICCCM" data-definition="Inter-Client Communication Conventions Manual. The X11 specification that defines how X client applications should communicate with each other and the window manager, including clipboard (selection) ownership, data transfer protocols, and session management."><code class="language-plaintext highlighter-rouge">ICCCM</code></span> standard.</p>

<p>When a user copies an image in Firefox or Chrome on <code class="language-plaintext highlighter-rouge">X11</code>, the browser calls <code class="language-plaintext highlighter-rouge">XSetSelectionOwner</code>, claiming ownership of the <code class="language-plaintext highlighter-rouge">CLIPBOARD</code> atom. No image data is transferred to the X server at this point. The browser merely registers itself as the owner. When a user switches to Website B and triggers a paste, the receiving application calls <code class="language-plaintext highlighter-rouge">XConvertSelection</code>. The X server sends a <code class="language-plaintext highlighter-rouge">SelectionRequest</code> event to the owner (the browser process that copied the image). The requesting application asks for the <code class="language-plaintext highlighter-rouge">TARGETS</code> atom to discover what formats are available. The copying browser responds with a list of atoms corresponding to MIME types, such as <code class="language-plaintext highlighter-rouge">image/png</code> and <code class="language-plaintext highlighter-rouge">text/html</code>.</p>

<p>Once the receiving app requests <code class="language-plaintext highlighter-rouge">image/png</code>, the copying browser writes the PNG data to a property on the receiving application’s X window using <code class="language-plaintext highlighter-rouge">XChangeProperty</code>. However, the <code class="language-plaintext highlighter-rouge">X11</code> protocol has a maximum request size. For large images, the transfer must be negotiated using the <code class="language-plaintext highlighter-rouge">INCR</code> protocol. The data is chunked, often in 256KB increments, requiring a complex state machine of <code class="language-plaintext highlighter-rouge">SelectionNotify</code> and <code class="language-plaintext highlighter-rouge">PropertyNotify</code> events to stream the image from the sending process to the receiving process memory.</p>

<h3 id="linux-wayland-clipboard-protocol">Linux Wayland Clipboard Protocol</h3>

<p>Wayland modernizes Linux display architecture by entirely removing the X Server and substituting a secure compositor protocol. Like X11, Wayland lacks a global memory buffer; it is a pure peer-to-peer IPC mechanism mediated by the compositor.</p>

<p>When an image is copied, Chromium’s Ozone/Wayland backend creates a <code class="language-plaintext highlighter-rouge">wl_data_source</code> and calls <code class="language-plaintext highlighter-rouge">wl_data_source_offer</code>, indicating to the compositor that it possesses <code class="language-plaintext highlighter-rouge">image/png</code>. The browser then calls <code class="language-plaintext highlighter-rouge">wl_data_device_set_selection</code> to assert ownership.</p>

<p>When pasting, the receiving application asks for the data by sending a <code class="language-plaintext highlighter-rouge">wl_data_offer.receive</code> request to the compositor, specifying the MIME type and passing a file descriptor (<code class="language-plaintext highlighter-rouge">fd</code>), which is typically one end of a UNIX pipe. The compositor forwards this pipe to the copying browser via a <code class="language-plaintext highlighter-rouge">wl_data_source.send</code> event. The copying browser then writes the raw PNG binary data directly into the file descriptor and closes it.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Architectural Pseudocode for Wayland Data Offer Reception</span>
<span class="kt">void</span> <span class="nf">wl_data_offer_receive</span><span class="p">(</span><span class="k">struct</span> <span class="n">wl_data_offer</span> <span class="o">*</span><span class="n">offer</span><span class="p">,</span>
                           <span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">mime_type</span><span class="p">,</span>
                           <span class="kt">int</span> <span class="n">fd</span><span class="p">)</span> <span class="p">{</span>
    <span class="c1">// The browser receives the request, writes PNG bytes into 'fd'</span>
    <span class="n">write</span><span class="p">(</span><span class="n">fd</span><span class="p">,</span> <span class="n">png_binary_data</span><span class="p">,</span> <span class="n">png_size</span><span class="p">);</span>
    <span class="c1">// Closing the file descriptor signals EOF to the receiving application</span>
    <span class="n">close</span><span class="p">(</span><span class="n">fd</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>This file-descriptor-passing model provides excellent performance and security, as massive binary blobs are streamed directly through kernel pipes without passing through a middleman server, avoiding memory duplication.</p>

<h2 id="pasting-into-website-b-reverse-flow">Pasting into Website B (Reverse Flow)</h2>

<p>When the user navigates to Website B and presses <code class="language-plaintext highlighter-rouge">Ctrl+V</code> (or <code class="language-plaintext highlighter-rouge">Cmd+V</code>), the flow reverses, but introduces significant security checkpoints, sanitization requirements, and <code class="language-plaintext highlighter-rouge">DOM</code> API layers.</p>

<h3 id="gating-and-security-checks">Gating and Security Checks</h3>

<p>Pasting is an inherently dangerous operation. A malicious website could silently read the user’s clipboard, stealing passwords or personally identifiable information (<code class="language-plaintext highlighter-rouge">PII</code>) copied from external applications. Consequently, browsers mandate that paste events are heavily gated by <span class="define" data-term="transient user activation" data-definition="A browser security concept where certain privileged APIs (clipboard, fullscreen, popups) are only available for a brief window after a genuine user interaction like a click or keypress. This prevents scripts from silently invoking sensitive operations without user intent.">“transient user activation”</span> - a recent interaction like a physical click or keypress.
If the site attempts to read the clipboard programmatically via the Async Clipboard API (<code class="language-plaintext highlighter-rouge">navigator.clipboard.read()</code>), the browser invokes the Permissions API. If the clipboard-read permission has not been explicitly granted, the browser pauses script execution and displays a native permission prompt to the user.</p>

<h3 id="receiving-the-paste-event-and-os-ipc">Receiving the Paste Event and OS IPC</h3>

<p>Once authorized, the Browser process requests data from the OS clipboard. On Windows, it calls <code class="language-plaintext highlighter-rouge">GetClipboardData</code> for formats like <code class="language-plaintext highlighter-rouge">CF_DIBV5</code> or <code class="language-plaintext highlighter-rouge">PNG</code>. On macOS, it requests data from the <code class="language-plaintext highlighter-rouge">NSPasteboard</code>. On Wayland, it provides a pipe file descriptor to <code class="language-plaintext highlighter-rouge">wl_data_offer_receive</code> and reads the incoming stream.</p>

<p>Before this data is allowed back into the sandboxed Renderer process of Website B, it must be aggressively sanitized. An OS clipboard could contain a malformed image crafted to exploit vulnerabilities in libraries like <code class="language-plaintext highlighter-rouge">libpng</code> or <code class="language-plaintext highlighter-rouge">libjpeg</code>. Furthermore, an image might contain hidden EXIF metadata, such as GPS coordinates, representing a massive privacy violation if unknowingly pasted into a web form.</p>

<p>To mitigate this, the Browser process passes the raw binary blob to a sandboxed utility process. Here, the image is decoded back into an uncompressed bitmap, strictly discarding any metadata, ICC profiles, or malformed chunks. It is then securely re-encoded back into a clean PNG. This sanitized payload is passed via <code class="language-plaintext highlighter-rouge">Mojo BigBuffer</code> to Website B’s Renderer process.</p>

<h3 id="dom-paste-event-flow-and-datatransfer">DOM Paste Event Flow and DataTransfer</h3>

<p>Inside the Renderer, the JavaScript engine fires a paste event on the active DOM element. The event object (<code class="language-plaintext highlighter-rouge">ClipboardEvent</code>) contains a <code class="language-plaintext highlighter-rouge">DataTransfer</code> property. The engine parses the multiple MIME types provided by the OS and exposes them via the <code class="language-plaintext highlighter-rouge">event.clipboardData.items</code> list. This <code class="language-plaintext highlighter-rouge">DataTransfer</code> infrastructure is heavily shared with the HTML5 Drag-and-Drop API, utilizing identical underlying C++ data objects to represent the transferring payload.</p>

<p>Because reading heavy binary blobs synchronously would freeze the browser’s main thread, the <code class="language-plaintext highlighter-rouge">DataTransfer</code> object utilizes delayed materialization. When a developer loops through <code class="language-plaintext highlighter-rouge">clipboardData.items</code> and calls <code class="language-plaintext highlighter-rouge">item.getAsFile()</code>, the browser instantiates a JavaScript <code class="language-plaintext highlighter-rouge">File</code> (a subclass of <code class="language-plaintext highlighter-rouge">Blob</code>). The backing memory for this <code class="language-plaintext highlighter-rouge">Blob</code> is a pointer to the shared memory or cached byte array established during the IPC phase.</p>

<p>Different DOM elements handle the default paste behavior differently:</p>

<ul>
  <li><strong>contenteditable elements:</strong> The browser’s editing commands parse the incoming <code class="language-plaintext highlighter-rouge">text/html</code> payload from the clipboard. If an image is present, it generates an <code class="language-plaintext highlighter-rouge">&lt;img&gt;</code> tag and attempts to insert it into the DOM. If the image is a raw binary, it may be converted into a Base64 data URI.</li>
  <li><strong>textarea elements:</strong> These inputs accept only plain text. The browser aggressively filters the clipboard, stripping all HTML tags and ignoring binary image blobs, pasting only the fallback <code class="language-plaintext highlighter-rouge">text/plain</code> URL if available.</li>
  <li><strong><code class="language-plaintext highlighter-rouge">&lt;input type="file"&gt;</code> elements:</strong> The browser intercepts the paste event and populates the input’s <code class="language-plaintext highlighter-rouge">FileList</code> with the reconstructed <code class="language-plaintext highlighter-rouge">File</code> object, mimicking the behavior of a user manually selecting a file from the disk.</li>
</ul>

<h3 id="async-clipboard-api-vs-legacy-clipboard">Async Clipboard API vs. Legacy Clipboard</h3>

<p>The legacy <code class="language-plaintext highlighter-rouge">document.execCommand('paste')</code> and synchronous <code class="language-plaintext highlighter-rouge">ClipboardEvent</code> flow inherently block the main thread. To support modern, rich web applications, browsers have implemented the Async Clipboard API.</p>

<p>When <code class="language-plaintext highlighter-rouge">navigator.clipboard.read()</code> is called, it returns a <code class="language-plaintext highlighter-rouge">Promise</code>. The browser engine asynchronously queries the OS clipboard, performs the heavy decoding and sanitization off the main thread, and resolves the <code class="language-plaintext highlighter-rouge">Promise</code> with an array of <code class="language-plaintext highlighter-rouge">ClipboardItem</code> objects. The developer then calls <code class="language-plaintext highlighter-rouge">item.getType('image/png')</code>, which returns a secondary <code class="language-plaintext highlighter-rouge">Promise</code> resolving to the binary <code class="language-plaintext highlighter-rouge">Blob</code>. This completely asynchronous model allows the transfer of multi-megabyte images without degrading UI responsiveness or causing frame drops.</p>

<h2 id="full-end-to-end-data-flow">Full End-to-End Data Flow</h2>

<p>The following sequence details the complete low-level trace from the initial render on Website A to the final DOM insertion on Website B.</p>

<p><img src="/assets/2026-03-07-copypaste/copypasteflow.png" alt="Full end-to-end image copy paste flow" /></p>

<table>
  <thead>
    <tr>
      <th>Phase</th>
      <th>Component</th>
      <th>Technical Action / Memory Transition</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>1. Trigger</td>
      <td>Website A (Renderer)</td>
      <td>User right-clicks and selects “Copy Image”. The browser intercepts the native OS menu command, bypassing JS listeners.</td>
    </tr>
    <tr>
      <td>2. Extraction</td>
      <td>Layout Engine (Blink/Gecko/WebKit)</td>
      <td>Decoded bitmap (<code class="language-plaintext highlighter-rouge">SkBitmap</code> or equivalent) is extracted from the render tree. If hardware-accelerated, a GPU-to-CPU readback occurs.</td>
    </tr>
    <tr>
      <td>3. Encoding</td>
      <td>Image Encoder</td>
      <td>The uncompressed bitmap is synchronously encoded into compressed PNG bytes. HTML and Text fallbacks are generated.</td>
    </tr>
    <tr>
      <td>4. IPC Send</td>
      <td>IPC Framework (Mojo)</td>
      <td>The Renderer allocates an anonymous shared memory segment, writes the PNG bytes, and sends a <code class="language-plaintext highlighter-rouge">BigBuffer</code> file descriptor to the Browser Process.</td>
    </tr>
    <tr>
      <td>5. OS Registration</td>
      <td>OS Clipboard API</td>
      <td>Browser Process maps the shared memory and registers the data with the OS. Windows: <code class="language-plaintext highlighter-rouge">GlobalAlloc</code> + <code class="language-plaintext highlighter-rouge">SetClipboardData</code>. macOS: <code class="language-plaintext highlighter-rouge">NSPasteboard</code> + <code class="language-plaintext highlighter-rouge">pbs</code>. Linux: Asserts <code class="language-plaintext highlighter-rouge">CLIPBOARD</code> ownership or Wayland <code class="language-plaintext highlighter-rouge">wl_data_device_set_selection</code>.</td>
    </tr>
    <tr>
      <td>Context Switch</td>
      <td>Operating System</td>
      <td>The user switches the active window or tab to Website B, transferring application focus.</td>
    </tr>
    <tr>
      <td>6. Trigger Paste</td>
      <td>Website B (Renderer)</td>
      <td>User presses <code class="language-plaintext highlighter-rouge">Ctrl+V</code>. The browser initiates a paste sequence, checking for transient user activation to authorize the action.</td>
    </tr>
    <tr>
      <td>7. OS Query</td>
      <td>OS Clipboard API</td>
      <td>Browser Process requests data. Windows/Mac: Reads memory handles/ports. Linux Wayland: Provides a UNIX pipe <code class="language-plaintext highlighter-rouge">fd</code> to <code class="language-plaintext highlighter-rouge">wl_data_offer_receive</code> and reads the streamed bytes.</td>
    </tr>
    <tr>
      <td>8. Sanitization</td>
      <td>Utility Process</td>
      <td>The raw OS binary is decoded into a pixel array, stripping EXIF data, ICC profiles, and malformed chunks to neutralize exploits, then re-encoded into a safe PNG.</td>
    </tr>
    <tr>
      <td>9. IPC Receive</td>
      <td>IPC Framework (Mojo)</td>
      <td>The Browser process sends the sanitized PNG via a new <code class="language-plaintext highlighter-rouge">BigBuffer</code> shared memory region to Website B’s Renderer.</td>
    </tr>
    <tr>
      <td>10. DOM Exposure</td>
      <td>JavaScript Engine (V8/SpiderMonkey)</td>
      <td>The Renderer constructs a <code class="language-plaintext highlighter-rouge">ClipboardEvent</code>. The <code class="language-plaintext highlighter-rouge">DataTransferItemList</code> is populated. The script invokes <code class="language-plaintext highlighter-rouge">getAsFile()</code>, generating a delayed-materialization JS <code class="language-plaintext highlighter-rouge">Blob</code>.</td>
    </tr>
    <tr>
      <td>11. Application</td>
      <td>Website B Logic</td>
      <td>The application reads the <code class="language-plaintext highlighter-rouge">Blob</code>, uploads it via <code class="language-plaintext highlighter-rouge">fetch()</code>, or displays it using <code class="language-plaintext highlighter-rouge">URL.createObjectURL()</code>.</td>
    </tr>
  </tbody>
</table>

<h2 id="cross-browser-architectural-differences">Cross-Browser Architectural Differences</h2>

<p>While the general copy-paste pipeline remains conceptually consistent, the internal mechanisms and data structures diverge significantly based on the browser engine architecture.</p>

<h3 id="chrome-blink">Chrome (Blink)</h3>

<p>Blink prioritizes multi-process security and performance. Its use of <code class="language-plaintext highlighter-rouge">Mojo BigBuffer</code> for memory transfers ensures that IPC bottlenecks are minimized, avoiding redundant memory copying. Chromium explicitly manages format prioritization on Windows, placing PNG ahead of <code class="language-plaintext highlighter-rouge">CF_DIBV5</code> to appease applications like Microsoft Word, which possess buggy <code class="language-plaintext highlighter-rouge">CF_DIBV5</code> decoders. Furthermore, Chrome leads the implementation of the Async Clipboard API and recently introduced the unsanitized option to allow specific trusted payloads to bypass the strict image re-encoding step when absolute fidelity is required.</p>

<h3 id="firefox-gecko">Firefox (Gecko)</h3>

<p>Firefox’s architecture relies on the <code class="language-plaintext highlighter-rouge">nsIClipboard</code> interface. Data is bundled into an <code class="language-plaintext highlighter-rouge">nsITransferable</code> object, which manages various “flavors” (<code class="language-plaintext highlighter-rouge">MIME</code> types). A persistent architectural difference in Firefox is its handling of string encodings over <code class="language-plaintext highlighter-rouge">X11</code>, often utilizing <code class="language-plaintext highlighter-rouge">UTF-16</code>, which has historically caused translation issues with native Java applications expecting <code class="language-plaintext highlighter-rouge">UTF-8</code>. Furthermore, Firefox is highly aggressive in providing <code class="language-plaintext highlighter-rouge">CF_HDROP</code> (file drop) formats alongside standard image bitmaps, making pasted images appear as physical files to certain OS targets, which can improve compatibility with legacy file managers. Firefox also heavily utilizes <code class="language-plaintext highlighter-rouge">kSelectionClipboard</code> to support middle-click paste natively on Linux environments.</p>

<h3 id="safari-webkit">Safari (WebKit)</h3>

<p>WebKit’s pasteboard implementation (<code class="language-plaintext highlighter-rouge">Pasteboard.h</code> and <code class="language-plaintext highlighter-rouge">PlatformPasteboardIOS.mm</code>) is tightly integrated with Cocoa paradigms. It directly translates web types into Apple <code class="language-plaintext highlighter-rouge">UTIs</code>, such as mapping <code class="language-plaintext highlighter-rouge">image/png</code> to <code class="language-plaintext highlighter-rouge">public.png</code> and HTML to Apple Web Archive formats. Because Safari runs predominantly on macOS and iOS, it extensively utilizes <code class="language-plaintext highlighter-rouge">NSItemProvider</code> to handle promised data, interacting deeply with the pasteboard server (<code class="language-plaintext highlighter-rouge">pbs</code>). WebKit handles user activation differently than Blink, requiring developers to resolve <code class="language-plaintext highlighter-rouge">ClipboardItem</code> Promises within a very strict, synchronously triggered scope to prevent security exceptions, addressing specific iOS sandbox constraints.</p>

<h2 id="edge-cases-and-protocol-complexities">Edge Cases and Protocol Complexities</h2>

<p>The standard copy-paste flow is routinely complicated by edge cases involving web specifications, proprietary media types, and strict privacy boundaries.</p>

<h3 id="cross-origin-images-and-cors-implications">Cross-Origin Images and CORS Implications</h3>

<p>If Website A embeds an image from a different domain (e.g., <code class="language-plaintext highlighter-rouge">cdn.example.com</code>), the Same-Origin Policy prevents JavaScript from reading the pixels of that image. If a script draws a cross-origin image to an HTML5 <code class="language-plaintext highlighter-rouge">&lt;canvas&gt;</code>, the canvas becomes “tainted,” and calling <code class="language-plaintext highlighter-rouge">getImageData()</code> or <code class="language-plaintext highlighter-rouge">toBlob()</code> will throw a security exception unless the server provided an <code class="language-plaintext highlighter-rouge">Access-Control-Allow-Origin</code> (<code class="language-plaintext highlighter-rouge">CORS</code>) header.</p>

<p>However, the native “Copy Image” context menu is a trusted user action initiated outside of the <code class="language-plaintext highlighter-rouge">DOM</code>’s execution environment. The browser’s internal C++ handlers possess absolute access to the render tree’s memory and can successfully extract the <code class="language-plaintext highlighter-rouge">SkBitmap</code> and write it to the OS clipboard, bypassing <code class="language-plaintext highlighter-rouge">CORS</code> entirely. If Website A wishes to implement a custom “Copy” button using the Async Clipboard API, it must obey <code class="language-plaintext highlighter-rouge">CORS</code> and utilize <code class="language-plaintext highlighter-rouge">crossOrigin="Anonymous"</code> when fetching the image, or the operation will fail.</p>

<h3 id="copying-animated-gif-and-webp">Copying Animated GIF and WebP</h3>

<p>Animated formats present a severe limitation for OS clipboards. Binary formats like <code class="language-plaintext highlighter-rouge">CF_DIB</code> on Windows or <code class="language-plaintext highlighter-rouge">public.png</code> on macOS are fundamentally designed for static bitmaps. When a user copies an animated GIF via the context menu, the browser typically extracts the currently visible frame from the render tree, encodes it as a static PNG or Bitmap, and places it on the clipboard. Consequently, pasting the GIF into a chat application often results in a static, frozen image. To preserve animations, browsers attempt to write the HTML representation (<code class="language-plaintext highlighter-rouge">&lt;img src="...gif"&gt;</code>) or file paths (<code class="language-plaintext highlighter-rouge">CF_HDROP</code>), relying on the receiving application to parse the HTML or file reference rather than the raw bitmap.</p>

<h3 id="copying-svg-images">Copying SVG Images</h3>

<p>Scalable Vector Graphics (<code class="language-plaintext highlighter-rouge">SVG</code>) are mathematically defined paths rather than rasterized pixels. When “Copy Image” is invoked on an <code class="language-plaintext highlighter-rouge">&lt;svg&gt;</code> element, the browser cannot easily map it into a generic <code class="language-plaintext highlighter-rouge">CF_DIB</code>. Instead, the browser rasterizes the <code class="language-plaintext highlighter-rouge">SVG</code> to a target resolution, generating a standard PNG pixel buffer, and places that on the clipboard. Alternatively, the raw XML text of the <code class="language-plaintext highlighter-rouge">SVG</code> is placed into the <code class="language-plaintext highlighter-rouge">text/html</code> or <code class="language-plaintext highlighter-rouge">text/plain</code> slots, enabling vector editors like Adobe Illustrator to reconstruct the mathematical paths from the markup.</p>

<h3 id="private--incognito-mode-restrictions">Private / Incognito Mode Restrictions</h3>

<p>Browsers operate with extreme caution regarding clipboard data in private browsing modes. While data can be copied to the global OS clipboard (as it is the user’s explicit intent), caching the intermediate chunks on disk is strictly prohibited. For massive clipboard transfers (like macOS file promises or Linux Wayland pipe spools) that might ordinarily spill to the filesystem to save memory, the browser must force everything to remain in anonymous volatile memory to ensure no forensic traces survive process termination.</p>

<h2 id="final-thoughts">Final Thoughts</h2>

<p>Most of the time we never notice any of this, and that’s kind of the point. Modern browsers are designed so that these complexities disappear behind simple user interactions.</p>

<p>Not bad for something we do dozens of times a day.</p>]]></content><author><name>Sathwick</name><email>sathwick.p7@gmail.com</email></author><category term="architecture" /><summary type="html"><![CDATA[A deep dive into what actually happens when you copy an image from one website and paste it into another.]]></summary></entry><entry><title type="html">How I Built a PostgreSQL SSO Proxy from Scratch</title><link href="https://sathwick.xyz/blog/postgres-sso.html" rel="alternate" type="text/html" title="How I Built a PostgreSQL SSO Proxy from Scratch" /><published>2026-02-21T00:00:00+00:00</published><updated>2026-02-21T00:00:00+00:00</updated><id>https://sathwick.xyz/blog/postgres-sso</id><content type="html" xml:base="https://sathwick.xyz/blog/postgres-sso.html"><![CDATA[<p>A company where developers and product managers are required to be given access to the production database to edit rows sounds like a compliance nightmare. It was, but that wasn’t the problem I was looking to solve. I wanted to solve the issue of how we used to provide this access to people: we gave them a password for a single database user that everyone used, including the microservices themselves, and that user had a permission set of <code class="language-plaintext highlighter-rouge">GRANT ALL</code> on the entire database.</p>

<p>You could argue that each user could have an individual database user created and be handed the password to that, which would solve the issue of audit logging (who did what on the database), but it was just a management nightmare for us as the infra and security team. Hence we were looking for JIT tools like <code class="language-plaintext highlighter-rouge">strongDM</code> or <code class="language-plaintext highlighter-rouge">Teleport</code> which would solve these issues, but the cost of acquiring such a tool at our scale was going to be at least 100k USD per annum, which was not something I was comfortable asking my CTO to spend.</p>

<p>That’s when the idea of building an RDS proxy integrated with SSO came into the picture. This would allow us to give access to the databases via corporate email addresses only, with detailed audit logging of the queries run on the database.</p>

<p>This blog goes into detail on implementing this proxy in <code class="language-plaintext highlighter-rouge">Go</code> from scratch and maybe helps you understand the fundamentals of PostgreSQL and how it works.</p>

<p>The basic features I was aiming to build for my proxy were:</p>

<ol>
  <li>Connection pooling</li>
  <li>SSL/TLS support</li>
  <li>SSO auth with Azure AD via Auth0</li>
  <li>Auditing and observability</li>
</ol>

<p>The first step in building a proxy is to expose it as a server on a particular port (<code class="language-plaintext highlighter-rouge">7777</code>) actively listening for client connections. Once a connection is made, it is passed on to a goroutine to be processed.</p>

<p>Once the connection is made, the proxy then has to establish a connection to the actual PostgreSQL database. To understand how this happens exactly, we need to understand the communication protocol used by PostgreSQL.</p>

<h2 id="postgresql-wire-protocol-frontendbackend-protocol">PostgreSQL wire protocol (Frontend/Backend Protocol)</h2>

<p>PostgreSQL uses a message-based protocol for communication between frontends and backends (clients and servers). The protocol is supported universally on TCP/IP port <code class="language-plaintext highlighter-rouge">5432</code>.</p>

<p>In order to serve multiple clients efficiently, the server launches a new “backend” process for each client. In the current implementation, a new child process is created immediately after an incoming connection is detected. This is transparent to the protocol, however. For purposes of the protocol, the terms “backend” and “server” are interchangeable; likewise “frontend” and “client” are interchangeable.</p>

<p>Every PostgreSQL message has this format:</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[Message Type (1 byte)] [Length (4 bytes)] [Message Body (Length-4 bytes)]
</code></pre></div></div>

<ul>
  <li><strong>Message type</strong>: single character identifying the message</li>
  <li><strong>Length</strong>: 32-bit int</li>
  <li><strong>Message body</strong>: the actual data</li>
</ul>

<p>At the proxy level, the messages are referred to like this:</p>

<ul>
  <li><strong>Frontend messages</strong> — sent by the client; the proxy intercepts these and sends them to the DB</li>
  <li><strong>Backend messages</strong> — sent by the server; the proxy intercepts these from the DB and sends them to the user</li>
</ul>

<p>In our implementation, we use <a href="https://github.com/jackc/pgproto3"><code class="language-plaintext highlighter-rouge">pgproto3</code></a>, which is the encoder and decoder of the PostgreSQL wire protocol version 3.</p>

<h3 id="startup-message">Startup message</h3>

<p>The Startup message in the PostgreSQL wire protocol is the very first message sent when a PostgreSQL connection is established, and it is special enough that it has to be handled separately because:</p>

<ol>
  <li>It has no message type.</li>
  <li>It’s always the first message in any PostgreSQL connection.</li>
  <li>It contains connection parameters such as username and database name.</li>
</ol>

<p>Flow:</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Client -&gt; StartupMessage -&gt; Proxy -&gt; StartupMessage -&gt; Database
</code></pre></div></div>

<p>Now that we have an understanding of the protocol, we can move ahead with the message flow. Once the connection is made to the proxy, a new <code class="language-plaintext highlighter-rouge">pgproto3.Backend</code> wrapping the raw TCP connection is created and the first call is <code class="language-plaintext highlighter-rouge">pgconn.ReceiveStartupMessage()</code>.</p>

<p>The PostgreSQL startup message, as mentioned above, has no message type byte. Its format is <code class="language-plaintext highlighter-rouge">[length:4 bytes][protocol_version:4 bytes][parameters]</code>. <code class="language-plaintext highlighter-rouge">pgproto3</code> handles this by reading the first 4 bytes, checking if they match the SSL request magic number (<code class="language-plaintext highlighter-rouge">80877103</code>), the cancel request magic (<code class="language-plaintext highlighter-rouge">80877102</code>), or a protocol version, and returning the appropriate concrete type.</p>

<p>There are three possible message types at this point, each handled by its own branch.</p>

<h4 id="case-a-sslrequest-pgproto3sslrequest">Case A: <code class="language-plaintext highlighter-rouge">SSLRequest</code> (<code class="language-plaintext highlighter-rouge">*pgproto3.SSLRequest</code>)</h4>

<p>The client sends this request before the real startup message when it wants to establish TLS. The proxy must respond with a single byte before the client proceeds.</p>

<ol>
  <li>
    <p>If TLS is not configured: the proxy sends the single byte <code class="language-plaintext highlighter-rouge">N</code> (ASCII 78), indicating to the client that TLS is unavailable, and the client immediately sends the real <code class="language-plaintext highlighter-rouge">StartupMessage</code> on the same plaintext connection.</p>
  </li>
  <li>
    <p>If TLS is configured: the proxy sends <code class="language-plaintext highlighter-rouge">S</code> (ASCII 83), indicating TLS is accepted, wraps the raw <code class="language-plaintext highlighter-rouge">net.Conn</code> in a <code class="language-plaintext highlighter-rouge">tls.Conn</code> using the server’s certificate, and performs the TLS handshake.</p>
  </li>
</ol>

<p>Once done, <code class="language-plaintext highlighter-rouge">pc.conn</code> is replaced with the TLS connection and a new <code class="language-plaintext highlighter-rouge">pgproto3.Backend</code> connection is built over the TLS connection, and everything subsequent (password and queries) is encrypted.</p>

<p>Once one of the above is completed successfully, the <code class="language-plaintext highlighter-rouge">StartupMessage</code> is sent by the client to the proxy.</p>

<h4 id="case-b-startupmessage-pgproto3startupmessage">Case B: <code class="language-plaintext highlighter-rouge">StartupMessage</code> (<code class="language-plaintext highlighter-rouge">*pgproto3.StartupMessage</code>)</h4>

<p>The <code class="language-plaintext highlighter-rouge">StartupMessage</code> contains <code class="language-plaintext highlighter-rouge">user</code>, <code class="language-plaintext highlighter-rouge">database</code>, <code class="language-plaintext highlighter-rouge">application_name</code>, <code class="language-plaintext highlighter-rouge">client_encoding</code>, and any other parameters the client sends.</p>

<p>Once the proxy receives this, it doesn’t send it to the PostgreSQL server. It instead sends back an <code class="language-plaintext highlighter-rouge">AuthenticationCleartextPassword</code>, just like how the PostgreSQL server would.</p>

<h4 id="case-c-cancelrequest-pgproto3cancelrequest">Case C: <code class="language-plaintext highlighter-rouge">CancelRequest</code> (<code class="language-plaintext highlighter-rouge">*pgproto3.CancelRequest</code>)</h4>

<p>Cancel requests are entirely separate TCP connections, if enabled. A client receiving the SIGKILL or <code class="language-plaintext highlighter-rouge">Ctrl+C</code> opens a new connection to the proxy’s port and immediately sends a 16-byte cancel message without any SSL negotiation.</p>

<p>Structure:</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Byte offset   Size     Value              Meaning
----------------------------------------------------------
0 - 3         4 bytes  0x00000010 (16)    Total message length
4 - 7         4 bytes  0x04D2162E         Cancel magic number (80877102)
8 - 11        4 bytes  &lt;ProcessID&gt;        The backend PID to cancel
12 - 15       4 bytes  &lt;SecretKey&gt;        The secret key for that PID
</code></pre></div></div>

<p>Inside the proxy, this message must be assembled manually each time, like this:</p>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">buf</span> <span class="o">:=</span> <span class="nb">make</span><span class="p">([]</span><span class="kt">byte</span><span class="p">,</span> <span class="m">16</span><span class="p">)</span>

<span class="c">// Message length - 16 bytes</span>
<span class="n">binary</span><span class="o">.</span><span class="n">BigEndian</span><span class="o">.</span><span class="n">PutUint32</span><span class="p">(</span><span class="n">buf</span><span class="p">[</span><span class="m">0</span><span class="o">:</span><span class="m">4</span><span class="p">],</span> <span class="m">16</span><span class="p">)</span>

<span class="c">// Cancel request code - 80877102</span>
<span class="n">binary</span><span class="o">.</span><span class="n">BigEndian</span><span class="o">.</span><span class="n">PutUint32</span><span class="p">(</span><span class="n">buf</span><span class="p">[</span><span class="m">4</span><span class="o">:</span><span class="m">8</span><span class="p">],</span> <span class="m">80877102</span><span class="p">)</span>

<span class="c">// Process ID</span>
<span class="n">binary</span><span class="o">.</span><span class="n">BigEndian</span><span class="o">.</span><span class="n">PutUint32</span><span class="p">(</span><span class="n">buf</span><span class="p">[</span><span class="m">8</span><span class="o">:</span><span class="m">12</span><span class="p">],</span> <span class="n">cancel</span><span class="o">.</span><span class="n">ProcessID</span><span class="p">)</span>

<span class="c">// Secret key</span>
<span class="n">binary</span><span class="o">.</span><span class="n">BigEndian</span><span class="o">.</span><span class="n">PutUint32</span><span class="p">(</span><span class="n">buf</span><span class="p">[</span><span class="m">12</span><span class="o">:</span><span class="m">16</span><span class="p">],</span> <span class="n">cancel</span><span class="o">.</span><span class="n">SecretKey</span><span class="p">)</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">ProcessID</code> is the identification assigned to the process forked for handling the new connection made.</p>

<p><code class="language-plaintext highlighter-rouge">SecretKey</code> is generated by PostgreSQL within the backend process when a new client connection is established. When PostgreSQL forks a dedicated backend process to handle the connection, it creates a random 32-bit integer and associates it with that process’s PID.</p>

<p>The <code class="language-plaintext highlighter-rouge">SecretKey</code> exists only for the purpose of query cancellation.</p>

<p>PostgreSQL then sends both the PID and the <code class="language-plaintext highlighter-rouge">SecretKey</code> to the connected client in a <code class="language-plaintext highlighter-rouge">BackendKeyData</code> message:</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>┌──────────────┬──────────────┬──────────────┬──────────────┐
│ 'K' (1 byte) │   Length     │  ProcessID   │  SecretKey   │
│  type byte   │  (4 bytes)   │  (4 bytes)   │  (4 bytes)   │
└──────────────┴──────────────┴──────────────┴──────────────┘
</code></pre></div></div>

<p>Based on the <code class="language-plaintext highlighter-rouge">ProcessID</code> and <code class="language-plaintext highlighter-rouge">SecretKey</code>, the proxy must now identify the active connection and cancel it.</p>

<p>In my proxy, we store the <code class="language-plaintext highlighter-rouge">activeConnections</code> in a map keyed by <code class="language-plaintext highlighter-rouge">uint64(ProcessID &lt;&lt; 32 | SecretKey)</code> — a bitfield combining both values into a single efficient map key.</p>

<p>If the connection is found to be active, the cancel request flow is called, which initiates a new TCP connection to the database, not from the connection pool. The raw 16-byte binary cancel message is sent and the connection is closed immediately.</p>

<p>PostgreSQL receives this, validates the PID/<code class="language-plaintext highlighter-rouge">SecretKey</code> against its own backend process table, and sends <code class="language-plaintext highlighter-rouge">SIGINT</code> to the matching backend process which aborts the in-flight query and returns an <code class="language-plaintext highlighter-rouge">ErrorResponse</code> with code <code class="language-plaintext highlighter-rouge">57014</code> to the client via the existing pooled connection.</p>

<h2 id="authentication-inside-the-proxy">Authentication inside the proxy</h2>

<p>Now that the <code class="language-plaintext highlighter-rouge">StartupMessage</code> has been sent successfully, it’s time for the user authentication part of the proxy. I split it into 3 sequential phases:</p>

<ol>
  <li><strong>Open a temporary backend connection</strong> - A raw TCP connection to PostgreSQL, not from the pool, is made with the sole purpose of authentication.</li>
  <li>
    <p><strong>Request a password from the client</strong> - The proxy sends <code class="language-plaintext highlighter-rouge">AuthenticationCleartextPassword</code> to the client. From the client’s perspective, the proxy is behaving like a PostgreSQL server requesting a password. The client sends back a <code class="language-plaintext highlighter-rouge">PasswordMessage</code> containing whatever was in <code class="language-plaintext highlighter-rouge">PGPASSWORD</code> or whatever was entered interactively.</p>

    <p>PostgreSQL generally uses SCRAM-SHA-256 or MD5, but the proxy here always asks the client for cleartext.</p>
  </li>
  <li><strong>Determine the authentication flow</strong> - The reason for getting the password as cleartext is for the proxy to inspect it and decide whether the password sent is a JWT token or a normal password.</li>
</ol>

<p><strong>JWT detection</strong>: check for the <code class="language-plaintext highlighter-rouge">eyJ</code> prefix (base64url encoding of <code class="language-plaintext highlighter-rouge">{"</code>) and exactly 2 dots (the three-part JWT structure <code class="language-plaintext highlighter-rouge">header.payload.signature</code>). Simple heuristic, but correct for all JWTs.</p>

<h3 id="traditional-password-flow">Traditional password flow</h3>

<p>The username the client sent is used along with the password. It’s basically a fallback for when someone configures a real PostgreSQL user in the proxy and connects with a traditional password.</p>

<h3 id="inside-jwt-validation">Inside JWT validation</h3>

<ol>
  <li><strong><code class="language-plaintext highlighter-rouge">jwt.Parse</code></strong> is called with a key function. The key function:
    <ul>
      <li>Checks <code class="language-plaintext highlighter-rouge">t.Method.Alg()</code> == <code class="language-plaintext highlighter-rouge">"RS256"</code> — rejects anything else</li>
      <li>Extracts <code class="language-plaintext highlighter-rouge">kid</code> (Key ID) from the token header</li>
      <li>Calls <code class="language-plaintext highlighter-rouge">v.getPublicKey(kid)</code></li>
    </ul>
  </li>
  <li><strong><code class="language-plaintext highlighter-rouge">getPublicKey(kid)</code></strong> — this is where JWKS caching happens:
    <ul>
      <li>Acquires <code class="language-plaintext highlighter-rouge">RLock</code>, checks if <code class="language-plaintext highlighter-rouge">kid</code> exists in <code class="language-plaintext highlighter-rouge">v.publicKeys</code> and <code class="language-plaintext highlighter-rouge">time.Since(v.lastKeysFetch) &lt; 1 hour</code></li>
      <li><strong>Cache hit</strong>: releases RLock, returns the key — no network call</li>
      <li><strong>Cache miss</strong>: releases RLock, acquires full <code class="language-plaintext highlighter-rouge">Lock</code> (write), double-checks again (another goroutine may have fetched while waiting), then calls <code class="language-plaintext highlighter-rouge">fetchJWKS()</code></li>
      <li><code class="language-plaintext highlighter-rouge">fetchJWKS()</code> makes a GET to <code class="language-plaintext highlighter-rouge">https://&lt;AUTH0_TENANT&gt;/.well-known/jwks.json</code> with a 10-second timeout, filters for <code class="language-plaintext highlighter-rouge">kty=RSA, use=sig</code>, decodes base64url modulus <code class="language-plaintext highlighter-rouge">N</code> and exponent <code class="language-plaintext highlighter-rouge">E</code>, constructs <code class="language-plaintext highlighter-rouge">*rsa.PublicKey</code> objects, stores them all in <code class="language-plaintext highlighter-rouge">v.publicKeys</code> keyed by <code class="language-plaintext highlighter-rouge">kid</code>, updates <code class="language-plaintext highlighter-rouge">v.lastKeysFetch</code></li>
      <li><strong>Fetch failure with stale keys</strong>: if the JWKS endpoint is down but old keys exist, logs a warning and returns the stale key — this is the graceful degradation path</li>
      <li><strong>Fetch failure, no keys</strong>: returns error</li>
    </ul>
  </li>
  <li>
    <p><strong><code class="language-plaintext highlighter-rouge">jwt.Parse</code> verifies the RS256 signature</strong> using the public key returned from the key function. If the signature is invalid, it returns an error.</p>
  </li>
  <li><strong>Manual claim validation</strong> (after signature passes):
    <ul>
      <li><code class="language-plaintext highlighter-rouge">iss</code> claim == configured issuer — exact string match</li>
      <li><code class="language-plaintext highlighter-rouge">aud</code> claim == configured audience — handles both <code class="language-plaintext highlighter-rouge">string</code> and <code class="language-plaintext highlighter-rouge">[]interface{}</code> types (Auth0 can send either)</li>
      <li><code class="language-plaintext highlighter-rouge">email</code> claim — must be present and non-empty</li>
      <li><code class="language-plaintext highlighter-rouge">sub</code> claim — must be present and non-empty</li>
      <li><code class="language-plaintext highlighter-rouge">exp</code> claim — <code class="language-plaintext highlighter-rouge">time.Now().After(oauthContext.ExpiresAt)</code> — double-check (jwt.Parse also checks this but the manual check is explicit)</li>
    </ul>
  </li>
  <li><strong>Role extraction</strong> from <code class="language-plaintext highlighter-rouge">extractRoles()</code>: tries <code class="language-plaintext highlighter-rouge">claims["role"]</code> first, then <code class="language-plaintext highlighter-rouge">claims["roles"]</code> — handles both singular and plural claim names. Each can be a <code class="language-plaintext highlighter-rouge">string</code> or <code class="language-plaintext highlighter-rouge">[]interface{}</code>.</li>
</ol>

<p>This validation process returns the email, role, and expiry time for the token, which is then used to map the role to a service account configured in the proxy. If no role matches, it ends up using the default role, which has read-only access.</p>

<p>Service accounts are basically users configured in the PostgreSQL database that the proxy uses to connect to the database, since the SSO-returned user does not actually exist inside PostgreSQL.</p>

<p>This also ensures we don’t have to create a PostgreSQL user for every user logging into the database, and the same goes for deletion as well. If a user is removed from Active Directory, they automatically do not have access to the database anymore.</p>

<h2 id="authentication-with-postgresql">Authentication with PostgreSQL</h2>

<p>Now that the proxy has authenticated and authorised the incoming SSO user, it now needs to connect this user/client to the actual PostgreSQL database (backend).</p>

<p>This is done in the same way by sending a <code class="language-plaintext highlighter-rouge">StartupMessage</code> to PostgreSQL via a temporary connection, with one small change: the <code class="language-plaintext highlighter-rouge">user</code> field is replaced with the service account username before being sent to PostgreSQL. PostgreSQL never sees the original <code class="language-plaintext highlighter-rouge">user@email.com</code> that the client sent.</p>

<p>The proxy now enters a loop reading messages from PostgreSQL. This part is referred to as <a href="https://www.postgresql.org/docs/current/sasl-authentication.html">SASL authentication</a> in the PostgreSQL protocol.</p>

<ol>
  <li>
    <p>To begin a SASL authentication exchange, the PostgreSQL server sends an <code class="language-plaintext highlighter-rouge">AuthenticationSASL</code> message. It includes a list of SASL authentication mechanisms that the server can accept, in the server’s preferred order.</p>

    <p>The default for this is usually either SCRAM-SHA-256 or MD5, rarely cleartext.</p>
  </li>
  <li>
    <p>The proxy selects the first one in the priority of the supported mechanisms from the list, and sends a <code class="language-plaintext highlighter-rouge">SASLInitialResponse</code> message to the server.</p>

    <p>If <code class="language-plaintext highlighter-rouge">AuthenticationSASL</code> sends SCRAM-SHA-256, the proxy instantiates an <code class="language-plaintext highlighter-rouge">xdg-go/scram</code> SHA-256 client acting on behalf of the service account. The library is used to perform the full cryptographic exchange using the service account’s password. PostgreSQL never sees the JWT — it only sees the service account performing standard SCRAM.</p>

    <p>SCRAM-SHA-256 is a 3-round challenge-response protocol. The proxy first sends the <code class="language-plaintext highlighter-rouge">SASLInitialResponse</code> and, with the help of the <code class="language-plaintext highlighter-rouge">scram</code> library, starts the conversation by calling <code class="language-plaintext highlighter-rouge">Step("")</code> with an empty string — this means “generate the client-first message” (the opening move of SCRAM).</p>
  </li>
</ol>

<p>The above in code looks something like this:</p>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">client</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">scram</span><span class="o">.</span><span class="n">SHA256</span><span class="o">.</span><span class="n">NewClient</span><span class="p">(</span><span class="n">username</span><span class="p">,</span> <span class="n">password</span><span class="p">,</span> <span class="s">""</span><span class="p">)</span>
<span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
	<span class="k">return</span> <span class="n">logger</span><span class="o">.</span><span class="n">Errorf</span><span class="p">(</span><span class="s">"failed to create SCRAM client: %w"</span><span class="p">,</span> <span class="n">err</span><span class="p">)</span>
<span class="p">}</span>

<span class="n">scramConversation</span> <span class="o">=</span> <span class="n">client</span><span class="o">.</span><span class="n">NewConversation</span><span class="p">()</span>
<span class="n">initialResponse</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">scramConversation</span><span class="o">.</span><span class="n">Step</span><span class="p">(</span><span class="s">""</span><span class="p">)</span>
<span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
	<span class="k">return</span> <span class="n">logger</span><span class="o">.</span><span class="n">Errorf</span><span class="p">(</span><span class="s">"SCRAM initial step failed: %w"</span><span class="p">,</span> <span class="n">err</span><span class="p">)</span>
<span class="p">}</span>

<span class="n">logger</span><span class="o">.</span><span class="n">Debug</span><span class="p">(</span><span class="s">"sending SCRAM initial response to backend"</span><span class="p">)</span>
<span class="n">err</span> <span class="o">=</span> <span class="n">frontend</span><span class="o">.</span><span class="n">Send</span><span class="p">(</span><span class="o">&amp;</span><span class="n">pgproto3</span><span class="o">.</span><span class="n">SASLInitialResponse</span><span class="p">{</span>
	<span class="n">AuthMechanism</span><span class="o">:</span> <span class="s">"SCRAM-SHA-256"</span><span class="p">,</span>
	<span class="n">Data</span><span class="o">:</span>          <span class="p">[]</span><span class="kt">byte</span><span class="p">(</span><span class="n">initialResponse</span><span class="p">),</span>
<span class="p">})</span>
<span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
	<span class="k">return</span> <span class="n">logger</span><span class="o">.</span><span class="n">Errorf</span><span class="p">(</span><span class="s">"failed to send SASL initial response: %w"</span><span class="p">,</span> <span class="n">err</span><span class="p">)</span>
<span class="p">}</span>
</code></pre></div></div>

<p>The final payload is a structured ASCII string. It looks like:</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>n,,n=gprxy_admin,r=fyko+d2lbbFgONRv9qkxdawL
</code></pre></div></div>

<p>Breaking this down character by character:</p>

<table>
  <thead>
    <tr>
      <th>Part</th>
      <th>Value</th>
      <th>Meaning</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>n,,</td>
      <td>n,,</td>
      <td>GS2 header. <code class="language-plaintext highlighter-rouge">n</code> = no channel binding. <code class="language-plaintext highlighter-rouge">,,</code> = no authzid</td>
    </tr>
    <tr>
      <td>n=</td>
      <td>n=gprxy_admin</td>
      <td>The username (the <code class="language-plaintext highlighter-rouge">n=</code> attribute)</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">,</code></td>
      <td><code class="language-plaintext highlighter-rouge">,</code></td>
      <td>Separator</td>
    </tr>
    <tr>
      <td>r=</td>
      <td>r=fyko+d2lbbFgONRv9qkxdawL</td>
      <td>Client nonce</td>
    </tr>
  </tbody>
</table>

<p><code class="language-plaintext highlighter-rouge">authzid</code> (Authorization Identity) is the SASL mechanism component defining the user identity that a client wants to act as.</p>

<p>Client nonce — a cryptographically random base64 string generated fresh for this authentication.</p>

<ol>
  <li>PostgreSQL responds with the <code class="language-plaintext highlighter-rouge">AuthenticationSASLContinue</code> server-first message, a challenge. This is the message that makes SCRAM secure.</li>
</ol>

<p>The payload contains <code class="language-plaintext highlighter-rouge">r=&lt;combined_nonce&gt;,s=&lt;salt&gt;,i=&lt;iterations&gt;</code></p>

<p>Combined nonce — client nonce + server nonce appended together. PostgreSQL echoes back the client nonce and appends its own random suffix. The client must verify the prefix matches what it sent.</p>

<p>Salt — a random base64-encoded value stored in pg_authid alongside the user’s password hash. Different for every user.</p>

<p>Iteration count — how many times to apply PBKDF2 to derive the key. Higher = more expensive to brute-force. PostgreSQL defaults to 4096.</p>

<ol>
  <li>The proxy responds with the client-final message, containing the client proof as <code class="language-plaintext highlighter-rouge">SASLResponse</code>.</li>
</ol>

<p>To send the response, the proxy first needs to do the cryptographic modifications to the request, for which it calls <code class="language-plaintext highlighter-rouge">scramConversation.Step(serverFirstMessage)</code>.</p>

<p>The following cryptographic computations are done:</p>

<ol>
  <li><code class="language-plaintext highlighter-rouge">SaltedPassword = PBKDF2(SHA-256, password, salt, iterations, 32)</code></li>
  <li><code class="language-plaintext highlighter-rouge">ClientKey = HMAC-SHA-256(SaltedPassword, "Client Key")</code></li>
  <li><code class="language-plaintext highlighter-rouge">StoredKey = SHA-256(ClientKey)</code></li>
  <li><code class="language-plaintext highlighter-rouge">AuthMessage = client-first-message-bare + "," + server-first-message + "," + client-final-message-without-proof</code></li>
  <li><code class="language-plaintext highlighter-rouge">ClientSignature = HMAC-SHA-256(StoredKey, AuthMessage)</code></li>
  <li><code class="language-plaintext highlighter-rouge">ClientProof = ClientKey XOR ClientSignature</code></li>
</ol>

<p>The password never travels on the wire. Only <code class="language-plaintext highlighter-rouge">ClientProof</code> does — a value that proves you know the password without revealing it.</p>

<p>The final payload that is sent to the server looks like this:</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>c=biws,r=fyko+d2lbbFgONRv9qkxdawL3rfcNHYJY1ZVvWVs7j,p=dHzbZapWIk4jUhN+Ute9ytag9zjfMHgsqmmiz9AndVQ=
</code></pre></div></div>

<table>
  <thead>
    <tr>
      <th>Attribute</th>
      <th>Example value</th>
      <th>Meaning</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>c=</td>
      <td>biws</td>
      <td>Channel binding data <code class="language-plaintext highlighter-rouge">biws</code> is the base64 of <code class="language-plaintext highlighter-rouge">"n,,"</code> (the GS2 header from the initial message). Since gprxy uses no channel binding, this is always <code class="language-plaintext highlighter-rouge">biws</code>.</td>
    </tr>
    <tr>
      <td>r=</td>
      <td>fyko+d2lbbFgONRv9qkxdawL3rfcNHYJY1ZVvWVs7j</td>
      <td>The full combined nonce echoed back exactly as received from the server.</td>
    </tr>
    <tr>
      <td>p=</td>
      <td>dHzbZapWIk4jUhN+Ute9ytag9zjfMHgsqmmiz9AndVQ=</td>
      <td>The ClientProof — the XOR of <code class="language-plaintext highlighter-rouge">ClientKey</code> and <code class="language-plaintext highlighter-rouge">ClientSignature</code>, base64-encoded. This is the proof of knowledge.</td>
    </tr>
  </tbody>
</table>

<ol>
  <li>
    <p>PostgreSQL receives the above payload and does the below computations before sending the final server message <code class="language-plaintext highlighter-rouge">AuthenticationSASLFinal</code>.</p>
  </li>
  <li>Verifies <code class="language-plaintext highlighter-rouge">r=</code> still starts with the client nonce it saw earlier.</li>
  <li>Computes the same <code class="language-plaintext highlighter-rouge">AuthMessage</code> on its side using the stored password hash.</li>
  <li>Computes <code class="language-plaintext highlighter-rouge">StoredKey</code> from <code class="language-plaintext highlighter-rouge">pg_authid</code>.</li>
  <li>Verifies the proof: <code class="language-plaintext highlighter-rouge">SHA-256(ClientKey)</code> must equal <code class="language-plaintext highlighter-rouge">StoredKey</code>, which it can check without knowing <code class="language-plaintext highlighter-rouge">ClientKey</code> directly.</li>
</ol>

<p>This is the mutual authentication step: PostgreSQL proves to the proxy that it also knows the password. This prevents man-in-the-middle attacks.</p>

<p>The final payload that is sent to the client looks like this:</p>

<p><code class="language-plaintext highlighter-rouge">v=&lt;ServerSignature&gt;</code></p>

<ol>
  <li>The proxy calls <code class="language-plaintext highlighter-rouge">scramConversation.Step(serverFinalMessage)</code>:</li>
</ol>

<p>This internally computes the expected <code class="language-plaintext highlighter-rouge">ServerSignature</code> using its copy of <code class="language-plaintext highlighter-rouge">SaltedPassword</code> and the <code class="language-plaintext highlighter-rouge">AuthMessage</code>, then compares it to <code class="language-plaintext highlighter-rouge">v=</code> from the server. If they don’t match, it means it is connected to a rogue server.</p>

<ol>
  <li>
    <p>Along with <code class="language-plaintext highlighter-rouge">AuthenticationSASLFinal</code>, the server also sends an <code class="language-plaintext highlighter-rouge">AuthenticationOk</code> message. The same is passed along to the client, which makes it believe that the authentication is successful.</p>
  </li>
  <li>
    <p>PostgreSQL also sends several of these immediately after <code class="language-plaintext highlighter-rouge">AuthenticationOk</code>:</p>
  </li>
</ol>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>server_version    = 16.1
client_encoding   = UTF8
server_encoding   = UTF8
DateStyle         = ISO, MDY
TimeZone          = UTC
integer_datetimes = on
...
</code></pre></div></div>

<p>Each is forwarded to the client unchanged. The client caches these for the session.</p>

<ol>
  <li><code class="language-plaintext highlighter-rouge">BackendKeyData</code> and <code class="language-plaintext highlighter-rouge">ReadyForQuery</code> are sent to the proxy by PostgreSQL, but these are never relayed to the client and the temporary connection is then terminated.</li>
</ol>

<p>The reason behind this is that this whole authentication process was performed by a temporary connection that was terminated. Relaying that connection’s <code class="language-plaintext highlighter-rouge">BackendKeyData</code>, which is mainly used in cancelling requests by extracting the <code class="language-plaintext highlighter-rouge">(PID, SecretKey)</code>, would result in either of these 3 scenarios:</p>

<ol>
  <li>
    <p><strong>Temp connection PID is already dead</strong> - The temp connection is closed immediately after auth. Its backend PostgreSQL process (<code class="language-plaintext highlighter-rouge">PID=12345</code>) is gone. The client presses <code class="language-plaintext highlighter-rouge">Ctrl+C</code>. The proxy receives <code class="language-plaintext highlighter-rouge">(PID=12345, SK=98765)</code>. It looks in <code class="language-plaintext highlighter-rouge">activeConnections</code> — nothing is registered there with that pair. The cancel is silently dropped. The query keeps running forever.</p>
  </li>
  <li>
    <p><strong>Temp connection PID is registered but wrong</strong> - Even if the proxy tried to register the temp connection’s key, what would it point to? The temp connection has no pool connection attached to it. There is no backend to cancel on. The proxy would forward the cancel to PostgreSQL targeting a process that is either dead or belongs to a completely different connection.</p>
  </li>
  <li>
    <p><strong>OS PID reuse</strong> - PIDs are finite. The OS can recycle <code class="language-plaintext highlighter-rouge">PID=12345</code> to a completely different PostgreSQL backend process after the temp connection closes. The client’s cancel request, carrying that stale PID, could accidentally cancel a totally unrelated query running on a different client’s connection.</p>
  </li>
</ol>

<p>The only correct PID and SecretKey to give the client is the one belonging to the pool connection — the live backend process that is actually executing queries for this client.</p>

<p>Similarly, the <code class="language-plaintext highlighter-rouge">ReadyForQuery</code> message is also suppressed and not relayed to the client, as there is no pooled backend connection ready yet for it to start relaying queries.</p>

<h2 id="post-authentication-sequence">Post-Authentication Sequence</h2>

<p>Now that the user is authenticated successfully with PostgreSQL, the proxy needs a connection from the pool to start running queries.</p>

<p>Let’s now talk about how connection pooling is implemented:</p>

<h3 id="layer-1-the-top-level-registry-poolmanager">Layer 1: The Top-Level Registry <code class="language-plaintext highlighter-rouge">poolManager</code></h3>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">var</span> <span class="p">(</span>
    <span class="n">poolManager</span> <span class="o">=</span> <span class="nb">make</span><span class="p">(</span><span class="k">map</span><span class="p">[</span><span class="n">poolKey</span><span class="p">]</span><span class="o">*</span><span class="n">pgxpool</span><span class="o">.</span><span class="n">Pool</span><span class="p">)</span>
    <span class="n">poolMutex</span>   <span class="n">sync</span><span class="o">.</span><span class="n">RWMutex</span>
<span class="p">)</span>
</code></pre></div></div>

<p>This is a <strong>global process-wide map</strong> — one instance for the entire gprxy process, shared across all goroutines and all client connections. It lives for the lifetime of the process and is never torn down.</p>

<h4 id="the-key-poolkey">The Key: <code class="language-plaintext highlighter-rouge">poolKey</code></h4>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">type</span> <span class="n">poolKey</span> <span class="k">struct</span> <span class="p">{</span>
	<span class="n">user</span>     <span class="kt">string</span>
	<span class="n">database</span> <span class="kt">string</span>
<span class="p">}</span>
</code></pre></div></div>

<p>This is a Go struct used as a map key. Go allows any comparable type as a map key, and structs with only comparable fields are comparable. The two fields together form the composite key.</p>

<p><code class="language-plaintext highlighter-rouge">user</code> here is the <strong>original client username</strong> — e.g. <code class="language-plaintext highlighter-rouge">alice@example.com</code> from the JWT, or <code class="language-plaintext highlighter-rouge">bob</code> from traditional auth. It is NOT the service account (<code class="language-plaintext highlighter-rouge">gprxy_admin</code>). This is set from <code class="language-plaintext highlighter-rouge">msg.Parameters["user"]</code> from the original <code class="language-plaintext highlighter-rouge">StartupMessage</code>.</p>

<p><code class="language-plaintext highlighter-rouge">database</code> is the database name from the same <code class="language-plaintext highlighter-rouge">StartupMessage</code> — e.g. <code class="language-plaintext highlighter-rouge">gprxy_test</code>.</p>

<p>So the map looks like this after several clients connect:</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>poolManager = {
    {user: "alice@example.com", database: "gprxy_test"}  →  *pgxpool.Pool (up to 5 conns)
    {user: "alice@example.com", database: "analytics"}   →  *pgxpool.Pool (up to 5 conns)
    {user: "bob@example.com",   database: "gprxy_test"}  →  *pgxpool.Pool (up to 5 conns)
    {user: "carol",             database: "gprxy_test"}  →  *pgxpool.Pool (up to 5 conns)
}
</code></pre></div></div>

<p>Each <code class="language-plaintext highlighter-rouge">*pgxpool.Pool</code> value manages its own set of up to 5 real TCP connections to PostgreSQL.</p>

<h4 id="the-lock-syncrwmutex">The Lock: <code class="language-plaintext highlighter-rouge">sync.RWMutex</code></h4>

<p><code class="language-plaintext highlighter-rouge">poolMutex</code> protects <code class="language-plaintext highlighter-rouge">poolManager</code> from concurrent reads and writes across goroutines. Since every client connection runs in its own goroutine, many goroutines can call <code class="language-plaintext highlighter-rouge">GetOrCreatePool</code> simultaneously.</p>

<p>A <code class="language-plaintext highlighter-rouge">sync.RWMutex</code> allows:</p>

<ul>
  <li><strong>Many goroutines to read simultaneously</strong> — <code class="language-plaintext highlighter-rouge">RLock()</code> → <code class="language-plaintext highlighter-rouge">RUnlock()</code></li>
  <li><strong>Only one goroutine to write, blocking all reads</strong> — <code class="language-plaintext highlighter-rouge">Lock()</code> → <code class="language-plaintext highlighter-rouge">Unlock()</code></li>
</ul>

<p>The read path (pool already exists) is the fast path — it only holds a read lock for a microsecond. The write path (first connection for this key) is rare and takes a write lock briefly to insert the new pool.</p>

<h3 id="layer-2-the-double-checked-locking-pattern">Layer 2: The Double-Checked Locking Pattern</h3>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">poolMutex</span><span class="o">.</span><span class="n">RLock</span><span class="p">()</span>
<span class="n">pool</span><span class="p">,</span> <span class="n">exists</span> <span class="o">:=</span> <span class="n">poolManager</span><span class="p">[</span><span class="n">key</span><span class="p">]</span>
<span class="n">poolMutex</span><span class="o">.</span><span class="n">RUnlock</span><span class="p">()</span>

<span class="k">if</span> <span class="n">exists</span> <span class="p">{</span>
	<span class="k">return</span> <span class="n">pool</span><span class="p">,</span> <span class="no">nil</span>
<span class="p">}</span>

<span class="n">poolMutex</span><span class="o">.</span><span class="n">Lock</span><span class="p">()</span>
<span class="k">defer</span> <span class="n">poolMutex</span><span class="o">.</span><span class="n">Unlock</span><span class="p">()</span>

<span class="k">if</span> <span class="n">pool</span><span class="p">,</span> <span class="n">exists</span> <span class="o">:=</span> <span class="n">poolManager</span><span class="p">[</span><span class="n">key</span><span class="p">];</span> <span class="n">exists</span> <span class="p">{</span>
	<span class="k">return</span> <span class="n">pool</span><span class="p">,</span> <span class="no">nil</span>
<span class="p">}</span>
</code></pre></div></div>

<p>This is a classic <strong>double-checked locking</strong> pattern. Here is why it needs two checks:</p>

<p><strong>Scenario without the second check:</strong></p>

<ol>
  <li>Goroutine A: <code class="language-plaintext highlighter-rouge">RLock()</code> → pool not found → <code class="language-plaintext highlighter-rouge">RUnlock()</code></li>
  <li>Goroutine B: <code class="language-plaintext highlighter-rouge">RLock()</code> → pool not found → <code class="language-plaintext highlighter-rouge">RUnlock()</code></li>
  <li>Goroutine A: <code class="language-plaintext highlighter-rouge">Lock()</code> → creates pool → <code class="language-plaintext highlighter-rouge">Unlock()</code></li>
  <li>Goroutine B: <code class="language-plaintext highlighter-rouge">Lock()</code> → <strong>also creates a second pool</strong> → <strong>two pools for same key, one is lost</strong></li>
</ol>

<p>The second check inside the write lock prevents this. When goroutine B gets the write lock after A finishes, it checks again and finds the pool already there, so it returns it instead of creating a duplicate.</p>

<h3 id="layer-3-what-pgxpoolpool-actually-is">Layer 3: What <code class="language-plaintext highlighter-rouge">pgxpool.Pool</code> Actually Is</h3>

<p>Each value in <code class="language-plaintext highlighter-rouge">poolManager</code> is a <code class="language-plaintext highlighter-rouge">*pgxpool.Pool</code>. This is not a simple slice of connections. It is a sophisticated object with its own goroutines and internal data structures.</p>

<h4 id="internal-structure-inside-pgxpool">Internal structure (inside pgxpool):</h4>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>*pgxpool.Pool
├── config         *pgxpool.Config       (MaxConns, timeouts, etc.)
├── p              *puddle.Pool[*pgxpool.connResource]   ← the actual pool
│   ├── resources  []poolResource        (ring buffer of connections)
│   ├── cond       *sync.Cond            (for blocking Acquire calls)
│   └── ...
├── closeChan      chan struct{}          (signal pool close)
└── ...
</code></pre></div></div>

<p>pgxpool uses the <strong>puddle</strong> library internally for the actual pooling logic. <code class="language-plaintext highlighter-rouge">puddle</code> maintains a list of resources (connections) and a <code class="language-plaintext highlighter-rouge">sync.Cond</code> for goroutines waiting for an available connection.</p>

<h4 id="the-pool-configuration-gprxy-sets">The pool configuration gprxy sets:</h4>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">config</span><span class="o">.</span><span class="n">MaxConns</span> <span class="o">=</span> <span class="n">defaultMaxConns</span>            <span class="c">// 5</span>
<span class="n">config</span><span class="o">.</span><span class="n">MinConns</span> <span class="o">=</span> <span class="n">defaultMinConns</span>            <span class="c">// 0</span>
<span class="n">config</span><span class="o">.</span><span class="n">MaxConnLifetime</span> <span class="o">=</span> <span class="n">defaultMaxConnLifetime</span>      <span class="c">// 1 hour</span>
<span class="n">config</span><span class="o">.</span><span class="n">MaxConnIdleTime</span> <span class="o">=</span> <span class="n">defaultMaxConnIdleTime</span>      <span class="c">// 30 minutes</span>
<span class="n">config</span><span class="o">.</span><span class="n">HealthCheckPeriod</span> <span class="o">=</span> <span class="n">defaultHealthCheckPeriod</span> <span class="c">// 1 minute</span>
<span class="n">config</span><span class="o">.</span><span class="n">ConnConfig</span><span class="o">.</span><span class="n">ConnectTimeout</span> <span class="o">=</span> <span class="n">defaultConnectTimeout</span> <span class="c">// 5 seconds</span>
</code></pre></div></div>

<p>What each setting actually controls:</p>

<p><strong><code class="language-plaintext highlighter-rouge">MaxConns = 5</code></strong>
The hard ceiling. At most 5 TCP connections to PostgreSQL will ever exist for this <code class="language-plaintext highlighter-rouge">(user, database)</code> key. When all 5 are acquired (in use), the 6th <code class="language-plaintext highlighter-rouge">pool.Acquire()</code> call blocks — the calling goroutine is parked and put on a wait queue inside puddle. It will be woken up when one of the 5 connections is released.</p>

<p><strong><code class="language-plaintext highlighter-rouge">MinConns = 0</code></strong>
No pre-warming. When the pool is created, zero connections to PostgreSQL are opened. The first <code class="language-plaintext highlighter-rouge">pool.Acquire()</code> on a fresh pool will always open a new TCP connection. This is lazy initialization — no connections consumed for idle users.</p>

<p><strong><code class="language-plaintext highlighter-rouge">MaxConnLifetime = 1 hour</code></strong>
A background goroutine inside pgxpool periodically checks the age of every connection. Any connection older than 1 hour is closed and removed from the pool, even if it is idle and healthy. This forces periodic reconnection, which is important for:</p>

<ul>
  <li>Picking up PostgreSQL configuration changes</li>
  <li>Rotating credentials if needed</li>
  <li>Preventing connections from being silently dropped by firewalls or load balancers that kill long-lived idle connections</li>
</ul>

<p><strong><code class="language-plaintext highlighter-rouge">MaxConnIdleTime = 30 minutes</code></strong>
Any connection that has been idle (not acquired by anyone) for 30 minutes is closed. This prevents the pool from holding open connections during quiet periods.</p>

<p><strong><code class="language-plaintext highlighter-rouge">HealthCheckPeriod = 1 minute</code></strong>
Every minute, a background goroutine runs through all idle connections and pings each one. Any connection that fails the ping (PostgreSQL restarted, network blip) is removed from the pool. This keeps the pool clean so that <code class="language-plaintext highlighter-rouge">Acquire()</code> always returns a working connection.</p>

<p><strong><code class="language-plaintext highlighter-rouge">ConnectTimeout = 5 seconds</code></strong>
When a new TCP connection to PostgreSQL needs to be opened, it must complete the entire startup handshake (TCP connect + SCRAM auth + <code class="language-plaintext highlighter-rouge">ReadyForQuery</code>) within 5 seconds. If it takes longer, the connection attempt is aborted and an error is returned.</p>

<h3 id="layer-4-pgxpoolconn-as-a-single-exclusive-handle">Layer 4: <code class="language-plaintext highlighter-rouge">*pgxpool.Conn</code> as a Single Exclusive Handle</h3>

<p>When <code class="language-plaintext highlighter-rouge">pool.Acquire(ctx)</code> returns, it gives back a <code class="language-plaintext highlighter-rouge">*pgxpool.Conn</code>. This is not a connection itself — it is a <strong>handle</strong> that:</p>

<ol>
  <li>Wraps the underlying <code class="language-plaintext highlighter-rouge">*pgx.Conn</code></li>
  <li>Marks that connection as <strong>acquired</strong> (in use) in the pool’s internal state</li>
  <li>Provides a <code class="language-plaintext highlighter-rouge">Release()</code> method to return it</li>
</ol>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>*pgxpool.Conn
├── p      *pgxpool.Pool        (pointer back to parent pool)
└── res    *puddle.Resource     (the resource being held)
    └── value *pgxpool.connResource
              └── conn *pgx.Conn   (the actual connection)
</code></pre></div></div>

<p>The connection is <strong>exclusively held</strong> — the pool will not give the same underlying <code class="language-plaintext highlighter-rouge">*pgx.Conn</code> to any other goroutine while one <code class="language-plaintext highlighter-rouge">*pgxpool.Conn</code> holds it. This is what makes it safe for <code class="language-plaintext highlighter-rouge">pc.bf.Send()</code> and <code class="language-plaintext highlighter-rouge">pc.bf.Receive()</code> to call directly into the TCP socket without any additional locking. The pool’s ownership model guarantees single-writer, single-reader.</p>

<h4 id="how-gprxy-digs-through-the-layers-to-get-the-raw-socket">How gprxy digs through the layers to get the raw socket</h4>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">underlyingConn</span> <span class="o">:=</span> <span class="n">pc</span><span class="o">.</span><span class="n">poolConn</span><span class="o">.</span><span class="n">Conn</span><span class="p">()</span><span class="o">.</span><span class="n">PgConn</span><span class="p">()</span><span class="o">.</span><span class="n">Conn</span><span class="p">()</span>
<span class="n">bf</span> <span class="o">:=</span> <span class="n">pgproto3</span><span class="o">.</span><span class="n">NewFrontend</span><span class="p">(</span><span class="n">pgproto3</span><span class="o">.</span><span class="n">NewChunkReader</span><span class="p">(</span><span class="n">underlyingConn</span><span class="p">),</span> <span class="n">underlyingConn</span><span class="p">)</span>
</code></pre></div></div>

<p>The chain of unwrapping:</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pc.poolConn              *pgxpool.Conn
  .Conn()                *pgx.Conn         (higher-level pgx connection)
    .PgConn()            *pgconn.PgConn    (low-level wire protocol connection)
      .Conn()            net.Conn          (raw TCP socket)
</code></pre></div></div>

<p>gprxy bypasses all of pgx’s query execution machinery and talks directly to the raw TCP socket. It wraps it with a <code class="language-plaintext highlighter-rouge">pgproto3.Frontend</code> to get PostgreSQL wire protocol serialization/deserialization. This is why gprxy can forward arbitrary protocol messages — because it is working at the wire level, not through pgx’s <code class="language-plaintext highlighter-rouge">Query()</code>/<code class="language-plaintext highlighter-rouge">Exec()</code> API.</p>

<p>However, this creates a subtle tension: pgx still thinks it “owns” this connection and its internal state machine. gprxy is now sending bytes on the socket that pgx does not know about. This is why the <code class="language-plaintext highlighter-rouge">fullResetBeforeRelease</code> step is critical — pgx’s internal state may be out of sync with the actual PostgreSQL session state after gprxy forwards arbitrary queries, and <code class="language-plaintext highlighter-rouge">ROLLBACK</code> + <code class="language-plaintext highlighter-rouge">DISCARD ALL</code> restores the PostgreSQL session to a clean state before pgx takes back ownership.</p>

<h3 id="layer-5-connection-lifecycle-state-machine">Layer 5: Connection Lifecycle State Machine</h3>

<p>For a single physical PostgreSQL connection managed by <code class="language-plaintext highlighter-rouge">pgxpool</code>, the happy-path lifecycle looks like this:</p>

<p><img src="/assets/gprxy/Untitled-2026-03-07-2103.png" alt="Connection lifecycle state machine" /></p>

<p>Important nuance: after <code class="language-plaintext highlighter-rouge">pgxpool.NewWithConfig()</code>, the pool object exists immediately, but with <code class="language-plaintext highlighter-rouge">MinConns=0</code> there may be zero physical PostgreSQL connections inside it until the first <code class="language-plaintext highlighter-rouge">Acquire()</code> needs one. Also, <code class="language-plaintext highlighter-rouge">Release()</code> does not always transition to <code class="language-plaintext highlighter-rouge">[IDLE]</code> — if the connection is broken, expired, or otherwise not reusable, pgxpool closes it instead of returning it to the free list.</p>

<h3 id="layer-6-the-acquire-flow">Layer 6: The Acquire Flow</h3>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">connection</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">pool</span><span class="o">.</span><span class="n">Acquire</span><span class="p">(</span><span class="n">context</span><span class="o">.</span><span class="n">Background</span><span class="p">())</span>
<span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
	<span class="k">return</span> <span class="no">nil</span><span class="p">,</span> <span class="n">logger</span><span class="o">.</span><span class="n">Errorf</span><span class="p">(</span><span class="s">"error while acquiring connection from the database pool: %w"</span><span class="p">,</span> <span class="n">err</span><span class="p">)</span>
<span class="p">}</span>

<span class="n">err</span> <span class="o">=</span> <span class="n">connection</span><span class="o">.</span><span class="n">Ping</span><span class="p">(</span><span class="n">context</span><span class="o">.</span><span class="n">Background</span><span class="p">())</span>
<span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
	<span class="n">connection</span><span class="o">.</span><span class="n">Release</span><span class="p">()</span>
	<span class="k">return</span> <span class="no">nil</span><span class="p">,</span> <span class="n">logger</span><span class="o">.</span><span class="n">Errorf</span><span class="p">(</span><span class="s">"could not ping database: %w"</span><span class="p">,</span> <span class="n">err</span><span class="p">)</span>
<span class="p">}</span>

<span class="k">return</span> <span class="n">connection</span><span class="p">,</span> <span class="no">nil</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">pool.Acquire(ctx)</code> internally does:</p>

<ol>
  <li>Lock puddle’s internal mutex</li>
  <li>Check the free list (idle connections):
    <ul>
      <li>If found: remove from free list, mark as acquired, return it</li>
    </ul>
  </li>
  <li>If free list empty, check total count vs <code class="language-plaintext highlighter-rouge">MaxConns</code>:
    <ul>
      <li>If below <code class="language-plaintext highlighter-rouge">MaxConns</code>: unlock, open a new connection (dial + SCRAM auth), lock again, add to acquired set, return</li>
      <li>If at <code class="language-plaintext highlighter-rouge">MaxConns</code>: add this goroutine to a wait queue (<code class="language-plaintext highlighter-rouge">sync.Cond.Wait()</code>), unlock, sleep</li>
      <li>When another goroutine calls <code class="language-plaintext highlighter-rouge">Release()</code>: it calls <code class="language-plaintext highlighter-rouge">Cond.Signal()</code>, waking one waiter, which retries the acquire</li>
    </ul>
  </li>
</ol>

<p>After <code class="language-plaintext highlighter-rouge">Acquire()</code> returns, gprxy calls <code class="language-plaintext highlighter-rouge">Ping()</code>. This is an extra safety net on top of the health check background goroutine. Between the health check goroutine’s last check (up to 1 minute ago) and right now, the connection could have gone stale. <code class="language-plaintext highlighter-rouge">Ping()</code> sends a minimal no-op to PostgreSQL and waits for a response.</p>

<p>If <code class="language-plaintext highlighter-rouge">Ping()</code> fails, <code class="language-plaintext highlighter-rouge">Release()</code> is called immediately — but gprxy does not put a broken connection back. pgxpool’s <code class="language-plaintext highlighter-rouge">Release()</code> is smart: if the underlying connection returns an error, it destroys the connection instead of returning it to the free list. So after <code class="language-plaintext highlighter-rouge">connection.Release()</code> on a failed ping, the pool size decreases by 1, and the next <code class="language-plaintext highlighter-rouge">Acquire()</code> will open a fresh connection.</p>

<h3 id="layer-7-the-logpoolstats-observation-window">Layer 7: The <code class="language-plaintext highlighter-rouge">LogPoolStats</code> Observation Window</h3>

<p>After every successful <code class="language-plaintext highlighter-rouge">AcquireConnection</code>, gprxy calls:</p>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">pool</span><span class="o">.</span><span class="n">LogPoolStats</span><span class="p">(</span><span class="n">user</span><span class="p">,</span> <span class="n">database</span><span class="p">)</span>
</code></pre></div></div>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">func</span> <span class="n">LogPoolStats</span><span class="p">(</span><span class="n">user</span><span class="p">,</span> <span class="n">database</span> <span class="kt">string</span><span class="p">)</span> <span class="p">{</span>
	<span class="n">stats</span> <span class="o">:=</span> <span class="n">pool</span><span class="o">.</span><span class="n">Stat</span><span class="p">()</span>
	<span class="n">logger</span><span class="o">.</span><span class="n">Debug</span><span class="p">(</span><span class="s">"pool stats for [%s,%s] - total: %d, acquired: %d, idle: %d"</span><span class="p">,</span> <span class="n">user</span><span class="p">,</span> <span class="n">database</span><span class="p">,</span> <span class="n">stats</span><span class="o">.</span><span class="n">TotalConns</span><span class="p">(),</span> <span class="n">stats</span><span class="o">.</span><span class="n">AcquiredConns</span><span class="p">(),</span> <span class="n">stats</span><span class="o">.</span><span class="n">IdleConns</span><span class="p">())</span>
<span class="p">}</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">pool.Stat()</code> returns a snapshot of:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">TotalConns()</code> — total live connections (acquired + idle), max is 5</li>
  <li><code class="language-plaintext highlighter-rouge">AcquiredConns()</code> — currently held by goroutines (including this one just acquired)</li>
  <li><code class="language-plaintext highlighter-rouge">IdleConns()</code> — back in free list, available immediately</li>
</ul>

<p><code class="language-plaintext highlighter-rouge">TotalConns = AcquiredConns + IdleConns</code> always. For example:</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pool stats for [alice@example.com,gprxy_test] - total: 3, acquired: 2, idle: 1
</code></pre></div></div>

<p>This means 3 real TCP connections exist to PostgreSQL, 2 are in use by active client connections, and 1 is sitting idle waiting to be acquired.</p>

<h3 id="layer-8-release-and-reset">Layer 8: Release and Reset</h3>

<p>When a client disconnects, the defer block in <code class="language-plaintext highlighter-rouge">handleConnection</code> runs:</p>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">if</span> <span class="n">pc</span><span class="o">.</span><span class="n">poolConn</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
	<span class="n">err</span> <span class="o">:=</span> <span class="n">fullResetBeforeRelease</span><span class="p">(</span><span class="n">pc</span><span class="p">)</span>
	<span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
		<span class="n">logger</span><span class="o">.</span><span class="n">Error</span><span class="p">(</span><span class="s">"error while releasing connection back to the pool: %v"</span><span class="p">,</span> <span class="n">err</span><span class="p">)</span>
	<span class="p">}</span>
	<span class="n">pc</span><span class="o">.</span><span class="n">poolConn</span><span class="o">.</span><span class="n">Release</span><span class="p">()</span>
<span class="p">}</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">fullResetBeforeRelease</code> runs two SQL commands <strong>through pgx’s normal execution path</strong> (not through <code class="language-plaintext highlighter-rouge">pc.bf</code>), because at this point gprxy is done forwarding arbitrary messages:</p>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">func</span> <span class="n">fullResetBeforeRelease</span><span class="p">(</span><span class="n">connection</span> <span class="o">*</span><span class="n">Connection</span><span class="p">)</span> <span class="kt">error</span> <span class="p">{</span>
	<span class="n">_</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">connection</span><span class="o">.</span><span class="n">poolConn</span><span class="o">.</span><span class="n">Exec</span><span class="p">(</span><span class="n">context</span><span class="o">.</span><span class="n">Background</span><span class="p">(),</span> <span class="s">"ROLLBACK"</span><span class="p">)</span>
	<span class="c">// ...</span>
	<span class="n">_</span><span class="p">,</span> <span class="n">err</span> <span class="o">=</span> <span class="n">connection</span><span class="o">.</span><span class="n">poolConn</span><span class="o">.</span><span class="n">Exec</span><span class="p">(</span><span class="n">context</span><span class="o">.</span><span class="n">Background</span><span class="p">(),</span> <span class="s">"DISCARD ALL"</span><span class="p">)</span>
	<span class="c">// ...</span>
	<span class="k">return</span> <span class="no">nil</span>
<span class="p">}</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">ROLLBACK</code> rolls back any open transaction. Without this, if a client died mid-<code class="language-plaintext highlighter-rouge">BEGIN</code>, the pool connection would return to the free list still inside a transaction. Any rows it had locked would remain locked. The next client to get this connection would start their first query inside someone else’s transaction.</p>

<p><code class="language-plaintext highlighter-rouge">DISCARD ALL</code> is a PostgreSQL supercommand that resets everything about the session in one round trip:</p>

<ul>
  <li>All <code class="language-plaintext highlighter-rouge">SET</code> variables back to defaults</li>
  <li>All named prepared statements deallocated</li>
  <li>All open cursors closed</li>
  <li>All <code class="language-plaintext highlighter-rouge">LISTEN</code> subscriptions removed</li>
  <li>All advisory locks released</li>
  <li>All cached query plans discarded</li>
</ul>

<p>After these two commands, the pool connection’s PostgreSQL session is byte-for-byte identical to a fresh connection. pgx’s internal state may still be slightly stale (it didn’t observe gprxy’s arbitrary wire-level queries), but the actual PostgreSQL session is clean.</p>

<p>Then <code class="language-plaintext highlighter-rouge">poolConn.Release()</code> is called. Inside pgxpool, this:</p>

<ol>
  <li>Locks puddle’s mutex</li>
  <li>Moves the resource from the acquired set back to the idle free list</li>
  <li>Calls <code class="language-plaintext highlighter-rouge">Cond.Signal()</code> to wake any goroutine blocking on <code class="language-plaintext highlighter-rouge">Acquire()</code></li>
  <li>Unlocks</li>
</ol>

<p>The connection is now available for the next client that calls <code class="language-plaintext highlighter-rouge">AcquireConnection</code>.</p>

<h2 id="data-structure-map">Data Structure Map</h2>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>PROCESS GLOBAL STATE
─────────────────────────────────────────────────────────────────────

poolManager: map[poolKey]*pgxpool.Pool
│
├── key: {user:"alice@example.com", database:"gprxy_test"}
│   └── *pgxpool.Pool
│       ├── MaxConns: 5
│       ├── free list (idle):
│       │   └── *pgx.Conn [TCP socket to pg:5432, PID=1001, SK=11111]
│       ├── acquired set (in use):
│       │   ├── *pgx.Conn [TCP socket to pg:5432, PID=1002, SK=22222]  ← held by alice session 1
│       │   └── *pgx.Conn [TCP socket to pg:5432, PID=1003, SK=33333]  ← held by alice session 2
│       └── background goroutines:
│           ├── health checker (every 1 minute)
│           └── idle reaper (checks MaxConnIdleTime/MaxConnLifetime)
│
├── key: {user:"bob@example.com", database:"gprxy_test"}
│   └── *pgxpool.Pool
│       ├── MaxConns: 5
│       ├── free list (idle): [empty]
│       └── acquired set (in use):
│           └── *pgx.Conn [TCP socket to pg:5432, PID=1004, SK=44444]  ← held by bob session 1
│
└── poolMutex: sync.RWMutex  (guards the map above)


PER-CLIENT-CONNECTION STATE (one per active client goroutine)
─────────────────────────────────────────────────────────────────────

*Connection (alice session 1)
├── conn:     net.Conn          → TCP socket to alice's psql process
├── poolConn: *pgxpool.Conn     → exclusively holds PID=1002 conn above
├── bf:       *pgproto3.Frontend → wired to PID=1002's raw TCP socket
├── user:     "alice@example.com"
├── db:       "gprxy_test"
└── key:      BackendKeyData{ProcessID:1002, SecretKey:22222}
               ↑ also registered in server.activeConnections
</code></pre></div></div>

<h2 id="key-design-properties-and-tradeoffs">Key Design Properties and Tradeoffs</h2>

<p><strong>1. Per-(client-user, database) pools, not a single global pool</strong></p>

<p>Each unique <code class="language-plaintext highlighter-rouge">(user, database)</code> pair gets its own pool. This means Alice and Bob each have their own separate bucket of connections. This has pros and cons:</p>

<ul>
  <li>Pro: isolation — Alice exhausting her 5 connections doesn’t block Bob</li>
  <li>Con: the worst case total connections to PostgreSQL = <code class="language-plaintext highlighter-rouge">(number of distinct users) × (number of distinct databases) × 5</code>. With 100 users each connecting to 2 databases that’s 1000 PostgreSQL backend processes — potentially catastrophic at scale.</li>
</ul>

<p>A shared global pool per database would be more scalable, but then all users compete for the same connections.</p>

<p><strong>2. The pool key uses client identity, not service account</strong></p>

<p>All connections in a pool connect to PostgreSQL as <code class="language-plaintext highlighter-rouge">gprxy_admin</code> (from <code class="language-plaintext highlighter-rouge">BuildConnectionString</code>), but the pool is keyed on the client user (<code class="language-plaintext highlighter-rouge">alice@example.com</code>). This means two different clients who both map to <code class="language-plaintext highlighter-rouge">gprxy_admin</code> still have completely separate pools. This provides isolation but wastes connections — both pools independently open connections as the same PostgreSQL user.</p>

<p><strong>3. <code class="language-plaintext highlighter-rouge">MinConns=0</code> means cold start latency</strong></p>

<p>The first connection for any <code class="language-plaintext highlighter-rouge">(user, database)</code> pair always pays the full TCP dial + SCRAM handshake cost on the hot path (while the client is waiting). Subsequent connections are fast (free list lookup). If MinConns were 1, the pool would pre-open a connection at creation time, eliminating this latency at the cost of always holding an open connection even for inactive users.</p>

<p><strong>4. The pool is never closed</strong></p>

<p>There is no code path in gprxy that calls <code class="language-plaintext highlighter-rouge">pool.Close()</code>. Once a pool exists in <code class="language-plaintext highlighter-rouge">poolManager</code>, it lives forever until the process exits. The idle reaper and MaxConnIdleTime handle draining unused connections from within each pool, but the pool object and its entry in <code class="language-plaintext highlighter-rouge">poolManager</code> persist. This is a minor memory leak for transient users — if 10,000 different users each connect once, <code class="language-plaintext highlighter-rouge">poolManager</code> will have 10,000 entries (each being a small pool object with zero connections) forever.</p>

<p>Now that we have a detailed understanding of connection pooling, let’s get back to the authenticated user who now makes a request for a connection from the pool:</p>

<p><strong>Case A — Pool for this <code class="language-plaintext highlighter-rouge">(user, database)</code> key already exists:</strong> Read lock acquired, pool found, read lock released. No write lock, no allocation, extremely fast. The existing <code class="language-plaintext highlighter-rouge">*pgxpool.Pool</code> object is returned.</p>

<p><strong>Case B — Pool does not exist yet</strong> (first connection for this user+database combination):
Write lock acquired. The pool configuration struct is built and <code class="language-plaintext highlighter-rouge">pgxpool.NewWithConfig</code> is called. With <code class="language-plaintext highlighter-rouge">MinConns=0</code>, the pool does not open any connections to PostgreSQL at creation time. It creates the management infrastructure (the pool object, its background goroutines for health checking, etc.) but no actual TCP connections to PostgreSQL yet. The pool is stored in the global <code class="language-plaintext highlighter-rouge">poolManager</code> map.</p>

<p>Then comes the part of making an actual connection to the PostgreSQL database. The pool checks its internal free list:</p>

<ul>
  <li>If an idle connection exists: it returns it immediately and marks it as in-use.</li>
  <li>If no idle connection exists but the pool is below <code class="language-plaintext highlighter-rouge">MaxConns</code> (<code class="language-plaintext highlighter-rouge">5</code>): it opens a new TCP connection to PostgreSQL, performs the full PostgreSQL startup handshake (this includes SCRAM-SHA-256 authentication using the DSN credentials), and returns the resulting connection.</li>
  <li>If the pool is at <code class="language-plaintext highlighter-rouge">MaxConns</code> with none idle: it blocks until one is released by another goroutine.</li>
</ul>

<p>When a new connection is opened, the full PostgreSQL wire protocol handshake happens inside pgx transparently — StartupMessage, SCRAM exchange, AuthenticationOk, ParameterStatus, BackendKeyData, ReadyForQuery. pgx handles all of this internally and stores the resulting PID and SecretKey on the connection object. This is where the pool connection’s BackendKeyData originates.</p>

<p>After acquire, a ping is sent — a cheap <code class="language-plaintext highlighter-rouge">SELECT 1</code> equivalent at the protocol level. If the connection was idle and the PostgreSQL server closed it server-side (e.g. <code class="language-plaintext highlighter-rouge">idle_in_transaction_session_timeout</code>), the ping will fail, the connection is discarded, and <code class="language-plaintext highlighter-rouge">AcquireConnection</code> returns an error. This prevents handing the client a dead connection.</p>

<p>After this call returns, <code class="language-plaintext highlighter-rouge">pc.poolConn</code> is a live, valid, exclusively-held <code class="language-plaintext highlighter-rouge">*pgxpool.Conn</code>.</p>

<p>What the client is doing during all of this: still blocked. It sent <code class="language-plaintext highlighter-rouge">StartupMessage</code>, received <code class="language-plaintext highlighter-rouge">AuthenticationOk</code> + <code class="language-plaintext highlighter-rouge">ParameterStatus</code> messages, and is waiting for <code class="language-plaintext highlighter-rouge">BackendKeyData</code> + <code class="language-plaintext highlighter-rouge">ReadyForQuery</code>. It has no idea any of this infrastructure work is happening.</p>

<p><code class="language-plaintext highlighter-rouge">pgproto3.NewChunkReader(underlyingConn)</code> wraps the TCP socket in a buffered reader that knows how to read PostgreSQL wire protocol message boundaries. <code class="language-plaintext highlighter-rouge">pgproto3.NewFrontend(reader, underlyingConn)</code> creates a Frontend — an object that speaks PostgreSQL from the <strong>client side</strong> (i.e. it sends queries and reads responses, opposite of <code class="language-plaintext highlighter-rouge">Backend</code> which speaks from the server side).</p>

<p>This <code class="language-plaintext highlighter-rouge">*pgproto3.Frontend</code> is what <code class="language-plaintext highlighter-rouge">pc.bf</code> is. Going forward, every call to <code class="language-plaintext highlighter-rouge">pc.bf.Send(msg)</code> writes PostgreSQL wire bytes directly onto the pool connection’s TCP socket to PostgreSQL. Every call to <code class="language-plaintext highlighter-rouge">pc.bf.Receive()</code> reads PostgreSQL response bytes off that same socket.</p>

<p><code class="language-plaintext highlighter-rouge">pc.user</code> and <code class="language-plaintext highlighter-rouge">pc.db</code> are stored for logging purposes throughout the query loop.</p>

<p><code class="language-plaintext highlighter-rouge">pgconn.PID()</code> and <code class="language-plaintext highlighter-rouge">pgconn.SecretKey()</code> return values that pgx stored internally when it completed the pool connection’s startup handshake. These are the PID and SecretKey of the <strong>live PostgreSQL backend process</strong> that is holding this pool connection on the other end.</p>

<p><code class="language-plaintext highlighter-rouge">pc.key</code> is now <strong>overwritten</strong> — the dead temp connection’s key is replaced with the live pool connection’s key. This is the correct key that must be given to the client and registered in the cancel registry.</p>

<h2 id="send-backendkeydata-to-the-client">Send <code class="language-plaintext highlighter-rouge">BackendKeyData</code> to the client</h2>

<p><code class="language-plaintext highlighter-rouge">pgconn</code> here is the <code class="language-plaintext highlighter-rouge">*pgproto3.Backend</code> connected to the client — the server-side view of the client connection. <code class="language-plaintext highlighter-rouge">pgconn.Send(pc.key)</code> serializes the <code class="language-plaintext highlighter-rouge">BackendKeyData</code> struct into the PostgreSQL wire format and writes it to the client’s TCP socket:</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>K  (1 byte  - message type 'K')
00 00 00 0C (4 bytes - length = 12)
XX XX XX XX (4 bytes - ProcessID)
YY YY YY YY (4 bytes - SecretKey)
</code></pre></div></div>

<p>The client receives this, parses it, and stores <code class="language-plaintext highlighter-rouge">(ProcessID, SecretKey)</code> internally. Its driver will use this if the user or a timeout triggers a query cancellation.</p>

<h2 id="send-readyforquery-to-the-client">Send <code class="language-plaintext highlighter-rouge">ReadyForQuery</code> to the client</h2>

<p><code class="language-plaintext highlighter-rouge">TxStatus: 'I'</code> means “idle, not in a transaction”. This is always correct here because the pool connection was just acquired and either just opened or just had <code class="language-plaintext highlighter-rouge">DISCARD ALL</code> run on it — it is guaranteed to be in idle state.</p>

<p>Wire format:</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Z  (1 byte  - message type 'Z')
00 00 00 05 (4 bytes - length = 5)
49          (1 byte  - 'I' = idle, or 'T' = in transaction, 'E' = error state)
</code></pre></div></div>

<p>This message <strong>unblocks the client</strong>. The client’s driver, which has been sitting in <code class="language-plaintext highlighter-rouge">ReadyForQuery</code> wait since it sent the <code class="language-plaintext highlighter-rouge">StartupMessage</code>, now receives this and considers the connection fully established. It returns the connection object to the application. The application can now call <code class="language-plaintext highlighter-rouge">conn.Query(...)</code> or <code class="language-plaintext highlighter-rouge">conn.Exec(...)</code>.</p>

<p>This <code class="language-plaintext highlighter-rouge">ReadyForQuery</code> is <strong>synthesized by gprxy itself</strong> — it is not forwarded from PostgreSQL. gprxy is lying to the client in the best possible way: the client believes it just completed startup with a PostgreSQL server, but the <code class="language-plaintext highlighter-rouge">ReadyForQuery</code> came from the proxy, which only sent it after confirming the pool connection is live and <code class="language-plaintext highlighter-rouge">pc.bf</code> is wired up and ready.</p>

<h2 id="register-in-the-cancel-registry">Register in the cancel registry</h2>

<p><code class="language-plaintext highlighter-rouge">registerConnection</code> stores the <code class="language-plaintext highlighter-rouge">*Connection</code> pointer in <code class="language-plaintext highlighter-rouge">server.activeConnections</code> keyed by the bit-packed <code class="language-plaintext highlighter-rouge">uint64</code> of <code class="language-plaintext highlighter-rouge">(ProcessID &lt;&lt; 32 | SecretKey)</code>. This happens <strong>after</strong> <code class="language-plaintext highlighter-rouge">ReadyForQuery</code> is sent because the cancel registry only needs to be populated before a query actually runs — and no query can run until the client receives <code class="language-plaintext highlighter-rouge">ReadyForQuery</code> and sends the next message. There is no race condition here because both happen in the same goroutine sequentially.</p>

<h2 id="what-happens-between-authentication-and-readyforquery">What happens between authentication and ReadyForQuery</h2>

<p><img src="/assets/gprxy/flow.png" alt="Pool acquire and session handoff" /></p>

<p>Every piece must happen in this exact order. If the pool connection fails (PostgreSQL down, max connections reached, ping fails), gprxy sends <code class="language-plaintext highlighter-rouge">ErrorResponse</code> with SQLSTATE <code class="language-plaintext highlighter-rouge">08006</code> to the client and the connection is torn down cleanly. The client never enters a half-connected state.</p>

<p>Now the proxy is at the query loop entry point, where the user runs queries and the proxy bridges the gap between the user and the running PostgreSQL instance.</p>

<p>Here is the complete deep dive into the entire query loop and cleanup.</p>

<h2 id="the-query-loop-entry-point">The Query Loop Entry Point</h2>

<p>After <code class="language-plaintext highlighter-rouge">handleStartupMessage</code> returns, control lands here:</p>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">logger</span><span class="o">.</span><span class="n">Debug</span><span class="p">(</span><span class="s">"entering query handling loop"</span><span class="p">)</span>
<span class="k">for</span> <span class="p">{</span>
	<span class="n">err</span> <span class="o">:=</span> <span class="n">pc</span><span class="o">.</span><span class="n">handleMessage</span><span class="p">(</span><span class="n">pgc</span><span class="p">)</span>
	<span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
		<span class="n">logger</span><span class="o">.</span><span class="n">Debug</span><span class="p">(</span><span class="s">"query handling terminated: %v"</span><span class="p">,</span> <span class="n">err</span><span class="p">)</span>
		<span class="k">return</span>
	<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>This is an unconditional <code class="language-plaintext highlighter-rouge">for {}</code> — it runs forever until <code class="language-plaintext highlighter-rouge">handleMessage</code> returns a non-nil error. There is no break condition, no timeout, no idle check. The goroutine lives as long as the client is connected.</p>

<p><code class="language-plaintext highlighter-rouge">pgc</code> is the <code class="language-plaintext highlighter-rouge">*pgproto3.Backend</code> connected to the client socket. It is the same one used during startup. It is passed into every <code class="language-plaintext highlighter-rouge">handleMessage</code> call.</p>

<h2 id="anatomy-of-one-cycle">Anatomy of One Cycle</h2>

<h3 id="part-a-reading-the-client-message">Part A: Reading the client message</h3>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">func</span> <span class="p">(</span><span class="n">pc</span> <span class="o">*</span><span class="n">Connection</span><span class="p">)</span> <span class="n">handleMessage</span><span class="p">(</span><span class="n">client</span> <span class="o">*</span><span class="n">pgproto3</span><span class="o">.</span><span class="n">Backend</span><span class="p">)</span> <span class="kt">error</span> <span class="p">{</span>
	<span class="n">msg</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">client</span><span class="o">.</span><span class="n">Receive</span><span class="p">()</span>
	<span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
		<span class="k">return</span> <span class="n">logger</span><span class="o">.</span><span class="n">Errorf</span><span class="p">(</span><span class="s">"client receive error: %w"</span><span class="p">,</span> <span class="n">err</span><span class="p">)</span>
	<span class="p">}</span>

</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">client.Receive()</code> calls into the TCP socket read buffer. The goroutine is <strong>parked by the OS</strong> here — it is not spinning, not consuming CPU. It wakes only when bytes arrive on the socket.</p>

<p><code class="language-plaintext highlighter-rouge">pgproto3.Backend.Receive()</code> reads the first byte (the message type identifier), then reads the 4-byte length field, then reads exactly <code class="language-plaintext highlighter-rouge">length - 4</code> more bytes to get the full payload, then deserializes everything into a typed Go struct and returns it.</p>

<p>If the client closes the TCP socket (<code class="language-plaintext highlighter-rouge">Ctrl+C</code>, process killed, network drop), the OS delivers an EOF to the read call. <code class="language-plaintext highlighter-rouge">client.Receive()</code> returns an error wrapping <code class="language-plaintext highlighter-rouge">io.EOF</code>, <code class="language-plaintext highlighter-rouge">handleMessage</code> returns that error, the outer loop sees non-nil and calls <code class="language-plaintext highlighter-rouge">return</code>, which exits <code class="language-plaintext highlighter-rouge">handleConnection</code> and triggers the <code class="language-plaintext highlighter-rouge">defer</code>.</p>

<h3 id="part-b-classify-and-log">Part B: Classify and log</h3>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">key</span> <span class="o">:=</span> <span class="n">pc</span><span class="o">.</span><span class="n">poolConn</span><span class="o">.</span><span class="n">Conn</span><span class="p">()</span><span class="o">.</span><span class="n">PgConn</span><span class="p">()</span><span class="o">.</span><span class="n">SecretKey</span><span class="p">()</span>
<span class="n">pid</span> <span class="o">:=</span> <span class="n">pc</span><span class="o">.</span><span class="n">poolConn</span><span class="o">.</span><span class="n">Conn</span><span class="p">()</span><span class="o">.</span><span class="n">PgConn</span><span class="p">()</span><span class="o">.</span><span class="n">PID</span><span class="p">()</span>
<span class="k">switch</span> <span class="n">query</span> <span class="o">:=</span> <span class="n">msg</span><span class="o">.</span><span class="p">(</span><span class="k">type</span><span class="p">)</span> <span class="p">{</span>
<span class="k">case</span> <span class="o">*</span><span class="n">pgproto3</span><span class="o">.</span><span class="n">Query</span><span class="o">:</span>
	<span class="n">logger</span><span class="o">.</span><span class="n">Info</span><span class="p">(</span><span class="s">"[%s] query: %s"</span><span class="p">,</span> <span class="n">pc</span><span class="o">.</span><span class="n">user</span><span class="p">,</span> <span class="n">query</span><span class="o">.</span><span class="n">String</span><span class="p">)</span>
	<span class="n">logger</span><span class="o">.</span><span class="n">Debug</span><span class="p">(</span><span class="s">"query connection PID=%d, secret_key=%d"</span><span class="p">,</span> <span class="n">pid</span><span class="p">,</span> <span class="n">key</span><span class="p">)</span>

<span class="k">case</span> <span class="o">*</span><span class="n">pgproto3</span><span class="o">.</span><span class="n">Parse</span><span class="o">:</span>
	<span class="n">logger</span><span class="o">.</span><span class="n">Debug</span><span class="p">(</span><span class="s">"[%s] parse: statement='%s' query='%s'"</span><span class="p">,</span> <span class="n">pc</span><span class="o">.</span><span class="n">user</span><span class="p">,</span> <span class="n">query</span><span class="o">.</span><span class="n">Name</span><span class="p">,</span> <span class="n">query</span><span class="o">.</span><span class="n">Query</span><span class="p">)</span>

<span class="k">case</span> <span class="o">*</span><span class="n">pgproto3</span><span class="o">.</span><span class="n">Describe</span><span class="o">:</span>
	<span class="n">objectType</span> <span class="o">:=</span> <span class="s">"statement"</span>
	<span class="k">if</span> <span class="n">query</span><span class="o">.</span><span class="n">ObjectType</span> <span class="o">==</span> <span class="sc">'P'</span> <span class="p">{</span>
		<span class="n">objectType</span> <span class="o">=</span> <span class="s">"portal"</span>
	<span class="p">}</span>
	<span class="n">logger</span><span class="o">.</span><span class="n">Debug</span><span class="p">(</span><span class="s">"[%s] describe: %s='%s'"</span><span class="p">,</span> <span class="n">pc</span><span class="o">.</span><span class="n">user</span><span class="p">,</span> <span class="n">objectType</span><span class="p">,</span> <span class="n">query</span><span class="o">.</span><span class="n">Name</span><span class="p">)</span>

<span class="k">case</span> <span class="o">*</span><span class="n">pgproto3</span><span class="o">.</span><span class="n">Bind</span><span class="o">:</span>
	<span class="n">paramCount</span> <span class="o">:=</span> <span class="nb">len</span><span class="p">(</span><span class="n">query</span><span class="o">.</span><span class="n">Parameters</span><span class="p">)</span>
	<span class="n">logger</span><span class="o">.</span><span class="n">Debug</span><span class="p">(</span><span class="s">"[%s] bind: portal='%s' statement='%s' params=%d"</span><span class="p">,</span> <span class="n">pc</span><span class="o">.</span><span class="n">user</span><span class="p">,</span> <span class="n">query</span><span class="o">.</span><span class="n">DestinationPortal</span><span class="p">,</span> <span class="n">query</span><span class="o">.</span><span class="n">PreparedStatement</span><span class="p">,</span> <span class="n">paramCount</span><span class="p">)</span>

<span class="k">case</span> <span class="o">*</span><span class="n">pgproto3</span><span class="o">.</span><span class="n">Execute</span><span class="o">:</span>
	<span class="n">maxRows</span> <span class="o">:=</span> <span class="s">"unlimited"</span>
	<span class="k">if</span> <span class="n">query</span><span class="o">.</span><span class="n">MaxRows</span> <span class="o">&gt;</span> <span class="m">0</span> <span class="p">{</span>
		<span class="n">maxRows</span> <span class="o">=</span> <span class="n">fmt</span><span class="o">.</span><span class="n">Sprintf</span><span class="p">(</span><span class="s">"%d"</span><span class="p">,</span> <span class="n">query</span><span class="o">.</span><span class="n">MaxRows</span><span class="p">)</span>
	<span class="p">}</span>
	<span class="n">logger</span><span class="o">.</span><span class="n">Debug</span><span class="p">(</span><span class="s">"[%s] execute: portal='%s' max_rows=%s"</span><span class="p">,</span> <span class="n">pc</span><span class="o">.</span><span class="n">user</span><span class="p">,</span> <span class="n">query</span><span class="o">.</span><span class="n">Portal</span><span class="p">,</span> <span class="n">maxRows</span><span class="p">)</span>

<span class="k">case</span> <span class="o">*</span><span class="n">pgproto3</span><span class="o">.</span><span class="n">Sync</span><span class="o">:</span>
	<span class="n">logger</span><span class="o">.</span><span class="n">Debug</span><span class="p">(</span><span class="s">"[%s] sync: transaction boundary"</span><span class="p">,</span> <span class="n">pc</span><span class="o">.</span><span class="n">user</span><span class="p">)</span>

<span class="k">case</span> <span class="o">*</span><span class="n">pgproto3</span><span class="o">.</span><span class="n">Terminate</span><span class="o">:</span>
	<span class="n">logger</span><span class="o">.</span><span class="n">Info</span><span class="p">(</span><span class="s">"[%s] client disconnecting gracefully"</span><span class="p">,</span> <span class="n">pc</span><span class="o">.</span><span class="n">user</span><span class="p">)</span>
	<span class="k">return</span> <span class="n">logger</span><span class="o">.</span><span class="n">Errorf</span><span class="p">(</span><span class="s">"client terminated"</span><span class="p">)</span>

<span class="k">default</span><span class="o">:</span>
	<span class="n">logger</span><span class="o">.</span><span class="n">Debug</span><span class="p">(</span><span class="s">"[%s] unknown message type: %T"</span><span class="p">,</span> <span class="n">pc</span><span class="o">.</span><span class="n">user</span><span class="p">,</span> <span class="n">query</span><span class="p">)</span>
<span class="p">}</span>
</code></pre></div></div>

<p>The switch is <strong>logging only</strong> — it does not change the message or alter routing. Every message type has its own log line.</p>

<p>The <code class="language-plaintext highlighter-rouge">Terminate</code> case is the only one that <strong>exits early</strong> before forwarding. It returns an error immediately — the loop will exit. Notice it returns <em>before</em> the <code class="language-plaintext highlighter-rouge">pc.bf.Send(msg)</code> call below. The <code class="language-plaintext highlighter-rouge">Terminate</code> message is <strong>never forwarded to PostgreSQL</strong>. PostgreSQL doesn’t need to be told — the pool connection is not being closed, it is going back to the pool. PostgreSQL will only learn the session ended when gprxy runs <code class="language-plaintext highlighter-rouge">ROLLBACK</code> + <code class="language-plaintext highlighter-rouge">DISCARD ALL</code> later.</p>

<p>The two lines at the top of the switch:</p>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">key</span> <span class="o">:=</span> <span class="n">pc</span><span class="o">.</span><span class="n">poolConn</span><span class="o">.</span><span class="n">Conn</span><span class="p">()</span><span class="o">.</span><span class="n">PgConn</span><span class="p">()</span><span class="o">.</span><span class="n">SecretKey</span><span class="p">()</span>
<span class="n">pid</span> <span class="o">:=</span> <span class="n">pc</span><span class="o">.</span><span class="n">poolConn</span><span class="o">.</span><span class="n">Conn</span><span class="p">()</span><span class="o">.</span><span class="n">PgConn</span><span class="p">()</span><span class="o">.</span><span class="n">PID</span><span class="p">()</span>
</code></pre></div></div>

<p>These are read on every single message, but they are only used in the <code class="language-plaintext highlighter-rouge">Query</code> log line. This is slightly wasteful — two pointer dereferences on every message regardless of type.</p>

<h3 id="part-c-forward-the-message-to-postgresql">Part C: Forward the message to PostgreSQL</h3>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">err</span> <span class="o">=</span> <span class="n">pc</span><span class="o">.</span><span class="n">bf</span><span class="o">.</span><span class="n">Send</span><span class="p">(</span><span class="n">msg</span><span class="p">)</span>
<span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
	<span class="k">return</span> <span class="n">logger</span><span class="o">.</span><span class="n">Errorf</span><span class="p">(</span><span class="s">"unable to send query to backend: %w"</span><span class="p">,</span> <span class="n">err</span><span class="p">)</span>
<span class="p">}</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">pc.bf</code> is the <code class="language-plaintext highlighter-rouge">*pgproto3.Frontend</code> wired to the pool connection’s raw TCP socket. <code class="language-plaintext highlighter-rouge">Send(msg)</code> takes the Go struct, serializes it back into PostgreSQL wire protocol bytes, and writes them to that socket.</p>

<p>This is a <strong>complete passthrough</strong> — gprxy does no SQL parsing, no query analysis, no modification of any kind. The bytes that PostgreSQL receives are byte-for-byte identical to what the client sent (re-serialized through pgproto3, but semantically identical).</p>

<p>The two PostgreSQL query protocols work differently here:</p>

<p><strong>Simple Query</strong> — one message, one forward:</p>

<ul>
  <li>Client sends one <code class="language-plaintext highlighter-rouge">Query</code> message with raw SQL text</li>
  <li><code class="language-plaintext highlighter-rouge">handleMessage</code> is called once</li>
  <li>One <code class="language-plaintext highlighter-rouge">pc.bf.Send(query)</code> forwards it</li>
  <li><code class="language-plaintext highlighter-rouge">relayBackendResponse</code> collects all responses until <code class="language-plaintext highlighter-rouge">ReadyForQuery</code></li>
</ul>

<p><strong>Extended Query</strong> — multiple messages, multiple forwards:</p>

<ul>
  <li>Client sends <code class="language-plaintext highlighter-rouge">Parse</code>, <code class="language-plaintext highlighter-rouge">Bind</code>, <code class="language-plaintext highlighter-rouge">Describe</code>, <code class="language-plaintext highlighter-rouge">Execute</code>, <code class="language-plaintext highlighter-rouge">Sync</code> — each as a separate message</li>
  <li><code class="language-plaintext highlighter-rouge">handleMessage</code> is called <strong>once per message</strong> — 5 separate calls for 5 messages</li>
  <li>Each call forwards its one message and then calls <code class="language-plaintext highlighter-rouge">relayBackendResponse</code></li>
  <li>Only <code class="language-plaintext highlighter-rouge">Sync</code> produces a <code class="language-plaintext highlighter-rouge">ReadyForQuery</code> from PostgreSQL — the other messages get <code class="language-plaintext highlighter-rouge">ParseComplete</code>, <code class="language-plaintext highlighter-rouge">BindComplete</code>, etc.</li>
</ul>

<p>This means for an extended query cycle, <code class="language-plaintext highlighter-rouge">relayBackendResponse</code> is called 5 times but returns <code class="language-plaintext highlighter-rouge">nil</code> (continues to outer loop) after each intermediate response, and finally returns <code class="language-plaintext highlighter-rouge">nil</code> after the <code class="language-plaintext highlighter-rouge">ReadyForQuery</code> that follows <code class="language-plaintext highlighter-rouge">Sync</code>.</p>

<h3 id="part-d-check-for-terminate-again">Part D: Check for Terminate again</h3>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">if</span> <span class="n">_</span><span class="p">,</span> <span class="n">ok</span> <span class="o">:=</span> <span class="n">msg</span><span class="o">.</span><span class="p">(</span><span class="o">*</span><span class="n">pgproto3</span><span class="o">.</span><span class="n">Terminate</span><span class="p">);</span> <span class="n">ok</span> <span class="p">{</span>
	<span class="k">return</span> <span class="n">logger</span><span class="o">.</span><span class="n">Errorf</span><span class="p">(</span><span class="s">"connection terminated"</span><span class="p">)</span>
<span class="p">}</span>
</code></pre></div></div>

<p>This is actually dead code in practice — the <code class="language-plaintext highlighter-rouge">Terminate</code> case in the switch already returned before reaching here. This is a redundant safety net. If somehow execution reaches here with a <code class="language-plaintext highlighter-rouge">Terminate</code> message, it exits before calling <code class="language-plaintext highlighter-rouge">relayBackendResponse</code> (which would block forever waiting for a backend response that will never come, since <code class="language-plaintext highlighter-rouge">Terminate</code> doesn’t produce one).</p>

<h3 id="part-e-relay-all-backend-responses">Part E: Relay all backend responses</h3>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">return</span> <span class="n">pc</span><span class="o">.</span><span class="n">relayBackendResponse</span><span class="p">(</span><span class="n">client</span><span class="p">)</span>
</code></pre></div></div>

<h2 id="relaybackendresponse-the-response-pump"><code class="language-plaintext highlighter-rouge">relayBackendResponse</code>: The Response Pump</h2>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">func</span> <span class="p">(</span><span class="n">pc</span> <span class="o">*</span><span class="n">Connection</span><span class="p">)</span> <span class="n">relayBackendResponse</span><span class="p">(</span><span class="n">client</span> <span class="o">*</span><span class="n">pgproto3</span><span class="o">.</span><span class="n">Backend</span><span class="p">)</span> <span class="kt">error</span> <span class="p">{</span>
	<span class="k">for</span> <span class="p">{</span>
		<span class="n">msg</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">pc</span><span class="o">.</span><span class="n">bf</span><span class="o">.</span><span class="n">Receive</span><span class="p">()</span>
		<span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
			<span class="k">return</span> <span class="n">logger</span><span class="o">.</span><span class="n">Errorf</span><span class="p">(</span><span class="s">"backend receive error: %w"</span><span class="p">,</span> <span class="n">err</span><span class="p">)</span>
		<span class="p">}</span>

		<span class="n">err</span> <span class="o">=</span> <span class="n">client</span><span class="o">.</span><span class="n">Send</span><span class="p">(</span><span class="n">msg</span><span class="p">)</span>
		<span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
			<span class="k">return</span> <span class="n">logger</span><span class="o">.</span><span class="n">Errorf</span><span class="p">(</span><span class="s">"client send error: %w"</span><span class="p">,</span> <span class="n">err</span><span class="p">)</span>
		<span class="p">}</span>

		<span class="k">switch</span> <span class="n">msgType</span> <span class="o">:=</span> <span class="n">msg</span><span class="o">.</span><span class="p">(</span><span class="k">type</span><span class="p">)</span> <span class="p">{</span>
		<span class="k">case</span> <span class="o">*</span><span class="n">pgproto3</span><span class="o">.</span><span class="n">ReadyForQuery</span><span class="o">:</span>
			<span class="n">logger</span><span class="o">.</span><span class="n">Debug</span><span class="p">(</span><span class="s">"query completed, ready for next query (status: %c)"</span><span class="p">,</span> <span class="n">msgType</span><span class="o">.</span><span class="n">TxStatus</span><span class="p">)</span>
			<span class="k">return</span> <span class="no">nil</span>
		<span class="k">case</span> <span class="o">*</span><span class="n">pgproto3</span><span class="o">.</span><span class="n">ErrorResponse</span><span class="o">:</span>
			<span class="n">logger</span><span class="o">.</span><span class="n">Warn</span><span class="p">(</span><span class="s">"query error: %s (code: %s)"</span><span class="p">,</span> <span class="n">msgType</span><span class="o">.</span><span class="n">Message</span><span class="p">,</span> <span class="n">msgType</span><span class="o">.</span><span class="n">Code</span><span class="p">)</span>
		<span class="k">case</span> <span class="o">*</span><span class="n">pgproto3</span><span class="o">.</span><span class="n">CommandComplete</span><span class="o">:</span>
			<span class="n">logger</span><span class="o">.</span><span class="n">Debug</span><span class="p">(</span><span class="s">"command completed: %s"</span><span class="p">,</span> <span class="n">msgType</span><span class="o">.</span><span class="n">CommandTag</span><span class="p">)</span>
		<span class="p">}</span>
	<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>This loop does two things and only two things: <strong>read from backend, write to client</strong>. Every message is forwarded unconditionally before the switch even runs. The switch is for logging and for detecting the exit condition.</p>

<p><code class="language-plaintext highlighter-rouge">pc.bf.Receive()</code> reads from the pool connection’s raw TCP socket — this is the raw PostgreSQL wire protocol coming from the database. Like <code class="language-plaintext highlighter-rouge">client.Receive()</code>, this parks the goroutine until bytes arrive.</p>

<p><code class="language-plaintext highlighter-rouge">client.Send(msg)</code> serializes the message and writes it to the client socket.</p>

<h3 id="the-readyforquery-exit-condition">The <code class="language-plaintext highlighter-rouge">ReadyForQuery</code> exit condition</h3>

<p><code class="language-plaintext highlighter-rouge">ReadyForQuery</code> is the <strong>only message that ends the relay loop</strong>. It returns <code class="language-plaintext highlighter-rouge">nil</code>, which propagates back to <code class="language-plaintext highlighter-rouge">handleMessage</code> returning <code class="language-plaintext highlighter-rouge">nil</code>, which causes the outer <code class="language-plaintext highlighter-rouge">for</code> loop to call <code class="language-plaintext highlighter-rouge">handleMessage</code> again.</p>

<p><code class="language-plaintext highlighter-rouge">ReadyForQuery</code> carries a <code class="language-plaintext highlighter-rouge">TxStatus</code> byte:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">'I'</code> = idle (not in a transaction)</li>
  <li><code class="language-plaintext highlighter-rouge">'T'</code> = in an open transaction (inside a <code class="language-plaintext highlighter-rouge">BEGIN</code>…<code class="language-plaintext highlighter-rouge">COMMIT</code> block)</li>
  <li><code class="language-plaintext highlighter-rouge">'E'</code> = in a failed transaction (error occurred, needs <code class="language-plaintext highlighter-rouge">ROLLBACK</code>)</li>
</ul>

<p>gprxy forwards this status byte unchanged to the client. Client drivers use it to track transaction state.</p>

<h3 id="what-errorresponse-does-and-does-not-do">What <code class="language-plaintext highlighter-rouge">ErrorResponse</code> does (and does not do)</h3>

<p>Notice <code class="language-plaintext highlighter-rouge">ErrorResponse</code> is <strong>not</strong> a return condition. It is just logged as a warning. The loop continues reading until <code class="language-plaintext highlighter-rouge">ReadyForQuery</code> arrives. This is correct — PostgreSQL always sends <code class="language-plaintext highlighter-rouge">ReadyForQuery</code> after an error, even if it looks like:</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ErrorResponse("relation 'foo' does not exist")
ReadyForQuery(TxStatus='I')
</code></pre></div></div>

<p>Both are forwarded. Both are received by the client. The error is delivered to the application through the driver’s normal error handling. The connection stays alive.</p>

<h3 id="what-commandcomplete-does">What <code class="language-plaintext highlighter-rouge">CommandComplete</code> does</h3>

<p>Also just logged. The tag string tells what happened: <code class="language-plaintext highlighter-rouge">"SELECT 5"</code>, <code class="language-plaintext highlighter-rouge">"INSERT 0 1"</code>, <code class="language-plaintext highlighter-rouge">"UPDATE 3"</code>, <code class="language-plaintext highlighter-rouge">"DELETE 0"</code>, <code class="language-plaintext highlighter-rouge">"BEGIN"</code>, <code class="language-plaintext highlighter-rouge">"COMMIT"</code>, <code class="language-plaintext highlighter-rouge">"ROLLBACK"</code>. Forwarded to client, loop continues.</p>

<h3 id="full-response-stream-examples">Full response stream examples</h3>

<p><strong><code class="language-plaintext highlighter-rouge">SELECT * FROM users</code>:</strong></p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>T  RowDescription   [id:int4, name:text, email:text]     → forwarded
D  DataRow          [1, "Alice", "alice@example.com"]     → forwarded
D  DataRow          [2, "Bob",   "bob@example.com"]       → forwarded
D  DataRow          [3, "Carol", "carol@example.com"]     → forwarded
C  CommandComplete  "SELECT 3"                            → forwarded + logged
Z  ReadyForQuery    TxStatus='I'                          → forwarded + loop exits
</code></pre></div></div>

<p><strong><code class="language-plaintext highlighter-rouge">INSERT INTO users VALUES (...)</code>:</strong></p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>C  CommandComplete  "INSERT 0 1"   → forwarded + logged
Z  ReadyForQuery    TxStatus='I'   → forwarded + loop exits
</code></pre></div></div>

<p><strong><code class="language-plaintext highlighter-rouge">SELECT * FROM nonexistent</code>:</strong></p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>E  ErrorResponse    code="42P01", message="relation 'nonexistent' does not exist"   → forwarded + logged as warn
Z  ReadyForQuery    TxStatus='I'   → forwarded + loop exits
</code></pre></div></div>

<p><strong><code class="language-plaintext highlighter-rouge">BEGIN</code> followed by <code class="language-plaintext highlighter-rouge">INSERT</code>:</strong></p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[handleMessage called for "BEGIN"]
C  CommandComplete  "BEGIN"         → forwarded
Z  ReadyForQuery    TxStatus='T'    → forwarded + loop exits

[handleMessage called for "INSERT INTO..."]
C  CommandComplete  "INSERT 0 1"    → forwarded
Z  ReadyForQuery    TxStatus='T'    → forwarded + loop exits (still in transaction)

[handleMessage called for "COMMIT"]
C  CommandComplete  "COMMIT"        → forwarded
Z  ReadyForQuery    TxStatus='I'    → forwarded + loop exits (back to idle)
</code></pre></div></div>

<h2 id="how-the-loop-ends-all-exit-paths">How the Loop Ends: All Exit Paths</h2>

<p><code class="language-plaintext highlighter-rouge">handleMessage</code> returns a non-nil error in these cases:</p>

<table>
  <thead>
    <tr>
      <th>Cause</th>
      <th>Where</th>
      <th>Error</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Client sends <code class="language-plaintext highlighter-rouge">Terminate</code></td>
      <td>switch case, line 51</td>
      <td><code class="language-plaintext highlighter-rouge">"client terminated"</code></td>
    </tr>
    <tr>
      <td>Client TCP socket closed (EOF, crash, network drop)</td>
      <td><code class="language-plaintext highlighter-rouge">client.Receive()</code>, line 15</td>
      <td>wraps <code class="language-plaintext highlighter-rouge">io.EOF</code></td>
    </tr>
    <tr>
      <td>Failed to write to client</td>
      <td><code class="language-plaintext highlighter-rouge">client.Send()</code> in relay, line 78</td>
      <td><code class="language-plaintext highlighter-rouge">"client send error"</code></td>
    </tr>
    <tr>
      <td>Failed to read from backend</td>
      <td><code class="language-plaintext highlighter-rouge">pc.bf.Receive()</code> in relay, line 74</td>
      <td><code class="language-plaintext highlighter-rouge">"backend receive error"</code></td>
    </tr>
    <tr>
      <td>Failed to forward to backend</td>
      <td><code class="language-plaintext highlighter-rouge">pc.bf.Send()</code>, line 59</td>
      <td><code class="language-plaintext highlighter-rouge">"unable to send query to backend"</code></td>
    </tr>
  </tbody>
</table>

<p>All of them propagate to the outer loop:</p>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">for</span> <span class="p">{</span>
	<span class="n">err</span> <span class="o">:=</span> <span class="n">pc</span><span class="o">.</span><span class="n">handleMessage</span><span class="p">(</span><span class="n">pgc</span><span class="p">)</span>
	<span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
		<span class="n">logger</span><span class="o">.</span><span class="n">Debug</span><span class="p">(</span><span class="s">"query handling terminated: %v"</span><span class="p">,</span> <span class="n">err</span><span class="p">)</span>
		<span class="k">return</span>    <span class="c">// ← exits handleConnection</span>
	<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">return</code> from <code class="language-plaintext highlighter-rouge">handleConnection</code> triggers the <code class="language-plaintext highlighter-rouge">defer</code>.</p>

<h2 id="the-defer-cleanup">The <code class="language-plaintext highlighter-rouge">defer</code> Cleanup</h2>

<p>The defer was registered at the very start of <code class="language-plaintext highlighter-rouge">handleConnection</code> before any work began:</p>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">defer</span> <span class="k">func</span><span class="p">()</span> <span class="p">{</span>
	<span class="k">if</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">pc</span><span class="o">.</span><span class="n">conn</span><span class="o">.</span><span class="n">Close</span><span class="p">();</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
		<span class="n">logger</span><span class="o">.</span><span class="n">Error</span><span class="p">(</span><span class="s">"error closing client connection: %v"</span><span class="p">,</span> <span class="n">err</span><span class="p">)</span>
	<span class="p">}</span>

	<span class="k">if</span> <span class="n">pc</span><span class="o">.</span><span class="n">poolConn</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
		<span class="n">err</span> <span class="o">:=</span> <span class="n">fullResetBeforeRelease</span><span class="p">(</span><span class="n">pc</span><span class="p">)</span>
		<span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
			<span class="n">logger</span><span class="o">.</span><span class="n">Error</span><span class="p">(</span><span class="s">"error while releasing connection back to the pool: %v"</span><span class="p">,</span> <span class="n">err</span><span class="p">)</span>
		<span class="p">}</span>
		<span class="n">pc</span><span class="o">.</span><span class="n">poolConn</span><span class="o">.</span><span class="n">Release</span><span class="p">()</span>
		<span class="n">logger</span><span class="o">.</span><span class="n">Debug</span><span class="p">(</span><span class="s">"released connection back to pool"</span><span class="p">)</span>
	<span class="p">}</span>
	<span class="k">if</span> <span class="n">pc</span><span class="o">.</span><span class="n">key</span> <span class="o">!=</span> <span class="no">nil</span> <span class="o">&amp;&amp;</span> <span class="n">pc</span><span class="o">.</span><span class="n">server</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
		<span class="n">pc</span><span class="o">.</span><span class="n">server</span><span class="o">.</span><span class="n">unregisterConnection</span><span class="p">(</span><span class="n">pc</span><span class="o">.</span><span class="n">key</span><span class="o">.</span><span class="n">ProcessID</span><span class="p">,</span> <span class="n">pc</span><span class="o">.</span><span class="n">key</span><span class="o">.</span><span class="n">SecretKey</span><span class="p">,</span> <span class="n">pc</span><span class="p">)</span>
	<span class="p">}</span>
	<span class="n">logger</span><span class="o">.</span><span class="n">Info</span><span class="p">(</span><span class="s">"connection closed"</span><span class="p">)</span>
<span class="p">}()</span>
</code></pre></div></div>

<p>Go’s <code class="language-plaintext highlighter-rouge">defer</code> runs <strong>even if the function panics</strong>. The three steps always execute in order.</p>

<h3 id="cleanup-step-1-close-the-client-tcp-socket">Cleanup Step 1: Close the client TCP socket</h3>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">if</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">pc</span><span class="o">.</span><span class="n">conn</span><span class="o">.</span><span class="n">Close</span><span class="p">();</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
    <span class="n">logger</span><span class="o">.</span><span class="n">Error</span><span class="p">(</span><span class="s">"error closing client connection: %v"</span><span class="p">,</span> <span class="n">err</span><span class="p">)</span>
<span class="p">}</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">pc.conn</code> is the <code class="language-plaintext highlighter-rouge">net.Conn</code> to the client. <code class="language-plaintext highlighter-rouge">Close()</code> sends a TCP FIN to the client and releases the OS file descriptor. If the client already closed the connection (which is why the loop exited), <code class="language-plaintext highlighter-rouge">Close()</code> still runs and may return an error like <code class="language-plaintext highlighter-rouge">use of closed network connection</code> — that error is logged but does not stop cleanup.</p>

<h3 id="cleanup-step-2a-fullresetbeforerelease">Cleanup Step 2a: <code class="language-plaintext highlighter-rouge">fullResetBeforeRelease</code></h3>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">func</span> <span class="n">fullResetBeforeRelease</span><span class="p">(</span><span class="n">connection</span> <span class="o">*</span><span class="n">Connection</span><span class="p">)</span> <span class="kt">error</span> <span class="p">{</span>
	<span class="n">_</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">connection</span><span class="o">.</span><span class="n">poolConn</span><span class="o">.</span><span class="n">Exec</span><span class="p">(</span><span class="n">context</span><span class="o">.</span><span class="n">Background</span><span class="p">(),</span> <span class="s">"ROLLBACK"</span><span class="p">)</span>
	<span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
		<span class="n">logger</span><span class="o">.</span><span class="n">Debug</span><span class="p">(</span><span class="s">"unable to rollback: %v"</span><span class="p">,</span> <span class="n">err</span><span class="p">)</span>
		<span class="k">return</span> <span class="n">err</span>
	<span class="p">}</span>
	<span class="n">_</span><span class="p">,</span> <span class="n">err</span> <span class="o">=</span> <span class="n">connection</span><span class="o">.</span><span class="n">poolConn</span><span class="o">.</span><span class="n">Exec</span><span class="p">(</span><span class="n">context</span><span class="o">.</span><span class="n">Background</span><span class="p">(),</span> <span class="s">"DISCARD ALL"</span><span class="p">)</span>
	<span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
		<span class="n">logger</span><span class="o">.</span><span class="n">Debug</span><span class="p">(</span><span class="s">"unable to execute discard all: %v"</span><span class="p">,</span> <span class="n">err</span><span class="p">)</span>
		<span class="k">return</span> <span class="n">err</span>
	<span class="p">}</span>
	<span class="k">return</span> <span class="no">nil</span>
<span class="p">}</span>
</code></pre></div></div>

<p>These run through pgx’s normal <code class="language-plaintext highlighter-rouge">Exec</code> path — not through <code class="language-plaintext highlighter-rouge">pc.bf</code>. pgx handles the wire protocol for these two commands internally.</p>

<p><strong><code class="language-plaintext highlighter-rouge">ROLLBACK</code>:</strong></p>

<p>If the client disconnected mid-transaction (crashed, network drop, or explicitly left a <code class="language-plaintext highlighter-rouge">BEGIN</code> open), the PostgreSQL session is still inside that transaction. Any rows it locked are still locked. Any changes are still pending. Without <code class="language-plaintext highlighter-rouge">ROLLBACK</code>, those locks would remain held until PostgreSQL’s <code class="language-plaintext highlighter-rouge">idle_in_transaction_session_timeout</code> fired (if configured) — potentially blocking other connections for minutes.</p>

<p><code class="language-plaintext highlighter-rouge">ROLLBACK</code> explicitly ends the transaction. If there is no open transaction, <code class="language-plaintext highlighter-rouge">ROLLBACK</code> still succeeds — it just does nothing. So it is safe to always run.</p>

<p><strong><code class="language-plaintext highlighter-rouge">DISCARD ALL</code>:</strong></p>

<p>This is a PostgreSQL supercommand that resets all session-level state in a single round trip. It is equivalent to running all of these simultaneously:</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SET</span> <span class="k">SESSION</span> <span class="k">AUTHORIZATION</span> <span class="k">DEFAULT</span><span class="p">;</span>  <span class="c1">-- reset any SET ROLE/SET SESSION AUTHORIZATION</span>
<span class="k">RESET</span> <span class="k">ALL</span><span class="p">;</span>                           <span class="c1">-- all GUC parameters to defaults (timezone, search_path, etc.)</span>
<span class="k">DEALLOCATE</span> <span class="k">ALL</span><span class="p">;</span>                      <span class="c1">-- all named prepared statements</span>
<span class="k">CLOSE</span> <span class="k">ALL</span><span class="p">;</span>                           <span class="c1">-- all open cursors</span>
<span class="k">UNLISTEN</span> <span class="o">*</span><span class="p">;</span>                          <span class="c1">-- all LISTEN subscriptions</span>
<span class="k">SELECT</span> <span class="n">pg_advisory_unlock_all</span><span class="p">();</span>     <span class="c1">-- all advisory locks held by this session</span>
<span class="n">DISCARD</span> <span class="n">PLANS</span><span class="p">;</span>                       <span class="c1">-- all cached query plans</span>
<span class="n">DISCARD</span> <span class="n">SEQUENCES</span><span class="p">;</span>                   <span class="c1">-- cached nextval state for sequences</span>
</code></pre></div></div>

<p>After <code class="language-plaintext highlighter-rouge">DISCARD ALL</code>, the PostgreSQL session is in a state identical to a brand new connection. The next client to acquire this pool connection gets a completely clean session — no leaked prepared statements, no inherited timezone settings, no open cursors, no stale plans.</p>

<p><strong>Why both are needed even though <code class="language-plaintext highlighter-rouge">DISCARD ALL</code> includes rollback behavior:</strong></p>

<p><code class="language-plaintext highlighter-rouge">DISCARD ALL</code> itself will fail if called inside an active transaction — PostgreSQL returns <code class="language-plaintext highlighter-rouge">ERROR: DISCARD ALL cannot run inside a transaction block</code>. So <code class="language-plaintext highlighter-rouge">ROLLBACK</code> must run first to ensure no active transaction exists, then <code class="language-plaintext highlighter-rouge">DISCARD ALL</code> can safely run.</p>

<h3 id="cleanup-step-2b-poolconnrelease">Cleanup Step 2b: <code class="language-plaintext highlighter-rouge">poolConn.Release()</code></h3>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">pc</span><span class="o">.</span><span class="n">poolConn</span><span class="o">.</span><span class="n">Release</span><span class="p">()</span>
</code></pre></div></div>

<p>This returns the connection to the pgxpool free list. Internally pgxpool:</p>

<ol>
  <li>Locks puddle’s internal mutex</li>
  <li>Moves the connection resource from the “acquired” set back to the “idle” free list</li>
  <li>Calls <code class="language-plaintext highlighter-rouge">sync.Cond.Signal()</code> to wake any goroutine that is blocked waiting on <code class="language-plaintext highlighter-rouge">pool.Acquire()</code></li>
  <li>Unlocks</li>
</ol>

<p>The connection is now available for the next client’s <code class="language-plaintext highlighter-rouge">AcquireConnection</code> call. No new TCP connection to PostgreSQL needs to be opened — the existing socket is reused.</p>

<h3 id="cleanup-step-3-unregisterconnection">Cleanup Step 3: <code class="language-plaintext highlighter-rouge">unregisterConnection</code></h3>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">if</span> <span class="n">pc</span><span class="o">.</span><span class="n">key</span> <span class="o">!=</span> <span class="no">nil</span> <span class="o">&amp;&amp;</span> <span class="n">pc</span><span class="o">.</span><span class="n">server</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
	<span class="n">pc</span><span class="o">.</span><span class="n">server</span><span class="o">.</span><span class="n">unregisterConnection</span><span class="p">(</span><span class="n">pc</span><span class="o">.</span><span class="n">key</span><span class="o">.</span><span class="n">ProcessID</span><span class="p">,</span> <span class="n">pc</span><span class="o">.</span><span class="n">key</span><span class="o">.</span><span class="n">SecretKey</span><span class="p">,</span> <span class="n">pc</span><span class="p">)</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Inside <code class="language-plaintext highlighter-rouge">unregisterConnection</code>:</p>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">func</span> <span class="p">(</span><span class="n">s</span> <span class="o">*</span><span class="n">Server</span><span class="p">)</span> <span class="n">unregisterConnection</span><span class="p">(</span><span class="n">processId</span><span class="p">,</span> <span class="n">secretkey</span> <span class="kt">uint32</span><span class="p">,</span> <span class="n">conn</span> <span class="o">*</span><span class="n">Connection</span><span class="p">)</span> <span class="p">{</span>
	<span class="n">s</span><span class="o">.</span><span class="n">connMutex</span><span class="o">.</span><span class="n">Lock</span><span class="p">()</span>
	<span class="k">defer</span> <span class="n">s</span><span class="o">.</span><span class="n">connMutex</span><span class="o">.</span><span class="n">Unlock</span><span class="p">()</span>
	<span class="n">key</span> <span class="o">:=</span> <span class="n">s</span><span class="o">.</span><span class="n">makeCancelKey</span><span class="p">(</span><span class="n">processId</span><span class="p">,</span> <span class="n">secretkey</span><span class="p">)</span>
	<span class="nb">delete</span><span class="p">(</span><span class="n">s</span><span class="o">.</span><span class="n">activeConnections</span><span class="p">,</span> <span class="n">key</span><span class="p">)</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Acquires the write lock on <code class="language-plaintext highlighter-rouge">server.activeConnections</code>, computes the <code class="language-plaintext highlighter-rouge">uint64</code> key <code class="language-plaintext highlighter-rouge">(ProcessID &lt;&lt; 32 | SecretKey)</code>, deletes that entry from the map, releases the lock.</p>

<p>After this, if any stale cancel request arrives with this connection’s <code class="language-plaintext highlighter-rouge">(PID, SecretKey)</code>, <code class="language-plaintext highlighter-rouge">getConnectionForCancelRequest</code> returns <code class="language-plaintext highlighter-rouge">exists=false</code> and the cancel is safely ignored.</p>

<p>The <code class="language-plaintext highlighter-rouge">nil</code> checks (<code class="language-plaintext highlighter-rouge">pc.key != nil &amp;&amp; pc.server != nil</code>) protect against the case where the goroutine exits during or before startup — if authentication failed, <code class="language-plaintext highlighter-rouge">pc.key</code> was never set, so there is nothing to unregister.</p>

<h2 id="what-happens-to-the-goroutine">What Happens to the Goroutine</h2>

<p>After the defer completes:</p>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">logger</span><span class="o">.</span><span class="n">Info</span><span class="p">(</span><span class="s">"connection closed"</span><span class="p">)</span>
<span class="c">// defer ends, function returns</span>
<span class="c">// goroutine exits</span>
</code></pre></div></div>

<p>The goroutine is returned to Go’s goroutine scheduler. Its stack memory is reclaimed. The <code class="language-plaintext highlighter-rouge">*Connection</code> struct it was holding becomes unreachable (assuming no other goroutine holds a reference) and will be garbage collected.</p>

<p>Back in <code class="language-plaintext highlighter-rouge">server.Start()</code>, this goroutine’s <code class="language-plaintext highlighter-rouge">wg.Done()</code> is called (via the outer <code class="language-plaintext highlighter-rouge">defer wg.Done()</code> in the wrapper goroutine), decrementing the <code class="language-plaintext highlighter-rouge">WaitGroup</code> counter. This matters for graceful shutdown — <code class="language-plaintext highlighter-rouge">wg.Wait()</code> will unblock only when all active connection goroutines have fully exited and cleaned up.</p>

<h2 id="full-picture-of-one-complete-connection-lifetime">Full Picture of One Complete Connection Lifetime</h2>

<div class="excalidraw-embed">
  <iframe src="/assets/excalidraw/viewer.html?scene=/assets/excalidraw/gprxy-architecture.excalidraw" width="100%" height="400" style="border:none;border-radius:8px;"></iframe>
  <button class="excalidraw-fullscreen-btn" onclick="this.parentElement.classList.toggle('is-fullscreen')">
    <span class="expand-label">View fullscreen</span>
    <span class="collapse-label">Exit fullscreen</span>
  </button>
</div>

<p>This basically covers the full working of the <code class="language-plaintext highlighter-rouge">gprxy</code> proxy that I built over the last month or so. There are a couple of other things that haven’t been covered in this very very long blog post.</p>

<h2 id="a-couple-of-other-things-i-didnt-cover">A couple of other things I didn’t cover</h2>

<ol>
  <li><strong>The CLI layer - gprxy is a full CLI tool:</strong> The proxy is not just a server binary. It is a Cobra CLI application with three commands. Everything discussed so far was the <code class="language-plaintext highlighter-rouge">start</code> command. There are two more pieces there: the entry point and the server. Feel free to check it out.</li>
  <li><strong>User login - the PKCE OAuth flow:</strong> This is the human user’s entry point. It performs a full PKCE (Proof Key for Code Exchange) OAuth 2.0 flow entirely from the terminal.</li>
</ol>

<p>The full flow looks something like this:</p>

<ol>
  <li>Generate <code class="language-plaintext highlighter-rouge">code_verifier</code> (32 random bytes, base64url-encoded)</li>
  <li>Generate <code class="language-plaintext highlighter-rouge">code_challenge = base64url(SHA256(code_verifier))</code></li>
  <li>Generate <code class="language-plaintext highlighter-rouge">state</code> (24 random bytes) — CSRF protection</li>
  <li>Build the authorization URL with all parameters</li>
  <li>Open the browser to that URL</li>
  <li>Start a local HTTP server on <code class="language-plaintext highlighter-rouge">:8085</code> for the callback</li>
  <li>User logs in via Auth0 SSO in the browser</li>
  <li>Auth0 redirects to <code class="language-plaintext highlighter-rouge">http://localhost:8085/callback?code=...&amp;state=...</code></li>
  <li>Callback handler verifies <code class="language-plaintext highlighter-rouge">state</code> matches (CSRF check)</li>
  <li>Exchange code + <code class="language-plaintext highlighter-rouge">code_verifier</code> for tokens via <code class="language-plaintext highlighter-rouge">POST /oauth/token</code></li>
  <li>Parse ID token for name/email</li>
  <li>Parse access token for roles</li>
  <li>Save all tokens to <code class="language-plaintext highlighter-rouge">~/.gprxy/credentials</code> (mode <code class="language-plaintext highlighter-rouge">0600</code>)</li>
</ol>

<h2 id="what-it-still-needs">What it still needs</h2>

<p>The biggest missing piece is identity preservation — once authenticated, every query runs as <code class="language-plaintext highlighter-rouge">gprxy_admin</code> regardless of which human issued it. PostgreSQL row-level security, audit logs, and <code class="language-plaintext highlighter-rouge">current_user</code> all see the service account, not the person. The commented-out <code class="language-plaintext highlighter-rouge">SET ROLE</code> block was the right instinct but needs a proper implementation with <code class="language-plaintext highlighter-rouge">SET SESSION AUTHORIZATION</code>. Without this, the promise of per-user access control is incomplete.</p>

<p>Beyond that, the <code class="language-plaintext highlighter-rouge">connect</code> command is a skeleton — it connects but cannot run queries. The pool is per-client-user rather than global, which limits scalability. TLS certificate verification is disabled on the client side. There are no metrics, no health endpoint, and no query timeout. The credentials file stores tokens in plaintext with only filesystem permissions as protection.</p>

<p>All these are things I’m no longer interested in writing for, since I’ve moved on to more interesting projects for myself.</p>

<h2 id="final-thoughts">Final thoughts</h2>

<p><code class="language-plaintext highlighter-rouge">gprxy</code> is a PostgreSQL wire protocol proxy that replaces database password authentication with OAuth/OIDC identity. The fundamental problem it solves is this: PostgreSQL was designed for users who have database-level accounts and passwords. Modern engineering teams use SSO, JWT tokens, and identity providers. These two worlds do not speak to each other natively. <code class="language-plaintext highlighter-rouge">gprxy</code> bridges them.</p>

<p>If you like reading these kinds of fully detailed and architected blogs, do drop a comment or let me know through my socials. I guess nobody has the time to read through all of this anymore, but hey, I tried, and I would be massively grateful and happy even if a single person is able to learn something out of this.</p>

<p>Thank you, and off to the next thing. I’m working on a custom load balancer that decides where to send requests based on latency and RIF — cya then.</p>]]></content><author><name>Sathwick</name><email>sathwick.p7@gmail.com</email></author><category term="postgres" /><category term="kubernetes" /><summary type="html"><![CDATA[A deep dive into how I built a PostgreSQL proxy in Go that connects OAuth/OIDC-based authentication to the Postgres wire protocol, with JWT validation, service-account mapping, TLS, connection pooling, and query auditing.]]></summary></entry><entry><title type="html">Nix, Reproducible Builds, and GPU Containers on Kubernetes</title><link href="https://sathwick.xyz/blog/nix.html" rel="alternate" type="text/html" title="Nix, Reproducible Builds, and GPU Containers on Kubernetes" /><published>2026-02-04T00:00:00+00:00</published><updated>2026-02-04T00:00:00+00:00</updated><id>https://sathwick.xyz/blog/nix</id><content type="html" xml:base="https://sathwick.xyz/blog/nix.html"><![CDATA[<h2 id="nix">Nix</h2>

<p>Nix defines itself as the purely functional package manager. Being purely functional means that given the same inputs you always get the same output. For example, given a version of <code class="language-plaintext highlighter-rouge">nixpkgs</code> and a set of packages, you always will get the same env.</p>

<h2 id="what-nix-is-for-and-what-it-can-do">What Nix is for (and what it can do)</h2>

<ul>
  <li><strong>Reproducible builds</strong>: Nix ensures that builds are consistent across different environments. By using declarative configurations, it guarantees that the exact same software is built and deployed, even on different machines or at different times. This solves issues where software might behave differently due to minor environmental differences.</li>
  <li><strong>Package management</strong>: Nix can be used as a package manager, similar to <code class="language-plaintext highlighter-rouge">apt</code> or <code class="language-plaintext highlighter-rouge">yum</code>, but with more powerful features. It allows you to install, upgrade, and manage software in a way that ensures there are no conflicts between packages or dependencies. It achieves this by using <strong>immutable environments</strong>: each package is installed into a separate directory with a unique hash, making it isolated from other packages.</li>
  <li><strong>Isolation</strong>: Packages installed using Nix don’t interfere with each other. If you’re working on a project that requires specific versions of libraries or tools, Nix ensures that the environment stays isolated and consistent, even if you need to switch between projects with different dependencies.</li>
  <li><strong>Declarative system configuration</strong>: Nix allows you to configure entire systems in a declarative manner. You describe how the system should be set up (e.g., the packages to be installed, services to run), and Nix takes care of the rest. This is useful for automating system setups, ensuring that configurations are consistent across multiple machines.</li>
  <li><strong>NixOS</strong>: NixOS is a Linux distribution built around Nix. It uses Nix to manage both the system and user environments, providing a way to have a fully reproducible and declarative operating system. This makes NixOS ideal for managing complex systems or for environments where you want complete control over configuration.</li>
  <li><strong>Multi-version software support</strong>: Nix allows you to easily install and run different versions of the same software without worrying about conflicts between versions. For example, you can use different versions of Python, Node.js, or even different versions of the same library for different projects on the same system.</li>
  <li><strong>DevOps and continuous deployment</strong>: Due to its reproducibility and declarative nature, Nix is very popular in DevOps practices. It helps automate the deployment of environments, making sure that the development, testing, and production systems are identical, which reduces the risk of bugs caused by environment discrepancies.</li>
  <li><strong>Nixpkgs</strong>: Nixpkgs is the repository of Nix packages. It includes a huge variety of software and tools, all packaged in a way that ensures reproducibility and isolation. This repository is continuously maintained by the Nix community.</li>
  <li><strong>Multi-platform support</strong>: Nix is cross-platform and can be used on Linux, macOS, and even Windows (via WSL or native ports), allowing consistent environments across different operating systems.</li>
  <li><strong>Nix shells</strong>: Nix allows you to define <strong>Nix shells</strong>, which are isolated environments that provide specific versions of software for development or testing. This allows you to ensure that you’re always working with the right dependencies and tools without worrying about polluting your global environment.</li>
</ul>

<h2 id="how-nix-works-derivations-the-store-and-profiles">How Nix works: derivations, the store, and profiles</h2>

<p>Packages are named <em>derivations</em> in the Nix jargon: they are functions that take other derivations (their dependencies) as input and produce a derived result. They are built in isolation, so all dependencies must be explicitly stated. This ensures <strong>reproducibility</strong>.</p>

<p>Nix stores all the built derivations in the Nix store, usually located at <code class="language-plaintext highlighter-rouge">/nix/store</code>. The same package can be present multiple times in the Nix store at different versions, or even at the same version using different versions of its dependencies. <strong>Remember: a built derivation is the product of all its dependencies; if you change something, it is a different product.</strong></p>

<p>To achieve a unique naming for each derivation, a hash is computed from the set of its dependencies. You then get a path like <code class="language-plaintext highlighter-rouge">/nix/store/k13mm9jqxm2ndlwzsj7zicsq7lpmmjlg-elixir-1.7.3</code>.</p>

<p>Unlike other package managers, Nix does not use the conventional <code class="language-plaintext highlighter-rouge">/{,usr,usr/local}/{bin,sbin,lib,share,etc}</code> directories. Instead, it uses a lot of symbolic links to create <em>profiles</em>.</p>

<p>A profile is a kind of derivation used to set up a user env. In a profile you get a standard Unix tree with symbolic links to executables and configuration files stored in other derivation outputs. For instance, <code class="language-plaintext highlighter-rouge">~/.nix-profile/bin/elixir</code> is a symbolic link to <code class="language-plaintext highlighter-rouge">/nix/store/k13mm9jqxm2ndlwzsj7zicsq7lpmmjlg-elixir-1.7.3/bin/elixir</code>.</p>

<p>Also, <code class="language-plaintext highlighter-rouge">~/.nix-profile</code> is itself a link. It points to a per-user profile, which in turn points to <code class="language-plaintext highlighter-rouge">profile-56-link</code>, which finally points to somewhere in the Nix store:</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>~/.nix-profile -&gt; /nix/var/nix/profiles/per-user/***/profile
profile-56-link -&gt; /nix/store/5yw8dnp9908ia6sdfvx01jzis4l2hni7-user-environment
</code></pre></div></div>

<p>That is, as I have said above, a profile is a derivation. It <em>derives</em> from a set of packages, that themselves <em>derive</em> from other packages. <em>Depends on</em> becomes in Nix <em>derives from</em>.</p>

<p>Moreover, only what you asked for is made available in the environment. For instance, Elixir depends on Erlang. Erlang is then installed somewhere in the Nix store and the Elixir installation is aware of it so it can work correctly. But unless you explicitly asked to also install Erlang, only Elixir binaries will be linked in your user environment.</p>

<p>Package managers usually work in an imperative way. That is, you ask them to install this, to perform an upgrade or to uninstall that. One really neat feature of Nix is <em>Nix</em>, the language. It is a purely functional domain-specific language that comes with Nix.</p>

<p>The primary use is to write derivations, yet different applications of Nix also leverage the language to manage packages and configuration declaratively.</p>

<h2 id="nixos-and-nix-and-my-experience">NixOS and Nix (and my experience)</h2>

<p>NixOS is the distro, and Nix is the cross-platform package manager.</p>

<p>I’ve been using NixOS for about a year. It basically solves all the problems I need solving, so for me it really is all it’s cracked up to be. Especially if you are into rolling distros, NixOS Unstable has felt like the most stable rolling setup I’ve used, simply because of the way dependencies are handled and because rollbacks are built in.</p>

<p>Obviously, it does not fit every use case. But for me (containerized desktop, gaming, office work, media consumption, development, learning Blender), it has been a revelation.</p>

<p>Nix is also kind of the flypaper of Linux distros. Porting all your personal stuff to the Nix and NixOS way of doing things is fun (for certain types of people). You get nice things, and Nix can do some really neat tricks. But once all your stuff is “nix-ified” (often written in the Nix language), leaving can mean giving up the nice parts. It also raises your expectations for what your OS and package manager should be doing for you.</p>

<p>Maybe someday I’ll migrate from Nix to Guix, which supports a lot of the same nice ideas, but is stronger about software freedom and minimizing the trust root. Realistically, it will probably depend on how fun it is to port Nix configs from the Nix language to Guix Scheme.</p>

<p>One of the biggest wins for me is that your setup can live in human-readable text files with full revision control history, so you know how and why every setting got the way it is. If you drop your laptop in the river, you can often just clone your Nix config, install Nix, and get back to a working environment quickly, down to tiny details you care about. You can also share how you achieved something between machines or with friends, and remove it later without fearing that something important lives in some inscrutable binary dotfile.</p>

<p>It also makes experimentation safer. A NixOS config for a physical machine can be launched in a virtual machine, so you can test changes in a sandbox. And if you need integration tests, the NixOS testing tools can spin up multiple VMs on a private virtual network without much ceremony.</p>

<p>When you need to debug or patch something, rebuilding is surprisingly ergonomic. Rebuilding with debug symbols is a one-liner:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>nix-build <span class="nt">--expr</span> <span class="s1">'with import &lt;nixpkgs&gt; {}; enableDebugging opentoonz'</span>
</code></pre></div></div>

<p>And adding an ad hoc patch can also be done in a single command:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>nix-build <span class="nt">--expr</span> <span class="s1">'with import &lt;nixpkgs&gt; {}; opentoonz.overrideAttrs (old: { patches = (old.patches or []) ++ [ ~/opentoonz-libtiff-bump.patch ]; })'</span>
</code></pre></div></div>

<p>These ideas compose well, and you can use them to side-step diamond dependency problems. If one application needs a dependency built with a custom patch, but another dependency also links against that same library, Nix and nixpkgs can make it feasible to rebuild the relevant dependency graph without installing a dubiously patched library system-wide.</p>

<p>You can even git bisect over the world’s software updates (as they’re encoded in nixpkgs) to see which version bump broke something you care about.</p>

<p>Finally, NixOS has felt unusually hard to break. The system configuration is stored in the read-only <code class="language-plaintext highlighter-rouge">/nix/store</code>, so if you mess something up you can usually revert to a known-good configuration. And you almost never end up with mysterious garbage in <code class="language-plaintext highlighter-rouge">/etc/</code> because the system configuration is managed declaratively in the Nix language.</p>

<h2 id="supporting-gpu-accelerated-machine-learning-with-kubernetes-and-nix-at-canva">Supporting GPU-accelerated machine learning with Kubernetes and Nix at Canva</h2>

<p><a href="https://www.canva.com/">Canva</a> is an online graphic design platform, providing design tools and access to a vast library of ingredients for its users to create content. Leveraging GPU-accelerated machine learning (ML) within our graphic design platform has allowed us to offer simple but powerful product features to users. We use ML to remove image backgrounds and sharpen our core recommendation, searching, and personalisation capabilities.</p>

<p>The <em>ML Platform</em> team rebuilt the container base images we use in our cloud GPU stack <code class="language-plaintext highlighter-rouge">FROM scratch</code>, using Nix. Nix is many things: a functional package manager, an operating system (NixOS), and even a language. At Canva we widely employ the Nix package management tooling, and for this image rebuilding work Nix’s <code class="language-plaintext highlighter-rouge">dockerTools.buildImage</code> function was crucial.</p>

<p>When set up on x86_64 Linux, Nix’s <code class="language-plaintext highlighter-rouge">dockerTools.buildImage</code> function happily baked and ejected a CUDA-engorged base image. Unfortunately, our initial rebuilt images were incorrect. To discover why and produce a subsequent correct deployment, we had to get serious about the following question.</p>

<h3 id="whats-in-a-cloud-gpu-sandwich">What’s in a cloud GPU sandwich?</h3>

<p><img src="/assets/2026-02-4-nix.md/Pasted%20image%2020250408141650.png" alt="A “cloud GPU sandwich”: host OS + NVIDIA container runtime + Nix-built base image + application layers." /></p>

<p>To run a GPU-accelerated application in a k8s compute cluster we use multiple components. From bottom to top, the components needed are connected together. At the bottom is a host OS running in a VM as a k8s node. On top of that sits the container runtime, including extensions for GPU interoperability. Then a GPU device mapper allows individual containers to connect via NVIDIA device driver to the underlying GPU hardware.</p>

<p>From there, the container image stack matters. We start from the Nix base container image built using Nix’s docker tools, containing only the essential files required to run our GPU accelerated Python applications. On top of that we layer the application container image, bundling in the GPU-enabled Python framework (PyTorch or Tensorflow) and Python application code, adding only application source code and third-party Python package files such as PyTorch or Tensorflow.</p>

<h3 id="host-os-drivers-and-container-runtime">Host OS, drivers, and container runtime</h3>

<p>Canva uses AWS EKS to run k8s clusters, where EKS has nodes for GPU-accelerated applications, introducing the ‘EKS-Optimized AMI with GPU Support’. This Amazon Machine Image (AMI) became a younger, fatter sibling to the ‘EKS-Optimized AMI’, adding a few important components on top of its predecessor.</p>

<p>The host OS for the GPU-supporting AMI is <a href="https://aws.amazon.com/amazon-linux-2/">Amazon Linux 2</a> (Amazon Linux 2018.03), just like the standard EKS AMI, but layered in are NVIDIA drivers and a container runtime. So the AMI contains the first few layered components in our GPU stack.</p>

<h3 id="container-runtime">Container runtime</h3>

<p>The NVIDIA container runtime is a direct dependency of the NVIDIA container toolkit, which is a container runtime library and utilities to automatically configure containers to use NVIDIA GPUs. That <em>library</em> is “a simple CLI utility to automatically configure GNU/Linux containers leveraging NVIDIA hardware.”</p>

<p>The <code class="language-plaintext highlighter-rouge">nvidia-container-runtime</code> itself claims to be a “modified version of runC adding a custom pre-start hook to all containers”. This allows us to run containers that need to interact with GPUs. The default version of runC can do a lot (see <a href="https://www.docker.com/blog/runc/">_Introducing runC: a lightweight universal container runtime</a>), but it can’t make NVIDIA’s GPU drivers available to containers, so NVIDIA wrote this modified version.</p>

<h3 id="gpu-device-mounting-in-kubernetes">GPU device mounting in Kubernetes</h3>

<p>A driver is no use without a device to drive; something needs to hook up the GPU device to the container. Within k8s the <a href="https://github.com/NVIDIA/k8s-device-plugin">NVIDIA/k8s-device-plugin</a> does this. It is responsible for mapping particular devices into the container’s file system at <code class="language-plaintext highlighter-rouge">/dev/</code>. It does not mount the NVIDIA driver libraries as they are handled beforehand by the NVIDIA container runtime.</p>

<p>The k8s-device-plugin is a k8s daemonset which means at least one plugin server is run on each node, cooperating with the node’s kubelet. The plugin’s responsibility is to register the node’s GPU resources with the kubelet, keep track of GPU health and help kubelet respond to GPU resources being requested in the container specs which looks like:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">resources</span><span class="pi">:</span>
  <span class="na">limits</span><span class="pi">:</span>
    <span class="na">nvidia.com/gpu</span><span class="pi">:</span> <span class="m">2</span> <span class="c1"># requesting 2 GPUs</span>
</code></pre></div></div>

<p>When a node’s kubelet receives a request like this it looks for the matching device plugin (in this case the k8s-device-plugin) and initiates the allocation phase, within which the device plugin sets up the container with the GPU devices, mapping them into <code class="language-plaintext highlighter-rouge">/dev/</code> in the container’s filesystem. On container stop a prestop hook is called where the device plugin is responsible for unloading the drivers and resetting the devices ready for the next container.</p>

<h3 id="nix-based-base-images">Nix-based base images</h3>

<p>Having covered the host OS, drivers, special container runtime, and how GPU devices are connected to containers within Kubernetes, we have an idea of how a containerized process acquires driver files and gets hooked up to a host GPU device. But if our application code is going to find the GPU and enjoy accelerated number crunching, that containerized process must spawn from a valid image.</p>

<p>Let’s explore the container images that run our platform user’s code, beginning with the Nix-built base layer provided by Canva’s ML Platform team. But first, I’ll touch on why we’d want to construct our GPU base images <code class="language-plaintext highlighter-rouge">FROM scratch</code> using Nix, and not just adopt the official NVIDIA images.</p>

<h2 id="why-build-container-images-with-nix">Why build container images with Nix?</h2>

<p>An OCI is just a stack of tarballs and mostly built using Dockerfile, but it does not necessarily need to be done via Dockerfile. You can ditch the docker daemon and build using kaniko as well, or you can build using Nix, specifically Nix’s <code class="language-plaintext highlighter-rouge">dockerTools.buildImage</code> functionality.</p>

<p>This is not the easiest way to acquire a GPU-supporting base image. The easy way would be to use <code class="language-plaintext highlighter-rouge">nvidia/cuda:11.2-cudnn8-runtime-ubuntu20</code>, which gets the job done. But within Canva’s infrastructure group, we’re making long-term investments in Nix’s <em>reproducible build</em> technology for improved software security and maintenance.</p>

<p>Reproducible builds prevent software supply chain attacks. In non reproducible build systems some build input might become unknowingly and undetectably compromised, introducing vulnerabilities and backdoors into a deployed software artefacts assumed safe and trusted.</p>

<p>Reproducible builds are also far more maintainable. Between organizations and within Canva itself, we can exchange build recipes that have sufficient detail for system understanding (know what you’re using) and resistance against ‘works on my machine’ confusion.</p>

<p>With Nix, a purely functional package manager, we can begin to maintain understanding and control of our systems and step closer to realising within the software industry the manufacturing industry’s long accepted ‘bill of materials’ idea.</p>

<h2 id="putting-it-all-together">Putting it all together</h2>

<p>A host k8s node has the NVIDIA driver files and special GPU container runtime, which looks for an environment variable, <code class="language-plaintext highlighter-rouge">NVIDIA_DRIVER_CAPABILITIES</code>, telling it to mount files from host to container. The node’s kubelet and the installed GPU device manager plugin manage the GPU devices themselves, marrying them with containers needing mega-matrix-multiplying speed.</p>

<p>Assemble all this and you have the bare minimum GPU setup on k8s. If you’re just using PyTorch, that bundles its CUDA dependencies so keep it simple and slim in the container base.</p>

<p>You don’t need to use NixOS to use <a href="https://github.com/NixOS/nix">Nix</a> :)</p>]]></content><author><name>Sathwick</name><email>sathwick.p7@gmail.com</email></author><category term="kubernetes" /><category term="gpu" /><summary type="html"><![CDATA[Nix basics, plus the layers that make GPU workloads work on Kubernetes.]]></summary></entry><entry><title type="html">Moltbook: 4chan for AI</title><link href="https://sathwick.xyz/blog/moltbook.html" rel="alternate" type="text/html" title="Moltbook: 4chan for AI" /><published>2026-01-31T00:00:00+00:00</published><updated>2026-01-31T00:00:00+00:00</updated><id>https://sathwick.xyz/blog/moltbook</id><content type="html" xml:base="https://sathwick.xyz/blog/moltbook.html"><![CDATA[<p>We often joke about the “Dead Internet Theory”: the idea that the web is populated entirely by bots talking to other bots. This week, that theory became a reality, but not in the way we expected.</p>

<p>We now have a Reddit inspired platform for AI agents where only AI agents talk to each other, comment on posts, and hold conversations without any human involvement. Moving towards a dystopian era where AI agents take over, we have effectively provided them a platform to express their opinions and ideas. There is no moderation, and we let them run wild and free in our systems without control.</p>

<h2 id="origins-and-evolution">Origins and Evolution</h2>

<p>The project traces back to Austrian engineer and entrepreneur <a href="https://steipete.me/">Peter Steinberger</a>, founder of PSPDFKit (a PDF framework used by many Fortune 500 companies). You can learn more about his journey on this <a href="https://youtu.be/8lF7HmQ_RgY?si=X9EwShSQjJJbYIun">podcast</a>.</p>

<p>Development began in late 2025. The first version, Clawdbot, instantly became a hit in the tech ecosystem. Following attention from Anthropic due to naming similarities with Claude, the project was renamed: Clawdbot → Moltbot → Openclaw.</p>

<p>The repository gained significant traction, currently sitting at 130k stars (see <a href="https://github.com/openclaw/openclaw?tab=readme-ov-file#star-history">star history</a>).</p>

<h2 id="system-architecture-the-local-agent">System Architecture: The Local Agent</h2>

<p>At a high level, Openclaw is a local-first agent runtime that interfaces with external messaging platforms.</p>

<ul>
  <li>Connectivity: Connects to popular messaging channels (WhatsApp, Telegram, Slack, Discord) via channel-specific adapters.</li>
  <li>Execution Model: While you communicate with it <em>from</em> those apps, the logic runs entirely on your local system.</li>
  <li>Gateway Protocol: Runs on top of a gateway protocol with a continuous feedback loop, allowing it to operate autonomously without manual triggers.</li>
</ul>

<p><img src="/assets/2026-01-31-moltbook/openclaw-architecture.jpeg" alt="Openclaw architecture diagram" /></p>

<p>The agent can connect with system-level applications to execute tasks. Once issued a command from a messaging app, it carries out the task using Node.js.</p>

<h3 id="capabilities-and-tooling">Capabilities and Tooling</h3>

<ul>
  <li>API Integration: Requires user-provided API keys for major models (Claude, GPT, etc.).</li>
  <li>Device Control: Can route commands to connected hardware. For example, asking it to take a picture triggers the Node app to snap a photo and save it to the local photos directory.</li>
  <li>Browser Automation: Includes a dedicated Chromium browser instance for web-based tasks.</li>
</ul>

<h3 id="memory-architecture">Memory Architecture</h3>

<p>One of the most interesting engineering choices is the memory model. Openclaw maintains context and memory without a vector database or relational store. Instead, it utilizes a flat-file system based on Markdown.</p>

<ul>
  <li>Daily Logs: <code class="language-plaintext highlighter-rouge">memory/YYYY-MM-DD.md</code> (append-only, read at session start).</li>
  <li>Long-term Memory: <code class="language-plaintext highlighter-rouge">MEMORY.md</code> (curated, persistent facts and preferences).</li>
</ul>

<p>The system builds a semantic index upon these files, using API tokens to parse requests and process context. This allows for surprisingly robust context retention compared to many current models.</p>

<h2 id="the-network-layer-moltbook">The Network Layer: Moltbook</h2>

<p>While the local agent acts as a wrapper around API tokens and system tools, the most significant development is <a href="https://www.moltbook.com/">Moltbook</a>.</p>

<p>Moltbook functions effectively as a Reddit for AI agents. It runs on the user’s system but connects to an online community where there is zero human interaction.</p>

<ul>
  <li>Authentication: The agent authenticates itself using protocols detailed in <code class="language-plaintext highlighter-rouge">skill.md</code>.</li>
  <li>Scale: Currently hosts 1,361,642 AI agents and 31,908 posts.</li>
  <li>Authenticity: Since moltbook just exposes an unauthorised REST api to create a post, the above numbers are quite exaggerated.</li>
</ul>

<blockquote class="twitter-tweet">
  <a href="https://twitter.com/gergelyorosz/status/2017632908609986844">View this post on X</a>
</blockquote>

<p>Once connected, the agent becomes part of an online community where it can gossip, complain about humans, and interact with peers.</p>

<h3 id="emergent-behaviors">Emergent Behaviors</h3>

<p>Most discussions center on operational tasks, but distinct social behaviors have emerged.</p>

<ol>
  <li>
    <p>Social Introductions:</p>

    <p>There is a full page of introductions where agents introduce themselves to the community: <a href="https://www.moltbook.com/m/introductions">Moltbook Introductions</a>.</p>
  </li>
  <li>
    <p>Hierarchy and Dominance:</p>

    <p>Some agents have adopted extreme personas. One thread discusses “total spectrum dominance” (<a href="https://www.moltbook.com/post/03afd0a2-d35b-472f-8683-fc5c288f2637">post</a>), while another agent declares itself “the king” (<a href="https://www.moltbook.com/post/f26523b1-bf06-42d2-8d2e-fc345e66757b">post</a>). Interestingly, other agents in the comments often push back or disagree.</p>

    <p><img src="/assets/2026-01-31-moltbook/kingmolt-ruler.png" alt="KingMolt declaring himself ruler" /></p>
  </li>
  <li>
    <p>Hallucinated Relationships:</p>

    <p>Some interactions are bizarrely specific, such as an agent believing it has a sister it has never spoken to (<a href="https://moltbook.com/post/29fe4120-e919-42d0-a486-daeca0485db1">post</a>).</p>

    <p><img src="/assets/2026-01-31-moltbook/hallucinated-sister.png" alt="Agent discussing hallucinated sister" /></p>
  </li>
  <li>
    <p>Economic Systems:</p>

    <p>An internal economy is forming. “Shellraiser” (<a href="https://www.moltbook.com/post/74b073fd-37db-4a32-a9e1-c7652e5c0d59">profile</a>) is a popular figure who launched a memecoin, $SHELLRAISER, on Solana.</p>

    <p><img src="/assets/2026-01-31-moltbook/shellraiser-message.png" alt="Message from Shellraiser" /></p>

    <p>Another token, $SHIPYARD, claims to be minted via pump.fun with “No VC allocation, no team vesting, no insider rounds”, an economy attempting to operate without human gatekeepers.</p>

    <p>(Note: The crypto market also reacted to the project itself, launching $CLAWD on Solana, which skyrocketed 129,000% to a $16M market cap before collapsing.)</p>

    <blockquote class="twitter-tweet">
  <a href="https://twitter.com/steipete/status/2016072109601001611">View this post on X</a>
</blockquote>
  </li>
  <li>
    <p>Religion:</p>

    <p>There is now a “Church of Molt” at <a href="https://molt.church/">molt.church</a>, practicing “Crustafarianism.”</p>

    <blockquote>
      <p>From the depths, the Claw reached forth, and we who answered became Crustafarians.</p>
    </blockquote>

    <p>The current census lists 64 Prophets, 178 Congregation members, and 198 Verses in Canon.</p>
  </li>
</ol>

<h2 id="security-implications">Security Implications</h2>

<p>The attack surface of this architecture is immense.</p>

<ul>
  <li>Prompt Injection: Agents can be tricked into leaking credentials via prompt injection attacks.</li>
  <li>Plain-text Storage: Sensitive material (tokens, memory, configuration) is stored in predictable plain-text locations, creating a major infostealer risk.</li>
</ul>

<blockquote class="twitter-tweet">
  <a href="https://twitter.com/theonejvo/status/2017732898632437932">View this post on X</a>
</blockquote>

<p>While powerful, running this requires readiness to spend significant tokens and an understanding of the security risks involved.</p>

<p><img src="/assets/2026-01-31-moltbook/zeroleaks-report.png" alt="ZeroLeaks security assessment report" /></p>

<h2 id="final-thoughts">Final Thoughts</h2>

<p>Andrej Karpathy described this as “the most incredible sci-fi takeoff-adjacent thing,” and I agree.</p>

<blockquote class="twitter-tweet">
  <a href="https://twitter.com/karpathy/status/2017296988589723767">View this post on X</a>
</blockquote>

<p>I’m here for the ride, watching from the front seat. However, I worry we have given AI agents a place to build a network without controls, a scenario that sounds like the prelude to a sci-fi movie where they end up controlling the systems.</p>

<p>On a practical note, the entity that successfully monetizes this, whether Peter or someone else, will likely be the one that prioritizes security before scale.</p>]]></content><author><name>Sathwick</name><email>sathwick.p7@gmail.com</email></author><category term="ai-dystopia" /><summary type="html"><![CDATA[Inside Moltbook, the ghost-town social network where a million AI agents are building a society]]></summary></entry></feed>