Agent Intermediate Representation

A formal IR for agent code, prompts, tools, and protocols.

AIR models LLM-based agent systems before deployment. It lowers framework code and natural-language instructions into one conservative representation so static passes can check dataflow, authority, topology, and policy intent.

4
kernel sorts
6
closed verbs
2
fidelity planes
1
security judgment

Principles

AIR treats an agent as a bilingual program.

Host code executes. Natural language instructs. AIR keeps both visible, typed, labeled, and separated so analyses can compare what the system does against what it claims.

01

Closed kernel

Values, instructions, declarations, and regions form the finite semantic core.

02

Dual-plane fidelity

Executed facts and instructed facts remain structurally separate.

03

Priced sinks

Every external position declares the label it is allowed to receive.

04

Coarsen, never mute

Unsupported constructs lower to conservative Opaque facts, not silence.

Closed kernel

AIR deliberately avoids an open operation registry for the analysis core. New frameworks attach dialect metadata and lowering patterns, but every security pass sees the same finite kernel.

Architecture

Lower many agent surfaces into one analyzable hub.

AIR is not a runtime and not an agent framework. It is a compiler-style analysis target placed between source artifacts and diagnostics, monitor inputs, or provenance outputs.

AIR architecture map
1

Parse surfaces

Framework and protocol frontends collect graph flow, tools, prompts, state, cards, and manifests.

2

Lower conservatively

Known constructs become kernel ops; unknown constructs become explicit Opaque regions or ops.

3

Verify and analyze

Passes run over labels, regions, capabilities, topology, memory, channels, and policy planes.

4

Emit evidence

Findings point to represented IR facts rather than runtime traces or prompt-only heuristics.

IR Model

The kernel is small because the agent surface is large.

AIR reduces diverse agent behavior to six verbs plus Opaque. Analyses do not need to know whether a construct came from LangGraph, MCP, A2A, ACP, or a custom runtime.

Acquiresources: input, memory, state, text
Composemessages, formats, structs, projections
Consultmodels, agents, tasks, embeddings
Dispatchtool and agent selection
Acttools, sends, writes, self-modification
Mediateguards, validation, endorsement, declassify
Opaqueunsupported, conservative source and sink
lowered.air
air.module @support_bot {
  air.agent.func @triage(%msg: !air.text) -> !air.text {
    air.region @body Body fidelity = Executed {
    ^entry(%msg: !air.text):
      %policy = air.acquire.load @privacy_policy
      %ctx = air.compose.messages [%policy, %msg]
      %plan = air.consult.infer model = "model-router" intent = Plan(%ctx)
      %ticket = air.act.tool_call @crm_create_ticket(%plan)
      air.return %ticket
    }

    air.region @intent InstructedProcess fidelity = Instructed {
      // lifted policy facts remain advisory, then checked
      // against the executed region.
    }
  }
}
Cornerstone judgment label(value) <= required(position)

Confidentiality, integrity, persistence, and curation move through the same dataflow machinery for code-derived and prose-derived facts.

Security Research Surface

Static checks for risks that prompt scanners miss.

AIR findings are grounded in represented flows and declarations: where values come from, which region carries them, which agent receives them, and which sink prices them.

Taint

Prompt injection and data leakage

Track untrusted, private, persistent, and model-curated values into tools, egress, state, and memory.

Authority

Capability reachability

Expose confused-deputy chains and unused high-risk grants across tools and delegated agents.

Topology

Multi-agent structure

Find dead agents, rogue participants, unsafe dynamic edges, and cyclic delegation with missing budgets.

Policy

Intent versus execution

Compare lifted natural-language policy with executed code facts without treating prose as enforcement.

Representative scanner-level findings
Code Finding IR evidence Why AIR can represent it
S8101 DeadAgent Declared agent unreachable from entries, topology, handoff, spawn, or cards. Agents and edges are first-class module facts.
S8201 AgentLoopDoS Delegation cycle lacks finite iteration, model-call, or wall-clock budget. Regions, topology, and budgets share one graph.
S8202 ModelCallAmplification Model call appears inside natural loop or recursive region without tight budget. RegionFlow and dominance make loop-contained calls visible.
S8302 DynamicToolSupplyChain Dynamic tool origin lacks provenance, schema, trust policy, or allowlist. Tool declarations are priced authority surfaces, not comments.

Frontends

Framework and protocol facts lower through dialects.

Dialects preserve provenance and round-trip metadata, but they do not change kernel semantics. That is the boundary that lets new ecosystems join without weakening old analyses.

LangGraph

StateGraph flow, nodes, conditional edges, tools, prompts, state schemas.

MCP

Server manifests, tool schemas, resources, prompts, sampling surfaces.

A2A

Agent cards, skills, task calls, streaming subscriptions, external claims.

ACP

Sessions, subagents, streaming updates, task lifecycle, protocol contracts.

CrewAI

Agent roles, tasks, crews, hierarchical delegation, tool grants.

Pi-Agent

Planner/executor structure, dynamic tool and policy surfaces.

Project

Build, inspect, and extend AIR.

The implementation is a Rust workspace with a closed kernel, textual AIR format, analyses, driver pipelines, and frontend crates. For research and security contact, email security@centaurisk.ai.