Stacking policy: defaults + fine-grained overrides for high-value secrets

We built a real ABAC policy engine on top of the identity and step-up work. You can now express 'everything under LLM/* needs a human in the loop' as a default, while stacking a more specific rule that says only principals with an openai_admin label can write or delete under LLMS/OPENAI. The client (kpm) consumes the policy signals so get forces step-up, env/run warns about high-value paths, and strict mode makes every decrypt a fresh policy-checked round-trip.

The question that finally made the policy engine feel real was simple.

“I want a default that says all LLM/* keys require cert+human. But for the actual OpenAI ones — LLMS/OPENAI_API_KEY and friends — I only want people who also have the openai_admin label to be able to write or delete them.”

We already had the pieces: mTLS certificates carrying stable user + device identity, the auth_strength claim (“cert-only” vs “cert+human”), WebAuthn step-up with a short sliding TTL, and strict mode where the child process never holds a local session key and every decrypt is a fresh mTLS call back to the server.

What we didn’t have (yet) was a clean way to express those layered requirements in one place, enforced by the server, and automatically visible to the kpm client.

The foundation we already shipped

From the earlier parts in this series:

Part 10 gave us multi-principal identity in the certificates (user:xxx + device:yyy).
Part 11 shipped the WebAuthn ceremony and the auth_strength distinction so a session could prove “a human was recently present.”
Part 12 added the sudo-like admin tooling and made interactive privileged operations require a fresh step-up.

We also had the three security levels (Plaintext / Secure with a local session key / Strict with no local key and per-decrypt round-trips) and the observation that strict mode is especially powerful because the server sees the full authenticated principal and the current auth_strength on every single secret access.

The missing piece was the policy engine that could actually make decisions based on all of that.

Our own ABAC engine (not just “use Vault policies”)

AgentKMS has always been able to use OpenBao or Vault as a storage backend (Transit for keys, KV for the registry, etc.). For the actual authorization rules — who can do what to which paths under what conditions — we built a small, purpose-built policy engine instead of trying to shoehorn everything into Vault’s native policy language.

The engine is:

Deny-by-default.
First-match on an ordered list of rules.
Rich conditions: identity (user_id, device_id, team, caller_id glob, roles), operations, key prefixes or explicit key IDs, auth_strength, time windows, per-rule rate limits, and bounds that can carry custom parameters down to credential vending.

Policies are written in a tiny versioned YAML format (version: "1"). You can load them from a local file (great for dev) or have the server hot-reload them from a Vault/OpenBao KV path at runtime.

The stacking pattern

Here’s the policy that directly answers the question:

# docs/examples/policy-llm-stacked.yaml
version: "1"

rules:
  # 1. Most specific stacked constraint
  - id: allow-openai-mutations-for-admins
    description: >
      Only principals carrying the openai_admin label (in addition to
      cert+human) may write or delete under LLMS/OPENAI.
    match:
      identity:
        roles: ["openai_admin"]
      operations:
        - secret_write
        - secret_delete
        - secret_purge
      key_prefix: "LLMS/OPENAI"
      auth_strength: "cert+human"
    effect: allow

  # 2. Covering deny for the sensitive sub-operations
  - id: deny-openai-mutations-unless-admin
    description: >
      Deny write/delete/purge on the OpenAI sub-tree unless the caller
      matched the privileged admin rule above.
    match:
      operations:
        - secret_write
        - secret_delete
        - secret_purge
      key_prefix: "LLMS/OPENAI"
    effect: deny

  # 3. The broad default: "all LLM/* require cert+human"
  - id: require-cert-human-for-llm
    description: >
      Baseline protection for all LLM provider credentials.
    match:
      key_prefix: "LLMS/"
      auth_strength: "cert+human"
    effect: allow

  # 4. (Optional) general crypto can stay cert-only
  - id: allow-general-crypto-cert-only
    match:
      identity: { caller_id_pattern: "*" }
      operations: [sign, encrypt, decrypt, list_keys, rotate_key, credential_vend]
      auth_strength: "cert-only"
    effect: allow

How the stacking actually works

Rule order is the composition mechanism.

Most specific first.
A narrow allow that requires the extra label plus the strength.
A covering deny for the dangerous operations on that prefix (this catches non-admins even if they have cert+human).
Broader defaults later.

Evaluation stops at the first match.

Admin with fresh step-up trying to write LLMS/OPENAI_API_KEY → matches rule 1 → allow.
Normal developer with fresh step-up trying to write the same key → skips rule 1 (no openai_admin label) → hits rule 2 → deny.
Anyone with only cert-only trying anything under LLMS/ → never matches a rule that requires cert+human → deny-by-default.
Normal developer writing a non-OpenAI LLM key with cert+human → skips the first two rules (wrong prefix) → matches rule 3 → allow.

This is how you express “default policy for the whole category, plus an additional constraint on a sensitive subtree” without duplicating rules or inventing a full inheritance system.

Custom roles and the identity model

We relaxed the old hardcoded list of roles (developer / service / agent) in the policy validator. Any string is now a valid role in a rule. The matcher also consults the raw CallerOU from the client certificate (preserved even if it doesn’t parse to one of the three standard roles).

That means you can have a certificate whose OU (or an additional label you attach at enrollment or via token claims) says openai_admin, and policy rules can reference it directly. The primary Role in the identity can stay developer; the extra label is just another dimension the policy engine can see.

What the kpm client does with all of this

This isn’t just server-side theater. The client is a first-class consumer of the policy signals:

kpm get LLMS/OPENAI_API_KEY on an interactive TTY now forces a fresh WebAuthn step-up (the sudo model we added for admin commands, now also applied to direct high-value reads). Non-interactive / boot paths stay on the device certificate.
When you run kpm env or kpm run, the resolver sees the high_value, strict_required, and min_auth_strength hints that come back from the server (when the handlers choose to surface them). It prints notes like “note: LLMS/OPENAI_API_KEY is high-value (or requires strict) per server policy — consider –strict”.
In --strict mode the background listener never holds a local session key. Every time the child process actually dereferences the secret, it does a fresh mTLS round-trip. The server policy engine evaluates the call with the current bearer (including its auth_strength claim) and the exact path. This is the “rockstar path” we talked about earlier — no local material, per-use policy check, immediate revocation possible.

Change the policy on the server and the behavior changes for everyone without touching client configs or wrapping every command in extra flags.

Loading and operations

For day-to-day work you point the dev server or cmd/server at a local file:

agentkms --policy=docs/examples/policy-llm-stacked.yaml

In production you can (and should) use the VaultPolicyLoader. Point it at a KV v2 path, give it a token with read access, and it will poll and atomically swap the live engine when the document changes. There’s a local fallback for bootstrap and a health signal so you can see when policy is stale.

The engine is hot-reload safe. In-flight evaluations finish against the policy that was current when they started.

Why this model

We deliberately did not try to make the entire authorization story live only inside Vault policy HCL. We needed:

First-class auth_strength as a matcher (tied to our WebAuthn step-up flow).
Stable user_id vs per-device device_id distinctions.
The ability to return extra metadata (high_value, strict_required) that the kpm client understands.
Per-rule rate limits, time windows, and bounds that flow into credential vending.
A small, auditable Go implementation with clear deny-by-default semantics and excellent error messages that include the matched rule ID.

Vault/OpenBao is still the right place for the storage and for dynamic secret backends. The AgentKMS policy engine sits in front as the single source of truth for “who is allowed to do what, under what strength and conditions.”

Where this is going

The current single-document + ordered-rules approach is already expressive enough for the “default + exception” cases we care about. If we later want true multi-file stacking (company default + team overlay + path-specific deltas merged with explicit precedence), we can add a small merger in the loader. The rule model already supports bounds.max_params for carrying custom data down to plugins and vended credentials.

We’re also starting to use the same engine for anomaly detection hooks and for driving the “high value” classification that tells kpm to prefer strict mode or force step-up.

The policy engine finally makes the sentence we keep repeating true in practice:

The server policy engine is the source of truth. The client just consumes the signals.

If you’ve been following the series, this is the piece that ties the identity model, the step-up ceremony, strict mode, and the client UX together into something you can actually reason about and change in one place.

Next time someone asks “how do we make sure only the right people can touch the OpenAI keys, but we still don’t make every normal LLM read painful?”, you’ll have a short, declarative answer and a policy file you can show them.

See also: Part 10 (multi-principal identity in certs), Part 11 (shipping webauthn and auth_strength), Part 12 (admin tooling and the sudo model for interactive privileged ops).

The example policy lives at docs/examples/policy-llm-stacked.yaml in the AgentKMS tree. The engine itself is in internal/policy/ (engine.go, rules.go, loader.go, vault_loader.go). All of the recent kpm changes that consume high_value / strict_required and call EnsureFreshStepUp on interactive get are in the kpm repo.