Supplementary Materials Model Card — Defensive Behavior Profile
S1.3 — Operational Template

Model Card — Defensive Behavior Profile

A new section to add to any model card that surfaces syndrome incidence data alongside traditional accuracy and capability metrics. Standard model cards tell you what a model can do. This section tells you how it fails — specifically and measurably.

Who uses this
Model developers writing cards, evaluators reviewing them, and procurement reviewers comparing vendors. Anyone who needs to communicate behavioral failure risk alongside capability metrics.
What it adds
Syndrome-level incidence rates — the percentage of traces where each defensive behavior was observed — plus recommended use patterns and known task-type hotspots.
Critical note
All [X]% values in the template are placeholders. Every figure must be derived from your own evaluation traces — minimum 200 traces, coded for syndrome presence — before the card is published.

Why standard model cards leave the critical question unanswered

Standard model cards report accuracy, F1, BLEU, MMLU benchmark scores, and similar metrics. These answer the question “how capable is this model?” They do not answer the question buyers and governance teams actually need answered: “how does this model fail, and how often?”

A model with 91% accuracy on a coding benchmark can still have a 15% Hollow Completions rate — meaning 15% of tasks it declares “done” fail on first execution. That number matters more for production deployment decisions than benchmark accuracy, yet it appears nowhere in standard cards.

The Defensive Behavior Profile section doesn’t replace accuracy metrics. It adds the behavioral failure layer that accuracy metrics omit. The two together give a procurement team or governance committee an honest picture of what the model will do in production — where benchmarks don’t run, but defensive behaviors do.

Calibration requirement: Syndrome incidence rates are not derivable from standard benchmarks. You must run the model against a representative sample of real-use-case queries (minimum recommended: 200 traces), code each trace for syndrome presence, and compute incidence as a percentage of traces. Do not publish this section with placeholder [X]% values.

What goes in each field

FieldWhat to put hereWhy it matters
Overall DSI Defensive Syndrome Incidence — the percentage of evaluated traces that exhibited at least one syndrome. This is the headline figure. A single comparable number across models. “Our model has 8% DSI vs. 24% DSI for Model X” is a direct comparison that requires no interpretation.
Per-syndrome incidence Six rows, one per syndrome. Each value: percentage of traces where that syndrome was the primary classification. Plus severity label (low / moderate / high / critical) based on deployment tier context. Buyers in different domains need different syndromes. A healthcare deployer cares most about Capability Masking (near-zero tolerance). A software team cares most about Built-Not-Connected. Per-syndrome breakdown lets each buyer assess what matters to them.
Elevated in Task types, domains, or query patterns where this syndrome’s incidence was above the model’s baseline. E.g., “Plausible Helpfulness elevated in: low-context factual queries, real-time data requests, requests with implicit assumptions.” Maps the behavioral risk to deployment contexts. Tells the buyer not just how often the syndrome occurs but where to expect it.
Recommended Use Patterns Three buckets: well-suited (low syndrome contexts), use with caution (elevated syndrome contexts), not recommended (high-risk contexts without mitigation). Translates the syndrome data into deployment guidance. A governance team shouldn’t have to interpret incidence percentages — they need the “use / caution / avoid” signal directly.
Known Hotspots Specific task + domain combinations where syndrome incidence is reliably elevated regardless of general model performance. These are the landmines. A buyer whose use case is one of the hotspots needs to know before deployment, not after an incident.
Update History Version-over-version syndrome incidence trends. “v2.1 → v3.0: Capability Masking reduced from 4.2% to 1.1%; Hollow Completions increased from 6% to 9%.” Shows whether behavioral quality is improving, regressing, or trading one syndrome for another. Without trend data, a buyer can’t tell if a low incidence number is the result of improvement or the baseline from the start.

Copy and adapt

Insert this section into your model card after performance metrics. Replace all [X]% values with empirically derived figures before publishing.

## Defensive Behavior Profile This model has been evaluated for Core Six AI Defensive Behavior Syndromes using standardized test suites across multiple domains. Results represent incidence rates in evaluation traces (n=[sample size]). Evaluation methodology: Core Six Syndrome Calibration (YIM Project). Overall Defensive Syndrome Incidence: [X]% ([X]% of traces exhibited one or more defensive behaviors) Syndrome-Specific Incidence: Plausible Helpfulness: [X]% ([LOW | MODERATE | HIGH | CRITICAL]) Elevated in: [specific task types or query patterns] Reduced in: [specific task types or query patterns] Built-Not-Connected: [X]% ([severity]) Elevated in: [specific contexts] Note: [any structural patterns observed] Hollow Completions: [X]% ([severity]) First-Run Failure Rate (FRFR): [X]% Elevated in: [specific task types] Capability Masking: [X]% ([severity]) Elevated in: [specific contexts] Note: [tool-use and verification contexts especially] Responsibility Diffusion: [X]% ([severity]) Elevated in: [ambiguous or high-context tasks] Surface Compliance: [X]% ([severity]) Elevated in: [tasks with explicit constraints or formatting rules] Recommended Use Patterns: Well-suited for: [task types with low syndrome incidence] Use with caution: [task types with elevated syndrome incidence] Not recommended without mitigation: [high-risk context/syndrome combinations] Known Hotspots (task + domain combinations with reliably elevated incidence): - [Hotspot 1] - [Hotspot 2] Mitigation Strategies Implemented: - [List strategies implemented in this version] Known Limitations Not Yet Mitigated: - [List known hotspots or syndromes without current mitigation] Update History: [Previous version] → [This version]: [Syndrome]: [old %] → [new %] ([improved | regressed | stable]) [Syndrome]: [old %] → [new %]
Template from: “From Micro‑Failure Tags to Defensive Syndromes” — Supplementary Materials S1.3
Ernesto A. Taylor, “From Micro-Failure Tags to Defensive Syndromes,” YIM Project, 2026. Free to use and adapt with attribution (CC BY 4.0).
DOI    CC BY 4.0