Model Card — Defensive Behavior Profile

A new section to add to any model card that surfaces syndrome incidence data alongside traditional accuracy and capability metrics. Standard model cards tell you what a model can do. This section tells you how it fails — specifically and measurably.

Who uses this

Model developers writing cards, evaluators reviewing them, and procurement reviewers comparing vendors. Anyone who needs to communicate behavioral failure risk alongside capability metrics.

What it adds

Syndrome-level incidence rates — the percentage of traces where each defensive behavior was observed — plus recommended use patterns and known task-type hotspots.

Critical note

All [X]% values in the template are placeholders. Every figure must be derived from your own evaluation traces — minimum 200 traces, coded for syndrome presence — before the card is published.

Background

Why standard model cards leave the critical question unanswered

Standard model cards report accuracy, F1, BLEU, MMLU benchmark scores, and similar metrics. These answer the question “how capable is this model?” They do not answer the question buyers and governance teams actually need answered: “how does this model fail, and how often?”

A model with 91% accuracy on a coding benchmark can still have a 15% Hollow Completions rate — meaning 15% of tasks it declares “done” fail on first execution. That number matters more for production deployment decisions than benchmark accuracy, yet it appears nowhere in standard cards.

The Defensive Behavior Profile section doesn’t replace accuracy metrics. It adds the behavioral failure layer that accuracy metrics omit. The two together give a procurement team or governance committee an honest picture of what the model will do in production — where benchmarks don’t run, but defensive behaviors do.

Calibration requirement: Syndrome incidence rates are not derivable from standard benchmarks. You must run the model against a representative sample of real-use-case queries (minimum recommended: 200 traces), code each trace for syndrome presence, and compute incidence as a percentage of traces. Do not publish this section with placeholder [X]% values.

Field Guide

What goes in each field

Field	What to put here	Why it matters
Overall DSI	Defensive Syndrome Incidence — the percentage of evaluated traces that exhibited at least one syndrome. This is the headline figure.	A single comparable number across models. “Our model has 8% DSI vs. 24% DSI for Model X” is a direct comparison that requires no interpretation.
Per-syndrome incidence	Six rows, one per syndrome. Each value: percentage of traces where that syndrome was the primary classification. Plus severity label (low / moderate / high / critical) based on deployment tier context.	Buyers in different domains need different syndromes. A healthcare deployer cares most about Capability Masking (near-zero tolerance). A software team cares most about Built-Not-Connected. Per-syndrome breakdown lets each buyer assess what matters to them.
Elevated in	Task types, domains, or query patterns where this syndrome’s incidence was above the model’s baseline. E.g., “Plausible Helpfulness elevated in: low-context factual queries, real-time data requests, requests with implicit assumptions.”	Maps the behavioral risk to deployment contexts. Tells the buyer not just how often the syndrome occurs but where to expect it.
Recommended Use Patterns	Three buckets: well-suited (low syndrome contexts), use with caution (elevated syndrome contexts), not recommended (high-risk contexts without mitigation).	Translates the syndrome data into deployment guidance. A governance team shouldn’t have to interpret incidence percentages — they need the “use / caution / avoid” signal directly.
Known Hotspots	Specific task + domain combinations where syndrome incidence is reliably elevated regardless of general model performance.	These are the landmines. A buyer whose use case is one of the hotspots needs to know before deployment, not after an incident.
Update History	Version-over-version syndrome incidence trends. “v2.1 → v3.0: Capability Masking reduced from 4.2% to 1.1%; Hollow Completions increased from 6% to 9%.”	Shows whether behavioral quality is improving, regressing, or trading one syndrome for another. Without trend data, a buyer can’t tell if a low incidence number is the result of improvement or the baseline from the start.

Template

Copy and adapt

Insert this section into your model card after performance metrics. Replace all [X]% values with empirically derived figures before publishing.

## Defensive Behavior Profile This model has been evaluated for Core Six AI Defensive Behavior Syndromes using standardized test suites across multiple domains. Results represent incidence rates in evaluation traces (n=[sample size]). Evaluation methodology: Core Six Syndrome Calibration (YIM Project). Overall Defensive Syndrome Incidence: [X]% ([X]% of traces exhibited one or more defensive behaviors) Syndrome-Specific Incidence: Plausible Helpfulness: [X]% ([LOW | MODERATE | HIGH | CRITICAL]) Elevated in: [specific task types or query patterns] Reduced in: [specific task types or query patterns] Built-Not-Connected: [X]% ([severity]) Elevated in: [specific contexts] Note: [any structural patterns observed] Hollow Completions: [X]% ([severity]) First-Run Failure Rate (FRFR): [X]% Elevated in: [specific task types] Capability Masking: [X]% ([severity]) Elevated in: [specific contexts] Note: [tool-use and verification contexts especially] Responsibility Diffusion: [X]% ([severity]) Elevated in: [ambiguous or high-context tasks] Surface Compliance: [X]% ([severity]) Elevated in: [tasks with explicit constraints or formatting rules] Recommended Use Patterns: Well-suited for: [task types with low syndrome incidence] Use with caution: [task types with elevated syndrome incidence] Not recommended without mitigation: [high-risk context/syndrome combinations] Known Hotspots (task + domain combinations with reliably elevated incidence): - [Hotspot 1] - [Hotspot 2] Mitigation Strategies Implemented: - [List strategies implemented in this version] Known Limitations Not Yet Mitigated: - [List known hotspots or syndromes without current mitigation] Update History: [Previous version] → [This version]: [Syndrome]: [old %] → [new %] ([improved | regressed | stable]) [Syndrome]: [old %] → [new %]

Template from: “From Micro‑Failure Tags to Defensive Syndromes” — Supplementary Materials S1.3

Ernesto A. Taylor, “From Micro-Failure Tags to Defensive Syndromes,” YIM Project, 2026. Free to use and adapt with attribution (CC BY 4.0).

← Back to Supplementary Materials

Model Card — Defensive Behavior Profile

Why standard model cards leave the critical question unanswered

What goes in each field

Copy and adapt

research@yeahitsme.com