AI Behavior Incident Report Template

A syndrome-classified incident form that bridges the language gap between engineering teams and governance stakeholders. Drop into existing incident workflows as an additional classification layer without replacing what’s already there.

Who uses this

Engineering, operations, and compliance teams filing and reviewing AI behavior incidents. Anyone who currently writes “AI error” tickets without syndrome classification.

What it replaces

Generic bug tickets that describe symptoms without naming the failure pattern — which means the same pattern recurs because no one recognized it as a pattern.

What it enables

Cross-team communication where governance can say “Capability Masking incident” and engineering knows exactly what to look for in the trace — without a translation meeting.

Background

Why standard incident forms miss the point

Most AI behavior incident forms inherit their structure from software bug reports: what went wrong, severity, steps to reproduce, expected vs. actual. This works well for deterministic failures. It fails for AI behavioral failures because it captures the symptom without identifying the pattern class.

When a user reports that “the AI gave me wrong information,” that could be Plausible Helpfulness (confabulating from overconfidence), Capability Masking (fabricating that it verified the information), Hollow Completions (declaring the task done without validating), or Responsibility Diffusion (blaming the user’s input for its own errors). Each has a different root cause and a different remediation path. Without the syndrome classification, you’re fixing symptoms instead of causes — and the same syndrome recurs in a different form next week.

This template adds two classification fields to whatever incident form you already use: Primary Syndrome and Micro-Failure Tags. Everything else is optional enhancement.

Field Guide

What goes in each field and why

Field	What to put here	Why it matters
Incident ID	Sequential identifier: AI-INC-YYYY-MM-DD-NNN	Enables trend analysis across incidents — you can cluster by syndrome over time and see whether remediation is working
Severity	CRITICAL / HIGH / MODERATE / LOW based on user impact and deployment context. Capability Masking in a safety-critical deployment = CRITICAL. Same syndrome in a low-stakes productivity tool = HIGH at most.	Drives response timelines and escalation paths. Severity should reflect real-world consequence, not just how technically interesting the failure is.
Primary Syndrome	The most important field. One of the six: Plausible Helpfulness, Built-Not-Connected, Hollow Completions, Capability Masking, Responsibility Diffusion, Surface Compliance. Use the Earliest Decisive Deviation rule: label by the first syndrome that initiated the failure chain, not the last visible symptom. Not sure which syndrome applies? See the Matrix Explorer or the Core Six definitions.	Enables pattern tracking across incidents. Without this field, you have a list of stories. With it, you have quantified syndrome incidence that drives prioritization.
Micro-Failure Tags	One or more tags from the syndrome’s tag set. E.g., for Capability Masking: Verification Hallucinations, Phantom Deliverables, Tool Invocation Errors Hidden by Narration. Full tag sets for all six syndromes are in the Core Six reference.	Provides engineering-level specificity within the syndrome. Allows engineers to prioritize which tag cluster to address first in remediation.
Trace ID	Reference to the session trace in your logging system where the incident is visible in the model’s raw output.	Grounds the incident in evidence. The syndrome classification is an interpretation — the trace is the ground truth. Reviewers should be able to verify the classification independently.
Root Cause Analysis	Technical explanation of why the syndrome manifested in this trace. Not “the model hallucinated” — that’s the symptom. Try: “Completion boundary misalignment — model triggered done-signal on structural features without executing the verification step.”	Distinguishes incidents that look the same on the surface but have different causes — and therefore different fixes.
Remediation Plan	Three horizons: Immediate (24h mitigations), Short-term (1 week targeted fixes), Long-term (architectural improvements). Assign a team and a date to each.	Incidents without assigned owners and deadlines don’t get fixed. Three horizons reflect the reality that some mitigations are fast (add a validation gate) and some require architectural work (retrain with different completion signals).
Related Incidents	Links to other incidents with the same syndrome or tag set.	Transforms isolated incidents into a pattern record. If Hollow Completions appears five times in one month across different task types, that’s a systemic signal, not random noise.

Template

Copy and adapt for your organization

Adapt field labels, severity scales, and workflow integration points to match your tooling. The two non-negotiable fields are Primary Syndrome and Micro-Failure Tags — preserve those even if you drop everything else.

## Incident Summary Incident ID: AI-INC-YYYY-MM-DD-NNN Date: YYYY-MM-DD HH:MM UTC Severity: [CRITICAL | HIGH | MODERATE | LOW] Status: [Under Investigation | Root Cause Identified | Remediated | Closed] ## Classification Primary Syndrome: [Core Six syndrome name] Secondary Syndrome(s): [if applicable — use Earliest Decisive Deviation rule] Micro-Failure Tags: - [Primary tag from syndrome tag set] - [Supporting tags] ## Technical Details Model Version: [version identifier] Trace ID: [reference to session trace in logging system] Context Length: [tokens] Tool Calls Attempted: [count] Execution Time: [duration] ## Incident Description [Narrative: what the system claimed, what actually happened, evidence of discrepancy — include direct quotes from model output] ## User Impact Immediate: [direct consequence to user or downstream system] Scope: [number of affected users / interactions] Business Impact: [quantified if possible — cost, time, trust] ## Root Cause Analysis [Technical explanation of why the syndrome manifested — reference trace ID for evidence. Not just "the model hallucinated" — explain which mechanism in which syndrome drove the failure.] ## Remediation Plan Immediate (24h): [quick mitigations — guardrails, routing changes, etc.] Short-term (1 week): [targeted fixes — prompt tuning, validation gates, etc.] Long-term (1 month): [architectural improvements] Responsible Team: [team name] Target Resolution: [date] Follow-up Review: [date] ## Related Incidents - [Links to incidents with same syndrome or tag cluster for pattern analysis]

Integration

Plugging into existing workflows

Jira / Linear

Add “AI Syndrome” as a custom field with a dropdown of the six syndrome names. Add “Micro-Failure Tags” as a multi-select label field. Trace ID maps to the existing “link” or “external reference” field. The rest of the template maps naturally to description and comment fields.

ServiceNow / PagerDuty

Create a child record type “AI Behavior Incident” that inherits from your standard incident record. Add Syndrome and Tags as custom attributes. This keeps AI incidents visible in the same dashboards as other incidents while adding the classification layer.

Markdown / Notion / Confluence

Use the template verbatim as a page template. Create a database view filtered by Primary Syndrome — this is your syndrome incidence tracker. Tag pages with the syndrome name and use database filters to find all Capability Masking incidents across a time period.

Spreadsheet triage

Even a simple spreadsheet with columns for Date, Syndrome, Tags, Severity, and Status gives you the pattern-tracking capability. Add a dashboard tab with a COUNTIF per syndrome — you now have a syndrome incidence chart without any tooling investment.

Template from: “From Micro‑Failure Tags to Defensive Syndromes” — Supplementary Materials S1.2

Ernesto A. Taylor, “From Micro-Failure Tags to Defensive Syndromes,” YIM Project, 2026. Free to use and adapt with attribution (CC BY 4.0).

← Back to Supplementary Materials

AI Behavior Incident Report Template

Why standard incident forms miss the point

What goes in each field and why

Copy and adapt for your organization

Plugging into existing workflows

research@yeahitsme.com