Production Readiness¶

The transition from development to production is where pre-runtime security ends and runtime security begins. This page covers how to make that handoff clean, complete, and visible. A system that reaches production without proper preparation forces runtime security to discover what it is protecting on the fly.

The deployment gate¶

No AI system should reach production without passing through a deployment gate appropriate to its risk tier. The gate is the final checkpoint before the system is live.

Gate checklist by tier¶

All tiers (minimum)¶

Risk classification documented and approved
Model provenance verified (provenance and integrity)
Basic adversarial testing completed (adversarial testing)
PACE resilience plan documented for critical components (resilience)
Logging configured and verified
Rollback procedure documented and tested
Runtime security team notified of pending deployment

MEDIUM tier (add to above)¶

Structured adversarial test suite passed
Data governance review completed (data governance)
Secrets management verified (secrets management)
Input and output guardrails configured and tested
Monitoring dashboards created
On-call rotation identified for AI-specific issues

HIGH tier (add to above)¶

Red-team exercise completed, findings remediated
Regulatory compliance verified (regulatory alignment)
Domain-specific guardrail testing completed
Judge evaluation configured and calibrated
Human oversight workflow tested end-to-end
Incident response runbook created for AI-specific scenarios
Privacy impact assessment completed (if processing personal data)

CRITICAL tier (add to above)¶

Independent adversarial assessment completed
Senior stakeholder sign-off on risk acceptance
Immutable logging configured and verified
Real-time Judge evaluation operational
Human oversight escalation path tested
Regulatory notifications filed (if required)
Board or risk committee briefed
Canary or staged deployment plan approved

Making the system visible¶

A deployed AI system that is invisible to the security operations team, the platform team, and the governance function is an unmanaged risk. Visibility must be established before deployment, not discovered after an incident.

What needs to be visible¶

Audience	What they need to see	Why
Security operations	Alerts on guardrail violations, anomalous usage patterns, Judge escalations	To detect and respond to attacks or misuse
Platform/infrastructure	Resource utilisation, latency, error rates, PACE level status	To maintain service reliability and capacity
AI/ML team	Model performance metrics, drift indicators, output quality trends	To maintain model quality and detect degradation
Governance/compliance	Audit logs, decision records, human oversight actions, tier compliance status	To demonstrate regulatory compliance and track governance effectiveness
Product owner	Usage metrics, user feedback, business outcome indicators	To validate the system delivers value and manage risk

Observability requirements by tier¶

Aspect	LOW	MEDIUM	HIGH	CRITICAL
Metrics	Basic (latency, errors)	Standard (+ usage, quality)	Comprehensive (+ drift, fairness)	Full (+ per-decision tracking)
Alerting	Error-based	Error + anomaly	Real-time on violations	Real-time + escalation chain
Dashboards	Optional	Team-level	Cross-functional	Executive + operational
Log retention	90 days	1 year	3 years	7 years (immutable)
Audit trail	Metadata	Key decisions	All interactions	Full reasoning chain

Integration with existing security tooling¶

AI systems must plug into the organisation's existing security infrastructure. Do not build parallel monitoring that bypasses the SOC.

Pre-deployment integration checklist:

AI system logs flow to the central SIEM or log aggregation platform
AI-specific alerts are routed to the appropriate response team
AI system health metrics are visible on infrastructure monitoring dashboards
AI-specific runbooks are loaded into the incident management system
AI system is registered in the organisation's asset inventory / AI registry

Shadow AI is invisible AI

If an AI system is deployed without integration into existing security and monitoring infrastructure, it is effectively invisible to the organisation's security posture. This is one of the highest-risk outcomes of poor pre-runtime security. Make integration a hard gate, not a post-deployment task.

The runtime security handoff¶

The handoff to AI Runtime Security is not a clean break. It is a structured transition where responsibility shifts but context must be preserved.

What to hand off¶

Item	Description	Format
Risk classification	Tier assignment, driving factors, accepted risks	Classification document
Control configuration	Guardrail settings, Judge policies, human oversight rules	Configuration files or policy documents
Test results	Adversarial testing findings, remediation status, known residual risks	Test report
PACE plan	Resilience plan for each critical component	PACE documentation
Escalation paths	Who to contact for AI-specific issues, what triggers escalation	Runbook
Model documentation	Model card, provenance record, known limitations, failure modes	Model documentation
Regulatory obligations	Applicable regulations, compliance requirements, reporting obligations	Compliance mapping

Handoff meeting¶

For HIGH and CRITICAL tier systems, conduct a formal handoff meeting between the development/deployment team and the runtime security/operations team. Cover:

System purpose, scope, and risk classification
Known risks, accepted risks, and residual risks
Control configuration and rationale
Monitoring and alerting setup
Escalation and incident response procedures
Scheduled review dates

For LOW and MEDIUM tier systems, this can be an asynchronous handoff using documented artifacts, but the information must still be transferred.

Staged deployment¶

Not every system should go from zero to full production in one step. Staged deployment reduces risk by limiting blast radius during the initial production period.

Deployment strategies by tier¶

Tier	Recommended approach
LOW	Direct deployment is acceptable. Monitor for the first week.
MEDIUM	Deploy to a subset of users or use cases first. Expand after one to two weeks of stable operation.
HIGH	Canary deployment: route a small percentage of traffic to the new system. Monitor closely. Expand gradually over weeks.
CRITICAL	Shadow deployment first (run in parallel with existing process, compare outputs, no live impact). Then canary. Then gradual expansion. Each stage requires explicit approval to proceed.

What to monitor during staged deployment¶

Output quality and consistency
Guardrail trigger rates (are they firing too often? too rarely?)
Judge evaluation results (what is the Judge catching that guardrails miss?)
User feedback and behaviour
Error rates and latency
PACE level (is the system operating on primary, or has it already fallen back?)

Post-deployment: the first 30 days¶

The first 30 days in production are the highest-risk period. Controls that worked in testing may behave differently under real-world load, with real users, and with real data.

First-week focus¶

Monitor all metrics at elevated frequency
Review all Judge escalations manually
Track guardrail trigger patterns
Verify logging is capturing what you expect
Confirm alerting is reaching the right people

First-month focus¶

Compare actual usage patterns to design assumptions
Review whether the risk tier assignment is accurate based on real-world behaviour
Identify any gaps between pre-deployment testing and production behaviour
Update adversarial test suite based on production observations
Conduct first scheduled review of the deployment

Tier reassessment¶

After the first 30 days, reassess the risk classification. Real-world deployment may reveal that the system:

Processes data you did not anticipate (tier may need to go up)
Is used differently than intended (tier may need adjustment in either direction)
Has attack surface you did not identify during testing (tier may need to go up)
Operates more safely than anticipated (tier may be eligible for reduction, but only after six months per the tier change requirements)

Connecting both sites¶

This page marks the boundary between AI Secured by Design (pre-runtime) and AI Runtime Security (runtime). The relationship is sequential but connected.

Pre-runtime security ensures that what gets deployed is trustworthy. It covers model selection, platform decisions, pipeline security, data governance, adversarial testing, and the deployment gate.

Runtime security ensures that the deployed system stays trustworthy. It covers guardrails, Judge evaluation, human oversight, monitoring, and incident response.

Neither is sufficient alone. A well-built system without runtime controls will degrade undetected. A poorly-built system with excellent runtime controls will fight fires that should never have started.

Continue to

AI Runtime Security

Your system is deployed. Runtime controls take over: guardrails to prevent, Judge to detect, humans to decide. Start with the risk tier you classified here and implement the corresponding runtime controls.

Continue to AI Runtime Security

References