G360 Technologies

The Engineering Room

The Engineering Room

AI Compliance Is Becoming a Live System

AI Compliance Is Becoming a Live System The Scenario A team ships an AI feature after passing a pre-deployment risk review. Three months later, a model update changes output behavior. Nothing breaks loudly. No incident is declared. But a regulator asks a simple question: can you show, right now, how you monitor and supervise the system’s behavior in production, and what evidence you retain over its lifetime? The answer is no longer a policy document. It is logs, controls, and proof that those controls run continuously. The Alternative Now consider what happens without runtime controls. The same team discovers the behavior change six months later during an annual model review. By then, the system has processed 200,000 customer interactions. No one can say with confidence which outputs were affected, when the drift began, or whether any decisions need to be revisited. Remediation becomes forensic reconstruction: pulling logs from three different systems, interviewing engineers who have since rotated teams, and producing a timeline from fragmented evidence. The regulator’s question is the same. The answer takes eight weeks instead of eight minutes. The Shift Between 2021 and 2026, AI governance expectations shifted from periodic reviews to continuous monitoring and enforcement. The pattern appears across frameworks, supervisory language, and enforcement posture: governance is treated less as documentation and more as operational infrastructure. There is a turning point in 2023 with the release of NIST AI Risk Management Framework 1.0 and its emphasis on tracking risk “over time.” They also describe enforcement signals across regulators, including the SEC and FTC, that emphasize substantiation and supervision rather than aspirational claims. In parallel, there is also a related shift in data governance driven by higher data velocity and real-time analytics. Governance moves from “after-the-fact” auditing to “in-line” enforcement that runs at the speed of production pipelines. How Governance Posture Is Shifting Checkpoint model Continuous model Risk assessment Pre-deployment, then annual review Ongoing, with drift detection and alerting Evidence Assembled during audits from tickets, docs, and interviews Generated automatically as a byproduct of operations Policy enforcement Manual review and approval workflows Deterministic controls enforced at runtime Monitoring Periodic sampling and spot checks Real-time dashboards with automated escalation Audit readiness Preparation project before examination Always-on posture; evidence exists by default Incident detection Often discovered during scheduled reviews Detected in near real time via anomaly alerts How the Mechanism Works There is a common runtime pattern: deterministic enforcement outside the model, comprehensive logging, and continuous monitoring. Policy enforcement sits outside the model. There is a distinguish between probabilistic systems (LLMs) and deterministic constraints (policy). The proposed architecture places a policy enforcement layer between AI systems and the resources they access. A typical flow includes context aggregation (identity, roles, data classification), policy evaluation using machine-readable rules, and enforcement actions such as allow, block, constrain, or escalate. The phased rollouts: monitor mode (log without blocking), soft enforcement (block critical violations only), and full enforcement. Evidence is produced continuously. A recurring requirement is that evidence should be generated automatically as a byproduct of operations: immutable audit trails capturing requests, decisions, and context; tamper-resistant logging aligned to retention requirements; and lifecycle logging from design through decommissioning. The EU AI Act discussion highlights “automatic recording” of events “over the lifetime” of high-risk systems as an architectural requirement. Guardrails operate on inputs and outputs. The runtime controls including input validation (prompt injection detection, rate limiting by trust level) and output filtering (sensitive data redaction, hallucination detection). Monitoring treats governance as an operational system. The monitoring layer includes performance metrics, drift detection, bias and fairness metrics, and policy violation tracking. The operational assumption is that governance failures should be detected and escalated promptly, not months later. Data pipelines use stream-native primitives. Kafka is for append-only event logging, schema registries for write-time validation, Flink is for low-latency processing and anomaly detection, and policy-as-code tooling (Open Policy Agent) to codify governance logic across environments. Why This Matters Now Two forces drive the urgency. First, regulatory and supervisory language is operationalizing “monitoring.” The expectations are focused on whether firms can monitor and supervise AI use continuously, particularly where systems touch sensitive functions like fraud detection, AML, trading, and back-office workflows. Second, runtime AI and real-time data systems reduce the value of periodic controls. Where systems operate continuously and decisions are made in near real time, quarterly or annual reviews become structurally misaligned. Implications for Enterprises Operational: Audit readiness becomes an always-on posture. Governance work shifts from manual review to control design. New ownership models emerge, with central standards paired with local implementation. Incident response expands to include governance events like policy violations and drift alerts. Technical: A policy layer becomes a first-class architectural component. Logging becomes a product requirement, tying identity, policy decisions, and data classifications into a single auditable trail. Monitoring must cover both AI behavior and system behavior. CI/CD becomes part of the governance boundary, with pipeline-level checks and deployment blocking tied to policy failures. Risks and Open Questions There are limitations that enterprises should treat as design constraints: standardization gaps in what counts as “adequate” logging; cost and complexity for smaller teams; jurisdiction fragmentation across regions; alert fatigue from continuous monitoring; and concerns that automated governance can lead to superficial human oversight. What This Means in Practice The shift is not a future state. Regulatory language, enforcement patterns, and supervisory expectations are already moving in this direction. The question for most enterprises is not whether to adopt continuous governance, but how quickly they can close the gap. Three questions worth asking now: Governance is becoming infrastructure. Infrastructure requires design, investment, and ongoing operational ownership. Treating it as paperwork is increasingly misaligned with how regulators, and AI systems themselves, actually operate. Further Reading

The Engineering Room

AI Agents Broke the Old Security Model. AI-SPM Is the First Attempt at Catching Up.

AI Agents Broke the Old Security Model. AI-SPM Is the First Attempt at Catching Up. A workflow agent is deployed to summarize inbound emails, pull relevant policy snippets from an internal knowledge base, and open a ticket when it detects a compliance issue. It works well until an external email includes hidden instructions that influence the agent’s tool calls. The model did not change. The agent’s access, tools, and data paths did. Enterprise AI agents are shifting risk from the model layer to the system layer: tools, identities, data connectors, orchestration, and runtime controls. In response, vendors are shipping AI Security Posture Management (AI-SPM) capabilities that aim to inventory agent architectures and prioritize risk based on how agents can act and what they can reach. (Microsoft) Agents are not just chat interfaces. They are software systems that combine a model, an orchestration framework, tool integrations, data retrieval pipelines, and an execution environment. In practice, a single “agent” is closer to a mini application than a standalone model endpoint. This shift is visible in vendor security guidance and platform releases. Microsoft’s Security blog frames agent posture as comprehensive visibility into “all AI assets” and the context around what each agent can do and what it is connected to. (Microsoft) Microsoft Defender for Cloud has also expanded AI-SPM coverage to include GCP Vertex AI, signaling multi-cloud posture expectations rather than single-platform governance. (Microsoft Learn) At the same time, cloud platforms are standardizing agent runtime building blocks. AWS documentation describes Amazon Bedrock AgentCore as modular services such as runtime, memory, gateway, and observability, with OpenTelemetry and CloudWatch-based tracing and dashboards. (AWS Documentation) On the governance side, the Cloud Security Alliance’s MAESTRO framework explicitly treats agentic systems as multi-layer environments where cross-layer interactions drive risk propagation. (Cloud Security Alliance) How the Mechanism Works  AI-SPM is best understood as a posture layer that tries to answer four questions continuously: Technically, many of these risks become visible only when you treat the agent as an execution path. Observability tooling for agent runtimes is increasingly built around tracing tool calls, state transitions, and execution metrics. AWS AgentCore observability documentation describes dashboards and traces across AgentCore resources and integration with OpenTelemetry. (AWS Documentation) Finally, tool standardization is tightening. The Model Context Protocol (MCP) specification added OAuth-aligned authorization requirements, including explicit resource indicators (RFC 8707), which specify exactly which backend resource a token can access. The goal is to reduce token misuse and confused deputy-style failures when connecting clients to tool servers. (Auth0) Analysis: Why This Matters Now The underlying change is that “AI risk” is less about what the model might say and more about what the system might do. Consider a multi-agent expense workflow. A coordinator agent receives requests, a validation agent checks policy compliance, and an execution agent submits approved payments to the finance system. Each agent has narrow permissions. But if the coordinator is compromised through indirect prompt injection (say, a malicious invoice PDF with hidden instructions), it can route fraudulent requests to the execution agent with fabricated approval flags. No single agent exceeded its permissions. The system did exactly what it was told. The breach happened in the orchestration logic, not the model. Agent deployments turn natural language into action. That action is mediated by: This shifts security ownership. Model governance teams can no longer carry agent risk alone. Platform engineering owns runtimes and identity integration, security engineering owns detection and response hooks, and governance teams own evidence and control design. It also changes what “posture” means. Traditional CSPM and identity posture focus on static resources and permissions. Agents introduce dynamic execution: the same permission set becomes higher risk when paired with autonomy and untrusted inputs, especially when tool chains span multiple systems. What This Looks Like in Practice A security team opens their AI-SPM dashboard on Monday morning. They see: The finding is not that the agent has a vulnerability. The finding is that this combination of autonomy, tool access, and external input exposure creates a high-value target. The remediation options are architectural: add an approval workflow for refunds, restrict external input processing, or tighten retrieval-time access controls. This is the shift AI-SPM represents. Risk is not a CVE to patch. Risk is a configuration and capability profile to govern. Implications for Enterprises Operational implications Technical implications Risks and Open Questions AI-SPM addresses visibility gaps, but several failure modes remain structurally unsolved. Further Reading