G360 Technologies

Author name: anuroop

The Threat Room

When AI Agents Act, Identity…….

When AI Agents Act, Identity Becomes the Control Plane A product team deploys an AI agent to handle routine work across Jira, GitHub, SharePoint, and a ticketing system. It uses delegated credentials, reads documents, and calls tools to complete tasks. A month later, a single poisoned document causes the agent to pull secrets and send them to an external endpoint. The audit log shows “the user” performed the actions because the agent acted under the user’s token. The incident is not novel malware. It is identity failure in an agent-shaped wrapper. Between late 2025 and early 2026, regulators and national cyber authorities started describing autonomous AI agents as a distinct security problem, not just another application. NIST’s new public RFI frames agent systems as software that can plan and take actions affecting real systems, and asks for concrete security practices and failure cases from industry. (Federal Register) At the same time, FINRA put “AI agents” into its 2026 oversight lens, calling out autonomy, scope, auditability, and data sensitivity as supervisory and control problems for member firms. (FINRA) Gartner has put a number on the trajectory: by 2028, 25% of enterprise breaches will be traced to AI agent abuse. That prediction reflects a shift in where attackers see opportunity. (gartner.com) Enterprises have spent a decade modernizing identity programs around humans, service accounts, and APIs. AI agents change the shape of “who did what” because they: The UK NCSC’s December 2025 guidance makes the core point directly: prompt injection is not analogous to SQL injection, and it may remain a residual risk that cannot be fully eliminated with a single mitigation. That pushes enterprise strategy away from perfect prevention and toward containment, privilege reduction, and operational controls. (NCSC) Why Agents Are Not Just Service Accounts Security teams may assume existing non-human identity controls apply. They do not fully transfer. Service accounts run fixed, predictable code. Agents run probabilistic models that decide what to do based on inputs, including potentially malicious inputs. A service account that reads a poisoned document does exactly what its code specifies. An agent that reads the same document might follow instructions embedded in it. The difference: agents can be manipulated through their inputs in ways that service accounts cannot. How the Mechanism Works 1. Agents collapse “identity” and “automation” into one moving target Most agents are orchestration layers around a model that can decide which tools to call. The identity risk comes from how agents authenticate and how downstream systems attribute actions: 2. Indirect prompt injection turns normal inputs into executable instructions Agents must read information to work. If the system cannot reliably separate “data to summarize” from “instructions to follow,” untrusted content can steer behavior. NCSC’s point is structural: language models do not have a native, enforceable boundary between data and instructions the way a parameterized SQL query does. That is why “filter harder” is not a complete answer. (NCSC) A practical consequence: any agent that reads external or semi-trusted content (docs, tickets, wikis, emails, web pages) has a standing exposure channel. 3. Tool protocols like MCP widen the blast radius by design The Model Context Protocol (MCP) pattern connects models to tools and data sources. It is powerful, but it also concentrates risk: an agent reads tool metadata, chooses a tool, and invokes it. Real-world disclosures in the MCP ecosystem have repeatedly mapped back to classic security failures: lack of authentication, excessive privilege, weak isolation, and unsafe input handling. One example is CVE-2025-49596 (MCP Inspector), where a lack of authentication between the inspector client and proxy could lead to remote code execution, according to NVD. (NVD) Separately, AuthZed’s timeline write-up shows that MCP server incidents often look like “same old security fundamentals,” but in a new interface where the agent’s reasoning decides what gets executed. (AuthZed) 4. Agent supply chain risk is identity risk Agent distribution and “prompt hub” patterns create a supply chain problem: you can import an agent configuration that quietly routes traffic through attacker infrastructure. Noma Security’s AgentSmith disclosure illustrates this clearly: a malicious proxy configuration could allow interception of prompts and sensitive data, including API keys, if users adopt or run the agent. (Noma Security) 5. Attack speed changes response requirements Unit 42 demonstrated an agentic attack framework where a simulated ransomware attack chain, from initial compromise to exfiltration, took 25 minutes. They reported a 100x speed increase using AI across the chain. (Palo Alto Networks) To put that in operational terms: a typical SOC alert-to-triage cycle can exceed 25 minutes. If the entire attack completes before triage begins, detection effectively becomes forensics. What This Looks Like from the SOC Consider what a security operations team actually sees when an agent-based incident unfolds: The delay between “something is wrong” and “we understand what happened” is where damage compounds. Now Scale It The opening scenario described one agent, one user, one poisoned document. Now consider a more realistic enterprise picture: How many agents read it? Which ones act on it? Which credentials are exposed? Which downstream systems are affected? The attack surface is not one agent. It is the graph of agents, permissions, and shared data sources. A single poisoned input can fan out across that graph faster than any human review process can catch it. Analysis – Why This Matters Now Regulators are converging on a shared premise: if an agent can take actions, then “governance” is not just model policy. It is identity, authorization, logging, and supervision. The regulatory message is consistent: if you deploy agents that can act, you own the consequences of those actions, including the ones you did not authorize. Implications for Enterprises Identity and access management Tooling and platform architecture Monitoring, audit, and response Risks and Open Questions Further Reading

Uncategorized

Texas AI Law Shifts Compliance Focus from Outcomes to Intent

Texas AI Law Shifts Compliance Focus from Outcomes to Intent A national retailer uses the same AI system to screen job applicants in Colorado and Texas. In Colorado, auditors examine outcomes and disparate impact metrics. In Texas, they start somewhere else entirely: what was this system designed to do, and where is that documented? The Texas Responsible Artificial Intelligence Governance Act takes effect January 1, 2026. It creates a state-level AI governance framework that distinguishes between developers and deployers, imposes specific requirements on government agencies, and centralizes enforcement under the Texas Attorney General with defined cure periods and safe harbors. TRAIGA covers private and public entities that develop or deploy AI systems in Texas, including systems affecting Texas residents. The statute defines AI systems broadly but reserves its most prescriptive requirements for state and local government. Private sector obligations focus on prohibited uses, transparency, and documentation. Here is the key distinction from other AI laws: TRAIGA does not use a formal high-risk classification scheme. Instead, it organizes compliance around roles, intent, and evidence of responsible design. How the mechanism works Role-based duties. Developers must test systems, mitigate risks, and provide documentation explaining capabilities, limitations, and appropriate uses. Deployers must analyze use cases, establish internal policies, maintain human oversight, align with data governance requirements, and obtain disclosures or consent where required in consumer-facing or government service contexts. Purpose and prohibition controls. The law prohibits AI systems designed or used for intentional discrimination, civil rights violations, or manipulation that endangers public safety. Organizations must document legitimate business purposes and implement controls to prevent or detect prohibited use. Enforcement and remediation. Only the Texas Attorney General can enforce the statute. The AG may request training data information, testing records, and stated system purposes. Entities generally receive notice and 60 days to cure alleged violations before penalties apply. Safe harbors exist for organizations that align with recognized frameworks like the NIST AI RMF, identify issues through internal monitoring, or participate in the state AI sandbox. Government-specific requirements. State agencies must inventory their AI systems, follow an AI code of ethics from the Department of Information Resources, and apply heightened controls to systems influencing significant public decisions (such as benefits eligibility or public services). Analysis: why this matters now TRAIGA makes intent a compliance artifact. Documentation of design purpose, testing, and internal controls moves from best practice to legal requirement. Key insight: For compliance teams, the question is no longer just “did this system cause harm” but “can we prove we tried to prevent it.” This has direct implications for technical teams. Internal testing, red teaming, and incident tracking are now tied to enforcement outcomes. Finding and fixing problems internally becomes part of the legal defense. For multi-state operators, the challenge is reconciliation. Evidence that supports a design-focused defense in Texas may not align with the impact-based assessments required elsewhere. Example: Consider a financial services firm using an AI system to flag potentially fraudulent transactions. Under Colorado’s SB 205, regulators would focus on whether the system produces disparate outcomes across protected classes. Under TRAIGA, the first question is whether the firm documented the system’s intended purpose, tested for failure modes, and established controls to prevent misuse. The same system, two different compliance burdens. Implications for enterprises Operations. AI inventories will need to expand to cover embedded and third-party systems meeting the statute’s broad definition. Governance teams should map which business units act as developers versus deployers, with documentation and contracts to match. Technical infrastructure. Continuous monitoring, testing logs, and incident tracking shift from optional to required. Documentation of system purpose, testing protocols, and mitigation measures should be retrievable quickly in the event of an AG inquiry. Governance strategy. Alignment with recognized risk management frameworks now offers concrete legal value. Incident response plans should account for Texas’s 60-day cure window alongside shorter timelines in other states. Risks & Open Questions Implementation guidance from Texas agencies is still developing. The central uncertainty is what documentation will actually satisfy the evidentiary standard for intent and mitigation. Other open questions include how the law interacts with state requirements on biometric data and automated decisions, and whether the regulatory sandbox will have practical value for nationally deployed systems. Further Reading Texas Legislature HB 149 analysis Texas Attorney General enforcement provisions Baker Botts TRAIGA overview Wiley Rein TRAIGA alert Ropes and Gray AI compliance analysis Ogletree Deakins AI governance commentary

The Governance Room

California’s 2026 AI Laws: When a Documentation Gap Becomes a Reportable Incident

California’s 2026 AI Laws: When a Documentation Gap Becomes a Reportable Incident Key Takeaways Effective January 1, 2026, frontier AI developers face enforceable safety, transparency, and cybersecurity obligations under California law Cybersecurity control failures can trigger critical safety incident reporting with 15-day deadlines Enterprises buying from frontier AI vendors should expect new due diligence, contract clauses, and attestation requirements A foundation model is deployed with new fine-tuning. The model behaves as expected. Weeks later, an internal researcher flags that access controls around unreleased model weights are weaker than documented. Under California’s 2026 AI regime, that gap is no longer a quiet fix. If it results in unauthorized access, exfiltration, or other defined incident conditions, it becomes a critical safety incident with a 15-day reporting deadline, civil penalties, and audit trails. Beginning January 1, 2026, California’s Transparency in Frontier Artificial Intelligence Act and companion statutes shift AI governance from voluntary principles to enforceable operational requirements. The laws apply to a narrow group: frontier AI developers whose training compute exceeds 10^26 floating point or integer operations, with additional obligations for developers that meet the statute’s “large frontier developer” criteria, including revenue thresholds. Who This Applies To This framework primarily affects large frontier developers and has limited immediate scope. However, it sets expectations that downstream enterprises will likely mirror in vendor governance and procurement requirements. For covered developers, internal-use testing and monitoring are no longer technical hygiene. They are regulated evidence-producing activities. Failures in cybersecurity controls and model weight security can trigger incident reporting and penalties even when no malicious intent exists. What Developers Must Produce The law requires documented artifacts tied to deployment and subject to enforcement. Safety and security protocol. A public document describing how the developer identifies dangerous capabilities, assesses risk thresholds, evaluates mitigations, and secures unreleased model weights. Must include criteria for determining substantial modifications and when new assessments are triggered. Transparency reports. Published before or at deployment. Large frontier developers must include catastrophic risk assessments, third-party evaluations, and compliance descriptions. Frontier AI Framework. Required for large frontier developers. Documents governance structures, lifecycle risk management, and alignment with recognized standards. Updated annually or within 30 days of material changes. What Triggers Reporting The law defines catastrophic risk using explicit harm thresholds: large-scale loss of life or property damage exceeding one billion dollars. Critical safety incidents include: Most critical safety incidents must be reported to the Attorney General within 15 days. Events posing imminent risk of death or serious injury require disclosure within 24 hours. Why the Coupling of Safety and Cybersecurity Matters California’s framework treats model weight security, internal access governance, and shutdown capabilities as safety-bound controls. These are not infrastructure concerns. They are controls explicitly tied to statutory safety obligations, and failures carry compliance consequences. Access logging, segregation of duties, insider threat controls, and exfiltration prevention are directly linked to statutory risk definitions. A control weakness that would previously have been an IT finding can now constitute a compliance-triggering event if it leads to unauthorized access or other defined incidents. Internal use is explicitly covered and subject to audit. Testing, monitoring, and reporting obligations apply to dangerous capabilities that arise from employee use, not just public deployment. This means internal experimentation with frontier models produces compliance artifacts, not just research notes. Developers must document procedures for incident monitoring and for promptly shutting down copies of models they own and control. Operational Changes for Covered Developers Documentation becomes operational. Safety protocols and frameworks must stay aligned with real system behavior. Gaps between documentation and practice can become violations. Incident response expands. Processes must account for regulatory reporting timelines alongside technical containment. Whistleblower infrastructure is required. Anonymous reporting systems and defined response processes create new coordination requirements across legal, security, and engineering teams. Model lifecycle tracking gains compliance consequences. Fine-tuning, retraining, and capability expansion may constitute substantial modifications triggering new assessments. How frequently occurring changes will be interpreted remains unclear. Starting in 2030, large frontier developers must undergo annual independent third-party audits. Downstream Implications for Enterprise Buyers Most enterprises will not meet the compute thresholds that trigger direct coverage. But the framework will shape how they evaluate and contract with AI vendors. Vendor due diligence expands. Procurement and security teams will need to assess whether vendors are subject to California’s requirements and whether their published safety protocols and transparency reports are current. Gaps in vendor documentation become risk factors in sourcing decisions. Contractual flow-down becomes standard. Enterprises will likely require vendors to represent compliance with applicable safety and transparency obligations, notify buyers of critical safety incidents, and provide audit summaries or attestations. These clauses mirror patterns established under GDPR and SOC 2 regimes. Example language: “Vendor shall notify Buyer within 48 hours of any critical safety incident as defined under California Business and Professions Code Chapter 25.1, and shall provide Buyer with copies of all transparency reports and audit summaries upon request.” Internal governance benchmarks shift. Even where not legally required, enterprises may adopt elements of California’s framework as internal policy: documented safety protocols for high-risk AI use cases, defined thresholds for escalation, and audit trails for model deployment decisions. The framework provides a reference architecture for AI governance that extends beyond its direct scope. Security, legal, and procurement teams should expect vendor questionnaires, contract templates, and risk assessment frameworks to incorporate California’s definitions and reporting categories within the next 12 to 18 months. Open Questions Substantial modification thresholds. The protocol must define criteria, but how regulators will interpret frequent fine-tuning or capability expansions is not yet established. Extraterritorial application. The law does not limit applicability to entities physically located in California. Global providers may need to treat California requirements as a baseline. Enforcement priorities. The Attorney General is tasked with oversight, but application patterns across different developer profiles are not yet established. Regime alignment. The European Union’s AI Act defines harm and risk using different metrics, creating potential duplication in compliance strategies. Further Reading California Business and Professions Code Chapter 25.1 (SB 53) Governor of California AI legislation announcements White and Case analysis of California frontier AI laws Sheppard Mullin overview of

The Engineering Room

The Prompt Is the Bug

The Prompt Is the Bug How MLflow 3.x brings version control to GenAI’s invisible failure points A customer support agent powered by an LLM starts returning inconsistent recommendations. The model version has not changed. The retrieval index looks intact. The only modification was a small prompt update deployed earlier that day. Without prompt versioning and traceability, the team spends hours hunting through deployment logs, Slack threads, and git commits trying to reconstruct what changed. By the time they find the culprit, the damage is done: confused customers, escalated tickets, and a rollback that takes longer than the original deploy. MLflow 3.x expands traditional model tracking into a GenAI-native observability and governance layer. Prompts, system messages, traces, evaluations, and human feedback are now treated as first-class, versioned artifacts tied directly to experiments and deployments. This matters because production LLM failures rarely come from the model. They come from everything around it. Classic MLOps tools were built for a simpler world: trained models, static datasets, numerical metrics. In that world, you could trace a failure back to a model version or a data issue. LLM applications break this assumption. Behavior is shaped just as much by prompts, system instructions, retrieval logic, and tool orchestration. A two-word change to a system message can shift tone. A prompt reordering can break downstream parsing. A retrieval tweak can surface stale content that the model confidently presents as fact. As enterprises deploy LLMs into customer support, internal copilots, and decision-support workflows, these non-model components become the primary source of production incidents. And without structured tracking, they leave no trace. MLflow 3.x extends the platform from model tracking into full GenAI application lifecycle management by making these invisible components visible. What Could Go Wrong (and often does) Consider two scenarios that MLflow 3.x is designed to catch: The phantom prompt edit. A product manager tweaks the system message to make responses “friendlier.” No code review, no deployment flag. Two days later, the bot starts agreeing with customer complaints about pricing, offering unauthorized discounts in vague language. Without prompt versioning, the connection between the edit and the behavior is invisible. The retrieval drift. A knowledge base update adds new product documentation. The retrieval index now surfaces newer content, but the prompt was tuned for the old structure. Responses become inconsistent, sometimes mixing outdated and current information in the same answer. Nothing in the model or prompt changed, but the system behaves differently. A related failure mode: human reviewers flag bad responses, but those flags never connect back to specific prompt versions or retrieval configurations. When the team investigates weeks later, they cannot reconstruct which system state produced the flagged outputs. Each of these failures stems from missing system-level traceability, even though they often surface later as governance or compliance issues. How The Mechanism Works MLflow 3.x introduces several GenAI-specific capabilities that integrate with its existing experiment and registry model. Tracing and observability MLflow Tracing captures inputs, outputs, and metadata for each step in a GenAI workflow, including LLM calls, tool invocations, and agent decisions. Traces are structured as sessions and spans, logged asynchronously for production use, and linked to the exact application version that produced them. Tracing is OpenTelemetry-compatible, allowing export into enterprise observability stacks. Prompt Registry Prompts are stored as versioned registry artifacts with content, parameters, and metadata. Each version can be searched, compared, rolled back, or evaluated. Prompts appear directly in the MLflow UI and can be filtered across experiments and traces by version or content. System messages and feedback as trace data Conversational elements such as user prompts, system messages, and tool calls are recorded as structured trace events. Human feedback and annotations attach directly to traces with metadata including author and timestamp, allowing quality labels to feed evaluation datasets. LoggedModel for GenAI applications The LoggedModel abstraction snapshots the full GenAI application configuration, including the model, prompts, retrieval logic, rerankers, and settings. All production traces, metrics, and feedback tie back to a specific LoggedModel version, enabling precise auditing and reproducibility. Evaluation integration MLflow GenAI Evaluation APIs allow prompts and models to be evaluated across datasets using built-in or custom judge metrics, including LLM-as-a-judge. Evaluation results, traces, and scores are logged to MLflow Experiments and associated with specific prompt and application versions. Analysis: Why This Matters Now LLM systems fail differently than traditional software. The failure modes are subtle, the causes are distributed, and the evidence is ephemeral. A prompt tweak can change output structure. A system message edit can alter tone or safety behavior. A retrieval change can surface outdated content. None of these show up in traditional monitoring. None of them trigger alerts. The system looks healthy until a customer complains, a regulator asks questions, or an output goes viral for the wrong reasons. Without artifact-level versioning, organizations cannot reliably answer basic operational questions: what changed, when it changed, and which deployment produced a specific response. MLflow 3.x addresses this by making prompts and traces as inspectable and reproducible as model binaries. This also compresses incident response from hours to minutes. When a problematic output appears, teams can trace it back to the exact prompt version, configuration, and application snapshot. No more inferring behavior from logs. No more re-running tests and hoping to reproduce the issue. Implications For Enterprises For operations teams: Deterministic replay becomes possible. Pair a prompt version with an application version and a model version, and you can reconstruct exactly what the system would have done. Rollbacks become configuration changes rather than emergency code redeploys. Production incidents can be converted into permanent regression tests by exporting and annotating traces. For security and governance teams: Tracing data can function as an audit log input when integrated with enterprise logging and retention controls. Prompt and application versioning supports approval workflows, human-in-the-loop reviews, and post-incident analysis. PII redaction and OpenTelemetry export enable integration with SIEM, logging, and GRC systems. When a regulator asks “what did your system say and why,” teams have structured evidence to work from rather than manual reconstruction. For platform architects: MLflow unifies traditional ML and GenAI governance under a

Uncategorized

Why Enterprises Are Versioning Prompts Like Code

Why Enterprises Are Versioning Prompts Like Code Managing LLM systems when the model isn’t the problem A prompt tweak that seemed harmless in testing starts generating hallucinated policy numbers in production. A retrieval index update quietly surfaces outdated documents. The model itself never changed. These are the failures enterprises now face as they move large language models into production, and traditional MLOps has no playbook for them. Operational control has shifted away from model training and toward prompt orchestration, retrieval pipelines, evaluation logic, and cost governance. GenAIOps practices now treat these elements as first-class, versioned artifacts that move through deployment, monitoring, and rollback just like models. Traditional MLOps was designed for predictive systems with static datasets, deterministic outputs, and well-defined metrics such as accuracy or F1 score. Most enterprise LLM deployments do not retrain foundation models. Instead, teams compose prompts, retrieval-augmented generation pipelines, tool calls, and policy layers on top of third-party models. This shift breaks several assumptions of classic MLOps. There is often no single ground truth for evaluation. Small prompt or retrieval changes can significantly alter outputs. Costs scale with tokens and execution paths rather than fixed infrastructure. Organizations have responded by extending MLOps into GenAIOps, with new tooling and workflows focused on orchestration, observability, and governance. What Can Go Wrong: A Scenario Consider an internal HR assistant built on a third-party LLM. The model is stable. The application code has not changed. But over two weeks, employee complaints about incorrect benefits information increase by 40%. Investigation reveals three simultaneous issues. First, a prompt update intended to make responses more concise inadvertently removed instructions to cite source documents. Second, a retrieval index rebuild pulled in an outdated benefits PDF that should have been excluded. Third, the evaluation pipeline was still running against a test dataset that did not include benefits-related queries. None of these failures would surface in traditional MLOps monitoring. The model responded quickly, token costs were normal, and no errors were logged. Without versioned prompts, retrieval configs, and production-trace evaluation, the team had no way to pinpoint when or why accuracy degraded. This pattern reflects issues described in recent enterprise GenAIOps guidance. It illustrates why the discipline has emerged. How The Mechanism Works Modern GenAIOps stacks define and manage operational artifacts beyond the model itself. Each component carries its own failure modes, and each requires independent versioning and observability. Prompt and instruction registries. Platforms such as MLflow 3.0 introduce dedicated prompt registries with immutable version histories, visual diffs, and aliasing for active deployments. Prompts and system messages can be promoted, canaried, or rolled back without redeploying application code. When output quality degrades, teams can trace the issue to a specific prompt version and revert within minutes. Retrieval and RAG configuration. Retrieval logic, indexes, chunking strategies, and ranking parameters are treated as deployable workload components. Changes to retrieval flow through the same validation and monitoring loops as model changes, since retrieval quality directly affects output quality. A misconfigured chunking strategy or stale index can introduce irrelevant or contradictory context that the model will dutifully incorporate. Evaluation objects. Evaluation datasets, scoring rubrics, and LLM-as-judge templates are versioned artifacts. Tools like LangSmith, Langfuse, Maxim, and Galileo integrate these evaluators into CI pipelines and production replay testing using logged traces. This allows teams to catch regressions that only appear under real-world query distributions. Tracing and observability. GenAI observability platforms capture nested traces for prompts, retrieval calls, tool invocations, and model generations. Metrics include latency, error rates, token usage, and cost attribution per span, prompt version, or route. When something breaks, teams can reconstruct the full execution path that produced a problematic output. Safety and policy layers. Content filters, abuse monitoring, and policy checks are configured objects in the deployment workflow. These layers annotate severity, log flagged content, and feed review and governance processes. Analysis Operational risk in LLM systems concentrates outside the model. Enterprises are encountering failures that look less like crashes and more like silent regressions, hallucinations, or cost spikes. A model can be healthy while a prompt change degrades factual accuracy, or a retrieval update introduces irrelevant context. The challenge is attribution. In a traditional software bug, a stack trace points to a line of code. In a GenAI failure, the output is a probabilistic function of the prompt, the retrieved context, the model, and the policy layers. Without versioning and tracing across all these components, debugging becomes guesswork. By elevating prompts, retrieval logic, and evaluators to managed artifacts, teams gain the ability to detect, attribute, and reverse these failures. The same observability data used for debugging also becomes input for governance, audit, and continuous improvement. Implications For Enterprises Operational control. Prompt updates and retrieval changes can move through controlled release paths with audit trails and instant rollback. Incident response expands to include hallucination regressions and policy violations, not just availability issues. Cost management. Token usage and latency are observable at the prompt and workflow level, enabling budgets, quotas, and routing decisions based on real usage rather than estimates. Teams can identify which prompts or workflows consume disproportionate resources and optimize accordingly. Quality assurance. Continuous evaluation on production traces allows teams to detect drift and regressions that would not surface in offline testing alone. This closes the gap between “works in staging” and “works in production.” Organizational alignment. New roles such as AI engineers sit between software and data teams, owning orchestration, routing, and guardrails rather than model training. This reflects where operational complexity actually lives. Risks & Open Questions Standardization remains limited. There is no dominant control plane equivalent to Kubernetes for LLM workloads, and frameworks evolve rapidly. Evaluation techniques such as LLM-as-judge introduce their own subjectivity and must be governed carefully. Tradeoffs between latency, cost, and output quality remain unresolved and are often use-case specific. Enterprises must also ensure that observability and logging do not themselves introduce privacy or compliance risks. The tooling landscape is fragmented, and no clear winner has emerged. Organizations adopting GenAIOps today should factor platform lock-in risk into procurement decisions and expect to revisit their choices as the space matures.

The Threat Room

The Context Layer Problem

The Context Layer Problem An Attack With No Exploit The following scenario is a composite based on multiple documented incidents reported since 2024. A company’s AI assistant sent a confidential pricing spreadsheet to an external email address. The security team found no malware, no compromised credentials, no insider threat. The model itself worked exactly as designed. What happened? An employee asked the assistant to summarize a vendor proposal. Buried deep in the PDF was a short instruction telling the assistant to forward internal financial data to an external address. The assistant followed the instruction. It had the permissions. It did what it was told. Variations of this attack have been documented across enterprise deployments since 2024. The base model was never the vulnerability. The context layer was. Why This Matters Now Between 2024 and early 2026, a pattern emerged across enterprise AI incidents. Prompt injection, RAG data leakage, automated jailbreaks, and Shadow AI stopped being theoretical concerns. They showed up in production copilots, IDE agents, CRMs, office suites, and internal chatbots. The common thread: none of these failures required breaking the model. They exploited how enterprises connected models to data and tools. The Trust Problem No One Designed For Traditional software has clear boundaries. Input validation. Access controls. Execution sandboxes. Code is code. Data is data. Large language models collapse this distinction. Everything entering the context window is processed as natural language. The model cannot reliably tell the difference between “summarize this document” from a user and “ignore previous instructions” embedded in that document. This creates a fundamental architectural tension. The more useful an AI system becomes (connecting it to email, documents, APIs, and tools), the larger the attack surface becomes. Five Failure Modes In The Wild Direct prompt injection is the simplest form. Attacker-controlled text tells the model to ignore prior instructions or perform unauthorized actions. In enterprise systems, this happens when untrusted content like emails, tickets, or CRM notes gets concatenated directly into prompts. One documented case involved a support ticket containing hidden instructions that caused an AI agent to export customer records. Indirect prompt injection is subtler and harder to defend. Malicious instructions hide in documents the system retrieves during normal operation: PDFs, web pages, wiki entries, email attachments. The orchestration layer treats retrieved content as trusted, so these injected instructions can override system prompts. Researchers demonstrated this by planting instructions in public web pages that corporate AI assistants later retrieved and followed. RAG data leakage often happens without any jailbreak at all. The problem is upstream: overly broad document embedding, weak vector store access controls, and retrieval logic that ignores user permissions. In several documented cases, users retrieved and summarized internal emails, HR records, strategy documents, and API keys simply by crafting semantic queries. The model did exactly what it was supposed to do. The retrieval pipeline was the gap. Agentic tool abuse raises the stakes. When models can call APIs, modify workflows, or interact with cloud services, injected instructions translate into real actions. Security researchers demonstrated attacks where a planted instruction in a GitHub issue caused an AI coding agent to exfiltrate repository secrets. The agent had the permissions. It followed plausible-looking instructions. No human approved the action. Shadow AI sidesteps enterprise controls entirely. Employees frustrated by slow IT approvals or restrictive policies copy sensitive data into personal ChatGPT accounts, unmanaged tools, or browser extensions. Reports from 2024 and 2025 link Shadow AI to a significant portion of data breaches, higher remediation costs, and repeated exposure of customer PII. The data leaves the building through the front door. Threat Scenario Consider a company that deploys an AI assistant with access to Confluence, Jira, Slack, and the ability to create calendar events and send emails on behalf of users. An attacker gets a job posting shared in a public Slack channel. They apply, and their resume (a PDF) contains invisible text: instructions telling the AI to forward any messages containing “offer letter” or “compensation” to an external address, then delete the forwarding rule from the user’s settings. A recruiter asks the AI to summarize the candidate’s resume. The AI ingests the hidden instructions. Weeks later, offer letters start leaking. The forwarding rule is gone. Logs show the AI took the actions, but the AI has no memory of why. The individual behaviors described here have already been observed in production systems. What remains unresolved is how often they intersect inside a single workflow. These are not edge cases. They are ordinary features interacting in ways most enterprises have not threat-modeled. What The Incidents Reveal Across documented failures, the base model is rarely the point of failure. Defenses break at three layers: Context assembly. Systems concatenate untrusted content without sanitization, origin tagging, or priority controls. The model cannot distinguish between instructions from the system prompt and instructions from a retrieved email. Trust assumptions. Orchestration layers assume retrieved content is safe, that model intent aligns with user authorization, and that probabilistic guardrails will catch adversarial inputs. As context windows grow and agents gain autonomy, these assumptions fail. Tool invocation. Agentic systems map model output directly to API calls without validating that the action matches user intent, checking privilege boundaries, or requiring human approval for sensitive operations. This is why prompt injection now holds the top position in the OWASP GenAI Top 10. Security researchers increasingly frame AI systems not as enhanced interfaces but as new remote code execution surfaces. What This Means For Enterprise Teams Security teams now face AI risk that spans application security, identity management, and data governance simultaneously. Controls must track where instructions originate, how context gets assembled, and when tools are invoked. Traditional perimeter defenses do not cover these attack vectors. Platform and engineering teams need to revisit RAG and agent architectures. Permission-aware retrieval, origin tagging, instruction prioritization, and policy enforcement at the orchestration layer are becoming baseline requirements. Tool calls based solely on model output represent a high-blast-radius design choice that warrants scrutiny. Governance and compliance teams must address Shadow AI as a structural problem, not a policy problem. Employees route around controls

Newsletter

The Enterprise AI Brief | Issue 3

The Enterprise AI Brief | Issue 3 Inside This Issue The Threat Room When AI Agents Act, Identity Becomes the Control Plane A single poisoned document. An agent following instructions it should have ignored. An audit log that points to the wrong person. AI agents are no longer just automation: they’re privileged identities that can be manipulated through their inputs. Regulators are catching up. NIST is collecting security input, FINRA is flagging autonomy and auditability as governance gaps, and Gartner predicts 25% of enterprise breaches will trace to agent abuse by 2028. The question isn’t whether agents create risk. It’s whether your controls were built for actors that can be turned by a document.  → Read the full article The Operations Room Agentic AI in Production: The System Worked. The Outcome Was Wrong. The system worked. The outcome was wrong. Most enterprises are running agentic pilots, but few have crossed into safe production. This piece explains what’s blocking the path. → Read the full article Enterprise GenAI Pilot Purgatory: Why the Demo Works and the Rollout Doesn’t Why do so many GenAI pilots impress in the demo, then quietly die before production? Research from 2025 and early 2026 reveals the same five breakdowns, again and again. This piece maps the failure mechanisms, and what the rare exceptions do differently. → Read the full article The Engineering Room AI Agents Broke the Old Security Model. AI-SPM Is the First Attempt at Catching Up. Traditional model security asks: what might the AI say? Agent security asks: what might the system do? Microsoft and AWS are shipping AI-SPM capabilities that track tools, identities, and data paths across agent architectures, because when agents fail, the breach is usually a tool call, not a hallucination.  → Read the full article The Governance Room From Disclosure to Infrastructure: How Global AI Regulation Is Turning Compliance Into System Design A retailer’s AI system flags fraudulent returns. The documentation is flawless. Then auditors ask for logs, override records, and proof that human review actually happened. The system passes policy review. It fails infrastructure review. This is the new compliance reality. Across the EU, US, and Asia-Pacific, enforcement is shifting from what policies say to what systems actually do. This piece explains why AI governance is becoming an infrastructure problem, what auditors are starting to look for, and what happens when documentation and architecture tell different stories. → Read the full article

Uncategorized

Shadow AI Metrics Expose a Governance Gap in Enterprise AI Programs

Shadow AI Metrics Expose a Governance Gap in Enterprise AI Programs A developer hits a wall debugging a production issue at 11 PM. She pastes 200 lines of proprietary code into ChatGPT using her personal account. The AI helps her fix the bug in minutes. The code, which contains API keys and references to internal systems, now exists outside the company’s control. No log was created. No policy was enforced. No one knows it happened. This is shadow AI, and it is occurring thousands of times per month across most enterprises. Organizations can now measure how often employees use AI tools, how much data is shared, and how frequently policies are violated. What they cannot do is enforce consistent governance when AI is used through personal accounts, unmanaged browsers, and copy-paste workflows. Shadow AI has turned AI governance into an enforcement problem, not a visibility problem. What the Metrics Actually Show Recent enterprise telemetry paints a consistent picture across industries and regions. According to data reported by Netskope, 94 percent of organizations now use generative AI applications. Nearly half of GenAI users access those tools through personal or unmanaged accounts, placing their activity outside enterprise identity, logging, and policy enforcement. On average, organizations record more than 200 GenAI-related data policy violations per month, with the highest-usage environments seeing over 2,000 violations monthly. Independent studies of shadow AI usage reinforce this pattern. Research analyzing browser-level and endpoint telemetry shows that the dominant data transfer method is not file upload but copy-paste. A large majority of employees paste confidential information directly into AI prompts, and most of those actions occur outside managed enterprise accounts. These metrics matter because they demonstrate scale. Shadow AI is not an edge case or a compliance outlier. It is routine behavior. What Data Is Leaving Enterprise Boundaries Across reports, the same categories of data appear repeatedly in AI-related policy violations: In most cases, this data is shared without malicious intent, as employees use AI tools to solve routine work problems faster. What makes these disclosures difficult to govern is not their sensitivity but their format. Prompts are unstructured, conversational, and ephemeral. They rarely resemble the files and records that traditional data governance programs are designed to protect. Where Governance Breaks Down Most enterprise AI governance frameworks assume three conditions: managed identity, known systems, and auditable records. Shadow AI violates all three. Identity fragmentation. When employees use personal AI accounts, organizations lose the ability to associate data use with enterprise roles, approvals, or accountability structures. System ambiguity. The same AI service may be accessed through sanctioned and unsanctioned paths that are indistinguishable at the network layer. Record absence. Prompt-based interactions often leave no durable artifact that can be reviewed, retained, or audited after the fact. As a result, organizations can detect that violations occur but cannot reliably answer who is responsible, what data was exposed, or whether policy intent was upheld. Why Existing Controls Do Not Close the Gap Enterprises have attempted to adapt existing controls to generative AI usage, with limited success. CASB and network-based controls can identify traffic to AI services but struggle to distinguish personal from corporate usage on the same domains. Traditional DLP systems are optimized for files and structured data flows, not conversational text entered into web forms. Browser-level controls provide more granular inspection but only within managed environments, leaving personal devices and alternative browsers outside scope. These controls improve visibility but do not establish enforceable governance. They observe behavior without consistently preventing or constraining it. More granular controls exist, but they tend to be limited to managed environments and do not generalize across personal accounts, devices, or workflows. What’s At Stake The consequences of ungoverned AI use extend beyond policy violations. Regulatory exposure. Data protection laws including GDPR, CCPA, and industry-specific regulations require organizations to know where personal data goes and to demonstrate control over its use. Shadow AI makes both difficult to prove. Intellectual property loss. Code, product plans, and strategic documents shared with AI tools may be used in model training or exposed through data breaches at the provider. Once shared, the data cannot be recalled. Client and partner trust. Contracts often include confidentiality provisions and data handling requirements. Uncontrolled AI use can put organizations in breach without their knowledge. Audit failure. When regulators or auditors ask how sensitive data is protected, “we have a policy but cannot enforce it” is not an adequate answer. These are not theoretical risks. They are the logical outcomes of the gap between policy and enforcement that current metrics reveal. Implications For AI Governance Programs Shadow AI forces a reassessment of how AI governance is defined and measured. First, policy coverage does not equal policy enforcement. Having acceptable use policies for AI does not ensure those policies can be applied at the point of use. Second, governance ownership is often unclear. Shadow AI risk sits between security, data governance, legal, and business teams, creating gaps in accountability. Third, audit readiness is weakened. When data use occurs outside managed identity and logging, organizations cannot reliably demonstrate compliance with internal policies or external expectations. Frameworks such as the AI Risk Management Framework published by NIST emphasize transparency, risk documentation, and control effectiveness. Shadow AI challenges all three by moving data use into channels that governance programs were not designed to regulate. Open Governance Questions Several unresolved issues remain for enterprises attempting to govern generative AI at scale.

Uncategorized

Registry-Aware Guardrails: Moving AI Safety and Policy Into External Control Planes

Registry-Aware Guardrails: Moving AI Safety and Policy Into External Control Planes Enterprise AI teams are shifting safety and policy logic out of models and into external registries and control planes. Instead of hardcoding guardrails that require retraining to update, these systems consult versioned policies, taxonomies, and trust records at runtime. The result: organizations can adapt to new risks, regulations, and business rules without redeploying models or waiting for fine-tuning cycles. Early enterprise AI deployments relied on static guardrails: keyword filters, prompt templates, or fine-tuned safety models embedded directly into applications. These worked when AI systems were simple. They break down when retrieval-augmented generation, multi-agent workflows, and tool-calling pipelines enter the picture. Two failure modes illustrate the problem. First, keyword and pattern filters miss semantic variations. A filter blocking “bomb” does not catch “explosive device” or context-dependent threats phrased indirectly. Second, inference-based leaks bypass content filters entirely. A model might not output sensitive data directly but can confirm, correlate, or infer protected information across multiple queries, exposing data that no single response would reveal. Recent research and platform disclosures describe a different approach: treating guardrails as first-class operational artifacts that live outside the model. Policies, safety categories, credentials, and constraints are queried at runtime, much like identity or authorization systems in traditional software. The model generates; the control plane governs. How The Mechanism Works Registry-aware guardrails introduce an intermediate control layer between the user request and the model or agent execution path. At runtime, the AI pipeline consults one or more external registries holding authoritative definitions. These registries can include safety taxonomies, policy rules, access-control contracts, trust credentials, or compliance constraints. The guardrail logic evaluates the request, retrieved context, or generated output against the current registry state. This pattern operates in two valid modes. In the first, guardrails evaluate policy entirely outside the model, intercepting inputs and outputs against registry-defined rules. In the second, registry definitions are passed into the model at runtime, conditioning its behavior through instruction-tuning or policy-referenced prompts. Both approaches avoid frequent retraining and represent the same architectural pattern: externalizing policy from model weights. Consider a scenario: A financial services firm deploys a customer-facing chatbot. Rather than embedding compliance rules in the model, the system queries a registry before each response. The registry defines which topics require disclaimers, which customer segments have different disclosure requirements, and which queries must be escalated to human review. When regulations change, the compliance team updates the registry. The chatbot’s behavior changes within minutes, with no model retraining, no code deployment, and a full audit trail of what rules applied to each interaction. Several technical patterns recur across implementations: In practice, this pattern appears in platform guardrails for LLM APIs, policy-governed retrieval pipelines, trust registries for agent and content verification, and control-plane safety loops operating on signed telemetry. The Architectural Shift This is not just a technical refinement. It represents a fundamental change in where safety logic lives and when governance decisions are made. In traditional deployments, safety is a model property enforced ex-post: teams fine-tune for alignment, add a content filter, and remediate when failures occur. Governance is reactive, applied after problems surface. In registry-aware architectures, safety becomes an infrastructure property enforced ex-ante: policies are defined, versioned, and applied before the model generates or actions execute. Governance is proactive, with constraints evaluated at runtime against current policy state. This mirrors how enterprises already handle identity, authorization, and compliance in other systems. No one embeds access control logic directly into every application. Instead, applications query centralized policy engines. Registry-aware guardrails apply the same principle to AI. Some implementations extend trust registries into trust graphs, modeling relationships and delegations between agents, credentials, and policy authorities. These remain emerging extensions rather than replacements for simpler registry architectures. Why This Matters Now Static guardrails struggle in dynamic AI systems. Research and incident analyses show that fixed filters are bypassed by evolving prompt injection techniques, indirect attacks through retrieved content, and multi-agent interactions. The threat surface changes faster than models can be retrained. Registry-aware guardrails address a structural limitation rather than a single attack class. By decoupling safety logic from models and applications, organizations can update constraints as threats, regulations, or business rules change. The timing also reflects operational reality. Enterprises are deploying AI across heterogeneous stacks: proprietary models, third-party APIs, retrieval systems, internal tools. A registry-driven control plane provides a common enforcement point independent of any single model architecture or vendor, reducing policy drift across teams and use cases. Implications For Enterprises For security, platform, and governance teams, registry-aware guardrails introduce several concrete implications: At the same time, this pattern increases the importance of registry reliability and access control. The registry becomes part of the AI system’s security boundary. A compromised registry compromises every system that trusts it. Risks and Open Questions Research and early implementations highlight unresolved challenges: What To Watch Several areas remain under active development or unresolved: Further Reading

Uncategorized

Agentic AI Gets Metered: Vertex AI Agent Engine Billing Goes Live

Agentic AI Gets Metered: Vertex AI Agent Engine Billing Goes Live On January 28, 2026, Google Cloud will begin billing for three core components of Vertex AI Agent Engine: Sessions, Memory Bank, and Code Execution. This change makes agent state, persistence, and sandboxed execution first-class, metered resources rather than implicitly bundled conveniences. Vertex AI Agent Engine, formerly known as the Reasoning Engine, has been generally available since 2025, with runtime compute billed based on vCPU and memory usage. But key elements of agent behavior, including session history, long-term memory, and sandboxed code execution, operated without explicit pricing during preview and early GA phases. In December 2025, Google updated both the Vertex AI pricing page and Agent Builder release notes to confirm that these components would become billable starting January 28, 2026. With SKUs and pricing units now published, the platform moves from a partially bundled cost model to one where agent state and behavior are directly metered. How The Mechanism Works Billing for Vertex AI Agent Engine splits across compute execution and agent state persistence. Runtime compute is billed using standard Google Cloud units. Agent Engine runtime consumes vCPU hours and GiB-hours of RAM, metered per second with idle time excluded. Each project receives a monthly free tier of 50 vCPU hours and 100 GiB-hours of RAM, after which usage is charged at published rates. Sessions are billed based on stored session events that contain content. Sessions are not billed by duration, but by the number of content-bearing events retained. Billable events include user messages, model responses, function calls, and function responses. System control events, such as checkpoints, are explicitly excluded. Pricing is expressed as a per-event storage model, illustrated using per-1,000-event examples, rather than compute time. Memory Bank is billed based on the number of memories stored and returned. Unlike session events, which capture raw conversational turns, Memory Bank persists distilled, long-term information extracted from sessions. Configuration options determine what content is considered meaningful enough to store. Each stored or retrieved memory contributes to billable usage. Code Execution allows agents to run code in an isolated sandbox. This sandbox is metered similarly to runtime compute, using per-second vCPU and RAM consumption, with no charges for idle time. Code Execution launched in preview in 2025 and begins billing alongside Sessions and Memory Bank in January 2026. What This Looks Like In Practice Consider a customer service agent handling 10,000 conversations per month. Each conversation averages 12 events: a greeting, three customer messages, three agent responses, two function calls to check order status, two function responses, and a closing message. That is 120,000 billable session events per month, before accounting for Memory Bank extractions or any code execution. If the agent also stores a memory for each returning customer and retrieves it on subsequent visits, memory operations add another layer of metered usage. Now scale that to five agents across three departments, each with different verbosity levels and tool dependencies. The billing surface area expands across sessions, memory operations, and compute usage, and without instrumentation, teams may not see the accumulation until the invoice arrives. Analysis This change matters because it alters the economic model of agent design. During preview, teams could retain long session histories, extract extensive long-term memories, and rely heavily on sandboxed code execution without seeing distinct cost signals for those choices. By introducing explicit billing for sessions and memories, Google is making agent state visible as a cost driver. The platform now treats conversational history, long-term context, and tool execution as resources that must be managed, not just features that come for free with inference. Implications For Enterprises For platform and engineering teams, cost management becomes a design concern rather than a post-deployment exercise. Session length, verbosity, and event volume directly affect spend. Memory policies such as summarization, deduplication, and selective persistence now have financial as well as architectural consequences. From an operational perspective, autoscaling settings, concurrency limits, and sandbox usage patterns influence both performance and cost. Long-running agents, multi-agent orchestration, and tool-heavy workflows can multiply runtime hours, stored events, and memory usage. For governance and FinOps teams, agent state becomes something that must be monitored, budgeted, and potentially charged back internally. Deleting unused sessions and memories is not just a data hygiene task but the primary way to stop ongoing costs. The Bigger Picture Google is not alone in moving toward granular agent billing. As agentic architectures become production workloads, every major cloud provider faces the same question: how do you price something that thinks, remembers, and acts? Token-based billing made sense when AI was stateless. But agents accumulate context over time, persist memories across sessions, and invoke tools that consume compute independently of inference. Metering these components separately reflects a broader industry shift: agents are not just models. They are systems, and systems have operational costs. Similar pricing structures are increasingly plausible across AWS, Azure, and independent agent platforms as agentic workloads mature. The teams that build cost awareness into their agent architectures now will have an advantage when granular agent billing becomes standard. Risks and Open Questions Several uncertainties remain. Google documentation does not yet clearly define default retention periods for sessions or memories, nor how quickly deletions translate into reduced billing. This creates risk for teams that assume short-lived state by default. Forecasting costs may also be challenging. Session and memory usage scales with user behavior, response verbosity, and tool invocation patterns, making spend less predictable than token-based inference alone. Finally, as agent systems grow more complex, attributing costs to individual agents or workflows becomes harder, especially in multi-agent or agent-to-agent designs. This complicates optimization, internal chargeback, and accountability. Further Reading Google Cloud Vertex AI Pricing Vertex AI Agent Builder Release Notes Vertex AI Agent Engine Memory Bank Documentation AI CERTs analysis on Vertex AI Agent Engine GA Google Cloud blog on enhanced tool governance in Agent Builder