How to Wear Model Armor 1: Integration Patterns

Model Armor in Google Cloud is a managed security service that provides a programmable defense layer to sanitize prompts and responses for Generative AI applications. At its core, Model Armor is a model-agnostic, API-first security solution designed to intercept and sanitize the I/O of Large Language Models (LLMs). It allows developers to define and enforce safety policies — referred to as Templates — that sit between the user and the model, ensuring that interactions remain within organizational and security guardrails. Unlike Google Cloud Armor that focuses on Layer 7 web traffic and DDoS protection, Model Armor operates on the semantic and content layer of GenAI. You can watch a youtube video to see a practical demonstration of these capabilities in action, including live examples of how the service intercepts and handles malicious requests.

Using Model Armor with your application requires understanding of how to invoke Model Armor, how to instruct it what to do and what to do next after you received the results of the call. About these things and some extras see below.

Two ways to call Model Armor

There are two ways to use Model Armor: direct invocation and via built-in integration. The direct invocation is used for a single Model Armor API call, as the name implies. It gives a flexibility of selecting a specific policy ‒ the template ‒ that is applied in the call and leaves the decision to whether to move forward with an LLM inference call and whether to use the original or sanitized context to the caller. Using direct invocation requires credentials authorized to invoke the Model Armor API and access to the Model Armor template. The API provides distinct methods for sanitizing the input (pre-call) and the model’s response (post-call). The simplest way to call Model Armor API is to use Google Cloud SDK:

from google.cloud import modelarmor_v1

# project_id, location_id, template_id and prompt_text set up earlier
client_options = {
    "api_endpoint": f"modelarmor.{location_id}.rep.googleapis.com"
}
client = modelarmor_v1.ModelArmorClient(client_options=client_options)
template_name = (
    f"projects/{project_id}/locations/{location_id}/templates/{template_id}"
)
prompt_request = modelarmor_v1.SanitizeUserPromptRequest(
    name=template_name,
    user_prompt_data=modelarmor_v1.DataItem(text=prompt_text),
)
try:
    prompt_response = client.sanitize_user_prompt(request=prompt_request)
except Exception as e:
    print(f"[API Error] Failed to sanitize user prompt:\n{e}")

You can get hands-on experience using this method by doing the Build a Secure Agent with Model Armor and Identity codelab. In this lab you will implement a call to Model Armor API before your agent interacts with the AI model.

More sophisticated implementations would use a plug-in design pattern to invoke Model Armor API each time any AI Agent in your application interacts with a model. Different frameworks provide their own implementation of the pattern. There are plugins and callbacks in Agent Development Kit (ADK), ChatGPT Plugins, integrations with Langchain and model wrappers in MLFlow.

The built-in integration does not require an explicit call to Model Armor API. Instead you configure supported Google Cloud services to have all LLM inferences to their model to be sanitized by Model Armor. The integration is configured per-service per-project. Meaning, activating the integration of Model Armor with Vertex AI in a project will sanitize all I/O to Gemini models made of the Vertex AI in that project. As of February 2026, Model Armor has integration with Vertex AI, Google Kubernetes Engine (GKE) and Gemini Enterprise. The following table captures the main differences about how integration works in different services:

Service LLM being protected Type of Templates Auth method Invocation method
GKE Hosted in GKE User defined per model Service Extensions service account Through configuration
Vertex AI Gemini family models Floor settings & User defined per call Vertex AI service account or agent’s identity Through configuration or using payload parameters in Vertex AI API call
Gemini Enterprise Gemini family models User defined Service agent identity Through configuration

Configure Vertex AI integration

Model Armor allows you to configure integration with Vertex AI directly using the Cloud console, gcloud CLI or Terraform. There are a couple of details that you should pay attention when you configure the integration:

  • You configure Vertex AI integration using floor settings which is also used to enforce the bottom threshold for Model Armor template configurations
  • The integration is defined for the project scope i.e. ALL API calls to models.generateContent and models.streamGenerateContent of Vertex AI in this project will be screened by Model Armor
  • The integration uses the floor settings as a template for both user prompt and model response sanitizations
  • You can use a user-defined template by populating the modelArmorConfig parameter in the generate content API calls

Do the following to activate Model Armor to generate content calls to Vertex AI API. All code snippets use PROJECT_ID as a placeholder for the project ID.

  1. Enable Model Armor and Sensitive Data Protection APIs. I assume that Vertex AI API is already enabled.
gcloud services enable modelarmor.googleapis.com dlp.googleapis.com --project=PROJECT_ID

Or with Terraform resource:

locals {
  gcp_services = [
    "modelarmor.googleapis.com",
    "dlp.googleapis.com"
  ]
}

resource "google_project_service" "gcp_services" {
  for_each = toset(local.gcp_services)
  project  = "PROJECT_ID"
  service  = each.value
}
  1. Set up the filter configuration for the floor settings as it is described in the documentation.

  2. Activate the integration to enforce blocking screened input and output of the content generating calls.

gcloud model-armor floorsettings update \
  --full-uri=projects/PROJECT_ID/locations/global/floorSetting \
  --add-integrated-services=VERTEX_AI \
  --vertex-ai-enforcement-type=INSPECT_AND_BLOCK

Or with Terraform resource:

resource "google_model_armor_floorsetting" "floorsetting-filter-config" {
  location    = "global"
  parent      = "project/PROJECT_ID"
  filter_config {
    # defines 'default' Template settings
  }
  integrated_services = ["AI_PLATFORM"]
  ai_platform_floor_setting {
    inspect_and_block = true
  }
}

How you know that Model Armor blocked a request

When integration is active and Model Armor blocks the prompt or response after inspection, the Vertex AI API returns status 200 OK and the following payload in the response body:

{
  "promptFeedback": {
  "blockReason": "MODEL_ARMOR",
  "blockReasonMessage": "Blocked by Floor Setting. The prompt violated Responsible AI Safety settings (Harassment, Dangerous), Prompt Injection and Jailbreak filters."
  },
}

The blockReasonMessage value depends on the filters that blocked the request. The response body can contain additional payload fields. See documentation for more examples.

When integration is activated the Model Armor is invoked seamlessly. The results of the Model Armor screening are logged when audit logging is activated. Note that audit logs do not contain prompt and response payloads. You can write the prompt and response payloads by configuring logging in the floor settings or in templates. However, mind that original prompts and responses may contain sensitive business and personal information. You will need to employ separate methods for sanitizing sensitive data in your logs. See my post on this topic as an example.

What permissions you need

In order to run the gcloud CLI commands above or provision the Terraform resource you will need only the roles/modelarmor.floorSettingsAdmin role on the PROJECT_ID project.

Location limitations

There are some limitations on the supported locations:

  • Vertex AI: templates should be in one of the following locations: us-central1, us-east4, us-west1, and europe-west4.
  • Gemini enterprise: the location of the template and the Gemini Enterprise instance must match

The limitations about the locations can change. Please consult with documentation for up to date information.

One more word about floor settings

Model Armor Floor settings has another function besides configuring integration with Vertex AI. As its name implies when enforced, it defines the minimal filter settings for all new Model Armor templates that are created in the project. The floor settings can also be defined at the folder and organization level thus enforcing the minimal filter settings over the hierarchy of the projects. One way to do it is to enforce floor settings using gcloud CLI:

gcloud model-armor floorsettings update \
  --full-uri=projects/PROJECT_ID/locations/global/floorSetting \
  --enable-floor-setting-enforcement=true

Or with Terraform resource:

resource "google_model_armor_floorsetting" "floorsetting-filter-config" {
  location    = "global"
  parent      = "project/PROJECT_ID"
  filter_config {
    # defines 'default' Template settings
  }
  enable_floor_setting_enforcement = true
}

What’s next and about MCP Servers

The examples for configuring integration with Vertex AI include only gcloud CLI and Terraform. You also can do the same using Google Cloud console or calling Model Armor API using REST. The executable samples of Terraform plan and Python application that invoke generateContent API with and without a Model Armor template can be found in the devrel-demo repository on Github together with many other demo applications and samples.

I’d also recommend you to give a try to the following codelabs that provide hands-on experience with Model Armor, Sensitive Data Protection and integration of these security services into your AI agent application:

There are several topics that I didn’t include in this post. One of them is the best practices of integrating Model Armor into an AI agent application built with ADK and LangChain. You may find some of these practices in the codelabs above but not all. I’m planning to cover this in a separate post: “How to wear Model Armor 2: Use ADK with agent frameworks”.
The second topic is similarity and differences of sanitizing sensitive data using Model Armor vs. using Sensitive Data Protection service (DLP). So, there will be “How to wear Model Armor 3”.

The last but not least is a topic of using Model Armor with MCP servers. When an AI agent is configured with tools, the output from the tools is directly fed to the model. Model Armor integrations support Google Cloud MCP servers (currently in preview). This integration is controlled using the floor settings using slightly different flags:

Using gcloud CLI:

gcloud model-armor floorsettings update \
  --full-uri=projects/PROJECT_ID/locations/global/floorSetting \
  --add-integrated-services=GOOGLE_MCP_SERVER \
  --vertex-ai-enforcement-type=INSPECT_AND_BLOCK

Or with Terraform resource:

resource "google_model_armor_floorsetting" "floorsetting-filter-config" {
  location    = "global"
  parent      = "project/PROJECT_ID"
  filter_config {
    # defines 'default' Template settings
  }
  integrated_services = ["GOOGLE_MCP_SERVER"]
  ai_platform_floor_setting {
    inspect_and_block = true
  }
}

This activation works for all Google Cloud MCP servers.