GenAI

How to wear Model Armor 2: Integrating with ADK and LangChain

Quick recap of part 1

The first post about Model Armor, explored the fundamentals of Google Cloud’s managed security service for Generative AI applications, which provides a model-agnostic defense layer to sanitize both prompts and model responses.

It covered the two primary patterns for integrating Model Armor into your stack:

Direct Invocation: Using the Model Armor SDK or API for granular control over pre-call and post-call sanitization.
Built-in Integration: Configuring services like Vertex AI, GKE, and Gemini Enterprise to automatically enforce security policies through “Floor Settings” and user-defined templates without explicit API calls in your application logic.

And it walked through the practical configuration of these integrations using gcloud CLI and Terraform, establishing a secure baseline for your GenAI pipelines. In this post I shift my focus to examine how direct invocation works in practice. I will review the methods of interpreting sanitize API responses and incorporating the API calls in two agent frameworks: LangChain ‒ probably the most widespread framework today for implementing agentic workflows and the Agent Development Kit (ADK) which I personally prefer for its simplicity.

How to Wear Model Armor 1: Integration Patterns

Model Armor in Google Cloud is a managed security service that provides a programmable defense layer to sanitize prompts and responses for Generative AI applications. At its core, Model Armor is a model-agnostic, API-first security solution designed to intercept and sanitize the I/O of Large Language Models (LLMs). It allows developers to define and enforce safety policies — referred to as Templates — that sit between the user and the model, ensuring that interactions remain within organizational and security guardrails. Unlike Google Cloud Armor that focuses on Layer 7 web traffic and DDoS protection, Model Armor operates on the semantic and content layer of GenAI. You can watch a youtube video to see a practical demonstration of these capabilities in action, including live examples of how the service intercepts and handles malicious requests.

If you did not notice, the Generative AI part of Vertex AI SDK is now deprecated. It means that new versions of this SDK will not update generative AI functions and these functions will be completely removed from SDK versions in 2026. You can find more info about it in the deprecation notice.

In 2024, the Generative AI module was introduced to the Vertex AI SDK. The way it was published for different programming languages introduced quite a confusion. For example, in Python a developer had to install the google-cloud-aiplatform package and then to import vertexai while in Go a name of the installed module was cloud.google.com/go/vertexai and the import statement had to import "cloud.google.com/go/vertexai/genai". In 2025, Google released a new GenAI SDK that was called to replace the collection of VertexAI SDKs for different languages. The new SDK has a more intuitive interface that is similar across different programming languages.

Control your Generative AI costs with the Vertex API’s context caching

Note: This blog has two authors.

What is context caching?

Vertex AI is a Google Cloud machine learning (ML) platform that, among other things, provides access to a collection of generative AI models. This includes the models known under the common name “Gemini models”. When you interact with these models you provide it with all the information about your inquiry. The Gemini models accept information in multiple formats including text, video and audio. The provided information is often referred to as “context”. The Gemini models are known to accept very long contexts.

Last updated: August 12, 2024.

Everyone today heard or read about GenAI, ChatGPT and other AI things. There are a lot of terminology, abbreviations and other clever words. I found myself troubled to remember all of them. So I decided to write down my definitions for each of these terms.

TL;DR; If you are familiar with AI terminology, you might want to stop reading this post here. For the rest of the readers, welcome to my personalized thesaurus of GenAI terminology.

How to wear Model Armor 2: Integrating with ADK and LangChain

Quick recap of part 1

How to Wear Model Armor 1: Integration Patterns

Good Bye Vertex AI SDK

Control your Generative AI costs with the Vertex API’s context caching

What is context caching?

My GenAI Thesaurus