Google Cloud

Beyond Chatbots: How to Build Asynchronous AI Agents on Google Cloud

Beyond Chatbots: How to Build Asynchronous AI Agents on Google Cloud

When we think of AI agents, we almost instinctively picture a chatbot: a user types a question, and the agent replies immediately. This request/response model is great for direct human interaction, but it doesn’t fit every use case.
Real-world enterprise systems are often distinct, disparate, and disconnected. They communicate through events—messages sent between systems to trigger actions asynchronously. If you want your AI agent to automate complex orchestrations (like processing insurance claims, analyzing logs as they arrive, or summarizing documents uploaded to a bucket), you need to break out of the synchronous “chat” loop.
In this post, we’ll explore how to plug an AI agent into an event-driven architecture. We will move beyond the standard API call and look at how to trigger a Python-based agent asynchronously using Google Cloud Pub/Sub and Eventarc. This approach allows you to integrate AI into established interfaces without modifying them, effectively turning your agent into a silent, scalable background worker.

Lightweight Session State: Using Vertex AI's Session Management Without a Full Agent Deployment

Lightweight Session State: Using Vertex AI's Session Management Without a Full Agent Deployment

Agent Development Kit or ADK from Google is one of popular frameworks for developing AI applications. It provides a rich set of instruments for developers saving the development time and enabling the use of the industry’s best practices. One of such instruments is session management. It is used to maintain a state of the user’s session during interaction with agents. ADK provides several implementations of the session management tool to be used for development, for use with relative databases and for maintaining state using Vertex AI - a Google Cloud platform for AI applications and ML models. You can find a lot of information about session management with ADK. You can read documentation or to learn about managing state and memory. And there is more.

Securely Call Cloud Run Service From Anywhere

Securely Call Cloud Run Service From Anywhere

Enabling authentication for your Cloud Run application is easy ‒ a single mouse click (or parameter in your CI/CD) without writing any code. Calling this application from another is less straightforward. It may be easy when both a caller and called applications are hosted under the same identity in Google Cloud. In the rest of cases, it requires acquiring an identity token.

A problem begins with documentation. Sometimes it isn’t clear whether the described token is an identity token or access token. While the first is good for invoking endpoints of user’s applications on Google Cloud, the second is good only for calling Google APIs.

Good Bye Vertex AI SDK

Good Bye Vertex AI SDK

If you did not notice, the Generative AI part of Vertex AI SDK is now deprecated. It means that new versions of this SDK will not update generative AI functions and these functions will be completely removed from SDK versions in 2026. You can find more info about it in the deprecation notice.

In 2024, the Generative AI module was introduced to the Vertex AI SDK. The way it was published for different programming languages introduced quite a confusion. For example, in Python a developer had to install the google-cloud-aiplatform package and then to import vertexai while in Go a name of the installed module was cloud.google.com/go/vertexai and the import statement had to import "cloud.google.com/go/vertexai/genai". In 2025, Google released a new GenAI SDK that was called to replace the collection of VertexAI SDKs for different languages. The new SDK has a more intuitive interface that is similar across different programming languages.

What is app-management enabled folder in Google Cloud

What is app-management enabled folder in Google Cloud

Do not be confused about the following introduction. This post *is* about app-management enabled folders. But before explaining what they are and how you can make one, it is important to quickly refresh what the term “folder” means in the context of Google Cloud.

If you have used Google Cloud you know about Google Cloud projects. According to Google Cloud resource hierarchy, any service resource (e.g. a virtual machine, GKE cluster or IP address) has a project as their parent, which represents the first grouping mechanism of the Google Cloud resource hierarchy. When a user accesses Google Cloud using an organizational account ‒ a Google Workspace account issued by an organization’s administrator, they have access to additional levels of grouping: folders and the topmost ‒ organization. Of course all access is pending appropriate IAM permissions. Folders allow to group projects and other folders to abstract company’s organizational or production hierarchies and control access to underlying resources.
Users can also access Google Cloud using personal accounts ‒ free accounts for individuals, created to access Google services like Gmail, Drive, and more. However, these accounts limit access to resource hierarchy to the level of projects and its underlying service resources.

Simplify Your OTel Trace With Google Cloud

Simplify Your OTel Trace With Google Cloud

OpenTelemetry (OTel) is the go-to standard for monitoring applications, offering a vendor-neutral way to capture telemetry data like traces, metrics, and logs. This enables consistent instrumentation and avoids vendor lock-in. Developers widely use OTel to instrument applications, with exporting telemetry data to Google Cloud Observability services.

OTel’s native data format follows OTLP (standing for OpenTelemetry Protocol) standard. To export OTel data to Google Cloud usually requires exporters like Google Cloud Trace Exporter for Go that exist for most of the popular programming languages.

Control What You Log

Control What You Log

DISCLAIMER: This post is not about log storage billing or managing log sinks.

Have you ever read or heard the phrase “Write everything to logs”? This is good advice. You never know what information can be useful or when. It is easy to do in Google Cloud. With help of audit logs all infrastructure, security and other cloud internal events are stored in Cloud Logging. And you can write application logs by simply printing them to stdout. However, there are situations when you may need to prevent some log entries from being stored:

How to Export Google Cloud Logs

How to Export Google Cloud Logs

Google Cloud provides efficient and not expensive storage for application and infrastructure logs. Logs stored in Google Cloud can be queried and analyzed using the analytical power of BigQuery. However, there are scenarios when Google Cloud customers may need to export log data from Google Cloud to third party (3P) solutions. This post reviews two main use cases of log exporting: exporting already stored logs and exporting logs while they are being ingested. The post focuses on how to configure and implement the part of the exporting process that handles extracting logs from Google Cloud. The part of loading the data into 3P solutions is not explored because of the variety of requirements and constraints that different 3P solutions expose.

Using PromQL in Google Cloud

Using PromQL in Google Cloud

PromQL stands for Prometheus Query Language. This post is about using PromQL in Cloud Monitoring. PromQL provides an alternative to the Metrics Explorer menu-driven builder and Monitoring Query Language (MQL) interfaces for exploring metrics, creating charts and alerts. Google Cloud introduced support for PromQL at the same time as Managed Service for Prometheus. Later, support for PromQL was introduced in Monitoring alert management. Practically it means that you can use PromQL instead of Monitoring Query Language (or MQL) to query Cloud Monitoring metrics in the Metrics Explorer, in custom dashboard configurations, and in alert management.

Control your Generative AI costs with the Vertex API’s context caching

Control your Generative AI costs with the Vertex API’s context caching

Note: This blog has two authors.

What is context caching?

Vertex AI is a Google Cloud machine learning (ML) platform that, among other things, provides access to a collection of generative AI models. This includes the models known under the common name “Gemini models”. When you interact with these models you provide it with all the information about your inquiry. The Gemini models accept information in multiple formats including text, video and audio. The provided information is often referred to as “context”. The Gemini models are known to accept very long contexts.