Name: adk-deploy-guide
Author: google

SKILL.md

$2c

Deployment Target Decision Matrix

Choose the right deployment target based on your requirements:

Criteria

Agent Engine

Cloud Run

GKE

Languages

Python

Python (+ others via custom containers)

Scaling

Managed auto-scaling (configurable min/max, concurrency)

Fully configurable (min/max instances, concurrency, CPU allocation)

Full Kubernetes scaling (HPA, VPA, node auto-provisioning)

Networking

VPC-SC and PSC supported

Full VPC support, direct VPC egress, IAP, ingress rules

Full Kubernetes networking

Session state

Native VertexAiSessionService (persistent, managed)

In-memory (dev), Cloud SQL, or Agent Engine session backend

Batch/event processing

Not supported

/invoke endpoint for Pub/Sub, Eventarc, BigQuery

Custom (Kubernetes Jobs, Pub/Sub)

Cost model

vCPU-hours + memory-hours (not billed when idle)

Per-instance-second + min instance costs

Node pool costs (always-on or auto-provisioned)

Setup complexity

Lower (managed, purpose-built for agents)

Medium (Dockerfile, Terraform, networking)

Higher (Kubernetes expertise required)

Best for

Managed infrastructure, minimal ops

Custom infra, event-driven workloads

Full Kubernetes control

Ask the user which deployment target fits their needs. Each is a valid production choice with different trade-offs.

Quick Deploy (ADK CLI)

For projects without Agent Starter Pack scaffolding. No Makefile, Terraform, or Dockerfile required.

# Cloud Run

adk deploy cloud_run --project=PROJECT --region=REGION path/to/agent/

# Agent Engine

adk deploy agent_engine --project=PROJECT --region=REGION path/to/agent/

# GKE (requires existing cluster)

adk deploy gke --project=PROJECT --cluster_name=CLUSTER --region=REGION path/to/agent/

All commands support --with_ui to deploy the ADK dev UI. Cloud Run also accepts extra gcloud flags after -- (e.g., -- --no-allow-unauthenticated).

See adk deploy --help or the ADK deployment docs for full flag reference.

For CI/CD, observability, or production infrastructure, scaffold with /adk-scaffold and use the sections below.

Dev Environment Setup & Deploy (Scaffolded Projects)

Setting Up Dev Infrastructure (Optional)

make setup-dev-env runs terraform apply in deployment/terraform/dev/. This provisions supporting infrastructure:

Service accounts (app_sa for the agent, used for runtime permissions)

Artifact Registry repository (for container images)

IAM bindings (granting the app SA necessary roles)

Telemetry resources (Cloud Logging bucket, BigQuery dataset)

Any custom resources defined in deployment/terraform/dev/

This step is optional — make deploy works without it (Cloud Run creates the service on the fly via gcloud run deploy --source .). However, running it gives you proper service accounts, observability, and IAM setup.

make setup-dev-env

Note: make deploy doesn't automatically use the Terraform-created app_sa. Pass --service-account explicitly or update the Makefile.

Deploying

Notify the human: "Eval scores meet thresholds and tests pass. Ready to deploy to dev?"

Wait for explicit approval

Once approved: make deploy

IMPORTANT: Never run make deploy without explicit human approval.

Production Deployment — CI/CD Pipeline

Best for: Production applications, teams requiring staging → production promotion.

Prerequisites:

Project must NOT be in a gitignored folder

User must provide staging and production GCP project IDs

GitHub repository name and owner

Steps:

If prototype, first add Terraform/CI-CD files using the Agent Starter Pack CLI (see /adk-scaffold for full options):

uvx agent-starter-pack enhance . --cicd-runner github_actions -y -s

Ensure you're logged in to GitHub CLI:

gh auth login  # (skip if already authenticated)

Run setup-cicd:

uvx agent-starter-pack setup-cicd \

  --staging-project YOUR_STAGING_PROJECT \

  --prod-project YOUR_PROD_PROJECT \

  --repository-name YOUR_REPO_NAME \

  --repository-owner YOUR_GITHUB_USERNAME \

  --auto-approve \

  --create-repository

Push code to trigger deployments

#### Key setup-cicd Flags

Flag

Description

--staging-project

GCP project ID for staging environment

--prod-project

GCP project ID for production environment

--repository-name / --repository-owner

GitHub repository name and owner

--auto-approve

Skip Terraform plan confirmation prompts

--create-repository

Create the GitHub repo if it doesn't exist

--cicd-project

Separate GCP project for CI/CD infrastructure. Defaults to prod project

--local-state

Store Terraform state locally instead of in GCS (see references/terraform-patterns.md)

Run uvx agent-starter-pack setup-cicd --help for the full flag reference (Cloud Build options, dev project, region, etc.).

Choosing a CI/CD Runner

Runner

Pros

Cons

github_actions (Default)

No PAT needed, uses gh auth, WIF-based, fully automated

Requires GitHub CLI authentication

google_cloud_build

Native GCP integration

Requires interactive browser authorization (or PAT + app installation ID for programmatic mode)

How Authentication Works (WIF)

Both runners use Workload Identity Federation (WIF) — GitHub/Cloud Build OIDC tokens are trusted by a GCP Workload Identity Pool, which grants cicd_runner_sa impersonation. No long-lived service account keys needed. Terraform in setup-cicd creates the pool, provider, and SA bindings automatically. If auth fails, re-run terraform apply in the CI/CD Terraform directory.

CI/CD Pipeline Stages

The pipeline has three stages:

CI (PR checks) — Triggered on pull request. Runs unit and integration tests.

Staging CD — Triggered on merge to main. Builds container, deploys to staging, runs load tests.

Path filter: Staging CD uses paths: ['app/**'] — it only triggers when files under app/ change. The first push after setup-cicd won't trigger staging CD unless you modify something in app/. If nothing happens after pushing, this is why.

Production CD — Triggered after successful staging deploy via workflow_run. Might require manual approval before deploying to production.

Approving: Go to GitHub Actions → the production workflow run → click "Review deployments" → approve the pending production environment. This is GitHub's environment protection rules, not a custom mechanism.

IMPORTANT: setup-cicd creates infrastructure but doesn't deploy automatically. Terraform configures all required GitHub secrets and variables (WIF credentials, project IDs, service accounts). Push code to trigger the pipeline:

git add . &#x26;&#x26; git commit -m "Initial agent implementation"

git push origin main

To approve production deployment:

# GitHub Actions: Approve via repository Actions tab (environment protection rules)

# Cloud Build: Find pending build and approve

gcloud builds list --project=PROD_PROJECT --region=REGION --filter="status=PENDING"

gcloud builds approve BUILD_ID --project=PROD_PROJECT

Cloud Run Specifics

For detailed infrastructure configuration (scaling defaults, Dockerfile, FastAPI endpoints, session types, networking), see references/cloud-run.md. For ADK docs on Cloud Run deployment, fetch https://adk.dev/deploy/cloud-run/index.md.

Agent Engine Specifics

Agent Engine is a managed Vertex AI service for deploying Python ADK agents. Uses source-based deployment (no Dockerfile) via deploy.py and the AdkApp class.

**No gcloud CLI exists for Agent Engine.** Deploy via deploy.py or adk deploy agent_engine. Query via the Python vertexai.Client SDK.

Deployments can take 5-10 minutes. If make deploy times out, check if the engine was created and manually populate deployment_metadata.json with the engine resource ID (see reference for details).

For detailed infrastructure configuration (deploy.py flags, AdkApp pattern, Terraform resource, deployment metadata, session/artifact services, CI/CD differences), see references/agent-engine.md. For ADK docs on Agent Engine deployment, fetch https://adk.dev/deploy/agent-engine/index.md.

GKE Specifics

For detailed infrastructure configuration (Terraform-managed Kubernetes resources, Workload Identity, session types, networking), see references/gke.md. For ADK docs on GKE deployment, fetch https://adk.dev/deploy/gke/index.md.

Service Account Architecture

Scaffolded projects use two service accounts:

**app_sa** (per environment) — Runtime identity for the deployed agent. Roles defined in deployment/terraform/iam.tf.

**cicd_runner_sa (CI/CD project) — CI/CD pipeline identity (GitHub Actions / Cloud Build). Lives in the CI/CD project (defaults to prod project), needs permissions in both** staging and prod projects.

Check deployment/terraform/iam.tf for exact role bindings. Cross-project permissions (Cloud Run service agents, artifact registry access) are also configured there.

Common 403 errors:

"Permission denied on Cloud Run" → cicd_runner_sa missing deployment role in the target project

"Cannot act as service account" → Missing iam.serviceAccountUser binding on app_sa

"Secret access denied" → app_sa missing secretmanager.secretAccessor

"Artifact Registry read denied" → Cloud Run service agent missing read access in CI/CD project

Secret Manager (for API Credentials)

Instead of passing sensitive keys as environment variables, use GCP Secret Manager.

# Create a secret

echo -n "YOUR_API_KEY" | gcloud secrets create MY_SECRET_NAME --data-file=-

# Update an existing secret

echo -n "NEW_API_KEY" | gcloud secrets versions add MY_SECRET_NAME --data-file=-

Grant access: For Cloud Run, grant secretmanager.secretAccessor to app_sa. For Agent Engine, grant it to the platform-managed SA (service-PROJECT_NUMBER@gcp-sa-aiplatform-re.iam.gserviceaccount.com). For GKE, grant secretmanager.secretAccessor to app_sa. Access secrets via Kubernetes Secrets or directly via the Secret Manager API with Workload Identity.

Pass secrets at deploy time (Agent Engine):

make deploy SECRETS="API_KEY=my-api-key,DB_PASS=db-password:2"

Format: ENV_VAR=SECRET_ID or ENV_VAR=SECRET_ID:VERSION (defaults to latest). Access in code via os.environ.get("API_KEY").

Observability

See the adk-observability-guide skill for observability configuration (Cloud Trace, prompt-response logging, BigQuery Analytics, third-party integrations).

Testing Your Deployed Agent

Agent Engine Deployment

Option 1: Testing Notebook

jupyter notebook notebooks/adk_app_testing.ipynb

Option 2: Python Script

import json

import vertexai

with open("deployment_metadata.json") as f:

    engine_id = json.load(f)["remote_agent_engine_id"]

client = vertexai.Client(location="us-central1")

agent = client.agent_engines.get(name=engine_id)

async for event in agent.async_stream_query(message="Hello!", user_id="test"):

    print(event)

Option 3: Playground

make playground

Cloud Run Deployment

Auth required by default. Cloud Run deploys with --no-allow-unauthenticated, so all requests need an Authorization: Bearer header with an identity token. Getting a 403? You're likely missing this header. To allow public access, redeploy with --allow-unauthenticated.

SERVICE_URL="https://SERVICE_NAME-PROJECT_NUMBER.REGION.run.app"

AUTH="Authorization: Bearer $(gcloud auth print-identity-token)"

# Test health endpoint

curl -H "$AUTH" "$SERVICE_URL/"

# Step 1: Create a session (required before sending messages)

curl -X POST "$SERVICE_URL/apps/app/users/test-user/sessions" \

  -H "Content-Type: application/json" \

  -H "$AUTH" \

  -d '{}'

# → returns JSON with "id" — use this as SESSION_ID below

# Step 2: Send a message via SSE streaming

curl -X POST "$SERVICE_URL/run_sse" \

  -H "Content-Type: application/json" \

  -H "$AUTH" \

  -d '{

    "app_name": "app",

    "user_id": "test-user",

    "session_id": "SESSION_ID",

    "new_message": {"role": "user", "parts": [{"text": "Hello!"}]}

  }'

Common mistake: Using {"message": "Hello!", "user_id": "...", "session_id": "..."} returns 422 Field required. The ADK HTTP server expects the new_message / parts schema shown above, and the session must already exist.

GKE Deployment

GKE LoadBalancer services are public by default — no auth header needed (unlike Cloud Run). See references/gke.md for curl examples and endpoint details.

Load Tests

make load-test

See tests/load_test/README.md for configuration, default settings, and CI/CD integration details.

Deploying with a UI (IAP)

To expose your agent with a web UI protected by Google identity authentication:

# Deploy with IAP (built-in framework UI)

make deploy IAP=true

# Deploy with custom frontend on a different port

make deploy IAP=true PORT=5173

IAP (Identity-Aware Proxy) secures the Cloud Run service — only authorized Google accounts can access it. After deploying, grant user access via the Cloud Console IAP settings.

For Agent Engine with a custom frontend, use a decoupled deployment — deploy the frontend separately to Cloud Run or Cloud Storage, connecting to the Agent Engine backend API.

Rollback & Recovery

The primary rollback mechanism is git-based: fix the issue, commit, and push to main. The CI/CD pipeline will automatically build and deploy the new version through staging → production.

For immediate Cloud Run rollback without a new commit, use revision traffic shifting:

gcloud run revisions list --service=SERVICE_NAME --region=REGION

gcloud run services update-traffic SERVICE_NAME \

  --to-revisions=REVISION_NAME=100 --region=REGION

Agent Engine doesn't support revision-based rollback — fix and redeploy via make deploy.

For GKE rollback, use kubectl rollout undo:

kubectl rollout undo deployment/DEPLOYMENT_NAME -n NAMESPACE

kubectl rollout status deployment/DEPLOYMENT_NAME -n NAMESPACE

Custom Infrastructure (Terraform)

For custom infrastructure patterns (Pub/Sub, BigQuery, Eventarc, Cloud SQL, IAM), consult references/terraform-patterns.md for:

Where to put custom Terraform files (dev vs CI/CD)

Resource examples (Pub/Sub, BigQuery, Eventarc triggers)

IAM bindings for custom resources

Terraform state management (remote vs local, importing resources)

Common infrastructure patterns

Troubleshooting

Issue

Solution

Terraform state locked

terraform force-unlock -force LOCK_ID in deployment/terraform/

GitHub Actions auth failed

Re-run terraform apply in CI/CD terraform dir; verify WIF pool/provider

Cloud Build authorization pending

Use github_actions runner instead

Resource already exists

terraform import (see references/terraform-patterns.md)

Agent Engine deploy timeout / hangs

Deployments take 5-10 min; check if engine was created (see Agent Engine Specifics)

Secret not available

Verify secretAccessor granted to app_sa (not the default compute SA)

403 on deploy

Check deployment/terraform/iam.tf — cicd_runner_sa needs deployment + SA impersonation roles in the target project

403 when testing Cloud Run

Default is --no-allow-unauthenticated; include Authorization: Bearer $(gcloud auth print-identity-token) header

Cold starts too slow

Set min_instance_count > 0 in Cloud Run Terraform config

Cloud Run 503 errors

Check resource limits (memory/CPU), increase max_instance_count, or check container crash logs

403 right after granting IAM role

IAM propagation is not instant — wait a couple of minutes before retrying. Don't keep re-granting the same role

Resource seems missing but Terraform created it

Run terraform state list to check what Terraform actually manages. Resources created via null_resource + local-exec (e.g., BQ linked datasets) won't appear in gcloud CLI output

adk-deploy-guide