This project is a sample implementation of an Agentic RAG using the Agent Development Kit (ADK), with Vertex AI Vector Search 2.0 as the unified vector store.
Vector Search 2.0 is Google Cloud's fully managed, self-tuning vector database built on Google's ScaNN (Scalable Nearest Neighbors) algorithm.
- Unified Data Storage: Store both vector embeddings and user data together (no separate database needed)
- Auto-Embeddings: Automatically generate semantic embeddings using Vertex AI embedding models
- Built-in Full Text Search: Provides built-in full-text search without needing to generate sparse embeddings
- Hybrid Search: Combine semantic and keyword search with intelligent RRF ranking
- Zero Indexing to Billion-Scale: Start immediately with kNN, then scale to billions with ANN indexes
rag-with-vectorsearch-2.0/
├── rag_with_vectorsearch_2_0/ # ADK Agent directory
│ ├── .env.example
│ ├── agent.py
│ ├── tools.py # Vector Search 2.0 Hybrid Search
│ └── requirements.txt # Agent dependencies
├── data_ingestion/ # Collection creation & Ingestion
│ ├── .env.example
│ ├── create_vector_search_collection.py
│ ├── ingest.py
│ └── requirements.txt # Data ingestion dependencies
├── source_documents/ # Source documents for RAG
└── README.md
Unlike the traditional pattern that requires a separate document store (e.g., Firestore), Vector Search 2.0 stores both vectors and data together, simplifying the architecture.
- Create Collection: Define data schema and vector schema with auto-embedding configuration
- Ingest Data: Store documents as Data Objects with
dataand emptyvectors(auto-generated) - Hybrid Search: Query using both Semantic Search and Text Search combined with RRF
- Retrieve Data: Get results directly from search response (no secondary lookup needed)
+--------------+ (1) Query +----------------------------+
| | ----------------> | Agentic RAG |
| User/Client | <---------------- |(Cloud Run, Agent Engine...)|
| | (4) Final Result +----------------------------+
+--------------+ | ^
(2) Hybrid | | (3) Return results
Search v | with data
+----------------------------------+
| Vertex AI Vector Search 2.0 |
| (Collection with Auto-Embed) |
| - Semantic Search (Dense) |
| - Text Search (Keyword) |
| - RRF Ranking |
+----------------------------------+
Before you begin, you need to have an active Google Cloud project.
First, authenticate with Google Cloud:
gcloud auth application-default loginSet up your project and enable the necessary APIs:
# Set your project ID and location
export GOOGLE_CLOUD_PROJECT=$(gcloud config get-value project)
export GOOGLE_CLOUD_LOCATION="us-central1"
# Enable the required APIs
gcloud services enable \
vectorsearch.googleapis.com \
aiplatform.googleapis.com \
cloudresourcemanager.googleapis.comTo allow the deployed Agent Engine to access your Vector Search collection:
# Get your project number
export PROJECT_NUMBER=$(gcloud projects describe $GOOGLE_CLOUD_PROJECT --format="value(projectNumber)")
# Grant the Vertex AI User role
gcloud projects add-iam-policy-binding $GOOGLE_CLOUD_PROJECT \
--member="serviceAccount:service-${PROJECT_NUMBER}@gcp-sa-aiplatform-re.iam.gserviceaccount.com" \
--role="roles/aiplatform.user"
# Grant the Vertex AI Vector Search Viewer role
gcloud projects add-iam-policy-binding $GOOGLE_CLOUD_PROJECT \
--member="serviceAccount:service-${PROJECT_NUMBER}@gcp-sa-aiplatform-re.iam.gserviceaccount.com" \
--role="roles/vectorsearch.viewer"This project uses uv to manage the Python virtual environment.
Create and activate the virtual environment:
# Navigate to the project directory
cd rag-with-vectorsearch-2.0
# Create the virtual environment
uv venv
# Activate the virtual environment (macOS/Linux)
source .venv/bin/activate
# Activate the virtual environment (Windows)
.venv\Scripts\activateInstall dependencies:
# Install agent dependencies
uv pip install -r rag_with_vectorsearch_2_0/requirements.txt
# Install data ingestion script dependencies
uv pip install -r data_ingestion/requirements.txtRun the collection creation script. This creates a Collection with auto-embedding configuration and an ANN index for production-ready search.
cd data_ingestion
python create_vector_search_collection.py \
--project_id=$GOOGLE_CLOUD_PROJECT \
--location=$GOOGLE_CLOUD_LOCATION \
--collection_name="rag-collection"Available options:
--embedding_model: Embedding model (default:gemini-embedding-001)--embedding_dim: Embedding dimensions (default:768)--no-index: Skip ANN index creation--wait-for-index: Wait for ANN index creation to complete
Note: Collection creation is immediate, but ANN index creation takes 5-30 minutes.
Run the data ingestion script to load documents from source_documents/:
cd data_ingestion
python ingest.py \
--project_id=$GOOGLE_CLOUD_PROJECT \
--location=$GOOGLE_CLOUD_LOCATION \
--collection_name="rag-collection" \
--source_dir="../source_documents"This script:
- Loads
.mdand.txtfiles from the source directory - Splits documents into chunks (1000 chars, 100 overlap)
- Creates Data Objects with auto-generated embeddings
Before running the agent, create a .env file:
cp rag_with_vectorsearch_2_0/.env.example rag_with_vectorsearch_2_0/.envEdit the .env file with your configuration:
GOOGLE_CLOUD_PROJECT=your-project-id
GOOGLE_CLOUD_LOCATION=us-central1
VECTOR_SEARCH_COLLECTION_NAME=rag-collection
Using the Command-Line Interface (CLI):
adk run rag_with_vectorsearch_2_0Using the Web Interface:
adk webScreenshot:
The RAG agent can be deployed to Vertex AI Agent Engine using the adk deploy command.
export GOOGLE_CLOUD_PROJECT=$(gcloud config get-value project)
export GOOGLE_CLOUD_LOCATION="us-central1"adk deploy cloud_run \
--project=$GOOGLE_CLOUD_PROJECT \
--region=$GOOGLE_CLOUD_LOCATION \
rag_with_vectorsearch_2_0Or deploy to Agent Engine:
adk deploy agent_engine \
--project=$GOOGLE_CLOUD_PROJECT \
--region=$GOOGLE_CLOUD_LOCATION \
--staging_bucket="gs://your-staging-bucket" \
rag_with_vectorsearch_2_0When the deployment finishes, it will print a line like this:
Successfully created remote agent: projects/<PROJECT_NUMBER>/locations/<PROJECT_LOCATION>/reasoningEngines/<AGENT_ENGINE_ID>
Make a note of the AGENT_ENGINE_ID. You will need it to interact with your deployed agent.
Once your agent is deployed, you can use the provided Jupyter notebook to verify its functionality.
- Navigate to the
notebooks/directory. - Open test_vector_search_2_0_agent_on_agent_engine.ipynb.
- Follow the instructions in the notebook to:
- Configure your Project ID and Location.
- Connect to the deployed agent using your unique
AGENT_ENGINE_ID. - Test stateful sessions and real-time streaming queries.
This notebook provides a convenient way to interact with your agent in a stateful manner and serves as a starting point for building your own client applications.
- Vertex AI Vector Search 2.0 Overview
- 📓 Introduction to Vertex AI Vector Search 2.0
- 📓 Vertex AI Vector Search 2.0 Public Preview Quickstart
- Introducing Vertex AI Vector Search 2.0 from zero to billion-scale (2025-12-26)
- 10-minute agentic RAG with the new Vector Search 2.0 and ADK (2026-01-19)
- Improve gen AI search with Vertex AI embeddings and task types (2024-10-03): Explains how Vertex AI's "task type" embeddings provide a streamlined solution to significantly enhance the accuracy and effectiveness of RAG systems.
- 📓 Deploy your first ADK agent on Vertex AI Agent Engine
- Migration from Vertex AI Vector Search 1.0 to 2.0
RAG with Vertex AI Vector Search 1.0 and Firestore
RAG with Vertex AI Vector Search 1.0 and GCS
