This project demonstrates how to implement a LightRAG (Light Retrieval Augmented Generation) agent using the Agent Development Kit (ADK) with Google Cloud Spanner as the storage backend.
It leverages the LightRAG library with the lightrag-spanner storage plugin and Gemini models for LLM and embedding.
![]() Image Source: "LightRAG: Simple and Fast Retrieval-Augmented Generation" |
User Query
|
v
ADK Agent (Gemini 2.5 Flash)
| tool call
v
lightrag_tool(query)
|
v
LightRAG.aquery(only_need_context=True)
|-- Keyword Extraction (LLM)
|-- Graph Search (Spanner Property Graph)
|-- Vector Search (Spanner Vector Search)
+-- Context assembly and return
|
v
ADK Agent generates final answer based on context
|
- User sends a query to the ADK Agent.
- Agent calls
lightrag_toolwith the query. - LightRAG processes the query:
- Extracts keywords (high-level & low-level) using LLM.
- Searches the Spanner Graph (entities, relationships).
- Searches the Spanner Vector Store (semantic similarity).
- Combines results into structured context.
- Context is returned to the Agent (no LLM answer generation inside LightRAG).
- Agent generates the final answer using the retrieved context.
lightrag-with-spanner/
├── lightrag_with_spanner/ # ADK Agent directory
│ ├── __init__.py
│ ├── agent.py # ADK Agent definition (root_agent)
│ ├── prompt.py # Agent system instructions
│ ├── tools.py # lightrag_tool - context retrieval via LightRAG
│ └── .env.example # Environment variables template
├── data_ingestion/ # Data ingestion directory
│ └── insert.py # Script to ingest documents
├── requirements.txt # Project dependencies
└── README.md
| File | Description |
|---|---|
lightrag_with_spanner/agent.py |
root_agent definition using Gemini 2.5 Flash and lightrag_tool |
lightrag_with_spanner/tools.py |
lightrag_tool function, extracts context from LightRAG |
lightrag_with_spanner/prompt.py |
System instruction guiding the Agent to answer based on tool-retrieved context |
data_ingestion/insert.py |
Script to ingest documents into the LightRAG Knowledge Graph |
This project uses Google Cloud Spanner for production-grade, scalable storage. Tables are automatically created by lightrag-spanner on first use via initialize_storages().
| Component | Backend |
|---|---|
| KV Storage | SpannerKVStorage |
| Vector Storage | SpannerVectorStorage |
| Graph Storage | SpannerGraphStorage |
| Doc Status Storage | SpannerDocStatusStorage |
Before you begin, ensure you have the following tools installed:
- uv (for Python package management)
- Google Cloud SDK (gcloud)
First, authenticate with Google Cloud:
gcloud auth application-default loginNext, set up your project and enable the necessary APIs:
export PROJECT_ID=$(gcloud config get-value project)
gcloud services enable \
spanner.googleapis.com \
aiplatform.googleapis.comCreate a Spanner instance and a database using the gcloud CLI.
# Set environment variables
export SPANNER_INSTANCE="lightrag-instance"
export SPANNER_DATABASE="lightrag-db"
export SPANNER_REGION="us-central1"
# Create the Spanner instance
gcloud spanner instances create $SPANNER_INSTANCE \
--config=regional-$SPANNER_REGION \
--description="LightRAG Instance" \
--nodes=1 \
--edition=ENTERPRISE
# Create the database
gcloud spanner databases create $SPANNER_DATABASE \
--instance=$SPANNER_INSTANCETo allow the deployed Agent Engine to connect to your Spanner instance, you must grant the necessary IAM roles to the Agent Engine's service account.
Run the following commands to grant both roles to the Agent Engine service account:
export PROJECT_NUMBER=$(gcloud projects describe $PROJECT_ID --format="value(projectNumber)")
# Grant permission to read database metadata
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:service-${PROJECT_NUMBER}@gcp-sa-aiplatform-re.iam.gserviceaccount.com" \
--role="roles/spanner.databaseReaderWithDataBoost"
# Grant permission to get databases
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:service-${PROJECT_NUMBER}@gcp-sa-aiplatform-re.iam.gserviceaccount.com" \
--role="roles/spanner.restoreAdmin"The roles/spanner.restoreAdmin role is granted to the Agent Engine service account to provide the necessary spanner.databases.get permission.
Copy the example file and edit it:
cp lightrag_with_spanner/.env.example lightrag_with_spanner/.envexport GOOGLE_CLOUD_PROJECT="your-project-id"
export GOOGLE_CLOUD_LOCATION="us-central1"
export GOOGLE_GENAI_USE_VERTEXAI="true"
export SPANNER_INSTANCE="lightrag-instance"
export SPANNER_DATABASE="lightrag-db"This project uses uv to manage the Python virtual environment and package dependencies.
Create and activate the virtual environment:
# Create the virtual environment
uv venv
# Activate the virtual environment
source .venv/bin/activateInstall dependencies:
uv pip install -r requirements.txtFirst, load the environment variables from the .env file:
source lightrag_with_spanner/.envIngest documents into the LightRAG Knowledge Graph.
# Ingest sample documents (Apple, Steve Jobs, Google)
python data_ingestion/insert.py --sample
# Or ingest your own document
python data_ingestion/insert.py --file your_document.txtYou can run the agent using either the command-line interface or a web-based interface.
adk run lightrag_with_spanneradk webScreenshot:
![]() Figure 1. LightRAG with Spanner - ADK Web UI |
|
![]() Figure 2. LightRAG with Spanner - ADK Log |
![]() Figure 3. LightRAG with Spanner - Storages |
LightRAG GitHub: Simple and Fast Retrieval-Augmented Generation that incorporates graph structures into text indexing and retrieval processes.
lightrag-spanner GitHub: Google Cloud Spanner storage backend for LightRAG.- Intro to GraphRAG - A dive into GraphRAG pattern details
- Google ADK Documentation
- Google Cloud Spanner Graph
- The unified graph solution with Spanner Graph and BigQuery Graph
- Vertex AI Gemini



