Skip to content

haydenk/gridsync

Repository files navigation

A self-hosted pipeline that downloads ERCOT's public Electric Service Identifier (ESIID) data, loads it into Postgres, and serves a fast fuzzy search API. Raw files are stored temporarily for processing and not retained long-term.

Built with Rust. No AWS, GCP, or Azure required.


What it does

ERCOT publishes a daily feed of every electric service point in Texas (~3–5 million records) across all Transmission and Distribution Service Providers (TDSPs). This project:

  1. Downloads new files from the ERCOT MIS API on a schedule to a temporary local directory
  2. Queues each downloaded file in a Postgres-backed work queue (esiid_documents)
  3. Processes queued files — parses ZIP/CSV and upserts records into Postgres; files can be deleted after processing
  4. Exposes a REST API for fuzzy address and ESIID search

Stack

Layer Choice Why
Language Rust Performance, safety, single binaries
Database Supabase / Postgres Portable, open-source, fuzzy search via pg_trgm
File storage Local filesystem (temporary) Processing buffer only — no long-term archive
Search pg_trgm + Postgres FTS Built into Postgres, no extra service needed
API Axum Fast, async, minimal
Scheduler System cron or pg_cron No external dependency

Prerequisites

  • mise — manages Rust and dev tools
  • A Postgres database (Supabase free tier works)
  • Disk space for temporary ERCOT ZIP files during processing (~500 MB per run; files can be removed after processing)

Getting Started

1. Clone and configure

git clone <repo-url>
cd gridsync
cp .env.example .env

Edit .env with your credentials:

DATABASE_URL=postgresql://postgres:password@db.your-project.supabase.co:5432/postgres

# Directory where downloaded ERCOT ZIP files are stored
DATA_DIR=./data

2. Install tools and dependencies

mise install              # installs Rust stable
mise run install          # installs sqlx-cli and cargo-watch

3. Run migrations

mise run db:migrate

4. Build

mise run build

5. Download ERCOT data

# Download all available files (first run may take a while)
mise run download

# Or limit to a specific TDSP
mise run download -- --utility ONCOR_ELEC

# Preview what would be downloaded without fetching
mise run download -- --dry-run

Files are saved to $DATA_DIR/ercot/esiid/YYYY/MM/DD/ for processing. They are not retained long-term and can be deleted once the queue is drained.

6. Process the download queue

# Process all downloaded files
mise run process

# Limit to a specific number of files (useful for testing)
mise run process -- --limit 5

7. (Optional) Run the ETL manually for a specific file

mise run etl -- \
  --path ./data/ercot/esiid/2026/04/07/ext.00000203...ONCOR_ELEC_FUL.zip \
  --utility ONCOR_ELEC \
  --file-type FUL \
  --publish-date 2026-04-07

8. Start the API

mise run api
# or with live reload during development
mise run watch

API

GET /search

Fuzzy search across all ESIID records.

Query parameters:

Parameter Type Default Description
q string Free-text query — fuzzy address, city, zip, or ESIID
zip string Filter by exact zip code
utility string Filter by TDSP name (e.g. ONCOR_ELEC)
premise_type string Filter by premise type (RES, COM, etc.)
load_zone string Filter by settlement load zone
active_only bool true Only return active service points
limit int 25 Results per page (max 100)
offset int 0 Pagination offset

Example:

curl "http://localhost:3000/search?q=123+main+st&zip=78701&limit=10"

Response:

{
  "total": 42,
  "limit": 10,
  "offset": 0,
  "results": [
    {
      "id": 1,
      "esiid": "10443720000123456",
      "address": "123 MAIN ST",
      "city": "AUSTIN",
      "state": "TX",
      "zip_code": "78701",
      "utility": "ONCOR_ELEC",
      "premise_type": "RES",
      ...
    }
  ]
}

When q is provided, results are ranked by full-text relevance then trigram similarity. Without q, results are returned ordered by ESIID.

GET /health

Returns 200 ok. Use for uptime checks.


Data Source

ERCOT publishes Report 203 (TDSP ESIID Extract) — a public feed requiring no authentication.

File types:

  • _FUL — Full snapshot of all ESIIDs for a TDSP. Published on the first Tuesday following the first Monday of each month.
  • _DAILY — Incremental delta. Published every day.

TDSPs covered: CENTERPOINT, ONCOR_ELEC, LUBBOCK, NUECES_ELEC, TNMP, AEP_CENTRAL, AEP_NORTH, ENTERGY_GULF, SHARYLAND, AEP_TEXAS_SP, SWEPCO_ENERG

Files are downloaded temporarily to $DATA_DIR/ercot/esiid/{YYYY}/{MM}/{DD}/ for processing. They are not archived and can be removed once processing is complete.


Scheduling

To keep data current, schedule the downloader and processor to run daily via system cron:

# Download new files at 6am UTC
0 6 * * * cd /path/to/gridsync && mise run download >> /var/log/gridsync-download.log 2>&1

# Process the queue at 6:30am UTC
30 6 * * * cd /path/to/gridsync && mise run process >> /var/log/gridsync-process.log 2>&1

Or use pg_cron if running inside Supabase/Postgres.

The downloader skips files that already exist locally (via the esiid_documents table), so reruns are safe. FUL files are picked up automatically when they appear in the ERCOT feed.


Development

mise run check      # cargo check + clippy
mise run fmt        # cargo fmt
mise run test       # cargo test
mise run watch      # live-reload API server
mise run clean      # cargo clean (remove build artifacts)

After adding or modifying sqlx::query! calls, regenerate the offline query cache so CI can build without a live database:

mise run db:prepare   # writes .sqlx/ — commit this directory

Migrations

mise run db:migrate   # apply pending
mise run db:status    # show applied/pending
mise run db:revert    # revert last migration

Project Layout

crates/
├── common/       # Shared config, DB pool, local storage helpers, models
├── downloader/   # ERCOT API → local filesystem pipeline
├── etl/          # Local filesystem → Postgres loader (manual + queue processor)
└── api/          # Axum search API
db/
└── migrations/   # Versioned SQL migrations
.mise/tasks/      # All runnable project tasks

License

GPL-3.0 — see LICENSE for details.

About

Data hub that ingests ERCOT public ESIID data, loads into Postgres, and serves a fuzzy search API. Built in Rust, vendor-agnostic.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors