This script reads JSON files from the runs folder and inserts the structured data into your Supabase database, while avoiding duplicate entries.
- Python 3.7+ installed on your system
- Supabase project with the
environmental_reporttable created - Environment variables set in your
.envfile
Make sure your .env file contains:
VITE_SUPABASE_URL="https://gsktkvdfgmvavwxatlmn.supabase.co"
VITE_SUPABASE_ANON_KEY="eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJzdXBhYmFzZSIsInJlZiI6Imdza3RrdmRmZ212YXZ3eGF0bG1uIiwicm9sZSI6ImFub24iLCJpYXQiOjE3NDU0MDE0NjQsImV4cCI6MjA2MDk3NzQ2NH0.0oZcjPiSG0pF0k2gGcM6bNRujOg251A6FRxuycT_nhI"
SUPABASE_TABLE_NAME="environmental_report"
The script expects a table with this structure:
CREATE TABLE IF NOT EXISTS environmental_report (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT,
method_of_access TEXT,
sector TEXT,
data_format TEXT,
description TEXT,
granularity TEXT,
url TEXT,
country TEXT,
country_locode TEXT,
human_eval INTEGER,
Accepted INTEGER
);run_supabase_insert.bat.\run_supabase_insert.ps1# Install dependencies
pip install -r requirements_supabase.txt
# Run the script
python insert_data_to_supabase.py- Loads Environment Variables: Reads Supabase credentials from
.envfile - Connects to Supabase: Creates a client connection to your database
- Checks for Duplicates: Fetches existing URL-country combinations to avoid duplicates
- Reads JSON Files: Scans the
runsfolder for all JSON files - Extracts Structured Data: Pulls the
structured_dataarrays from each JSON file - Prepares Records: Formats the data for database insertion (handles arrays by converting to comma-separated strings)
- Inserts Data: Adds new records to the database, skipping duplicates
- Logs Progress: Provides detailed logging of the insertion process
- Duplicate Prevention: Checks URL + country combination to avoid inserting the same data twice
- Batch Processing: Handles all JSON files in the runs folder at once
- Error Handling: Gracefully handles file reading errors and database insertion failures
- Detailed Logging: Provides comprehensive logs of the process
- Data Formatting: Automatically converts array fields (sector, data_format, subsector) to comma-separated strings
- Default Values: Sets
human_evalandAcceptedto 0 for all new records
The script will log:
- Number of JSON files processed
- Number of structured data items found
- Number of existing records in database
- Number of records inserted
- Number of duplicates skipped
- Any errors encountered
Example output:
2025-06-02 17:00:00,123 - INFO - Using table: environmental_report
2025-06-02 17:00:01,456 - INFO - Supabase client created successfully
2025-06-02 17:00:02,789 - INFO - Found 25 existing URL-country combinations
2025-06-02 17:00:03,012 - INFO - Found 18 structured data items in results_Canada_afolu_20250602_152740.json
2025-06-02 17:00:03,345 - INFO - Found 3 structured data items in results_Chile_ippu_20250602_162506.json
2025-06-02 17:00:04,678 - INFO - Total structured data items collected: 67
2025-06-02 17:00:05,901 - INFO - Prepared 67 records for insertion
2025-06-02 17:00:08,234 - INFO - Insertion complete: 42 records inserted, 25 records skipped (duplicates)
2025-06-02 17:00:08,567 - INFO - Process completed successfully!
2025-06-02 17:00:08,890 - INFO - Summary: 42 new records inserted, 25 duplicates skipped
- "Missing required environment variables": Make sure your
.envfile is in the same directory and contains the correct variables - "Runs folder does not exist": Ensure the
runsfolder exists in the same directory as the script - Database connection errors: Verify your Supabase URL and API key are correct
- Permission errors: Make sure your Supabase API key has the necessary permissions to insert data
- The script will skip any JSON files that don't contain a
structured_datafield - Array fields (sector, data_format, subsector) are automatically converted to comma-separated strings
- The script is idempotent - you can run it multiple times safely without creating duplicates