Add SKILL.md files for connection debugging, core run, and output analysis

luisggoncalves · luisggoncalves · commit ef2948a74c74 · 2026-04-01T17:09:10.000-03:00
diff --git a/skills/sqlcompare-connection-debug/SKILL.md b/skills/sqlcompare-connection-debug/SKILL.md
@@ -0,0 +1,70 @@
+---
+name: sqlcompare-connection-debug
+description: Debug SQLCompare connection and credential problems, especially connector resolution, environment variables, YAML configs, and permission issues. Use when a run fails to connect or create comparison tables.
+---
+
+# SQLCompare Connection Debug
+
+## Overview
+Diagnose connection failures by validating connector resolution, credentials, and required privileges. Focus on issues around `-c` selectors, environment variables, and YAML config files.
+
+## Quick Triage
+1. Capture the exact error text from the failing command.
+2. Verify the command and connector name used (for example, `-c snowflake_prod`).
+3. Run a simple connectivity check with `sqlcompare query` if possible.
+
+## Connector Resolution Order
+SQLCompare resolves connections in this order.
+1. Default connector (if `-c` is omitted).
+2. Direct URL passed in the command.
+3. Environment variables: `SQLCOMPARE_CONN_DEFAULT`, `SQLCOMPARE_CONN_<NAME>`.
+4. YAML file: `~/.sqlcompare/connections.yml`.
+
+## Environment Variable Checklist
+1. Confirm `SQLCOMPARE_CONN_DEFAULT` when no `-c` is supplied.
+2. Confirm `SQLCOMPARE_CONN_<NAME>` matches the `-c <name>` value.
+3. Ensure values are valid SQLAlchemy URLs and include driver, credentials, host, and database.
+4. Check for common format issues: missing scheme, extra quotes, unescaped special characters in passwords, or a missing database name.
+
+For URL parsing and credential encoding warnings, run:\n`python3 skills/sqlcompare-connection-debug/scripts/parse_sqlalchemy_url.py \"<sqlalchemy_url>\"`
+
+### SQLAlchemy URL Formats (Examples)
+1. Postgres: `postgresql://user:pass@host:5432/dbname`
+2. DuckDB file: `duckdb:////absolute/path/to/db.duckdb`
+3. DuckDB in-memory: `duckdb:///:memory:`
+4. Snowflake (password): `snowflake://user:pass@account/db/schema?warehouse=WH&role=ROLE`
+5. Snowflake (private key): `snowflake://user@account/db/schema?warehouse=WH&role=ROLE&private_key_file=/absolute/path/key.p8&private_key_file_pwd=YOUR_PASSPHRASE`
+
+If a password includes `@`, `:`, or `/`, URL-encode it before using it in the connection string.
+
+## YAML Connection File Checklist
+1. Check `~/.sqlcompare/connections.yml` exists and is readable.
+2. Ensure the top-level key matches the `-c` name.
+3. Validate required fields per connector (drivername, username, password, host, database, schema).
+4. If you use YAML, prefer it when URL encoding becomes error-prone.
+
+## Permission And Schema Issues
+1. SQLCompare creates a physical join table in the comparison schema.
+2. Ensure the connector user has `CREATE SCHEMA` and `CREATE TABLE` privileges.
+3. Verify `SQLCOMPARE_COMPARISON_SCHEMA` if a non-default schema is required.
+
+## Debug Logging
+1. Enable verbose logging by setting `SQLCOMPARE_DEBUG=1`.
+2. Re-run the failing command and capture the expanded logs.
+
+## Connection Format Troubleshooting
+1. If the error mentions an unknown dialect or driver, confirm the URL scheme (for example `postgresql://`, `snowflake://`, `duckdb://`).
+2. If the error mentions authentication, verify username/password and URL-encode special characters.
+3. If the error mentions database or schema not found, confirm the database and schema names in the URL or YAML config.
+
+## Connectivity Smoke Tests
+Use these to isolate authentication vs query issues.
+1. Simple query against the target connector.
+`sqlcompare query "SELECT 1" -c <name>`
+2. List available diffs to confirm local metadata access.
+`sqlcompare list-diffs`
+
+## Common Fixes
+1. Use `-c <name>` explicitly to avoid accidental default selection.
+2. Correct the connection name casing to match env vars and YAML keys.
+3. Quote or fully qualify table names if your database requires it.
diff --git a/skills/sqlcompare-connection-debug/scripts/parse_sqlalchemy_url.py b/skills/sqlcompare-connection-debug/scripts/parse_sqlalchemy_url.py
@@ -0,0 +1,113 @@
+#!/usr/bin/env python3
+"""Parse a SQLAlchemy URL and highlight encoding issues in credentials."""
+
+from __future__ import annotations
+
+import argparse
+import sys
+from urllib.parse import parse_qs, unquote, urlparse
+
+try:
+    from sqlalchemy.engine import make_url
+except Exception:  # pragma: no cover - optional dependency
+    make_url = None
+
+
+RESERVED = set("@:/?#&=%+ ")
+
+
+def _raw_userinfo(netloc: str) -> str | None:
+    if "@" not in netloc:
+        return None
+    return netloc.split("@", 1)[0]
+
+
+def _split_user_pass(userinfo: str) -> tuple[str | None, str | None]:
+    if ":" in userinfo:
+        user, pwd = userinfo.split(":", 1)
+        return user, pwd
+    return userinfo, None
+
+
+def _parse_with_sqlalchemy(url: str) -> tuple[object, dict[str, list[str]]]:
+    parsed = make_url(url)
+    query = parsed.query or {}
+    return parsed, {k: [str(v)] if not isinstance(v, list) else [str(x) for x in v] for k, v in query.items()}
+
+
+def _parse_with_stdlib(url: str) -> tuple[object, dict[str, list[str]]]:
+    parsed = urlparse(url)
+    return parsed, parse_qs(parsed.query)
+
+
+def parse_sqlalchemy_url(url: str) -> int:
+    if make_url is not None:
+        try:
+            parsed, query = _parse_with_sqlalchemy(url)
+        except Exception as exc:
+            print(f\"ERROR: SQLAlchemy failed to parse URL: {exc}\")\n            return 1
+    else:
+        parsed, query = _parse_with_stdlib(url)
+
+    scheme = getattr(parsed, \"drivername\", None) or getattr(parsed, \"scheme\", \"\")
+    if not scheme:
+        print("ERROR: Missing scheme (for example, postgresql://, snowflake://, duckdb://)")
+        return 1
+
+    netloc = getattr(parsed, \"host\", None)
+    if netloc is None and hasattr(parsed, \"netloc\"):
+        netloc = parsed.netloc
+
+    raw_userinfo = _raw_userinfo(netloc or \"\") if isinstance(netloc, str) else None
+    raw_user, raw_pass = (None, None)
+    if raw_userinfo:
+        raw_user, raw_pass = _split_user_pass(raw_userinfo)
+
+    if make_url is not None and hasattr(parsed, \"username\"):
+        decoded_user = parsed.username
+        decoded_pass = parsed.password
+    else:
+        decoded_user = unquote(raw_user) if raw_user is not None else None
+        decoded_pass = unquote(raw_pass) if raw_pass is not None else None
+
+    print("Parsed URL")
+    print(f"scheme: {scheme}")
+    print(f"username: {decoded_user}")
+    print(f"password: {decoded_pass}")
+    host = getattr(parsed, \"host\", None) or getattr(parsed, \"hostname\", None)
+    port = getattr(parsed, \"port\", None)
+    database = getattr(parsed, \"database\", None)
+    path = getattr(parsed, \"path\", None)
+    print(f\"host: {host}\")
+    print(f\"port: {port}\")
+    print(f\"database/path: {database or path or None}\")
+
+    if query:
+        print("query parameters:")
+        for key in sorted(query):
+            print(f"  {key}: {query[key]}")
+    else:
+        print("query parameters: none")
+
+    warnings = []
+    if raw_userinfo:
+        if raw_user and any(ch in RESERVED for ch in raw_user):
+            warnings.append("Username contains reserved characters; URL-encode it.")
+        if raw_pass and any(ch in RESERVED for ch in raw_pass):
+            warnings.append("Password contains reserved characters; URL-encode it.")
+    if warnings:
+        print("warnings:")
+        for warn in warnings:
+            print(f"  - {warn}")
+    return 0
+
+
+def main() -> int:
+    parser = argparse.ArgumentParser(description="Parse a SQLAlchemy URL")
+    parser.add_argument("url", help="SQLAlchemy URL to parse")
+    args = parser.parse_args()
+    return parse_sqlalchemy_url(args.url)
+
+
+if __name__ == "__main__":
+    sys.exit(main())
diff --git a/skills/sqlcompare-core-run/SKILL.md b/skills/sqlcompare-core-run/SKILL.md
@@ -0,0 +1,51 @@
+---
+name: sqlcompare-core-run
+description: Run SQLCompare core comparisons across tables, SQL queries, and files to produce a diff_id. Use when the user wants to compare two datasets, choose index keys, set connectors, or run dataset configs via sqlcompare run/table/dataset/stats/query.
+---
+
+# SQLCompare Core Run
+
+## Overview
+Run the primary SQLCompare commands that create comparison artifacts and return a diff_id for later analysis. Focus on selecting inputs, defining index keys, and choosing the right connector.
+
+## Quick Start
+1. Set a default connection or pass one explicitly.
+2. Run a comparison command and capture the diff_id from output.
+3. Hand the diff_id to output analysis for inspection and reporting.
+
+## Choose The Comparison Type
+Use these patterns to pick the right entry point.
+
+1. Compare two tables or views.
+`sqlcompare run analytics.fact_sales analytics.fact_sales_new id`
+2. Compare two SQL queries (inline or .sql files).
+`sqlcompare run "SELECT ..." "SELECT ..." id -c snowflake_prod`
+`sqlcompare run queries/previous.sql queries/current.sql id -c snowflake_prod`
+3. Compare local files (CSV/XLSX) with DuckDB.
+`sqlcompare run path/to/previous.csv path/to/current.xlsx id`
+4. Use a dataset config for repeatable runs.
+`sqlcompare dataset path/to/dataset.yaml`
+5. Run a fast statistical comparison (no diff_id needed for row-level drilldown).
+`sqlcompare stats analytics.users analytics.users_new -c snowflake_prod`
+6. Run a quick sanity query against a connector.
+`sqlcompare query "SELECT COUNT(*) FROM analytics.users" -c snowflake_prod`
+
+## Inputs And Keys
+1. Provide index columns that exist in both datasets.
+2. Use comma-separated keys for composite indexes.
+`sqlcompare run analytics.users analytics.users_new user_id,tenant_id`
+3. Fully qualify or quote identifiers when required by your database.
+
+## Connections And Defaults
+1. Prefer `-c <name>` to select a named connection for a run.
+2. Set `SQLCOMPARE_CONN_DEFAULT` for the default connector.
+3. Set `SQLCOMPARE_CONN_<NAME>` for named connectors.
+4. Use `~/.sqlcompare/connections.yml` for YAML-based connection configs.
+
+## Expected Output
+1. A successful compare prints a diff_id.
+2. Use that diff_id with `sqlcompare-output-analysis` for inspection, exports, and queries.
+
+## Notes
+1. SQLCompare creates comparison tables in the `SQLCOMPARE_COMPARISON_SCHEMA` (default `sqlcompare`).
+2. Ensure the connector has `CREATE SCHEMA` and `CREATE TABLE` privileges.
diff --git a/skills/sqlcompare-output-analysis/SKILL.md b/skills/sqlcompare-output-analysis/SKILL.md
@@ -0,0 +1,48 @@
+---
+name: sqlcompare-output-analysis
+description: Analyze SQLCompare outputs after a core run. Use when the user has a diff_id and needs to inspect stats, missing rows, column-level diffs, exports, or AI-friendly diff-queries.
+---
+
+# SQLCompare Output Analysis
+
+## Overview
+Interpret and drill into a SQLCompare diff_id using inspect, list-diffs, and diff-queries. Produce summaries, targeted column reviews, and exportable reports.
+
+## Start From A diff_id
+1. If you already have a diff_id, proceed to inspection commands.
+2. If you do not, list recent diffs and pick the correct one.
+`sqlcompare list-diffs`
+`sqlcompare list-diffs users`
+
+## Core Inspection Commands
+1. Overall statistics summary.
+`sqlcompare inspect <diff_id> --stats`
+2. Inspect a specific column with row samples.
+`sqlcompare inspect <diff_id> --column revenue --limit 100`
+3. Show rows missing on either side.
+`sqlcompare inspect <diff_id> --missing-current`
+`sqlcompare inspect <diff_id> --missing-previous`
+4. List available columns.
+`sqlcompare inspect <diff_id> --list-columns`
+
+## Export Reports (XLSX)
+1. Summary report with capped row samples per column.
+`sqlcompare inspect <diff_id> --save summary`
+2. Full report without per-column row caps.
+`sqlcompare inspect <diff_id> --save complete --file-path ./reports/full_diff.xlsx`
+3. Single-column summary report.
+`sqlcompare inspect <diff_id> --column revenue --save summary --file-path ./reports/revenue_diff.xlsx`
+
+Notes:
+1. `--save summary|complete` is for the standard diff view.
+2. Do not combine `--save summary|complete` with `--stats`, `--missing-current`, `--missing-previous`, or `--list-columns`.
+
+## AI-Friendly Diff Queries
+Use diff-queries to get structured metadata and SQL templates.
+`sqlcompare diff-queries <diff_id>`
+
+## Interpretation Hints
+1. Start with `--stats` to see which columns change most.
+2. Use `--missing-current` and `--missing-previous` to find dropped or new rows.
+3. Use `--column` to inspect high-impact fields.
+4. Export a summary report for sharing, then refine analysis with targeted column checks.