|
| 1 | +# Spanner Benchmark POC |
| 2 | + |
| 3 | +This repository contains a small Java 17 / Maven benchmark scaffold for evaluating Google Cloud Spanner read and query performance against the `customer_insights` and `customer_insights_phone` tables in [schema.sql](/home/karthit_google_com/code/kt-spanner-stuff/verizon_soi_poc/schema.sql). |
| 4 | + |
| 5 | +Configuration is loaded from `.env`, authentication uses ADC from the VM, and `GOOGLE_SPANNER_ENABLE_DIRECT_ACCESS` is expected to be set explicitly. |
| 6 | + |
| 7 | +NOTE: It's important to set 'GOOGLE_SPANNER_ENABLE_DIRECT_ACCESS' to true to realize the latency benefits of "DIRECT PATH". |
| 8 | + |
| 9 | +## Schema Overview |
| 10 | + |
| 11 | +The benchmark currently targets two tables with the same payload columns but different primary-key shapes: |
| 12 | + |
| 13 | +- `customer_insights`: keyed by `cust_id, acct_no, phone_number, insight_category, insight_name` |
| 14 | +- `customer_insights_phone`: keyed by `phone_number, insight_category, insight_name` |
| 15 | + |
| 16 | +Both tables currently contain these columns: |
| 17 | + |
| 18 | +- `cust_id STRING(36) NOT NULL` |
| 19 | +- `acct_no STRING(36) NOT NULL` |
| 20 | +- `phone_number STRING(20) NOT NULL` |
| 21 | +- `insight_category STRING(50) NOT NULL` |
| 22 | +- `insight_name STRING(100) NOT NULL` |
| 23 | +- `insight_values STRING(MAX)` |
| 24 | +- `updated_by STRING(100)` |
| 25 | +- `updated_at TIMESTAMP OPTIONS (allow_commit_timestamp=true)` |
| 26 | + |
| 27 | +The benchmark is set up this way so the same logical data can be exercised through two different key orders: |
| 28 | + |
| 29 | +- `customer_insights` supports customer/account-oriented access paths |
| 30 | +- `customer_insights_phone` supports phone-oriented access paths |
| 31 | + |
| 32 | +That makes it easier to compare how key order changes point-read behavior, prefix scans, and SQL filtering selectivity while keeping the row contents the same. |
| 33 | + |
| 34 | +Also see [schema.sql](schema.sql). |
| 35 | + |
| 36 | + |
| 37 | +## Commands |
| 38 | + |
| 39 | +Compile: |
| 40 | + |
| 41 | +```bash |
| 42 | +mvn compile |
| 43 | +``` |
| 44 | + |
| 45 | +Build an executable jar: |
| 46 | + |
| 47 | +```bash |
| 48 | +mvn package |
| 49 | +``` |
| 50 | + |
| 51 | +Recommended execution path: |
| 52 | + |
| 53 | +```bash |
| 54 | +java -jar target/spanner-benchmark-poc.jar benchmark |
| 55 | +``` |
| 56 | + |
| 57 | +Populate synthetic seed data: |
| 58 | + |
| 59 | +```bash |
| 60 | +mvn exec:java -Dexec.args="populate --profile small --truncate-first" |
| 61 | +``` |
| 62 | + |
| 63 | +Run the basic benchmark: |
| 64 | + |
| 65 | +```bash |
| 66 | +mvn exec:java -Dexec.args="benchmark --scenario core-read-paths --warmup 10 --iterations 100 --concurrency 1 --sample-size 200" |
| 67 | +``` |
| 68 | + |
| 69 | +Run the benchmark with an explicit stale-read window: |
| 70 | + |
| 71 | +```bash |
| 72 | +mvn exec:java -Dexec.args="benchmark --scenario core-read-paths --warmup 10 --iterations 100 --concurrency 1 --sample-size 200 --staleness-seconds 15" |
| 73 | +``` |
| 74 | + |
| 75 | +Preferred runtime path for clean CLI execution: |
| 76 | + |
| 77 | +```bash |
| 78 | +java -jar target/spanner-benchmark-poc.jar populate --profile small --truncate-first |
| 79 | +java -jar target/spanner-benchmark-poc.jar benchmark --scenario core-read-paths --warmup 10 --iterations 100 --concurrency 1 --sample-size 200 --staleness-seconds 15 |
| 80 | +``` |
| 81 | + |
| 82 | +## Maven Exec Note |
| 83 | + |
| 84 | +`mvn exec:java` may emit a shutdown warning like `NoClassDefFoundError: io/opentelemetry/sdk/common/CompletableResultCode` after the benchmark completes. |
| 85 | + |
| 86 | +That warning is from the Maven exec launcher and its in-process classloader cleanup, not from the benchmark logic or Spanner query execution itself. |
| 87 | + |
| 88 | +For real benchmark runs, prefer the packaged jar: |
| 89 | + |
| 90 | +```bash |
| 91 | +mvn package |
| 92 | +java -jar target/spanner-benchmark-poc.jar benchmark |
| 93 | +``` |
| 94 | + |
| 95 | +## Benchmark Workflow |
| 96 | + |
| 97 | +Seed a simple dataset into both tables: |
| 98 | + |
| 99 | +```bash |
| 100 | +java -jar target/spanner-benchmark-poc.jar populate --profile small --truncate-first |
| 101 | +``` |
| 102 | + |
| 103 | +Seed a dataset with many `cust_id + acct_no` combinations and about 250 rows per account pair in both tables: |
| 104 | + |
| 105 | +```bash |
| 106 | +java -jar target/spanner-benchmark-poc.jar populate --profile read-heavy-250 --truncate-first |
| 107 | +``` |
| 108 | + |
| 109 | +Seed a dataset with many `cust_id + acct_no` combinations and about 1000 rows per account pair in both tables: |
| 110 | + |
| 111 | +```bash |
| 112 | +java -jar target/spanner-benchmark-poc.jar populate --profile read-heavy-1000 --truncate-first |
| 113 | +``` |
| 114 | + |
| 115 | +You can also override any profile dimension directly. Example: 100 account pairs with 250 rows per pair: |
| 116 | + |
| 117 | +```bash |
| 118 | +java -jar target/spanner-benchmark-poc.jar populate --profile read-heavy-250 --truncate-first --customers 20 --accounts-per-customer 5 |
| 119 | +``` |
| 120 | + |
| 121 | +Run the default core read paths benchmark across both tables: |
| 122 | + |
| 123 | +```bash |
| 124 | +java -jar target/spanner-benchmark-poc.jar benchmark --warmup 10 --iterations 100 --concurrency 1 --sample-size 200 --staleness-seconds 15 |
| 125 | +``` |
| 126 | + |
| 127 | +Limit benchmark execution to one table when needed: |
| 128 | + |
| 129 | +```bash |
| 130 | +java -jar target/spanner-benchmark-poc.jar benchmark --table customer_insights --scenario core-read-paths --warmup 10 --iterations 100 --concurrency 1 --sample-size 200 --staleness-seconds 15 |
| 131 | +java -jar target/spanner-benchmark-poc.jar benchmark --table customer_insights_phone --scenario core-read-paths --warmup 10 --iterations 100 --concurrency 1 --sample-size 200 --staleness-seconds 15 |
| 132 | +``` |
| 133 | + |
| 134 | +Run only the `cust_id + acct_no` query benchmark: |
| 135 | + |
| 136 | +```bash |
| 137 | +java -jar target/spanner-benchmark-poc.jar benchmark --scenario customer-account --warmup 10 --iterations 100 --concurrency 1 --sample-size 200 --staleness-seconds 15 |
| 138 | +java -jar target/spanner-benchmark-poc.jar benchmark --scenario customer-account --warmup 10 --iterations 100 --concurrency 10 --sample-size 200 --staleness-seconds 15 |
| 139 | +java -jar target/spanner-benchmark-poc.jar benchmark --scenario customer-account --selection-mode random-without-replacement --selection-seed 42 --warmup 100 --iterations 2000 --concurrency 10 --sample-size 1000 --staleness-seconds 15 |
| 140 | +java -jar target/spanner-benchmark-poc.jar benchmark --scenario customer-account --selection-mode round-robin --warmup 100 --iterations 1000 --concurrency 10 --sample-size 200 --staleness-seconds 15 |
| 141 | +``` |
| 142 | + |
| 143 | +The benchmark output now includes per-query row-returned stats alongside latency and throughput. |
| 144 | + |
| 145 | +Run a built-in concurrency sweep with a compact report: |
| 146 | + |
| 147 | +```bash |
| 148 | +java -jar target/spanner-benchmark-poc.jar benchmark --scenario customer-account --selection-mode random-without-replacement --selection-seed 42 --warmup 1000 --iterations 2000 --concurrency-sweep 1,2,4,8,16 --sample-size 1000 --staleness-seconds 15 |
| 149 | +``` |
| 150 | + |
| 151 | +When `--concurrency-sweep` is set, the benchmark runs each listed concurrency level in order and prints a summary table at the end for easier comparison. |
| 152 | + |
| 153 | +Selection modes: |
| 154 | + |
| 155 | +- `random`: random with replacement |
| 156 | +- `random-without-replacement`: randomized coverage without repeats until the sampled set is exhausted, then reshuffle |
| 157 | +- `round-robin`: deterministic cycling through the sampled keys |
| 158 | +- `shuffle-once`: deterministic shuffle once, then cycle through that order |
| 159 | + |
| 160 | +Run only the full primary-key SQL benchmark: |
| 161 | + |
| 162 | +```bash |
| 163 | +java -jar target/spanner-benchmark-poc.jar benchmark --scenario exact-primary-key-sql --warmup 10 --iterations 100 --concurrency 1 --sample-size 200 --staleness-seconds 15 |
| 164 | +``` |
| 165 | + |
| 166 | +## Benchmark Terms |
| 167 | + |
| 168 | +`warmup` means the number of query executions to run before measurements are recorded. |
| 169 | + |
| 170 | +Why this matters: |
| 171 | + |
| 172 | +- early requests can include one-time effects like JVM activity, connection/session setup, and client-side initialization |
| 173 | +- those first executions are often not representative of steady-state read latency |
| 174 | +- warmup helps separate startup effects from the numbers you want to compare |
| 175 | + |
| 176 | +In this benchmark: |
| 177 | + |
| 178 | +- `--warmup 10` means run 10 unmeasured executions first |
| 179 | +- `--iterations 100` means then record 100 measured executions |
| 180 | +- `--concurrency 10` means up to 10 measured executions can be in flight at the same time |
| 181 | +- `--concurrency-sweep 1,2,4,8` means run the same benchmark repeatedly at each listed concurrency and then print a comparison report |
| 182 | + |
| 183 | +Practical guidance: |
| 184 | + |
| 185 | +- use small warmup values for quick smoke tests |
| 186 | +- use larger warmup values when comparing runs seriously |
| 187 | +- keep warmup, iterations, concurrency, sample size, and selection mode the same when comparing results across data shapes or consistency modes |
| 188 | + |
| 189 | +`sample-size` controls how many keys or key prefixes are loaded into the benchmark's in-memory sample set before execution starts. |
| 190 | + |
| 191 | +Why this matters: |
| 192 | + |
| 193 | +- the benchmark does not choose from the whole table on every iteration |
| 194 | +- it first loads a bounded sample of candidate keys or `cust_id + acct_no` pairs |
| 195 | +- query executions are then selected from that sample set |
| 196 | + |
| 197 | +Practical guidance: |
| 198 | + |
| 199 | +- larger sample sizes reduce the chance that a small subset of keys dominates the run |
| 200 | +- smaller sample sizes are faster to initialize but can bias the workload toward fewer repeated keys |
| 201 | +- keep sample size fixed when comparing two benchmark runs |
| 202 | + |
| 203 | +`selection-mode` controls how benchmark iterations choose from the sampled keys. |
| 204 | + |
| 205 | +Available modes: |
| 206 | + |
| 207 | +- `random`: random with replacement; the same key can be picked multiple times |
| 208 | +- `random-without-replacement`: randomized order with no repeats until the sample is exhausted, then reshuffle using the same seeded RNG |
| 209 | +- `round-robin`: deterministic cycling through the sampled keys in order |
| 210 | +- `shuffle-once`: deterministic shuffle once, then cycle through that shuffled order |
| 211 | + |
| 212 | +Practical guidance: |
| 213 | + |
| 214 | +- use `random` when you want a simple randomized workload |
| 215 | +- use `random-without-replacement` when you want broad coverage with less immediate key reuse |
| 216 | +- use `round-robin` when you want strict deterministic coverage |
| 217 | +- use `shuffle-once` when you want deterministic but less order-biased coverage |
| 218 | +- use `random-without-replacement` plus a fixed seed for concurrency sweeps so each run covers the sample broadly without immediate key reuse |
| 219 | +- if comparing runs closely, keep both `selection-mode` and `selection-seed` fixed |
| 220 | + |
| 221 | +## Current Benchmark Coverage |
| 222 | + |
| 223 | +The initial default benchmark focus is: |
| 224 | + |
| 225 | +- SQL queries filtered by `cust_id` and `acct_no` with strong consistency |
| 226 | +- SQL queries filtered by `cust_id` and `acct_no` with stale consistency |
| 227 | +- SQL queries filtered by full primary key with strong consistency |
| 228 | +- SQL queries filtered by full primary key with stale consistency |
| 229 | + |
| 230 | +Additional scenario names currently supported: |
| 231 | + |
| 232 | +- `core-read-paths` |
| 233 | +- `customer-account` |
| 234 | +- `phone-number` |
| 235 | +- `exact-primary-key-sql` |
| 236 | +- `customer-only` |
| 237 | +- `full-key-read` |
| 238 | +- `all` |
| 239 | + |
| 240 | +Table selection values: |
| 241 | + |
| 242 | +- `all`: run scenarios for both `customer_insights` and `customer_insights_phone` |
| 243 | +- `customer_insights`: run only the original customer/account keyed table scenarios |
| 244 | +- `customer_insights_phone`: run only the phone-keyed table scenarios |
| 245 | + |
| 246 | +This is only the first pass. The intent is to refine the scenario set once the target read patterns and reporting requirements are finalized. |
| 247 | + |
| 248 | +## Data Shape Notes |
| 249 | + |
| 250 | +For this schema, rows returned by the `cust_id + acct_no` query are driven by: |
| 251 | + |
| 252 | +`rows per customer-account = phone-numbers-per-account * categories * names-per-category` |
| 253 | + |
| 254 | +The populate command now prints: |
| 255 | + |
| 256 | +- total expected rows across both tables |
| 257 | +- total distinct `cust_id + acct_no` pairs |
| 258 | +- rows per `cust_id + acct_no` pair |
| 259 | + |
| 260 | +For deterministic benchmark datasets, use `--truncate-first` so the selected profile replaces existing table contents instead of accumulating with prior loads. |
| 261 | + |
| 262 | +Useful presets: |
| 263 | + |
| 264 | +- `small`: lightweight smoke-test dataset |
| 265 | +- `read-heavy-250`: many customer/account combinations with about 250 rows per account pair |
| 266 | +- `read-heavy-1000`: many customer/account combinations with about 1000 rows per account pair |
0 commit comments