refactor: extract stateless-db crate and split pipeline/validator modules#122
refactor: extract stateless-db crate and split pipeline/validator modules#122
Conversation
|
The PR currently has no labels. Based on the diff, the While the PR is primarily a refactor (extracting
There is no dedicated |
Codecov Report❌ Patch coverage is ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
ReviewClean, well-scoped refactoring. The three structural changes (new A few notes:
|
ReviewSolid refactor. The crate extraction, module splits, and opaque One issue to verify before merging:
Pre-existing unresolved thread (AGENTS.md): |
|
Good PR overall. The extraction of One inline comment below on a logging regression introduced by the prefix removal. |
|
Clean refactor. The Three issues to address, one nit:
See inline comments for the exact suggestions. |
|
Good refactor overall — the Two issues to address before merging:
One nit: the two retry loops in Test plan has one unchecked item (debug-trace-server startup with non-existent data-dir); the |
|
Clean refactor. All prior review comments have been addressed. A few small notes below, but overall this improves the codebase significantly — extracting |
|
Large, well-organized refactoring. The crate split, pipeline module decomposition, Nit: Nit: The new See inline comment on |
ReviewStrong refactoring. The structural decompositions — stateless-db crate, pipeline module split, validator binary split — all reduce coupling cleanly and the new cache tests cover the important invariants. A few specific observations: Infallible RPC client — the shift from Result to T throughout RpcClient and the centralized round_robin_with_backoff is a clean design. The round-level backoff semantics (escalate debug → warn at round 3, jitter, unbounded retry) and primary-failover for witnesses are well-documented and correctly tested. The metric recording change (record_request(true, ...) unconditionally) is correct given the new invariant. ContractCache / Arc — the three-tier lookup chain (memory → disk → RPC with hash verification) is a meaningful correctness improvement over the previous approach. Having get_codes fail the whole batch on a single hash mismatch is stricter and right. DB error classification in get_block_data_by_hash — distinguishing StoreError::MissingData (silent fallthrough to RPC) from other errors (warn + fallthrough) is a correctness improvement over the old let Ok(data) = ... that silently swallowed real backend errors. Unresolved: std::mem::forget(temp_dir) in tests/integration.rs — replied to the existing thread. This is the only item to fix before merging. |
|
Good refactor overall — the crate split is clean, the pipeline module decomposition is clear, and previous review feedback has been addressed. Two issues remain: 1. Unresolved: 2. Block-range log wrong on fresh start — see inline comment on |
| let (a, b) = (hits_a.load(Ordering::Relaxed), hits_b.load(Ordering::Relaxed)); | ||
| assert!(a >= 5, "primary (index 0) must be tried every call: got {a}"); | ||
| // Backup should only be hit as fallback. | ||
| assert!(b <= a, "backup must not outpace primary: a={a}, b={b}"); |
There was a problem hiding this comment.
nit: The doc comment says "the secondary is untouched while primary is up", but the test never has A healthy. After the first RPC-level failure, A returns "v0:AAAA" which also fails at decode, so both A and B fail on every round and both get hit equally (a ≈ b). b <= a is only trivially satisfied.
The missing case is: when A succeeds (returns valid data), B should receive zero hits. A focused test would use a real witness payload (or a mock that returns success) for A and assert b == 0 across multiple calls.
The test does verify that A is always tried first, which is the critical scheduling property — but the "untouched secondary" invariant is not actually exercised.
|
Large, well-structured refactor. The crate extraction, pipeline split, and retry unification are all clean. Two items to address before merge: PR description inaccuracy: The description says "promote Resolved threads: The |
Summary
stateless-dbcrate fromstateless-common, moving redb table definitions, read/write helpers, serialization, andContractCacheinto a dedicated persistence layer.crates/stateless-core/src/pipeline.rsinto a module (config,traits,fetcher,divergence,advancer,worker,tests) for clearer responsibility boundaries.bin/stateless-validator/src/main.rsintomain.rs(entry),app.rs(CLI/startup wiring),workers.rs(pipeline + signals + reporter), andlib.rs; integration tests move totests/integration.rs.AGENTS.mdto describe the new layout and file map.PipelineConfig::fetcher_max_in_flightdefault fromworkersto2 * workers, since fetchers are I/O-bound and should exceed the CPU-bound worker count to keep workers fed (bounded by--data-max-concurrent-requests/--witness-max-concurrent-requests).--data-diron startup in bothstateless-validatoranddebug-trace-server, so operators no longer need to pre-create the directory.start,end,count) at validator shutdown by snapshotting the canonical tip before the pipeline runs.ansifeature oftracing-subscriberfor colored terminal output, and promote[Worker] Successfully validated blockfromdebug!toinfo!.rex4Timehardfork activation totest_data/mainnet/genesis.json.serde,serde_json,eyre,thiserror,tempfile) from several crate manifests.Test plan
cargo fmt --all --checkcargo clippy --workspace --all-targets --all-featurescargo sort --check --workspace --groupedcargo test --workspace(unit + integration, including the updatedtest_pipeline_config_default_uses_cpu_count)stateless-validatoragainst a fresh (non-existent)--data-dirand confirm it is auto-created[Main] Validated blocks this sessionlog appears at shutdown with the expectedstart/end/countdebug-trace-serverstartup with a non-existent--data-dirin local-cache mode