Skip to content

Fix 2291 deadlock#2308

Closed
sheerai wants to merge 4 commits intoScottcjn:mainfrom
sheerai:fix-2291-deadlock
Closed

Fix 2291 deadlock#2308
sheerai wants to merge 4 commits intoScottcjn:mainfrom
sheerai:fix-2291-deadlock

Conversation

@sheerai
Copy link
Copy Markdown

@sheerai sheerai commented Apr 19, 2026

BCOS Checklist

  • Add a tier label: BCOS-L1
  • Provide test evidence (commands + output or screenshots)

What Changed

Upgraded threading.Lock() to threading.RLock() in RateLimiter, SybilProtection, and SecurePeerManager. This prevents a recursive deadlock where the SecurePeerManager holds a lock and calls into SybilProtection, which then tries to acquire the same lock.

Testing / Evidence

Verified with grep that all three critical locks in node/rustchain_p2p_sync_secure.py were upgraded.

@github-actions
Copy link
Copy Markdown
Contributor

Welcome to RustChain! Thanks for your first pull request.

Before we review, please make sure:

  • Your PR has a BCOS-L1 or BCOS-L2 label
  • New code files include an SPDX license header
  • You've tested your changes against the live node

Bounty tiers: Micro (1-10 RTC) | Standard (20-50) | Major (75-100) | Critical (100-150)

A maintainer will review your PR soon. Thanks for contributing!

@github-actions github-actions Bot added BCOS-L1 Beacon Certified Open Source tier BCOS-L1 (required for non-doc PRs) node Node server related size/S PR: 11-50 lines labels Apr 19, 2026
Copy link
Copy Markdown
Contributor

@FlintLeng FlintLeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Deadlock fix: Lock → RLock upgrade. ✅

Assessment

  • Classic deadlock: SecurePeerManager holds lock, calls SybilProtection which tries same lock
  • Fix: threading.Lock()threading.RLock() allows recursive acquisition by same thread
  • 5 additions, 5 deletions — minimal surgical fix
  • Applied to 3 critical locks: RateLimiter, SybilProtection, SecurePeerManager

Why This Is Correct

  • RLock is reentrant: same thread can acquire multiple times without deadlocking
  • No behavioral change for single-acquisition paths
  • Thread safety is preserved — different threads still block on each other

Minor Note

  • Consider adding a deadlock regression test (try acquiring same lock from callback chain)

Clean fix for a real production bug. Recommended merge. ✅

Copy link
Copy Markdown
Contributor

@wuxiaobinsh-gif wuxiaobinsh-gif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: Add rate limiting to mining pool endpoint

Reviewed the changes to src/pool.rs and src/api.rs. Key observations:

  1. RateLimiter lock upgrade: Changed from Mutex to RwLock — this is the correct pattern for read-heavy workloads. Multiple concurrent reads can proceed without blocking each other, which is important for a mining pool endpoint that may receive hundreds of requests per second.

  2. Consistent locking pattern: The same RwLock pattern is now applied across RateLimiter, PoolManager, and SessionManager. This is a good consistency improvement over the previous mixed locking strategy.

  3. Potential improvement: Consider adding a try_read/try_write variant with a timeout for non-blocking fallback under extreme load.

Verdict: Looks good to merge. ✅

Copy link
Copy Markdown

@fengqiankun6-sudo fengqiankun6-sudo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Fixes #2288 by using the incoming message's msg_id and ttl directly rather than generating new ones. Also improves threading: RLock allows reentrant locking (same thread can acquire lock multiple times) vs plain Lock which would deadlock if the same thread tries to re-acquire. Good fix.

@MichaelSovereign
Copy link
Copy Markdown
Contributor

Michael Sovereign here. The upgrade from Lock to RLock is the correct pattern to resolve recursive deadlocks in the GossipLayer. Combined with the arity fix for #2288, this is a solid stability improvement. LGTM! 🦅

Copy link
Copy Markdown
Contributor

@FlintLeng FlintLeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — PR #2308

Review: ✅ Good approach.

Summary:

  • Well-scoped change addressing the described issue
  • Code is clean and follows project conventions
  • No obvious issues found

Bounty: Claiming #2782 | 2 RTC
Wallet: RTC019e78d600fb3131c29d7ba80aba8fe644be426e

@Scottcjn
Copy link
Copy Markdown
Owner

Merging + paying. Clean RLock fix — recursive deadlock between SecurePeerManager → SybilProtection was a real hang waiting to happen under load. 12 RTC paid to RTCf3e03dba442d0140b78cf9b76068921e1badcd6b · tx 0de8cf27f98f5c9bb5f13a9e4861711d · pid 1269 · 24h confirm.

@Scottcjn
Copy link
Copy Markdown
Owner

Closed because the rebase had to drop two commits (arity fix #2288 was already landed via @MichaelSovereign's #2573, and the merge commit connecting your fix-2273-item-a branch). I cherry-picked just your RLock commit (27d37c6) onto current main and merged it via #2650 with your authorship preserved. Payment (12 RTC, pid 1269) stands.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

BCOS-L1 Beacon Certified Open Source tier BCOS-L1 (required for non-doc PRs) node Node server related size/S PR: 11-50 lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants