Fix: fix(hooks): GateGuard permanently blocks all operations on Windows — fs.renameSync EEXIST not handled#1444
Conversation
… operations on Windows — fs.renameSync EEXIST not handled
📝 WalkthroughWalkthroughThis PR updates the session isolation fallback mechanism to use a deterministic cwd-derived SHA256 hash instead of PID-based identification, and adds file deletion before renaming the state file to prevent Windows EEXIST errors during atomic operations. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile SummaryThis PR fixes two Windows-specific regressions in the GateGuard hook:
Confidence Score: 4/5Safe to merge with minor improvements — failure modes are in the safe direction (spurious re-gate, not gate bypass). Both findings are P2: the atomicity gap causes spurious blocking (not gate bypass), and orphaned temp files are an edge-case disk-leak. The core fixes (EEXIST handling, stable CWD-hash session ID) are correct and address real Windows regressions. Score is 4 rather than 5 because the two P2 findings touch the correctness of the write path and temp-file hygiene. scripts/hooks/gateguard-fact-force.js — saveState atomicity and pruneStaleFiles temp-file coverage. Important Files Changed
Sequence DiagramsequenceDiagram
participant Hook as GateGuard Hook
participant FS as File System
Note over Hook: saveState() called
Hook->>FS: writeFileSync(tmpFile)
FS-->>Hook: tmpFile written
alt Before this PR (Windows EEXIST bug)
Hook->>FS: renameSync(tmpFile → STATE_FILE)
FS-->>Hook: ❌ EEXIST (destination exists)
Note over Hook: outer catch ignores — state lost permanently
else After this PR
Hook->>FS: unlinkSync(STATE_FILE)
FS-->>Hook: STATE_FILE deleted
Note over FS: ⚠️ window: STATE_FILE absent
Hook->>FS: renameSync(tmpFile → STATE_FILE)
FS-->>Hook: ✅ success
end
Note over Hook: SESSION_ID fallback
alt CLAUDE_SESSION_ID set
Hook->>Hook: use env var (stable)
else Fallback before PR
Hook->>Hook: pid-{ppid||pid} — changes each invocation on Windows
else Fallback after PR
Hook->>Hook: cwd-sha256(cwd)[0:16] — stable per project
end
Reviews (1): Last reviewed commit: "Fix issue #1435: fix(hooks): GateGuard p..." | Re-trigger Greptile |
| try { fs.unlinkSync(STATE_FILE); } catch (_) { /* ignore if not exists */ } | ||
| fs.renameSync(tmpFile, STATE_FILE); |
There was a problem hiding this comment.
Unlink-before-rename breaks atomic write semantics
Always unlinking first creates a window between line 87 and 88 where STATE_FILE doesn't exist. Any concurrent loadState() call in that window returns empty state (checked: []), causing isChecked to return false and the gate to fire spuriously. The comment above still says "Atomic write" but the write is no longer atomic.
A safer pattern catches EEXIST specifically, so the unlink only happens when the destination actually exists — preserving the atomic case on the happy path:
| try { fs.unlinkSync(STATE_FILE); } catch (_) { /* ignore if not exists */ } | |
| fs.renameSync(tmpFile, STATE_FILE); | |
| try { | |
| fs.renameSync(tmpFile, STATE_FILE); | |
| } catch (err) { | |
| if (err.code === 'EEXIST') { | |
| try { fs.unlinkSync(STATE_FILE); } catch (_) { /* ignore */ } | |
| fs.renameSync(tmpFile, STATE_FILE); | |
| } | |
| } |
The worst-case failure mode here is still a spurious re-gate (safe direction), but reducing the frequency matters for stability on async hook runs.
| const tmpFile = STATE_FILE + '.tmp.' + process.pid; | ||
| fs.writeFileSync(tmpFile, JSON.stringify(state, null, 2), 'utf8'); | ||
| // On Windows, fs.renameSync throws EEXIST if the destination exists. | ||
| // Unlink first to ensure cross-platform replacement. | ||
| try { fs.unlinkSync(STATE_FILE); } catch (_) { /* ignore if not exists */ } | ||
| fs.renameSync(tmpFile, STATE_FILE); | ||
| } catch (_) { /* ignore */ } |
There was a problem hiding this comment.
Orphaned temp files not pruned
If renameSync throws after unlinkSync succeeds (e.g., an antivirus tool holds a lock on tmpFile), the outer catch (_) { /* ignore */ } silently swallows the error and tmpFile (state-...json.tmp.<PID>) is left on disk permanently.
pruneStaleFiles skips it because the filter requires f.endsWith('.json') — files ending in .json.tmp.<PID> don't match. Consider extending the prune loop to also remove stale .tmp.* files:
// in pruneStaleFiles, add alongside the existing state-*.json cleanup:
if (f.includes('.tmp.') && now - stat.mtimeMs > SESSION_TIMEOUT_MS) {
try { fs.unlinkSync(path.join(STATE_DIR, f)); } catch (_) {}
}There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/hooks/gateguard-fact-force.js`:
- Around line 85-89: Remove the unconditional fs.unlinkSync(STATE_FILE) and
change the tmpFile -> STATE_FILE replace logic to try fs.renameSync(tmpFile,
STATE_FILE) first; on failure only handle Windows-specific errors (check
error.code === 'EEXIST' || error.code === 'EPERM') by calling
fs.unlinkSync(STATE_FILE) and retrying the rename, and for any other error or a
second failure, surface the error (throw or log and exit) instead of swallowing
it; update the code paths that reference STATE_FILE, tmpFile, fs.renameSync and
fs.unlinkSync so failures are handled explicitly and fail fast.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 3bb049b4-34d6-42a0-a1d3-fa29657f48bf
📒 Files selected for processing (1)
scripts/hooks/gateguard-fact-force.js
| // On Windows, fs.renameSync throws EEXIST if the destination exists. | ||
| // Unlink first to ensure cross-platform replacement. | ||
| try { fs.unlinkSync(STATE_FILE); } catch (_) { /* ignore if not exists */ } | ||
| fs.renameSync(tmpFile, STATE_FILE); | ||
| } catch (_) { /* ignore */ } |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In current Node.js LTS on Windows, what are the documented fs.renameSync behaviors when the destination already exists (including EEXIST/EPERM), and what cross-platform pattern is recommended to preserve atomic file replacement semantics?
💡 Result:
In current Node.js LTS versions (v24 Active LTS "Krypton" and v22/v20 Maintenance LTS as of 2026-04-15), fs.renameSync(oldPath, newPath) behavior when the destination already exists is platform-dependent, emulating POSIX rename(2) semantics via libuv. On POSIX (Linux/macOS): Overwrites the destination if it's a file (atomically replaces it). If destination is an existing non-empty directory, throws ENOTEMPTY. If empty directory, may succeed (replaces it). On Windows: Intended to overwrite existing files using MoveFileEx with MOVEFILE_REPLACE_EXISTING flag in libuv. However, frequently throws EPERM ("operation not permitted") or EBUSY, especially if files are locked by antivirus/indexer/Windows Search, or under concurrent access/high load. EEXIST may occur if replacement fails. Retries (as in graceful-fs) are a common workaround, but not built into core fs. Official Node.js docs (fs.html) do not explicitly document per-platform differences or error codes for existing destinations, referencing POSIX rename(2) generally. Issues like nodejs/node#29481 and libuv discussions confirm Windows unreliability for atomic replacement via rename. Cross-platform pattern for preserving atomic file replacement: Write new content to a temporary file (e.g., destination + unique suffix like '.' + hash/process.pid), then fs.renameSync(tempPath, destination). This leverages rename's atomicity within the same filesystem. If destination exists, rename will attempt replacement (POSIX: succeeds; Windows: may fail with EPERM but often succeeds if temp is new). Popular libraries like write-file-atomic implement this with retries, fsync, chown/mode preservation, handling Windows quirks for reliability. Example: const fs = require('node:fs'); const path = require('node:path'); const crypto = require('node:crypto'); function atomicWriteSync(dest, data) { const tmp = dest + '.' + crypto.createHash('md5').update(dest + process.pid + Date.now).digest('hex'); fs.writeFileSync(tmp, data); fs.renameSync(tmp, dest); // Atomic replace } Use libraries like 'write-file-atomic' for production (handles edge cases, retries EPERM on Windows).
Citations:
- 1: https://nodejs.org/api/fs.html
- 2: https://stackoverflow.com/questions/21219018/node-js-does-fs-rename-overwrite-file-if-already-exists
- 3: EPERM when renaming files on Windows nodejs/node#29481
- 4: https://github.com/iarna/write-file-atomic
- 5: https://github.com/nodejs/LTS
- 6: https://nodejs.org/en/about/releases
- 7: fs.rename doesn't work as documented nodejs/node#21957
- 8: https://www.npmjs.com/package/write-file-atomic
- 9: https://www.nodejs.org
- 10: https://nodejs.org/docs/latest/api/fs.html
- 11: Error renaming file in Win32, if the file already exists joyent/libuv#283
- 12: http://docs.libuv.org/en/latest/fs.html
Remove unconditional unlink and handle state-write errors explicitly.
Line 87 unconditionally deletes STATE_FILE before rename, creating a no-file gap where concurrent reads may find no file during the replacement. Additionally, line 89 silently swallows all errors from the rename, so if the operation fails, state may never persist and the gate can continue re-blocking incorrectly.
On POSIX systems (Linux, macOS), fs.renameSync() atomically replaces an existing file. On Windows, it may fail with EPERM due to file locks; conditional handling is needed only in that case.
Suggested approach
Attempt fs.renameSync() first on all platforms. Only on Windows, if it fails with EEXIST or EPERM, unlink the destination and retry. Wrap the entire operation in explicit error handling to fail fast and prevent silent state corruption.
Per the coding guidelines: "Always handle errors explicitly at every level and never silently swallow errors."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/hooks/gateguard-fact-force.js` around lines 85 - 89, Remove the
unconditional fs.unlinkSync(STATE_FILE) and change the tmpFile -> STATE_FILE
replace logic to try fs.renameSync(tmpFile, STATE_FILE) first; on failure only
handle Windows-specific errors (check error.code === 'EEXIST' || error.code ===
'EPERM') by calling fs.unlinkSync(STATE_FILE) and retrying the rename, and for
any other error or a second failure, surface the error (throw or log and exit)
instead of swallowing it; update the code paths that reference STATE_FILE,
tmpFile, fs.renameSync and fs.unlinkSync so failures are handled explicitly and
fail fast.
There was a problem hiding this comment.
1 issue found across 1 file
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="scripts/hooks/gateguard-fact-force.js">
<violation number="1" location="scripts/hooks/gateguard-fact-force.js:87">
P1: Unconditional pre-delete before rename creates a non-atomic write window and can silently lose persisted state if rename fails.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| fs.writeFileSync(tmpFile, JSON.stringify(state, null, 2), 'utf8'); | ||
| // On Windows, fs.renameSync throws EEXIST if the destination exists. | ||
| // Unlink first to ensure cross-platform replacement. | ||
| try { fs.unlinkSync(STATE_FILE); } catch (_) { /* ignore if not exists */ } |
There was a problem hiding this comment.
P1: Unconditional pre-delete before rename creates a non-atomic write window and can silently lose persisted state if rename fails.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At scripts/hooks/gateguard-fact-force.js, line 87:
<comment>Unconditional pre-delete before rename creates a non-atomic write window and can silently lose persisted state if rename fails.</comment>
<file context>
@@ -82,6 +82,9 @@ function saveState(state) {
fs.writeFileSync(tmpFile, JSON.stringify(state, null, 2), 'utf8');
+ // On Windows, fs.renameSync throws EEXIST if the destination exists.
+ // Unlink first to ensure cross-platform replacement.
+ try { fs.unlinkSync(STATE_FILE); } catch (_) { /* ignore if not exists */ }
fs.renameSync(tmpFile, STATE_FILE);
} catch (_) { /* ignore */ }
</file context>
|
Me too |
Fixed two critical issues on Windows: 1) Handled EEXIST error in fs.renameSync by unlinking the destination file before renaming, ensuring state is correctly persisted across atomic writes. 2) Replaced the unstable pid/ppid fallback for SESSION_ID with a stable hash of the current working directory. This ensures that hook invocations within the same project context share a consistent state file even when spawned through process wrappers that change parent PIDs.
Test: On a Windows environment without CLAUDE_SESSION_ID set, trigger a GateGuard fact-forcing check (e.g., by running a Bash command). Verify that the command is blocked initially. Provide the requested facts/instruction quote and retry the same command. Verify that the command is now permitted and that only a single state file (matching the stable CWD hash) is created in the .gateguard directory.
Summary by cubic
Fixes GateGuard blocking on Windows by safely replacing the state file and using a stable session key. This prevents EEXIST errors during atomic writes and keeps session state consistent across process wrappers.
fs.renameSyncto handle Windows EEXIST and ensure atomic state replacement.SESSION_IDwhenCLAUDE_SESSION_IDis unset, so hooks in the same project share one state file.Written for commit 9a52399. Summary will update on new commits.