Skip to content

ULW verification loop: infinite loop when Oracle VERIFIED not detected in parent session #3212

@supraforge-mueller

Description

@supraforge-mueller

Bug

The ultrawork (ULW) loop keeps cycling infinitely after Oracle emits <promise>VERIFIED</promise>. The verification is actually complete, but the system fails to detect the VERIFIED promise in the parent session's messages.

Root Cause

Three interrelated issues in the ralph-loop verification path:

1. detectOracleVerificationFromParentSession only scans assistant role messages

In src/hooks/ralph-loop/pending-verification-handler.ts, the function filters for message.info?.role !== "assistant", but Oracle subagent output is returned as a tool_result which may be in a non-assistant message or a separate message part. This means VERIFIED is never found → recovery always fails → handleFailedVerification restarts the loop.

2. detectCompletionInSessionMessages only inspects assistant messages and text parts

In src/hooks/ralph-loop/completion-promise-detector.ts, the function filters for assistant role and only collects text parts. Tool result parts (which contain the Oracle's output) are excluded, so VERIFIED is never detected via the API path either.

3. No circuit breaker for failed verification attempts

When verification detection fails, restartAfterFailedVerification increments iteration but there's no cap on how many verification rounds are attempted. With max_iterations = 100, the loop can run up to 100 Oracle calls (expensive!) before giving up.

Reproduction

  1. Start a ULW loop (/ulw-loop "some task")
  2. Agent completes work, emits <promise>DONE</promise>
  3. System transitions to verification pending, injects ULTRAWORK_VERIFICATION_PROMPT
  4. Agent calls Oracle, Oracle returns with <promise>VERIFIED</promise>
  5. System fails to detect VERIFIED → calls handleFailedVerification → restarts loop
  6. Loop repeats indefinitely

Proposed Fixes

Fix 1: Scan all message roles for VERIFIED detection

In detectOracleVerificationFromParentSession:

  • Remove the role !== "assistant" filter
  • Extract text from parts (all types), content string, and tool_result parts
  • Accept VERIFIED even without explicit "Agent: oracle" marker (fallback)

In detectCompletionInSessionMessages:

  • When checking for VERIFIED promise, scan ALL messages (not just assistant)
  • Include tool_result parts in text collection

Fix 2: Add verification attempt circuit breaker

  • Add verification_attempts: number to RalphLoopState
  • New constant MAX_VERIFICATION_ATTEMPTS = 3
  • Increment counter in restartAfterFailedVerification
  • In handlePendingVerification: if verification_attempts >= MAX_VERIFICATION_ATTEMPTS, call loopState.clear() and end the loop with a warning toast
  • Serialize/deserialize the counter in the state file

Fix 4: Transcript-based fallback

In handlePendingVerification, before calling handleFailedVerification:

  • If API-based detection failed, try detectCompletionInTranscript(transcriptPath, "VERIFIED", started_at) as a fallback
  • If VERIFIED found in transcript file → clear loop (success)
  • Pass getTranscriptPath through the function chain

Files Affected

  • src/hooks/ralph-loop/pending-verification-handler.ts (Fix 1 + 4)
  • src/hooks/ralph-loop/completion-promise-detector.ts (Fix 1)
  • src/hooks/ralph-loop/verification-failure-handler.ts (Fix 2)
  • src/hooks/ralph-loop/loop-state-controller.ts (Fix 2)
  • src/hooks/ralph-loop/types.ts (Fix 2 - new field)
  • src/hooks/ralph-loop/ralph-loop-event-handler.ts (Fix 4 - pass getTranscriptPath)
  • src/hooks/ralph-loop/storage.ts (Fix 2 - serialize/deserialize)

Patch

I've applied these fixes to dist/index.js locally (patching the compiled bundle). The patches work and JS syntax validates clean. Happy to submit a PR against the TypeScript source if that would be preferred.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions