Skip to content

Rejected tool calls use status: 'completed' in function_call_result, causing model hallucinations #1104

@KostaGPT

Description

@KostaGPT

Problem

When a tool call is rejected via the approval flow, buildApprovalRejectionResult produces a function_call_result with status: 'completed'. The only signal that the tool was rejected is the text content of the output.

I believe this is what causes the model to frequently hallucinate that the rejected tool call succeeded, especially in multi-tool scenarios (e.g. approve one email, reject another — model claims both were sent).

Root cause

In runner/toolExecution.mjs, getToolCallOutputItem hardcodes status: 'completed'. buildApprovalRejectionResult passes the rejection message through getToolCallOutputItem, so rejected tools get status: 'completed' identical to successful ones.

The model sees:

function_call_result { name: "SendEmail", callId: "abc", status: "completed", output: "Email sent to bob" }
function_call_result { name: "SendEmail", callId: "def", status: "completed", output: "Tool execution was not approved." }

The structural status: 'completed' contradicts the rejection text. The rejection message can be strengthened via ToolErrorFormatter to mitigate this, but we're fighting an opposing force — the status: 'completed' field works against the rejection text, and we can't fully eliminate the contradictory signal the model receives.

Suggested fix

Consider using status: 'incomplete' for rejected tool calls, or ideally adding a dedicated status: 'rejected' value to the function_call_result spec so the model gets a consistent signal.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions