Skip to content

Commit 68ff59f

Browse files
iangmaiaclaude
andauthored
Fix Claude analysis gating to handle co-failures correctly (#25491)
* Fix Claude analysis gating to handle co-failures correctly The previous script exited early when any non-essential step (e.g. Danger) failed, incorrectly skipping Claude analysis even when essential jobs also failed in the same build. Now counts non-essential failures first, then queries the Buildkite API for the total failed job count. Claude is only skipped when all failures are accounted for by non-essential steps. Fails safe: if the API call fails, Claude runs anyway. Also fixes shebang portability, uses a proper Bash array for step keys, uses YAML literal scalar to preserve prompt formatting, and includes timed_out jobs in the failure count. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix Claude analysis gating: use hard_failed and handle all-passed Two bugs fixed: - buildkite-agent step get outcome returns "hard_failed", not "failed", so the non-essential check never matched - When all steps passed, the script fell through to uploading Claude analysis instead of exiting early Also restructures the script to query the API first, which simplifies the flow and avoids a redundant early-exit branch. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent c574ef1 commit 68ff59f

File tree

2 files changed

+38
-9
lines changed

2 files changed

+38
-9
lines changed

.buildkite/claude-analysis.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ steps:
2424
build_log_mode: "failed"
2525
trigger: "on-failure"
2626
max_log_lines: 1500
27-
custom_prompt: >
27+
custom_prompt: |
2828
Ignore these jobs — they are not real failures:
2929
(1) The "Claude Build Analysis" job itself (uses `exit 1` to trigger this hook).
3030
(2) The "Danger - PR Check" job (external linter, not indicative of build health).
Lines changed: 37 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,50 @@
1-
#!/bin/bash -eu
1+
#!/usr/bin/env bash
2+
3+
set -eu
24

35
# Uploads the Claude analysis pipeline only when real build/test failures exist.
4-
# Skips when only non-essential jobs failed (e.g. Danger).
6+
# Skips when only non-essential jobs failed (e.g. Danger) or nothing failed.
57

68
source .buildkite/shared-pipeline-vars
79

10+
# Query the Buildkite API to get the total count of failed jobs in this build.
11+
build_info=$(curl -sf \
12+
-H "Authorization: Bearer ${BUILDKITE_TOKEN_FOR_CLAUDE:-}" \
13+
"https://api.buildkite.com/v2/organizations/${BUILDKITE_ORGANIZATION_SLUG}/pipelines/${BUILDKITE_PIPELINE_SLUG}/builds/${BUILDKITE_BUILD_NUMBER}" \
14+
2>/dev/null || echo "")
15+
16+
if [ -z "${build_info}" ]; then
17+
echo "Could not fetch build info; assuming real failures exist, running Claude analysis"
18+
buildkite-agent pipeline upload .buildkite/claude-analysis.yml
19+
exit 0
20+
fi
21+
22+
total_failures=$(echo "${build_info}" | jq '[.jobs[] | select(.state == "failed" or .state == "timed_out")] | length' 2>/dev/null || echo "-1")
23+
24+
# Nothing failed — skip Claude analysis.
25+
if [ "${total_failures}" -eq 0 ]; then
26+
echo "All steps passed, skipping Claude analysis"
27+
exit 0
28+
fi
29+
830
# Non-essential steps whose failure alone should NOT trigger Claude analysis.
931
# Add step keys here as needed.
10-
NON_ESSENTIAL_STEPS="danger"
32+
NON_ESSENTIAL_STEPS=("danger")
1133

12-
for step_key in ${NON_ESSENTIAL_STEPS}; do
13-
OUTCOME=$(buildkite-agent step get outcome --step "${step_key}" 2>/dev/null || echo "not_run")
14-
if [ "${OUTCOME}" = "failed" ]; then
15-
echo "Only non-essential job '${step_key}' failed, skipping Claude analysis"
16-
exit 0
34+
# Count how many non-essential steps have a "hard_failed" outcome.
35+
non_essential_failures=0
36+
for step_key in "${NON_ESSENTIAL_STEPS[@]}"; do
37+
outcome=$(buildkite-agent step get outcome --step "${step_key}" 2>/dev/null || echo "not_run")
38+
if [ "${outcome}" = "hard_failed" ]; then
39+
echo "Non-essential job '${step_key}' failed"
40+
non_essential_failures=$((non_essential_failures + 1))
1741
fi
1842
done
1943

44+
if [ "${total_failures}" -eq "${non_essential_failures}" ]; then
45+
echo "Only non-essential job(s) failed (${non_essential_failures}/${total_failures}), skipping Claude analysis"
46+
exit 0
47+
fi
48+
2049
echo "Real failures detected, running Claude analysis"
2150
buildkite-agent pipeline upload .buildkite/claude-analysis.yml

0 commit comments

Comments
 (0)