Skip to content

When a CronJob has spec.timeZone set and the kube-state-metrics container does not have tzdata installed, kube-state-metrics panics and crashes, taking down all cluster metrics collection. #2898

@vxe

Description

@vxe

What happened:

When a CronJob has spec.timeZone set to a non-UTC timezone (e.g. Asia/Singapore) and the
kube-state-metrics container image does not include tzdata, kube-state-metrics panics and enters
CrashLoopBackOff, taking down all cluster metrics collection.

The panic originates in internal/store/cronjob.go:254 where getNextScheduledTime() prefixes
the schedule with CRON_TZ=<timezone>, passes it to cron.ParseStandard(), which calls
time.LoadLocation() — failing without tzdata — and the error is passed directly to panic(err)
with no recovery.

What you expected to happen:

The error should be logged and the kube_cronjob_next_schedule_time metric skipped for the
affected CronJob. The process should continue running and serving metrics for all other resources.

How to reproduce it (as minimally and precisely as possible):

# Apply any CronJob with spec.timeZone set to a named timezone
kubectl apply -f - <<EOF
apiVersion: batch/v1
kind: CronJob
metadata:
  name: tz-test
spec:
  schedule: "0 4 * * 1-5"
  timeZone: "Asia/Singapore"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: test
            image: busybox
            command: ["echo", "hello"]
          restartPolicy: OnFailure
EOF

# kube-state-metrics will panic immediately with:
# panic: Failed to parse cron job schedule 'CRON_TZ=Asia/Singapore 0 4 * * 1-5':
#        provided bad location Asia/Singapore: unknown time zone Asia/Singapore

Anything else we need to know?:

spec.timeZone is a stable Kubernetes API field since v1.27. The panic(err) in cronjob.go:254
means any cluster using this field with kube-state-metrics deployed without tzdata will lose all
metrics. The fix is two-fold:

  1. Replace panic(err) with a logged error that skips the metric for the affected CronJob
  2. Include tzdata in the container image so named timezones resolve correctly

Suggested fix for cronjob.go:254:

if err != nil {
    klog.Errorf("failed to compute next schedule time for cronjob %s/%s: %v",
        j.Namespace, j.Name, err)
    return &metric.Family{Metrics: ms}
}

Environment:

  • kube-state-metrics version: (please fill in)
  • Kubernetes version: v1.32.12
  • Cloud provider or hardware configuration: AWS EKS
  • Other info: kube-state-metrics container image does not include tzdata; time.LoadLocation() fails for any named timezone

Metadata

Metadata

Assignees

Labels

kind/bugCategorizes issue or PR as related to a bug.triage/acceptedIndicates an issue or PR is ready to be actively worked on.

Type

No type

Projects

Status

Needs Triage

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions