Description
We observe that jobs submitted via our front-end wrapper script sometimes end with failed 100 : assumedly after job, and the qmaster message file reports it cannot read the usage file for the job. Importantly, the job is never executed on the execution host.
qmaster message
2025-12-11 16:21:27.352581| worker|03|rdocs01|W|job 142486 .1 failed on host rd2696 assumedly after job because: can't read usage file for job 142486 .1
Client-side messages
waiting for interactive job to be scheduled ...
Your interactive job 142486 has been successfully scheduled.
Establishing builtin session to host rd2696 ...
Your job 142486 ("test_job") has been submitted
qacct -j 142486
start_time -/-
end_time -/-
granted_pe NONE
slots 1
failed 100 : assumedly after job
exit_status 0
ru_wallclock 0
ru_utime 0.000
ru_stime 0.000
ru_maxrss 0
ru_ixrss 0
ru_ismrss 0
ru_idrss 0
ru_isrss 0
ru_minflt 0
ru_majflt 0
ru_nswap 0
ru_inblock 0
ru_oublock 0
ru_msgsnd 0
ru_msgrcv 0
ru_nsignals 0
ru_nvcsw 0
ru_nivcsw 0
wallclock 0.000
cpu 0.000
mem 0.000
io 0.000
iow 0.000
maxvmem 0
maxrss 0
arid undefined
Environment
-
Product: OCS 9.0.9 (build 141125-1311) — official binaries
-
OS/Distro: Oracle Linux 8.10
-
Front-end: wrapper submit script (run_job) that asks for yes/no confirmation before invoking qrsh and related tools
Observations
-
The job never starts on the execution host (no start_time, no resource usage).
-
Accounting shows zero usage and failed=100.
-
qmaster logs indicate missing usage file, suggesting the shepherd never wrote it.
-
The issue occurs only when confirmation is piped (e.g., echo y | run_job or y | run_job).
-
When typing y interactively, the job runs normally.
Steps to Reproduce
-
Use the submit wrapper script that prompts for confirmation.
-
Pipe y into the script:
echo y | run_job
# or
y | run_job
-
Observe that the job is scheduled but never executed, and accounting shows failed 100.
Control Case (works)
- When typing
y manually at the prompt (interactive stdin), the job runs normally and accounting is written.
Thanks in advance!
Description
We observe that jobs submitted via our front-end wrapper script sometimes end with
failed 100 : assumedly after job, and the qmaster message file reports it cannot read the usage file for the job. Importantly, the job is never executed on the execution host.qmaster message
Client-side messages
qacct -j 142486Environment
Product: OCS 9.0.9 (build 141125-1311) — official binaries
OS/Distro: Oracle Linux 8.10
Front-end: wrapper submit script (
run_job) that asks for yes/no confirmation before invokingqrshand related toolsObservations
The job never starts on the execution host (no start_time, no resource usage).
Accounting shows zero usage and
failed=100.qmaster logs indicate missing usage file, suggesting the shepherd never wrote it.
The issue occurs only when confirmation is piped (e.g.,
echo y | run_jobory | run_job).When typing
yinteractively, the job runs normally.Steps to Reproduce
Use the submit wrapper script that prompts for confirmation.
Pipe
yinto the script:Observe that the job is scheduled but never executed, and accounting shows
failed 100.Control Case (works)
ymanually at the prompt (interactive stdin), the job runs normally and accounting is written.Thanks in advance!