Bug #5845
closedPipeline has failed but no jobs are marked as failed
0%
Description
https://workbench.su92l.arvadosapi.com/pipeline_instances/su92l-d1hrv-5hkbkuwvsve9lsk#Components
Pipeline has failed but one job reports "complete" while the other reports "Not ready". No jobs report "failed" even though the whole pipeline has failed.
Files
Updated by Bryan Cosca over 9 years ago
From the pipeline instance logs:
Error creating job for component RefreshReport: Repository not found: '$USER'
Job submission was: {"job":{"script":"run-command","script_parameters":{"command":["$(job.srcdir)/crunch_scripts/get-evidence-refresh-shim","$(file $(GET_EVIDENCE_JSON))","$(file $(GETEV_LATEST))"],"OUT_DATA_DIR":"01a1bf596b269e487d220053f8f29724+249","GET_EVIDENCE_JSON":"$(OUT_DATA_DIR)/out-data/get-evidence.json","GETEV_LATEST":"2511736ccd170e3be28b7d10077ea8e5+74/getev-latest.json.gz"},"script_version":"get-evidence-refresh","repository":"$USER","runtime_constraints":{"docker_image":"arvados/jobs","arvados_sdk_version":"38e27663cf656f0c9c443a2715f249afe39a8bfb","min_nodes":1},"owner_uuid":"su92l-tpzed-6cw59akrlzqb2sl","submit_id":"instance su92l-d1hrv-5hkbkuwvsve9lsk rand d7xy2p313grr","state":"Queued"},"find_or_create":true}
Updated by Abram Connelly over 9 years ago
The bug is not that the job failed but that workbench is not displaying the failure.
The job failed because I mis-specified the repository. It was looking for the literal string '$USER' for the repository and did not find it, causing the job to not even begin because it couldn't find the repository.
Once the failure was noticed Arvados (correctly) cleaned up the job, stopped the pipeline and did not continue. When viewing the pipeline, though, the whole pipeline is marked as 'failed', the first job in the two-job pipeline is marked as successful (correctly) and the second job in the pipeline is marked as 'not ready'.
The second job marked as 'not ready' is, in my opinion, a bug. From my perspective, the second job failed and should be marked as such.
Updated by Brett Smith over 9 years ago
- Status changed from New to Closed
- Target version deleted (
Bug Triage)
Closing as a duplicate of #5906 (since I already wrote the specification there).