Project

General

Profile

Actions

Bug #9753

closed

Task process exited 1, but never updated its task record to indicate success and record its output.

Added by Bryan Cosca over 9 years ago. Updated about 6 years ago.

Status:
Closed
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
-
Story points:
-

Description

Using arvados-cwl-runner...

https://cloud.curoverse.com/collections/8fa1d98e0fcd4a0304d8c91b68435459+85/qr1hi-8i9sb-q36siyp3wfe8pcs.log.txt

2016-08-07_22:45:14 qr1hi-8i9sb-q36siyp3wfe8pcs 44336 0 stderr Using Arvados SDK version 0.1.20160622134640
2016-08-07_22:45:14 qr1hi-8i9sb-q36siyp3wfe8pcs 44336 0 stderr 2016/08/07 22:45:13 Using TLS certificates at /etc/ssl/certs/ca-certificates.crt
2016-08-07_22:45:14 qr1hi-8i9sb-q36siyp3wfe8pcs 44336 0 stderr 2016/08/07 22:45:13 Using TLS certificates at /etc/arvados/ca-certificates.crt
2016-08-07_22:45:20 qr1hi-8i9sb-q36siyp3wfe8pcs 44336 0 stderr 2016/08/07 22:45:19 Get https://qr1hi.arvadosapi.com/arvados/v1/jobs/qr1hi-8i9sb-q36siyp3wfe8pcs: EOF
2016-08-07_22:45:20 qr1hi-8i9sb-q36siyp3wfe8pcs 44336 0 stderr srun: error: compute1: task 0: Exited with exit code 1
2016-08-07_22:45:20 qr1hi-8i9sb-q36siyp3wfe8pcs 44336 0 child 44464 on compute1.1 exit 1 success=
2016-08-07_22:45:20 qr1hi-8i9sb-q36siyp3wfe8pcs 44336 0 ERROR: Task process exited 1, but never updated its task record to indicate success and record its output.

probably not useful but here's the cwl output:

[step bgzip_freebayes] completion status is permanentFail
2016-08-07 22:59:19 arvados.cwl-runner[38899] ERROR: Caught unhandled exception, marking pipeline as failed.  Error was: Output for workflow not available
Traceback (most recent call last):
  File "/home/bcosc/venv-test/local/lib/python2.7/site-packages/arvados_cwl/__init__.py", line 189, in arvExecutor
    for runnable in jobiter:
  File "/home/bcosc/venv-test/local/lib/python2.7/site-packages/cwltool/workflow.py", line 410, in job
    for w in wj.job(builder.job, output_callback, **kwargs):
  File "/home/bcosc/venv-test/local/lib/python2.7/site-packages/cwltool/workflow.py", line 381, in job
    raise WorkflowException("Output for workflow not available")
WorkflowException: Output for workflow not available
2016-08-07 22:59:21 arvados.cwl-runner[38899] INFO: Job bgzip_varscan (qr1hi-8i9sb-i8uah5udbcjqftn) is Running
Workflow error, try again with --debug for more information:
  Workflow failed.
Traceback (most recent call last):
  File "/home/bcosc/venv-test/local/lib/python2.7/site-packages/cwltool/main.py", line 715, in main
    **vars(args))
  File "/home/bcosc/venv-test/local/lib/python2.7/site-packages/arvados_cwl/__init__.py", line 223, in arvExecutor
    raise WorkflowException("Workflow failed.")
WorkflowException: Workflow failed.
Actions #1

Updated by Tom Morris over 9 years ago

  • Priority changed from Normal to High

Bumping to high priority so we remember to triage early.

Actions #2

Updated by Tom Morris over 9 years ago

  • Subject changed from TLS certificates error to Task process exited 1, but never updated its task record to indicate success and record its output.
  • Priority changed from High to Normal

I don't think this has anything to do with TLS, so updating the title.

The two things that jump out:
1. The job record fetch returned EOF (apparently - although it's there now)
2. The task record hadn't been updated

Returning to normal priority as one in a litany of weird pipeline errors.

Actions #3

Updated by Peter Amstutz about 6 years ago

  • Status changed from New to Closed
Actions

Also available in: Atom PDF