Feature #17301
closed
Special case report exit_code 137 as likely out of memory error
Added by Peter Amstutz about 4 years ago.
Updated over 2 years ago.
Estimated time:
(Total: 0.00 h)
Release relationship:
Auto
Description
One of the most common reasons for containers to fail by running out of memory and being OOM killed. When this happens the container exit code is 137. Arvados-cwl-runner should detect that and print a warning, workbench2 needs to display container warnings and errors similar to how it is already done with workbench 1.
- Description updated (diff)
- Category set to Workbench2
- Target version set to 2021-02-17 sprint
- Related to Story #16945: WB2 Workflows / containers feature parity added
- Target version deleted (
2021-02-17 sprint)
- Release deleted (
31)
- Target version set to 2022-03-30 Sprint
- Target version changed from 2022-03-30 Sprint to 2022-04-13 Sprint
- Related to Feature #18513: Print "exited from signal XY" for exit codes >128 added
- Assigned To set to Peter Amstutz
- Category changed from Workbench2 to CWL
- Target version changed from 2022-04-13 Sprint to 2022-04-27 Sprint
- Status changed from New to In Progress
Reviewing c22d905
- The code assumes that
runtime_status['activityDetail']
is legal. Do we know if it's at least accepted in railsAPI/controller? (The documentation doesn't mention it)
- The warning message seems to me a little too wordy. I was thinking that we could have an indexed documentation page where to point the user for broader explanations of the summarized messages that we display in WB2's UI. Food for thought, not sure if it should apply to this story.
- At
executor.py
:
- Line 264: That comment seems to be outdated now.
- Line 268: There's a trailing semicolon.
- If we're going to use
runtime_status
as some sort of logging store (as I understand, any error/warning will be appended to this field) we'll need to think how to handle long texts on WB2.
Lucas Di Pentima wrote:
Reviewing c22d905
- The code assumes that
runtime_status['activityDetail']
is legal. Do we know if it's at least accepted in railsAPI/controller? (The documentation doesn't mention it)
Since a-c-r never posts 'activity' status I just took it out.
- The warning message seems to me a little too wordy. I was thinking that we could have an indexed documentation page where to point the user for broader explanations of the summarized messages that we display in WB2's UI. Food for thought, not sure if it should apply to this story.
I cut the text back to "Container may have been killed for using too much RAM. Try resubmitting with a higher 'ramMin'."
- At
executor.py
:
- Line 264: That comment seems to be outdated now.
- Line 268: There's a trailing semicolon.
Fixed
- If we're going to use
runtime_status
as some sort of logging store (as I understand, any error/warning will be appended to this field) we'll need to think how to handle long texts on WB2.
I added a 40 line limit to details.
17301-cwl-oom @ 332b0d1b4a9095f4e43893ec741f901b74b36ceb
developer-run-tests: #3071
This was annoying because it wasn't failing for me locally.
I fixed up the test cases to make sure RuntimeStatusLoggingHandler gets removed from the global logger.
developer-run-tests: #3072
- Status changed from In Progress to Resolved
Also available in: Atom
PDF