Project

General

Profile

Backlogs error: Couldn't find RbRelease with 'id'=69 (ActiveRecord::RecordNotFound)
Actions

Bug #19429

open

Somehow detect that scratch space failed to mount

Added by Peter Amstutz over 2 years ago. Updated almost 2 years ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
Crunch
Target version:
-
Start date:
Due date:
% Done:

0%

Estimated time:

Description

A failure mode with cloud compute nodes is that the scratch space can fail to mount due to misconfiguration. When this happens, jobs may mysteriously run out of disk space and it can be very difficult to debug.

Maybe "arvados-server cloudtest" could include a check that an expected amount of disk space is available, or even check on AWS that we have the right IAM permissions.

Actions #1

Updated by Peter Amstutz over 2 years ago

  • Description updated (diff)
Actions #2

Updated by Ward Vandewege almost 2 years ago

  • Release set to 69
Actions

Also available in: Atom PDF