Backlogs error: Couldn't find RbRelease with 'id'=69 (ActiveRecord::RecordNotFound)
Actions
Bug #19429
openSomehow detect that scratch space failed to mount
Status:
New
Priority:
Normal
Assigned To:
-
Category:
Crunch
Target version:
-
Start date:
Due date:
% Done:
0%
Estimated time:
Description
A failure mode with cloud compute nodes is that the scratch space can fail to mount due to misconfiguration. When this happens, jobs may mysteriously run out of disk space and it can be very difficult to debug.
Maybe "arvados-server cloudtest" could include a check that an expected amount of disk space is available, or even check on AWS that we have the right IAM permissions.
Actions