Actions
Bug #9688
closed[Crunch2] Limit number of dispatch attempts per container
Status:
Resolved
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
-
Start date:
08/02/2016
Due date:
% Done:
0%
Estimated time:
Story points:
-
Description
Problem¶
There are circumstances where crunch-dispatch-* tries to run a container, but something fails before the container gets to Running state, so the container goes back to Queued state. See #9679.
If the same problem keeps happening, the container just flaps between Queued and Locked.
After a certain amount of time, or number of retries, we should really just give up and cancel the container.
Proposed solutions¶
Pick one of:- (crunch-dispatch-*) If a single container gets dispatched more than N times (over a period of at least M seconds) by a single crunch-dispatch-* process, but still won't run, give up and change state to Cancelled.
- (API server) If a container has been Locked and returned to Queued state, and is more than M seconds old, cancel it.
- Introduce a delay between "return container X to queue" and "re-attempt X".
Updated by Peter Amstutz almost 6 years ago
- Related to Bug #11561: [API] Limit number of lock/unlock cycles for a given container added
Updated by Tom Morris almost 6 years ago
- Priority changed from Normal to High
- Target version set to To Be Groomed
Updated by Tom Morris almost 6 years ago
- Related to Bug #14540: [API] Limit number of container lock/unlock cycles added
Updated by Peter Amstutz almost 6 years ago
- Related to deleted (Bug #14540: [API] Limit number of container lock/unlock cycles)
Updated by Peter Amstutz almost 6 years ago
- Has duplicate Bug #14540: [API] Limit number of container lock/unlock cycles added
Updated by Peter Amstutz almost 6 years ago
- Related to deleted (Bug #11561: [API] Limit number of lock/unlock cycles for a given container)
Updated by Peter Amstutz almost 6 years ago
- Has duplicate Bug #11561: [API] Limit number of lock/unlock cycles for a given container added
Updated by Peter Amstutz almost 6 years ago
- Status changed from New to Duplicate
Updated by Tom Clegg over 3 years ago
- Status changed from Duplicate to Resolved
- Priority changed from High to Normal
Actions