Bug #23196
closedShould track instance capacity state separately for preemptible and non-preemptible instance types
Description
Currently, if arvados-dispatch-cloud gets a capacity error trying to create a preemptible a1.large instance, it will avoid trying to create preemptible or non-preemptible a1.large instances for one minute.
"At capacity" state should be tracked separately for preemptible and non-preemptible instance types.
Updated by Tom Clegg 6 months ago
23196-preemptible-capacity @ 895af9eaf2a9c9c0bb5ccfd5dc2e70846e5ad319 -- developer-run-tests: #4902
(workbench tests failed)
Updated by Tom Clegg 6 months ago
- Related to Bug #22017: a-d-c needs to handle different quotas for difference instance types added
Updated by Tom Clegg 6 months ago
- Related to Bug #23178: a-d-c is treating rate limit errors as capacity issues, should not added
Updated by Brett Smith 6 months ago
Tom Clegg wrote in #note-1:
23196-preemptible-capacity @ 895af9eaf2a9c9c0bb5ccfd5dc2e70846e5ad319 -- developer-run-tests: #4902
This is fine. I do wonder if it would be nicer for maintainability to add a method to arvados.InstanceType that returns a unique key on all the axes we care about. That way we could extend the logic in the future if needed without tracking down all the places the logic is duplicated here. But I don't feel too strongly about it at this point.
(workbench tests failed)
Known issue #23180.
Updated by Tom Clegg 6 months ago
- Status changed from In Progress to Resolved
Applied in changeset arvados|ced8535a60991b2d437243ba3cd06c67e0679150.
Updated by Tom Clegg 6 months ago
Previous commit fixed worker pool, but scheduler also has its own cache, which was still conflating preemptible and non-preemptible types.
23196-preemptible-capacity @ 25d5c32f947fc5115be9d348413af5ccc90f48fe -- developer-run-tests: #4904
Updated by Brett Smith 6 months ago
Tom Clegg wrote in #note-7:
23196-preemptible-capacity @ 25d5c32f947fc5115be9d348413af5ccc90f48fe -- developer-run-tests: #4904
Assuming tests pass, LGTM.
Updated by Tom Clegg 6 months ago
- Status changed from In Progress to Resolved
Applied in changeset arvados|71424c7aed9d7b760404127dca9692672a3e5e1a.