Bug #14804
closed
[keepstore] Return 5xx (not 4xx) if block is not found due to transient backend device failure
Added by Tom Clegg almost 6 years ago.
Updated almost 6 years ago.
Estimated time:
(Total: 0.00 h)
Release relationship:
Auto
Description
Currently, when keepstore is trying to read a block, if one Azure-backed volume encounters a 503 error and all other volumes return 404, keepstore returns 404 to its client. This is a non-retryable error so the client will give up.
The correct behavior is to return a 502 or 503 status in this situation.
Azure error message:
storage: service returned error: StatusCode=503, ErrorCode=ServerBusy, ErrorMessage=The server is busy.
Related issues
1 (1 open — 0 closed)
- Target version changed from Arvados Future Sprints to 2019-02-27 Sprint
- Assigned To set to Lucas Di Pentima
- Status changed from New to In Progress
Updates at 601764a10 - branch 14804-keepstore-transient-backend-errors
Test run: https://ci.curoverse.com/job/developer-run-tests/1082/
When requesting a block, if keepstore
gets errors from all of its volumes, the error that was being returned to the client was 404 no matter which error the volumes returned.
Now, when receiving a VolumeBusyError
(transient error) from a volume backend, keepstore
will return a 503 status so that the client can retry instead of mistakenly believe that the block is not there.
Small nit pick, I would update the comment for TestGetHandler
to include your test scenario. Otherwise, LGTM.
- Status changed from In Progress to Resolved
- Related to Bug #15118: [keepstore] Return 5xx (not 4xx) if block is not found due to transient backend device failure added
Also available in: Atom
PDF