Actions
Bug #17926
closed[controller] lib/pq 1.3.0 does not handle stale db connections properly (Aurora RDS)
Start date:
07/20/2021
Due date:
% Done:
100%
Estimated time:
(Total: 0.00 h)
Story points:
-
Release:
Release relationship:
Auto
Description
Context: Arvados cluster with Aurora RDS as db backend.
Symptom: After the cluster has been idle for a while, a fresh login fails with a "broken pipe" error. The logs say
{"PID":14505,"RequestID" :"req-22mvdy7j9r6di9xzn6os","level”:“info”, "msg":"response”, "remoteAddr”:"127.0.0.1:47966", "reqBytes":38,"reqForwardedFor":"1.2.3.4", “reqHost":"somewhere. over.the.rainbow", “reqMethod": "POST", “reqPath":"arvados/v1/users/authenticate",“reqQuery":"","respBody":"{\"errors\":[\"w rite tcp 9.1.2.3:57210-\\u003e5.6.7.8:5432: write: broken pipe\"]}\n","respBytes":91, respStatus":"Internal Server Error”,"respStatusCode” :500, “time” :"2021-07-207T15:57:14.8873462372", “timeToStatus":0.177528, “timeTotal”:0.177538, "timeWriteBody":0.000018}
Likely cause: a bug in `lib/pq`, as described here: https://blog.bossylobster.com/2020/12/broken-pipe.html
The fix has been merged and is available in version 1.10.0 and up, but we are on version 1.3.0.
Actions