Actions
Bug #11901
closed[arvados-ws] Fix leaking postgres connections and subsequent stall
Start date:
06/26/2017
Due date:
% Done:
100%
Estimated time:
(Total: 0.00 h)
Story points:
-
Description
Occasionally arvados-ws reaches its database connection pool limit and stops responding.
- Fix leaking connections
- Report something helpful in debug.json, like how many connections are in use and what for (expect 1 for listener, ≤1 per server queue slot, and 1 per client connection doing "sendOldEvents")
- Add a health check that fails when we're at connection pool limit
Updated by Tom Clegg over 7 years ago
11901-ws-db-conns @ c5a8ad7751e13560a6cde34395ea76f380c8a80d
- fix an unclosed "rows" object
- add authenticated /_health/ping and /_health/db handlers
- add # open db connections to /debug.json
Updated by Tom Clegg over 7 years ago
The health-check specs here (authentication, URLs, responses) are the ones Nico and I developed last week based on existing conventions and ease of integration with consul, nagios, etc. I've since written them up on #11906.
Updated by Radhika Chippada over 7 years ago
- Should "/_health/ping" and "/_health/db" also check if the ManagementToken is configured and bearer token matches? (I could not tell if this was already the case ...)
- Would it make sense to test when management token is not configured as well (disabled)?
if rtr.Config.ManagementToken == "" { http.Error(w, "disabled", http.StatusNotFound) }
Updated by Tom Clegg over 7 years ago
Radhika Chippada wrote:
- Should "/_health/ping" and "/_health/db" also check if the ManagementToken is configured and bearer token matches? (I could not tell if this was already the case ...)
Yes, mgmtAuth() covers the http.ServeMux handling /_health/ so the individual handlers don't need to re-check.
- Would it make sense to test when management token is not configured as well (disabled)?
Yes, added this test.
11901-ws-db-conns @ 5c860fdbf28128e7d11a9dff8b5c30777c2cbfeb
Updated by Tom Clegg over 7 years ago
- Status changed from In Progress to Resolved
- % Done changed from 0 to 100
Applied in changeset arvados|commit:8051c3a14d40f0d410e4ddf54d89a084475d807e.
Actions