Project

General

Profile

Actions

Bug #13630

open

locking api_client_authorizations for update can deadlock ContainersController#update

Added by Joshua Randall over 6 years ago. Updated almost 2 years ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
API
Target version:
-
Start date:
Due date:
% Done:

0%

Estimated time:
Story points:
-
Release:
Release relationship:
Auto

Description

We've observed an apparent deadlock (though not detected by postgres) when a containers update request (ContainersController#update ) calls assign_auth which results in an update_attributes call on api_client_authorizations.

When this happened, the database ended up with a number of queries of the form:

UPDATE "api_client_authorizations" SET "expires_at" = $1, "updated_at" = $2 WHERE "api_client_authorizations"."id" = $3

All of which were waiting on row and/or tuple exclusive locks (i.e. none of those queries were actively running - they were ALL waiting). We let that go for over 45m and saw every new query get stuck in the same situation before we manually killed all of the waiting postgres processes to clear them out.

Partial stack trace (from new relic) leading up to the stuck query:

…/active_record/mass_assignment_security/
persistence.rb:  65:in `block in update'
…/active_record/mass_assignment_security/
persistence.rb:  63:in `update'
…api-db/arvados-api/current/app/models/
arvados_model.rb: 125:in `update_attributes'
…dos-api-db/arvados-api/current/app/models/
container.rb: 492:in `assign_auth'
…model-4.2.10/lib/active_model/validations/
callbacks.rb: 113:in `run_validations!'
…ems/activemodel-4.2.10/lib/active_model/
validations.rb: 338:in `valid?'
…/active_record/mass_assignment_security/
persistence.rb:  77:in `block in update!'
…/active_record/mass_assignment_security/
persistence.rb:  75:in `update!'
…-api/current/app/controllers/
application_controller.rb: 119:in `update'
…nt/app/controllers/arvados/v1/
containers_controller.rb:  30:in `block in update'
…nt/app/controllers/arvados/v1/
containers_controller.rb:  27:in `update'
…-api/current/app/controllers/
application_controller.rb: 398:in `block in set_current_request_id'
…-api/current/app/controllers/
application_controller.rb: 397:in `set_current_request_id'
….0/gems/actionview-4.2.10/lib/action_view/
rendering.rb:  30:in `process'
…uby/2.3.0/gems/omniauth-1.4.3/lib/omniauth/
strategy.rb: 186:in `call!'
…uby/2.3.0/gems/omniauth-1.4.3/lib/omniauth/
strategy.rb: 164:in `call'
…ruby/2.3.0/gems/omniauth-1.4.3/lib/omniauth/
builder.rb:  63:in `call'
…or_bundle/ruby/2.3.0/gems/rack-1.6.10/lib/rack/
etag.rb:  24:in `call'
…ruby/2.3.0/gems/rack-1.6.10/lib/rack/
conditionalget.rb:  38:in `call'
…or_bundle/ruby/2.3.0/gems/rack-1.6.10/lib/rack/
head.rb:  13:in `call'
…2.3.0/gems/rack-1.6.10/lib/rack/session/abstract/
id.rb: 225:in `context'
…2.3.0/gems/rack-1.6.10/lib/rack/session/abstract/
id.rb: 220:in `call'
…ms/lograge-0.10.0/lib/lograge/rails_ext/rack/
logger.rb:  15:in `call_app'
…ms/request_store-1.4.1/lib/request_store/
middleware.rb:  19:in `call'
…ruby/2.3.0/gems/rack-1.6.10/lib/rack/
methodoverride.rb:  22:in `call'
…bundle/ruby/2.3.0/gems/rack-1.6.10/lib/rack/
runtime.rb:  18:in `call'
…undle/ruby/2.3.0/gems/rack-1.6.10/lib/rack/
sendfile.rb: 113:in `call'
…vados-api/current/app/middlewares/
arvados_api_token.rb:  63:in `call'
…uby/phusion_passenger/rack/
thread_handler_extension.rb:  97:in `process_request'
…by/phusion_passenger/request_handler/
thread_handler.rb: 157:in `accept_and_process_next_request'
…by/phusion_passenger/request_handler/
thread_handler.rb: 110:in `main_loop'
…/ruby/vendor_ruby/phusion_passenger/
request_handler.rb: 415:in `block (3 levels) in start_threads'
   /usr/lib/ruby/vendor_ruby/phusion_passenger/
utils.rb: 113:in `block in create_thread_and_abort_on_exception'

Actions #1

Updated by Joshua Randall over 6 years ago

We have mitigated this issue by adding a timeout on all postgres statements - there is no sense in the API server waiting for postgres statements for longer than the request timeout!

We modified the `initialize` method in `models/arvados_model.rb` so that it looks like this:

  def initialize raw_params={}, *args
    ActiveRecord::Base.connection.execute('SET statement_timeout = 300000')
    super(self.class.permit_attribute_params(raw_params), *args)
  end

Actions #4

Updated by Lucas Di Pentima almost 2 years ago

  • Release set to 60
Actions

Also available in: Atom PDF