Project

General

Custom queries

Profile

Actions

Bug #18346

closed

Login federation: request storm overwhelming login cluster rails api server

Added by Peter Amstutz about 3 years ago. Updated about 3 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Start date:
11/10/2021
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
-
Release relationship:
Auto

Description

A customer has seen this behavior in 2 different scenarios:

a) when a user used an old token that was issued by a local cluster prior to the migration to a login federation. Local cluster and login cluster on Arvados 2.2.2
b) when a big workflow is run on a 2.3.0 cluster with the login cluster on 2.2.2

The b) case appears to be a 2.3 regression: the workflow that triggered the outage is a re-run that did not cause problems on Arvados 2.2.x (or older, that's not clear).

The requests that end up at the login cluster api server have a specific request parameter pattern (include_trash=true&select=[uuid]). They seem to be user and collection requests.

The collection requests seem to be for log collections (i.e. the workflow steps writing to them, presumably?).

The requests all get a 401 response from the login cluster api server, but this does not appear to impede the running of the big workflow on the local cluster.

The customer implemented a workaround: greatly increasing the number of passenger workers on the login cluster api server made it able to handle many more concurrent requests (and return a 401 for them), which avoids the overload death spiral when clients retry.


Subtasks 4 (0 open4 closed)

Task #18351: Review 18346-container-tokenResolvedTom Clegg11/10/2021

Actions
Task #18365: build 2.3.1~rc1 with bugfixResolvedPeter Amstutz11/10/2021

Actions
Task #18366: Review 18346-crunchrun-no-eventsResolvedPeter Amstutz11/10/2021

Actions
Task #18373: merge fixes into 2.3.1Resolved11/10/2021

Actions

Related issues 1 (0 open1 closed)

Related to Arvados - Bug #18887: [federation] wb1 fiddlesticks in login federationResolvedWard Vandewege03/25/2022

Actions
#1

Updated by Peter Amstutz about 3 years ago

  • Status changed from New to In Progress
#2

Updated by Peter Amstutz about 3 years ago

  • Description updated (diff)
#3

Updated by Ward Vandewege about 3 years ago

  • Subject changed from Request storm overwhelming federation to Request storm overwhelming login federation
#4

Updated by Ward Vandewege about 3 years ago

  • Subject changed from Request storm overwhelming login federation to Login federation: request storm overwhelming login cluster rails api server
#5

Updated by Ward Vandewege about 3 years ago

  • Description updated (diff)
#6

Updated by Ward Vandewege about 3 years ago

  • Description updated (diff)
#8

Updated by Ward Vandewege about 3 years ago

  • Description updated (diff)
#11

Updated by Ward Vandewege about 3 years ago

  • Release set to 45
#12

Updated by Peter Amstutz about 3 years ago

  • Target version changed from 2021-11-10 sprint to 2021-11-24 sprint
#13

Updated by Peter Amstutz about 3 years ago

  • Assigned To set to Tom Clegg
#16

Updated by Tom Clegg about 3 years ago

  • Description updated (diff)
#18

Updated by Tom Clegg about 3 years ago

  • Project changed from 35 to Arvados
#27

Updated by Peter Amstutz about 3 years ago

  • Status changed from In Progress to Resolved
#28

Updated by Tom Clegg almost 3 years ago

  • Related to Bug #18887: [federation] wb1 fiddlesticks in login federation added
Actions

Also available in: Atom PDF