Bug #7235
Updated by Tom Clegg about 9 years ago
h2. Background The Python Keep client sets a 300-second timeout to complete all requests. There are some real-world scenarios where this is too strict. For example, purely hypothetically, an Arvados developer might be working across an ocean, tethered through a cellular network. Everything will complete just fine, but whole 64MiB blocks might not be able to finish transferring in five minutes. The functional requirement is that a user with a slow but stable connection can successfully interact with a Keep proxy. (I am willing to let timeouts continue to serve as a performance sanity check for the not-proxy case, on the expectation that one admin has sufficient control over the entire stack there.) h2. Implementation Refer to "libcurl connection options":http://curl.haxx.se/libcurl/c/curl_easy_setopt.html for details. When the Python Keep client connects to a proxy service, instead of setting TIMEOUT_MS, set LOW_SPEED_LIMIT and LOW_SPEED_TIME to ensure a minimum transfer rate of 2 1 MiB per 64 32 second interval: i.e., give up if the transfer speed drops below 32 KiB/s. Expected outcomes, assuming "mid-transfer outage" happens after 2 1 MiB has been transferred: |Available bandwidth|Server delay (req finish to resp start)|Mid-transfer outage|Outcome|| |33 KiB/s |None |None |Success|| |32 KiB/s |1s |None |Fail || |32 KiB/s |None |1s |Fail || |64 KiB/s |31s |15s |None |Success|| |64 KiB/s |31s |15s |<32s |None |Success|| |64 KiB/s |<=31s |<=15s |>32s |>16s |Fail || |<2 |<1 MiB/s |63s |31s |None |Fail || |>2 |>1 MiB/s |<63s |<=31s |None |Success|| |>2 |>1 MiB/s |<63s |<63s |<=31s |<=31s |Success|"normal"| |>2 |>1 MiB/s |>=64s |>=32s |None |Fail |"classic timeout"| |>2 |>1 MiB/s |<63s |>64s |<=31s |>=32s |Fail |"classic timeout"| Of the last 100393 keepstore transactions on keep14.q * keep14.q, 31 (0.03%) took longer than 30 seconds. * 8 (0.008%) took longer than 60 seconds.