Project

General

Profile

Bug #7235

Updated by Tom Clegg about 9 years ago

h2. Background 

 The Python Keep client sets a 300-second timeout to complete all requests.    There are some real-world scenarios where this is too strict.    For example, purely hypothetically, an Arvados developer might be working across an ocean, tethered through a cellular network.    Everything will complete just fine, but whole 64MiB blocks might not be able to finish transferring in five minutes. 

 The functional requirement is that a user with a slow but stable connection can successfully interact with a Keep proxy.    (I am willing to let timeouts continue to serve as a performance sanity check for the not-proxy case, on the expectation that one admin has sufficient control over the entire stack there.) 

 h2. Implementation 

 Refer to "libcurl connection options":http://curl.haxx.se/libcurl/c/curl_easy_setopt.html for details. 

 When the Python Keep client connects to a proxy service, instead of setting TIMEOUT_MS, set LOW_SPEED_LIMIT and LOW_SPEED_TIME to ensure a minimum transfer rate of 2 1 MiB per 64 32 second interval: i.e., give up if the transfer speed drops below 32 KiB/s. 

 Expected outcomes, assuming "mid-transfer outage" happens after 2 1 MiB has been transferred: 

 |Available bandwidth|Server delay (req finish to resp start)|Mid-transfer outage|Outcome|| 
 |33 KiB/s             |None                                     |None                 |Success|| 
 |32 KiB/s             |1s                                       |None                 |Fail     || 
 |32 KiB/s             |None                                     |1s                   |Fail     || 
 |64 KiB/s             |31s |15s                                      |None                 |Success|| 
 |64 KiB/s             |31s |15s                                      |<32s |None                 |Success|| 
 |64 KiB/s             |<=31s |<=15s                                    |>32s |>16s                 |Fail     || 
 |<2 |<1 MiB/s             |63s |31s                                      |None                 |Fail     || 
 |>2 |>1 MiB/s             |<63s                                     |<=31s                                    |None                 |Success|| 
 |>2 |>1 MiB/s             |<63s                                     |<63s                 |<=31s                                    |<=31s                |Success|"normal"| 
 |>2 |>1 MiB/s             |>=64s |>=32s                                    |None                 |Fail     |"classic timeout"| 
 |>2 |>1 MiB/s             |<63s                                     |>64s                 |<=31s                                    |>=32s                |Fail     |"classic timeout"| 

 Of the last 100393 keepstore transactions on keep14.q 
 * keep14.q, 31 (0.03%) took longer than 30 seconds. 
 * 8 (0.008%) took longer than 60 seconds.

Back