Story #13925
Updated by Tom Clegg about 3 years ago
The default keep cache size is 256 MiB. For certain workloads, this is much too small. In particular, multithreaded workloads which read from multiple files experience severe cache contention. Unfortunately, it is difficult for users to analyze performance problems due to keep cache. Often times the response is simply to request more resources via runtime_constraints. increase the machine size. However, because the keep cache does not scale with container/machine machine size, this does not have any effect. Based on the observation that (a) users request more VCPUs multicore machines for multithreaded workloads and (b) users' users typical response to performance problems is to request more resources, scale up the machine, we should scale the default keep cache based on runtime_constraints. machine size. The cache should be either a percentage of RAM (say 12.5%) or multiplied by the number of cores, say 384 MiB per core. This could be computed by a-c-r or on the API server.