Story #6311
closed[Maybe] [SDKs] Support caching Keep blocks in memcached
0%
Description
We could potentially improve job performance by running memcached on each compute node to store Keep blocks. When a node is running many tasks from a job that access the same data, this cache could make it possible for the block to be downloaded to the node once, then shared across tasks.
If we decide to go ahead with this caching strategy, add the necessary support to the Python SDK Keepclient to use a memcached store when available.
Updated by Brett Smith over 9 years ago
- Description updated (diff)
- Category set to SDKs
Updated by Tom Clegg over 9 years ago
- Orchestrating turning up/down memcached when jobs start and stop.
- Firewalling user/job A's memcached from user/job B's memcached. E.g., crunch2 allowing >1 job per node, shared shell VM.
Memcached is good for sharing free memory (freely!) across nodes. Given that each job has distinct permissions, we'd essentially need a VPN per job in order to take advantage of that feature. And without that feature, I'm not sure memcached would perform any better than a tmpfs-backed filesystem cache.
Updated by Tom Morris almost 6 years ago
- Target version deleted (
Arvados Future Sprints)