Actions
Bug #4309
closed[SDK] arv-copy collection copy performance
Status:
Resolved
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
-
Story points:
2.0
Description
Originally https://arvados.org/issues/3699#note-43
I'm copying a pipeline with a 5990M collection. I noticed this code:
data = src_keep.get(word)
dst_locator = dst_keep.put(data)
See attached image, there's a very clear falloff between blocks -- doing this sequentially isn't optimal. Download and upload could proceed concurrently. Also, it's possible we could get better utilization if we transferred multiple blocks at a time (e.g. 2x down / 2x up) by talking to multiple Keep servers. Consider a producer-consumer pattern using Python queues.
Files
Updated by Peter Amstutz over 11 years ago
- File arv-copy-perf.png arv-copy-perf.png added
- Description updated (diff)
Updated by Peter Amstutz over 11 years ago
- Subject changed from [SDK] arv-copy performance to [SDK] arv-copy collection copy performance
- Description updated (diff)
Updated by Tom Clegg over 11 years ago
- Target version changed from Bug Triage to Arvados Future Sprints
Updated by Ward Vandewege over 4 years ago
- Target version deleted (
Arvados Future Sprints)
Updated by Peter Amstutz over 1 year ago
- Release deleted (
60) - Target version deleted (
Future) - Status changed from New to Resolved
This was finally addressed in #20937
Actions