Bug #10932
closed[arv-put] job resume too slow & uninformative
100%
Description
Completed 97% if a 636959M upload (637GB) and then attempted to resume it after arv-put was killed. It took almost 3 hours (175 minutes) of 100% CPU for arv-put to calculate/confirm the restart point (which seemed to only be 45%, not 97%) before beginning to do any additional uploading. During this time there was no user feedback other than "0M / 636959M 0.0%" for hours on end/
Some other, possibly relevant, details:
98 MB cache file in ~/.cache/arvados/ :
97913649 Jan 18 21:54 476acf1a6d4cfb9b630a13289c4a72be
of which ~7 MB is manifest text:
$ jq .manifest 476acf1a6d4cfb9b630a13289c4a72be | wc
1 240325 6861939
and the rest is info for the 478502 filesL
$ jq '.files | length' 476acf1a6d4cfb9b630a13289c4a72be
478502
$ jq -r '.files[] | .size ' 476acf1a6d4cfb9b630a13289c4a72be | wc
478502 478502 3827704
$ jq -r '.files[] | .mtime ' 476acf1a6d4cfb9b630a13289c4a72be | wc
478502 478502 8928344
$ jq -r '.files | keys ' 476acf1a6d4cfb9b630a13289c4a72be | wc
478504 480984 69204103
That's 69 MB of file names which are compressable to 1.4 MB due to high levels of redundancy
$ jq -r '.files | keys ' 476acf1a6d4cfb9b630a13289c4a72be | gzip | wc
10074 39234 1442694