Story #8568
closed[SDKs] `arv keep docker` supports Docker 1.10+
100%
Description
If you try to run arv keep docker
today on a host running Docker 1.10+, you'll get an exception when the tool tries to find a specific file in the generated tar file:
$ arv keep docker --project-uuid=qr1hi-j7d0g-rerv400sudof0mn name 1474M / 1474M 100.0% Collection saved as 'Docker image name:latest sha256:... (2)' qr1hi-4zz18-k4b79jr1uamnila Traceback (most recent call last): File "/usr/local/bin/arv-keepdocker", line 5, in <module> pkg_resources.run_script('arvados-python-client==0.1.20160128214108', 'arv-keepdocker') File "/usr/local/lib/python2.7/dist-packages/pkg_resources.py", line 517, in run_script self.require(requires)[0].run_script(script_name, ns) File "/usr/local/lib/python2.7/dist-packages/pkg_resources.py", line 1436, in run_script exec(code, namespace, namespace) File "/usr/local/lib/python2.7/dist-packages/arvados_python_client-0.1.20160128214108-py2.7.egg/EGG-INFO/scripts/arv-keepdocker", line 4, in <module> main() File "/usr/local/lib/python2.7/dist-packages/arvados_python_client-0.1.20160128214108-py2.7.egg/arvados/commands/keepdocker.py", line 401, in main json_file = image_tar.extractfile(image_tar.getmember(image_hash + '/json')) File "/usr/lib/python2.7/tarfile.py", line 1800, in getmember raise KeyError("filename %r not found" % name) KeyError: "filename 'sha256:...d2b6/json' not found"
This happens because Docker changed the image tar format in Docker 1.10. Recognize and correctly upload images generated by Docker 1.10.
Changes to make¶
The output of docker images --no-trunc
has changed so the image hash now includes the cryptographic hash name. It looks like sha256:4fe79ae514594715386c226e89a53c8037e081603f93f65a47c82573359f70cb
.
Right now arv-keepdocker looks for a JSON metadata file in the .tar under <image hash>/json
:
json_file = image_tar.extractfile(image_tar.getmember(image_hash + '/json'))
This needs to be updated to look for a file named <unmarked hash>.json
. So, in the example hash above, arv-keepdocker needs to look for a metadata file named 4fe79ae514594715386c226e89a53c8037e081603f93f65a47c82573359f70cb.json
.
Suggest finding the metadata file with something like the following:
image_hash_type, _, raw_image_hash = image_hash.rpartition(':')
if image_hash_type:
json_filename = raw_image_hash + '.json'
else:
json_filename = raw_image_hash + '/json'
json_file = image_tar.extractfile(image_tar.getmember(json_filename))
I have confirmed that the JSON file still has the information arv-keepdocker needs in the new image format.
Updated by Brett Smith over 8 years ago
- Target version set to Arvados Future Sprints
Updated by Radhika Chippada over 8 years ago
- Assigned To set to Radhika Chippada
- Story points set to 1.0
Updated by Radhika Chippada over 8 years ago
- Target version changed from Arvados Future Sprints to 2016-08-03 sprint
Updated by Radhika Chippada over 8 years ago
Branch 8568-docker-version-support is ready for review.
Just did the replacement code copy paste from the description. Manually tested the code block is sane. There do not seem to be any existing tests in this area and hence I did not make any tests updates.
All python sdk tests passed.
Updated by Radhika Chippada over 8 years ago
- Status changed from New to In Progress
Updated by Radhika Chippada over 8 years ago
- % Done changed from 0 to 100
- Status changed from In Progress to Resolved
Applied in changeset arvados|commit:be87361dedf4e35405616e802fba12dedf86dfde.
Updated by Tom Morris over 8 years ago
Do we need test coverage here? Was test writing the reason it was estimated at a full story point?
Do we need to improve our process so that we test with Docker releases when they hit beta (or no later than immediately after they're released) or should we instead have a compatibility matrix which shows which versions we've tested with and are supported? Docker's such a critical component for the system that I think we need to have a good story here.
The current release is 1.11.2 and 1.12 is up to RC5 https://github.com/docker/docker/releases/tag/v1.12.0-rc5
1.10 was released Feb. 4 after four release candidates over a two week period, so has been in the wild for a number of months.
Updated by Brett Smith over 8 years ago
Tom Morris wrote:
Do we need test coverage here?
Yes. arv keep docker
registers Docker images in Arvados. It is critical to building workflows under both Crunch v1 and v2. No other tool does its job, so if it fails, users are up a creek.
Was test writing the reason it was estimated at a full story point?
My recollection is that it was pointed at one point primarily because of uncertainty. There was a possibility that additional code would be necessary to achieve full Docker 1.10+ compatibility. The conservative pointing was intended to help compensate for this.
Do we need to improve our process so that we test with Docker releases when they hit beta (or no later than immediately after they're released) or should we instead have a compatibility matrix which shows which versions we've tested with and are supported?
It would be an improvement. Docker's development pace vastly exceeds ours; in the less than two years we've integrated with it, there have been two breaking compatibility changes like this (the other was #4196).
#9428 would also help mitigate the risk of compatibility drift, by having this client tool talk to a specific version of the Docker API directly, rather than calling out to the Docker CLI.
Docker's such a critical component for the system that I think we need to have a good story here.
The current release is 1.11.2 and 1.12 is up to RC5 https://github.com/docker/docker/releases/tag/v1.12.0-rc5
1.10 was released Feb. 4 after four release candidates over a two week period, so has been in the wild for a number of months.
Part of why we've been dragging our feet is that the image file format change means that we have to devise and write a way to convert Docker images that are already registered in extant Arvados clusters, and so far there hasn't been demand for that. That's #8567.