Story #17103
closedDeveloper shell inside running container
0%
Description
Ability to connect to a shell running inside a crunch container.
Draft implementation proposal:
1. When dispatcher starts crunch-run, it includes the hmac of systemroottoken and container id as a cli argument to crunch-run. Crunch-run can use that hmac to validate incoming shell connections (see below).
2. When crunch-run starts, it updates the container record with its ip and the port it listens on for shell connections. TBD: best name for this field.
3. Users can invoke this command to connect to a container:
arvados-client shell --user username container_uuid command...
(--user username is optional)
4. When invoked like that, arvados-client connects to controller on the standard port, using a new shell endpoint. Controller verifies the container record is readable by this user. If so, it uses connection hijacking (like we do in websockets) to set up bidirectional communication with arvados-client. That way controller doesn't need to listen on another port.
Controller then forwards the connection to the container, using the host and port recorded by crunch-run in the container record. It authenticates the request by sending along
hmac (hmac(systemroottoken and container id) and the container ip and the container shell port)
Crunch-run validates the hmac. We do hmac of the hmac for two reasons: it will make accidental connections to incorrect containers impossible. Also, it avoids malicious connections from another node on which the container was attempted to be started before: that node would also be able to generate hmac(systemroottoken and container id), but it wouldn't be able to know the ip + shell port combination of the container, so it wouldn't be able to compute the final hmac.
5. Assuming the connection validates, crunch-run then executes
docker exec -uid username -it docker_container_id command
or, if the username was not specified,
docker exec -it docker_container_id command
and it connects stdin/stdout/stderr with the incoming connection. The communication between arvados-client and crunch-run uses the ssh protocol under the hood, but that's transparent to the user. Because we are using docker exec between crunch-run and the container, we can't use ssh all the way through, which is unfortunate. The alternative would be to always have an ssh server inside the container, but that would be much more complicated.
Open questions:
- connecting with ssh should probably mark the container as not eligible for reuse. How should we approach that?
- a systemwide setting should probably govern the availability of the container shell feature (disabled/default off/default on ?)
- maybe the availability of the shell feature should be a runtime constraint, like api_access?