Story #2256
closed[Workbench] Users can access an Arvados shell through the Web
80%
Description
The goal: users who only have access to a Web browser can access an Arvados shell, so they can use a FUSE mount, our CLI tools, etc. Configure and deploy whatever services are necessary for users to get this functionality through a Web client.
The specific implementation is up to you. You can research possible solutions as part of this story, although I'm not sure we need a very thorough survey. Abram has done some promising experiments with (IIRC) tty.js, which may be useful for us.
One requirement is that it must be possible to integrate the Web client in Workbench. However, doing the actual integration work is a separate story.
Files
Updated by Brett Smith about 10 years ago
- Subject changed from Workbench has a built-in SSH client that can be used to operate Arvados VMs, so people with highly restrictive firewalls and no SSH client can use VMs. to [Workbench] Include SSH client that can be used to operate Arvados VMs, so people with highly restrictive firewalls and no SSH client can use VMs.
- Target version set to Arvados Future Sprints
Voting for a priority bump on this. It can be very handy for cases where the user doesn't have direct access to the data they want to use (e.g., it's on a Web server). Doing the shuffle on a shell node makes a lot of sense in this case, and a Web-based SSH client could help smooth that process over.
Updated by Tom Clegg about 10 years ago
- Project changed from 35 to Arvados
- Subject changed from [Workbench] Include SSH client that can be used to operate Arvados VMs, so people with highly restrictive firewalls and no SSH client can use VMs. to [Workbench] Browser-based SSH client for logging in to VMs from an OS with no built-in SSH client
- Category set to Workbench
Updated by Tom Clegg about 10 years ago
One benefit here is that the SSH client could be pre-configured: the user doesn't have to figure out how to set up ~/.ssh/config
, follow a long list of instructions for PuTTY, copy and paste the right VM name, etc.
Updated by Tom Clegg about 10 years ago
There is an SSH app for Chrome (I use this regularly on my Chromebook). I wonder if we can rig up a one-click "[install the app and] open a connection using these settings"?
https://chrome.google.com/webstore/detail/secure-shell/pnhechapfaindjhompbnflcldabbghjo
Not sure what key generation/management would look like.
Here's a JS crypto library (although the demo seems to have hosting/setup trouble atm, one JS library is 502). "Generally RSA key generation takes several minutes. It occasionally takes several hours." But at least it is implemented so as not to freeze the browser.
Updated by Abram Connelly over 9 years ago
As a 5 minute proof of concept, I installed on one of my shell nodes that I had root on:
$ npm install tty.js $ echo > tty_web_srv.js <<EOF var tty = require('tty.js'); var app = tty.createServer({ shell: 'bash', users: { dax: 'arv' }, term : { "cursorBlink" : false }, port: 80 }); app.cursorBlink = false; app.listen(); EOF $ sudo node tty_web_srv.js [tty.js] You should sha1 your user information. [tty.js] Listening on port 80.
Note that it's running over http and will run shells as the root user. tty.js
has options for running over https which should probably be used.
Updated by Peter Amstutz over 9 years ago
Design sketch:
- Run a tty.js daemon by default on each shell node. Add authentication logic that uses the API token and checks in with the API server instead of using HTTP basic auth.
- In the "virtual machines" table on "manage accounts", add a "login" button that links to the page served by "tty.js" server on the shell node.
Updated by Brett Smith over 9 years ago
- Target version changed from Arvados Future Sprints to 2015-07-08 sprint
Updated by Brett Smith over 9 years ago
- Subject changed from [Workbench] Browser-based SSH client for logging in to VMs from an OS with no built-in SSH client to [Workbench] Users can access an Arvados shell through the Web
- Description updated (diff)
Updated by Peter Amstutz over 9 years ago
Some alternatives to tty.js to look at:
- GateOne https://github.com/liftoff/GateOne (AGPLv3)
- Designed to be embeddable, hopefully easy to embed in workbench.
- Has an ssh plugin for managaging keys
- Has gimmicks like being able to transform output into clickable links, embedded images
- Written in Python
- Based on my research this is the most promising so far
- KeyBox https://github.com/skavanagh/KeyBox/ (Apache 2)
- Uses websockets over TLS and acts as gateway from TLS to SSH by installing its own key into the user's authorized_keys file.
- Written in Java
- Guacamole http://guac-dev.org/
- Web based remote desktop. Apparently started life as a web based terminal but doesn't appear to actually support ssh any more.
- AnyTerm http://anyterm.org/index.html (GPL) (looks a little bit old and unmaintained)
- ShellInABox https://code.google.com/p/shellinabox/ (GPL) (also looks a bit old and unmaintained)
Updated by Peter Amstutz over 9 years ago
It seems like the method for doing websockets/TLS -> SSH is for authorized_keys
on the login host to include a key which is controlled by the gateway, then spawn the ssh client on the gateway to connect to the login host. This is not awesome since it makes the gateway on obvious point of attack.
An alternative architecture would forward TLS all the way through to the login host, and then have a process similar to sshd
which speaks the websocket/TLS protocol, does authentication (based on Arvados API token or whatever), and spawns a shell as the correct user using /bin/login -f user
. To avoid a MITM proxy, a gateway node would assign a separate port to each login host that it forwards to.
So far I haven't found anything that does this out of the box, but we might be able to rig something up with a non-privileged tty.js or GateOne service on each login host and a suid
script that does Arvados API token authentication and then runs /bin/login
.
Updated by Nico César over 9 years ago
some improvement has been done.
I've been researching Shell in a box: https://code.google.com/p/shellinabox/wiki/shellinaboxd_man
and installed in su92l and 4xphq (with firewalling problems to solve)
and a self signed cert.
We should * add non-ephimeral external IPs * add DNS entries for shell nodes * get a wildcard certificate for each cluster ( I wish no webbrowser complained about *.*.arvadosapi.com ...! but we have to live with that restriction )
I saw some "mismatch cypher" errors. And I belive that are caused by TLSv1.2 offered by the server and not supported by the client. or something like that ....
I also added the following patch
https://code.google.com/p/shellinabox/issues/detail?id=215
with no sucess ...
Updated by Nico César over 9 years ago
I think that "shell-via-switchyard" on GCE route is creating some issues when trying to connect to an external IP I'm creating a new machine to be a webshell on su92l
Updated by Peter Amstutz over 9 years ago
ShellInABox uses AJAX (which means it has to poll for activity), not Websockets, and looks a little bit old and unmaintained (last commit is March 2012...)
The websockets based solutions (tty.js, gateone, keybox) are more recently developed and look like they have more features/better performance.
However with all these applications the basic problem remains of how best to securely multiplex access to multiple VMs using only https and a single IP address. I think KeyBox comes the closest to actually solving that part of the problem.
Updated by Nico César over 9 years ago
on 4xphq (in AWS) the current shell doesn't have a public IP and from http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-instance-addressing.html#public-ip-addresses
If you launch an instance in EC2-Classic, it is assigned a public IP address by default. You can't modify this behavior.
(..)
The public IP addressing feature is only available during launch. However, whether you assign a public IP address to your instance during launch or not, you can associate an Elastic IP address with your instance after it's launched. For more information, see Elastic IP Addresses (EIP).You can also modify your subnet's public IP addressing behavior.
So I'll try an EIP first.
Updated by Nico César over 9 years ago
Peter Amstutz wrote:
ShellInABox uses AJAX (which means it has to poll for activity), not Websockets, and looks a little bit old and unmaintained (last commit is March 2012...)
The websockets based solutions (tty.js, gateone, keybox) are more recently developed and look like they have more features/better performance.
I tried shellinabox and the user experience was really good. Anyways I'll go through all ( https://en.wikipedia.org/wiki/Web-based_SSH#External_links ) of them, and take the decision later. But my current fight is with the network and the way we setup clusters. (more below)
However with all these applications the basic problem remains of how best to securely multiplex access to multiple VMs using only https and a single IP address. I think KeyBox comes the closest to actually solving that part of the problem.
can you explain further why it's only a Single IP? I'm aware that the current ssh access is via ssh ProxyCommand with turnout@switchyard.4xphq.arvadosapi.com .... but we can think of other architectures. I'm thinking: What's the downside of having external IPs on the current shell nodes? Is it possible to setup a http reverse proxy that handles all the SSL and authentication then redirects to the correct backend (see how simple is this config using apache: https://code.google.com/p/shellinabox/wiki/shellinaboxd_man#CONFIGURATION )
A multi port solution could be done too, in my opinion we should avoid it since I never had good experiences with https certs and random port numbers should-work-in-all-browsers test...
Updated by Nico César over 9 years ago
I see in /etc/nginx/sites-enabled/0000-su92l.arvadosapi.com-ssl.conf in host su92l.arvadosapi.com:
server { listen 10.22.206.25:443 ssl; server_name su92l.arvadosapi.com; (..) location / { proxy_pass http://api; (..)
one option is to add
location /shell {proxy_pass http(s)://shell; ...} location /shell/customer_1 {proxy_pass http(s)://customer1.shell; ...} location /shell/customer_2 {proxy_pass http(s)://customer2.shell; ...}
so people will access to https://su92l.arvadosapi.com/shell and it will be answered by the reverse proxy.
We are currently doing this with my-dev-pgp-hms.shell.su92l.arvadosapi.com in the switchyard.- I don't have any particular preferences so far (switchyard or API server).
Updated by Peter Amstutz over 9 years ago
From conversation on engineering:
- Write a PAM that will go on each shell node and authenticate login by contacting the API server using the API token for the password.
- We can write a PAM in Python: http://pam-python.sourceforge.net/
apt-get install libpam-python
- There's two options for architecture:
- run a https-to-ssh gateway (gateway runs ssh client and uses password based login on the shell node based on the above PAM)
- run a reverse https proxy and run a service on each shell node, use a suid script to authorize the user and run a login shell. (in this case, a full PAM may not be necessary)
Updated by Tom Clegg over 9 years ago
Login checker could look something like this
import arvados
import os
requested_username = "foobar"
arv = arvados.api('v1', host=os.getenv('ARVADOS_API_HOST'), token='whatever-the-client-provided')
my_hostname = subprocess.check_output('hostname').strip()
# BUG: hostname stored on the API is just "foo.shell", not "foo.shell.zzzzz.arvadosapi.com"!
matches = arv.virtual_machines().list(filters=[['hostname','=',my_hostname]]).execute()['items']
if len(matches) != 1:
raise "I don't know who I am!"
this_vm_uuid = matches[0]['uuid']
client_user_uuid = arv.users().current().execute()['uuid']
filters = [
['link_class','=','permission'],
['name','=','can_login'],
['head_uuid','=',this_vm_uuid],
['tail_uuid','=',client_user_uuid]]
allowed = False
for l in arvados.api('v1').links().list(filters=filters).execute()['items']:
if requested_username == l['properties']['username']:
allowed = True
break
if allowed:
print "OK"
else:
print "Cancel"
Updated by Nico César over 9 years ago
seems that this module needs more than we want to. like for example: _sys.argv0
2015-06-16T20:53:32+00:00 Traceback (most recent call last): 2015-06-16T20:53:32+00:00 File "/home/nico/nico_pam.py", line 2, in <module> 2015-06-16T20:53:32+00:00 import arvados 2015-06-16T20:53:32+00:00 File "/usr/local/lib/python2.7/dist-packages/arvados/__init__.py", line 21, in <module> 2015-06-16T20:53:32+00:00 from .api import api, http_cache 2015-06-16T20:53:32+00:00 File "/usr/local/lib/python2.7/dist-packages/arvados/api.py", line 9, in <module> 2015-06-16T20:53:32+00:00 import apiclient 2015-06-16T20:53:32+00:00 File "/usr/local/lib/python2.7/dist-packages/apiclient/__init__.py", line 22, in <module> 2015-06-16T20:53:32+00:00 from googleapiclient import sample_tools 2015-06-16T20:53:32+00:00 File "/usr/local/lib/python2.7/dist-packages/googleapiclient/sample_tools.py", line 31, in <module> 2015-06-16T20:53:32+00:00 from oauth2client import tools 2015-06-16T20:53:32+00:00 File "/usr/local/lib/python2.7/dist-packages/oauth2client/tools.py", line 69, in <module> 2015-06-16T20:53:32+00:00 argparser = _CreateArgumentParser() 2015-06-16T20:53:32+00:00 File "/usr/local/lib/python2.7/dist-packages/oauth2client/tools.py", line 54, in _CreateArgumentParser 2015-06-16T20:53:32+00:00 parser = argparse.ArgumentParser(add_help=False) 2015-06-16T20:53:32+00:00 File "/usr/lib/python2.7/argparse.py", line 1573, in __init__ 2015-06-16T20:53:32+00:00 prog = _os.path.basename(_sys.argv[0]) 2015-06-16T20:53:32+00:00 AttributeError: 'module' object has no attribute 'argv'
Updated by Tom Clegg over 9 years ago
http://www.linux-pam.org/Linux-PAM-html/sag-pam_exec.html is a possibility. "User can have control over the environment" is a nice warning but doesn't seem insurmountable, especially if we avoid using this for stuff other than sshd.
Updated by Nico César over 9 years ago
I fixed it. by doing:
import sys sys.argv=[''] import arvados
but there is another global variable error:
2015-06-16T21:51:06+00:00 Traceback (most recent call last): 2015-06-16T21:51:06+00:00 File "/home/nico/nico_pam.py", line 59, in pam_sm_authenticate 2015-06-16T21:51:06+00:00 if not check_arvados_token(user, resp.resp): 2015-06-16T21:51:06+00:00 File "/home/nico/nico_pam.py", line 22, in check_arvados_token 2015-06-16T21:51:06+00:00 arv = arvados.api('v1',host=ARVADOS_API_HOST, token=token) 2015-06-16T21:51:06+00:00 File "/usr/local/lib/python2.7/dist-packages/arvados/api.py", line 164, in api 2015-06-16T21:51:06+00:00 http_kwargs['cache'] = http_cache('discovery') 2015-06-16T21:51:06+00:00 File "/usr/local/lib/python2.7/dist-packages/arvados/api.py", line 93, in http_cache 2015-06-16T21:51:06+00:00 path = os.environ['HOME'] + '/.cache/arvados/' + data_type 2015-06-16T21:51:06+00:00 File "/usr/lib/python2.7/UserDict.py", line 23, in __getitem__ 2015-06-16T21:51:06+00:00 raise KeyError(key) 2015-06-16T21:51:06+00:00 KeyError: 'HOME'
can http_cache have everthing somewhere else? or no cache at all....? since this is excecuted by HOME-less nobody
Updated by Nico César over 9 years ago
disabled cache: then bumped into:
2015-06-16T22:14:52+00:00 Traceback (most recent call last): 2015-06-16T22:14:52+00:00 File "/home/nico/nico_pam.py", line 59, in pam_sm_authenticate 2015-06-16T22:14:52+00:00 if not check_arvados_token(user, resp.resp): 2015-06-16T22:14:52+00:00 File "/home/nico/nico_pam.py", line 24, in check_arvados_token 2015-06-16T22:14:52+00:00 matches = arv.virtual_machines().list(filters=[['hostname','=',my_hostname]]).execute()['items'] 2015-06-16T22:14:52+00:00 File "/usr/local/lib/python2.7/dist-packages/oauth2client/util.py", line 137, in positional_wrapper 2015-06-16T22:14:52+00:00 return wrapped(*args, **kwargs) 2015-06-16T22:14:52+00:00 File "/usr/local/lib/python2.7/dist-packages/googleapiclient/http.py", line 723, in execute 2015-06-16T22:14:52+00:00 raise HttpError(resp, content, uri=self.uri) 2015-06-16T22:14:52+00:00 ApiError: <HttpError 401 when requesting https://4xphq.arvadosapi.com/arvados/v1/virtual_machines?alt=json&filters=%5B%5B%22hostname%22%2C+%22%3D%22%2C+%22shell.4xphq%22%5D%5D returned "Not logged in"> Password:
Updated by Nico César over 9 years ago
more global variables found.
2015-06-17T15:40:32+00:00 Traceback (most recent call last): 2015-06-17T15:40:32+00:00 File "/home/nico/nico_pam.py", line 72, in pam_sm_authenticate 2015-06-17T15:40:32+00:00 if not check_arvados_token(user, resp.resp): 2015-06-17T15:40:32+00:00 File "/home/nico/nico_pam.py", line 49, in check_arvados_token 2015-06-17T15:40:32+00:00 auth_log(str(arvados.api('v1').links().list().execute())) 2015-06-17T15:40:32+00:00 File "/usr/local/lib/python2.7/dist-packages/arvados/api.py", line 147, in api 2015-06-17T15:40:32+00:00 return api_from_config(version=version, cache=cache, **kwargs) 2015-06-17T15:40:32+00:00 File "/usr/local/lib/python2.7/dist-packages/arvados/api.py", line 204, in api_from_config 2015-06-17T15:40:32+00:00 raise ValueError("%s is not set. Aborting." % x) 2015-06-17T15:40:32+00:00 ValueError: ARVADOS_API_HOST is not set. Aborting.
Updated by Nico César over 9 years ago
I finally made libpam module work·
So shellinabox works with the trick part of copy&pasting the Arvados API token. but it lets you select the user.
- window resizing works
- tab works
- color works
- copy works
- ñññ-test works
- pasting half-works: right click will bring a menu to
- BUG: some letters in my non-US layout dont work. solved in v2.15rc2 https://github.com/shellinabox/shellinabox/commit/3570f20b0b0db1909bf19685128ed3ae3a3445dd
I wanted to tryout GateOne https://github.com/liftoff/GateOne.git , but it require python-tornado >= 4.0 and Debian SID comes with 3.2 (Jessie had 2.3) before going down the rabbit hole of 400 pip install's I prefer to see other options
Anyterm seems to be old old... it's an apache module and it's required an old libboost to compile (1.32) . http://anyterm.org/1.1/install.html I had dealt with libboost version madness before. I don't think is worth using the story point budget on this.
Updated by Nico César over 9 years ago
- File webmux.png webmux.png added
webmux is a MULTI connection. In out case makes sense to be running on switchyard.
it has OWN authentication screen with email/password... and then you add connection and having the PRIVATE key per connection loaded on the database (it's a sqlite3 by default.)
- Multi-tab
- window resizing works
- tab works
- color works
- copy works
- ñññ-test doesn't work
- pasting DOESN'T WORK!!
- websocket based
- python-twisted installed from backports and some extra package (python-twisted-sockjs) from http://sousmonlit.zincube.net/~niol/apt/pool/main/s/sockjs-twisted/python-twisted-sockjs_1.2.1-1_all.deb
- slower than
Updated by Nico César over 9 years ago
8iframes are used to embed shellinabox
https://code.google.com/p/shellinabox/issues/detail?id=253#c1
Updated by Nico César over 9 years ago
- File shellinabox.png shellinabox.png added
Updated by Nico César over 9 years ago
buy 1 certificate per installation *.zzzzz.arvadosapi.com OR webshell.zzzzz.arvadosapi.comDONEmake sure webshell.zzzzz.arvadosapi.com to resolves to switchyardDONEhave a map in hiera that location -> host for switchyardDONE- webshell.zzzzz.arvadosapi.com/location-example: location-example.shell.zzzzz.arvadosapi.com
- webshell.zzzzz.arvadosapi.com/johns-shell: johns-shell.shell.zzzzz.arvadosapi.com
- webshell.zzzzz.arvadosapi.com/default-shell: default-shell.shell.zzzzz.arvadosapi.com
create a puppet manifest to handle reverse proxy and CORS in switchyard's nginx( #6410 ) DONE- package and install libpam module for arvados authentication ( #6360 ) in all shells
package and install shellinabox-2.14 in shell via puppet (check if any extra patching is needed)DONE- create a sample HTML needed for workbench
Updated by Nico César over 9 years ago
- Status changed from In Progress to Resolved