Crunch2 installation » History » Version 7
Tom Clegg, 06/17/2016 03:21 PM
| 1 | 1 | Tom Clegg | h1. Crunch2 installation |
|---|---|---|---|
| 2 | |||
| 3 | (DRAFT -- when ready, this will move to doc.arvados.org→install) |
||
| 4 | |||
| 5 | 2 | Tom Clegg | {{toc}} |
| 6 | |||
| 7 | h2. Set up a crunch-dispatch service |
||
| 8 | |||
| 9 | Currently, dispatching containers via SLURM is supported. |
||
| 10 | |||
| 11 | Install crunch-dispatch-slurm on a node that can submit SLURM jobs. This can be the slurm controller node, a worker node, or any other node that has the appropriate SLURM/munge configuration. |
||
| 12 | |||
| 13 | <pre><code class="shell"> |
||
| 14 | sudo apt-get install crunch-dispatch-slurm |
||
| 15 | </code></pre> |
||
| 16 | |||
| 17 | Create a privileged token for use by the dispatcher. If you have multiple dispatch processes, you should give each one a different token. |
||
| 18 | |||
| 19 | <pre><code class="shell"> |
||
| 20 | apiserver:~$ cd /var/www/arvados-api/current |
||
| 21 | apiserver:/var/www/arvados-api/current$ sudo -u webserver-user RAILS_ENV=production bundle exec script/create_superuser_token.rb |
||
| 22 | zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz |
||
| 23 | </code></pre> |
||
| 24 | |||
| 25 | 4 | Tom Clegg | Save the token on the dispatch node, in <code>/etc/sv/crunch-dispatch-slurm/env/ARVADOS_API_TOKEN</code> |
| 26 | 2 | Tom Clegg | |
| 27 | 4 | Tom Clegg | Example runit script (@/etc/sv/crunch-dispatch-slurm/run@): |
| 28 | 2 | Tom Clegg | |
| 29 | 1 | Tom Clegg | <pre><code class="shell"> |
| 30 | #!/bin/sh |
||
| 31 | 4 | Tom Clegg | set -e |
| 32 | exec 2>&1 |
||
| 33 | 2 | Tom Clegg | |
| 34 | export ARVADOS_API_HOST=uuid_prefix.your.domain |
||
| 35 | |||
| 36 | exec chpst -e ./env -u crunch crunch-dispatch-slurm |
||
| 37 | </code></pre> |
||
| 38 | |||
| 39 | 6 | Tom Clegg | Example runit logging script (@/etc/sv/crunch-dispatch-slurm/log/run@): |
| 40 | |||
| 41 | <pre><code class="shell"> |
||
| 42 | #!/bin/sh |
||
| 43 | set -e |
||
| 44 | [ -d main ] || mkdir main |
||
| 45 | exec svlogd -tt ./main |
||
| 46 | </code></pre> |
||
| 47 | |||
| 48 | 2 | Tom Clegg | Ensure the @crunch@ user exists -- and has the same UID, GID, and home directory -- on the dispatch node and all SLURM compute nodes. Ensure the @crunch@ user can run docker containers on SLURM compute nodes. |
| 49 | |||
| 50 | 3 | Tom Clegg | h2. Install crunch-run on all compute nodes |
| 51 | 1 | Tom Clegg | |
| 52 | 3 | Tom Clegg | <pre><code class="shell"> |
| 53 | sudo apt-get install crunch-run |
||
| 54 | </code></pre> |
||
| 55 | |||
| 56 | 1 | Tom Clegg | h2. Enable cgroup accounting on all compute nodes |
| 57 | |||
| 58 | 4 | Tom Clegg | (This requirement isn't new for crunch2/containers, but it seems to be a FAQ. The Docker install guide mentions it's optional and performance-degrading, so it's not too surprising if people skip it. Perhaps we should say why/when it's a good idea to enable it?) |
| 59 | |||
| 60 | 3 | Tom Clegg | Check https://docs.docker.com/engine/installation/linux/ for instructions specific to your distribution. |
| 61 | |||
| 62 | For example, on Ubuntu: |
||
| 63 | # Update @/etc/default/grub@ to include: <pre> |
||
| 64 | GRUB_CMDLINE_LINUX="cgroup_enable=memory swapaccount=1" |
||
| 65 | </pre> |
||
| 66 | # @sudo update-grub@ |
||
| 67 | # Reboot |
||
| 68 | 2 | Tom Clegg | |
| 69 | 1 | Tom Clegg | h2. Configure docker |
| 70 | |||
| 71 | 4 | Tom Clegg | Unchanged from current docs. |
| 72 | |||
| 73 | 1 | Tom Clegg | h2. Test the dispatcher |
| 74 | 4 | Tom Clegg | |
| 75 | 5 | Tom Clegg | On the dispatch node, monitor the crunch-dispatch logs. |
| 76 | 4 | Tom Clegg | |
| 77 | <pre><code class="shell"> |
||
| 78 | dispatch-node$ tail -F /etc/sv/crunch-dispatch-slurm/log/main/current |
||
| 79 | </code></pre> |
||
| 80 | |||
| 81 | 5 | Tom Clegg | On a shell VM, install a docker image for testing. |
| 82 | 1 | Tom Clegg | |
| 83 | <pre><code class="shell"> |
||
| 84 | 5 | Tom Clegg | user@shellvm:~$ arv-keepdocker busybox |
| 85 | </code></pre> |
||
| 86 | |||
| 87 | On a shell VM, run a trivial container. |
||
| 88 | |||
| 89 | <pre><code class="shell"> |
||
| 90 | 4 | Tom Clegg | user@shellvm:~$ arv container_request create --container-request '{ |
| 91 | 1 | Tom Clegg | "name": "test", |
| 92 | 4 | Tom Clegg | "state": "Committed", |
| 93 | "priority": 1, |
||
| 94 | 5 | Tom Clegg | "container_image": "busybox", |
| 95 | 7 | Tom Clegg | "command": ["echo", "OK"], |
| 96 | "output_path": "/dev/null" |
||
| 97 | 4 | Tom Clegg | }' |
| 98 | 1 | Tom Clegg | </code></pre> |
| 99 | 7 | Tom Clegg | |
| 100 | Measures of success: |
||
| 101 | * You should see dispatcher log entries indicating it has submitted a SLURM job. |
||
| 102 | * Provided the container doesn't finish before you get a chance, SLURM's @squeue@ command should show the new job in the list of queued/running jobs. |
||
| 103 | * After the SLURM job finishes, @arv container list --limit 1@ should indicate the outcome: <pre> |
||
| 104 | { |
||
| 105 | ... |
||
| 106 | "exit_code":0, |
||
| 107 | ... |
||
| 108 | "state":"Complete", |
||
| 109 | ... |
||
| 110 | } |
||
| 111 | </pre> |