Crunch2 installation » History » Version 9
Brett Smith, 06/17/2016 07:38 PM
minor copyedits
| 1 | 1 | Tom Clegg | h1. Crunch2 installation |
|---|---|---|---|
| 2 | |||
| 3 | (DRAFT -- when ready, this will move to doc.arvados.org→install) |
||
| 4 | |||
| 5 | 2 | Tom Clegg | {{toc}} |
| 6 | |||
| 7 | h2. Set up a crunch-dispatch service |
||
| 8 | |||
| 9 | Currently, dispatching containers via SLURM is supported. |
||
| 10 | |||
| 11 | 9 | Brett Smith | Install crunch-dispatch-slurm on a node that can submit SLURM jobs. This can be any node appropriately configured to connect to the SLURM controller node. |
| 12 | 2 | Tom Clegg | |
| 13 | <pre><code class="shell"> |
||
| 14 | sudo apt-get install crunch-dispatch-slurm |
||
| 15 | </code></pre> |
||
| 16 | |||
| 17 | 9 | Brett Smith | Create a privileged Arvados API token for use by the dispatcher. If you have multiple dispatch processes, you should give each one a different token. |
| 18 | 2 | Tom Clegg | |
| 19 | <pre><code class="shell"> |
||
| 20 | apiserver:~$ cd /var/www/arvados-api/current |
||
| 21 | apiserver:/var/www/arvados-api/current$ sudo -u webserver-user RAILS_ENV=production bundle exec script/create_superuser_token.rb |
||
| 22 | zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz |
||
| 23 | </code></pre> |
||
| 24 | |||
| 25 | 4 | Tom Clegg | Save the token on the dispatch node, in <code>/etc/sv/crunch-dispatch-slurm/env/ARVADOS_API_TOKEN</code> |
| 26 | 2 | Tom Clegg | |
| 27 | 4 | Tom Clegg | Example runit script (@/etc/sv/crunch-dispatch-slurm/run@): |
| 28 | 2 | Tom Clegg | |
| 29 | 1 | Tom Clegg | <pre><code class="shell"> |
| 30 | #!/bin/sh |
||
| 31 | 4 | Tom Clegg | set -e |
| 32 | exec 2>&1 |
||
| 33 | 2 | Tom Clegg | |
| 34 | export ARVADOS_API_HOST=uuid_prefix.your.domain |
||
| 35 | |||
| 36 | exec chpst -e ./env -u crunch crunch-dispatch-slurm |
||
| 37 | </code></pre> |
||
| 38 | |||
| 39 | 6 | Tom Clegg | Example runit logging script (@/etc/sv/crunch-dispatch-slurm/log/run@): |
| 40 | |||
| 41 | <pre><code class="shell"> |
||
| 42 | #!/bin/sh |
||
| 43 | set -e |
||
| 44 | [ -d main ] || mkdir main |
||
| 45 | exec svlogd -tt ./main |
||
| 46 | </code></pre> |
||
| 47 | |||
| 48 | 9 | Brett Smith | Ensure the @crunch@ user exists -- and has the same UID, GID, and home directory -- on the dispatch node and all SLURM compute nodes. Ensure the @crunch@ user can run Docker containers on SLURM compute nodes. |
| 49 | 2 | Tom Clegg | |
| 50 | 3 | Tom Clegg | h2. Install crunch-run on all compute nodes |
| 51 | 1 | Tom Clegg | |
| 52 | 3 | Tom Clegg | <pre><code class="shell"> |
| 53 | sudo apt-get install crunch-run |
||
| 54 | </code></pre> |
||
| 55 | |||
| 56 | 1 | Tom Clegg | h2. Enable cgroup accounting on all compute nodes |
| 57 | |||
| 58 | 4 | Tom Clegg | (This requirement isn't new for crunch2/containers, but it seems to be a FAQ. The Docker install guide mentions it's optional and performance-degrading, so it's not too surprising if people skip it. Perhaps we should say why/when it's a good idea to enable it?) |
| 59 | |||
| 60 | 3 | Tom Clegg | Check https://docs.docker.com/engine/installation/linux/ for instructions specific to your distribution. |
| 61 | |||
| 62 | For example, on Ubuntu: |
||
| 63 | # Update @/etc/default/grub@ to include: <pre> |
||
| 64 | GRUB_CMDLINE_LINUX="cgroup_enable=memory swapaccount=1" |
||
| 65 | </pre> |
||
| 66 | # @sudo update-grub@ |
||
| 67 | # Reboot |
||
| 68 | 2 | Tom Clegg | |
| 69 | 9 | Brett Smith | h2. Configure Docker |
| 70 | 1 | Tom Clegg | |
| 71 | 4 | Tom Clegg | Unchanged from current docs. |
| 72 | |||
| 73 | 1 | Tom Clegg | h2. Test the dispatcher |
| 74 | 4 | Tom Clegg | |
| 75 | 5 | Tom Clegg | On the dispatch node, monitor the crunch-dispatch logs. |
| 76 | 4 | Tom Clegg | |
| 77 | <pre><code class="shell"> |
||
| 78 | dispatch-node$ tail -F /etc/sv/crunch-dispatch-slurm/log/main/current |
||
| 79 | </code></pre> |
||
| 80 | |||
| 81 | 9 | Brett Smith | On a shell VM, install a Docker image for testing. |
| 82 | 1 | Tom Clegg | |
| 83 | <pre><code class="shell"> |
||
| 84 | 9 | Brett Smith | user@shellvm:~$ arv keep docker busybox |
| 85 | 5 | Tom Clegg | </code></pre> |
| 86 | |||
| 87 | On a shell VM, run a trivial container. |
||
| 88 | |||
| 89 | <pre><code class="shell"> |
||
| 90 | 4 | Tom Clegg | user@shellvm:~$ arv container_request create --container-request '{ |
| 91 | 1 | Tom Clegg | "name": "test", |
| 92 | 4 | Tom Clegg | "state": "Committed", |
| 93 | "priority": 1, |
||
| 94 | 5 | Tom Clegg | "container_image": "busybox", |
| 95 | 8 | Tom Clegg | "command": ["true"], |
| 96 | "output_path": "/out", |
||
| 97 | "mounts": { |
||
| 98 | "/out": { |
||
| 99 | "kind": "tmp", |
||
| 100 | "capacity": 1000 |
||
| 101 | } |
||
| 102 | } |
||
| 103 | 1 | Tom Clegg | }' |
| 104 | </code></pre> |
||
| 105 | 7 | Tom Clegg | |
| 106 | Measures of success: |
||
| 107 | 8 | Tom Clegg | * Dispatcher log entries will indicate it has submitted a SLURM job. |
| 108 | * Before the container finishes, SLURM's @squeue@ command will show the new job in the list of queued/running jobs. |
||
| 109 | * After the container finishes, @arv container list --limit 1@ will indicate the outcome: <pre> |
||
| 110 | 7 | Tom Clegg | { |
| 111 | ... |
||
| 112 | "exit_code":0, |
||
| 113 | ... |
||
| 114 | "state":"Complete", |
||
| 115 | ... |
||
| 116 | } |
||
| 117 | </pre> |