Crunch2 installation » History » Version 8
Tom Clegg, 06/17/2016 03:26 PM
1 | 1 | Tom Clegg | h1. Crunch2 installation |
---|---|---|---|
2 | |||
3 | (DRAFT -- when ready, this will move to doc.arvados.org→install) |
||
4 | |||
5 | 2 | Tom Clegg | {{toc}} |
6 | |||
7 | h2. Set up a crunch-dispatch service |
||
8 | |||
9 | Currently, dispatching containers via SLURM is supported. |
||
10 | |||
11 | Install crunch-dispatch-slurm on a node that can submit SLURM jobs. This can be the slurm controller node, a worker node, or any other node that has the appropriate SLURM/munge configuration. |
||
12 | |||
13 | <pre><code class="shell"> |
||
14 | sudo apt-get install crunch-dispatch-slurm |
||
15 | </code></pre> |
||
16 | |||
17 | Create a privileged token for use by the dispatcher. If you have multiple dispatch processes, you should give each one a different token. |
||
18 | |||
19 | <pre><code class="shell"> |
||
20 | apiserver:~$ cd /var/www/arvados-api/current |
||
21 | apiserver:/var/www/arvados-api/current$ sudo -u webserver-user RAILS_ENV=production bundle exec script/create_superuser_token.rb |
||
22 | zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz |
||
23 | </code></pre> |
||
24 | |||
25 | 4 | Tom Clegg | Save the token on the dispatch node, in <code>/etc/sv/crunch-dispatch-slurm/env/ARVADOS_API_TOKEN</code> |
26 | 2 | Tom Clegg | |
27 | 4 | Tom Clegg | Example runit script (@/etc/sv/crunch-dispatch-slurm/run@): |
28 | 2 | Tom Clegg | |
29 | 1 | Tom Clegg | <pre><code class="shell"> |
30 | #!/bin/sh |
||
31 | 4 | Tom Clegg | set -e |
32 | exec 2>&1 |
||
33 | 2 | Tom Clegg | |
34 | export ARVADOS_API_HOST=uuid_prefix.your.domain |
||
35 | |||
36 | exec chpst -e ./env -u crunch crunch-dispatch-slurm |
||
37 | </code></pre> |
||
38 | |||
39 | 6 | Tom Clegg | Example runit logging script (@/etc/sv/crunch-dispatch-slurm/log/run@): |
40 | |||
41 | <pre><code class="shell"> |
||
42 | #!/bin/sh |
||
43 | set -e |
||
44 | [ -d main ] || mkdir main |
||
45 | exec svlogd -tt ./main |
||
46 | </code></pre> |
||
47 | |||
48 | 2 | Tom Clegg | Ensure the @crunch@ user exists -- and has the same UID, GID, and home directory -- on the dispatch node and all SLURM compute nodes. Ensure the @crunch@ user can run docker containers on SLURM compute nodes. |
49 | |||
50 | 3 | Tom Clegg | h2. Install crunch-run on all compute nodes |
51 | 1 | Tom Clegg | |
52 | 3 | Tom Clegg | <pre><code class="shell"> |
53 | sudo apt-get install crunch-run |
||
54 | </code></pre> |
||
55 | |||
56 | 1 | Tom Clegg | h2. Enable cgroup accounting on all compute nodes |
57 | |||
58 | 4 | Tom Clegg | (This requirement isn't new for crunch2/containers, but it seems to be a FAQ. The Docker install guide mentions it's optional and performance-degrading, so it's not too surprising if people skip it. Perhaps we should say why/when it's a good idea to enable it?) |
59 | |||
60 | 3 | Tom Clegg | Check https://docs.docker.com/engine/installation/linux/ for instructions specific to your distribution. |
61 | |||
62 | For example, on Ubuntu: |
||
63 | # Update @/etc/default/grub@ to include: <pre> |
||
64 | GRUB_CMDLINE_LINUX="cgroup_enable=memory swapaccount=1" |
||
65 | </pre> |
||
66 | # @sudo update-grub@ |
||
67 | # Reboot |
||
68 | 2 | Tom Clegg | |
69 | 1 | Tom Clegg | h2. Configure docker |
70 | |||
71 | 4 | Tom Clegg | Unchanged from current docs. |
72 | |||
73 | 1 | Tom Clegg | h2. Test the dispatcher |
74 | 4 | Tom Clegg | |
75 | 5 | Tom Clegg | On the dispatch node, monitor the crunch-dispatch logs. |
76 | 4 | Tom Clegg | |
77 | <pre><code class="shell"> |
||
78 | dispatch-node$ tail -F /etc/sv/crunch-dispatch-slurm/log/main/current |
||
79 | </code></pre> |
||
80 | |||
81 | 5 | Tom Clegg | On a shell VM, install a docker image for testing. |
82 | 1 | Tom Clegg | |
83 | <pre><code class="shell"> |
||
84 | 5 | Tom Clegg | user@shellvm:~$ arv-keepdocker busybox |
85 | </code></pre> |
||
86 | |||
87 | On a shell VM, run a trivial container. |
||
88 | |||
89 | <pre><code class="shell"> |
||
90 | 4 | Tom Clegg | user@shellvm:~$ arv container_request create --container-request '{ |
91 | 1 | Tom Clegg | "name": "test", |
92 | 4 | Tom Clegg | "state": "Committed", |
93 | "priority": 1, |
||
94 | 5 | Tom Clegg | "container_image": "busybox", |
95 | 8 | Tom Clegg | "command": ["true"], |
96 | "output_path": "/out", |
||
97 | "mounts": { |
||
98 | "/out": { |
||
99 | "kind": "tmp", |
||
100 | "capacity": 1000 |
||
101 | } |
||
102 | } |
||
103 | 1 | Tom Clegg | }' |
104 | </code></pre> |
||
105 | 7 | Tom Clegg | |
106 | Measures of success: |
||
107 | 8 | Tom Clegg | * Dispatcher log entries will indicate it has submitted a SLURM job. |
108 | * Before the container finishes, SLURM's @squeue@ command will show the new job in the list of queued/running jobs. |
||
109 | * After the container finishes, @arv container list --limit 1@ will indicate the outcome: <pre> |
||
110 | 7 | Tom Clegg | { |
111 | ... |
||
112 | "exit_code":0, |
||
113 | ... |
||
114 | "state":"Complete", |
||
115 | ... |
||
116 | } |
||
117 | </pre> |