Bug #17589
Updated by Ward Vandewege almost 4 years ago
This is happening on Tordo (AWS test cluster), at version 2.2.0~dev20210426141415-1. Tordo is in a LoginCluster federation with ce8i5 (Azure test cluster) as the login cluster.
As a superuser, on the tordo shell node, I mounted keep with `arv-mount keep --debug --foreground`. I then ran `ls keep/home`.
This hangs arv-mount indefinitely. The output from arv-mount is just this:
<pre>
unique: 10, opcode: LOOKUP (1), nodeid: 1, insize: 45, pid: 29513
2021-04-27 01:35:09 arvados.arvados_fuse[29440] DEBUG: arv-mount lookup: parent_inode 1 name 'home' inode 5
unique: 10, success, outsize: 144
unique: 11, opcode: LOOKUP (1), nodeid: 1, insize: 45, pid: 29513
2021-04-27 01:35:09 arvados.arvados_fuse[29440] DEBUG: arv-mount lookup: parent_inode 1 name 'home' inode 5
unique: 11, success, outsize: 144
unique: 12, opcode: OPENDIR (27), nodeid: 5, insize: 48, pid: 29513
2021-04-27 01:35:09 arvados.arvados_fuse[29440] DEBUG: arv-mount opendir: inode 5
2021-04-27 01:35:09 arvados.arvados_fuse[29440] DEBUG: InodeCache touched inode 5 (size 0) (uuid ce8i5-tpzed-xo2k4i24bjzwedw) total now 0 (1 entries)
2021-04-27 01:35:09 arvados.arvados_fuse[29440] DEBUG: InodeCache touched inode 8 (size 0) (uuid ce8i5-tpzed-xo2k4i24bjzwedw) total now 0 (2 entries)
</pre>
arv-mount slowly consumes more and more memory until the OOM killer takes it out.
While in this state, with `ls` hanging, accessing specific collections via `keep/by_id` works just fine.