Story #12308
Updated by Tom Clegg over 7 years ago
Background:
Python+llfuse was expedient and has done lots of good work for us, but it's not promising as a long term (fast+reliable+maintainable) solution.
Replacement strategy:
* use collection-backed filesystem from #12483
* add a more general arvados-backed filesystem ("by_id" directory, etc)
* present as fuse using a library like https://godoc.org/github.com/hanwen/go-fuse/fuse or https://godoc.org/bazil.org/fuse
The arvados-to-filesystem mapping should be implemented as a native Go interface, with a separate thin layer attaching that to the FUSE library. This way we can export the same filesystem behavior through other interfaces. In particular, we will want at least:
* bazil.org/fuse is a popular choice for doing fuse with go
* billziss-gh/cgofuse cross-compiles to Linux, Windows, and OSX (but is probably not as good as bazil on linux)
* webdav should export the same hierarchy
To combat the proliferation of separately packaged client programs, we should build this as the first subcommand ("mount") of a new eventually-all-encompassing CLI tool "arvados". [[CLI client]]
TBD:
* Approach for handling websocket "update" events
* Selectable mechanisms/options for syncing to server (fflush, fsync, close) (on a shell node, flush-on-close might be best; in crunch-run, flush-on-exit might be best)
* Desired behavior when updates conflict (write error? clobber? create "oops,clobbered" file?)
* Control overall cache size (currently collectionfs can use lots of RAM in certain non-sequential write scenarios; we need the ability to trade speed for space efficiency in memory-constrained environments)