Bug #5095
closed[FUSE] arv-mount takes up too much memory and occassionally crashes when listing large 'home' directory
100%
Description
On lightning-dev2 (qr1hi), in the home directory of my keep mount (/home/abram/keep/home), doing an 'ls' causes arv-mount to balloon to 95% usage and sometimes crash. The 'ls' command itself sometimes takes upwards of 5-15 minutes to complete.
~$ cd keep/home ~/keep/home$ ls
(this could take more than 5 minutes or so).
If it completes without crashing arv-mount (and me needing to remount the keep mount point), then subsequent directory listings run without too much issue. I think arv-mount also settles down to a memory usage that's more reasonable (right now it's at 50% or so).
File access in the keep mount, by going to ~/keep/by_id/<PDH> for example, while the 'ls' is running works fine.
I believe I have around 1.7k elements in my 'home' project.
Updated by Brett Smith almost 10 years ago
- Subject changed from arv-mount takes up too much memory and occassionally crashes when listing the 'home' directory in my keep mount to [FUSE] arv-mount takes up too much memory and occassionally crashes when listing large 'home' directory
- Category set to Keep
Updated by Tom Clegg almost 10 years ago
- arv-mount is retrieving the manifest_text for all those collections, even though it doesn't need to do that in order to show a list of files.
- arv-mount isn't retrieving the manifest_text, but apiserver is spending a lot of time preprocessing each collection/page anyway.
- arv-mount is fetching smaller pages than necessary. (Perhaps limit=1000 better if you know you're just going to keep asking for more pages anyway?)
Worth up to 1.0 points for investigating and reporting how much these (and other) factors contribute, even if the corresponding fixes aren't trivial.
Worth 2.0 points if the second point is a significant improvement.
Updated by Tom Clegg almost 10 years ago
- Category changed from Keep to FUSE
- Story points set to 2.0
Updated by Tom Clegg almost 10 years ago
- Target version changed from Bug Triage to 2015-02-18 sprint
Updated by Brett Smith almost 10 years ago
Reviewing 4106786.
- Directory uses the current time as the default mtime. CollectionDirectory overrides this with 0 for API collections that don't provide a modified_at value, but leaves Directory's default in place for collections retrieved from Keep. Isn't this kind of inconsistent?
- Sort of following along with the theme of this branch, wouldn't it save both time and code to update self._mtime in CollectionDirectory.new_collection? Then CollectionDirectory wouldn't need to override the mtime method at all, it could just use Directory's.
Thanks.
Updated by Peter Amstutz almost 10 years ago
- Status changed from New to In Progress
Updated by Peter Amstutz almost 10 years ago
Brett Smith wrote:
Reviewing 4106786.
- Directory uses the current time as the default mtime. CollectionDirectory overrides this with 0 for API collections that don't provide a modified_at value, but leaves Directory's default in place for collections retrieved from Keep. Isn't this kind of inconsistent?
I had not thought of that. It now defaults to 0 in both cases.
- Sort of following along with the theme of this branch, wouldn't it save both time and code to update self._mtime in CollectionDirectory.new_collection? Then CollectionDirectory wouldn't need to override the mtime method at all, it could just use Directory's.
Yes. Fixed.
Now at ef4e4a3
Updated by Brett Smith almost 10 years ago
Updated by Peter Amstutz almost 10 years ago
- Status changed from In Progress to Resolved
Applied in changeset arvados|commit:8b90f80efca772efd2697ffc70d7809c32564171.