Actions
Bug #18719
closeds3 sync problems with arv-mount
Status:
Resolved
Priority:
Normal
Assigned To:
Category:
FUSE
Target version:
Start date:
02/07/2022
Due date:
% Done:
100%
Estimated time:
(Total: 0.00 h)
Story points:
-
Release:
Release relationship:
Auto
Description
Running "s3 sync" on arv-mount doesn't behave well.
aws s3 sync --endpoint=https://collections.ce8i5.arvadosapi.com s3://ce8i5-4zz18-jaj3bafbrt0cb7s/ . [Errno 2] No such file or directory
Traceback (most recent call last): File "awscli/clidriver.py", line 459, in main File "awscli/customizations/commands.py", line 197, in __call__ File "awscli/customizations/commands.py", line 191, in __call__ File "awscli/customizations/s3/subcommands.py", line 725, in _run_main File "awscli/customizations/s3/subcommands.py", line 991, in run File "awscli/customizations/s3/fileformat.py", line 55, in format File "awscli/customizations/s3/fileformat.py", line 84, in local_format File "posixpath.py", line 379, in abspath FileNotFoundError: [Errno 2] No such file or directory
>> os.path.abspath(".") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib64/python2.7/posixpath.py", line 371, in abspath cwd = os.getcwd() OSError: [Errno 2] No such file or directory
sh-4.2$ cat .arvados#collection cat: .arvados#collection: Input/output error
2022-02-04 14:17:39 arvados.arvados_fuse[1631186] ERROR: Unhandled exception during FUSE operation Traceback (most recent call last): File "/home/peter/work/arvados/services/fuse/arvados_fuse/__init__.py", line 327, in catch_exceptions_wrapper return orig_func(self, *args, **kwargs) File "/home/peter/work/arvados/services/fuse/arvados_fuse/__init__.py", line 583, in lookup inode = p[name].inode File "/home/peter/work/arvados/services/fuse/arvados_fuse/fresh.py", line 25, in use_counter_wrapper return orig_func(self, *args, **kwargs) File "/home/peter/work/arvados/services/fuse/arvados_fuse/fresh.py", line 34, in check_update_wrapper return orig_func(self, *args, **kwargs) File "/home/peter/work/arvados/services/fuse/arvados_fuse/fusedir.py", line 566, in __getitem__ self.collection_record_file = ObjectFile(self.inode, self.collection_record) File "/home/peter/work/arvados/services/fuse/arvados_fuse/fusefile.py", line 102, in __init__ self.object_uuid = obj['uuid'] TypeError: 'NoneType' object is not subscriptable
Updated by Peter Amstutz almost 3 years ago
- Status changed from New to In Progress
Updated by Peter Amstutz almost 3 years ago
Issues seem to be:
- Generating conflicts with self (longstanding bug #11553)
- Replacing collection entry with a new collection entry, but they have the same uuid, gets a new inode, old inode goes away, so then getpwd() errors out
- collection_record not updated with new one (this might be a straightforward omission)
Updated by Peter Amstutz almost 3 years ago
"contents" response is returning past collections versions?
Updated by Peter Amstutz almost 3 years ago
aws s3 sync --endpoint=https://collections.ce8i5.arvadosapi.com s3://ce8i5-4zz18-pe56veug61jpmds/ .
Updated by Peter Amstutz almost 3 years ago
18719-fuse-fixes @ 97ace0ab8a33f488715909ba1058c790aeb0900b
Addresses the following bugs:
- group.contents appears to include previous version snapshots. Because the previous version typically has the same name as the head version, the snapshot ends up replacing the head in the directory listing.
- websockets also sends out a creation event for the snapshot. It appears that websockets currently doesn't provide enough of the collection record to be able to distinguish between a new collection and a snapshot of a past version (and FUSE wasn't aware of snapshots anyway). This produces a similar problem where the snapshot ends up replacing the head version because they have the same name.
- When triggering live updates based on websocket events, it would try to update a collection to a specific portable data hash, but this also ran into the issue of confusing the snapshot with the head version.
- when running update(), or on flush() after the user made a change to the collection, it did not update the internal copy of the collection record with the copy it got back from the update operation, as a result the special ".arvados#collection" was always stale
- When multiple files are being flushed at once, the commit_all() part of flush() can overlap, which can result in double attempts to delete a buffer block
Fixes:
- Filter snapshots from group.contents response
- Only invalidate directories based on websocket events (to be updated on next access), stop trying to be clever about applying incremental changes or updating to specific PDHs
- Change collection_record_file to use a callback, the callback flushes and then gets the latest collection record from Collection object
- Be more strategic about when to invalidate the collection_record_file
- Looking up ".arvados#collection" invalidates it and forces a refresh
- If a buffer block doesn't exist and we try to delete it, don't throw an error
Updated by Ward Vandewege almost 3 years ago
Peter Amstutz wrote:
18719-fuse-fixes @ 97ace0ab8a33f488715909ba1058c790aeb0900b
Addresses the following bugs:
- group.contents appears to include previous version snapshots. Because the previous version typically has the same name as the head version, the snapshot ends up replacing the head in the directory listing.
- websockets also sends out a creation event for the snapshot. It appears that websockets currently doesn't provide enough of the collection record to be able to distinguish between a new collection and a snapshot of a past version (and FUSE wasn't aware of snapshots anyway). This produces a similar problem where the snapshot ends up replacing the head version because they have the same name.
- When triggering live updates based on websocket events, it would try to update a collection to a specific portable data hash, but this also ran into the issue of confusing the snapshot with the head version.
- when running update(), or on flush() after the user made a change to the collection, it did not update the internal copy of the collection record with the copy it got back from the update operation, as a result the special ".arvados#collection" was always stale
- When multiple files are being flushed at once, the commit_all() part of flush() can overlap, which can result in double attempts to delete a buffer block
Fixes:
- Filter snapshots from group.contents response
- Only invalidate directories based on websocket events (to be updated on next access), stop trying to be clever about applying incremental changes or updating to specific PDHs
- Change collection_record_file to use a callback, the callback flushes and then gets the latest collection record from Collection object
- Be more strategic about when to invalidate the collection_record_file
- Looking up ".arvados#collection" invalidates it and forces a refresh
- If a buffer block doesn't exist and we try to delete it, don't throw an error
I did some manual testing and this version is definitely more reliable.
Just one note:
- in
services/fuse/arvados_fuse/__init__.py
around line 480 a whole section of code is commented out, should it just be removed?
LGTM otherwise. Thanks for the new comments!
Updated by Peter Amstutz almost 3 years ago
- Status changed from In Progress to Resolved
Actions