Bug #6706
closed[FUSE] Crash in forget()
100%
Description
I think this is getting called unexpectedly during shutdown, and self.inodes == None:
2015-07-21_23:55:01 su92l-8i9sb-m4m1girz122q3f1 4686 5 stderr 2015-07-21 23:55:00 arvados.arvados_fuse[9793] ERROR: Unhandled exception during FUSE operation 2015-07-21_23:55:01 su92l-8i9sb-m4m1girz122q3f1 4686 5 stderr Traceback (most recent call last): 2015-07-21_23:55:01 su92l-8i9sb-m4m1girz122q3f1 4686 5 stderr File "/usr/local/lib/python2.7/dist-packages/arvados_fuse/__init__.py", line 254, in catch_exceptions_wrapper 2015-07-21_23:55:01 su92l-8i9sb-m4m1girz122q3f1 4686 5 stderr return orig_func(self, *args, **kwargs) 2015-07-21_23:55:01 su92l-8i9sb-m4m1girz122q3f1 4686 5 stderr File "/usr/local/lib/python2.7/dist-packages/arvados_fuse/__init__.py", line 443, in forget 2015-07-21_23:55:01 su92l-8i9sb-m4m1girz122q3f1 4686 5 stderr ent = self.inodes[inode] 2015-07-21_23:55:01 su92l-8i9sb-m4m1girz122q3f1 4686 5 stderr TypeError: 'NoneType' object has no attribute '__getitem__'
Updated by Brett Smith over 9 years ago
- Category set to FUSE
- Assigned To set to Peter Amstutz
- Target version changed from Bug Triage to 2015-08-05 sprint
Updated by Brett Smith over 9 years ago
Reviewing 87ff392
This definitely does the job, and obviously it's straightforward. I wonder if we might be better off avoiding setting self.inodes = None
, though? Going over the llfuse docs, it doesn't look like it would be required, and avoiding it would help prevent future surprises like this. No other methods are prepared to cope with self.inodes = None
, though I agree it's hard to imagine why they would ever be called after destroy()
.
Just an idea, though. If it's an insufficient fix or inappropriate change, you can definitely merge this. Thanks.
Updated by Peter Amstutz over 9 years ago
While fixing this, I noticed another problem:
Exception in thread WebSocketClient (most likely raised during interpreter shutdown): Traceback (most recent call last): File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner File "/usr/lib/python2.7/threading.py", line 763, in run File "build/bdist.linux-x86_64/egg/ws4py/websocket.py", line 427, in run File "build/bdist.linux-x86_64/egg/ws4py/websocket.py", line 305, in once File "build/bdist.linux-x86_64/egg/ws4py/websocket.py", line 357, in process File "build/bdist.linux-x86_64/egg/ws4py/streaming.py", line 185, in receiver File "build/bdist.linux-x86_64/egg/ws4py/framing.py", line 138, in _parsing <type 'exceptions.TypeError'>: 'NoneType' object is not callable
Pulling on this thread (hah), it turns out that the correct way to shut down the websockets client is to use terminate()
, not close()
(from reading the code, it turns out the close()
method only sends a shutdown message to the server and does not actually close the socket or stop the client thread).
So as a bonus bugfix I am fixing up all the places in the Arvados tree that use Python websockets client to use terminate()
instead of close()
.
Updated by Peter Amstutz over 9 years ago
Brett Smith wrote:
Reviewing 87ff392
This definitely does the job, and obviously it's straightforward. I wonder if we might be better off avoiding setting
self.inodes = None
, though? Going over the llfuse docs, it doesn't look like it would be required, and avoiding it would help prevent future surprises like this. No other methods are prepared to cope withself.inodes = None
, though I agree it's hard to imagine why they would ever be called afterdestroy()
.Just an idea, though. If it's an insufficient fix or inappropriate change, you can definitely merge this. Thanks.
Introduced a clear()
method which wipes out the inode/inode cache entries dicts instead of setting to None. Thanks.
Updated by Peter Amstutz over 9 years ago
Peter Amstutz wrote:
So as a bonus bugfix I am fixing up all the places in the Arvados tree that use Python websockets client to use
terminate()
instead ofclose()
.
So, whoops, this turns out to be wrong. terminate()
seems to be intended as an internal method, but not documented as such. The correct way to shut down a connection is to call close()
and then wait for the closed()
callback.
EventClient now overrides close()
to provide the desired behavior in way that's compatible with the existing code using events.py.
Updated by Peter Amstutz over 9 years ago
- Status changed from New to In Progress
Updated by Brett Smith over 9 years ago
17ce65c is good to merge. Because we're changing the semantics of EventClient.close() from its superclass, a docstring that explains that might help avoid surprises for future users. That's the only other idea I had. Thanks.
Updated by Peter Amstutz over 9 years ago
- Status changed from In Progress to Resolved
- % Done changed from 50 to 100
Applied in changeset arvados|commit:3e6ba5fa5f225c8aa431ce9a2796369c1e1dda2d.