Project

General

Profile

Actions

Bug #23452

open

Ensure KeepClient recovers memory on garbage collection

Added by Brett Smith about 1 month ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
SDKs
Target version:
-
Story points:
-

Description

User wrote a script that had basically this structure:

for coll_id, coll_path in inputs:
    c = arvados.collection.Collection(coll_id)
    with c.open(coll_path) as f:
        ...  # Copy f somewhere

This ballooned in RAM use until it got OOM-killed or brought down the box. At a quick read, I note a couple of things: Each new Collection would contstruct a new KeepClient for itself. And as far as I can see, there isn't anything to stop those KeepClient threads when it, or the parent Collection object, is garbage collected. I suspect those threads are lingering. At the very least, it seems to be something in KeepClient: restructuring the script to build a single KeepClient and reuse it worked around the problem. Which is consistent with what we see in our own tools that do similar work like arv-copy and arv-mount.

Make sure KeepClient fully cleans up after itself when it is garbage collected. This is hopefully as simple as adding a __del__ method that joins threads, but we should confirm that and add any other necessary cleanup.

No data to display

Actions

Also available in: Atom PDF