Project

General

Profile

Actions

Bug #10445

open

Fix memory leak in Python SDK Collection class

Added by Tom Clegg about 8 years ago. Updated about 2 years ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
SDKs
Target version:
Start date:
11/02/2016
Due date:
% Done:

0%

Estimated time:
Story points:
2.0

Description

Currently, each new CollectionReader creates its own API client, and Keep client, and block cache unless the caller supplies an API client object (and Keep client object?). If a caller creates 10 CollectionReaders and reads 64 MiB from each one, the program will use 640 MiB. Possibly due to HTTP KeepAlive behavior, Python does not reclaim memory even if the caller unreferences the CollectionReaders. For example, this script leaks memory and network connections:

import arvados

uuid = '......'
for i in range(20):
    cr = arvados.collection.CollectionReader(uuid)
    for fn in cr:
        f = cr.open(fn)
        f.read()
        f.close()

Proposed improvement:

Share block caches between auto-instantiated API clients that use the same settings.

This also applied to KeepClient and BlockManager


Files

leaktest.py (1.11 KB) leaktest.py Tom Clegg, 11/02/2016 03:32 PM

Related issues 1 (1 open0 closed)

Related to Arvados - Feature #10493: [Python SDK] Provide more control over pre-fetch behaviorNew11/09/2016

Actions
Actions #1

Updated by Tom Clegg about 8 years ago

Actions #2

Updated by Peter Amstutz about 8 years ago

The read cache is actually in KeepClient. Reads go through the block manager, but sharing the read cache is a matter of using a shared keepclient object.

(Technically, you could even create multiple KeepClient objects and initialize them with the same KeepBlockCache).

Actions #3

Updated by Tom Clegg about 8 years ago

  • Description updated (diff)
Actions #4

Updated by Tom Morris over 7 years ago

  • Target version set to Arvados Future Sprints
Actions #5

Updated by Ward Vandewege over 3 years ago

  • Target version deleted (Arvados Future Sprints)
Actions #6

Updated by Peter Amstutz about 2 years ago

  • Target version set to 2022-11-23 sprint
Actions #7

Updated by Peter Amstutz about 2 years ago

  • Target version changed from 2022-11-23 sprint to 2022-12-07 Sprint
Actions #8

Updated by Peter Amstutz about 2 years ago

  • Description updated (diff)
Actions #9

Updated by Peter Amstutz about 2 years ago

  • Target version changed from 2022-12-07 Sprint to 2022-12-21 Sprint
Actions #10

Updated by Peter Amstutz about 2 years ago

  • Subject changed from [SDKs] Fix memory leak in Python SDK Collection class to Fix memory leak in Python SDK Collection class
Actions #11

Updated by Tom Clegg about 2 years ago

  • Story points set to 1.0
Actions #12

Updated by Peter Amstutz about 2 years ago

  • Story points changed from 1.0 to 2.0
Actions #13

Updated by Peter Amstutz about 2 years ago

  • Target version changed from 2022-12-21 Sprint to 2023-01-18 sprint
Actions #14

Updated by Peter Amstutz about 2 years ago

  • Target version changed from 2023-01-18 sprint to 2023-02-01 sprint
Actions #15

Updated by Peter Amstutz about 2 years ago

  • Target version deleted (2023-02-01 sprint)
  • Release set to 59
Actions #16

Updated by Peter Amstutz about 2 years ago

  • Target version set to To be scheduled
Actions

Also available in: Atom PDF