Project

General

Profile

Actions

Feature #5202

closed

Hash individual files in collections.

Added by Peter Amstutz about 11 years ago. Updated about 6 years ago.

Status:
Closed
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
-
Story points:
-

Description

It would be very helpful to have hashes for the individual files in a collection available by default. This supports a couple of use cases:

  1. Checking if a file in Keep is the same as a local file (even if they have different file names) for validation, or to avoid redundant file uploads.
  2. Searching to see if a given file is present in Keep (for example, auditing to see if a file was uploaded that should not have been.)
  3. Identifying files which are stored multiple times in Keep with different block alignment (resulting in different blocks) which could be re-packed and the relevant collections updated to achieve deduplication).

Implementation proposal sketch here:

https://arvados.org/projects/arvados/wiki/Separating_files_from_collections

Actions #1

Updated by Peter Amstutz about 11 years ago

  • Description updated (diff)
Actions #2

Updated by Peter Amstutz about 11 years ago

  • Description updated (diff)
Actions #3

Updated by Peter Amstutz about 11 years ago

  • Description updated (diff)
Actions #4

Updated by Peter Amstutz about 11 years ago

  • Description updated (diff)
Actions #5

Updated by Peter Amstutz about 6 years ago

  • Status changed from New to Closed
Actions

Also available in: Atom PDF