Idea #22458
openAbility to intentionally turn a collection a "ghost" collection
Description
For provenance, I would like to keep collection records around.
However, in some cases I don't want to store the intermediate data. For example, I might have processing steps where the output is just as large or larger than the input data.
Propose being able to set replication_desired to zero to indicate that the underlying blocks can be trashed by keep-balance, without them being reported as "missing" blocks. Once set to zero, replication_desired cannot be increased. I call these "ghost collections".
(Another name that just came to me is "dehydrated" or "freeze dried" collections).
Fetching a ghost collection returns an unsigned manifest.
Ghost collection records should behave similarly to frozen projects: read-only, except for being moved between projects (it might be ok to edit metadata such as name and properties as well).
Similar to trash_at / delete_at, it would also be nice to have a ghost_at field, and a corresponding output_ghost_ttl on container requests that lets you specify that a collection should be ghosted at some point in the future -- helpful to keep intermediate results around for a little while, but not forever.
Clients such as Workbench, keep-web, Python SDK, etc should be made aware of ghost collections, so that they return a sensible error if the user tries to read a file, instead of a scary "failed to read block" error.
If the ghost collection exists on another cluster readable by the user, it should be possible to automatically fetch the blocks via federation, or rematerialize/rehydrate the collection by downloading all the blocks from somewhere else and re-writing the manifest with current block signatures as proof the collection is readable again.
Updated by Peter Amstutz about 1 year ago
- Position changed from -933268 to -933261
Updated by Peter Amstutz about 1 year ago
- Description updated (diff)
- Subject changed from Ability to intentional make a collection a "ghost" collection to Ability to intentionally turn a collection a "ghost" collection
Updated by Peter Amstutz about 1 year ago
- Related to Idea #22459: Design proposal for manual "empty trash" command added