Feature #11184
closed
[Keep] Support multiple storage classes
Added by Tom Morris almost 8 years ago.
Updated over 3 years ago.
Description
As an Arvados system administrator, I want to take advantage of the cool/cold storage classes offered by cloud vendors.
This involves designating desired storage class(es) in some way (collection, all collections in a project, etc) as well as a way to migrate between storage classes.
Keep storage classes
Overview¶
- Each keep volume offers one or more storage classes (the default is just the "default" class).
- Each collection has one or more desired storage classes (the default is just the "default" class).
- When writing, clients may specify one or more required storage classes; if not, the required class is "default". A keepstore server will only write the data on a volume that offers all of the required classes.
- Keep-balance moves data to volumes that have the desired attributes and updates collection records to reflect the storage classes currently satisfied by all blocks in the collection (much like replication level, these are not necessarily equal to the desired classes).
If overlapping collections (i.e., with common data blocks) request different storage classes, keep-balance will maintain multiple copies of the common blocks if necessary to satisfy all collections' requirements.
Simplifying restrictions in initial implementation (
#12708):
- No client side support.
- No keepstore support for writing data to a given storage class.
- API server configuration specifies the set of classes that can be requested (in addition to "default", which is always available).
- Keepstore configuration specifies the set of classes offered by each volume.
- Assigned To set to Tom Clegg
- Description updated (diff)
- Assigned To changed from Tom Clegg to Tom Morris
- Description updated (diff)
suggest
- specify desired functionality (user can push X buttons, result is Y)
- enumerate features/improvements needed to achieve desired functionality, and define any new APIs needed
- estimate points (or maybe split into separate stories if needed)
- Description updated (diff)
- Target version changed from Arvados Future Sprints to To Be Groomed
- Assigned To changed from Tom Morris to Tom Clegg
- Blocked by Feature #11645: [keepstore] Add "StorageClasses" field to volume config added
- Subject changed from [Keep] Support multiple storage tiers to [Keep] Support multiple storage classes
- Description updated (diff)
Pricing examples (moved from description)
Microsoft Azure (pricing at 50-500 TB level)
- LRS-COOL $0.01/GB/mo, $0.01/10Kops + $0.01GB
- LRS-HOT $0.0177/GB/mo, $0.05/10Kops
- GRS-COOL $0.02/GB/mo, $0.20/10Kops + $0.01/GB
- GRS-HOT $0.0354/GB/mo, $0.10/10Kops
- RAGRS-COOL $0.025/GB/mo, $0.20/10Kops + $0.01/GB
- RAGRS-HOST $0.0442/GB/mo, $0.10/10Kops
Amazon S3 (pricing at 50-500TB level)
- Standard - $0.022/GB, $0.004/10Kops (get)
- Infrequent Access - $0.0125/GB, $0.01/10Kops (get)
- Glacier - $0.004/GB + variable retrieval charge depending on speed
Google
- Multi-Regional Storage $0.026/GB/mo
- Regional Storage $0.02/GB/mo
- Nearline Storage $0.01/GB/mo, $0.01/GB retrieval charge
- Coldline Storage $0.007, $0.05/GB retrieval charge
- Optional bucket versioning
- Blocked by Story #12707: [API] Add columns for desired/actual storage classes for each collection added
- Blocked by Story #12708: [keep-balance] Move blocks to satisfy storage_classes_desired added
- Description updated (diff)
- Related to Story #7929: [SDKs] PySDK KeepClient considers volume IDs when replicating added
- Related to Story #7930: [SDKs] GoSDK KeepClient considers volume IDs when replicating added
- Related to Story #7931: [keep-balance] Count block replication by volume IDs added
- Related to Story #7932: [Keep] keepproxy aggregates and reports volume IDs from Keepstores added
- Target version changed from To Be Groomed to 2018-03-14 Sprint
- Target version changed from 2018-03-14 Sprint to 2018-03-28 Sprint
- Target version changed from 2018-03-28 Sprint to Arvados Future Sprints
- Related to deleted (Story #7932: [Keep] keepproxy aggregates and reports volume IDs from Keepstores)
- Related to Feature #13382: [keepstore] Write new blocks to appropriate storage class added
- Related to Story #13429: [API] [arvados-cwl-runner] Save workflow outputs to desired storage classes added
- Related to Story #13430: [arv-put] [Python] Allow caller to specify storage classes when writing data to Keep added
- Blocked by Feature #13431: [keepproxy] [GoSDK] Propagate desired storage classes in PUT request headers added
- Related to deleted (Story #7929: [SDKs] PySDK KeepClient considers volume IDs when replicating)
- Related to deleted (Story #7930: [SDKs] GoSDK KeepClient considers volume IDs when replicating)
- Related to deleted (Story #13429: [API] [arvados-cwl-runner] Save workflow outputs to desired storage classes)
- Blocked by Story #13429: [API] [arvados-cwl-runner] Save workflow outputs to desired storage classes added
- Related to deleted (Story #13430: [arv-put] [Python] Allow caller to specify storage classes when writing data to Keep)
- Blocked by Story #13430: [arv-put] [Python] Allow caller to specify storage classes when writing data to Keep added
- Description updated (diff)
- Target version changed from Arvados Future Sprints to 2018-06-20 Sprint
- Assigned To changed from Tom Clegg to Tom Morris
- Story points deleted (
3.0)
- Status changed from New to In Progress
- Target version changed from 2018-06-20 Sprint to Arvados Future Sprints
- Related to Feature #17392: Support writing blocks to correct storage classes in Go SDK added
- Status changed from In Progress to Resolved
- Target version changed from Arvados Future Sprints to 0
- Target version deleted (
0)
Also available in: Atom
PDF