Story #9278
open[Crunch2] Document/fix handling of collections with non-nil expires_at field
100%
Description
Draft desired behavior: Expiring collections
Current behavior:- When deleting collections from the project view, Workbench sets expires_at=now().
- API server does not return expired collections in list or get responses.
default_scope where("expires_at IS NULL or expires_at > CURRENT_TIMESTAMP")
- API server does return expiring collections in list responses.
Updated by Tom Clegg over 8 years ago
- Assigned To set to Tom Clegg
- Target version set to 2016-06-08 sprint
- Story points set to 2.0
Updated by Brett Smith over 8 years ago
Tom Clegg wrote:
Draft at Expiring collections
Thoughts:
At this point, I'm convinced that sysadmins need some way to accelerate deletion of blocks. Say someone accidentally uploads something much larger than the cluster was specced to store: usually what people want in this case is to delete that newly-created collection and blocks uniquely associated with it. A rule as simple as "Admins can set expires_at to an arbitrary time" would be sufficient to make this possible. Can the proposal grow to accommodate something like that? (I realize we need a way to solve this on the Keepstore end, too, but for now I think I'd be happy as long as changes on the API end don't make anything harder.)
More generally, I do wonder what ways you intend for it to be possible for clients to set or change expires_at, besides the existing delete method.
About the idea of creating collections to hold references to unreferenced blocks: I'm concerned about the ops impact of creating potentially many collections like this. It's also a little redundant: we already have the data we need in the Logs table. How would you feel about extending keep-balance to read those logs and use it as a new data source? The rule would be something like "Read all Logs for collection updates from the past [TTL duration]. Any block referenced in a manifest_text in any of those records is not eligible for deletion." If I've thought this through correctly, that would seem to eliminate any need for separate tracking of collection manifest changes. It would also correctly handle updates that change replication_desired from >0 to 0.
"In any case, an application that undeletes collections must be prepared to encounter name conflicts." - Will clients be able to just set ensure_unique_name=True to DTRT? If not, can we make that possible?
Updated by Peter Amstutz over 8 years ago
Design comments:
expires_at significance get (pdh) get (uuid) appears in default list can appear in list when filtering by expires_at >now expiring yes(*) yes(*) no(**) yes (**) Change to "yes" after updating clients (arv-mount and Workbench) to behave appropriately, i.e., either use an expires_at filter when > requesting collection lists, or skip over them in default views.
On the principal of least surprise I would suggest settling on the behavior listed in the table and not plan on changing it as mentioned in (**). API clients that would need to care are not limited to Workbench and arv-mount but also crunch scripts written by users. API clients that care about expiring collections (workbench, block manager) can set the expires_at filter accordingly.
1. When expiring a collection, stash the original name somewhere and change its name to something unique (e.g., incorporating uuid and timestamp).
This is a little wonky but I'd be fine with that; I believe we have a "properties" hash on collections already. Partial indexes sound a bit tricky to set up.
Questions/clarifications:
What happens if a collection is deleted twice? Does expires_at get updated on the second delete?
Do you undelete a collection by setting expires_at
to null? What happens if the user tries to do that on a collection that is past its expiration date?
What happens if a user is trying to delete a Project and there are expiring/expired collections?
Can expiring collections have their manifest text or other fields be updated? (probably not...)
Updated by Tom Clegg over 8 years ago
Peter Amstutz wrote:
Design comments:
expires_at significance get (pdh) get (uuid) appears in default list can appear in list when filtering by expires_at >now expiring yes(*) yes(*) no(**) yes (**) Change to "yes" after updating clients (arv-mount and Workbench) to behave appropriately, i.e., either use an expires_at filter when > requesting collection lists, or skip over them in default views.
On the principal of least surprise I would suggest settling on the behavior listed in the table and not plan on changing it as mentioned in (**). API clients that would need to care are not limited to Workbench and arv-mount but also crunch scripts written by users. API clients that care about expiring collections (workbench, block manager) can set the expires_at filter accordingly.
Interesting. I like the "least surprise" principle, but I see it the other way around: the least surprising behavior is for "list without filters" to return all of the items, like it does elsewhere.
Adding an "expires_at is null" filter to an API request seems really easy, compared to combining the results of two (multi-page) queries using different filters.
Another option is to do what we do with keep services: one API for "list all", and one API for "list the ones I think you want".
The current behavior here is "yes", fwiw. It hasn't caused any confusion because it never comes up: nobody ever sets expires_at to a time in the future.
1. When expiring a collection, stash the original name somewhere and change its name to something unique (e.g., incorporating uuid and timestamp).
This is a little wonky but I'd be fine with that; I believe we have a "properties" hash on collections already. Partial indexes sound a bit tricky to set up.
We already rely on partial indexes, so I don't think the setup trickery should be a factor.
What happens if a collection is deleted twice? Does expires_at get updated on the second delete?
Earlier of {default expiry time} and {existing expiry time}, I think. (updated wiki)
"The given expires_at cannot be sooner than the existing expires_at and sooner than now+blobSignatureTTL."
Do you undelete a collection by setting
expires_at
to null?
Yes. (updated wiki)
What happens if the user tries to do that on a collection that is past its expiration date?
404. (updated wiki)
What happens if a user is trying to delete a Project and there are expiring/expired collections?
Expired: they're invisible, so deleting a project should work exactly the same way it would without them.
Expiring: Not sure about this one. Dumping them in the parent project is one possibility.
Can expiring collections have their manifest text or other fields be updated? (probably not...)
I don't think there's any reason to disallow this. One of the use cases is "scratch space". A rule "can't update expiring collection" would merely force clients to jump through undelete-modify-delete hoops, introducing the possibility of crashing in the middle and wasting storage space indefinitely.
Updated by Tom Clegg over 8 years ago
- Clients: Use 'expires_at is null' filter in arv-mount and workbench (separate story: "show trash" feature for both)
- Clients: set expires_at=(now+permissionTTL) (or now+defaultTrashLifetime) instead of deleting collections outright
- API: Use shorter permission TTL in API response for expiring collection
- API: enforce expires_at >= now during update
- (#9363) handle deletion races: keep-balance needs to look in the logs table
- (#9364) expedited delete feature
Updated by Tom Clegg over 8 years ago
Updated by Tom Clegg over 8 years ago
- Target version changed from 2016-06-08 sprint to 2016-06-22 sprint
Updated by Peter Amstutz over 8 years ago
9278-expiring-collections @ 5c11190dda23801d0f7d177bf2c0a0ac5d899898 LGTM
Updated by Tom Clegg over 8 years ago
- Target version changed from 2016-06-22 sprint to Arvados Future Sprints
Updated by Brett Smith over 8 years ago
- Assigned To changed from Tom Clegg to Brett Smith
I will split off the subtasks into their own stories and make sure the desired behavior is specified in them directly.
Updated by Brett Smith over 8 years ago
Tom Clegg wrote:
- Clients: Use 'expires_at is null' filter in arv-mount and workbench
(separate story: "show trash" feature for both)
Workbench #9587 (see also #9589), FUSE #9590 and #9591.
- Clients: set expires_at=(now+permissionTTL) (or now+defaultTrashLifetime) instead of deleting collections outright
Updated by Ward Vandewege over 3 years ago
- Target version deleted (
Arvados Future Sprints)