Project

General

Profile

Actions

Story #9278

open

[Crunch2] Document/fix handling of collections with non-nil expires_at field

Added by Tom Clegg over 8 years ago. Updated over 3 years ago.

Status:
In Progress
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
-
Start date:
06/01/2016
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
1.0

Description

Draft desired behavior: Expiring collections

Current behavior:
  • When deleting collections from the project view, Workbench sets expires_at=now().
  • API server does not return expired collections in list or get responses. default_scope where("expires_at IS NULL or expires_at > CURRENT_TIMESTAMP")
  • API server does return expiring collections in list responses.

Subtasks 6 (0 open6 closed)

Task #9298: Update arv-mount to desired behaviorClosedTom Clegg07/13/2016

Actions
Task #9297: Update Workbench to desired behaviorClosedTom Clegg07/13/2016

Actions
Task #9293: Review docs/spec [[Expiring collections]]ResolvedTom Clegg06/07/2016

Actions
Task #9294: Document desired behavior and interpretationResolvedTom Clegg06/01/2016

Actions
Task #9296: Update API to desired behaviorResolvedTom Clegg06/07/2016

Actions
Task #9302: Review 9278-expiring-collectionsResolvedPeter Amstutz06/09/2016

Actions

Related issues 10 (5 open5 closed)

Related to Arvados - Feature #9364: [keep-balance] "Expedited delete" tool: perform garbage collection on some specific (recently deleted) collections, bypassing usual GC race protectionsNew

Actions
Related to Arvados - Story #9277: [Crunch2] System-owned container outputs should be garbage-collectedResolvedPeter Amstutz02/16/2017

Actions
Related to Arvados - Feature #3900: [Workbench] Trash button on collection uses "delete" API instead of setting expires_at/trash_atResolvedLucas Di Pentima02/10/2017

Actions
Related to Arvados - Story #9582: [Workbench] Don't display trashed collections in regular collection listingsResolved07/12/2016

Actions
Related to Arvados - Story #9584: [FUSE] Don't display expiring collections in regular collection listingsNewTom Morris07/12/2016

Actions
Related to Arvados - Story #9587: [Workbench] Interface to list and untrash trashed collectionsResolvedRadhika Chippada07/12/2016

Actions
Related to Arvados - Story #9589: [Workbench] Update collection interface for collections with non-nil trash_atClosedRadhika Chippada07/13/2016

Actions
Related to Arvados - Story #9590: [FUSE] Trash directory to list, inspect, and un-trash trashed collectionsNewTom Morris07/13/2016

Actions
Related to Arvados - Story #9591: [FUSE] Undelete collections by moving them out of the TrashDirectoryNewTom Morris07/13/2016

Actions
Related to Arvados - Story #9592: [FUSE] rmdir on CollectionDirectory sets expires_atNewTom Morris07/13/2016

Actions
Actions #1

Updated by Tom Clegg over 8 years ago

  • Assigned To set to Tom Clegg
  • Target version set to 2016-06-08 sprint
  • Story points set to 2.0
Actions #2

Updated by Tom Clegg over 8 years ago

Actions #3

Updated by Brett Smith over 8 years ago

Tom Clegg wrote:

Draft at Expiring collections

Thoughts:

At this point, I'm convinced that sysadmins need some way to accelerate deletion of blocks. Say someone accidentally uploads something much larger than the cluster was specced to store: usually what people want in this case is to delete that newly-created collection and blocks uniquely associated with it. A rule as simple as "Admins can set expires_at to an arbitrary time" would be sufficient to make this possible. Can the proposal grow to accommodate something like that? (I realize we need a way to solve this on the Keepstore end, too, but for now I think I'd be happy as long as changes on the API end don't make anything harder.)

More generally, I do wonder what ways you intend for it to be possible for clients to set or change expires_at, besides the existing delete method.

About the idea of creating collections to hold references to unreferenced blocks: I'm concerned about the ops impact of creating potentially many collections like this. It's also a little redundant: we already have the data we need in the Logs table. How would you feel about extending keep-balance to read those logs and use it as a new data source? The rule would be something like "Read all Logs for collection updates from the past [TTL duration]. Any block referenced in a manifest_text in any of those records is not eligible for deletion." If I've thought this through correctly, that would seem to eliminate any need for separate tracking of collection manifest changes. It would also correctly handle updates that change replication_desired from >0 to 0.

"In any case, an application that undeletes collections must be prepared to encounter name conflicts." - Will clients be able to just set ensure_unique_name=True to DTRT? If not, can we make that possible?

Actions #4

Updated by Tom Clegg over 8 years ago

  • Description updated (diff)
Actions #5

Updated by Peter Amstutz over 8 years ago

Design comments:

expires_at significance get (pdh) get (uuid) appears in default list can appear in list when filtering by expires_at
>now expiring yes(*) yes(*) no(**) yes

(**) Change to "yes" after updating clients (arv-mount and Workbench) to behave appropriately, i.e., either use an expires_at filter when > requesting collection lists, or skip over them in default views.

On the principal of least surprise I would suggest settling on the behavior listed in the table and not plan on changing it as mentioned in (**). API clients that would need to care are not limited to Workbench and arv-mount but also crunch scripts written by users. API clients that care about expiring collections (workbench, block manager) can set the expires_at filter accordingly.

1. When expiring a collection, stash the original name somewhere and change its name to something unique (e.g., incorporating uuid and timestamp).

This is a little wonky but I'd be fine with that; I believe we have a "properties" hash on collections already. Partial indexes sound a bit tricky to set up.

Questions/clarifications:

What happens if a collection is deleted twice? Does expires_at get updated on the second delete?

Do you undelete a collection by setting expires_at to null? What happens if the user tries to do that on a collection that is past its expiration date?

What happens if a user is trying to delete a Project and there are expiring/expired collections?

Can expiring collections have their manifest text or other fields be updated? (probably not...)

Actions #6

Updated by Tom Clegg over 8 years ago

Peter Amstutz wrote:

Design comments:

expires_at significance get (pdh) get (uuid) appears in default list can appear in list when filtering by expires_at
>now expiring yes(*) yes(*) no(**) yes

(**) Change to "yes" after updating clients (arv-mount and Workbench) to behave appropriately, i.e., either use an expires_at filter when > requesting collection lists, or skip over them in default views.

On the principal of least surprise I would suggest settling on the behavior listed in the table and not plan on changing it as mentioned in (**). API clients that would need to care are not limited to Workbench and arv-mount but also crunch scripts written by users. API clients that care about expiring collections (workbench, block manager) can set the expires_at filter accordingly.

Interesting. I like the "least surprise" principle, but I see it the other way around: the least surprising behavior is for "list without filters" to return all of the items, like it does elsewhere.

Adding an "expires_at is null" filter to an API request seems really easy, compared to combining the results of two (multi-page) queries using different filters.

Another option is to do what we do with keep services: one API for "list all", and one API for "list the ones I think you want".

The current behavior here is "yes", fwiw. It hasn't caused any confusion because it never comes up: nobody ever sets expires_at to a time in the future.

1. When expiring a collection, stash the original name somewhere and change its name to something unique (e.g., incorporating uuid and timestamp).

This is a little wonky but I'd be fine with that; I believe we have a "properties" hash on collections already. Partial indexes sound a bit tricky to set up.

We already rely on partial indexes, so I don't think the setup trickery should be a factor.

What happens if a collection is deleted twice? Does expires_at get updated on the second delete?

Earlier of {default expiry time} and {existing expiry time}, I think. (updated wiki)

"The given expires_at cannot be sooner than the existing expires_at and sooner than now+blobSignatureTTL."

Do you undelete a collection by setting expires_at to null?

Yes. (updated wiki)

What happens if the user tries to do that on a collection that is past its expiration date?

404. (updated wiki)

What happens if a user is trying to delete a Project and there are expiring/expired collections?

Expired: they're invisible, so deleting a project should work exactly the same way it would without them.

Expiring: Not sure about this one. Dumping them in the parent project is one possibility.

Can expiring collections have their manifest text or other fields be updated? (probably not...)

I don't think there's any reason to disallow this. One of the use cases is "scratch space". A rule "can't update expiring collection" would merely force clients to jump through undelete-modify-delete hoops, introducing the possibility of crashing in the middle and wasting storage space indefinitely.

Actions #7

Updated by Tom Clegg over 8 years ago

  • Status changed from New to In Progress
Actions #8

Updated by Tom Clegg over 8 years ago

Changes needed:
  • Clients: Use 'expires_at is null' filter in arv-mount and workbench (separate story: "show trash" feature for both)
  • Clients: set expires_at=(now+permissionTTL) (or now+defaultTrashLifetime) instead of deleting collections outright
  • API: Use shorter permission TTL in API response for expiring collection
  • API: enforce expires_at >= now during update
  • (#9363) handle deletion races: keep-balance needs to look in the logs table
  • (#9364) expedited delete feature
Actions #9

Updated by Tom Clegg over 8 years ago

9278-expiring-collections @ 5c11190 includes
  • 5c11190 9278: Ensure locator signatures expire no later than expires_at.
  • 1a0738a 9278: Expose expires_at in API response.
  • 67d893f 9278: Set expires_at=now if a client sets it to a time in the past.
Actions #10

Updated by Tom Clegg over 8 years ago

  • Target version changed from 2016-06-08 sprint to 2016-06-22 sprint
Actions #11

Updated by Tom Clegg over 8 years ago

  • Story points changed from 2.0 to 1.0
Actions #12

Updated by Peter Amstutz over 8 years ago

9278-expiring-collections @ 5c11190dda23801d0f7d177bf2c0a0ac5d899898 LGTM

Actions #13

Updated by Tom Clegg over 8 years ago

  • Target version changed from 2016-06-22 sprint to Arvados Future Sprints
Actions #14

Updated by Brett Smith over 8 years ago

  • Assigned To changed from Tom Clegg to Brett Smith

I will split off the subtasks into their own stories and make sure the desired behavior is specified in them directly.

Actions #15

Updated by Brett Smith over 8 years ago

Tom Clegg wrote:

  • Clients: Use 'expires_at is null' filter in arv-mount and workbench

Workbench #9582, FUSE #9584.

(separate story: "show trash" feature for both)

Workbench #9587 (see also #9589), FUSE #9590 and #9591.

  • Clients: set expires_at=(now+permissionTTL) (or now+defaultTrashLifetime) instead of deleting collections outright

Workbench #3900, FUSE #9592.

Actions #16

Updated by Tom Morris over 8 years ago

  • Assigned To deleted (Brett Smith)
Actions #17

Updated by Tom Morris over 6 years ago

  • Release deleted (11)
Actions #18

Updated by Ward Vandewege over 3 years ago

  • Target version deleted (Arvados Future Sprints)
Actions

Also available in: Atom PDF