Feature #13561
closed[API] Store, and add APIs to retrieve, previous versions of collection objects
Added by Tom Clegg over 6 years ago. Updated about 6 years ago.
100%
Updated by Tom Morris over 6 years ago
- Blocks Story #13494: Browse previous versions of a collection added
Updated by Tom Clegg over 6 years ago
- Related to Feature #13109: Support collection versions added
Updated by Tom Clegg over 6 years ago
- Subject changed from [API] Retrieve previous versions of collection objects to [API] Store, and add APIs to retrieve, previous versions of collection objects
Updated by Tom Morris over 6 years ago
- Target version changed from To Be Groomed to Arvados Future Sprints
Updated by Tom Morris over 6 years ago
- Assigned To set to Lucas Di Pentima
- Target version changed from Arvados Future Sprints to 2018-09-19 Sprint
Updated by Lucas Di Pentima over 6 years ago
- Status changed from New to In Progress
Updated by Lucas Di Pentima over 6 years ago
- Target version changed from 2018-09-19 Sprint to 2018-10-03 Sprint
Updated by Lucas Di Pentima over 6 years ago
- Target version changed from 2018-10-03 Sprint to 2018-09-19 Sprint
Updated by Lucas Di Pentima over 6 years ago
- Target version changed from 2018-09-19 Sprint to 2018-10-03 Sprint
Updated by Lucas Di Pentima about 6 years ago
- Target version changed from 2018-10-03 Sprint to 2018-10-17 sprint
Updated by Tom Clegg about 6 years ago
- Related to Story #14299: [keep-balance] Ensure blocks referenced by old collection versions are safe from garbage collection added
Updated by Lucas Di Pentima about 6 years ago
Initial implementation at a18fe628853e2042bb104088dd586cb8f41adcef - branch 13561-collection-versions-api
Test run: https://ci.curoverse.com/job/developer-run-tests/920/
- Collection model creates versions when being updated, depending of site configuration parameters.
- Old versions can't be updated. They have selected fields synchronized with the current version' values.
- Links can't be attached to old versions.
- Collections.index API expanded to include past versions.
- Tests added
(Some wb integration tests failed -- investigating)
Updated by Lucas Di Pentima about 6 years ago
Updates at 652e26fee
Test run: https://ci.curoverse.com/job/developer-run-tests/921/
- Ignore
version
¤t_version_uuid
attributes at create API endpoint - Add
include_old_versions
param on discovery doc
Updated by Lucas Di Pentima about 6 years ago
Update at 728b7ee1f
Test run: https://ci.curoverse.com/job/developer-run-tests/922/
Realized one key feature was missing: Allow old collection versions to be accessible by UUID.
Updated by Lucas Di Pentima about 6 years ago
Merged master into the branch at 0f11eb78e
Test run: https://ci.curoverse.com/job/developer-run-tests/924/
Updated by Lucas Di Pentima about 6 years ago
Documentation updates at 9d8a391d4 - branch 13561-collection-versions-doc
Ready for review
Updated by Tom Clegg about 6 years ago
Implementing this by overriding save!
seems sketchy: AFAIK neither update_attributes
nor save
call save!
(rather, save!
is implemented as save || raise
). Can we rephrase this as a Rails callback -- perhaps after_save?
I find it hard to follow the copying of attributes between self and snapshot. I wonder if we should either use @old_attributes
from log_start_state, or use SQL directly (after getting the row lock) to copy the current version to a new row with a new UUID.
A unit test like this might be useful to detect loopholes:
c = collections(...)
c.name = 'foobar'
c.save
c.properties['foo'] = 'bar'
c.update_attributes(name: 'baz')
The tests are in test/functional but they don't call the controllers -- seems like they should move to test/unit?
"older versions should no(sic) be directly updatable" should use an admin token.
Updated by Peter Amstutz about 6 years ago
Lucas Di Pentima wrote:
Documentation updates at 9d8a391d4 - branch
13561-collection-versions-doc
In "Using collection versioning"
- Suggest rephrasing first section:
When collection versioning is enabled, updating certain collection attributes (name, description, properties and manifest_text) will save a copy of the previous collection version...
- Need to explain versioning more clearly: the past version is saved with a new uuid, current_version_uuid points to the most recent version, the version number is incremented.
- Cut out the extra fields in the "arv" examples, limit it to "uuid", "current_version_uuid", "version", "created_at", "modified_at"
- Adjust heading: "Manually preserving a version" → "Ensuring a version is preserved"
Updated by Lucas Di Pentima about 6 years ago
Updates at 1499536da
Test run: https://ci.curoverse.com/job/developer-run-tests/926/
Tom Clegg wrote:
Implementing this by overriding
save!
seems sketchy: AFAIK neitherupdate_attributes
norsave
callsave!
(rather,save!
is implemented assave || raise
). Can we rephrase this as a Rails callback -- perhaps after_save?
I find it hard to follow the copying of attributes between self and snapshot. I wonder if we should either use@old_attributes
from log_start_state, or use SQL directly (after getting the row lock) to copy the current version to a new row with a new UUID.
Refactored the code so that's hopefully more readable, and also to avoid overriding save!()
, by taking advantage of around_update
and after_update
callbacks.
"older versions should no(sic) be directly updatable" should use an admin token.
Fixed
Updated by Tom Clegg about 6 years ago
AFAICT after_update runs after around_update is done, which seems to mean sync_past_versions is outside the "with_lock" block. There's also a potential race where should_preserve_version? returns false early in the process, then starts returning true later on because time has advanced. Would it be simpler to do all of the steps in around_update? Something like
with_lock return(yield) unless { ... need snapshot ... } snapshot = ... yield snapshot.save end
Since old_versions_cannot_be_updated is a validation, I think it should call errors.add() and return false, rather than raising PermissionDeniedError.
Should test that current_version_uuid is ignored during update (create is already tested)
Updated by Lucas Di Pentima about 6 years ago
Peter Amstutz wrote:
In "Using collection versioning"
- Suggest rephrasing first section:
When collection versioning is enabled, updating certain collection attributes (name, description, properties and manifest_text) will save a copy of the previous collection version...
[...]
- Adjust heading: "Manually preserving a version" → "Ensuring a version is preserved"
Suggestions addressed at dff1e79
Re-phrased/re-arranged user guide. Added admin guide.
Updated by Lucas Di Pentima about 6 years ago
Tom Clegg wrote:
AFAICT after_update runs after around_update is done, which seems to mean sync_past_versions is outside the "with_lock" block. There's also a potential race where should_preserve_version? returns false early in the process, then starts returning true later on because time has advanced. Would it be simpler to do all of the steps in around_update? Something like
Oops, sorry... I guess I misunderstood the docs when checking that. Nevertheless, this worked because with_lock
uses a "SELECT ... FOR UPDATE
" lock that's released after COMMIT, but I agree with you that's better to be explicit about it.
[...]
Should test that current_version_uuid is ignored during update (create is already tested)
All comments addressed at c8b76b1d7
Test run: https://ci.curoverse.com/job/developer-run-tests/927/
Updated by Peter Amstutz about 6 years ago
Reviewing 13561-collection-versions-doc @ dff1e793412d9fc673c0158149db812649214399
Overall this writeup is much better, thank you.
- Formatting comment: in a few places you have a line break mid-paragraph, which is awkward. You need to either join those into one long line in textile, or add an extra space so they are separate. Example:
Every collection has a
version
attribute that indicates its version number, starting from 1 on new collections and incrementing by 1 with every versionable update. All collections point to their most current version via thecurrent_version_uuid
attribute, beinguuid
andcurrent_version_uuid
equal on those collection records that are the the current version of themselves.
Note that the "current version" collection record doesn't change itsuuid
, "past versions" are saved as new records every time it's needed, pointing to the current collection record.
- Needs discussion about what happens if I delete a collection with versioning enabled, or with preserve_version = true.
- In your example:
arv collection index --filters '[["uuid", "=", "o967z-4zz18-ynmlhyjbg1arnr2"]]' --include-old-versions
The filter is ["uuid", "=", "o967z-4zz18-ynmlhyjbg1arnr2"] but it also returns one or more collection with a different uuid, which don't actually fit the filter. That seems really weird. Is that really how it works or is there a typo? If this is a special case, you need to elaborate more on how it works. On the design side, was there any discussion of using the "included" response field to hold past versions?
- Might want to provide an example of filtering on "current_version_uuid" and "version" to get a stable version even as new versions are created.
- Is there a way to request a particular version as part of a GET request?
Updated by Lucas Di Pentima about 6 years ago
- Target version changed from 2018-10-17 sprint to 2018-10-31 sprint
- Story points changed from 4.0 to 0.5
Updated by Lucas Di Pentima about 6 years ago
After-demo fixes on 9fd178164 - branch 13561-collection-versions-fixes
Test run: https://ci.curoverse.com/job/developer-run-tests/937/
- Request individual collections with its past versions using
[["current_version_uuid", "=", "some-uuid"]]
instead of filtering by UUID, to avoid confusing behavior. - Maintain consistency on illegal update attempts by raising an exception when trying to change
version
or setpreserve_version
to false when already set to true.
Updated by Tom Clegg about 6 years ago
I think the spec and previous version were right about ignoring preserve_version=false, rather than raising an error. Since there is no case where passing preserve_version=false could prevent a version from being preserved, the flag means "don't require this version to be preserved". It doesn't mean "un-preserve".
Filter changes LGTM.
Updated by Lucas Di Pentima about 6 years ago
Reverted requested changes & rebased at eab1bd7
Test run: https://ci.curoverse.com/job/developer-run-tests/938/
Waiting for the tests to pass before merging.
Updated by Lucas Di Pentima about 6 years ago
While testing things for the documentation branch, came across a bug that made trashed past versions to appear on index API calls when being an admin and passing include_trash
.
Fixed at 89acce74a (branch 13561-trashed-collection-versions-fix
) and added new tests.
Test run: https://ci.curoverse.com/job/developer-run-tests/942/
Updated by Lucas Di Pentima about 6 years ago
Peter Amstutz wrote:
- Formatting comment: in a few places you have a line break mid-paragraph, which is awkward. You need to either join those into one long line in textile, or add an extra space so they are separate.
Done
- Needs discussion about what happens if I delete a collection with versioning enabled, or with preserve_version = true.
Added a note specific to this topic, I hope it's enough.
- In your example:
[...]
The filter is ["uuid", "=", "o967z-4zz18-ynmlhyjbg1arnr2"] but it also returns one or more collection with a different uuid, which don't actually fit the filter. That seems really weird. Is that really how it works or is there a typo? If this is a special case, you need to elaborate more on how it works. On the design side, was there any discussion of using the "included" response field to hold past versions?
Fixed on the other branch, updated the doc accordingly.
- Might want to provide an example of filtering on "current_version_uuid" and "version" to get a stable version even as new versions are created.
Added example.
- Is there a way to request a particular version as part of a GET request?
Right now, you can do a GET by UUID to past versions (there's an example about that). Didn't add a specific call like /collections/UUID/VerNum
to the API. Should I do it?
Rebased branch 13561-collection-versions-doc
- e5dc0c9de
Updated by Tom Clegg about 6 years ago
13561-trashed-collection-versions-fix @ 89acce74a LGTM, thanks
Updated by Peter Amstutz about 6 years ago
This looks good, just a few suggested edits, then please merge:
Note that if you set
collection_versioning
tofalse
after being enabled, old versions that could have been saved will still be accessible.
The second half of this sentence is confusing. Suggest: "old versions will still be accessible, but further changes will not be versioned."
There are two ways that this feature works:
Suggest: "A version will be saved when one of the following conditions is true:"
When a collection’s current version is deleted, its past versions will follow it.
Suggest: "When a collection is deleted, any past versions are deleted along with it."
Updated by Lucas Di Pentima about 6 years ago
- Status changed from In Progress to Resolved
Applied in changeset arvados|0adb096afc227db376823f84956de6d7ea30dc10.
Updated by Tom Morris about 5 years ago
- Related to Bug #15148: keep-balance incorrectly accounts for blocks in collections with null `modified_at` field added