Bug #17152
closedVersion snapshots have modified_at of when snapshot was created, not when content was last modified; also flag to preserve current version
100%
Description
When a collection snapshot is created (recording a version), it appears that modified_at gets set to that point in time. However, because created_at is when version 1 is created, and the same for every snapshot, we don't have a field that tells us when the content of that snapshot came into existence.
In other words, the modifed_at timestamp of a collection snapshot at version N is actually the point in time when version N+1 was created. This is not what we want.
Fix the API server so that snapshots get the modified_at of the collection at the moment the snapshot is taken.
Write a migration that adjust the existing modified_at fields when versioning is enabled. The modified_at of version N should take the modified_at of version N-1. The "modified_at" of version 1 should take the value of "created_at".
Related thing to fix: there's currently an option that delays making snapshots on updates. However, sometimes the client knows better and you want every change to create a new snapshot. The "preserve_version" flag seems like it should do that, but actually it doesn't (???) so either we need to adjust the behavior of that flag to have the expected behavior, or we need a new flag.
It would probably make sense for "preserve_version" flag to both ensure that the previous version is snapshotted, as well as indicating that the new version should be snapshotted when updated as well.
Updated by Lucas Di Pentima about 4 years ago
- Assigned To set to Lucas Di Pentima
Updated by Peter Amstutz about 4 years ago
- Description updated (diff)
- Assigned To deleted (
Lucas Di Pentima)
Updated by Lucas Di Pentima about 4 years ago
- Status changed from New to In Progress
Updated by Lucas Di Pentima about 4 years ago
- Status changed from In Progress to New
Updated by Peter Amstutz about 4 years ago
- Subject changed from Version snapshots have modified_at of when snapshot was created, not when content was last modified to Version snapshots have modified_at of when snapshot was created, not when content was last modified; also flag to preserve current version
- Description updated (diff)
Updated by Lucas Di Pentima about 4 years ago
- Status changed from New to In Progress
Updated by Lucas Di Pentima about 4 years ago
Updates at b3fad38 - branch 17152-collection-versions-fixes
Test run: developer-run-tests: #2200
- Fixes code and test related to
modified_at
handling for collection versions. - The
preserve_version
issue may not be one: We can flip it fromfalse
totrue
to preserve the current head version previous to make further updates. This can be done client side (specially on wb2 as it already has the head version data on the store, we can check if the extra request is necessary)
Pending: database migration for previous installations.
Updated by Lucas Di Pentima about 4 years ago
- Target version changed from 2020-12-02 Sprint to 2020-12-16 Sprint
Updated by Lucas Di Pentima about 4 years ago
Updates at 67ef4d23e
Test run: developer-run-tests: #2204
- Fixed
arvbox
script, that for some reason didn't published the webdav download port, making wb2 fail to access collection's files at least inpublicdev
mode. - Added migration to check for collections that need its version's timestamp fixing.
- Added test for this migration, simulating a buggy state via fixtures.
- Also tested the migration manually with
arvbox
.
Updated by Lucas Di Pentima about 4 years ago
Updates at arvados-workbench2|09a02c4 - branch 17152-always-preserve-collection-versions
Test run: developer-tests-workbench2: #214
- Updates wb2's collection service methods to preserve the head version before making any further changes. This includes webdav operations.
- Updates integration tests.
Updated by Peter Amstutz about 4 years ago
17152-collection-versions-fixes @ arvados|67ef4d23ece096e72da0ada75d4f2faa181df412
The migration and modified_at fix LGTM. I like the check for (head.modified_at == head~1.modified_at) as a way to determine if the fix-up is needed or not.
However it occurred to me that it looks like setting the preserve_version
flag might update modified_at? I don't think we want it to do that.
17152-always-preserve-collection-versions @ arvados-workbench2|09a02c42b441f2191641649e5193ea0a39a35e58
The way I had expected this to work was to ensure that preserve_version
was set before doing the update, to ensure that the version you were looking at before you made the change gets preserved.
Then you'd also set preserve_version
on the update (I guess for WebDAV updates you still need to set preserve_version
afterwards as well.) As noted, you probably need to tweak it so that setting preserve_version
doesn't touch modified_at.
It's a little annoying to have to do 1 or 2 extra API calls to set preserve_version
, perhaps we could avoid an extra call when we believe preserve_version
was already set.
Separate from that, a slightly annoying/confusing behavior I noticed while testing this:
- go to the head version
- open the action menu and choose "edit collection"
- change the collection name in the edit dialog
- save it
For some reason, when you return to the collection view from the dialog, the parent project is selected on the right-hand details panel, which is unexpected, since it was showing the version history tab before.
Updated by Lucas Di Pentima about 4 years ago
From chat:
Peter Amstutz @tetron 18:25
we have upgrade notes
it's not exactly the same
it's something that can be included in release notes
let's nail down exactly what it should do
the easiest behavior from the client perspective is that an update with preserve_version and a snapshot-eligible change preserves both the old and new versions
and that setting preserve_version on its own does not change modified_at
if you do an update and it doesn't include preserve_version: true then you get the default behavior and preserve_version on the record reverts to false
Lucas Di Pentima @Lucas Di Pentima 18:28
…and setting only preserve_version to true on an update call should behave the old way?
Peter Amstutz @tetron 18:28
(existing behavior)
Lucas Di Pentima @Lucas Di Pentima 18:28
ok
Peter Amstutz @tetron 18:29
the only difference would be an update with preserve_version would act as if you'd set preserve_version first and then sent an update that also had preserve_version
and setting preserve_version on its own doesn't mess up modified_at
Updated by Peter Amstutz about 4 years ago
From chat:
Tweak behavior such that an update that includes preserve_version: true
would act as if you'd set preserve_version: true
first and committed an update and set preserve_version: true
again.
In other words, it would snapshot the version about to be replaced, and also mark the new version as required for snapshot on the next update.
The second tweak is that setting preserve_version: true
on its own no longer affects modified_at
. (We might want to suppress creating an entry in the audit log as well, as it is kind of expensive to do for collections.)
Update the API "revision" timestamp in the discovery document.
The API behavior change should be documented in the upgrade notes.
Workbench2 collection create and update should always add preserveVersion: true
, but should avoid making an extra API call.
WebDAV operations should set preserveVersion: true
after an update.
Updated by Lucas Di Pentima about 4 years ago
Updates at 7a31eec9cb5c99361b45cdb629120e4b16ec10ee - branch 17152-collection-preserve-version-changes
Test run: developer-run-tests: #2215
- Updates API revision number.
- Changes
preserve_version
semantics as discussed on chat. - Updates documentation, upgrade notes and tests.
- Avoids creating audit logs when only
preserve_version
is changed.
Updated by Lucas Di Pentima about 4 years ago
Updates at d5745d536 - branch 17152-collection-preserve-version-changes (arvados repo)
Test run: developer-run-tests: #2220
While working on the wb2 side of this fix, I realised that preserve_version
was being flipped to false
on updates with current version's preserve_version
was true, and the request also included it as true
.
This update fixes that bug and adds proper integration testing, as the conditional disabling of preserve_version
was moved to the controller logic.
Updated by Lucas Di Pentima about 4 years ago
Updates at arvados-workbench2|2cabc3d9
Test run: developer-tests-workbench2: #217 (failed because it needs the arvados branch -- tested locally)
- Removed the additional call on the collection service
update()
function, passingpreserveVersion: true
to the first and only call. - WebDAV operations already made an extra call to update
preserve_version
to true after the webdav request.
Updated by Peter Amstutz about 4 years ago
Lucas Di Pentima wrote:
Updates at d5745d536 - branch 17152-collection-preserve-version-changes (arvados repo)
Test run: developer-run-tests: #2220While working on the wb2 side of this fix, I realised that
preserve_version
was being flipped tofalse
on updates with current version'spreserve_version
was true, and the request also included it astrue
.
This update fixes that bug and adds proper integration testing, as the conditional disabling ofpreserve_version
was moved to the controller logic.
before_update :preserve_version_exclusive_updates_leave_modified_at_alone, if: Proc.new { |col| col.changes.keys.sort == ['modified_at', 'updated_at', 'preserve_version'].sort }
Instead of forcing modified_at
back to the earlier time, I think it would be cleaner for the Collection
class to override def maybe_update_modified_by_fields
and do the check there, and then either it calls super
or does nothing.
Updated by Lucas Di Pentima about 4 years ago
Peter Amstutz wrote:
Instead of forcing
modified_at
back to the earlier time, I think it would be cleaner for theCollection
class to overridedef maybe_update_modified_by_fields
and do the check there, and then either it callssuper
or does nothing.
Good call, thanks! Update at 9e75bd68c - Test run: developer-run-tests: #2223
Updated by Peter Amstutz about 4 years ago
Lucas Di Pentima wrote:
Peter Amstutz wrote:
Instead of forcing
modified_at
back to the earlier time, I think it would be cleaner for theCollection
class to overridedef maybe_update_modified_by_fields
and do the check there, and then either it callssuper
or does nothing.Good call, thanks! Update at 9e75bd68c - Test run: developer-run-tests: #2223
This LGTM
Updated by Peter Amstutz about 4 years ago
workbench2 17152-always-preserve-collection-versions LGTM as well, thanks!
Updated by Lucas Di Pentima about 4 years ago
Updated by Lucas Di Pentima about 4 years ago
- Status changed from In Progress to Resolved