Project

General

Profile

Actions

Bug #17152

closed

Version snapshots have modified_at of when snapshot was created, not when content was last modified; also flag to preserve current version

Added by Peter Amstutz about 4 years ago. Updated almost 4 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Start date:
12/03/2020
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
-
Release relationship:
Auto

Description

When a collection snapshot is created (recording a version), it appears that modified_at gets set to that point in time. However, because created_at is when version 1 is created, and the same for every snapshot, we don't have a field that tells us when the content of that snapshot came into existence.

In other words, the modifed_at timestamp of a collection snapshot at version N is actually the point in time when version N+1 was created. This is not what we want.

Fix the API server so that snapshots get the modified_at of the collection at the moment the snapshot is taken.

Write a migration that adjust the existing modified_at fields when versioning is enabled. The modified_at of version N should take the modified_at of version N-1. The "modified_at" of version 1 should take the value of "created_at".

Related thing to fix: there's currently an option that delays making snapshots on updates. However, sometimes the client knows better and you want every change to create a new snapshot. The "preserve_version" flag seems like it should do that, but actually it doesn't (???) so either we need to adjust the behavior of that flag to have the expected behavior, or we need a new flag.

It would probably make sense for "preserve_version" flag to both ensure that the previous version is snapshotted, as well as indicating that the new version should be snapshotted when updated as well.


Subtasks 1 (0 open1 closed)

Task #17180: Review 17152-collection-preserve-version-changes (arvados repo) & 17152-always-preserve-collection-versions (wb2 repo)ResolvedPeter Amstutz12/03/2020

Actions
Actions #1

Updated by Peter Amstutz about 4 years ago

  • Description updated (diff)
Actions #2

Updated by Lucas Di Pentima about 4 years ago

  • Assigned To set to Lucas Di Pentima
Actions #3

Updated by Peter Amstutz about 4 years ago

  • Description updated (diff)
  • Assigned To deleted (Lucas Di Pentima)
Actions #4

Updated by Peter Amstutz about 4 years ago

  • Assigned To set to Lucas Di Pentima
Actions #5

Updated by Lucas Di Pentima about 4 years ago

  • Status changed from New to In Progress
Actions #6

Updated by Lucas Di Pentima about 4 years ago

  • Status changed from In Progress to New
Actions #7

Updated by Peter Amstutz about 4 years ago

  • Subject changed from Version snapshots have modified_at of when snapshot was created, not when content was last modified to Version snapshots have modified_at of when snapshot was created, not when content was last modified; also flag to preserve current version
  • Description updated (diff)
Actions #8

Updated by Lucas Di Pentima about 4 years ago

  • Status changed from New to In Progress
Actions #9

Updated by Lucas Di Pentima about 4 years ago

Updates at b3fad38 - branch 17152-collection-versions-fixes
Test run: developer-run-tests: #2200

  • Fixes code and test related to modified_at handling for collection versions.
  • The preserve_version issue may not be one: We can flip it from false to true to preserve the current head version previous to make further updates. This can be done client side (specially on wb2 as it already has the head version data on the store, we can check if the extra request is necessary)

Pending: database migration for previous installations.

Actions #10

Updated by Lucas Di Pentima about 4 years ago

  • Target version changed from 2020-12-02 Sprint to 2020-12-16 Sprint
Actions #11

Updated by Lucas Di Pentima about 4 years ago

Updates at 67ef4d23e
Test run: developer-run-tests: #2204

  • Fixed arvbox script, that for some reason didn't published the webdav download port, making wb2 fail to access collection's files at least in publicdev mode.
  • Added migration to check for collections that need its version's timestamp fixing.
  • Added test for this migration, simulating a buggy state via fixtures.
  • Also tested the migration manually with arvbox.
Actions #12

Updated by Lucas Di Pentima about 4 years ago

Updates at arvados-workbench2|09a02c4 - branch 17152-always-preserve-collection-versions
Test run: developer-tests-workbench2: #214

  • Updates wb2's collection service methods to preserve the head version before making any further changes. This includes webdav operations.
  • Updates integration tests.
Actions #13

Updated by Peter Amstutz about 4 years ago

17152-collection-versions-fixes @ arvados|67ef4d23ece096e72da0ada75d4f2faa181df412

The migration and modified_at fix LGTM. I like the check for (head.modified_at == head~1.modified_at) as a way to determine if the fix-up is needed or not.

However it occurred to me that it looks like setting the preserve_version flag might update modified_at? I don't think we want it to do that.

17152-always-preserve-collection-versions @ arvados-workbench2|09a02c42b441f2191641649e5193ea0a39a35e58

The way I had expected this to work was to ensure that preserve_version was set before doing the update, to ensure that the version you were looking at before you made the change gets preserved.

Then you'd also set preserve_version on the update (I guess for WebDAV updates you still need to set preserve_version afterwards as well.) As noted, you probably need to tweak it so that setting preserve_version doesn't touch modified_at.

It's a little annoying to have to do 1 or 2 extra API calls to set preserve_version, perhaps we could avoid an extra call when we believe preserve_version was already set.

Separate from that, a slightly annoying/confusing behavior I noticed while testing this:

  1. go to the head version
  2. open the action menu and choose "edit collection"
  3. change the collection name in the edit dialog
  4. save it

For some reason, when you return to the collection view from the dialog, the parent project is selected on the right-hand details panel, which is unexpected, since it was showing the version history tab before.

Actions #14

Updated by Lucas Di Pentima about 4 years ago

From chat:

Peter Amstutz @tetron 18:25
we have upgrade notes
it's not exactly the same
it's something that can be included in release notes
let's nail down exactly what it should do
the easiest behavior from the client perspective is that an update with preserve_version and a snapshot-eligible change preserves both the old and new versions
and that setting preserve_version on its own does not change modified_at
if you do an update and it doesn't include preserve_version: true then you get the default behavior and preserve_version on the record reverts to false

Lucas Di Pentima @Lucas Di Pentima 18:28
…and setting only preserve_version to true on an update call should behave the old way?

Peter Amstutz @tetron 18:28
(existing behavior)

Lucas Di Pentima @Lucas Di Pentima 18:28
ok

Peter Amstutz @tetron 18:29
the only difference would be an update with preserve_version would act as if you'd set preserve_version first and then sent an update that also had preserve_version
and setting preserve_version on its own doesn't mess up modified_at

Actions #15

Updated by Peter Amstutz about 4 years ago

From chat:

Tweak behavior such that an update that includes preserve_version: true would act as if you'd set preserve_version: true first and committed an update and set preserve_version: true again.

In other words, it would snapshot the version about to be replaced, and also mark the new version as required for snapshot on the next update.

The second tweak is that setting preserve_version: true on its own no longer affects modified_at. (We might want to suppress creating an entry in the audit log as well, as it is kind of expensive to do for collections.)

Update the API "revision" timestamp in the discovery document.

The API behavior change should be documented in the upgrade notes.

Workbench2 collection create and update should always add preserveVersion: true, but should avoid making an extra API call.

WebDAV operations should set preserveVersion: true after an update.

Actions #16

Updated by Lucas Di Pentima about 4 years ago

Updates at 7a31eec9cb5c99361b45cdb629120e4b16ec10ee - branch 17152-collection-preserve-version-changes
Test run: developer-run-tests: #2215

  • Updates API revision number.
  • Changes preserve_version semantics as discussed on chat.
  • Updates documentation, upgrade notes and tests.
  • Avoids creating audit logs when only preserve_version is changed.
Actions #17

Updated by Lucas Di Pentima about 4 years ago

Updates at d5745d536 - branch 17152-collection-preserve-version-changes (arvados repo)
Test run: developer-run-tests: #2220

While working on the wb2 side of this fix, I realised that preserve_version was being flipped to false on updates with current version's preserve_version was true, and the request also included it as true.
This update fixes that bug and adds proper integration testing, as the conditional disabling of preserve_version was moved to the controller logic.

Actions #18

Updated by Lucas Di Pentima about 4 years ago

Updates at arvados-workbench2|2cabc3d9
Test run: developer-tests-workbench2: #217 (failed because it needs the arvados branch -- tested locally)

  • Removed the additional call on the collection service update() function, passing preserveVersion: true to the first and only call.
  • WebDAV operations already made an extra call to update preserve_version to true after the webdav request.
Actions #19

Updated by Peter Amstutz about 4 years ago

Lucas Di Pentima wrote:

Updates at d5745d536 - branch 17152-collection-preserve-version-changes (arvados repo)
Test run: developer-run-tests: #2220

While working on the wb2 side of this fix, I realised that preserve_version was being flipped to false on updates with current version's preserve_version was true, and the request also included it as true.
This update fixes that bug and adds proper integration testing, as the conditional disabling of preserve_version was moved to the controller logic.

  before_update :preserve_version_exclusive_updates_leave_modified_at_alone,
    if: Proc.new { |col| col.changes.keys.sort == ['modified_at', 'updated_at', 'preserve_version'].sort }

Instead of forcing modified_at back to the earlier time, I think it would be cleaner for the Collection class to override def maybe_update_modified_by_fields and do the check there, and then either it calls super or does nothing.

Actions #20

Updated by Lucas Di Pentima about 4 years ago

Peter Amstutz wrote:

Instead of forcing modified_at back to the earlier time, I think it would be cleaner for the Collection class to override def maybe_update_modified_by_fields and do the check there, and then either it calls super or does nothing.

Good call, thanks! Update at 9e75bd68c - Test run: developer-run-tests: #2223

Actions #21

Updated by Peter Amstutz about 4 years ago

Lucas Di Pentima wrote:

Peter Amstutz wrote:

Instead of forcing modified_at back to the earlier time, I think it would be cleaner for the Collection class to override def maybe_update_modified_by_fields and do the check there, and then either it calls super or does nothing.

Good call, thanks! Update at 9e75bd68c - Test run: developer-run-tests: #2223

This LGTM

Actions #22

Updated by Peter Amstutz about 4 years ago

workbench2 17152-always-preserve-collection-versions LGTM as well, thanks!

Actions #23

Updated by Lucas Di Pentima about 4 years ago

Thanks!

WB2 test run with arvandos branch merged: developer-tests-workbench2: #220

Merging wb2!

Actions #24

Updated by Lucas Di Pentima about 4 years ago

  • Status changed from In Progress to Resolved
Actions #25

Updated by Peter Amstutz almost 4 years ago

  • Release set to 37
Actions

Also available in: Atom PDF