Bug #14878: [API] Fix unreliable test - Arvados

Actions

Copy link

Bug #14878

closed

[API] Fix unreliable test

Added by Tom Clegg about 6 years ago. Updated almost 6 years ago.

Status:

Resolved

Priority:

Normal

Assigned To:

Tom Clegg

Category:

Tests

Target version:

2019-06-05 Sprint

Start date:

05/28/2019

Due date:

% Done:

100%

Estimated time:

(Total: 0.00 h)

Story points:

Description

"Update priority" tests fail occasionally.

From https://ci.curoverse.com/job/developer-run-tests-services-api/1095/console

  1) Failure:
UpdatePriorityTest#test_priority_0_but_should_be_>0 [/home/ci-jenkins/.jenkins-slave/workspace/developer-run-tests-services-api/services/api/test/unit/update_priority_test.rb:20]:
Expected 0 to be < 0.

Possible explanation: if a background update task (UpdatePriority.run_update_thread(), started from an after_commit callback in a previous test) is still running when the test case does an explicit inline call to UpdatePriority.update_priority(), the inline call fails to get the lock, and returns immediately. But the background task's updates do not necessarily incorporate the latest priority updates, and even if they do, the resulting updates are not necessarily committed before the test case checks for them.

Possible ways to fix:

Don't start the background update thread in the test environment.
Pass a "block if needed" flag to update_priority(), so the test case works even when background tasks are running.

Subtasks 1 (0 open — 1 closed)

Actions

Copy link

Updated by Tom Clegg about 6 years ago

Description updated (diff)

Actions

Copy link

Updated by Tom Morris about 6 years ago

Target version set to 2019-03-27 Sprint

Actions

Copy link

Updated by Lucas Di Pentima about 6 years ago

Assigned To set to Lucas Di Pentima

Actions

Copy link

Updated by Tom Morris about 6 years ago

Target version changed from 2019-03-27 Sprint to 2019-04-10 Sprint

Actions

Copy link

Updated by Lucas Di Pentima about 6 years ago

Target version changed from 2019-04-10 Sprint to 2019-04-24 Sprint

Actions

Copy link

Updated by Lucas Di Pentima almost 6 years ago

Target version changed from 2019-04-24 Sprint to 2019-05-08 Sprint

Actions

Copy link

Updated by Lucas Di Pentima almost 6 years ago

Target version changed from 2019-05-08 Sprint to 2019-05-22 Sprint

Actions

Copy link

Updated by Lucas Di Pentima almost 6 years ago

Target version changed from 2019-05-22 Sprint to 2019-06-05 Sprint

Actions

Copy link

Updated by Tom Clegg almost 6 years ago

Assigned To changed from Lucas Di Pentima to Tom Clegg

Actions

Copy link

#10

Updated by Tom Clegg almost 6 years ago

On second thought, disabling the background thread (and updating synchronously instead) in test cases seems undesirable: it would prevent all tests (even other components' integration tests) from experiencing async behavior.

A blocking flock() doesn't work: the background task sometimes gets the lock, then does a postgresql statement that blocks to wait for the main thread to commit/rollback. If the main thread then waits for the lock before (instead of) committing, everyone deadlocks in a way that postgresql can't detect.

Instead, solved by just skipping the lock when running "update priority" from the test case.

14878-priority-race @ 435a5df3e505dfbf67467bd02073f97e63c4c61d -- https://ci.curoverse.com/view/Developer/job/developer-run-tests/1262/

Actions

Copy link

#11

Updated by Tom Clegg almost 6 years ago

Status changed from New to In Progress

Actions

Copy link

#12

Updated by Lucas Di Pentima almost 6 years ago

I tried to make master fail by running update_priority_test.rb repeatedly on my local machine but wasn't able to reproduce the flaky behavior, even running the entire unit test suite several times.

I think it would be useful to add a comment on update_priority() explaining noblock's use, for posterity. Apart from that, it LGTM.

Actions

Copy link

#13