Bug #12107
closed[API] jobs.create should not do slow git operations
100%
Description
It's typical for jobs.create to take ~6 seconds, with the database only accounting for about 75ms.
Our "copy commit to internal.git and tag it" procedure might be responsible for a significant portion of this.
Currently, when creating a new job, source:services/api/app/models/commit.rb does this, whether or not the commit already exists in internal.git:
must_pipe("echo #{sha1.shellescape}",
"git --git-dir #{src_gitdir.shellescape} pack-objects -q --revs --stdout",
"git --git-dir #{dst_gitdir.shellescape} unpack-objects -q")
must_git(dst_gitdir,
"tag --force #{tag.shellescape} #{sha1.shellescape}")
Using a copy of an internal.git repo from a production cluster on my workstation, and a commit that is already present in internal.git, I found
git operations | time | |
pack-objects | unpack-objects | 4-5s | |
fetch file:///src_gitdir $sha1 | 35s | including initial auto gc |
fetch file:///src_gitdir $sha1 | 12-14s | while auto gc running in background |
fetch file:///src_gitdir $sha1 | 1s | after gc done |
- Check whether the commit is already present in internal.git before copying it.
[[ `git rev-parse --verify $sha1^{commit}` = $sha1 ]]
takes <60ms for both success and fail cases.
- Use "git fetch file://..." instead of "pack|unpack" to avoid unnecessary work (with the pipe method, pack-objects has to prepare the entire pack whether it's needed or not; fetch is surely smarter).
git fetch --no-tags file://$src_gitdir $sha1:refs/tags/$uuid
takes ~1s including adding/updating the tag.
- See if there's a way to do this even more efficiently in cases where the commit isn't yet present. "fetch" takes 1s even when the commit is already present, so perhaps there's another way that finds more shortcuts.
Updated by Tom Morris over 7 years ago
- Target version set to 2017-08-30 Sprint
- Story points set to 1.0
Updated by Lucas Di Pentima over 7 years ago
- Assigned To set to Lucas Di Pentima
Updated by Lucas Di Pentima over 7 years ago
- Target version changed from 2017-08-30 Sprint to 2017-09-13 Sprint
Updated by Tom Clegg over 7 years ago
- Assigned To changed from Lucas Di Pentima to Tom Clegg
Updated by Tom Clegg over 7 years ago
12107-faster-git @ 32341a4777692a28358f2af5afa0d9e975e2da73
This makes the first two proposed improvements:- If the commit is already present, just tag it.
- If the commit isn't present, use fetch instead of pack-objects|unpack-objects.
Updated by Lucas Di Pentima over 7 years ago
Running services/api
tests on my Debian 9 VM produced a lot of errors, like this one:
74) Error: JobTest#test_verify_job_status_[["state",_nil,_[["state",_"Queued"]]]]: Commit::GitError: git --git-dir /home/lucas/arvados_local/services/api/tmp/internal.git fetch --no-tags file:///home/lucas/arvados_local/services/api/tmp/git/test/zzzzz-s0uqq-382brsig8rp3666/.git 077ba2ad3ea24a929091a9e6ce545c93199b8e57:refs/tags/zzzzz-8i9sb-3tdw50ao2bulisx 2>&1: pid 23753 exit 1: app/models/commit.rb:240:in `must_pipe' app/models/commit.rb:228:in `block in must_git' app/models/commit.rb:227:in `each' app/models/commit.rb:227:in `must_git' app/models/commit.rb:161:in `tag_in_internal_repository' app/models/job.rb:443:in `tag_version_in_internal_repository' test_after_commit (1.1.0) lib/test_after_commit/database_statements.rb:11:in `block in transaction' test_after_commit (1.1.0) lib/test_after_commit/database_statements.rb:5:in `transaction' test/unit/job_test.rb:234:in `block (2 levels) in <class:JobTest>’
Not sure what's causing this. The git version I'm using is 2.11.0.
Updated by Tom Clegg over 7 years ago
Not sure why the same tests didn't fail for me but I have been able to reproduce this with "git fetch".
I found this:
No, out of security concerns; imagine you included some proprietary source code by mistake, and undo the damage by forcing a push with a branch that does not have the incriminating code. Usually you do not control the garbage-collection on the server, yet you still do not want other people to fetch “by SHA-1”.
Silently exiting 1 is rather unfortunate, and most references to this question say the remote "server" decides whether to honor a fetch by sha1.
Updated by Lucas Di Pentima over 7 years ago
Ok, I ran the tests on jenkins just to be sure, and they passed without issues. LGTM, thanks!
Updated by Tom Clegg over 7 years ago
(continuing from note-11, which I saved mid-investigation)
In any case, experiments confirm we can't rely on fetch, so I've done this:- Check whether commit is already present in internal.git. If so, just tag it and stop.
- Ask git for a list of branches that include the desired commit. Take the first one and git-fetch it. Tag the desired commit. If all of these steps work, stop.
- Last resort: pack-objects|unpack-objects like we did before.
There are a number of things that can go wrong in step 2 (commit isn't on any branch, regexp isn't good enough to catch the branch name, selected branch gets non-FF pushed just before we fetch and no longer has the commit we want) but in all of these cases, we just try the pack-unpack fallback -- if that doesn't work, nothing would.
12107-faster-git @ bd5a371cc0f57aea20c0954f532db5890dae5c59
Updated by Tom Clegg over 7 years ago
- Target version changed from 2017-09-13 Sprint to 2017-09-27 Sprint
Updated by Tom Clegg over 7 years ago
- initializes the internal.git repo at test setup time instead of preserving internal.git from one test run to the next (tests were passing for me with broken code, because a previous version's test suite had already fetched all the tested commits into internal.git)
- adds tests for a "not on any branch" commit and a "not at tip of any branch" commit
Updated by Lucas Di Pentima over 7 years ago
Just one comment:
- File
services/api/app/models/commit.rb
- Lines 176 & 181: I think those
must_git()
calls could be condensed into one by using theensure
block, or maybe place it immediately after thebegin/rescue
block?
- Lines 176 & 181: I think those
The rest LGTM.
Updated by Tom Clegg over 7 years ago
Lucas Di Pentima wrote:
- Lines 176 & 181: I think those
must_git()
calls could be condensed into one by using theensure
block, or maybe place it immediately after thebegin/rescue
block?
Added protective comment:
# Even if all of the above steps succeeded, we might still not
# have the right commit due to a race, in which case tag_cmd
# will fail, and we'll need to fall back to pack|unpack. So
# don't be tempted to condense this tag_cmd and the one in the
# rescue block into a single attempt.
must_git(dst_gitdir, tag_cmd)
Updated by Anonymous over 7 years ago
- Status changed from In Progress to Resolved
- % Done changed from 0 to 100
Applied in changeset arvados|commit:95000b5d6bb4264533a1770346342c8fd3dbe61a.