Bug #8787
closed
[API] repositories/get_all_permissions method is too slow
Added by Nico César almost 9 years ago.
Updated almost 8 years ago.
Description
In su92l we're getting a Error: Net::ReadTimeout :
Error: Net::ReadTimeout
/usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/protocol.rb:158:in `rescue in rbuf_fill'
/usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/protocol.rb:152:in `rbuf_fill'
/usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/protocol.rb:134:in `readuntil'
/usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/protocol.rb:144:in `readline'
/usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/http/response.rb:39:in `read_status_line'
/usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/http/response.rb:28:in `read_new'
/usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/http.rb:1408:in `block in transport_request'
/usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/http.rb:1405:in `catch'
/usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/http.rb:1405:in `transport_request'
/usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/http.rb:1378:in `request'
/usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/http.rb:1371:in `block in request'
/usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/http.rb:853:in `start'
/usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/http.rb:1369:in `request'
/usr/local/rvm/gems/ruby-2.1.4/gems/faraday-0.8.11/lib/faraday/adapter/net_http.rb:76:in `perform_request'
/usr/local/rvm/gems/ruby-2.1.4/gems/faraday-0.8.11/lib/faraday/adapter/net_http.rb:39:in `call'
/usr/local/rvm/gems/ruby-2.1.4/gems/faraday-0.8.11/lib/faraday/request/url_encoded.rb:14:in `call'
/usr/local/rvm/gems/ruby-2.1.4/gems/google-api-client-0.6.4/lib/google/api_client/request.rb:159:in `send'
/usr/local/rvm/gems/ruby-2.1.4/gems/google-api-client-0.6.4/lib/google/api_client.rb:563:in `execute'
/usr/local/rvm/gems/ruby-2.1.4/gems/arvados-0.1.20160315200920/lib/arvados.rb:207:in `api_exec'
/usr/local/rvm/gems/ruby-2.1.4/gems/arvados-0.1.20160315200920/lib/arvados.rb:83:in `block (4 levels) in initialize'
/usr/local/arvados/update-gitolite.rb:233:in `<main>'
this is
arv = Arvados.new
permissions = arv.repository.get_all_permissions
which makes a
Started GET "/arvados/v1/repositories/get_all_permissions" for 127.0.0.1 at 2016-03-24 13:14:56 +0000
Processing by Arvados::V1::RepositoriesController#get_all_permissions as */*
and I think is taking ~150 seconds sometimes (sometimes is almost instant, caching maybe? )
we need to optimize this so
/bin/bash -c 'source /etc/profile.d/rvm.sh && /usr/local/arvados/update-gitolite.rb production'
can run predictably
this started happening Tue, Mar 22, 2016 at 9:12 AM (Boston time)
Comparing indexes looking for a missing one:
# select
t.relname as table_name,
i.relname as index_name,
array_to_string(array_agg(a.attname), ', ') as column_names
from
pg_class t,
pg_class i,
pg_index ix,
pg_attribute a
where
t.oid = ix.indrelid
and i.oid = ix.indexrelid
and a.attrelid = t.oid
and a.attnum = ANY(ix.indkey)
and t.relkind = 'r'
group by
t.relname,
i.relname
order by
t.relname,
i.relname;
on qr1hi there is a difference with table logs, column uuid,
but c97qk is identical
this is how to reproduce it:
#!/usr/bin/env ruby
require 'arvados'
beginning_time = Time.now
arv = Arvados.new
permissions = arv.repository.get_all_permissions
end_time = Time.now
puts "Time elapsed #{(end_time - beginning_time)*1000} milliseconds"
nico@shell:~$ ./test.rb
/usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/protocol.rb:158:in `rescue in rbuf_fill': Net::ReadTimeout (Faraday::Error::TimeoutError)
from /usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/protocol.rb:152:in `rbuf_fill'
from /usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/protocol.rb:134:in `readuntil'
from /usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/protocol.rb:144:in `readline'
from /usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/http/response.rb:39:in `read_status_line'
from /usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/http/response.rb:28:in `read_new'
from /usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/http.rb:1408:in `block in transport_request'
from /usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/http.rb:1405:in `catch'
from /usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/http.rb:1405:in `transport_request'
from /usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/http.rb:1378:in `request'
from /usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/http.rb:1371:in `block in request'
from /usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/http.rb:853:in `start'
from /usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/http.rb:1369:in `request'
from /usr/local/rvm/gems/ruby-2.1.4/gems/faraday-0.8.11/lib/faraday/adapter/net_http.rb:76:in `perform_request'
from /usr/local/rvm/gems/ruby-2.1.4/gems/faraday-0.8.11/lib/faraday/adapter/net_http.rb:39:in `call'
from /usr/local/rvm/gems/ruby-2.1.4/gems/faraday-0.8.11/lib/faraday/request/url_encoded.rb:14:in `call'
from /usr/local/rvm/gems/ruby-2.1.4/gems/google-api-client-0.6.4/lib/google/api_client/request.rb:159:in `send'
from /usr/local/rvm/gems/ruby-2.1.4/gems/google-api-client-0.6.4/lib/google/api_client.rb:563:in `execute'
from /usr/local/rvm/gems/ruby-2.1.4/gems/arvados-0.1.20160315200920/lib/arvados.rb:207:in `api_exec'
from /usr/local/rvm/gems/ruby-2.1.4/gems/arvados-0.1.20160315200920/lib/arvados.rb:83:in `block (4 levels) in initialize'
from ./test.rb:7:in `<main>'
nico@shell:~$ ./test.rb
Time elapsed 15450.9472 milliseconds
nico@shell:~$ time ./test.rb
Time elapsed 111464.593272 milliseconds
real 1m51.765s
user 0m0.524s
sys 0m0.176s
nico@shell:~$ time ./test.rb
Time elapsed 15467.253309 milliseconds
real 0m15.989s
user 0m0.568s
sys 0m0.348s
nico@shell:~$ time ./test.rb
Time elapsed 13511.453208 milliseconds
real 0m13.801s
user 0m0.516s
sys 0m0.036s
timeout is 2 minutes, here is the proof
nico@shell:~$ cat ./test.rb
#!/usr/bin/env ruby
require 'arvados'
beginning_time = Time.now
arv = Arvados.new
permissions = arv.repository.get_all_permissions
end_time = Time.now
puts "Time elapsed #{(end_time - beginning_time)*1000} milliseconds"
nico@shell:~$ time ./test.rb
/usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/protocol.rb:158:in `rescue in rbuf_fill': Net::ReadTimeout (Faraday::Error::TimeoutError)
from /usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/protocol.rb:152:in `rbuf_fill'
from /usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/protocol.rb:134:in `readuntil'
from /usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/protocol.rb:144:in `readline'
from /usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/http/response.rb:39:in `read_status_line'
from /usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/http/response.rb:28:in `read_new'
from /usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/http.rb:1408:in `block in transport_request'
from /usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/http.rb:1405:in `catch'
from /usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/http.rb:1405:in `transport_request'
from /usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/http.rb:1378:in `request'
from /usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/http.rb:1371:in `block in request'
from /usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/http.rb:853:in `start'
from /usr/local/rvm/rubies/ruby-2.1.4/lib/ruby/2.1.0/net/http.rb:1369:in `request'
from /usr/local/rvm/gems/ruby-2.1.4/gems/faraday-0.8.11/lib/faraday/adapter/net_http.rb:76:in `perform_request'
from /usr/local/rvm/gems/ruby-2.1.4/gems/faraday-0.8.11/lib/faraday/adapter/net_http.rb:39:in `call'
from /usr/local/rvm/gems/ruby-2.1.4/gems/faraday-0.8.11/lib/faraday/request/url_encoded.rb:14:in `call'
from /usr/local/rvm/gems/ruby-2.1.4/gems/google-api-client-0.6.4/lib/google/api_client/request.rb:159:in `send'
from /usr/local/rvm/gems/ruby-2.1.4/gems/google-api-client-0.6.4/lib/google/api_client.rb:563:in `execute'
from /usr/local/rvm/gems/ruby-2.1.4/gems/arvados-0.1.20160315200920/lib/arvados.rb:207:in `api_exec'
from /usr/local/rvm/gems/ruby-2.1.4/gems/arvados-0.1.20160315200920/lib/arvados.rb:83:in `block (4 levels) in initialize'
from ./test.rb:7:in `<main>'
real 2m0.688s
user 0m0.456s
sys 0m0.084s
- Project changed from 40 to Arvados
- Subject changed from [API] Error: Net::ReadTimeout on arv.repository.get_all_permissions to [API] repositories/get_all_permissions method is too slow
- Category set to API
- Target version set to Arvados Future Sprints
There are a lot of contributing factors here. Given the wide timing variation shown in the notes, one idea is that the response time is greatly affected by whether or not particular database results are cached.
The #1 culprit is the permissions graph. We completely flush that whenever it's affected, and then the next API query has to rebuild the graph completely.
Addressing this performance issue would help a lot of things, not just this individual call. We'd like to start there and see whether it's sufficient to solve this issue.
There was a philosophical discussion about whether our permissions model (and by extension, API generally) should move to a sort of eventual consistency model, where the permissions graph gets rebuilt in the background without that work blocking the handling of any individual API query.
see #10917 looks like we've go an order of magnitude of improvement
- Status changed from New to Resolved
This got resolved with the last deploy no more Timeout alerts in my inbox
- Target version deleted (
Arvados Future Sprints)
Also available in: Atom
PDF