Bug #14009
closedcontainer request creation very slow when there are many potentially reusable containers
100%
Description
Container request creation on our system was typically taking 90-120s (N.B. this is after disabling all audit logging so it has already been sped up dramatically from stock). We found this to be due to the queries that look up the reusable containers were the query conditions were pretty much all on columns that have no indices of any kind.
Our interim solution has been to add a hash-based index (to keep index size down) on the `command` column:
arvados_api_production=# create index index_containers_on_command_hash on containers using hash (command);
After adding this index, container request creation was sped up by ~20x (to around 5s each).
However, note that hash indices are not recommended for postgres versions < 10 as they are not included in WAL replication. An alternative would be to have explicit hash columns included in the table (which are btree indexed) that can be used to query the exact-match fields for container reuse.
In addition to making for faster reuse queries, this should also pave the way to enable formulation of client-side container queries that don't need to transmit the large text values (such as mounts, command, etc) to the API server.
Updated by Tom Clegg over 6 years ago
- Status changed from New to In Progress
14009-container-reuse-index @ 6525b509825dbbf1cbe8b30b34080aafc4e5bde3
The test database is so small postgres declines to use the index, but I pasted the "create index" statement and a reuse query from the test suite into a dev cluster with lots of containers (9tee4) to confirm postgres uses it. The query took ~130ms before adding the index, ~1ms after.
Updated by Peter Amstutz over 6 years ago
Tom Clegg wrote:
14009-container-reuse-index @ 6525b509825dbbf1cbe8b30b34080aafc4e5bde3
The test database is so small postgres declines to use the index, but I pasted the "create index" statement and a reuse query from the test suite into a dev cluster with lots of containers (9tee4) to confirm postgres uses it. The query took ~130ms before adding the index, ~1ms after.
This seems like the sort of thing a hash index would be useful for, but Postgres 9.x hash indexes come with a recommendation not to use them, so this seems to be the next best thing. LGTM.
Updated by Tom Clegg over 6 years ago
- Status changed from In Progress to Resolved