Project

General

Profile

Actions

Bug #9542

closed

[API] review logs queries

Added by Nico César over 8 years ago. Updated almost 5 years ago.

Status:
Closed
Priority:
Normal
Assigned To:
Category:
Performance
Target version:
-
Start date:
07/05/2016
Due date:
% Done:

0%

Estimated time:
Story points:
-

Description

I see several

SELECT "logs".* FROM "logs"  WHERE (logs.id > 2341779 AND (((logs.event_type in ('create','update','delete'))))) ORDER BY id asc;

which takes 20+ minutes to execute and seems to use 100% of 1 core.

there is a index_logs_on_event_type index

arvados_production=# select
    t.relname as table_name,
    i.relname as index_name,
    array_to_string(array_agg(a.attname), ', ') as column_names
from
    pg_class t,
    pg_class i,
    pg_index ix,
    pg_attribute a
where
    t.oid = ix.indrelid
    and i.oid = ix.indexrelid
    and a.attrelid = t.oid
    and a.attnum = ANY(ix.indkey)
    and t.relkind = 'r'
    and t.relname like 'logs%'
group by
    t.relname,
    i.relname
order by
    t.relname,
    i.relname;
 table_name |        index_name         |                                                 column_names                                                 
------------+---------------------------+--------------------------------------------------------------------------------------------------------------
 logs       | index_logs_on_created_at  | created_at
 logs       | index_logs_on_event_at    | event_at
 logs       | index_logs_on_event_type  | event_type
 logs       | index_logs_on_modified_at | modified_at
 logs       | index_logs_on_object_uuid | object_uuid
 logs       | index_logs_on_owner_uuid  | owner_uuid
 logs       | index_logs_on_summary     | summary
 logs       | index_logs_on_uuid        | uuid
 logs       | logs_pkey                 | id
 logs       | logs_search_index         | uuid, owner_uuid, modified_by_client_uuid, modified_by_user_uuid, object_uuid, event_type, object_owner_uuid
 logs       | logs_uuid                 | uuid
(11 rows)
arvados_production=# explain SELECT "logs".* FROM "logs"  WHERE (logs.id > 2341779 AND (((logs.event_type in ('create','update','delete'))))) ORDER BY id asc;
                                     QUERY PLAN                                      
-------------------------------------------------------------------------------------
 Index Scan using logs_pkey on logs  (cost=0.44..7547930.83 rows=16786039 width=744)
   Index Cond: (id > 2341779)
   Filter: ((event_type)::text = ANY ('{create,update,delete}'::text[]))
(3 rows)

Related issues 1 (0 open1 closed)

Related to Arvados - Bug #8289: [API] Avoid making queries unnecessarily slow by adding superfluous "order by" columnsResolvedTom Clegg02/08/2016

Actions
Actions #1

Updated by Nico César over 8 years ago

20+ minutes is under load: actually is 3m22s when is idle

$ time psql arvados_production <<< "SELECT "logs".* FROM "logs"  WHERE (logs.id > 2341779 AND (((logs.event_type in ('create','update','delete'))))) ORDER BY id asc;" 
Killed

real    3m22.794s
user    0m5.548s
sys     0m27.716s

Actions #2

Updated by Tom Clegg over 8 years ago

9542-websocket-queries @ f1c9f13

Actions #3

Updated by Nico César over 8 years ago

Tom Clegg wrote:

9542-websocket-queries @ f1c9f13

I looked at the code and looks better than before. we should see how this works out in a cluster under heavy load.

Actions #4

Updated by Tom Clegg over 8 years ago

Added find_each so a client saying "start at id=1" doesn't make the server fetch millions of id's in one batch.

7a77da2

Actions #5

Updated by Nico César over 8 years ago

Tom Clegg wrote:

Added find_each so a client saying "start at id=1" doesn't make the server fetch millions of id's in one batch.

7a77da2

7a77da2 looks good to me. let's merge and deploy as usual

Actions #6

Updated by Nico César over 8 years ago

just to add more logs to the fire ... I get a fiddlesticks trying to list a COLLECTION https://workbench.qr1hi.arvadosapi.com/collections/qr1hi-4zz18-gx0qr5w1386siuh

API request URL
    https://qr1hi.arvadosapi.com/arvados/v1/logs
API response

    {
      ":errors":[
        "HTTPClient::ReceiveTimeoutError error connecting to API server" 
      ]
    }
Actions #7

Updated by Tom Morris over 8 years ago

  • Category set to Performance
  • Assigned To set to Tom Clegg

Is this fixed?

Assigning to Tom Clegg for resolution since he worked on it.

Actions #8

Updated by Peter Amstutz almost 5 years ago

  • Status changed from New to Closed
Actions

Also available in: Atom PDF