Project

General

Profile

Actions

Bug #8289

closed

[API] Avoid making queries unnecessarily slow by adding superfluous "order by" columns

Added by Tom Clegg almost 9 years ago. Updated almost 9 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
API
Target version:
Start date:
02/08/2016
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
0.5

Description

As an example, when the Python SDK's websocket fallback PollClient starts up, it does an API request with order="id asc". API server translates that to a PostgreSQL query like this:

arvados_production=# SELECT  "logs".* FROM "logs"  WHERE ((logs.event_type in ('create','update','delete')) AND (logs.id > '3464274')) ORDER BY logs.id desc, logs.modified_at desc, logs.uuid LIMIT 1 OFFSET 0;
arvados_production=# explain analyze SELECT  "logs".* FROM "logs"  WHERE ((logs.event_type in ('create','update','delete')) AND (logs.id > '3464274')) ORDER BY logs.id desc, logs.modified_at desc, logs.uuid LIMIT 1 OFFSET 0;
                                                                              QUERY PLAN                                                                               
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=822081.89..822081.89 rows=1 width=1290) (actual time=4007.367..4007.368 rows=1 loops=1)
   ->  Sort  (cost=822081.89..823218.65 rows=454703 width=1290) (actual time=4007.364..4007.364 rows=1 loops=1)
         Sort Key: id, modified_at, uuid
         Sort Method: top-N heapsort  Memory: 27kB
         ->  Bitmap Heap Scan on logs  (cost=69925.24..819808.37 rows=454703 width=1290) (actual time=531.924..3086.689 rows=869534 loops=1)
               Recheck Cond: ((id > 3464274) AND ((event_type)::text = ANY ('{create,update,delete}'::text[])))
               ->  BitmapAnd  (cost=69925.24..69925.24 rows=454703 width=0) (actual time=530.935..530.935 rows=0 loops=1)
                     ->  Bitmap Index Scan on logs_pkey  (cost=0.00..29223.51 rows=1433428 width=0) (actual time=249.940..249.940 rows=1954308 loops=1)
                           Index Cond: (id > 3464274)
                     ->  Bitmap Index Scan on index_logs_on_event_type  (cost=0.00..40474.13 rows=1341921 width=0) (actual time=270.632..270.632 rows=1735755 loops=1)
                           Index Cond: ((event_type)::text = ANY ('{create,update,delete}'::text[]))
 Total runtime: 4007.417 ms

"id" is a primary key, so the "logs.modified_at desc, logs.uuid" part of the "order by" clause can't possibly make a difference to the results. If we remove them, the query looks like this:

arvados_production=# explain analyze SELECT  "logs".* FROM "logs"  WHERE ((logs.event_type in ('create','update','delete')) AND (logs.id > '3464274')) ORDER BY logs.id desc LIMIT 1 OFFSET 0;
                                                                 QUERY PLAN                                                                  
---------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..8.67 rows=1 width=1290) (actual time=0.049..0.050 rows=1 loops=1)
   ->  Index Scan Backward using logs_pkey on logs  (cost=0.00..3943415.80 rows=454703 width=1290) (actual time=0.046..0.046 rows=1 loops=1)
         Index Cond: (id > 3464274)
         Filter: ((event_type)::text = ANY ('{create,update,delete}'::text[]))
 Total runtime: 0.078 ms

Subtasks 1 (0 open1 closed)

Task #8354: review 8289-no-extra-ordersResolvedTom Clegg02/08/2016

Actions

Related issues 1 (0 open1 closed)

Related to Arvados - Bug #9542: [API] review logs queries ClosedTom Clegg07/05/2016

Actions
Actions

Also available in: Atom PDF