Project

General

Profile

Actions

Bug #10543

open

implement approximate (estimated) counts for API list method

Added by Joshua Randall about 8 years ago. Updated over 3 years ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
-
Start date:
11/16/2016
Due date:
% Done:

0%

Estimated time:
Story points:
-

Description

Implement a -count=estimate option for API list queries, to return an estimated/approximate row count in `items_available` rather than the exact count (or no count, as the option 'none' introduced in #9998 allows).

Postgres has a simple way of getting an approximate row count for an entire table very quickly, and a somewhat more involved way of getting an approximate count for more sophisticated queries (https://wiki.postgresql.org/wiki/Count_estimate), which should still be much faster than a full table scan.

This could be used anywhere only an approximate count is needed. That could include:
- to populate a UI that displays the number of pages available rather than the count
- to populate a UI that displays the number of items available in approximate terms (i.e. instead of showing "Data Collections (7323212)" workbench could say "Data Collections (7.3M)")
- to create an appropriately sized data structure to accommodate all the data (e.g. to set the collection map size at the beginning of a keep-balance run, which already uses 110% of the returned value)

Actions #1

Updated by Tom Morris over 7 years ago

  • Target version set to Arvados Future Sprints
Actions #2

Updated by Ward Vandewege over 3 years ago

  • Target version deleted (Arvados Future Sprints)
Actions

Also available in: Atom PDF