Bug #3400
Updated by Tom Clegg about 10 years ago
Background: If a developer writes something like @Link.where(...).each do { ... }@ and the API call results in more than the result limit (default 100), the code will only iterate over the first 100 results. This is error prone and leads to subtle bugs where code works as expected when there are less than 100 results but suddenly starts losing results when there are more than 100 results. The client libraries should clearly distinguish between cases where paging is desired vs iterating over the entire result set. This problem currently exists in all client SDKs: Python, Go, Ruby, Java, Perl, command line, and Rails (Workbench).
Current priority:
* Add and document "all pages" feature in Rails/Workbench client library
* Document the limitation in other libraries. If possible, show an example of how to do it with the @offset@ parameter, like the @$newtask_results@ code in source:sdk/cli/bin/crunch-job#L1056.
Implementation:
"Get all pages automatically" should be the default behavior, as in ActiveRecord. This change will involve fixing all existing cases where Workbench relies on the "one page" behavior.
* @Model.limit(100)@ sets @limit=100@ in the API call, _and_ fetches multiple pages in order to reach 100 results (or EOF) even if the API server imposes a maximum page size smaller than the @limit@ given.
* @each()@, @select()@, and @collect()@ fetch subsequent pages as needed to get all matching results (an @each@ loop that breaks in the first iteration will involve just one API call).
* The code that retrieves results will finally get extricated from @where()@.
* @to_ary@ will have to fetch all pages before returning. Therefore, it should be avoided where possible.
* Accept @limit(:page)@ to signify "one page, using the API server's default @limit@" (i.e., don't send a @limit@ parameter, and don't fetch additional pages).
* Use @limit(1)@, @limit(:page)@, etc. in Workbench where appropriate (e.g., retrieving a page of results for infinite-scroll content)