Story #15781
closed[WB2] Support multiple values for a given property key
100%
Description
Backend¶
https://www.postgresql.org/docs/9.5/functions-json.html
arvados_development=# select '{"a":["foo","bar"]}'::jsonb @> '{"a":["bar"]}'::jsonb; ?column? ---------- t
So jsonb subset match works if you know the key and are looking for an exact match on the value.
It looks like an Arvados filter of [["properties.a", "=", ["bar"]]
might already work by since it uses @>
in the implementation. Although that's arguably a bug since the expectation is to get an exact match. Instead we'll introduce a new operator [["properties.a", "contains", "bar"]
.
Frontend UI¶
For editing, rather than overwriting a tag:value pair as is done currently when the tag is specified multiple times, add the new tag:value pair to the list. When writing to the backend, merge the values into an array which becomes value that's written into the properties. Arrays are only used when there are multiple values.
For searching the implementation will use the new contains
operator internally, but continue to use the "has" notation externally.
Updated by Tom Morris about 5 years ago
- Description updated (diff)
- Target version changed from To Be Groomed to Arvados Future Sprints
- Story points set to 4.0
Updated by Peter Amstutz about 5 years ago
- Target version deleted (
Arvados Future Sprints) - Release set to 20
Updated by Lucas Di Pentima about 5 years ago
- Category set to Workbench2
- Assigned To set to Lucas Di Pentima
- Target version set to 2020-01-29 Sprint
Updated by Lucas Di Pentima almost 5 years ago
- Status changed from New to In Progress
Updated by Lucas Di Pentima almost 5 years ago
- Target version changed from 2020-01-29 Sprint to 2020-02-12 Sprint
Updated by Lucas Di Pentima almost 5 years ago
Updates at commit:8a88ceb33 - branch 15781-multi-value-property-search
(arvados repo)
Test run: developer-run-tests: #1711
- Adds
contains
subproperty filter to the API server - Adds tests checking that the new operator works on scalar and array subproperty values and also that the
=
filter doesn't work on array subproperty elements. - Adds documentation update.
Updated by Lucas Di Pentima almost 5 years ago
Updates for workbench2 at commit 41658ace - branch 15781-multi-value-property-edit
- Adds support for multi-value properties editing on collections and projects
- Adds support for multi-value properties on the advanced search editor
- Uses 'contains' filter when searching for properties
Updated by Lucas Di Pentima almost 5 years ago
- Has duplicate Bug #15774: Advanced Search editor keeps adding search terms instead of replacing added
Updated by Peter Amstutz almost 5 years ago
Lucas Di Pentima wrote:
Updates at commit:8a88ceb33 - branch
15781-multi-value-property-search
(arvados repo)
Test run: developer-run-tests: #1711
- Adds
contains
subproperty filter to the API server- Adds tests checking that the new operator works on scalar and array subproperty values and also that the
=
filter doesn't work on array subproperty elements.- Adds documentation update.
The documentation for "contains" operation is not very clear. Suggest something like:
Filter where subproperty has a value either by exact match or value is element of subproperty list. For example, ["foo", "containes", "bar"] will find both {"foo": "bar"} and {"foo": ["bar", "baz"]}.
Updated by Lucas Di Pentima almost 5 years ago
Peter Amstutz wrote:
Lucas Di Pentima wrote:
Updates at commit:8a88ceb33 - branch
15781-multi-value-property-search
(arvados repo)
[...]
The documentation for "contains" operation is not very clear. Suggest something like:
Filter where subproperty has a value either by exact match or value is element of subproperty list. For example, ["foo", "containes", "bar"] will find both {"foo": "bar"} and {"foo": ["bar", "baz"]}.
Fixed at commit:dd7da09fe
Updated by Peter Amstutz almost 5 years ago
Lucas Di Pentima wrote:
Updates for workbench2 at commit 41658ace - branch
15781-multi-value-property-edit
- Adds support for multi-value properties editing on collections and projects
- Adds support for multi-value properties on the advanced search editor
- Uses 'contains' filter when searching for properties
In the advanced search popup, removing property filters doesn't remove them from the query bar.
If you attempt delete a property and the update fails, it doesn't restore the property in the display.
If you do a federated search on clusters that don't support 'contains' you get back errors. Should increment to increment the API version and test for it, and decide if client is able to use 'contains' or needs to fall back on '=' or whatever it used before.
Updated by Lucas Di Pentima almost 5 years ago
- Target version changed from 2020-02-12 Sprint to 2020-02-26 Sprint
Updated by Lucas Di Pentima almost 5 years ago
Update at commit:8141f4c40 - branch 15781-multi-value-property-search
(arvados repo)
- Bumps the API revision number to indicate the new
contains
filter operator to clients.
Updated by Lucas Di Pentima almost 5 years ago
Update at commit ecd7fabb - branch 15781-multi-value-property-edit
(wb2 repo)
Test run: developer-tests-workbench2: #12
- Adds API revision number (from discovery document) to sessions, defaulting to zero.
- Uses 'contains' or 'ilike' when searching properties depending on API revision number.
- Fixes error handling on properties add/remove operations.
- Fixes property removal on advanced search editor.
- Fixes property Chip UI duplication on advanced search editor.
Updated by Peter Amstutz almost 5 years ago
Lucas Di Pentima wrote:
Update at commit ecd7fabb - branch
15781-multi-value-property-edit
(wb2 repo)
Test run: developer-tests-workbench2: #12
- Uses 'contains' or 'ilike' when searching properties depending on API revision number.
Will 'contains' ignore case? Or should it fall back to 'like' instead of 'ilike'?
Updated by Lucas Di Pentima almost 5 years ago
Updates at commit:7a7447ac7
Test run: developer-run-tests-services-api: #1779
As workbench2 was previously using 'ilike', I preferred to give 'contains' case-insensitive matching powers. Added a couple of tests for this.
Updated by Lucas Di Pentima almost 5 years ago
I've run PG's EXPLAIN on both types of queries, and it seems that my solution isn't the best as it forces a sequential scan:
Before¶
arvados_test=# explain select uuid from collections where collections.properties @> '{"listprop":"ELEM6"}'::jsonb; QUERY PLAN ---------------------------------------------------------------------------------------------- Bitmap Heap Scan on collections (cost=12.00..16.01 rows=1 width=27) Recheck Cond: (properties @> '{"listprop": "ELEM6"}'::jsonb) -> Bitmap Index Scan on collection_index_on_properties (cost=0.00..12.00 rows=1 width=0) Index Cond: (properties @> '{"listprop": "ELEM6"}'::jsonb)
After¶
arvados_test=# explain select uuid from collections where lower(collections.properties::text)::jsonb @> lower('{"listprop":"ELEM6"}')::jsonb; QUERY PLAN ---------------------------------------------------------------------------------- Seq Scan on collections (cost=0.00..42.15 rows=1 width=27) Filter: ((lower((properties)::text))::jsonb @> '{"listprop": "elem6"}'::jsonb)
Maybe there's an index we can add? I'll look into it. The plan B would be to make wb2 use like as a fallback, but that would change previous behavior that users may already be using. This might not be too disruptive as wb2 is still in beta.
Updated by Lucas Di Pentima almost 5 years ago
Adding an expression GIN index makes the difference:
arvados_test=# create index coll_props_lower_2 on collections using gin ((lower(properties::text)::jsonb)); CREATE INDEX arvados_test=# explain select uuid from collections where (lower(collections.properties::text)::jsonb) @> lower('{"listprop":"ELEM6"}')::jsonb; QUERY PLAN -------------------------------------------------------------------------------------------- Bitmap Heap Scan on collections (cost=12.00..16.03 rows=1 width=27) Recheck Cond: ((lower((properties)::text))::jsonb @> '{"listprop": "elem6"}'::jsonb) -> Bitmap Index Scan on coll_props_lower_2 (cost=0.00..12.00 rows=1 width=0) Index Cond: ((lower((properties)::text))::jsonb @> '{"listprop": "elem6"}'::jsonb) (4 rows)
Updated by Lucas Di Pentima almost 5 years ago
After talking about the case [in]sensitive search topic with Peter, we concluded that for now, workbench2 will support case-sensitive search on properties to avoid having a second index.
Just in case we need to make the contains
filter operator to do case-insensitive matches, here's the diff:
diff --git a/services/api/lib/record_filters.rb b/services/api/lib/record_filters.rb
index 5688ca614..b3bf055d6 100644
--- a/services/api/lib/record_filters.rb
+++ b/services/api/lib/record_filters.rb
@@ -141,7 +141,7 @@ module RecordFilters
end
param_out << proppath
when 'contains'
- cond_out << "#{attr_table_name}.#{attr} @> ?::jsonb OR #{attr_table_name}.#{attr} @> ?::jsonb"
+ cond_out << "lower(#{attr_table_name}.#{attr}::text)::jsonb @> lower(?)::jsonb OR lower(#{attr_table_name}.#{attr}::text)::jsonb @> lower(?)::jsonb"
param_out << SafeJSON.dump({proppath => operand})
param_out << SafeJSON.dump({proppath => [operand]})
else
Updated by Lucas Di Pentima almost 5 years ago
Updates at commit:1d922bf21 undo the last 2 commits, and just adds a couple of tests to confirm that contains
does case-sensitive matching only.
Test run: developer-run-tests-services-api: #1784
Updated by Lucas Di Pentima almost 5 years ago
Updates at commit e8d27222 (wb2 repo) makes the search fallback to use like
instead of ilike
for property matching.
Test run: developer-tests-workbench2: #13
Updated by Peter Amstutz almost 5 years ago
If we're just doing exact matches, shouldn't the fallback query be
["groups.properties.IDTAGANIMALS","=","IDVALANIMALS2"]
not
["groups.properties.IDTAGANIMALS","like","%IDVALANIMALS2%"]
Doesn't using "like" force a sequential scan?
Updated by Peter Amstutz almost 5 years ago
Bitmap Heap Scan on collections (cost=164.25..282.31 rows=32 width=87) (actual time=0.614..0.615 rows=1 loops=1) Recheck Cond: (properties @> '{"IDTAGANIMALS": "IDVALANIMALS2"}'::jsonb) Heap Blocks: exact=1 -> Bitmap Index Scan on collection_index_on_properties (cost=0.00..164.24 rows=32 width=0) (actual time=0.607..0.607 rows=1 loops=1) Index Cond: (properties @> '{"IDTAGANIMALS": "IDVALANIMALS2"}'::jsonb) Planning time: 0.137 ms Execution time: 0.675 ms
Seq Scan on collections (cost=0.00..3241.22 rows=3 width=87) (actual time=9.591..11.891 rows=1 loops=1) Filter: ((properties ->> 'IDTAGANIMALS'::text) ~~ '%IDVALANIMALS2%'::text) Rows Removed by Filter: 32114 Planning time: 0.132 ms Execution time: 11.910 ms
Updated by Peter Amstutz almost 5 years ago
Bitmap Heap Scan on collections (cost=328.52..559.55 rows=65 width=87) (actual time=1.135..1.135 rows=1 loops=1) Recheck Cond: ((properties @> '{"IDTAGANIMALS": "IDVALANIMALS2"}'::jsonb) OR (properties @> '{"IDTAGANIMALS": ["IDVALANIMALS2"]}'::jsonb)) Heap Blocks: exact=1 -> BitmapOr (cost=328.52..328.52 rows=65 width=0) (actual time=1.125..1.125 rows=0 loops=1) -> Bitmap Index Scan on collection_index_on_properties (cost=0.00..164.24 rows=32 width=0) (actual time=0.569..0.569 rows=1 loops=1) Index Cond: (properties @> '{"IDTAGANIMALS": "IDVALANIMALS2"}'::jsonb) -> Bitmap Index Scan on collection_index_on_properties (cost=0.00..164.24 rows=32 width=0) (actual time=0.555..0.555 rows=0 loops=1) Index Cond: (properties @> '{"IDTAGANIMALS": ["IDVALANIMALS2"]}'::jsonb) Planning time: 0.165 ms Execution time: 1.182 ms
Updated by Peter Amstutz almost 5 years ago
The wb2 filter builder automatically adds leading and trailing '%' wildcard with 'addLike'.
Updated by Peter Amstutz almost 5 years ago
Container requests also have jsonb properties, that should be included in queryToFilters.
Updated by Lucas Di Pentima almost 5 years ago
Above comments fixed at commit 5b0a7e37. Thanks!
Updated by Peter Amstutz almost 5 years ago
Peter Amstutz wrote:
The wb2 filter builder automatically adds leading and trailing '%' wildcard with 'addLike'.
I realize it's not clear, that's just a note, not necessarily something to fix, although it is a little bit surprising.
Updated by Lucas Di Pentima almost 5 years ago
- Status changed from In Progress to Resolved