Feature #5011
closed[SDKs] arv-put accepts --replication argument (saved with collection record)
Added by Tom Clegg almost 10 years ago. Updated almost 10 years ago.
100%
Updated by Brett Smith almost 10 years ago
- Subject changed from [SDKs] arv-get accepts --replication argument (saved with collection record) to [SDKs] arv-put accepts --replication argument (saved with collection record)
Updated by Ward Vandewege almost 10 years ago
- Target version changed from Arvados Future Sprints to 2015-02-18 sprint
Updated by Tom Clegg almost 10 years ago
- Category set to SDKs
- Assigned To set to Tom Clegg
Updated by Brett Smith almost 10 years ago
Reviewing 655b69e. Unfortunately, I got these test failures:
====================================================================== FAIL: test_ArvPutSignedManifest (tests.test_arv_put.ArvPutIntegrationTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/brett/repos/arvados/sdk/python/tests/test_arv_put.py", line 507, in test_ArvPutSignedManifest self.assertEqual(p.returncode, 0) AssertionError: 1 != 0 ====================================================================== FAIL: test_put_collection_with_default_redundancy (tests.test_arv_put.ArvPutIntegrationTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/brett/repos/arvados/sdk/python/tests/test_arv_put.py", line 539, in test_put_collection_with_default_redundancy collection = self.run_and_find_collection("", []) File "/home/brett/repos/arvados/sdk/python/tests/test_arv_put.py", line 528, in run_and_find_collection self.assertEqual(1, len(collection_list)) AssertionError: 1 != 0 ====================================================================== FAIL: test_put_collection_with_high_redundancy (tests.test_arv_put.ArvPutIntegrationTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/brett/repos/arvados/sdk/python/tests/test_arv_put.py", line 535, in test_put_collection_with_high_redundancy collection = self.run_and_find_collection("", ['--replication', '4']) File "/home/brett/repos/arvados/sdk/python/tests/test_arv_put.py", line 528, in run_and_find_collection self.assertEqual(1, len(collection_list)) AssertionError: 1 != 0 ====================================================================== FAIL: test_put_collection_with_name_and_no_project (tests.test_arv_put.ArvPutIntegrationTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/brett/repos/arvados/sdk/python/tests/test_arv_put.py", line 553, in test_put_collection_with_name_and_no_project ['--portable-data-hash', '--name', link_name]) File "/home/brett/repos/arvados/sdk/python/tests/test_arv_put.py", line 528, in run_and_find_collection self.assertEqual(1, len(collection_list)) AssertionError: 1 != 0 ====================================================================== FAIL: test_put_collection_with_named_project_link (tests.test_arv_put.ArvPutIntegrationTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/brett/repos/arvados/sdk/python/tests/test_arv_put.py", line 563, in test_put_collection_with_named_project_link '--project-uuid', self.PROJECT_UUID]) File "/home/brett/repos/arvados/sdk/python/tests/test_arv_put.py", line 528, in run_and_find_collection self.assertEqual(1, len(collection_list)) AssertionError: 1 != 0 ====================================================================== FAIL: test_put_collection_with_unnamed_project_link (tests.test_arv_put.ArvPutIntegrationTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/brett/repos/arvados/sdk/python/tests/test_arv_put.py", line 544, in test_put_collection_with_unnamed_project_link ['--portable-data-hash', '--project-uuid', self.PROJECT_UUID]) File "/home/brett/repos/arvados/sdk/python/tests/test_arv_put.py", line 528, in run_and_find_collection self.assertEqual(1, len(collection_list)) AssertionError: 1 != 0 ----------------------------------------------------------------------
This showed up in the run, which seems like a likely culprit:
Traceback (most recent call last): File "/home/brett/repos/arvados/sdk/python/arvados/commands/put.py", line 518, in <module> main() File "/home/brett/repos/arvados/sdk/python/arvados/commands/put.py", line 483, in main 'redundancy': replication, NameError: global name 'replication' is not defined
Beyond that:
- I think it would be nicer to use None as a default value for replication in both for CollectionWriter and --replication (i.e., don't set a default). None more clearly signals to users "we'll figure out our own default value," bringing it in line with the documentation. It will also help expose bugs in case we ever fail to set that default: later code that might work with 0 will probably raise an exception with None.
- It looks like we should just have the signatures for
local_store_put
andlocal_store_get
matchput
andget
, respectively. This meanscopies
should go in front ofnum_retries
inlocal_store_put
, andlocal_store_get
shouldn't have acopies
argument (that makes no sense). It would be extra-nice to update the comment abovelocal_store_put
to talk about this general rule. - Do you want to go all the way, and write a proper docstring for CollectionWriter.finish() from the comment?
- I thought disk_services_available looked awfully familiar—there's a lot of overlap between setup methods in CollectionTestMixin and KeepClientServiceTestCase. Is it reasonable to DRY this up, perhaps in arvados_testutil?
Thanks.
Updated by Tom Clegg almost 10 years ago
Brett Smith wrote:
Reviewing 655b69e. Unfortunately, I got these test failures:
[...]
This showed up in the run, which seems like a likely culprit:
[...]
My bad, unpushed commits. :(
- I think it would be nicer to use None as a default value for replication in both for CollectionWriter and --replication (i.e., don't set a default). None more clearly signals to users "we'll figure out our own default value," bringing it in line with the documentation. It will also help expose bugs in case we ever fail to set that default: later code that might work with 0 will probably raise an exception with None.
Done.
- It looks like we should just have the signatures for
local_store_put
andlocal_store_get
matchput
andget
, respectively. This meanscopies
should go in front ofnum_retries
inlocal_store_put
, andlocal_store_get
shouldn't have acopies
argument (that makes no sense). It would be extra-nice to update the comment abovelocal_store_put
to talk about this general rule.
Done. (And moved into docstring.)
- Do you want to go all the way, and write a proper docstring for CollectionWriter.finish() from the comment?
Sure thing, done.
- I thought disk_services_available looked awfully familiar—there's a lot of overlap between setup methods in CollectionTestMixin and KeepClientServiceTestCase. Is it reasonable to DRY this up, perhaps in arvados_testutil?
Done, in a new arvados_testutil.ApiClientMock
class.
Now at cbf80c0
Updated by Brett Smith almost 10 years ago
Reviewing cbf80c0. Thank you very much for DRYing up the test infrastructure. This is much nicer.
- Please add unit tests for
replication_desired
. At least one for case where redundancy is defined, and another when it isn't. - The comment at the top of
Collection.attributes_required_columns
would benefit from being made more general, since the code is too. - Please add
replication_desired
to the API schema documentation. It would be ideal to add a note or two to explain how it relates to theredundancy
attributes, too, mentioning what's deprecated and so on. - I feel like the
CollectionWriter.finish
docstring is a little too scary. Yes, users should be aware that it doesn't make an Arvados collection object, but using this method to save task output is (currently) normal and expected. Suggest documenting the distinction but back off language like "beware" and "special cases." As a suggestion: "This is primarily used to save and record task output. In all other cases, you should make a collection instead…" - When you instantiate a KeepClient with a local store, it actually switches out its own
put
andget
methods for thelocal_store
versions. The docstrings might paint a fuller picture by describing this relationship rather than saying it's just "for use in test cases;" the__init__
docstring already points out that using a local store is primarily for testing. - Please make docstring formatting follow PEP 257. In particular, one-line docstrings should get closing quotes on the same line, while multiline docstrings should end with closing quotes on a line by themselves, with no blank lines on either side.
Thanks again.
Updated by Tom Clegg almost 10 years ago
Brett Smith wrote:
- Please add unit tests for
replication_desired
. At least one for case where redundancy is defined, and another when it isn't.
Added.
Also added functional tests to protect/document Python SDK's usage.
- The comment at the top of
Collection.attributes_required_columns
would benefit from being made more general, since the code is too.
Updated comments.
- Please add
replication_desired
to the API schema documentation. It would be ideal to add a note or two to explain how it relates to theredundancy
attributes, too, mentioning what's deprecated and so on.
Can we leave this for #3410?
Meanwhile I realized that forward compatibility in the SDK (for a few days) would be better than backward compatibility in the API (for longer than that), so I did this
replication_attr = 'replication_desired'
if api_client._schema.schemas['Collection']['properties'].get(replication_attr, None) is None:
# API calls it 'redundancy' until #3410.
replication_attr = 'redundancy'
- I feel like the
CollectionWriter.finish
docstring is a little too scary. Yes, users should be aware that it doesn't make an Arvados collection object, but using this method to save task output is (currently) normal and expected. Suggest documenting the distinction but back off language like "beware" and "special cases." As a suggestion: "This is primarily used to save and record task output. In all other cases, you should make a collection instead…"
Updated. I left out "all" from "all other cases", though. Fear of absolutes.
- When you instantiate a KeepClient with a local store, it actually switches out its own
put
andget
methods for thelocal_store
versions. The docstrings might paint a fuller picture by describing this relationship rather than saying it's just "for use in test cases;" the__init__
docstring already points out that using a local store is primarily for testing.
Updated:
- """A stub for put(), for use in test cases.
+ """A stub for put().
+
+ This method is used in place of the real put() method in a
+ KeepClient constructed with local_store=True.
- Please make docstring formatting follow PEP 257. In particular, one-line docstrings should get closing quotes on the same line, while multiline docstrings should end with closing quotes on a line by themselves, with no blank lines on either side.
Fascinating, sort of. PEP-257 says "Blank lines should be removed from the beginning and end of the docstring." But this is in a paragraph describing what processing tools should do, not(?) code authors. Emacs python-mode defaults to ending with a blank line (which it calls "pep-257" style). But now I found (setq python-fill-docstring-style 'pep-257-nn)
and stuck it in ~/.emacs and Coding Standards so I don't have to [fail to] remember to remove that trailing blank line after every M-q. Phew.
Now at 71a556d
Updated by Brett Smith almost 10 years ago
Tom Clegg wrote:
- Please add
replication_desired
to the API schema documentation. It would be ideal to add a note or two to explain how it relates to theredundancy
attributes, too, mentioning what's deprecated and so on.Can we leave this for #3410?
Sure.
Meanwhile I realized that forward compatibility in the SDK (for a few days) would be better than backward compatibility in the API (for longer than that), so I did this
- It would be simpler to write
if replication_attr not in api_client._schema.schemas['Collection']['properties']
. - We're already doing this elsewhere, but I'll point out that
_schema
is pretty clearly marked as being not public. It's handy, but every time we do this, we risk making more work for ourselves to keep up with future googleapiclient updates. (No, I don't see a public interface to achieve the same goal.)
- When you instantiate a KeepClient with a local store, it actually switches out its own
put
andget
methods for thelocal_store
versions. The docstrings might paint a fuller picture by describing this relationship rather than saying it's just "for use in test cases;" the__init__
docstring already points out that using a local store is primarily for testing.Updated:
The local_store
argument is expected to be a directory path string, not a boolean.
- Please make docstring formatting follow PEP 257. In particular, one-line docstrings should get closing quotes on the same line, while multiline docstrings should end with closing quotes on a line by themselves, with no blank lines on either side.
Fascinating, sort of. PEP-257 says "Blank lines should be removed from the beginning and end of the docstring." But this is in a paragraph describing what processing tools should do, not(?) code authors.
You're right, sorry about that. I think I sort of assumed this from the example multi-line docstring.
My points here are all comparatively minor, so I think you're good to merge with or without addressing them. Thanks.
Updated by Anonymous almost 10 years ago
- Status changed from New to Resolved
- % Done changed from 60 to 100
Applied in changeset arvados|commit:64c70939c414881de61ac65512701d0ba4068786.
Updated by Tom Clegg almost 10 years ago
- Status changed from Resolved to In Progress
Updated by Brett Smith almost 10 years ago
Reviewing 5011-thread-safe-test at 3068c62
Instead of writing our own threadsafe_iter, would it be possible to just stuff the mock responses in a Queue.Queue, and then set the mock's side_effect to that_queue.get
?
Updated by Tom Clegg almost 10 years ago
Brett Smith wrote:
Reviewing 5011-thread-safe-test at 3068c62
Instead of writing our own threadsafe_iter, would it be possible to just stuff the mock responses in a Queue.Queue, and then set the mock's side_effect to
that_queue.get
?
Yes! In fact, I started out trying to use a Queue, but it didn't work out, and the iter seemed simple enough. But I tried Queue again and it worked better this time.
Now, in 50df495:queue = Queue.Queue() for val in items: queue.put(val) return lambda *args, **kwargs: queue.get(block=False)
Now at d5809a1
Updated by Brett Smith almost 10 years ago
Tom Clegg wrote:
Brett Smith wrote:
Instead of writing our own threadsafe_iter, would it be possible to just stuff the mock responses in a Queue.Queue, and then set the mock's side_effect to
that_queue.get
?Yes! In fact, I started out trying to use a Queue, but it didn't work out, and the iter seemed simple enough. But I tried Queue again and it worked better this time.
Sweet, thank you for indulging me. I agree threadsafe_iter was pretty straightforward, but still, less code is less code. d5809a1e is good to merge, thanks.
Updated by Tom Clegg almost 10 years ago
- Status changed from In Progress to Resolved