Support #18799
openStrategy to generate Python SDK docstrings based on API docs
Added by Peter Amstutz almost 3 years ago. Updated about 2 years ago.
0%
Description
write script that
- takes the discovery document
- produces Python stubs with docstrings, type annotations etc corresponding to the google api client
- adds the stub files to the python SDK
- runs pydoc
The goal is for the methods/objects found under arvados.api() (generated on the fly by google api client) to be browsable in pydoc.
Files
GroupsIndexDoc.png (124 KB) GroupsIndexDoc.png | Brett Smith, 01/16/2023 08:39 PM | ||
GroupsIndexReturns.png (213 KB) GroupsIndexReturns.png | Brett Smith, 01/16/2023 08:39 PM | ||
discovery-pydoc-prototype.py (1.71 KB) discovery-pydoc-prototype.py | Brett Smith, 01/16/2023 08:39 PM |
Updated by Peter Amstutz almost 3 years ago
- Related to Support #18263: Plan to document the Python SDK added
Updated by Peter Amstutz almost 3 years ago
- Related to Story #18800: Update Python SDK documentation added
Updated by Peter Amstutz about 2 years ago
- Target version set to 2022-11-23 sprint
Updated by Brett Smith about 2 years ago
One possible implemention: google-api-python-client already generates docstrings for API methods, based on information in the discovery document. For example:
>>> arvc = arvados.api('v1') >>> print(arv.users().create.__doc__) Create a new User. Args: body: object, The request body. (required) select: array, Attributes of the new object to return in the response. ensure_unique_name: boolean, Adjust name to ensure uniqueness instead of returning an error on (owner_uuid, name) collision. cluster_id: string, Create object on a remote federated cluster instead of the current one. Returns: An object of the form: { # User "uuid": "A String", "etag": "A String", # Object version. "owner_uuid": "A String", "created_at": Unknown type! datetime …
Probably the cheapest implementation is to instantiate an API client as normal, then introspect the generated methods to write the stubs. One major downside of this approach is that the docstring generation seems to be very static. I don't think we could customize it (e.g., to follow our own docstring style) without serious monkeypatching. See every mention of docs
starting from https://github.com/googleapis/google-api-python-client/blob/3bbefc1352bcb2e302f7736643c9363799d5f5df/googleapiclient/discovery.py#L1193
If we want more control over the formatting, we'll probably end up basically rewriting all this ourselves. At which point, yeah, we can just work from the discovery document directly instead of the generated Python objects. (We can still use discovery document deserialization from apiclient.schema
.)
Question: Where should the stubs go? In real code all these methods will be attached to the return value of arvados.api
. Maybe call that result arvados.api.Client
or arvados.api.Resources
, and write the stubs under there?
Updated by Peter Amstutz about 2 years ago
- Target version changed from 2022-11-23 sprint to 2022-12-21 Sprint
Updated by Peter Amstutz about 2 years ago
- Target version changed from 2022-12-21 Sprint to 2023-01-18 sprint
Updated by Brett Smith about 2 years ago
- Subject changed from Strategy to tie the Python SDK to the API docs to Strategy to generate Python SDK docstrings based on API docs
Updated by Peter Amstutz about 2 years ago
- Target version changed from 2023-01-18 sprint to 2023-02-01 sprint
Updated by Peter Amstutz about 2 years ago
- Tracker changed from Story to Support
Updated by Peter Amstutz about 2 years ago
- Target version changed from 2023-02-01 sprint to To be groomed
Updated by Brett Smith about 2 years ago
- File GroupsIndexDoc.png GroupsIndexDoc.png added
- File GroupsIndexReturns.png GroupsIndexReturns.png added
- File discovery-pydoc-prototype.py discovery-pydoc-prototype.py added
Brett Smith wrote in #note-7:
Probably the cheapest implementation is to instantiate an API client as normal, then introspect the generated methods to write the stubs.
I prototyped this. See the attached script (it's just one page!). Call it like this with an Arvados API configuration in place:
python3 discovery-pydoc-prototype.py >arvados/sdk/python/arvados/api_resources.py
Then generate documentation as normal. The documentation will include this api_resources
stub with information about all the API resources and methods.
The formatting is pretty rough. The docstrings only seem to care about plaintext presentation, so pdoc3 makes relatively big formatting decisions based on small whitespace inconsistencies. See attached for a couple of examples of how it looks.
If we need to do the cheapest thing that could possibly work, this is probably it. But there are definitely noticeable presentation improvements to be found by walking the discovery document ourselves and writing our own docstrings instead of using the ones generated by apiclient
.
Updated by Brett Smith about 2 years ago
Doing it ourselves is a matter of iterating over the method definitions that match:
arv_client._resourceDesc['resources'][resource_name]['methods'][method_name]
For each method, look at description
, parameters
, and response
. For each parameter, look at description
, type
, required
, default
, enum
, and enumDescriptions
. Not every parameter will define every key but those should all be checked. Consider special-casing parameters that have only a single enum
possibility.
Cross-reference response
against arv_client._resourceDesc['schemas'][response_type]
.
Updated by Brett Smith about 2 years ago
- Related to Bug #19929: Improve documentation in the discovery document added