Story #12706
closed[SDK] R SDK support for Collections
100%
Description
As a first step, the R SDK should allow me to allow to find collections and files in Keep using filtering on metadata, load the files into R, process them and then write the results back to a collection.
For this, we will provide a high level API. The low-level Arvados API access can be provided either by googleAuthR (as described in #11876) or by hand. If not using googleAuthR, the low-level API should not be accessible by the user, so that we can replace it with an auto-generated API later.
High level requirements:
- User can get a specific collection by UUID or portable_data_hash (PDH).
- User can get a list of collections, with standard Arvados filters.
- User can create a new, empty collection in a specific project (project is owner_uuid)
- Collection object supports these operations (using WebDAV unless otherwise noted)
- Update collection name (via Arvados API)
- Open a file or directory that already exists and get a File or Directory object
- Read the listing of a Directory
- Get size of a file
- Read the contents of a File. API should support reading a portion of the file at a certain offset and length
- Put some text or bytes to file (replaces entire file)
- Create a new File object under a certain path
- Delete a File under a certain path
- Move/rename a file or directory from one path to another within the same collection
If such a thing exists, implement R equivalent of "file-like objects" so that open Collection File objects can be used as input to R functions.
Writable WebDAV support is in progress and should be available soon. Start by working on Arvados API access and reading from WebDAV.
Updated by Peter Amstutz over 7 years ago
- Description updated (diff)
- Assigned To set to Fuad Muhic
Updated by Peter Amstutz over 7 years ago
- Related to Story #11876: [R SDK] Create a Bioconductor/R SDK added
Updated by Tom Clegg over 7 years ago
re "file-like objects": I take this to mean we want something like
f = collection.open("foo/bar.txt") f.write("baz") f.close()
...but we do not (yet) need to optimize away the webdav round-trips by having an in-memory representation of a collection's directory structure. Is that correct?
Updated by Peter Amstutz over 7 years ago
Tom Clegg wrote:
re "file-like objects": I take this to mean we want something like
[...]
...but we do not (yet) need to optimize away the webdav round-trips by having an in-memory representation of a collection's directory structure. Is that correct?
Yes. But we should probably look at the R standard library for working with files looks like to see what the expectations are.
Updated by Peter Amstutz over 7 years ago
WebDAV support also requires discovering the address of the keep-web server, see https://dev.arvados.org/issues/11876#note-18
Updated by Tom Morris over 7 years ago
- Target version changed from 2017-12-06 Sprint to 2017-12-20 Sprint
Updated by Tom Morris about 7 years ago
- Target version changed from 2017-12-20 Sprint to 2018-01-17 Sprint