Feature #6280
Updated by Tom Clegg over 9 years ago
Given a collection (as retrieved from the API server via "list" or "get"), the new method (CollectionFileReader?) should return an object that implements the io.Reader interface.
Factors to consider:
* Handle fragmented files correctly ("bar" in "./foo" is the same file as "foo/bar" in ".")
* Fetch data from proxy, disk, or gateway service as appropriate
The returned object should also have a Len() method that returns the size of the file in bytes.
The reader should use the (*KeepClient)Get() method so its Read() method can start returning data before an entire block is retrieved.
If the reader detects a hash mismatch while reading from Keep, it should (of course) return an error, and should make sure to return less than Len() bytes in total. (This makes it possible for a web client, for example, to detect that an error occurred while downloading.) This should be taken care of already by (*KeepClient)Get().
For an initial implementation, it's acceptable to have a delay at each block boundary ("reached end of current block, so get a reader for the next block"). Ideally, though, this part of the reader will also work asynchronously: e.g., worker goroutine calls (*KeepClient)Get(), reads <=1MiB byte slices from the reader, and sends them to a channel with size 8, while the Read() just returns the next chunk from the channel. If this is as simple as it sounds even with proper error handling, it would be Nice To Have.