Feature #17009
closed[keep-web] S3 API should accept bucket name as first component of domain name
100%
Description
Currently it only accepts bucket name in path, but it should be easy enough to accept bucket name in the domain name as we already do in keep-web for non-S3 requests.
Updated by Tom Clegg about 4 years ago
- Related to Story #16360: Keep-web supports S3 compatible interface added
Updated by Peter Amstutz about 4 years ago
I'm trying to use the command line version of cyberduck from https://duck.sh/
I'm trying to list the contents of a bucket:
duck -l s3://download.ce8i5.arvadosapi.com/ce8i5-j7d0g-g6r8w0853s32ged/
This doesn't work because it is connecting to
ce8i5-j7d0g-g6r8w0853s32ged.download.ce8i5.arvadosapi.com
From debugging, I see something about:
s3service.disable-dns-buckets=false
This seems to be a configuration option of the jets3t
java library used by Duck. I don't know how to set it, though.
creating ~/.duck/jets3t.properties
didn't seem to work.
Updated by Peter Amstutz about 4 years ago
- Target version set to 2020-12-02 Sprint
Updated by Peter Amstutz about 4 years ago
- Blocked by Feature #17011: Add keep-web wildcard DNS to salt added
Updated by Tom Clegg about 4 years ago
17009-s3-bucket-vhost @ baeef76a2b3b60fb3613d01b1df2916397e8c589 -- developer-run-tests: #2186
Updated by Tom Clegg about 4 years ago
Worth adding a note to that keep-web install page along these lines? "The *.collections.ClusterID.example.com option is preferred if you plan to access Keep using third-party S3 client software."
(Some clients can be configured to use a different pattern like {bucket}--collections.example.com
but even for them it's probably less effort overall to use the default pattern.)
Updated by Peter Amstutz about 4 years ago
17009-s3-bucket-vhost @ baeef76a2b3b60fb3613d01b1df2916397e8c589
Well, that was easy.
We'll want to do some manual testing when the wildcard certificates get set up on one of the dev clusters.
Otherwise, this LGTM.
Tom Clegg wrote:
Worth adding a note to that keep-web install page along these lines? "The *.collections.ClusterID.example.com option is preferred if you plan to access Keep using third-party S3 client software."
(Some clients can be configured to use a different pattern like
{bucket}--collections.example.com
but even for them it's probably less effort overall to use the default pattern.)
Yes, it should be recommended. Also the introduction on that page should mention support for S3 API.
Updated by Tom Clegg about 4 years ago
Install doc updates:
17009-s3-bucket-vhost @ 2c3df643bc9effb76a26d56c6b4881856003c053
Updated by Anonymous about 4 years ago
- Status changed from In Progress to Resolved
Applied in changeset arvados|40a4776f3e3b55944aa1267ac51a329d77218b18.
Updated by Peter Amstutz about 4 years ago
- Status changed from Resolved to Feedback
Updated by Peter Amstutz about 4 years ago
Cyberduck still doesn't quite work. It is supposed to be returning a list of bucket contents but instead it is returning an application/x-directory object.
$ duck -v -l s3://collections.ce8i5.arvadosapi.com/ce8i5-4zz18-ohp73xy8om7aipj Listing directory ce8i5-4zz18-ohp73xy8om7aipj… Login collections.ce8i5.arvadosapi.com. Login collections.ce8i5.arvadosapi.com – S3 with username and password. No login credentials could be found in the Keychain. Access Key ID (peter): ce8i5-gj3su-02f1ov5mgblpf5b Login as ce8i5-gj3su-02f1ov5mgblpf5b Secret Access Key: WARNING! Passwords are stored in plain text in ~/.duck/credentials. Save password (y/n): y Authenticating as ce8i5-gj3su-02f1ov5mgblpf5b… > GET / HTTP/1.1 > Date: Wed, 25 Nov 2020 16:14:18 GMT > x-amz-content-sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 > Host: collections.ce8i5.arvadosapi.com > x-amz-date: 20201125T161418Z > Authorization: ******** > Connection: Keep-Alive > User-Agent: Cyberduck/7.7.0.33744 (Linux/4.19.0-10-amd64) (amd64) < HTTP/1.1 200 OK < Server: nginx/1.14.0 (Ubuntu) < Date: Wed, 25 Nov 2020 16:14:18 GMT < Content-Type: application/xml < Content-Length: 271 < Connection: keep-alive < Strict-Transport-Security: max-age=63072000 > GET /?encoding-type=url&max-keys=1000&prefix&delimiter=%2F HTTP/1.1 > Date: Wed, 25 Nov 2020 16:14:18 GMT > x-amz-content-sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 > Host: collections.ce8i5.arvadosapi.com > x-amz-date: 20201125T161418Z > Authorization: ******** > Connection: Keep-Alive > User-Agent: Cyberduck/7.7.0.33744 (Linux/4.19.0-10-amd64) (amd64) < HTTP/1.1 200 OK < Server: nginx/1.14.0 (Ubuntu) < Date: Wed, 25 Nov 2020 16:14:18 GMT < Content-Type: application/xml < Content-Length: 272 < Connection: keep-alive Login successful… > GET /?versioning HTTP/1.1 > Date: Wed, 25 Nov 2020 16:14:18 GMT > x-amz-content-sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 > Host: ce8i5-4zz18-ohp73xy8om7aipj.collections.ce8i5.arvadosapi.com > x-amz-date: 20201125T161418Z > Authorization: ******** > Connection: Keep-Alive > User-Agent: Cyberduck/7.7.0.33744 (Linux/4.19.0-10-amd64) (amd64) < HTTP/1.1 200 OK < Server: nginx/1.14.0 (Ubuntu) < Date: Wed, 25 Nov 2020 16:14:19 GMT < Content-Type: application/x-directory < Content-Length: 0 < Connection: keep-alive < Strict-Transport-Security: max-age=63072000 > GET /?encoding-type=url&max-keys=1000&prefix&delimiter=%2F HTTP/1.1 > Date: Wed, 25 Nov 2020 16:14:19 GMT > x-amz-content-sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 > Host: ce8i5-4zz18-ohp73xy8om7aipj.collections.ce8i5.arvadosapi.com > x-amz-date: 20201125T161419Z > Authorization: ******** > Connection: Keep-Alive > User-Agent: Cyberduck/7.7.0.33744 (Linux/4.19.0-10-amd64) (amd64) < HTTP/1.1 200 OK < Server: nginx/1.14.0 (Ubuntu) < Date: Wed, 25 Nov 2020 16:14:19 GMT < Content-Type: application/x-directory < Content-Length: 0 < Connection: keep-alive < Strict-Transport-Security: max-age=63072000 Listing directory ce8i5-4zz18-ohp73xy8om7aipj failed. Failed to parse XML document with handler class org.jets3t.service.impl.rest.XmlResponsesSaxParser$ListBucketHandler. Please contact your web hosting service provider for assistance.
Updated by Tom Clegg about 4 years ago
- Target version changed from 2020-12-02 Sprint to 2020-12-16 Sprint
Updated by Tom Clegg about 4 years ago
- Status changed from Feedback to In Progress
Updated by Tom Clegg about 4 years ago
The XML parsing failure in #17009#note-14 was caused by incorrect routing, fixed in master at 0c5e55d63. But the path handling was still broken for list operations, which is fixed here. Also adds a test for list/get/put using vhost style requests.
17009-s3-vhost-list @ f46eee810702b655737007bdfecf91201cdb27ca -- developer-run-tests: #2205
Updated by Lucas Di Pentima about 4 years ago
This LGTM. Was trying to manually test it on arvbox, but I think it would be quicker to merge and test against our dev clusters. Thanks!
Updated by Lucas Di Pentima about 4 years ago
Tried with the duck
command as described on #note-14 and it listed the collection correctly:
$ duck -v -l s3://collections.ce8i5.arvadosapi.com/ce8i5-4zz18-ohp73xy8om7aipj Listing directory ce8i5-4zz18-ohp73xy8om7aipj… [...] Authenticating as ce8i5-gj3su-ggs1g0lp3coa7bc… > GET / HTTP/1.1 > Date: Wed, 09 Dec 2020 19:46:43 GMT > x-amz-content-sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 > Host: collections.ce8i5.arvadosapi.com > x-amz-date: 20201209T194643Z > Authorization: ******** > Connection: Keep-Alive > User-Agent: Cyberduck/7.7.2.33862 (Mac OS X/10.15.7) (x86_64) < HTTP/1.1 200 OK < Server: nginx/1.14.0 (Ubuntu) < Date: Wed, 09 Dec 2020 19:46:44 GMT < Content-Type: application/xml < Content-Length: 271 < Connection: keep-alive < Strict-Transport-Security: max-age=63072000 > GET /?encoding-type=url&max-keys=1000&prefix&delimiter=%2F HTTP/1.1 > Date: Wed, 09 Dec 2020 19:46:44 GMT > x-amz-content-sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 > Host: collections.ce8i5.arvadosapi.com > x-amz-date: 20201209T194644Z > Authorization: ******** > Connection: Keep-Alive > User-Agent: Cyberduck/7.7.2.33862 (Mac OS X/10.15.7) (x86_64) < HTTP/1.1 200 OK < Server: nginx/1.14.0 (Ubuntu) < Date: Wed, 09 Dec 2020 19:46:44 GMT < Content-Type: application/xml < Content-Length: 272 < Connection: keep-alive Login successful… > GET /?versioning HTTP/1.1 > Date: Wed, 09 Dec 2020 19:46:45 GMT > x-amz-content-sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 > Host: ce8i5-4zz18-ohp73xy8om7aipj.collections.ce8i5.arvadosapi.com > x-amz-date: 20201209T194645Z > Authorization: ******** > Connection: Keep-Alive > User-Agent: Cyberduck/7.7.2.33862 (Mac OS X/10.15.7) (x86_64) < HTTP/1.1 200 OK < Server: nginx/1.14.0 (Ubuntu) < Date: Wed, 09 Dec 2020 19:46:45 GMT < Content-Type: application/xml < Content-Length: 114 < Connection: keep-alive < Strict-Transport-Security: max-age=63072000 > GET /?encoding-type=url&max-keys=1000&prefix&delimiter=%2F HTTP/1.1 > Date: Wed, 09 Dec 2020 19:46:45 GMT > x-amz-content-sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 > Host: ce8i5-4zz18-ohp73xy8om7aipj.collections.ce8i5.arvadosapi.com > x-amz-date: 20201209T194645Z > Authorization: ******** > Connection: Keep-Alive > User-Agent: Cyberduck/7.7.2.33862 (Mac OS X/10.15.7) (x86_64) < HTTP/1.1 200 OK < Server: nginx/1.14.0 (Ubuntu) < Date: Wed, 09 Dec 2020 19:46:46 GMT < Content-Type: application/xml < Content-Length: 710 < Connection: keep-alive < Strict-Transport-Security: max-age=63072000 cwl.output.json output.txt > GET /?prefix&delimiter=%2F&uploads HTTP/1.1 > Date: Wed, 09 Dec 2020 19:46:46 GMT > x-amz-content-sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 > Host: ce8i5-4zz18-ohp73xy8om7aipj.collections.ce8i5.arvadosapi.com > x-amz-date: 20201209T194646Z > Authorization: ******** > Connection: Keep-Alive > User-Agent: Cyberduck/7.7.2.33862 (Mac OS X/10.15.7) (x86_64) < HTTP/1.1 200 OK < Server: nginx/1.14.0 (Ubuntu) < Date: Wed, 09 Dec 2020 19:46:46 GMT < Content-Type: application/xml < Content-Length: 710 < Connection: keep-alive < Strict-Transport-Security: max-age=63072000
I think this is ready to be marked as resolved.
Updated by Tom Clegg about 4 years ago
- Status changed from In Progress to Resolved