Project

General

Profile

Actions

Feature #16513

closed

Get reference Keep performance numbers for Keep-on-S3

Added by Ward Vandewege over 4 years ago. Updated about 4 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Start date:
06/15/2020
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
-
Release relationship:
Auto

Subtasks 1 (0 open1 closed)

Task #16528: review 16513-keep-exercise-improvementsResolvedWard Vandewege06/15/2020

Actions

Related issues 4 (2 open2 closed)

Related to Arvados - Story #10477: [keepstore] switch s3 driver from goamz to a more actively maintained client libraryResolvedWard Vandewege11/08/2016

Actions
Related to Arvados - Feature #16518: [keep] Allow clients to set a header to disable md5sum calculations in keepstoreNew

Actions
Related to Arvados - Feature #16519: [keepstore] optimize md5sum calculationsNew

Actions
Blocks Arvados Epics - Story #16516: Run Keepstore on local compute nodesResolved10/01/202111/30/2021

Actions
Actions #1

Updated by Ward Vandewege over 4 years ago

  • Related to Story #16514: Actionable insight into keep usage added
Actions #2

Updated by Ward Vandewege over 4 years ago

  • Related to deleted (Story #16514: Actionable insight into keep usage)
Actions #3

Updated by Ward Vandewege over 4 years ago

  • Blocks Story #16516: Run Keepstore on local compute nodes added
Actions #4

Updated by Ward Vandewege over 4 years ago

e710f1b2da3095d6152ac7f6ed1ffab8bfc2c0c7 on branch 16513-keep-exercise-improvements is ready for review.

Actions #5

Updated by Ward Vandewege over 4 years ago

  • Status changed from New to In Progress
  • Target version set to 2020-06-17 Sprint
Actions #6

Updated by Tom Clegg over 4 years ago

I have a few nits / suggested improvements but you could ignore them and/or merge e710f1b in the meantime.

Repeating the expression float64(bytesOut) / elapsed.Seconds() / 1048576 is a bit crufty. Should probably compute that once as rateOut and then use it 3 times.

We probably don't need 2 different stats reporting formats. We could print the header line at start, then print a CSV row once every stats-interval plus one at the end.

Printing the final summary on SIGINT/SIGALRM would be a nice touch. (then "alarm 60 keep-exercise ..." would work well, fwiw)

endChan could be a Timer rather than a Ticker. context.WithDeadline() and <-ctx.Done() would be another way to do it.

If we send the CSV data to stdout and logs to stderr, we'll be more ... | tee stats.csv -friendly.

Actions #7

Updated by Ward Vandewege over 4 years ago

  • Target version changed from 2020-06-17 Sprint to 2020-07-01 Sprint
Actions #8

Updated by Ward Vandewege over 4 years ago

Tom Clegg wrote:

I have a few nits / suggested improvements but you could ignore them and/or merge e710f1b in the meantime.

Repeating the expression float64(bytesOut) / elapsed.Seconds() / 1048576 is a bit crufty. Should probably compute that once as rateOut and then use it 3 times.

We probably don't need 2 different stats reporting formats. We could print the header line at start, then print a CSV row once every stats-interval plus one at the end.

Printing the final summary on SIGINT/SIGALRM would be a nice touch. (then "alarm 60 keep-exercise ..." would work well, fwiw)

endChan could be a Timer rather than a Ticker. context.WithDeadline() and <-ctx.Done() would be another way to do it.

If we send the CSV data to stdout and logs to stderr, we'll be more ... | tee stats.csv -friendly.

I've implemented everything in cba1b4145e8fcc57a851839f77fd020e5aaff722, ready for another look.

Actions #9

Updated by Tom Clegg over 4 years ago

LGTM @ a5a6111e3, thanks!

Actions #10

Updated by Ward Vandewege over 4 years ago

Arvados version: 2.0.2; AWS VPC with S3 endpoint

Single-threaded write to Keep backed by S3: ~42 MiB/sec
Single-threaded read from Keep backed by S3: ~62 MiB/sec

Single-threaded write to S3 with a 3rd party client (s3-cli): ~46 MiB/sec
Single-threaded read from S3 with a 3rd party client (s3-cli): ~106 MiB/sec

It's worth noting that S3 and Keep are optimized for aggregate throughput. With X reader/writer processes, you would expect to see roughly X times the single thread performance, up to the capacity (CPU/bandwidth/memory) of the keepstores (and the clients, but these tend to be spread out over many machines).

That said, we have identified a few areas for future improvement:

a) Keep write to S3 does not currently use multipart writes, because the S3 library we use does not support it. Using multipart writes is recommended to increase write throughput. We are looking into adopting the official AWS S3 go library (#10477). Our Keep S3 backend predates the official AWS S3 go library.

b) Keep's single-threaded read performance: some of the slowdown is caused by the md5sum that Keepstore does on reading every block. We are considering adding an option to disable the md5sum on read in Keepstore (#16518). We are investigating additional performance improvements as well (e.g. #16519).

Actions #11

Updated by Ward Vandewege over 4 years ago

  • Related to Story #10477: [keepstore] switch s3 driver from goamz to a more actively maintained client library added
Actions #12

Updated by Ward Vandewege over 4 years ago

  • Related to Feature #16518: [keep] Allow clients to set a header to disable md5sum calculations in keepstore added
Actions #13

Updated by Ward Vandewege over 4 years ago

  • Related to Feature #16519: [keepstore] optimize md5sum calculations added
Actions #14

Updated by Ward Vandewege over 4 years ago

  • Status changed from In Progress to Resolved
Actions #16

Updated by Peter Amstutz about 4 years ago

  • Release set to 25
Actions

Also available in: Atom PDF