Project

General

Profile

Actions

Feature #13025

closed

[keepstore] runtime metrics endpoint

Added by Tom Clegg almost 7 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Start date:
02/08/2018
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
-
Release relationship:
Auto

Description

Start by exporting request time statistics using https://godoc.org/github.com/prometheus/client_golang/prometheus or similar.


Subtasks 1 (0 open1 closed)

Task #13027: Review 13025-keepstore-metricsResolvedTom Clegg02/08/2018

Actions

Related issues 1 (0 open1 closed)

Related to Arvados - Story #10484: [keepstore] Add S3 bucket statistics and current #clients to status.jsonResolvedTom Clegg11/17/2016

Actions
Actions #1

Updated by Tom Morris almost 7 years ago

  • Related to Story #10484: [keepstore] Add S3 bucket statistics and current #clients to status.json added
Actions #2

Updated by Tom Morris almost 7 years ago

Here are some of the background notes from previous discussions:

Not to dive into implementation details, but introducing a general purpose metrics package may provide benefits going forward.

I've used the Dropwizard metrics in the past and like them. A Go port of that package can be found here: https://github.com/armon/go-metrics

There's also: https://github.com/rcrowley/go-metrics which is described in this blog post: http://blog.dekstroza.io/monitoring-software-written-in-golang/

  • June 2017 email before status.json meeting:

Some background reading:

- metrics vs logs - https://grafana.com/blog/2016/01/05/logs-and-metrics-and-graphs-oh-my/
- Dropwizard metrics - http://metrics.dropwizard.io/3.2.2/
- Go port of above - https://github.com/rcrowley/go-metrics
- Coda Hale's (author of Dropwizard) Go metrics package - https://github.com/codahale/metrics
- Prometheus - https://prometheus.io/docs/introduction/comparison/

I think we have two different things that we're interested here:
1. Health checks
2. Metrics

We should be sure to distinguish between them and understand which we're talking about at any given time.

It seems like some of the metrics from #10484 are candidates to get consolidated here, but probably best to focus on additional metrics first.

Actions #3

Updated by Tom Clegg almost 7 years ago

  • Status changed from New to In Progress
Actions #4

Updated by Tom Clegg almost 7 years ago

[{
  "name": "request_duration_seconds",
  "help": "Summary of request duration.",
  "type": "SUMMARY",
  "metric": [
    {
      "label": [
        {
          "name": "code",
          "value": "200" 
        },
        {
          "name": "method",
          "value": "get" 
        } 
      ],
      "summary": {
        "sample_count": "1",
        "sample_sum": 0.000342534,
        "quantile": [
          {
            "quantile": 0.5,
            "value": 0.000342534
          },
          {
            "quantile": 0.9,
            "value": 0.000342534
          },
          {
            "quantile": 0.99,
            "value": 0.000342534
          } 
        ] 
      }
    },
    { 
      "label": [
        { 
          "name": "code",
          "value": "200" 
        },
        { 
          "name": "method",
          "value": "put" 
        }
      ],
      "summary": {
        "sample_count": "9",
        "sample_sum": 0.258729368,
        "quantile": [
          { 
            "quantile": 0.5,
            "value": 0.023052607
          },
          {
            "quantile": 0.9,
            "value": 0.072060673
          },
          {
            "quantile": 0.99,
            "value": 0.072060673
          }
        ]
      }
    }
  ]
}]
Actions #5

Updated by Tom Clegg almost 7 years ago

13025-keepstore-metrics @ 85398442d095524c5b2f315c294ea81f9d17853b

This adds "request_duration_seconds" and "time_to_status_seconds" summary metrics.

I haven't added s3 and Azure backend stats yet, but I don't think that's a reason to delay merging these basic stats.

Actions #6

Updated by Lucas Di Pentima almost 7 years ago

  • Some tests are failing: https://ci.curoverse.com/job/developer-run-tests/573/
  • File services/keepstore/config.go
    • Line 99: I think the ‘else’ clause is not needed.
    • Line 197: Can ‘req’ param be replaced by _ on exportJSON()’s signature if it’s not going to be used?
  • Does this feature need some sort of documentation?
Actions #7

Updated by Tom Clegg almost 7 years ago

Lucas Di Pentima wrote:

Aha... keep-web was counting on WroteStatus() to return 0 when there has been no explicit call to WriteHeader(). Reverted that change and fixed the status=0 logging bug at log-emitting time instead.

  • File services/keepstore/config.go
    • Line 99: I think the ‘else’ clause is not needed.

But without the else, log.Formatter wouldn't get set...? (I rearranged this to put the "else" case in the function body, as recommended by golint.)

  • Line 197: Can ‘req’ param be replaced by _ on exportJSON()’s signature if it’s not going to be used?

Can be, but even go vet and golint seem ok with it... :)

  • Does this feature need some sort of documentation?

Sure, although we might try it for a bit before we advertise it.

13025-keepstore-metrics @ 3826a6339ba1c901c054053920ed20547b3ba54d

https://ci.curoverse.com/job/developer-run-tests/577/

Actions #8

Updated by Lucas Di Pentima almost 7 years ago

LGTM, thanks!

Actions #9

Updated by Tom Clegg almost 7 years ago

  • Status changed from In Progress to Resolved
Actions #10

Updated by Tom Morris over 6 years ago

  • Release set to 17
Actions

Also available in: Atom PDF