Feature #23284
closedKeepclient reports backend keep metrics
Description
Add net:keep0, keepcalls, and keepcache output to mimic arv-mount:
crunchstat: keepcalls 0 put 3753 get -- interval 10.0000 seconds 0 put 0 get crunchstat: net:keep0 0 tx 484065792 rx -- interval 10.0000 seconds 0 tx 0 rx crunchstat: keepcache 3700 hit 8 miss -- interval 10.0000 seconds 0 hit 0 miss
Implementation:
Add prometheus metrics to sdk/go/keepclient such that a caller like lib/mount.keepFS can pass in a *prometheus.Registry to start collecting back-end and cache metrics, e.g.,
reg := prometheus.NewRegistry()
kc, _ := keepclient.MakeKeepClient(ac)
kc.CollectMetrics(reg)
Connecting multiple KeepClients to the same registry also needs to work, e.g.,
reg := prometheus.NewRegistry()
kc, _ := keepclient.MakeKeepClient(ac)
kc.CollectMetrics(reg)
kc2 := kc.Clone()
kc3, _ := keepclient.MakeKeepClient(ac)
kc3.CollectMetrics(reg)
// reg now reports combined metrics for kc, kc2, and kc3
In lib/mount.keepFS, use those resulting metrics to generate the new crunchstat entries.
Updated by Tom Clegg 5 months ago
- Follows Feature #23245: Go FUSE driver reports crunchstats added
Updated by Brett Smith 4 months ago
- Target version changed from Development 2025-11-12 to Development 2025-11-26
Updated by Tom Clegg 4 months ago
23284-keepclient-metrics @ c9ba673c419be2f9b41baef9f993449cad55868a -- developer-run-tests: #4942
fix sdk/go/arvados tests, re-run remainder @ 7e07cfc04c0d76fa33dd161d34f91b5f66249492 run-tests-remainder: #5662
- track cache hits/misses, blocks in/out, and network traffic in/out
- enable client metrics in keepproxy
- enable client metrics in keep-web
- track cache bytes in/out, application bytes in/out (arv-mount doesn't report it, but lib/mount probably should, because network traffic ÷ application traffic is an indicator of a cache thrashing pattern where we re-fetch entire blocks to serve short reads)
Updated by Tom Clegg 4 months ago
- Related to Feature #23308: Go FUSE driver crunchstat should report client-side traffic (need to add to keepclient metrics) added
Updated by Tom Clegg 4 months ago
· Edited
23284-keepclient-metrics @ b9ac9378f404a10a07816eb5a8d5c6ae2e6f15d6 -- developer-run-tests: #4944
(workbench2 test failed)
- All agreed upon points are implemented / addressed. Describe changes from pre-implementation design.
- ✅ Add (*keepclient.KeepClient)RegisterMetrics(*prometheus.Registry) method
- ✨ Add keepclient backend metrics to keepproxy and keep-web
- ❌ Piggybacking already-registered metrics is not a thing, so the
kc3example in the description is not possible. Keep-web would have been easier to instrument that way, but instead I refactored it to use Clone so all metrics can be shared. - We could also add something like (*keepclient.KeepClient)CombineMetrics(*keepclient.KeepClient) but it would not be trivial, because concurrency. As long as the Clone approach works I think we should leave it at that.
- Anything not implemented (discovered or discussed during work) has a follow-up story.
- Code is tested and passing, both automated and manual, what manual testing was done is described.
- ✅ automated tests added to keepclient pkg
- ✅ keep-web metrics test updated to check keepclient stats are reported too
- The tested code incorporates recent main branch changes.
- ✅
- New or changed UI/UX has gotten feedback from stakeholders.
- n/a
- Documentation has been updated.
- n/a
- Behaves appropriately at the intended scale (describe intended scale).
- n/a (even for small reads, overhead of tracking metrics should not be noticeable)
- Considered backwards and forwards compatibility issues between client and server.
- n/a
- Follows our coding standards and GUI style guidelines.
- ✅
(this branch doesn't actually add the metrics to the Go FUSE driver, that part is blocked on #23245)
Updated by Brett Smith 4 months ago
- Assigned To set to Tom Clegg
- Status changed from In Progress to Resolved
- Subject changed from Go FUSE driver reports backend keep metrics in crunchstat output to Keepclient reports backend keep metrics
Updated by Brett Smith 4 months ago
- Precedes Feature #23332: Go FUSE driver reports crunchstats net:keep0, keepcalls, keepcache added
Updated by Brett Smith 4 months ago
- Related to Feature #23333: Go FUSE Driver Phase 1: crunch-run uses arvados-client mount added