Feature #23245
closedGo FUSE driver reports crunchstats
Description
Have the Go FUSE driver write statistics to stderr in the same way that the Python FUSE driver does. It is invoked with the same command line switch (--crunchstat-interval=N prints stats every N seconds). Statistics are written in the exact same format as the Python FUSE driver, which looks like this (every --crunchstat-interval seconds, on stderr):
crunchstat: keepcalls 0 put 3753 get -- interval 10.0000 seconds 0 put 0 get crunchstat: net:keep0 0 tx 484065792 rx -- interval 10.0000 seconds 0 tx 0 rx crunchstat: keepcache 3700 hit 8 miss -- interval 10.0000 seconds 0 hit 0 miss crunchstat: blkio:0:0 0 write 484065792 read -- interval 10.0000 seconds 0 write 0 read crunchstat: fuseops 0 write 3694 read -- interval 10.0000 seconds 0 write 0 read crunchstat: fuseop:read 3694 count 8.274071 time -- interval 10.0000 seconds 0 count 0.000000 time crunchstat: fuseop:write 0 count 0.000000 time -- interval 10.0000 seconds 0 count 0.000000 time crunchstat: fuseop:destroy 0 count 0.000000 time -- interval 10.0000 seconds 0 count 0.000000 time crunchstat: fuseop:on_event 0 count 0.000000 time -- interval 10.0000 seconds 0 count 0.000000 time crunchstat: fuseop:getattr 12 count 0.000357 time -- interval 10.0000 seconds 0 count 0.000000 time crunchstat: fuseop:setattr 0 count 0.000000 time -- interval 10.0000 seconds 0 count 0.000000 time crunchstat: fuseop:lookup 10 count 0.066938 time -- interval 10.0000 seconds 0 count 0.000000 time crunchstat: fuseop:forget 0 count 0.000000 time -- interval 10.0000 seconds 0 count 0.000000 time crunchstat: fuseop:open 1 count 0.000043 time -- interval 10.0000 seconds 0 count 0.000000 time crunchstat: fuseop:release 2 count 0.000208 time -- interval 10.0000 seconds 0 count 0.000000 time crunchstat: fuseop:opendir 1 count 0.000027 time -- interval 10.0000 seconds 0 count 0.000000 time crunchstat: fuseop:readdir 2 count 0.000004 time -- interval 10.0000 seconds 0 count 0.000000 time crunchstat: fuseop:statfs 0 count 0.000000 time -- interval 10.0000 seconds 0 count 0.000000 time crunchstat: fuseop:create 1 count 0.000523 time -- interval 10.0000 seconds 0 count 0.000000 time crunchstat: fuseop:mkdir 0 count 0.000000 time -- interval 10.0000 seconds 0 count 0.000000 time crunchstat: fuseop:unlink 0 count 0.000000 time -- interval 10.0000 seconds 0 count 0.000000 time crunchstat: fuseop:rmdir 0 count 0.000000 time -- interval 10.0000 seconds 0 count 0.000000 time crunchstat: fuseop:rename 0 count 0.000000 time -- interval 10.0000 seconds 0 count 0.000000 time crunchstat: fuseop:flush 6 count 0.000083 time -- interval 10.0000 seconds 0 count 0.000000 time
The ordering of stats is not significant.
Meaning/source of stats:fuseop:*reports calls and cumulative elapsed time for each fuse API callblkio:0:0reports traffic at the fuse layernet:keep0,keepcalls, andkeepcachereport back-end data traffic (keep), which will require instrumenting keepclient as well -- these are not included here, they will be added in #23284fuseopsappears to be an additional spelling of fuseop:read and fuseop:write
Implementation:
We should use prometheus to collect metrics like- arvados_fuse_io_bytes{op="write"} (and "read")
- arvados_fuse_ops{op="write"} (and "read" and ...)
- arvados_fuse_op_seconds{op="write"}
This leaves the option to also export fuse and keepclient metrics via normal prometheus API in future.
For now, we can use Gather (source:sdk/go/httpserver/metrics.go has an example of the first part of this) and report them in crunchstat format.
- keepFS should add fields
Registry *prometheus.Registry StatsInterval time.Duration StatsOutput io.Writer mBytes *prometheus.CounterVec mSeconds *prometheus.CounterVec mOps *prometheus.CounterVec
- keepFS.Init should
- create the CounterVecs and register them (similar to source:lib/dispatchcloud/worker/pool.go)
- (if StatsInterval > 0 && StatsOutput != nil) start a ticker, and start a goroutine to call
fs.writeStats()on each tick - populate each vec with all of the applicable op values (otherwise prometheus won't start reporting "chmod" stats until the first time chmod is called) -- see example in newInstrumentedInstanceSet in source:lib/dispatchcloud/driver.go
- keepFS should get a Destroy method to stop the stats goroutine (fuse library will call this at shutdown)
- keepfs_test.go should get a TearDownTest method that calls Destroy
- (*mountCommand)RunCommand should create a prometheus.NewRegistry() and pass it to keepFS (Registry field) and keepclient.KeepClient (TBD)
Something like this would let us add metrics to every op with one line:
func (fs *keepFS) Chmod(path string, mode uint32) (errc int) {
defer fs.reportMetrics("chmod", time.Now(), nil)
// ...
}
func (fs *keepFS) Write(path string, buf []byte, ofst int64, fh uint64) (n int) {
defer fs.reportMetrics("write", time.Now(), &n)
// ...
}
func (fs *keepFS) reportMetrics(op string, t0 time.Time, bytes *int) {
fs.mOps.WithLabelValues(op).Add(time.Since(t0).Seconds())
fs.mSeconds.WithLabelValues(op).Add(time.Since(t0).Seconds())
if bytes != nil {
fs.mBytes.WithLabelValues(op).Add(float64(*bytes))
}
}
Updated by Tom Clegg 5 months ago
- Precedes Feature #23284: Keepclient reports backend keep metrics added
Updated by Lisa Knox 4 months ago
It was decided in discussion that the first iteration of this ticket can be to have a string printed to stderr at the interval passed with -crunchstat-interval.
23245-go-fuse-crunchstat @ 19fb7325d5a973b11e5a0b3aae8ec093edc68ef5
- passing
-crunchstat-interval nnow logs the word "tick" every n seconds with the INFO flag - the default interval is 0, which disables logging
- passing < 0 throws an error
- an assertion was added to lib/mount/command_test.go
Updated by Tom Clegg 4 months ago
· Edited
- Status changed from New to In Progress
- Description updated (diff)
I think we should move the logging code into keepFS, so RunCommand() parses the command line flag and passes the interval to keepFS in a new StatsInterval time.Duration field.
As written, ticker.Stop() in repeatEvery() never gets called, because for range ticker.C runs forever. (In fact, even Stop doesn't close the channel, so for range ticker.C will wait forever regardless of who calls Stop. Stop is no longer needed for the sake of garbage collection either, so we actually don't need to call it at all.)
- keepFS adds a field
done chan struct{} - (*keepFS)Init() starts a timer, creates the
donechannel, and runs a goroutinefs.done = make(chan struct{}) go func() { if fs.StatsInterval <= 0 { return } ticker := time.NewTicker(fs.StatsInterval) for { select { case <-done: return case <-ticker.C: fs.writeStats() } } }() - (*keepFS)Destroy() calls
close(fs.done)to stop stats (it doesn't need to check whether stats are active, because Init created the channel regardless)
I dislike sleeping 1.2 seconds in the test. We can avoid that by making -crunchstat-interval a float, and testing it with 0.001.
The test could also test that logging stopped as expected: after unmount, sleep 0.01, check stderr.Len(), wait another 2x interval, check stderr.Len() again. In principle this is inherently racy unless we add more coordination -- logStats() might start before done is closed, and produce output after Unmount returns -- but practically I think "sleep 0.01" will be good enough.
Updated by Brett Smith 4 months ago
- Target version changed from Development 2025-11-12 to Development 2025-11-26
Updated by Lisa Knox 4 months ago
23245-go-fuse-crunchstat @ cb4eb8a27e69d9e821c3267ed6cf2a77224fe039
I think we should move the logging code into keepFS...
done
As written, ticker.Stop() in repeatEvery() never gets called...
Implemented the Destroy method to close the done channel
I dislike sleeping 1.2 seconds in the test...
Changed the command line parse to coerce input to float64 and adjust test to only sleep for 20ms. Going down to 0.0001s as suggested introduces some flakiness
The test could also test that logging stopped as expected...
Fixed, but kinda moot because if the logs keep printing, that means the channel hasn't closed and the test will run forever.
Updated by Tom Clegg 4 months ago
Look good, just a couple of things:
Help message "-crunch-stat-interval must be non-negative" should be "-crunchstat-interval"
fs_test.go should get a TearDownTest method that calls s.fs.Destroy() (rather than rely on the fact that currently Destroy isn't needed as long as statsInterval <= 0)
Going down to 0.0001s as suggested introduces some flakiness
Hm, curious what kind of flakiness? I can see that making the "logging stopped" check flaky without the "after unmount, sleep 0.01, check stderr.Len(), ..." part, but I tried 0.0001s here and it passed reliably so I'm wondering.
Updated by Lisa Knox 4 months ago
· Edited
23245-go-fuse-crunchstat @ f855137bdb31fa403ca0ea9c2df5118af8019513
All items addressed.
In testing, I found that when I set the interval at 0.0001, even if I set the sleep timer for 0, it somehow always prints a few logs. Therefore, it made no difference if I set the sleep to 0.0002 0r 0.000000001, the test always passes. Raising both the interval and the sleep timer gave me more rational behavior, so I went with it.
Updated by Tom Clegg 4 months ago
Gotcha. IOW, if the interval is too small, the test can't detect whether there's an interval before the first log entry, right? But we aren't actually checking this, and we probably shouldn't, because it would be fine (perhaps preferable) to log zeros right away before the first interval. I don't want to get hung up on this one but in general it's a good investment to do "sleep tiny intervals until the thing happens or {timeout} passes" rather than "sleep {timeout}, then check that the thing has happened". "Confirm logging has stopped" is the only place we can't do that.
In Go you can use 0 as a float64, you don't need to say 0.0.
I don't think we need to fix this but now that I've thought of it I have to say it: if you ask for -crunchstat-interval 0.0000000000000001 the non-negative check passes, but then we truncate it to a duration of 0 nanoseconds, so there's no ticker. And if you ask for -crunchstat-interval NaN you get the same thing, because NaN < 0 is also false.
LGTM, thanks!
Updated by Tom Clegg 4 months ago
- Related to Feature #23308: Go FUSE driver crunchstat should report client-side traffic (need to add to keepclient metrics) added
Updated by Lisa Knox 4 months ago
· Edited
23245-fuse-logging @ 88a1c5eb8ab5f485b08f8a608ee362f7484e6121 (Note the changed branch name)
- ✅ All agreed upon points are implemented / addressed.
- Logging for FUSE operations is in place.
- Tests written
- ✅ Anything not implemented (discovered or discussed during work) has a follow-up story.
- Adding Keep metrics to the logs (per #23284) should immediately follow this iteration.
- ✅ Code is tested and passing, both automated and manual, what manual testing was done is described
- ✅ The tested code incorporates recent main branch changes.
- ❌ New or changed UX/UX and has gotten feedback from stakeholders.
- I copied the formatting from the example on the ticket, but I suspect we don't actually want it to look exactly like that.
- n/a Documentation has been updated.
- ✅ Behaves appropriately at the intended scale (describe intended scale).
- ✅ Considered backwards and forwards compatibility issues between client and server.
- no change
- ✅ Follows our coding standards and GUI style guidelines.
- I elected to move the command test out of the main command test to eliminate excess logging during tests
- The only reason logrus is used for logging instead of fmt.Println is because I could not figure out how to check the stdout in a test and I think it's probably ok either way.
- I saw the Namespace & Subsystem fields in #23284 but I'm not sure if or how they should be applied here. As in, I can't find where the values are consumed, so it's not clear why they are there.
keepfs_test.go should get a TearDownTest method that calls Destroy
- keepfs_test.go doesn't exist, but fs.go already has a teardown method that calls destroy, and I think that's what this line means?
Updated by Tom Clegg 4 months ago
I elected to move the command test out of the main command test to eliminate excess logging during tests
Having a separate test specifically for crunchstat logging is good, yes. We could also add c.Check(stderr.String(), check.Equals, "") to TestMount().
I think we should add ahow to check the stdout in a test
StatsOutput io.Writer field to keepFS, and write stats to it with fmt.Fprintf().
- We don't really want the structured log formatting stuff logrus provides -- we just want to write literal text in our own format.
- Test cases can provide a bytes.Buffer, mountCommand can provide stderr
- Maybe someday mountCommand will have an option like
-crunchstat-fd=3to send to a different file descriptor instead of combining with stderr
Namespace & Subsystem
They get prepended to the metric name, so instead of
fs.mBytes = prometheus.NewCounterVec(prometheus.CounterOpts{
Name: "arvados_fuse_bytes",
Help: "Bytes read/written by the FUSE filesystem",
}, []string{"fuseop"})
we can say
fs.mBytes = prometheus.NewCounterVec(prometheus.CounterOpts{
Namespace: "arvados",
Subsystem: "fuse",
Name: "bytes",
Help: "Bytes read/written by the FUSE filesystem",
}, []string{"fuseop"})
The resulting metrics are identical but the latter is how we spell it elsewhere.
fs.go already has a teardown method that calls destroy
Right, I meant fs_test.go, and we already added the TearDown method in the previous branch.
In the keepFS struct definition:
type keepFS struct {
fuse.FileSystemBase
Client *arvados.Client
KeepClient *keepclient.KeepClient
ReadOnly bool
Uid int
Gid int
Logger logrus.FieldLogger
root arvados.CustomFileSystem
open map[uint64]*sharedFile
lastFH uint64
statsInterval time.Duration
done chan struct{}
Registry *prometheus.Registry
mBytes *prometheus.CounterVec
mOps *prometheus.CounterVec
mSeconds *prometheus.CounterVec
I think statsInterval should be StatsInterval and move up to the exported/capitalized section. That section has fields that get set by the caller. The second section has fields that are managed by keepFS itself. (Since keepFS itself is private to this package, the caller is by definition also in this package, so the language doesn't actually prevent the private fields from being accessed, but it's a helpful distinction to make anyway.)
In Init, we are assuming the initial previousMetrics will have all zeroes. So we could just use a nil map (var previousMetrics map[string]float64) instead of calling gatherMetrics.
In gatherMetrics:
metricFamilies, err := reg.Gather()
if err == nil {
for _, mf := range metricFamilies {
...
should be
metricFamilies, err := reg.Gather()
if err != nil {
return nil
}
for _, mf := range metricFamilies {
or even
metricFamilies, _ := reg.Gather() // ignore errors
for _, mf := range metricFamilies {
(since it's safe to read/iterate over a nil slice/map)
FormatMetrics() should be private, i.e., formatMetrics(). Or better, (*keepFS)writeMetrics(), and it writes directly to fs.CrunchstatOutput instead of building up a slice of strings for the caller to write.
I did a bit of manual testing and the output looks plausible.
The blkio:0.0 entries aren't implemented yet.
I think the test case would be more manageable with a "get the line for stat X" closure, like
statline := func(stat string) string {
for _, line := range lines {
if strings.HasPrefix(line, "crunchstat: "+stat+" ") {
return line
}
}
return ""
}
c.Check(statline("fuseop:read"), Equals, "...")
c.Check(statline("fuseop:write"), Equals, "...")
That would also avoid this pattern
for _, line := range lines {
if strings.HasPrefix(line, "crunchstat: fuseop:read ") {
c.Check(line, Equals, "crunchstat: fuseop:read 8 count 0.250000 time -- interval 1.0000 seconds 5 count 0.150000 time")
...which passes when there is no fuseop:read line at all.
Updated by Tom Clegg 4 months ago
Tom Clegg wrote in #note-15:
FormatMetrics() should be private, i.e., formatMetrics(). Or better, (*keepFS)writeMetrics(), and it writes directly to fs.CrunchstatOutput instead of building up a slice of strings for the caller to write.
On second thought I take back the part about making it a *keepfs method. writeMetrics(out io.Writer, ...) would be tidier for testing.
Updated by Brett Smith 4 months ago
Lisa Knox wrote in #note-14:
- I copied the formatting from the example on the ticket, but I suspect we don't actually want it to look exactly like that.
We do because crunchstat-summary consumes this format. source:tools/crunchstat-summary/crunchstat_summary/summarizer.py#L86
Updated by Lisa Knox 4 months ago
23245-fuse-logging @ 38e47b672aba2f2b959d5c840f0c403ac59a98ac
I think we should add a
StatsOutput io.Writerfield to keepFS...
Done, and it was surprisingly straightforward to do.
I think statsInterval should be StatsInterval and move up to the exported/capitalized section...
Thank you for this, I had wrongly assumed that the different sections were arranged semantically by their functions.
In Init, we are assuming the initial previousMetrics will have all zeroes. So we could just use a nil map (
var previousMetrics map[string]float64) instead of calling gatherMetrics...
I had assumed it would panic because that feels like typical Go behavior.
The blkio:0.0 entries aren't implemented yet.
Done, plus keepcache, keepcalls, and net:keep0, the last of which I don't really know how to test. It looks like the stats are initiated and registered properly, and that I'm collecting them properly, but I have yet to figure out how it's triggered.
I think the test case would be more manageable with a "get the line for stat X" closure...
Done, it's much cleaner now.
On second thought I take back the part about making it a *keepfs method. writeMetrics(out io.Writer, ...) would be tidier for testing.
I was planning on pushing back on this for that reason exactly.
Updated by Brett Smith 4 months ago
- Precedes Feature #23332: Go FUSE driver reports crunchstats net:keep0, keepcalls, keepcache added
Updated by Brett Smith 4 months ago
- Related to Feature #23333: Go FUSE Driver Phase 1: crunch-run uses arvados-client mount added
Updated by Brett Smith 4 months ago
- Target version changed from Development 2025-11-26 to Development 2025-12-10
Updated by Brett Smith 4 months ago
Lisa Knox wrote in #note-18:
23245-fuse-logging @ 38e47b672aba2f2b959d5c840f0c403ac59a98ac
Going over Tom's comments and this branch, I agree you've addressed most everything, except for one note below.
In Init, we are assuming the initial previousMetrics will have all zeroes. So we could just use a nil map (
var previousMetrics map[string]float64) instead of calling gatherMetrics...I had assumed it would panic because that feels like typical Go behavior.
Specifically for slices and maps, reads from nil are treated the same as an empty slice/map, and idiomatic Go often leans on this fact.
The blkio:0.0 entries aren't implemented yet.
Done, plus keepcache, keepcalls, and net:keep0, the last of which I don't really know how to test. It looks like the stats are initiated and registered properly, and that I'm collecting them properly, but I have yet to figure out how it's triggered.
Unfortunately this is beyond me too. Personally I think I would be comfortable merging the branch with the current testing, and then seeing if Tom has any follow-up ideas for a test.
On second thought I take back the part about making it a *keepfs method. writeMetrics(out io.Writer, ...) would be tidier for testing.
I was planning on pushing back on this for that reason exactly.
If I'm following right, it seems like there was some miscommunication here. Tom took back the part about making it a *keepfs method, yet that's exactly what the latest changes do. Am I missing something?
One style note, to avoid pyramiding, in gatherMetrics instead of this:
metricName := mf.GetName()
if len(metric.GetLabel()) > 0 {
// … do the rest…
}
it would be nicer written like:
metricName := mf.GetName()
if len(metric.GetLabel()) == 0 {
continue
}
// … do the rest…
Thanks.
Updated by Lisa Knox 4 months ago
· Edited
23245-fuse-logging @ 8faaebd3553d42e4e0f9b29ec3454f04adc6fc77
Style note taken
If I'm following right, it seems like there was some miscommunication here. Tom took back the part about making it a *keepfs method, yet that's exactly what the latest changes do. Am I missing something?
No, I misunderstood the comment, or rather, the retraction. I returned it to a public function. The part I wanted to push back on is the idea of passing the writer in and having FormatMetrics itself write to stdout. We already test that the logging happens, so it makes sense to me to independently test that FormatMetrics formats the metrics, which I like because FormatMetrics is a somewhat complicated function.
Actually, looking at it again, what we would ideally do is to test that each individual operation populates the correct values in the correct line in the logs. For example, mount the fs, then create a directory, then check that mkdir was called in the log lines. As far as I can tell, that is not what happens in the python FUSE driver tests. I don't know how feasible or worthwhile that is, though.
Updated by Brett Smith 4 months ago
Lisa Knox wrote in #note-23:
No, I misunderstood the comment, or rather, the retraction. I returned it to a public function. The part I wanted to push back on is the idea of passing the writer in and having FormatMetrics itself write to stdout. We already test that the logging happens, so it makes sense to me to independently test that FormatMetrics formats the metrics, which I like because FormatMetrics is a somewhat complicated function.
So, I do think the way Tom retracted and revised his comment was, er, pretty terse and a little confusing. But I understand the suggestion is to replace the original:
func formatMetrics(currentMetrics, previousMetrics map[string]float64, intervalSeconds float64) []string
with:
func writeMetrics(out io.Writer, currentMetrics, previousMetrics map[string]float64, intervalSeconds float64) error
Then instead of building a slice, the function writes strings directly to out. If any write returns an error, it can immediately return that back.
Tests can pass in a strings.Builder so they don't have to try to intercept stdout writes or any other magic like that. Then, to me at least, writing checks against the builder seems about the same effort as writing checks against the slice. (If it's still helpful to have a slice, it's easy enough to make one with strings.Split(builder.String(), "\n").
Actually, looking at it again, what we would ideally do is to test that each individual operation populates the correct values in the correct line in the logs. For example, mount the fs, then create a directory, then check that mkdir was called in the log lines. As far as I can tell, that is not what happens in the python FUSE driver tests. I don't know how feasible or worthwhile that is, though.
I think where I'm at with this right now is, it's a good idea and I'm interested in doing it, but given where this ticket is and that there's other active development on the Go FUSE driver right now, I would rather have it be a separate ticket.
I think it would be reasonable to set up a single mount, then do one operation, check the logs, do a different operations, check the logs, etc. This would reduce overhead significantly. It would add some risk that one bad operation ruins all the following tests, but overall I think it's a reasonable trade-off. Does that seem doable?
Updated by Tom Clegg 4 months ago
- FormatMetrics should be formatMetrics. We aren't planning for anyone outside this package to use it. (I meant to retract only the part about making it a method of
*keepFS, not the part about making it private.) - formatMetrics should take an io.Writer argument instead of returning a slice of strings, because it's idiomatic Go and it's more efficient. Tests should write to a bytes.Buffer (or strings.Builder) and split on newline.
test that each individual operation populates the correct values in the correct line in the logs
I think it would be reasonable to set up a single mount, then do one operation, check the logs, do a different operations, check the logs, etc. This would reduce overhead significantly.
We should be testing this in fs_test.go, which doesn't actually set up a fuse mount, so overhead isn't an issue.
This test doesn't need to check logs (which is super inconvenient because logs are produced asynchronously on a timer, and is redundant because we already test the logging parts in TestFormatMetrics and TestCrunchstatLogger). It just needs to call gatherMetrics and check the returned map.
That said, I'm not sure we need to prioritize an exhaustive version of this test. We could start with a simple version that writes and reads back a file and confirms the metrics have non-zero entries for bytes written, cache hits, and open/write/read ops. That will catch the most likely bugs. Maybe this will even use our new feature to mount a temp collection -- s.fs.root.MountTmp("tmp") -- which should be more convenient/tidy than our usual approach of explicitly creating a test collection or using a test fixture.
Updated by Tom Clegg 4 months ago
We don't really need the t0 variable, we could just do this, right?
- t0 := time.Now()
- defer fs.reportMetrics("open", t0, nil)
+ defer fs.reportMetrics("open", time.Now(), nil)
For read and write, I think it would be more correct to increment the operation count even when the number of bytes read/written is zero:
- t0 := time.Now()
- defer func() {
- if n > 0 {
- fs.reportMetrics("read", t0, &n)
- }
- }()
+ defer fs.reportMetrics("read", time.Now(), &n)
Thanks for adding the missing defer fs.debugPanics() in Fsyncdir.
Updated by Lisa Knox 3 months ago
23245-fuse-logging @ e1f41c11b1cbe637895df4ef7c4567db4ddd326f
- ✅ All agreed upon points are implemented / addressed.
- formatMetrics is now writeMetrics and writes directly to an io.Writer
- I chose to keep the existing writeMetrics functionality of compiling all the strings and then writing them out, as this allows me to write the error check only once.
- Added a test for getherMetrics()
- Anything not implemented (discovered or discussed during work) has a follow-up story.
- We need to nail down how exhaustive the tests need to be and either include them here or on a separate ticket.
- ✅ Code is tested and passing, both automated and manual, what manual testing was done is described
- ✅ The tested code incorporates recent main branch changes.
- n/a New or changed UX/UX and has gotten feedback from stakeholders.
- n/a Documentation has been updated.
- ✅ Behaves appropriately at the intended scale (describe intended scale).
- ✅ Considered backwards and forwards compatibility issues between client and server.
- no change
- ✅ Follows our coding standards and GUI style guidelines.
Updated by Brett Smith 3 months ago
- Target version changed from Development 2025-12-10 to Development 2026-01-06
Updated by Tom Clegg 3 months ago
Lisa Knox wrote in #note-27:
- I chose to keep the existing writeMetrics functionality of compiling all the strings and then writing them out, as this allows me to write the error check only once.
Since the caller is ignoring the returned error anyway, I think we should just call fmt.Fprintf(out, ...) and not bother checking/returning errors from that.
- We need to nail down how exhaustive the tests need to be and either include them here or on a separate ticket.
I think the minimum we should have on this branch is a check for a non-zero statistic in TestCrunchstatLogger. Something like: write a 2KiB file before the 20 ms sleep, and check logs for crunchstat: blkio:0:0 2048 write and crunchstat: fuseop:open 1 count instead of just crunchstat.
In TestWriteMetrics, we could sort the written lines and then compare to a pre-sorted slice of expected lines using DeepEquals. I think the way that gets reported in a failure will be easier to make sense of.
Gocheck comes with a NotNil checker that you can use in place of Not(IsNil).
The trailing tab here looks like maybe a debugging thing that wasn't meant to be in the final version?
lines = append(lines, fmt.Sprintf("crunchstat: net:keep0 %.0f tx %.0f rx -- interval %.4f seconds %.0f tx %.0f rx ",
c.Check(lines1[0], Matches, "crunchstat: net:keep0 1024 tx 2048 rx -- interval 1.0000 seconds 1024 tx 2048 rx.*")
c.Check(lines1[1], Equals, "crunchstat: keepcalls 5 put 10 get -- interval 1.0000 seconds 5 put 10 get")
Updated by Lisa Knox 3 months ago
· Edited
23245-fuse-logging @ 8d91b8ab2263d7087bbf05986816b3693c345c78
Since the caller is ignoring the returned error anyway, I think we should just call
fmt.Fprintf(out, ...)and not bother checking/returning errors from that.
Done
I think the minimum we should have on this branch is a check for a non-zero statistic in TestCrunchstatLogger. Something like: write a 2KiB file before the 20 ms sleep, and check logs for
crunchstat: blkio:0:0 2048 writeandcrunchstat: fuseop:open 1 countinstead of justcrunchstat.
Done
In TestWriteMetrics, we could sort the written lines and then compare to a pre-sorted slice of expected lines using DeepEquals.
This makes it much simpler, thank you.
Gocheck comes with a
NotNilchecker...
The trailing tab here...
Done and done
Updated by Lisa Knox 3 months ago
- Status changed from In Progress to Resolved
Applied in changeset arvados|96600b7ede39f0a14a08d8f28c2e614fc21787de.