Bug #5043
closed[Crunch] accept arbitrarily long stderr lines in crunchstat
100%
Description
Crunchstat uses a bufio.Scanner
to read stderr lines, and as a consequence cannot process input lines longer than bufio.MaxScanTokenSize
(64K). (See #4889)
Using a bufio.Reader
with ReadString()
or something similar would make it more robust to unexpectedly long stderr output, without killing the child task.
Updated by Tom Clegg almost 10 years ago
- Target version changed from Arvados Future Sprints to 2015-03-11 sprint
Updated by Tom Clegg almost 10 years ago
5043-crunchstat-long-lines at 508bcab includes
Accept arbitrarily long lines on stderr. Split long stderr lines into <= 16KiB segments, emitting "firstpart[...]" then "[...]nextpart[...]" then "[...]lastpart". Splitting provides a few benefits:- crunchstat's caller (and, typically, Workbench) can see some stderr even though a newline might not arrive for a long time.
- programs trying to display the resulting log are less likely to behave badly (e.g.,
less(1)
gets very slow when displaying long lines) - crunchstat doesn't consume arbitrary amounts of RAM trying to buffer long lines.
Also: Get rid of the "log channel" and use Go's own logging library instead: it guarantees goroutine safety and allows you to specify a prefix ("crunchstat: "), which were the two things our logChan code was doing.
Updated by Peter Amstutz almost 10 years ago
I took some time to beat on this a little bit. Tested this way:
./crunchstat -cgroup-root=/sys/fs/cgroup -poll=1 bash -c "cat ~/ff4a131626aa0883c715d3c5abf394a6 1>&2" 2> out
(ff4a131626aa0883c715d3c5abf394a6 is 28 MiB of manifest on a single line I had lying around from something else..)
Works great!
Also: Get rid of the "log channel" and use Go's own logging library instead: it guarantees goroutine safety and allows you to specify a prefix ("crunchstat: "), which were the two things our logChan code was doing.
You're creating separate "statLog" and "childLog" Loggers that both write to Stderr, and (from looking at the actual log.go code) each one has its own mutex, which means writes to "statLog" are serialized, but this has no effect on writes to "childLog". In practice this doesn't seem to be an issue, but I think we're just getting lucky that Write()
on the underlying stderr stream is effectively atomic for small writes.
Just one thing puzzled me in the tests:
t.Fatalf("sent %d bytes, got %d different bytes", len(sentBytes)+1, len(receivedBytes))
Why does it report len(sentBytes)+1
when the test is bytes.Compare(receivedBytes, sentBytes)
?
Updated by Tom Clegg almost 10 years ago
Peter Amstutz wrote:
Also: Get rid of the "log channel" and use Go's own logging library instead: it guarantees goroutine safety and allows you to specify a prefix ("crunchstat: "), which were the two things our logChan code was doing.
You're creating separate "statLog" and "childLog" Loggers that both write to Stderr, and (from looking at the actual log.go code) each one has its own mutex, which means writes to "statLog" are serialized, but this has no effect on writes to "childLog". In practice this doesn't seem to be an issue, but I think we're just getting lucky that
Write()
on the underlying stderr stream is effectively atomic for small writes.
It's not just luck: the Go log library also guarantees "each logging operation makes a single call to the Writer's Write method." The mutex just controls access to the logger's output buffer; os.Stderr.Write()
itself is already safe to use from multiple goroutines.
You can exercise this with something like
./crunchstat -cgroup-root=/sys/fs/cgroup -poll=1 sh -c 'yes c------------------------------------------------------------------------ >&2' 2>&1 |egrep -v ^c
Why does it report
len(sentBytes)+1
when the test isbytes.Compare(receivedBytes, sentBytes)
?
Oops, fixed @ f1589a6. (This was left over from an uglier version in which sentBytes didn't end with a newline.)
Updated by Peter Amstutz almost 10 years ago
Tom Clegg wrote:
It's not just luck: the Go log library also guarantees "each logging operation makes a single call to the Writer's Write method." The mutex just controls access to the logger's output buffer;
os.Stderr.Write()
itself is already safe to use from multiple goroutines.
Citation? I believe you, but I'm curious where it actually says that in the Go documentation.
In my own test I didn't see any interference, so it does seem to be working in practice.
LGTM.
Updated by Anonymous almost 10 years ago
- Status changed from In Progress to Resolved
Applied in changeset arvados|commit:a04ea95e79c60ed2a54eaec5b5c2e235fe39ef9a.