Bug #5856
closedgzip.open(f) sometimes throws struct.error (unpack requires a string argument of length 4) on File handle from CollectionReader
100%
Description
One of the tasks from https://workbench.su92l.arvadosapi.com/pipeline_instances/su92l-d1hrv-pfu68urmcrq00mo failed with this error
Three tasks from https://workbench.su92l.arvadosapi.com/pipeline_instances/su92l-d1hrv-8sgdxqj8dq8tqxa failed with this error as well
2015-04-29_17:40:17 su92l-8i9sb-bb7pxsb146b8suq 720 182 stderr Traceback (most recent call last):
2015-04-29_17:40:17 su92l-8i9sb-bb7pxsb146b8suq 720 182 stderr File "/tmp/crunch-job/src/crunch_scripts/add_callsets.py", line 234, in <module>
2015-04-29_17:40:17 su92l-8i9sb-bb7pxsb146b8suq 720 182 stderr for line in callset_fastj_file_handle:
2015-04-29_17:40:17 su92l-8i9sb-bb7pxsb146b8suq 720 182 stderr File "/usr/lib/python2.7/gzip.py", line 455, in readline
2015-04-29_17:40:17 su92l-8i9sb-bb7pxsb146b8suq 720 182 stderr c = self.read(readsize)
2015-04-29_17:40:17 su92l-8i9sb-bb7pxsb146b8suq 720 182 stderr File "/usr/lib/python2.7/gzip.py", line 261, in read
2015-04-29_17:40:17 su92l-8i9sb-bb7pxsb146b8suq 720 182 stderr self._read(readsize)
2015-04-29_17:40:17 su92l-8i9sb-bb7pxsb146b8suq 720 182 stderr File "/usr/lib/python2.7/gzip.py", line 325, in _read
2015-04-29_17:40:17 su92l-8i9sb-bb7pxsb146b8suq 720 182 stderr self._read_eof()
2015-04-29_17:40:17 su92l-8i9sb-bb7pxsb146b8suq 720 182 stderr File "/usr/lib/python2.7/gzip.py", line 344, in _read_eof
2015-04-29_17:40:17 su92l-8i9sb-bb7pxsb146b8suq 720 182 stderr isize = read32(self.fileobj) # may exceed 2GB
2015-04-29_17:40:17 su92l-8i9sb-bb7pxsb146b8suq 720 182 stderr File "/usr/lib/python2.7/gzip.py", line 25, in read32
2015-04-29_17:40:17 su92l-8i9sb-bb7pxsb146b8suq 720 182 stderr return struct.unpack("<I", input.read(4))[0]
2015-04-29_17:40:17 su92l-8i9sb-bb7pxsb146b8suq 720 182 stderr struct.error: unpack requires a string argument of length 4
zcat-ing the file that broke the task from the first pipeline (using keepmount) did not produce any errors
Updated by Peter Amstutz over 9 years ago
- Status changed from New to Resolved
- % Done changed from 0 to 100
Applied in changeset arvados|commit:c807aef7059b253512ba5da5d535e89e800b9c11.
Updated by Peter Amstutz over 9 years ago
- Status changed from Resolved to In Progress
Updated by Peter Amstutz over 9 years ago
- Target version changed from Bug Triage to 2015-05-20 sprint
Updated by Peter Amstutz over 9 years ago
- Status changed from In Progress to Resolved
- % Done changed from 0 to 100
Applied in changeset arvados|commit:4570521e9d639806b1f25c93822d5cf0dbfe42d0.
Updated by Peter Amstutz over 9 years ago
- Status changed from Resolved to In Progress
- Assigned To set to Peter Amstutz
Updated by Brett Smith over 9 years ago
Reviewing 4570521
I'm not sure it makes much sense for the tests to manipulate sfile.arvadosfile
directly. While it's not marked as private, it's not really part of the interface users are expected to use. What do you think of this:
- Rename
test_read_returns_first_block
totest_read_block_crossing_behavior
in both test case classes. In ArvadosFileReaderTestCase, this would have the code you added fortest_read_crosses_blocks
. This way the name is sensible in both places and we don't have to try to erase one in the child. test_successive_reads
is meant to ensure that state is maintained correctly across calls, so calling the statelessreadfrom
misses the mark. What if it calledsfile.read(4)
multiple times and checked for correct results until EOF? That would maintain the spirit of the test, and double-check that block crossing happens correctly.
In the docstring, "upto" should be two words.
Thanks.
Updated by Peter Amstutz over 9 years ago
- Status changed from In Progress to Resolved
- Assigned To deleted (
Peter Amstutz)
Brett Smith wrote:
Reviewing 4570521
I'm not sure it makes much sense for the tests to manipulate
sfile.arvadosfile
directly. While it's not marked as private, it's not really part of the interface users are expected to use. What do you think of this:
- Rename
test_read_returns_first_block
totest_read_block_crossing_behavior
in both test case classes. In ArvadosFileReaderTestCase, this would have the code you added fortest_read_crosses_blocks
. This way the name is sensible in both places and we don't have to try to erase one in the child.test_successive_reads
is meant to ensure that state is maintained correctly across calls, so calling the statelessreadfrom
misses the mark. What if it calledsfile.read(4)
multiple times and checked for correct results until EOF? That would maintain the spirit of the test, and double-check that block crossing happens correctly.
Done.
In the docstring, "upto" should be two words.
Fixed.
Updated by Brett Smith over 9 years ago
- Assigned To set to Peter Amstutz
c42ccd1f is good to merge. Thanks.