Bug #7255: [Data Manager] Test some non-trivial manifests in integration test - Arvados

Actions

Copy link

Bug #7255

closed

[Data Manager] Test some non-trivial manifests in integration test

Added by Tom Clegg over 9 years ago. Updated over 9 years ago.

Status:

Resolved

Priority:

Normal

Assigned To:

Radhika Chippada

Category:

Keep

Target version:

2015-12-02 sprint

Start date:

09/10/2015

Due date:

% Done:

Estimated time:

(Total: 0.00 h)

Story points:

1.0

Description

Currently, the test data used in integration tests include only one data block per manifest, and the test program relies on this: it doesn't test the safety of anything after the first block in the first stream of each collection.

The testing scheme should expand to include the following cases:

Empty collection (manifest_text == "")
Collection with 100 streams with 10 blocks each (total 1000 different blocks)

Related, but not included here:

Handle errors in manifest format (see #7253)

Subtasks 1 (0 open — 1 closed)

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Updated by Tom Clegg over 9 years ago

Story points changed from 1.0 to 0.5

Actions

Copy link

Updated by Brett Smith over 9 years ago

Target version set to Arvados Future Sprints

Actions

Copy link

Updated by Brett Smith over 9 years ago

Assigned To set to Radhika Chippada
Target version changed from Arvados Future Sprints to 2015-12-02 sprint

Actions

Copy link

Updated by Radhika Chippada over 9 years ago

Story points changed from 0.5 to 1.0

Actions

Copy link

Updated by Radhika Chippada over 9 years ago

Status changed from New to In Progress

Actions

Copy link

Updated by Tom Clegg over 9 years ago

I don't think we can rely on arv-put in createMultiStreamBlockCollection. It will only write one block per stream unless we write more than 64 MiB of data; we'd have to write ~60 GiB in order to comply with the story spec. We're only interested in the number of blocks and streams, not bytes and files.

How about something like

var locs []string
for s := 0; s < 100; s++ {
  manifest += fmt.Sprintf("./stream%d ", s)
  for b := 0; b < 10; b++ {
    loc, _, err := kc.PutB([]byte(fmt.Sprintf("s %d b %d", s, b)))
    locs = append(locs, strings.Split(loc, "+A")[0])
    manifest += loc + " " 
  }
  manifest += "0:1:dummyfile.txt\n" 
}
coll := make(map[string]string)
arvadosclient.Create("collections", nil, map[string]string{"manifest_text":manifest}, &coll)
return coll["uuid"], locs

The difference betwen TestPutAndGetCollectionsWithMultipleStreamsAndBlocks and TestPutAndGetCollectionsWithMultipleBlocks seems to be that one has multiple streams and the other has just one; both have multiple files, but only a single block per stream. There are also some unnecessary complications that seem to negate the value of the test: if the first block appearing in the manifest is the only one that's backdated, for all we know only the rest of the blocks' age, not their presence in the manifest, is keeping them alive.

I think we only need one test here:

Create one collection with 100 streams x 10 blocks each = 1000 different block IDs (but the total data should be small)
Ensure list of locs returned by collection-creator func has 1000 different entries (sort and then test all adjacent pairs?)
Write one additional "stray" data block (unreferenced by any collection)
Backdate all of the blocks
Run garbage collection
Ensure the "stray" block is missing (i.e., we didn't just skip GC entirely for some reason), and all others ("locs") are still present.

Actions

Copy link

Updated by Tom Clegg over 9 years ago

LGTM @ e1c0499

really minor nits:

// instead of /* */ for short comments
TestPutAndGetCollections[...] is a bit odd as a test name. Maybe TestManifestWithMultipleStreamsAndBlocks. Not a big deal.

Actions

Copy link

Updated by Radhika Chippada over 9 years ago

Status changed from In Progress to Resolved
% Done changed from 0 to 100

Applied in changeset arvados|commit:efacbb4c548a4d8c9bda4a8b99a1f2f13e15cd2f.

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Arvados

Custom queries

Bug #7255

[Data Manager] Test some non-trivial manifests in integration test

Updated by Tom Clegg over 9 years ago

Updated by Brett Smith over 9 years ago

Updated by Brett Smith over 9 years ago

Updated by Radhika Chippada over 9 years ago

Updated by Radhika Chippada over 9 years ago

Updated by Tom Clegg over 9 years ago

Updated by Tom Clegg over 9 years ago

Updated by Radhika Chippada over 9 years ago