Project

General

Profile

Actions

Story #3761

closed

[Keep] Process entries on the current pull list.

Added by Tom Clegg over 10 years ago. Updated almost 10 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Radhika Chippada
Category:
Keep
Target version:
Start date:
03/02/2015
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
3.0

Description

Currently, when receiving its first pull list, keepstore sets up a WorkQueue instance called pullq. At the same time it should also start a pull worker goroutine:

go RunPullWorker(pullq.NextItem)

The resulting goroutine will run forever, processing pull requests on the WorkQueue one at a time.

"RunPullWorker" will:
  • Get the next pull request.
  • For each server, try Pull(). Stop when one succeeds.
  • Repeat.
"Pull" will:
  • Generate a random API token1.
  • Generate a permission signature using the random token, timestamp ~60 seconds in the future, and desired block hash.
  • Using this token & signature, retrieve the given block from the given keepstore server.
  • Verify checksum and write to storage, just as if it had been provided by a client in a PUT transaction. I.e., PutBlock().

RunPullWorker() and Pull() will look something like this:

func RunPullWorker(nextItem <-chan interface{}) {
  for item := range nextItem {
    pullReq := item.(PullRequest)
    for _, addr := range pullReq.Servers {
      err := pw.Pull(pullReq.Locator, addr)
      if err == nil {
        break
      }
    }
  }
}

func Pull(addr string, locator string) (err error) {
  log.Printf("Pull %s/%s starting", locator, addr)
  defer func() {
    if err == nil {
      log.Printf("Pull %s/%s success", addr, locator)
    } else {
      log.Printf("Pull %s/%s error: %s", addr, locator, err)
    }
  }()
  // (will also need to set auth headers and add a signature token to the locator here)
  resp, err = http.Get("http://%s/%s", addr, locator)
  if err { return }
  data, err = ioutil.ReadAll(resp.Body)
  if err { return }
  err = PutBlock(data, locator)
  return
}
PullWorker doesn't need to worry about:
  • Retrying (Data Manager will tell us to do the pull again, if it's still needed)
  • Concurrency (we can add concurrency safely & easily by starting multiple PullWorkers)
  • Noticing when the pull list changes, or is empty (WorkQueue already does all this: we just read from the channel, and something will arrive when there's something for this thread to do)
  • Detecting whether a given pull request is useless, e.g., data already present, before pulling (instead, trust Data Manager to give us useful pull lists, and be OK with an occasional superfluous GET)

1 Currently, Keep doesn't actually verify API tokens, just the permission signature, so a random token is just as effective as a real one.


Subtasks 1 (0 open1 closed)

Task #5346: Review branch: 3761-pull-list-workerResolvedRadhika Chippada03/02/2015

Actions
Actions

Also available in: Atom PDF