Keep-balance » History » Version 5
Tom Clegg, 02/16/2014 06:28 PM
| 1 | 1 | Tom Clegg | h1. Data Manager |
|---|---|---|---|
| 2 | |||
| 3 | 5 | Tom Clegg | The Data Manager enforces policies and generates reports about storage resource usage. The Data Manager interacts with the [[Keep server]] and the metadata database. Clients/users do not interact with the Data Manager directly: the metadata database service acts as a proxy/cache on their behalf, and is responsible for access controls. |
| 4 | 1 | Tom Clegg | |
| 5 | 4 | Tom Clegg | See also: |
| 6 | * [[Keep server]] |
||
| 7 | * [[Keep manifest format]] |
||
| 8 | * source: n/a (design phase) |
||
| 9 | 1 | Tom Clegg | |
| 10 | 4 | Tom Clegg | Responsibilities: |
| 11 | 1 | Tom Clegg | * Garbage collector: decide what is eligible for deletion (and some partial order of preference) |
| 12 | 5 | Tom Clegg | * Replication enforcer: copy and delete blocks in various backing stores to achieve desired replication level |
| 13 | * Rebalancer: move blocks to redistribute free space and reduce client probes |
||
| 14 | * Data location index: know which backing stores should be contacted to retrieve a given block |
||
| 15 | * Report query engine |
||
| 16 | 4 | Tom Clegg | |
| 17 | 5 | Tom Clegg | Example reports/queries: |
| 18 | * for managers: how much disk space is being conserved due to CAS |
||
| 19 | * for managers: how much disk space is occupied in a given backing store service |
||
| 20 | * for managers: how disk usage would be affected by modifying storage policy |
||
| 21 | * for managers: how much disk space+time is used (per user, group, node, disk) |
||
| 22 | * for users: when replication/policy specified for a collection is not currently satisfied (and why, for how long, etc) |
||
| 23 | * for users: how much disk space is represented by a given set of collections |
||
| 24 | * for users: how much disk space can be made available by garbage collection |
||
| 25 | * for users: how soon they should expect their cached data to disappear |
||
| 26 | * for users: performance statistics (how fast should I expect my job to read data?) |
||
| 27 | * for ops: where each block was most recently read/written, in case data recovery is needed |
||
| 28 | * for ops: how unbalanced the backing stores are across the cluster |
||
| 29 | * for ops: activity level and performance statistics |
||
| 30 | * for ops: activity level vs. amount of space (how much of the data is being accessed by users?) |
||
| 31 | * for ops: disk performance/error/status trends (and SMART reports) to help identify bad hardware |
||
| 32 | * for ops: history of disk adds, removals, moves |
||
| 33 | |||
| 34 | 4 | Tom Clegg | Basic kinds of data in the index: |
| 35 | * Which blocks are used by which collections (and which collections are valued by which users/groups) |
||
| 36 | 5 | Tom Clegg | * Which blocks are stored in which services (local Keep, remote Keep, other storage service) |
| 37 | 4 | Tom Clegg | * Which blocks are stored on which disks |
| 38 | * Which disks are attached to which nodes |
||
| 39 | 5 | Tom Clegg | * Aggregate read/write activity per block and per disk (where applicable, e.g., block stored in local Keep) |
| 40 | 4 | Tom Clegg | * Exceptions (checksum mismatch, IO error) |
| 41 | |||
| 42 | h2. Implementation considerations |
||
| 43 | |||
| 44 | Overview |
||
| 45 | 5 | Tom Clegg | * REST service for queries |
| 46 | ** All requests require authentication. Token validity verified against Metadata server, and cached locally. |
||
| 47 | * Subscribes to system event log |
||
| 48 | * Connects to metadata server (has a system_user token), at least periodically, to ensure eventual consistency with metadata DB's idea of what data is important |
||
| 49 | * Persistent database |
||
| 50 | * In-memory database |
||
| 51 | 4 | Tom Clegg | |
| 52 | Distributed/asynchronous |
||
| 53 | * Easy to run multiple keep index services. |
||
| 54 | * Most features do not need synchronous operation / real time data. |
||
| 55 | * Features that move or delete data should be tied to a single "primary" indexing service (failover event likely requires resetting some state). |
||
| 56 | * Substantial disagreement between multiple index services should be easy to flag on admin dashboard. |