Keep index » History » Version 1
Tom Clegg, 02/04/2014 02:09 AM
| 1 | 1 | Tom Clegg | h1. Keep index |
|---|---|---|---|
| 2 | |||
| 3 | See also: |
||
| 4 | * [[Keep server]] |
||
| 5 | * [[Keep manifest format]] |
||
| 6 | * source: n/a (design phase) |
||
| 7 | |||
| 8 | Purposes of index: |
||
| 9 | * Tell garbage collector what is eligible for deletion (and some partial order of preference) |
||
| 10 | * Tell replication enforcer which blocks should be stored how many × (and in which [types of] backing store) |
||
| 11 | * Tell rebalancer which blocks should be moved to redistribute free space and reduce probe time |
||
| 12 | * Tell managers how much disk space is being conserved due to CAS |
||
| 13 | * Tell managers how much disk space is occupied in a given backing store service |
||
| 14 | * Tell managers how disk usage would be affected by modifying storage policy |
||
| 15 | * Tell users how much disk space is represented by a given set of collections |
||
| 16 | * Tell users how much disk space can be made available by garbage collection |
||
| 17 | * Tell users how soon they should expect their cached data to disappear |
||
| 18 | * Tell users performance statistics (how fast should I expect my job to read data?) |
||
| 19 | * Tell ops where each block was most recently read/written, in case data recovery is needed |
||
| 20 | * Tell ops how unbalanced the backing stores are across the cluster |
||
| 21 | * Tell ops activity level and performance statistics |
||
| 22 | * Tell ops activity level vs. amount of space (how much of the data is being accessed by users?) |
||
| 23 | * Tell ops disk performance/error/status trends to help identify bad hardware |
||
| 24 | |||
| 25 | Basic kinds of data in the index: |
||
| 26 | * Which blocks are used by which collections (and which collections are valued by which users/groups) |
||
| 27 | * Which blocks are stored on which disks |
||
| 28 | * Which disks are attached to which nodes |
||
| 29 | * Read events |
||
| 30 | * Write events |
||
| 31 | * Exceptions (checksum mismatch, IO error) |
||
| 32 | |||
| 33 | h2. Implementation considerations |
||
| 34 | |||
| 35 | Overview |
||
| 36 | * REST service |
||
| 37 | * API server may cache/proxy some queries |
||
| 38 | * API server may redirect some queries |
||
| 39 | |||
| 40 | Permissions |
||
| 41 | * Support +A tokens like [[Keep server]] when accepting collection/blob uuids in request? |
||
| 42 | * Require admin api_token for some queries, site-configurable? |
||
| 43 | |||
| 44 | Distributed/asynchronous |
||
| 45 | * Easy to run multiple keep index services. |
||
| 46 | * Most features do not need synchronous operation / real time data. |
||
| 47 | * Features that move or delete data should be tied to a single "primary" indexing service (failover event likely requires resetting some state). |
||
| 48 | * Substantial disagreement between multiple index services should be easy to flag on admin dashboard. |