Arvados Summit Fall 2013 Breakout 1 » History » Version 1
Jonathan Sheffi, 10/25/2013 02:43 PM
| 1 | 1 | Jonathan Sheffi | h1. Arvados Summit Fall 2013 Breakout 1 |
|---|---|---|---|
| 2 | |||
| 3 | h2. User stories (Jonathan & Ward facilitating) |
||
| 4 | |||
| 5 | * As an admin, if I change my DB structure, I want Arvados to help me update the config |
||
| 6 | * As an admin, I want to see the mapping of another dataset to my own |
||
| 7 | * When I run a job, I want to be able to work as Draft or Final/Real results |
||
| 8 | * As a consumer of genomic data, I want to visualize my data |
||
| 9 | * As a commercial leader of a clinical lab, I want to be able to trace quote to cash for diagnostic tests |
||
| 10 | * I want to be able to know where any file is. |
||
| 11 | * As a patient or participant, I want to be able to export my data to another study. |
||
| 12 | * As someone who works with data, I want the genotypic and phenotypic data I use to conform to a standard ontology. |
||
| 13 | * As a clinician, I want to quantify the uncertainty of the data & analysis underlying my report, so that I and the patient understand the clinical decision more fully. |
||
| 14 | * As a clinician, I want to run the same experiment on multiple data sets. |
||
| 15 | * As a lab director and oncologist, I want exome raw reads to called variants to take 15 minutes. |
||
| 16 | * As a data miner, I want to be able to query *all* public data without downloading it. |
||
| 17 | * As a researcher, I want to be able to set up a standard pipeline for a particular type of data set. |
||
| 18 | * As an informatician, I want all my data to conform to a standard format so that I can analyze across multiple data sets. |
||
| 19 | * As a clinician, I want to collect & track inbound case data, such as referral letters, ICD-9 diagnosis codes, case summaries, consents, medical reports, and insurance pre-verifications. |
||
| 20 | * As an informatician, I want to be able to track & manage ICD-9/10 data. |
||
| 21 | * As a lab director or clinician, I want to share a report with another clinician at another institution. |
||
| 22 | * As a clinician, if I discover a mutation, I want to share that with an analytical tool or aggregator of data (e.g. GeneInsight). |
||
| 23 | * As a user, I want to associate ‘keepalive’ metadata to my intermediate data |
||
| 24 | * As Arvados, I record profiling information that data expiration for intermediate data can be based on |
||
| 25 | * As an informatician, I can easily manipulate VCF files in parallel (as easy as GNV parallel) |
||
| 26 | * As a compliance officer, I have structured insight into the consents for my data |
||
| 27 | * As a researcher, I want to be able to collaborate on big datasets without having to copy them. |
||
| 28 | * As an informatician, I want to associate metadata with (a section of) my pipelines. |
||
| 29 | * As a new user, I can browse pipelines for metadata, see how ‘popular’ datasets and pipelines are [‘social features’] |
||
| 30 | |||
| 31 | h2. Technical discussion (Tom facilitating) |
||
| 32 | |||
| 33 | * Test for functionality |
||
| 34 | * Documentation |
||
| 35 | ** What can Keep do? |
||
| 36 | ** High-level functional description |
||
| 37 | ** How would one replace an existing storage system with Keep? |
||
| 38 | ** How to migrate? |
||
| 39 | ** How to MapReduce? |
||
| 40 | ** Examples |
||
| 41 | * Databases as input to job |
||
| 42 | * Permissions |
||
| 43 | * Audit trail |
||
| 44 | * Prioritizing jobs - squeaky wheel |
||
| 45 | * Monitoring - activity & status |
||
| 46 | * Checkpointing |
||
| 47 | * Self-starter kit |