Actions
Story #7988
closed[Keep] Single keepstore responsible for trash lists on S3
Status:
Closed
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
-
Start date:
Due date:
% Done:
0%
Estimated time:
Story points:
-
Description
The S3 API does not support transactional deletes, so there is a race condition between checking the timestamp of a block and deleting the block, where the block can have its metadata refreshed but then deleted.
While this could potentially be solved using AWS S3 object versioning, this solution is not available with other storage systems that provide S3 compatible APIs, such as Google and Ceph.
Proposed solution:
- Designate a single keepstore to handle trash lists for a given S3 bucket.
- On PUT, if the blocks are new or of an existing block that is less than 2 weeks old, can be handled by any server
- Otherwise, do PUT-copy to update new block to "hash.copy"
- For each block on the trash list:
- Get the modification time
- Try to PUT-copy from "hash.copy" to "hash". If this succeeds, don't do anything else.
- Send delete request
- Try to PUT-copy again from "hash.copy" to "hash" (ignore errors)
- When the trash list is empty (because we finished processing, or an empty trash list was received from data manager), search and delete all blocks matching the pattern "*.copy"
Actions