Feature #18863
Updated by Ward Vandewege almost 3 years ago
As identified in https://dev.arvados.org/issues/18763#note-5, we have a "deleted_old_container_logs" rake task that is supposed to be running in a cron job to clear out old container logs.
Following the pattern we started using for the trash sweeps (#18339), add a background job that executes this sql query.
Remove the rake task file from the repository. Add a note to the upgrade nodes document that the cron job should be removed when upgrading.
The query used by the existing rake task is
DELETE FROM logs WHERE id in (SELECT logs.id FROM logs JOIN containers ON logs.object_uuid = containers.uuid WHERE event_type IN ('stdout', 'stderr', 'arv-mount', 'crunch-run', 'crunchstat') AND containers.log IS NOT NULL AND now() - containers.finished_at > interval '#{Rails.configuration.Containers.Logging.MaxAge.to_i} seconds')"
Determined empirically, aa more efficient version of that (if postgresql specific) is
delete from logs using containers where logs.object_uuid=containers.uuid and logs.event_type in ('stdout', 'stderr', 'arv-mount', 'crunch-run', 'crunchstat') AND containers.log IS NOT NULL AND now() - containers.finished_at > interval '#{Rails.configuration.Containers.Logging.MaxAge.to_i} seconds')"