We currently wipe the Solr database as the first step in the init pod:
Yesterday we ended up with a weird state in ITRB Prod where the Solr pod wasn't restarting because of a configuration issue. Once we got it restarted, it started in download mode -- so the first thing it did was wipe the Solr database!
Given the weird state, it's not clear to me that we could have restarted in LOAD_DATA=no mode and avoiding wiping the database, but if we move this deletion line later in the script that at least becomes an option, so hopefully we can catch that.
I think the right move would be to download this file to a separate directory -- this could be a separate mount point, which we can hopefully configure to be InitContainer-only so we don't need to lock up 400G for the Solr pod. We can then wipe the main database only after the download is complete.
We currently wipe the Solr database as the first step in the init pod:
translator-devops/helm/name-lookup/templates/scripts-config-map.yaml
Line 15 in 3333b46
Yesterday we ended up with a weird state in ITRB Prod where the Solr pod wasn't restarting because of a configuration issue. Once we got it restarted, it started in download mode -- so the first thing it did was wipe the Solr database!
Given the weird state, it's not clear to me that we could have restarted in LOAD_DATA=no mode and avoiding wiping the database, but if we move this deletion line later in the script that at least becomes an option, so hopefully we can catch that.
I think the right move would be to download this file to a separate directory -- this could be a separate mount point, which we can hopefully configure to be InitContainer-only so we don't need to lock up 400G for the Solr pod. We can then wipe the main database only after the download is complete.