AB#62209: Add local backup and error handling into prepare_data.sh #9

e-halinen · 2025-07-17T11:34:27Z

No description provided.

haphut

Great start at making this thing more reliable!

Note, I have not delved deep into the larger context of this script and what is the process and data flow of hsl-map-matcher. Some thoughts anyway:

Bash is great for one-off operations, for combining tools with pipes or maaaybe for file operations.

Once a Bash script reaches 20-50 lines, I would consider migrating to languages with a more friendly developer experience. In this case, TypeScript.

The script downloads a new data set and runs binaries against it. This logic can be done in TypeScript. That would take more development work right now, though.

If you think we should continue with a Bash script for now, I would change the following:

Add the great default settings, set -Eeuo pipefail, to the top. Maybe then just do not handle errors and let the script fail on any error, depending on how the TypeScript wrapper reacts to that.
Create a temporary directory, set a trap to remove it and download the new data and process it in a subdirectory of the temporary directory. If everything goes nicely i.e. the script does not end prematurely and with a non-zero exit code, replace the previous directory with the new one. This way there is a working copy of the data always available.

Pseudocode:

#!/bin/bash

set -Eeuo pipefail

readonly OSM_DATA_URL="${OSM_DATA_URL}"
readonly DATA_DIR='data'
readonly PROFILE_SOURCE_DIR='./osrm-profiles'
readonly OSM_DATA_FILE='map-data.osm.pbf'
readonly OSRM_BIN_PATH='./node_modules/@project-osrm/osrm/lib/binding'

tmp_dir="$(mktemp -d)"
trap 'rm -rf "${tmp_dir}"' EXIT

NEW_DATA_DIR="${tmp_dir}/new-data-dir"

# curl into NEW_DATA_DIR
# run osrm binaries against NEW_DATA_DIR
# "${ORSM_BIN_PATH} blah-blah

# Two options :
#
## Easier but not atomic:
# rm -rf "${DATA_DIR}"
# mv "${NEW_DATA_DIR}" "${DATA_DIR}"
#
## Harder but atomic
# 1. Move NEW_DATA_DIR out of /tmp similary to above.
# 2. Create a symlink (`ln -s`) or move the existing symlink (`mv -T`) to
#    point to the latest data directory. That's atomic.
# 3. Remove the older directory. Maybe use ISO 8601 timestamps
#    (`date --utc '+%Y%m%dT%H%M%SZ'`) in the directory names to recognize the
#    latest directory.

…SLdevcom/hsl-map-matcher into AB#62209_add_map_data_redundancy

haphut

LGTM

e-halinen added 2 commits July 17, 2025 14:33

AB#62209: Add local backup and error handling into prepare_data.sh

a123ffa

Merge branch 'dev' into AB#62209_add_map_data_redundancy

0613dd8

haphut reviewed Jul 17, 2025

View reviewed changes

e-halinen added 3 commits July 18, 2025 17:22

Move new data processing to /tmp, improve logging from parent process

8996605

Merge branch 'AB#62209_add_map_data_redundancy' of github-e-halinen:H…

bac27e4

…SLdevcom/hsl-map-matcher into AB#62209_add_map_data_redundancy

Remove old_data gitignore entry

87609c0

haphut approved these changes Jul 18, 2025

View reviewed changes

e-halinen merged commit ccd61c0 into dev Jul 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AB#62209: Add local backup and error handling into prepare_data.sh #9

AB#62209: Add local backup and error handling into prepare_data.sh #9

Uh oh!

e-halinen commented Jul 17, 2025

Uh oh!

haphut left a comment

Uh oh!

haphut left a comment

Uh oh!

Uh oh!

AB#62209: Add local backup and error handling into prepare_data.sh #9

AB#62209: Add local backup and error handling into prepare_data.sh #9

Uh oh!

Conversation

e-halinen commented Jul 17, 2025

Uh oh!

haphut left a comment

Choose a reason for hiding this comment

Uh oh!

haphut left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!