-
Notifications
You must be signed in to change notification settings - Fork 20
Description
Details
Current Status
Version 1.0.2
- Standard imports Use osmium to report more accurate time of the data from pbf metadata #390
- First imports w/ Replication Use osmium to report more accurate time of the data from pbf metadata #390
Version TBD
- Updates with replication
Description
The osm.pgosm_flex
table tracks the osm_date
column which is intended to provide an idea of when the data itself was from. The current behavior defaults to "today's date" according to the computer running the import by default. Unless the --pgosm-date
option is used, then it uses the date provided (e.g. 2024-05-18
). Neither of these options is perfect, due to time zones and actual differences between when "I downloaded the file" vs "when the data was pulled from OSM."
I'd like to make the following changes.
- Use
osmium fileinfo
to retrieve the timestamp from the pbf metdata when it exists - Fall back to current behavior when metadata missing
- Consider changing
osm_date
totimestamptz
instead ofdate
Example in DB
I ran an import on 5/18/2024 local time. The data saved in the osm.pgosm_flex
table is shown by the following query.
SELECT imported, osm_date, region, pgosm_flex_version
FROM osm.pgosm_flex
;
imported |osm_date |region |pgosm_flex_version|
-----------------------------+----------+-------------------------+------------------+
2024-05-18 08:25:03.747 -0600|2024-05-18|north-america/us-colorado|1.0.0-c946501 |
PBF metadata
The timestamp from the pbf's metadata is reported as 2024-05-17T20:20:59Z
, which is 2024-05-17T14:20:59 MDT
local time for me. The date reported in the current method is reported to be a day later when the data was actually sourced.
We should be able to run this command returning the JSON into python as a dict
to extract the timestamp
and/or osmosis_replication_timestamp
keys.
osmium fileinfo district-of-columbia-2024-05-18.osm.pbf --json
{
"file": {
"name": "district-of-columbia-2024-05-18.osm.pbf",
"format": "PBF",
"compression": "none",
"size": 19026604
},
"header": {
"boxes": [
[
-77.1201,
38.79134,
-76.90906,
38.99603
]
],
"with_history": false,
"option": {
"generator": "osmium/1.14.0",
"osmosis_replication_base_url": "http://download.geofabrik.de/north-america/us/district-of-columbia-updates",
"osmosis_replication_sequence_number": "4066",
"osmosis_replication_timestamp": "2024-05-17T20:20:59Z",
"pbf_dense_nodes": "true",
"pbf_optional_feature_0": "Sort.Type_then_ID",
"sorting": "Type_then_ID",
"timestamp": "2024-05-17T20:20:59Z"
}
}
}