Some research on middle performance #2110

joto · 2023-12-04T15:03:38Z

I did some research on how fast the import into the middle tables is and how much we could possible improve performance. For that I compared actual osm2pgsql runs with running COPY "manually".

All the numbers are based on a single run, so take them with a grain of salt.

All experiments were done with the new middle (--middle-database-format=new) and without flat node files, i.e. all nodes were imported into the database.

First I did an import with --slim -O null, i.e. without output tables to get the current baseline. Internally this will create the tables with a primary key constraint on the id column and then import the data using COPY. I looked only at the timings for that part, not at building extra indexes which happens later. Indexing will have to be done anyway and it happens completely in the database, so it is unlikely that we can do much about that part. I then dumped out the data in COPY format and re-created the same database by creating the tables and running COPY with psql. The time for this is the shortest time we can likely get, the difference between this time and the import time is the time needed to read and convert the OSM data, i.e. the necessary or unnecessary overhead generated by osm2pgsql.

I then tried some variations:

Create the tables without primary key (PK) constraint and add that constraint later (which will build the index)
COPY with FREEZE option
Both of the above

Here are the timings (in minutes):

	PK	nodes	ways	rels	sum
osm2pgsql import	yes	263	77	3	343 (5.7h)
COPY	yes	241	78	3	322 (5.4h)
COPY FREEZE	yes	196	65	2	263 (4.4h)
COPY	no	142	60	2	204 (3.4h)
COPY FREEZE	no	127	50	2	179 (3.0h)
add primary key		29	4	0	33 (0.6h)
COPY + add PK		171	64	2	237 (4.0h)
COPY FREEZE + add PK		156	54	2	212 (3.5h)

Some results from this research:

Our approach can most likely be improved upon, but it is not horrible. Compared to the best other result we need 5.7 instead of 3.5 hours. But we have to decode the OSM data and convert it into the COPY format. While this mostly happens in different threads, some overhead is inevitable here.
It looks like using COPY FREEZE can improve the performance. For this to work we have to create the table in the same transaction as we do the COPY. This is not easily possible with the current code, but it is a change we could do.
From the numbers here it seems to be significantly faster to create the table without the primary key constraint and add that later. But in another test where I also generated output tables, this did not make a difference. The reason is probably that osm2pgsql was busy so often doing other things, that PostgreSQL had the time to update the indexes while the import ran. So it might be possible to get some improvement here in a real situation, but it is not quite as clear-cut as the numbers here suggest.

We also have to keep in mind that the situation is different if we use a flat nodes file. (And also different if we use the --extra-attributes option.)

And for real situations we have interaction between the middle and the output which I haven't looked at in detail so far. Most nodes don't have any tags, so they don't take up any time in the output code, the middle code is the bottleneck here. For the ways this situation is reversed, the middle is reasonably simple, the output runs some Lua code for basically every way, which is almost certainly the bottleneck.

The text was updated successfully, but these errors were encountered:

pnorman · 2023-12-06T10:08:18Z

From the numbers here it seems to be significantly faster to create the table without the primary key constraint and add that later. But in another test where I also generated output tables, this did not make a difference. The reason is probably that osm2pgsql was busy so often doing other things, that PostgreSQL had the time to update the indexes while the import ran. So it might be possible to get some improvement here in a real situation, but it is not quite as clear-cut as the numbers here suggest.

We should move to creating the UNIQUE index after loading the data. It might not have sped up your test, but I believe it would on some hardware with a different number of threads. Additionally, the resulting index is properly balanced without dead tuples in it.

It looks like using COPY FREEZE can improve the performance. For this to work we have to create the table in the same transaction as we do the COPY. This is not easily possible with the current code, but it is a change we could do.

This would rule out ever having multiple threads writing to the middle at the same time. Do we want to do that?

joto · 2025-01-02T19:17:07Z

I did some further tests:

All numbers reported here are for the ways table only. The simple COPY took 57 minutes. Tests were done using PostgreSQL 15. Supposedly there are some improvements in COPY performance in Pg 16, so we should also test with Pg 16 and/or 17.

Using binary format

COPY can be done with the text format (that we currently use) or a binary format. Using binary format the COPY time drops to 40 minutes, so we are about 30% faster. The binary format is not well documented and the documentation says it might change, but it hasn't changed since PostgreSQL 7 as far as I can see, so it is unlikely that this will be a problem. More so because we are only writing into the database, not reading, so we have more control over the format than if we have to parse the format the database generates.

This should also save use some time on the osm2pgsql side, because generating the binary format is probably faster than the text format. For geometries which don't have to be hex encoded any more, the number of bytes transfered will be only half, that should also help. I can really see no downside, we should consider switching.

Parallel COPYs

I tried simulating parallel COPYs by splitting a COPY file into 2 (or 4) pieces and doing 2 (or 4) copies simultaneously. This is with the text format. The import times were 33 minutes (or 23 minutes for 4 COPYs). It is hard to tell how well this would work in practice, but it is definitly worth a shot. The CPU usage for the postgres process doing the COPY went from 100% with one COPY to something like 70%, so it looks like we are not CPU-bound any more but either I/O bound or hitting some limits on locks or so.

This might make the data format on disk not so efficient though, maybe creating the index will take longer or usage of the resulting table is slightly slower.

With FREEZE

Using COPY FREEZE the import takes 49 minutes, using FREEZE and binary format we are at 32 minutes, almost halfing the current time.

Unfortunately using COPY FREEZE does not work together with parallel COPYs, because you need to create the table in the same transaction that you do the COPY FREEZE in. I tried using snapshot synchronization to overcome this, but it didn't work. Maybe I didn't do it right, but there is probably something in there that prevents this from working, my transactions always got rolled back when I tried doing this.

Using COPY FREEZE would be simpler to implement that having multiple connections for parallel copies, so this is still something to consider.

UNLOGGED TABLE

Somewhere on the Internet I found the suggestion to create the table as UNLOGGED TABLE, then do the import and the ALTER the table to LOGGED. This does not help. The COPY took about the same time and the ALTER TABLE took quite some time, so this is a dead end.

pnorman · 2025-01-02T20:40:10Z

Unfortunately using COPY FREEZE does not work together with parallel COPYs, because you need to create the table in the same transaction that you do the COPY FREEZE in. I tried using snapshot synchronization to overcome this, but it didn't work. Maybe I didn't do it right, but there is probably something in there that prevents this from working, my transactions always got rolled back when I tried doing this.

I don't think parallel COPY FREEZE is possible with snapshot syncronization. The docs state

But note that any database changes made by any one of these transactions remain invisible to the other transactions, as is usual for changes made by uncommitted transactions. So the transactions are synchronized with respect to pre-existing data, but act normally for changes they make themselves

The starting place is syncronized, but they're still separate transactions which can diverge.

Do we need to worry about the gains from freezing tuples at COPY time? We have to run VACUUM ANALYZE; on the table to update the Free Space Map and get statistics for query planning. We should be running VACUUM FREEZE ANALYZE after tables are done, not just VACUUM ANALYZE. I'm wondering if that changes any of the numbers

joto · 2025-01-02T20:49:17Z

We are currently running ANALYZE on all tables after import to update statistics. From what I understand we don't need VACUUM because there can be no dead tuples to clean up. I don't know enogh about FSM to understand whether that needs updating after a from-scratch import!?

pnorman · 2025-01-04T00:35:37Z

The advice I've had from PostgreSQL experts is that for optimal performance you should run VACUUM on a freshly-loaded table. It does maintenance work other than dead tuple cleanup. The visibility map, fsm are, and xid information are some to the things updated.

However, although vacuum does important work, I'm not sure any of these really matter for the typical osm2pgsql workload. The docs state

PostgreSQL's VACUUM command has to process each table on a regular basis for several reasons:

To recover or reuse disk space occupied by updated or deleted rows.

To update data statistics used by the PostgreSQL query planner.

To update the visibility map, which speeds up index-only scans.

To protect against loss of very old data due to transaction ID wraparound or multixact ID wraparound.

1 does not matter because there are no updated or deleted rows at this point in time. 2 is taken care of by a separate ANALYZE. The queries we run against the middle (or rendering) tables don't work for index-only scans. 4 is a valid concern, regular auto-vacuums will take care of that, and osm2pgsql DBs don't normally have enough transactions/second for wraparound to matter.

I still can't figure out if the FSM matters or not for us

joto · 2025-01-06T10:37:14Z

I did some further tests regarding VACUUM:

When I run VACUUM tablename; after a COPY FREEZE the vacuum needs about half a second, so apparently is doesn't have to do anything.
After a normal COPY of the ways table (done in 55 minutes), it takes nearly 7 minutes to run the VACUUM tablename;

So not only is the COPY FREEZE itself faster, it spares use the vacuum time. Normally we don't see that time, because it will hit us randomly a bit later when the autovacuum kicks in.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some research on middle performance #2110

Some research on middle performance #2110

joto commented Dec 4, 2023

pnorman commented Dec 6, 2023

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

joto commented Jan 2, 2025

pnorman commented Jan 2, 2025

joto commented Jan 2, 2025

pnorman commented Jan 4, 2025

joto commented Jan 6, 2025

Some research on middle performance #2110

Some research on middle performance #2110

Comments

joto commented Dec 4, 2023

pnorman commented Dec 6, 2023

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

joto commented Jan 2, 2025

Using binary format

Parallel COPYs

With FREEZE

UNLOGGED TABLE

pnorman commented Jan 2, 2025

joto commented Jan 2, 2025

pnorman commented Jan 4, 2025

joto commented Jan 6, 2025