-
-
Notifications
You must be signed in to change notification settings - Fork 475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Very slow node parsing on updates (caused by PG's jit) #1045
Comments
What is your osm2pgsql command line, and does it happen with recent versions? |
I've checked the update logs we keep on each server which contains timing for each step (osmosis, osm2pgsql, tile expiration) So... I've compared the postgresql.conf files and there's only minor differences and nothing that could explain the change. |
Slowdown potentially related to fix for #1014
|
The full osm2pgsql command line... osm2pgsql diff.osc --number-processes=4 -a -C 64 -k -m -G -s -S ./osm2pgsql-choosen.style -d osm --flat-nodes /ssd/osm2pgsql/flat-nodes.raw --keep-coastlines --tag-transform-script ./config/activate-relation-type-waterway.lua --prefix=planet_osm I'll do more tests on my workstation... the servers have enough troubles to test news things on them ;) |
Here are some test results, all done on the same machine (24 cores, nvme SSD) on a France extract with 1 daily diff:
I've looked at postgres changelog, and maybe found something to look at. |
How did you upgrade form 10 to 12? Did you do a reimport or a database upgrade? Is this postgres 12.0 or 12.1? Did your tests use the same command line as above? If not do you mind running the same test against a standard import with the example lua to exclude that the issue is related to your specific database setup? |
Same problem here, seems to have something to do with changes introduced in PG 11, since problems with slow node parsing during updates were already reported on osm-dev mailing list after update from PG 10 to PG 11. I was doing a full planet import for tile rendering in a fresh PG 12.1 instance, this import ran as expected and didn't show decreased node parsing speed as when compared to previous full imports. Before we had osm2pgsql 0.9.4 and PG 9.6, daily updates took 2-3 hours, now they are up to more than 10 hours, due to slow node parsing. |
I've just check what was different in my PG config files between 10 and 12 and I've found what causes the slowdown... JIT !
just set it to
And I got 11s for the node parsing, like with PG10 :) |
On the rendering servers, I did the upgrade using pg_upgradecluster. But as you can see above, the cause it definitely not the upgrade, but the new jit feature of PG which has a very bad side effect on osm2pgsql prepared statements (I presume). I'm checking in a few minutes on the rendering servers if BINGO: confirmed :) |
Hm, I turned JIT off in our PG 12.1 instance, but couldn't see any positive impact on node parsing speed during updates, still at 0.1K/s. |
Have you reloaded PG config ? All my servers are ok since I changed+reloaded the config with jit=off |
Sure reloaded the config. Anything else in PG configuration, that may affect update performance? |
Make sure you ANALYZE the tables after a PG upgrade, this can really help the planner. FYI, I've submit a bug to the postgres dev team... here is the thread on the bugs mailing list: |
@cquest thanks for the follow up. I'm copying the query analysis over here:
My reading of this is that using prepared statements is not directly responsible for the slow down. The true culprit is the badly balanced way-node index. Even if we get rid of those and simply execute the SQL at the appropriate place, it still would try to use JIT when it shouldn't. Two things we should do here:
|
As mentioned above, turning off jit doesn't improve node update performance on my system.
This was with a daily update running in parallel. |
My understanding is that something is wrong at PG level: estimated plan cost. In my EXPLAIN ANALYSIS (with or withour jit) rows returned by the index bitmap scan are estimated at 300k where we only got 2. That's a factor of 150 000 x ! The guess from statistics seems plain wrong 100% of the time. This is causing PG to take bad decisions in the query planning, and activate jit when it should not, etc... to me it looks like bad statistics maintained (or computed) on GIN index or something like that, but I don't know enough about PG internals to do more than a guess. On @akara13 EXPLAIN, it's even worse (3.1M, 10x more) causing the planner to launch a parallel worker ! About improving the update process... I got a couple of ideas after looking at osm2pgsql source code to try to understand the original problem. Better to open a separate #1046 issue for that ;) |
Thanks @cquest for the suggestions. |
The estimates for the way-node index have always been widely off. It The index is so huge and badly balanced that postgres cannot make sense of it. It just never mattered because the query is simple enough that postgres always figured that it should use the index as opposed to a sequential scan, which were the only choices then. We are running now into trouble only because postgres tries to be even more clever and use such advanced features as workers and jit. There might be a way to influence the statistics. Newer versions of postgres have more advanced tools for that, see Postgresql Extended Statistics. |
I have analyzed planet_osm_ways, row estimate is still ridicously high at 3.1M. |
The way_nodes index is definitely a problem... so slow to build, larger than the data itself, bloating more than others... but hard to replace by something more efficient. I've not looked yet at how imposm deals with that. |
The row estimates being off is definitely an issue. There was a simple patch proposed on the Postgres bugs list, I was able to test it today and found it does wonders for the estimated row counts. I tested on a much smaller data set of just the Colorado region, the estimates were showing >10,000 when actual was 1. With the patch applied it estimates 68. Assuming the improvements in estimates scale to the larger data set this should help the planner choose the correct path regarding JIT / parallel query. |
They are known to have problems with the int arrays. See osm2pgsql-dev#1045
They are known to have problems with the int arrays. See osm2pgsql-dev#1045
#1256 disables JIT and parallel workers for the connections where intarray indexes are used. This should fix the slowness. We can't really require patches to Postgres for running osm2pgsql. |
On 2 worldwide rendering servers, I'm facing a very slow parsing of nodes during updates.
I've moved to postgresql 12 recently.
These servers are running 0.96 from the osmadmins ppa, with flatnodes and the whole PG database on SSD.
Here is a log extract:
Node parsing takes 90% of the overall update time.
Am I the only one ?
On a third worldwide rendering server, I'm still running a very old 0.86 and the node parsing is way faster !
The text was updated successfully, but these errors were encountered: