Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip optimize #895

Closed
wants to merge 7 commits into from
Closed

Skip optimize #895

wants to merge 7 commits into from

Conversation

Komzpa
Copy link
Contributor

@Komzpa Komzpa commented Jan 7, 2019

Initially a rebase of patch by @thbeutin in #27.

Implements only skip for optimization, needs adding a skip for indexing.

@Komzpa Komzpa changed the title [wip] Skip optimize Skip optimize Jan 15, 2019
@Komzpa
Copy link
Contributor Author

Komzpa commented Jan 15, 2019

This works for me and makes life better, implementing #27.

Copy link
Collaborator

@lonvia lonvia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I admit that I have some reservations about adding yet more options for relatively limited use-cases. For one this is because it is getting harder and harder to judge if the combination of parameters makes sense and doesn't have any unintended side-effects. The other reason is that there are a couple of other issues open that address something similar (e.g. #799) and I would rather see the pre- and postprocessing of the tables reconsidered as a whole. But I don't expect that to happen any time soon, so I guess we have to be realistic here.

@@ -211,96 +224,143 @@ void table_t::start()
void table_t::stop()
{
stop_copy();
if (!append)
{
// Post-procrssing for initial import only.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

if (srid == "4326") {
/* libosmium assures validity of geometries in 4326, so the WHERE can be skipped.
Because we know the geom is already in 4326, no reprojection is needed for GeoHashing */
fprintf(stderr, "Sorting data in %s\n", name.c_str());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This printf() looks like it belongs into the if{} below.

Because we know the geom is already in 4326, no reprojection is needed for GeoHashing */
fprintf(stderr, "Sorting data in %s\n", name.c_str());
if (!skip_optimizing) {
if (srid == "4326") {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To break down this horribly long code, move the contents of the if{} into its own private function.

fprintf(stderr, "Creating osm_id index on %s\n", name.c_str());
pgsql_exec_simple(sql_conn, PGRES_COMMAND_OK, (fmt("CREATE INDEX ON %1% USING BTREE (osm_id) %2%") % name %
(table_space_index ? "TABLESPACE " + table_space_index.get() : "")).str());
// We haven't removed invalid geometries as we did not perform a COPY.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The table has originally been created UNLOGGED because we know that it will be dropped here. If the sorting step is skipped, then the table needs to be a standard one.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this can IMHO be forced on inital parameter checks the way that "--skip-(optimiz|index)ing" is only allowed when "--unlogged" is requested as well - after all, the user should know what he is doing by requesting it.

}

if (!skip_indexing) {
fprintf(stderr, "Creating geometry index on %s\n", name.c_str());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This if{} too deserves its own private function.

@pnorman
Copy link
Collaborator

pnorman commented Jan 16, 2019

I admit that I have some reservations about adding yet more options for relatively limited use-cases. For one this is because it is getting harder and harder to judge if the combination of parameters makes sense and doesn't have any unintended side-effects. The other reason is that there are a couple of other issues open that address something similar (e.g. #799) and I would rather see the pre- and postprocessing of the tables reconsidered as a whole. But I don't expect that to happen any time soon, so I guess we have to be realistic here.

Do you think it would make more sense to completely skip all the Postgres post-processing? (i.e. everything after pending)

@lonvia
Copy link
Collaborator

lonvia commented Jan 16, 2019

Do you think it would make more sense to completely skip all the Postgres post-processing? (i.e. everything after pending)

I was more thinking that the user could supply a custom script for pre- and post-processing. The default ones would do exactly the same stuff as today (optimisation and index creation) and could then be adapted as required.

@lonvia
Copy link
Collaborator

lonvia commented Aug 7, 2019

Closing because of unresponsiveness of submitter.

@lonvia lonvia closed this Aug 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants