Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manual index configuration for id indexes? #1854

Closed
joto opened this issue Dec 19, 2022 · 8 comments
Closed

Manual index configuration for id indexes? #1854

joto opened this issue Dec 19, 2022 · 8 comments

Comments

@joto
Copy link
Collaborator

joto commented Dec 19, 2022

I noticed a problem with the new manual index creation options as per #1780: The id indexes are explicitly exempt from the configuration. Osm2pgsql will create an id index when it needs the index itself, i.e. when it creates a database for updates. Now if the user needs an index on the id column for some other reason everything is fine as long as we are creating a database for updates, the user can just use the index osm2pgsql creates for itself. But if the database is not created for updates, the user has to add that index to the configuration. But this means the configuration depends on whether the user creates an updatable database or not which is somewhat confusing and annoying for the user, especially because there is a good chance testing is done on a non-updatable database and then the final production import is updatable and the result are two (probably even identical) indexes on the id column.

@pnorman
Copy link
Collaborator

pnorman commented Dec 19, 2022

This is something I've run into a few times. I have ID indexes for dev work so I can fetch particular objects for testing, but have to manually create them.

@lonvia
Copy link
Collaborator

lonvia commented Dec 19, 2022

I see two possible solutions:

  1. Have a special index method that creates a id index and allow that do be added to the index definition, something like: ... indexes={{column='node_id', method='id-index'}} ....
  2. Add an option to the id definition: ... ids = { type = 'node', id_column = 'node_id', create_index = 'always'}, ....

The first has the advantage that all indexes are in the same place but it is less obvious what it means and has some potential to be used wrongly. So I'd go for the second option.

@joto
Copy link
Collaborator Author

joto commented Dec 19, 2022

@pnorman This issue is specific to id indexes. If you need other indexes only in some setups you have to write your Lua script accordingly and use, for instance, an environment variable to switch between different setups.

@pnorman
Copy link
Collaborator

pnorman commented Dec 20, 2022

I am talking about ID indexes.

@rustprooflabs
Copy link
Contributor

I see two possible solutions:

  1. Have a special index method that creates a id index and allow that do be added to the index definition, something like: ... indexes={{column='node_id', method='id-index'}} ....
  2. Add an option to the id definition: ... ids = { type = 'node', id_column = 'node_id', create_index = 'always'}, ....

The first has the advantage that all indexes are in the same place but it is less obvious what it means and has some potential to be used wrongly. So I'd go for the second option.

I prefer option two as well.

@joto
Copy link
Collaborator Author

joto commented Dec 22, 2022

The problem with (2) is that you can't use the more simple define_node/way/relation/area_table() functions any more but have to use define_table(). Maybe with all the extra things we have been doing on the table definitions the simplified functions aren't that great any more anyway and we want to deprecate them?

For option (2), that are the values for create_index?

  • Do we want to use native data types: true for always create, nil (default) for "create for updateable databases", maybe false for "never create" (which might be useful if you want to create them manually in some specific way?) or
  • use strings: "always", "never", "auto" (default)?

@lonvia
Copy link
Collaborator

lonvia commented Dec 22, 2022

I'd go with strings always and auto for now. never might just be asking for trouble.

joto added a commit to joto/osm2pgsql that referenced this issue Jan 19, 2023
If the user always needs the id index, they can force the index build
even in non-slim (or slim+drop) mode by setting the `create_index`
option of the `ids` setting in the define_table() Lua command to
`always`. The default is `autpo` which means: Only build the index in
slim mode.

See osm2pgsql-dev#1854

The test runner (steps_db.py) has been extended so that the
table_exists() check also checks for views, so that we can access the
data in the pg_catalog.pg_indexes view.
@joto
Copy link
Collaborator Author

joto commented Jan 25, 2023

This was implemented in #1894.

@joto joto closed this as completed Jan 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants