Skip to content

Add DB indexes #155

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 2, 2025
Merged

Add DB indexes #155

merged 3 commits into from
May 2, 2025

Conversation

RealOrangeOne
Copy link
Owner

@RealOrangeOne RealOrangeOne commented Apr 27, 2025

Closes #113
Fixes #140
Fixes #15

Add some indexes to improve performance. From various testing in the linked PRs, it should make quite a difference.

The ordering index seems to be the one which makes the most difference. The other indexes are mostly there to make other bits of filtering better.

@RealOrangeOne
Copy link
Owner Author

@kakulukia @c-soft @jonatron @eviljeff interested in your thoughts on this patch, given you've experienced these issues first-hand.

@RealOrangeOne RealOrangeOne force-pushed the add-db-indexes branch 2 times, most recently from ebd4572 to 26ee0f1 Compare April 27, 2025 16:33
@RealOrangeOne RealOrangeOne added the database-backend Issues relating to the database backend label Apr 27, 2025
@kakulukia
Copy link

I currently have the following indices set
sqlite> PRAGMA index_list('django_tasks_database_dbtaskresult');
0|django_tasks_database_dbtaskresult_run_after_2fa1dfb3|0|c|0
1|django_tasks_database_dbtaskresult_enqueued_at_0d65f00d|0|c|0
2|django_tasks_database_dbtaskresult_priority_5f260db0|0|c|0
3|django_tasks_database_dbtaskresult_status_3cbcbca3|0|c|0
4|sqlite_autoindex_django_tasks_database_dbtaskresult_1|1|pk|0

So i looks like we have a match on pk and priority. I figured as you are using DBTaskResult.objects.ready() that the status index should also be valuable. But i dont have the time to further test this during the coming weeks.

The other two are probably irrelevant.
Whats do the others say?

@RealOrangeOne
Copy link
Owner Author

If you try running with the indexes in this PR, do you get the same performance improvements you saw?

@kakulukia
Copy link

i will try to find some time for testing this in the next days - lets see

@jonatron
Copy link
Contributor

Seems ok for postgres, after run_after order is updated #154 .
For mysql, I've been having a look, experimenting with adding status to the index:
"status", *ordering,
and changing the ready query into a union:

        return self.filter(
            status=ResultStatus.NEW
        ).filter(
            models.Q(run_after=None),
        ).union(
            self.filter(
                status=ResultStatus.NEW,
            ).filter(
                models.Q(run_after__lte=timezone.now()),
            ),
            all=True,
        )

Although this doesn't work for sqlite. As mentioned in #113 (comment) , another workaround is make run_after not null.

@RealOrangeOne
Copy link
Owner Author

The NULLS LAST fix here should be good for all backends, even if it's a bit of a bodge. I'd hope that means your approach would now work for SQLite?

Copy link

@eviljeff eviljeff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not checked out and tested locally but lgtm

Comment on lines 115 to 116
# HACK: SQLite doesn't support NULLS LAST in indexes
models.Case(models.When(run_after=None, then=True), default=False),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great hack! I was playing around with Coalesce but this is cleaner

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I got some strange SQL syntax errors when I tried that. I need to try and distil them down to see if it's a bug in what I did or in the ORM itself.

F("priority").desc(),
# HACK: SQLite doesn't support NULLS LAST in indexes
models.Case(models.When(run_after=None, then=True), default=False),
F("run_after").desc(),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs updating to .asc() after #154

@jonatron
Copy link
Contributor

This is so far the only way I've found to make the index be used on all of mysql, postgres and sqlite: master...jonatron:django-tasks:db-indexes-2 .
It's similar to this branch but makes run_after not null. I don't think mysql knows to use the index with models.Case, so this avoids it. There's probably a better way of translating None/null into DATE_MAX and vice versa.

@RealOrangeOne
Copy link
Owner Author

Yeah the addition of a Case in an index is probably strange for the query planner. I'll see about incorporating your changes in.

@jonatron
Copy link
Contributor

jonatron commented May 2, 2025

Checking if @adamchainz , the django mysql guy wants to advise

@RealOrangeOne
Copy link
Owner Author

@jonatron I decided to just pull in your changes, since they're basically exactly what I ended up with 😆. Nice work! ❤️

I don't think adding status in to the index is that much of a bodge, and definitely not a blocker at this stage.

@RealOrangeOne RealOrangeOne merged commit 6c67bf2 into master May 2, 2025
121 of 123 checks passed
@RealOrangeOne RealOrangeOne deleted the add-db-indexes branch May 2, 2025 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
database-backend Issues relating to the database backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

very bad performance with too many DBTaskResult entries Optimise DB worker queries
4 participants