Skip to content

Performance of CouchDB is an issue once reaching 20k records - require indexing of regularly queried fields #2505

@Mark-J-Lawrence

Description

@Mark-J-Lawrence

Background

We've had an ecosystem running for a couple of weeks now, and there are 17774 run records in the couchDB database.

This amount was unexpected, and after investigating, I found our RAS cleanup process wasn't working correctly, which is now resolved and will delete ~11k rows out of the 17k+, HOWEVER, the reason I started looking into this was due to a huge performance slowdown for API calls to the ecosystem. This would often result in 504s and group runs failing within our polling CI run. Our processes absolutely hammer the API endpoints and we need them to be performant.

The ecosystem should be able to handle having 20k+ records in the Db.

I think the key thing here, is that we are constantly querying the DSS and RAS using:

  • group
  • runName
  • runId
  • from
  • to
  • we always want detail=methods

via the RAS API, which will map to db fields

....and then we often query via our eclipse plugin on requestor and owner, then maybe on certain tags.

We also search regularly on streamName via the Streams API, which will relate to a db field.
Additionally on namespace & propertyName via the CPS API, which will relate to db fields.

I suspect no form of indexing has been set up on any of fields that are regularly queried by customers, this will result in a full db scan, which would explain their poor performance. CouchDB sets up a default "Primary Index" on the document ID (i.e. runId), which will explain why its very quick when searching on that.

There are a number of ways to set up indexing in CouchDB, which are described well in this blog.

Some evidence:

  • 10.3s (!!!) for a GET on https://<server>/api/ras/runs?runname=U134115
  • 483ms for a GET on the same run but using https://<server>/api/ras/runs?runId=cdb-db069dde-a163-40da-ae5e-ccd6910cf24d-1766079846823-U134115
  • 8.6s for a GET on the group that contains the above run using https://<server>/api/ras/runs?group=yueeYxJiFl

Tasks

  • <task>

Metadata

Metadata

Assignees

No one assigned

    Labels

    6-EcosystemEcosystem/Automation system issuesNeeds ReviewThis work item needs reviewing by a member of the dev teamcics

    Type

    No type

    Projects

    Status

    🆕 New

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions