Performance of CouchDB is an issue once reaching 20k records - require indexing of regularly queried fields

## Background
We've had an ecosystem running for a couple of weeks now, and there are 17774 run records in the couchDB database.

This amount was unexpected, and after investigating, I found our RAS cleanup process wasn't working correctly, which is now resolved and will delete ~11k rows out of the 17k+, HOWEVER, the reason I started looking into this was due to a huge performance slowdown for API calls to the ecosystem. This would often result in 504s and group runs failing within our polling CI run. Our processes absolutely hammer the API endpoints and we need them to be performant.

The ecosystem should be able to handle having 20k+ records in the Db.

I think the key thing here, is that we are constantly querying the DSS and RAS using:

- `group`
- `runName`
- `runId`
- `from`
- `to`
- we always want `detail=methods`

via the RAS API, which will map to db fields

....and then we often query via our eclipse plugin on `requestor` and `owner`, then maybe on certain tags.

We also search regularly on `streamName` via the Streams API, which will relate to a db field.
Additionally on `namespace` & `propertyName` via the CPS API,  which will relate to db fields.

I suspect no form of indexing has been set up on any of fields that are regularly queried by customers, this will result in a full db scan, which would explain their poor performance. CouchDB sets up a default "Primary Index" on the document ID (i.e. `runId`), which will explain why its very quick when searching on that.

There are a number of ways to set up indexing in CouchDB, which are [described well in this blog.](https://medium.com/@firmanbrilian/optimizing-query-performance-and-index-design-in-couchdb-for-analytical-workloads-42999cc44768)

Some evidence:

- 10.3s (!!!) for a `GET` on `https://<server>/api/ras/runs?runname=U134115`
- 483ms for a `GET` on the same run but using `https://<server>/api/ras/runs?runId=cdb-db069dde-a163-40da-ae5e-ccd6910cf24d-1766079846823-U134115`
- 8.6s for a `GET` on the group that contains the above run using `https://<server>/api/ras/runs?group=yueeYxJiFl`

## Tasks
- [ ] \<task>


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Performance of CouchDB is an issue once reaching 20k records - require indexing of regularly queried fields #2505

Background

Tasks

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Performance of CouchDB is an issue once reaching 20k records - require indexing of regularly queried fields #2505

Description

Background

Tasks

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions