Skip to content

[DOCS] Enhance troubleshooting high cpu page. Opster migration #909

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

thekofimensah
Copy link
Contributor

@thekofimensah thekofimensah commented Mar 25, 2025

This is importing additional information given in the Opster docs for high cpu issues.

Summary of changes:

  1. Moved the “Hotspotting” paragraph at the beginning into the “Tips and Solutions” section for better contextual fit, incorporating extra details from the Opster documentation.
  2. Added new subsections to “Tips and Solutions,” specifically covering JVM garbage collection and oversharding.
  3. Separated the previous three points (“Scale your cluster,” “Spread out bulk requests,” and “Cancel long-running searches”) into a distinct section, as they are less common scenarios and more general best practices rather than direct solutions to typical high CPU issues.

This is the backport: elastic/elasticsearch#125558

@thekofimensah
Copy link
Contributor Author

@leemthompo I'm still getting this issue where the md redirect goes to the middle of the page. I think wherever the link is in the original page, it is automatically scrolled to that same place in the new page.

It would be great if you can check the "hot spotting" or "data tier" link and if it's not just me, I can create a ticket

@thekofimensah
Copy link
Contributor Author

@georgewallace

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shainaraskas , I made the changes to the headings, not sure if "check-cpu-usage" for example needs to be more unique to this page, or not.

Also i rewrote the oversharding paragraph to remove that first sentence but also realized I needed to clarify it a bit

Copy link
Collaborator

@shainaraskas shainaraskas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ok to go now, but I provided a couple of suggestions we could address while we're here


For optimal JVM performance, garbage collection should meet these criteria:

| GC Type | Completion Time | Occurrence Frequency |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these should be sentence case

Suggested change
| GC Type | Completion Time | Occurrence Frequency |
| GC type | Completion time | Frequency |

@@ -77,17 +75,61 @@ This API returns a breakdown of any hot threads in plain text. High CPU usage fr

The following tips outline the most common causes of high CPU usage and their solutions.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking at this section, the first 3 items are not CPU usage reduction recommendations.

what would you think about breaking the section into "Common causes of high CPU usage" and "Reduce high CPU usage" so one has links to additional problem spaces and one has general recommendations?


### Oversharding [high-cpu-usage-oversharding]

Oversharding occurs when a cluster has too many shards, often times caused by shards being smaller than optimal. While Elasticsearch doesn’t have a strict minimum shard size, an excessive number of small shards can negatively impact performance. Each shard consumes cluster resources since Elasticsearch must maintain metadata and manage shard states across all nodes.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Oversharding occurs when a cluster has too many shards, often times caused by shards being smaller than optimal. While Elasticsearch doesn’t have a strict minimum shard size, an excessive number of small shards can negatively impact performance. Each shard consumes cluster resources since Elasticsearch must maintain metadata and manage shard states across all nodes.
Oversharding occurs when a cluster has too many shards, often times caused by shards being smaller than optimal. While {{es}} doesn’t have a strict minimum shard size, an excessive number of small shards can negatively impact performance. Each shard consumes cluster resources because {{es}} must maintain metadata and manage shard states across all nodes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants