Skip to content

Issue page revamp for NRAI #19790

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Feb 24, 2025
Original file line number Diff line number Diff line change
@@ -1,16 +1,9 @@
---
title: Issues and incident management and response
tags:
- Alerts
metaDescription: 'Read about how to analyze alert issues and incidents to determine the root cause of an issue.'
redirects:
- /docs/alerts-applied-intelligence/new-relic-alerts/alert-incidents/view-alert-incidents-our-products/
- /docs/alerts-applied-intelligence/applied-intelligence/incident-intelligence/Issues-and-Incident-management-and-response
freshnessValidatedDate: never
---

## Issues feed [#issues-feed]

The <DNT>**Issues feed**</DNT> page is where you can find an overview of all your issues, along with helpful information about them. You can also click any individual issue for more detail, including its analysis summary, event log, and details about correlated issues.

<img
Expand Down Expand Up @@ -90,6 +83,14 @@ We've created a quick demo here to walk you through the issue page user interfac
id="b13vkx69yv"
/>


### About Issue page[#about-issue-page]

<Collapser
id="issue page details"
title="About Issue page"
>

The <DNT>**Issue page**</DNT> includes the following sections:

* <DNT>**Issue payload:**</DNT> This provides you with the issue payload details and lets you copy the payload with a click of a button.
Expand All @@ -115,6 +116,8 @@ The <DNT>**Issue page**</DNT> includes the following sections:
src="/images/accounts_screenshot-crop_new-issue-page.webp"
/>

</Collapser>

## Postmortem [#postmortem-intro]

A postmortem is a retrospective process that teams use to analyze what worked and what didn't when responding to and resolving an incident.
Expand All @@ -123,7 +126,7 @@ In the New Relic platform, the postmortem feature is a tool that automatically c

The postmortem includes:

* the record of an incident, including descriptions
* The record of an incident, including descriptions
* A timeline of the incident
* The incident's impact
* The incident's root causes
Expand All @@ -132,30 +135,6 @@ The postmortem includes:

For detailed steps on creating a postmortem or to watch our walk-through demo, visit our [Postmortem documentation](/docs/alerts-applied-intelligence/applied-intelligence/postmortems-applied-intelligence/) page.

## Root cause analysis [#root-cause-analysis]

Root cause analysis automatically finds potential causes for an issue and its impacted entities. It shows you why open issues occurred, which deployments contributed, and relevant error logs and attributes. With this, you can investigate the problem and reduce your mean time to resolution (MTTR).

<Callout variant="tip">
Note that root cause analysis is dependent on other New Relic data sources and features. This is why root cause analysis information may not always be present for every issue.
</Callout>

<img
title="An example root cause analysis."
alt="A screenshot example root cause analysis"
src="/images/accounts_screenshot-crop_root-cause-analysis-.webp"
/>

<figcaption>
When you select an issue, you may see <DNT>**Root cause analysis**</DNT> information.
</figcaption>

Root cause analysis includes three main UI sections:

* <DNT>**Deployment events**</DNT>: When you set up deployments, we provide the deployment nearest to the issue creation. Changes, such as deployments, account for a high percentage of the root causes of incidents and having that information at hand can help diagnose and resolve issues.
* <DNT>**Error logs**</DNT>: You can explore millions of log messages with a single click and use manual querying to help you find anomalous patterns and hard-to-find problems.
* <DNT>**Attributes to investigate**</DNT>: We scan the distribution of attributes and surface possible causes by finding significant changes in the distribution. This section also shows changes in database and external metrics. You can also [query interesting attributes](/docs/query-your-data/nrql-new-relic-query-language/get-started/introduction-nrql-new-relics-query-language).

## Impacted entities and issue map

<img
Expand Down Expand Up @@ -233,4 +212,4 @@ To view the issues in a text format, in the right hand corner, click <DNT>**Swit

To further reduce noise or get improved incident correlation, you can change or customize your decisions. Decisions determine how incidents are grouped together.

To get started, see [Decisions](/docs/new-relic-one/use-new-relic-one/new-relic-ai/get-started-decisions).
To get started, refer to [Decisions](/docs/new-relic-one/use-new-relic-one/new-relic-ai/get-started-decisions).
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
---
title: Response intelligence with New Relic AI
metaDescription: 'Learn to reduce the meat time to resolve issues and incidents by using New Relic AI with response intelligence.'
freshnessValidatedDate: never
---

<Callout title="preview">
We're still working on this feature, but we'd love for you to try it out!

If you have questions or feedback, or if you need help during the preview of the .NET agent's <DNT>**Instrumentation**</DNT> editor, send an email to [[email protected]](mailto:[email protected]).

This feature is currently provided as part of a preview program pursuant to our [pre-release policies](/docs/licenses/license-information/referenced-policies/new-relic-pre-release-policy).
</Callout>

The Issue page, now integrated with New Relic AI, offers real-time insights to help reduce the mean time to resolution (MTTR) of issues and incidents. It brings together the key details of an incident into a single view, enabling you to quickly understand the context of an issue without navigating through multiple screens.

<img
title="The alerts issues feed."
alt="A screenshot of the alerts issues
feed."
src="/images/accounts_screenshot-full_issue-feed.webp"
/>

The AI-powered issue page provides a concise summary that includes the affected entity, the severity of the issue, an explanation of the alert condition, and additional details to assist with debugging. Additionally, the issue page includes a new ‘overview’ tab with three widgets that address the most critical questions first responders ask when dealing with an issue.

**What’s impacted?**<br/>
First responders need to assess the "blast radius" to determine the severity of an issue and decide on the next steps. This widget provides an overview of the affected entities along with the impact to the end users of the application or service.

**What happened previously?**<br/>
Many IT issues tend to reoccur. Knowing if an issue has happened before, why it occurred, and how it was resolved can save first responders valuable time during an incident. To support this, customers can use the widget to link their existing retrospective or postmortem documents. By leveraging retrieval augmented generation (RAG), the New Relic AI platform will index and store this information for future contextual reference. Once configured, first responders will see a summary of similar past issues, along with links to the retrospective documents for detailed analysis.

**What to check?**<br/>
First responders often need contextual guidance on immediate actions to mitigate an issue. This widget provides customized steps to help them quickly restore services to normal operational levels. Additionally, the Potential causes tab identifies likely causes through causal analysis, covering a range of possible anomalies and performance issues. For more information, refer to casual analysis.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After this pararaph, please add the following:

Overview
The causal analysis engine identifies potential symptoms that might have triggered an alert event and suggests immediate mitigating actions to address them.

Consider a scenario where a PHP application encounters a memory leak, leading to a failure in the throughput SLI and triggering an alert. Our engine investigates by moving from the service level to the APM application and then to the infrastructure container to detect the symptom.

How does the engine work? The causal analysis engine uses distinct analysis categories, such as deployment events, infrastructure resource limits, and more. Each category is designed to address various potential sources of anomalies and performance issues. These categories focus on specific data types and metrics, enabling precise analysis and more accurate identification of causal relationships.

At the moment, we’re primarily focused on APM entity causal analysis. In the near term, our plan is to include infrastructure, Browser, among other entity types.

Screenshot 2025-02-20 at 10 50 57 PM

Mitigating actions & visualizations

For every identified potential cause, the engine offers tailored mitigation actions that guide users through the necessary steps to quickly restore services and entities to their normal operational states. We recognize that many of our customers typically rely on NRQL to analyze significant queries, hence we provide relevant visuals alongside the underlying query for each cause.

New Relic AI generated analysis

In some scenarios, our causal engine may not identify an algorithm-driven cause. However, we have insights that can be utilized with LLMs to offer you actionable steps. Customers interested in this capability must have the New Relic AI entitlement enabled.


## Causal analysis
The causal analysis engine identifies potential symptoms that might have triggered an alert event and suggests immediate mitigating actions to address them.

Consider a scenario where a PHP application encounters a memory leak, leading to a failure in the throughput SLI and triggering an alert. Our engine investigates by moving from the service level to the APM application and then to the infrastructure container to detect the symptom.

How does the engine work? The causal analysis engine uses distinct analysis categories, such as deployment events, infrastructure resource limits, and more. Each category is designed to address various potential sources of anomalies and performance issues. These categories focus on specific data types and metrics, enabling precise analysis and more accurate identification of causal relationships.

At the moment, New Relic only supports causal analysis for APM entities.

<img
title="The alerts issues feed."
alt="A screenshot of the alerts issues
feed."
src="/images/potential_issues.webp"
/>

### Mitigating actions & visualizations

For every identified potential cause, the engine offers tailored mitigation actions that guide users through the necessary steps to quickly restore services and entities to their normal operational states. We recognize that many of our customers typically rely on NRQL to analyze significant queries, hence we provide relevant visuals alongside the underlying query for each cause.

### New Relic AI generated analysis

In some scenarios, our causal engine may not identify an algorithm-driven cause. However, we have insights that, when combined with with LLMs, can offer you actionable steps. To access this capability, you must have the New Relic AI entitlement enabled.

2 changes: 2 additions & 0 deletions src/nav/alerts.yml
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,8 @@ pages:
pages:
- title: Issue and incident management and response
path: /docs/alerts/incident-management/Issues-and-Incident-management-and-response
- title: Intelligence response
path: /docs/alerts/incident-management/response-intelligence-ai
- title: Create postmortems
path: /docs/alerts/incident-management/postmortems-applied-intelligence
- title: View incidents and events
Expand Down
Binary file added static/images/potential_issues.webp
Binary file not shown.
Loading