Skip to content

Commit b95f80a

Browse files
Guillaume-Barrieriadjivon
authored andcommitted
[error-tracking] Apply general doc update (#28724)
Co-authored-by: iadjivon <[email protected]>
1 parent f23f30b commit b95f80a

14 files changed

+129
-24
lines changed

content/en/error_tracking/dynamic_sampling.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ A `Dynamic Sampling activated` event is generated when Dynamic Sampling is appli
4141
When Dynamic Sampling is applied, the following steps are recommended:
4242

4343
- Check which issue is consuming your quota. The issue to which Dynamic Sampling is applied is linked in the event generated in Event Management.
44-
- If you'd like collect additional samples for this issue, raise your daily quota on the [Error Tracking Rate Limits page][2].
44+
- If you'd like to collect additional samples for this issue, raise your daily quota on the [Error Tracking Rate Limits page][2].
4545
- If you'd like to avoid collecting samples for this issue in the future, consider creating an [exclusion filter][3] to prevent additional events from being ingested into Error Tracking.
4646

4747
## Further Reading

content/en/error_tracking/error_grouping.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -21,9 +21,13 @@ Error Tracking intelligently groups similar errors into issues. This grouping is
2121

2222
The error stack trace is the code path followed by an error between being thrown and being captured by Datadog instrumentation. Error Tracking evaluates the topmost stack frame (the **location** of the error) and uses it to group the error.
2323

24-
If any stack-frame properties differ for two given errors, the two errors are grouped under different issues. For example, Error Tracking does not group issues across services or error types. Error Tracking also ignores numbers, punctuation, and anything that is between quotes or parentheses: only word-like tokens are used.
24+
If any stack frame properties differ for two given errors, the two errors are grouped under different issues. For example, Error Tracking does not group issues across services or error types. Error Tracking also ignores numbers, punctuation, and anything that is between quotes or parentheses: only word-like tokens are used.
2525

26-
**Note**: To improve grouping accuracy, Error Tracking removes variable stack-frame properties such as versions, ids, dates, and so on.
26+
<div class="alert alert-info">
27+
<strong>Tip:</strong> To ensure optimal grouping, enclose variables in your error messages in quotes or parentheses.
28+
</div>
29+
30+
**Note**: To improve grouping accuracy, Error Tracking removes variable stack frame properties such as versions, ids, dates, and so on.
2731

2832

2933
## Custom Grouping
@@ -36,7 +40,7 @@ If `error.fingerprint` is provided, the grouping behavior follows these rules:
3640

3741
* Custom grouping takes precedence over the default strategy.
3842
* Custom grouping can be applied only to a subset of your errors and can coexist with the default strategy.
39-
* The content of `error.fingerprint` is used as-is without any modification.
43+
* The content of `error.fingerprint` is used as-is without any modification (although it is converted to a standardized fingerprint format).
4044
* Errors from the same service and with the same `error.fingerprint` attribute are grouped into the same issue.
4145
* Errors with different `service` attributes are grouped into different issues.
4246

content/en/error_tracking/explorer.md

Lines changed: 42 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,12 @@ Each item listed in the Error Tracking Explorer is an issue that contains high-l
2424
- Graph of occurrences over time
2525
- Number of occurrences in the selected time period
2626

27+
Issue are also tagged as:
28+
- `New` if the issue was first seen less than two days ago and is in state **FOR REVIEW** (see [Issue States][5])
29+
- `Regression` if the issue was **RESOLVED** and occurred again in a newer version (see [Regression Detection][6])
30+
- `Crash` if the application crashed
31+
- Having a [Suspected Cause][3]
32+
2733
### Time range
2834

2935
{{< img src="real_user_monitoring/error_tracking/time_range.png" alt="Error Tracking Time Range" style="width:80%;" >}}
@@ -48,6 +54,36 @@ Click the Edit icon to see the list of available facets that you can show or hid
4854

4955
{{< img src="/error_tracking/error-tracking-facets.png" alt="Click the pencil icon to hide or show available Error Tracking facets from view." style="width:100%;" >}}
5056

57+
### Issue level filters
58+
59+
In addition to error events, Error Tracking offers issue level filters to refine the list of displayed issues.
60+
61+
{{< img src="error_tracking/issue-level-filters.png" alt="Issue level filters in Error Tracking" style="width:100%;" >}}
62+
63+
#### Sources
64+
65+
Error Tracking consolidates errors from multiple Datadog products (Rum, Logs, APM) into a unified view, allowing you to watch and troubleshoot errors across your entire stack. You can choose to display **All**, **Browser**, **Mobile**, or **Backend** issues in the explorer.
66+
67+
For more granular filtering, you can narrow down issues by specific log sources or by SDK and scope to a programming language.
68+
69+
#### Fix available
70+
71+
Display only issues that have an AI generated fix available to quickly remediate problems.
72+
73+
#### Teams filters
74+
75+
Issue Team Ownership helps you quickly identify issues and focus on relevant errors by using Git `CODEOWNERS`. Datadog will automatically filter your issues so your team can cut through noise and prioritize what matters.
76+
77+
Issue ownership is derived from the `CODEOWNERS` files of your repositories. To use this feature, you need to link your Datadog teams to their GitHub counterparts. All errors coming from RUM and APM are eligible for Team Ownership.
78+
79+
#### Assigned to
80+
81+
Track and assign issues to yourself or the most knowledgeable team members, and easily refine the issue list by assignee.
82+
83+
#### Suspected Cause
84+
85+
[Suspected Cause][3] enables quicker filtering and prioritization of errors, empowering teams to address potential root causes more effectively.
86+
5187
## Inspect an issue
5288

5389
Click on any issue to open the issue panel and see more information about it.
@@ -64,21 +100,19 @@ The lower part of the issue panel gives you the ability to navigate error sample
64100

65101
## Get alerted on new errors
66102

67-
Seeing a new issue as soon as it happens gives you the chance to proactively identify and fix it before it becomes critical. Error Tracking generates a [Datadog event][1] whenever an issue is first seen in a given service and environment and, as a result, gives you the ability to be alerted in such cases by configuring [Event Monitors][2].
103+
Seeing a new issue as soon as it happens gives you the chance to proactively identify and fix it before it becomes critical. Error Tracking monitors allow you to track any new issue or issues that have a high impact in your systems or on your users (see [Error Tracking Monitors][7])
68104

69-
Each event generated is tagged with the version, the service, and the environment so that you have a fine-grained control over issues you want to be alerted for. You can directly export your search query from the explorer to create an event monitor on the related scope:
105+
You can directly export your search query from the explorer to create an Error Tracking Monitor on the related scope:
70106

71107
{{< img src="/error_tracking/create-monitor.mp4" alt="Export your search query to an Error Tracking monitor" video=true >}}
72108

73-
## Suspected Cause
74-
75-
[Suspected Cause][3] enables quicker filtering and prioritization of errors, empowering teams to address potential root causes more effectively.
76-
77109
## Further Reading
78110

79111
{{< partial name="whats-next/whats-next.html" >}}
80112

81113
[1]: /events
82-
[2]: /monitors/types/event/
83114
[3]: /error_tracking/suspected_causes
84-
[4]: /real_user_monitoring/explorer/search/#event-types
115+
[4]: /real_user_monitoring/explorer/search/#event-types
116+
[5]: /error_tracking/issue_states
117+
[6]: /error_tracking/regression_detection
118+
[7]: /monitors/types/error_tracking

content/en/error_tracking/issue_states.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ further_reading:
1111
All issues in Error Tracking have a status to help you triage and prioritize issues or dismiss noise. There are four statuses:
1212

1313
- **FOR REVIEW**: Ongoing and in need of attention because the issue is new or it's a regression.
14-
- **REVIEWED**: Triaged and needs to be fixed, either now or later.
14+
- **REVIEWED**: Triaged and needs to be fixed, either now or later.
1515
- **IGNORED**: Requiring no additional investigation or action.
1616
- **RESOLVED**: Fixed and no longer occurring.
1717

@@ -27,13 +27,13 @@ Error Tracking automatically marks issues as **REVIEWED** if one of the followin
2727
- The issue has been assigned
2828
- A case has been created from the issue
2929

30-
{{< img src="error_tracking/auto-review-actions.png" alt="Error Tracking automatic review actions" style="width:75%;" >}}
30+
{{< img src="error_tracking/auto-review-actions-2.png" alt="Error Tracking automatic review actions" style="width:75%;" >}}
3131

3232
## Automatic resolution
3333

3434
Error Tracking automatically marks issues as **RESOLVED** that appear to be inactive or resolved due to a lack of recent error occurrences:
3535

36-
- If the issue was last reported in a version that is more than 14 days old, and a newer version has been released but does not report the same error, Error Tracking automatically resolves the issue. Configure your services with version tags (see instructions for [APM][1], [RUM][2], and [Logs][3]) to ensure that automatic resolution accounts for versions of your services.
36+
- If the issue was last reported in a version that is more than 14 days old, and a newer version has been released but does not report the same error, Error Tracking automatically resolves the issue. Configure your services with version tags (see instructions for [APM][1], [RUM][2], and [Logs][3]) to ensure that automatic resolution accounts for versions of your services.
3737
- If `version` tags are not set up, Error Tracking automatically resolves an issue if there have been no new errors reported for that issue within the last 14 days.
3838

3939
## Automatic re-opening through regression detection
@@ -44,10 +44,10 @@ See [Regression Detection][4].
4444

4545
The issue status appears anywhere the issue can be viewed, such as in the issues list or on the details panel for a given issue. To manually update the status of an issue, click the status and choose a different one in the dropdown menu.
4646

47-
{{< img src="error_tracking/updating-issue-status.png" alt="The Activity Timeline in the Error Tracking Issue" style="width:100%;" >}}
47+
{{< img src="error_tracking/updating-issue-status-2.png" alt="The Activity Timeline in the Error Tracking Issue" style="width:100%;" >}}
4848

4949
## Issue history
50-
View a history of your issue activity with the **Activity Timeline**. On the details panel of any Error Tracking issue, view the Activity Timeline by clicking the **Activity** tab.
50+
View a history of your issue activity with the **Activity Timeline**. On the details panel of any Error Tracking issue, view the Activity Timeline by clicking the **Activity** tab.
5151

5252
{{< img src="error_tracking/issue-status-history.png" alt="The Activity Timeline in the Error Tracking Issue" style="width:80%;" >}}
5353

content/en/error_tracking/manage_data_collection.md

Lines changed: 66 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -19,15 +19,15 @@ You can define what data is included in Error Tracking in two ways:
1919
- [Rules](#rules-inclusion)
2020
- [Rate limits](#rate-limits)
2121

22-
You can configure both rules and rate limits on the [**Error Tracking** > **Settings**][1] page.
22+
You can configure both rules and rate limits on the [**Error Tracking** > **Settings**][1] page.
2323

2424
## Rules
2525

26-
Rules allow you to select which errors are ingested into Error Tracking. They apply to both billable and non-billable errors.
26+
Rules allow you to select which errors are ingested into Error Tracking. They apply to both billable and non-billable errors.
2727

2828
Each rule consists of:
2929
- A scope: an inclusion filter, which contains a search query, such as `service:my-web-store`.
30-
- Optionally, one or more nested exclusion filters to further refine the rule. For example, an exclusion filter might use the `env:staging` query to exclude staging errors.
30+
- Optionally, one or more nested exclusion filters to further refine the rule and ignore some of the matching events. For example, an exclusion filter might use the `env:staging` query to exclude staging errors.
3131

3232
A given rule can be toggled on or off. An error event is included if it matches a query in one of the active inclusion filters _and_ it does not match any active nested exclusion queries.
3333

@@ -39,6 +39,33 @@ Each error event is checked against the rules in order. The event is processed o
3939

4040
Rules are evaluated in order, with the evaluation stopping at the first matching rule. The priority of the rules and their nested filters depends on their order in the list.
4141

42+
{{% collapse-content title="Example" level="p" %}}
43+
Given a list of rules:
44+
- Rule 1: `env:prod`
45+
- Exclusion filter 1-1: `service:api`
46+
- Exclusion filter 1-2: `status:warn`
47+
- Rule 2: `service:web`
48+
- Rule 3 (this rule is disabled): `team:security`
49+
- Rule 4: `service:foo`
50+
51+
52+
{{< img src="error_tracking/error-tracking-filters-example.png" alt="Error Tracking Filters example of setup" style="width:75%;" >}}
53+
54+
The processing flow is as follows:
55+
{{< img src="error_tracking/error-tracking-filters-diagram-brand-design.png" alt="Error Tracking Filters" style="width:90%;" >}}
56+
57+
58+
An event with `env:prod service:my-service status:warn`
59+
- will match rule 1 and go to its exclusion filters
60+
- will not match exclusion 1-1 so will go to exclusion 1-2
61+
- at exclusion 1-2, it will be a match, so the event will be discarded
62+
63+
An event with `env:staging service:web`
64+
- will not match rule 1, so will go to rule 2
65+
- at rule 2, it will be a match, so the event will be kept
66+
67+
{{% /collapse-content %}}
68+
4269
### Default rules
4370

4471
By default, Error Tracking has an `*` inclusion filter and no exclusion filters. This means all error with the [requirements][2] to be fingerprinted are ingested into Error Tracking.
@@ -54,7 +81,7 @@ To add a rule (inclusion filter):
5481
6. Click **Save Changes**
5582
7. Optionally, reorder the rules to change their [evaluation order](#evaluation-order). Click and drag the six-dot icon on a given rule to move the rule up or down in the list.
5683

57-
{{< img src="logs/error_tracking/reorder_filters.png" alt="On the right side of each rule is a six-dot icon, which you can drag vertically to reorder rules." style="width:80%;">}}
84+
{{< img src="error_tracking/reorder-filters.png" alt="On the right side of each rule is a six-dot icon, which you can drag vertically to reorder rules." style="width:80%;">}}
5885

5986

6087
## Rate limits
@@ -71,22 +98,55 @@ To set a rate limit:
7198
1. Edit the **errors/month** field.
7299
1. Click **Save Rate Limit**.
73100

74-
{{< img src="logs/error_tracking/rate_limit.png" alt="On the left side of this page, under 'Set your Rate Limit below,' is a drop-down menu where you can set your rate limit." style="width:70%;">}}
101+
{{< img src="error_tracking/rate-limit.png" alt="On the left side of this page, under 'Set your Rate Limit below,' is a drop-down menu where you can set your rate limit." style="width:70%;">}}
75102

76103
A `Rate limit applied` event is generated when you reach the rate limit. See the [Event Management documentation][4] for details on viewing and using events.
77104

78105
{{< img src="logs/error_tracking/rate_limit_reached_event.png" alt="Screenshot of a 'Rate limit applied' event in the Event Explorer. The event's status is INFO, the source is Error Tracking, the timestamp reads '6h ago', and the title is 'Rate limit applied.' The event is tagged 'source:error_tracking'. The message reads 'Your rate limit has been applied as more than 60000000 logs error events were sent to Error Tracking. Rate limit can be changed from the ingestion control page. " style="width:70%;">}}
79106

80107
## Monitoring usage
81108

82-
You can monitor your Error Tracking on Logs usage by setting up monitors and alerts for the `datadog.estimated_usage.error_tracking.logs.events` metric, which tracks the number of ingested error logs.
109+
You can monitor your Error Tracking on Logs usage by setting up monitors and alerts for the `datadog.estimated_usage.error_tracking.logs.events` metric, which tracks the number of ingested error logs.
83110

84111
This metric is available by default at no additional cost, and its data is retained for 15 months.
85112

113+
## Dynamic Sampling
114+
115+
Because Error Tracking billing is based on the number of errors, large increases in the errors for a single issue can quickly consume your Error Tracking budget. Dynamic Sampling protects you by establishing a threshold for the error rate per issue based on your daily rate limit and historical error volumes, sampling errors when that threshold is reached. Dynamic Sampling automatically deactivates when the error rate of your issue decreases below the given threshold.
116+
117+
### Setup
118+
119+
Dynamic Sampling is automatically enabled with Error Tracking with a default intake threshold based on your daily rate limit and historical volume.
120+
121+
For best results, set up a daily rate limit on the [Error Tracking Rate Limits page][5]: Click **Edit Rate Limit** and enter a new value.
122+
123+
{{< img src="error_tracking/dynamic-sampling-rate-limit.png" alt="Error Tracking Rate Limit" style="width:90%" >}}
124+
125+
### Disable Dynamic Sampling
126+
127+
Dynamic Sampling can be disabled on the [Error Tracking Settings page][4].
128+
129+
{{< img src="error_tracking/dynamic-sampling-settings.png" alt="Error Tracking Dynamic Sampling Settings" style="width:90%" >}}
130+
131+
### Monitor Dynamic Sampling
132+
133+
A `Dynamic Sampling activated` event is generated when Dynamic Sampling is applied to an issue. See the [Event Management documentation][4] for details on viewing and using events.
134+
135+
{{< img src="error_tracking/dynamic-sampling-event.png" alt="Error Tracking Rate Limit" style="width:90%" >}}
136+
137+
#### Investigation and mitigation steps
138+
139+
When Dynamic Sampling is applied, the following steps are recommended:
140+
141+
- Check which issue is consuming your quota. The issue to which Dynamic Sampling is applied is linked in the event generated in Event Management.
142+
- If you'd like to collect additional samples for this issue, raise your daily quota on the [Error Tracking Rate Limits page][5].
143+
- If you'd like to avoid collecting samples for this issue in the future, consider creating an exclusion filter to prevent additional events from being ingested into Error Tracking.
144+
86145
## Further Reading
87146

88147
{{< partial name="whats-next/whats-next.html" >}}
89148

90149
[1]: https://app.datadoghq.com/error-tracking/settings/rules
91150
[2]: /error_tracking/troubleshooting/?tab=java#errors-are-not-found-in-error-tracking
92151
[4]: /service_management/events/
152+
[5]: https://app.datadoghq.com/error-tracking/settings/rate-limits

content/en/error_tracking/troubleshooting.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,9 @@ To be processed by Error Tracking, a span must have these attributes:
2222
- `error.message`
2323
- `error.stack`
2424

25-
**Note**: The stack must have at least two lines and one *meaningful* frame (a frame with a function name and a filename in most languages).
25+
<div class="alert alert-info">
26+
<strong>Note:</strong> The stack must have at least two lines and one <em>meaningful</em> frame (a frame with a function name and a filename in most languages).
27+
</div>
2628

2729
This [example query][5] searches for spans meeting the criteria for inclusion in Error Tracking.
2830

@@ -32,6 +34,10 @@ Error Tracking only processes errors that are sent with the source set to `custo
3234

3335
This [example query][6] shows RUM errors that meet the criteria for inclusion in Error Tracking.
3436

37+
### Inclusion/Exclusion filters
38+
39+
Make sure the errors you are looking for match at least one inclusion filter and no exclusion filters. Check your filters setup (more information in [Manage Data Collection][8]).
40+
3541
## No error samples found for an issue
3642

3743
All errors are processed, but only retained errors are available in the issue panel as an error sample.
@@ -46,3 +52,4 @@ Spans associated with the error need to be retained with a custom retention filt
4652
[5]: https://app.datadoghq.com/apm/traces?query=%40_top_level%3A1%20%40error.stack%3A%2A%20AND%20%40error.message%3A%2A%20AND%20error.type%3A%2A%20
4753
[6]: https://app.datadoghq.com/rum/sessions?query=%40type%3Aerror%20%40error.stack%3A%2A
4854
[7]: https://app.datadoghq.com/error-tracking/settings
55+
[8]: /error_tracking/manage_data_collection/
318 KB
Loading
189 KB
Loading
36.7 KB
Loading
504 KB
Loading

0 commit comments

Comments
 (0)