From 9d5e00aa3ec220dbf3069181c02611116bae52d8 Mon Sep 17 00:00:00 2001 From: Michael Cretzman Date: Mon, 31 Mar 2025 10:09:47 -0700 Subject: [PATCH 01/15] first draft --- .gitignore | 2 + .../application_security/guide/_index.md | 1 + .../guide/manage_account_theft_appsec.md | 660 ++++++++++++++++++ 3 files changed, 663 insertions(+) create mode 100644 content/en/security/application_security/guide/manage_account_theft_appsec.md diff --git a/.gitignore b/.gitignore index 172387200e217..6391628b4cfec 100644 --- a/.gitignore +++ b/.gitignore @@ -274,3 +274,5 @@ local/bin/py/submit_github_status_check.py !local/bin/py/build/configuration/pull_config_preview.yaml !local/bin/py/build/configuration/pull_config.yaml !local/bin/py/build/configuration/integration_merge.yaml +go.mod +go.sum diff --git a/content/en/security/application_security/guide/_index.md b/content/en/security/application_security/guide/_index.md index 648216a1341d1..797fde0fafce7 100644 --- a/content/en/security/application_security/guide/_index.md +++ b/content/en/security/application_security/guide/_index.md @@ -10,4 +10,5 @@ disable_toc: true {{< whatsnext desc="Advanced Topics" >}} {{< nextlink href="/security/application_security/guide/standalone_application_security/" >}}Standalone Application Security{{< /nextlink >}} + {{< nextlink href="/security/application_security/guide/manage_account_theft_appsec/" >}}Managing account theft with ASM{{< /nextlink >}} {{< /whatsnext >}} diff --git a/content/en/security/application_security/guide/manage_account_theft_appsec.md b/content/en/security/application_security/guide/manage_account_theft_appsec.md new file mode 100644 index 0000000000000..a28bcefae0205 --- /dev/null +++ b/content/en/security/application_security/guide/manage_account_theft_appsec.md @@ -0,0 +1,660 @@ +--- +title: Managing Account Theft with ASM +disable_toc: false +--- + +Users are trusted entities in your systems with access to sensitive information and the ability to perform sensitive actions. Malicious actors have identified users as an opportunity to target websites and steal valuable data and resources. + +Datadog Application Security Management (ASM) provides [builtin][1] detection and protection capabilities to help you manage this threat. + +This guide describes how to use ASM to prepare for and respond to account takeover campaigns. This guide is divided into three phases: + +- [Phase 1: Collecting login information](#phase-1:-collecting-login-information) +- [Phase 2: Preparing for account takeover campaigns](#phase-2:-preparing-for-account-takeover-campaigns) +- [Phase 3: Reacting to account takeover campaigns](#phase-3:-reacting-to-account-takeover-campaigns) + +## Phase 1: Collecting login information + +To detect malicious patterns, ASM requires visibility into your users' login activity. This phase describes how to enable and validate this visibility. + +### Step 1.1: Ensure ASM is enabled on your identity service + +This step describes how to set up your service to use ASM. + +If your service is already using ASM, you can go to [Step 1.3: Validating whether login information is automatically collected](#step-1.3:-validating-login-information-is-automatically-collected). + +1. Go to [**Service Catalog**][2], click the **Security** lens, and search for your login service name. + + + +2. Click on the service to open its details. If the **Threat management** pill is green, ASM is enabled and you may move to [Step 1.3: Validating whether login information is automatically collected](#step-1.3:-validating-login-information-is-automatically-collected). + + + + If ASM isn't enabled, the panel displays the **Discover ASM** button. + + + + To set up ASM, move to [Step 1.2: Enabling ASM on login service](#step-1.2:-enabling-app-&-api-protection-on-your-login-service). + +### Step 1.2: Enabling ASM on your login service + +To enable ASM on your login service, ensure you meet the following requirements: + +* Similarly to Datadog APM, ASM requires a library integration in your services and a running Datadog agent. +* ASM generally benefits from using the newest library possible; however, minimum supported versions are documented in [Compatibility Requirements][3]. +* **Threat Detection** is required at a minimum, and **Automatic user activity event tracking** should also be enabled, ideally. + +To enable ASM using a new deployment, use the `APPSEC\_ENABLED` environment variable/library configuration or [Remote Configuration]. You can use either method, but Remote Configuration can be set up using the Datadog UI. + +To enable ASM using Remote Configuration, do the following: + +1. Go to [Remote Configuration][5]. +2. Click **Get Started with ASM**. +3. In **Threat Management**, click **Select Services.** +4. Select your service(s), and then click **Next** and proceed with the setup instructions. + +When you see traces from your service in [ASM Traces][6], move to [Step 1.3: Validating login information is automatically collected](#step-1.3:-validating-login-information-is-automatically-collected). + +For more detailed instructions on using a new deployment, see [Enabling ASM Threat Detection using Datadog Tracing Libraries][7]. + +### Step 1.3: Validating login information is automatically collected {#step-1.3:-validating-login-information-is-automatically-collected} + +After you have enabled ASM, you can validate that login information is collected by Datadog. + +**Note:** Once ASM is enabled on a service, wait a few minutes for users to log into the service or log into the service yourself. + +To validate login information is collected, do the following: + +1. Go to [Traces][8] in ASM. +2. Look for traces tagged with login activity from your login service. For example, in **Search for**, you might have `@appsec.security\_activity:business\_logic.users.login.\*`. +3. Check if all your login services are reporting login activity. You can see this in the **Service** facet. + + + +**If you don't see login activity from a service**, go to [Step 1.5: Manually instrumenting your services](#step-1.5:-manually-instrumenting-your-services). + +### Step 1.4: Validating login metadata is automatically collected {#step-1.4:-validating-login-metadata-is-automatically-collected} + +To validate that login metadata is collected, do the following: + +1. Go to [Traces][8] in ASM. +2. Look for traces tagged with successful and failed login activity from your login service. For example, in **Search for**, you might have all. +3. Open a trace. +4. In the trace details, is the **Security** tab, review **Business Logic Event**. +5. Check if the event has a false user. + + + +In the event of a **false** user (`usr.exists:false`), look for the following issues: + +- A single event: if the trace contains multiple login events, such as both successes and failures, go to [Step 1.5: Manually instrumenting your services](#step-1.5:-manually-instrumenting-your-services). +- If the event does not contain the mandatory metadata, it might appear as a user attribution section. The mandatory metadata is `usr.login` and `usr.exists` in the case of login failure, and `usr.id` in the case of login success. In this case, go to [Step 1.5: Manually instrumenting your services](#step-1.5:-manually-instrumenting-your-services). + +**If the instrumentation is correct, go to [Phase 2: Preparing for Account Takeover campaigns](#phase-2:-preparing-for-account-takeover-campaigns).** + +### Step 1.5: Manually instrumenting your services + +ASM collects login information and metadata using an SDK embedded in the Datadog libraries. Instrumentation is performed by calling the SDK when a user login is successful/fails and by providing the SDK with the metadata of the login. The SDK attaches the login and the metadata to the trace and sends it to Datadog where it is retained. + +**For an alternative to modifying the code of the service**, go to [Step 1.6: Remote instrumentation of your services](#step-1.6:-remote-instrumentation-of-your-services). + +To manually instrument your services, do the following: + +1. If auto-instrumentation is providing incorrect data (multiple events in a single trace), see [Disable auto-instrumentation][9]. + For detailed instrumentation instructions for each language, go to [Adding business logic information (login success, login failure, any business logic) to traces][10]. +2. Add the following metadata: + * `usr.login`: **Mandatory for login success and failure**. This field contains the *name* used to log into the account. The name might be an email address, a phone number, a username, or something else. The purpose of this field is to identify targeted accounts even if they don't exist in your systems because a user might be able to change those accounts. + * `usr.exists`: **Recommended for login failures**. This field is required for some default detections. The field helps to lower the priority of attempts targeted at accounts that don't exist in your systems. + * `usr.id`: **Recommended for login success and failure (if available)**. This field contains a unique identifier for the account. User blocking is based on this value. This field also helps with investigation. If no identifier is available (because the account doesn't exist), you don't need to populate this field. + +**After deploying the code, validate the instrumentation is correct by following the steps in** [Step 1.4: Validating login metadata is automatically collected](#step-1.4:-validating-login-metadata-is-automatically-collected). + +### Step 1.6: Remote instrumentation of your services + +ASM can use custom In-App WAF rules to flag login attempts and extract the metadata from the request needed by detection rules. + +This approach requires that [Remote Configuration][11] is enabled and working. Verify Remote Configuration is running in [Remote Configuration][12]. + +To use custom In-App WAF rules, do the following: + +1. Open the [In-App WAF custom rule creation form](https://app.datadoghq.com/security/appsec/in-app-waf?column=services-count&config_by=custom-rules&group_by=RULESET&order=desc&policy_by=rules&ruleColumn=modifiedAt&ruleId=newRule). +2. Name your rule and select the **Business Logic** category. +3. Set the rule type as `users.login.failure` for login failures and `users.login.success` for login successes. + +4. Select your service and write the rule to match the login attempts. Typically, you match the method (`POST`), the URI with a regex (`^/login`) and the status code (403 for failures, 302 or 200 for success). +5. Collect the tags required by detection rules. The most important tag is `usr.login`. Assuming the login was provided in the request, you can add a condition and set `store value as tag` as the operator. + +6. Select a specific user parameter as an input, either in the body or the query. +7. Set the `Tag` field to the name of the tag where we want to save the value captured using `usr.login`. + +8. Click **Save**. The rule is automatically sent to every instance of the service and will start capturing login failures. + +**To validate that the instrumentation is correct**, see [Step 1.4: Validating login metadata is automatically collected](#step-1.4:-validating-login-metadata-is-automatically-collected). + +For more details, see [Tracking business logic information without modifying the code][13]. + +## Phase 2: Preparing for Account Takeover campaigns + +After setting up instrumentation for your services, ASM monitord for attack campaigns. You can review the monitoring in the [Attacks overview][14] **Business logic** section. + + + +ASM detects [multiple attacker strategies][15]. Upon detecting an attack with a high level of confidence, the [built-in detection rules][16] generate a signal. + +The severity of the signal is set based on the urgency of the threat: from **Low** in case of unsuccessful attacks to **Critical** in case of successful account compromises. + +To fully leverage detections, take the following actions. + +### Step 2.1: Configuring notifications + +[Notifications][17] provide warnings when a signal is triggered. + +To create a notification rule using the [Create a new rule][18] setting, do the following: + +1. Open [Create a new rule][18]. +2. Enter a name for the rule. +3. Select **Signal** and remove all entries except **ASM**. +4. Restrict the rule to `category:account_takeover.` +5. Add notification recipients (Slack, Teams, PagerDuty). + To learn more, see [Notification channels][19]. +6. Save the rule. + + The notification is sent the next time a signal is generated. + +### Step 2.2: Validate proper data propagation + +In microservice environments, services are generally reached by internal hosts running other services. This internal environment makes it challenging to identify the unique traits of the attacker, such as IP, user agent, fingerprint, etc., and validate the data. + +[ASM Traces][20] can help to validate the data by exposing the source IPs and user agent traffic. + +To validate the data, do the following: + +1. Review login traces in the [Traces][21] and check for the following: +* Source IPs (`@http.client_ip`) are varied and public IPs. + * **Problem:** If login attempts are coming from a few IPs only, this might be a proxy that you can't block without risking availability. + * **Solution:** Forward the client IP of the initial request through a HTTP header, such as `X-Forwarded-For`. You can use a custom header for [better security][22] and configure the tracer to read it using the `DD_TRACE_CLIENT_IP_HEADER` environment variable. +* The user agent (`@http.user_agent`) is consistent with the expected traffic (web browser, mobile app, etc.) + * **Problem:** The user agent could be replaced by the user agent in the calling microservice network library. + * **Solution:** Use the client user agent when calling subsequent services. + +### Step 2.3: Configure automatic blocking + +**Before you begin:** Verify that the IP addresses are properly configured, as described in [Step 2.2: Validate proper data propagation](#step-2.2:-validate-proper-data-propagation). + +ASM automatic blocking can be used to block attacks at any time of the day. Automatic blocking can help block attacks before your team members are online, providing security during off hours. + +You can configure automatic blocking to block IPs identified as part of an attack. This is only a partial remediation because attackers can change IPs; however, it can give you more time to implement comprehensive remediation. + +To configure automatic blocking, do the following: + +1. Go to **ASM** > **Protection** > [Detection Rules][23]. +2. In **Search**, enter `tag:"category:account_takeover"`. +3. Open the rules where you want to turn on blocking. Datadog recommends turning IP blocking on for **High** or **Critical** severity. +4. In the rule, in **Define Conditions**, in **Security Responses**, enable **IP automated blocking**. + You can control the blocking behavior per condition. Each rule can have multiple conditions based on your confidence and the attack success. + +**Datadog does not recommend permanent blocking of IP addresses**. Attackers are unlikely to reuse IPs and permanent blocking could result in blocking users. Moreover, ASM has a limit of how many IPs it can block (`~10000`), and this could fill this list with unnecessary IPs. + + + +## Phase 3: Reacting to account takeover campaigns + +This section describes common account takeover hacker behavior and how to triage, investigate, and monitor detections. + +### How attackers run their campaigns + +Eventually, your systems come under attack. The wave of malicious login attempts can often eclipse the volume of normal login activity the service is expecting. The load might increase causing availability problems and the attacker could at any time successfully log into an account. + +The actions the attackers take depend on their strategy and the configurations of your systems. Some attackers might decide to immediately abuse their access to extract value before you've had time to freeze their compromised accounts. Others might keep the accounts dormant until a later time. + +Many strategies are available, but it's important to understand that the value chain of attacks is often carefully divided: + +1. The actor who initiates the attack often buys a database of credentials from a vendor (likely acquired by the compromise of another service). +2. The actor procures a script designed to automate login attempts while evading detection (randomizing headers, trying to look as similar to normal traffic as possible). +3. The actor buys access to a botnet, letting them leverage many different IPs to run their attack. There are extreme cases where large campaigns with 500k+ attempts were so distributed that Datadog saw an average of 1.01 requests per IP and a single attempt per account. +4. When valid credentials are discovered, they might be sold downstream to another actor to leverage them to some end such as financial theft, spam, abuse, etc. + +Whenever an attack starts against your systems, signals are generated mentioning **Credential Stuffing**, **Distributed Credential Stuffing**, or **Bruteforce**. These signal terms are based on the strategy used by the attacker. + +### Step 3.1: Triage + +The first step is to confirm that the detection is correct. Certain behaviors, such as a security scan on a login endpoint or a lot of token rotation, might appear to the detection as an attack. The analysis depends on the signal, and the following examples provide general guidance that you'll need to adapt to your situation. + +{{< tabs >}} +{{% tab "Bruteforce" %}} + +The signal is looking for an attempt to steal a user account by trying many different passwords for this account. Generally, a small number of accounts are targeted by those campaigns. + +Review the accounts flagged as compromised. Click on a user to open a summary of recent activity. + +Questions for triage: + +* Has there been a sharp increase of activity? +* Are the IPs attempting logins known? +* Are they flagged by threat intelligence? + +If the answer to those questions is yes, the signal is likely legitimate. + +You can adapt your response based on the sensitivity of the account (for example, a free account without much access/secrets vs admin account). + +{{% /tab %}} + +{{% tab "Credential Stuffing" %}} + +This signal is looking for a large number of accounts with failed logins coming from a small number of IPs. This is often caused by unsophisticated attackers. + +Review the accounts flagged as targeted. + +If they share attributes, such as all coming from one institution, check whether the IP might be a proxy for this institution by reviewing its past activity. + +Questions for triage: + +* Has there been a sharp increase of activity? +* Are the accounts uncorrelated? +* Are IPs flagged by threat intelligence? +* Are there much more login failures than successes ? + +If the answer to those questions is yes, the signal is likely legitimate. +You can adapt your response based on the scale of the attack and whether accounts are being compromised. + +{{% /tab %}} + +{{% tab "Distributed Credential Stuffing" %}} + +This signal is looking for a large increase in the overall number of login failures on a service. This is caused by sophisticated attackers leveraging a botnet. + +Datadog tries to identify common attributes between the login failures in your service. This can surface defects in the attacker script that can be used to isolate the malicious activity. When found, a section called **Attacker Attributes** is shown. If present, review whether this is legitimate activity by selecting the cluster and clicking on **Explore clusters**. + +If accurate, the activity of the cluster should closely match the increase in login failures while also being very low/nonexistent before. +If no cluster is available, click **Investigate in full screen** and review the targeted users/IPs for outliers. + +If the list is truncated, click **View in App & API Protection Traces Explorer** and run the investigation with the Traces explorer. For additional tools, see [Step 3.3: Investigation](#step-33-investigation). + +{{% /tab %}} +{{< /tabs >}} + + +If the conclusion of the triage is that the signal is a false positive, you can flag it as a false positive and close it. + +If the false positive was caused by a unique setting in your service, you might introduce suppression filters to silence them out. + +**If the signal is legitimate**, move to step [Step 3.2: Preliminary response](#step-3.2:-disrupting-the-attacker-as-a-preliminary-response). + +### Step 3.2: Disrupting the attacker as a preliminary response + +If the attack is ongoing, you might want to disrupt the attacker as you investigate further. Disrupting the attacker will slow down the attack and reduce the number of compromised accounts. + +**Note:** This is a common step, although you might want to skip this step in the following circumstances: + +* Accounts aren't immediately valuable: you can block compromised accounts after the fact with no negative consequences. +* You want to maintain the maximum visibility on the attack by avoiding preventing the attacker from learning that an investigation is ongoing and changing their strategy to something more difficult to track. + +Enforcing this preliminary response requires [Remote Configuration][11] is enabled for your services. + +If you want to initiate a partial response, do the following: + +{{< tabs >}} +{{% tab "Bruteforce or Credential Stuffing" %}} + +The attackers are likely using a small number of IPs. To block them, open the signal and use Next Steps. You can set the duration of blocking. + +We recommend **12h**, which will be enough for the attack to stop and avoid blocking legitimate users when, after the attack, those IPs get recycled to legitimate users. We do not recommend permanent blocking. +You can also block compromised users, although a better approach would be to extract them and reset their credentials using your own systems. +Finally, you can introduce automated IP blocking while running your investigation. + +{{% /tab %}} + +{{% tab "Distributed Credential Stuffing" %}} + +Those attacks often rely on a large number of disposable IPs. The latency from the Datadog platform makes it impractical to block login attempts by blocking the IP before the IP gets dropped from the attacker's pool. + +Instead, block traits of the request that are unique to the malicious attempt (a user agent, a specific header, a fingerprint, etc.). + +In a **Distributed Credential Stuffing campaign** signal, Datadog automatically identifies clear traits and presents them as **Attacker Attributes**. + +Before blocking, we recommend that you review the activity from the cluster to confirm that the activity is indeed malicious. + +The questions you're trying to answer are: + +- Is the traffic malicious? +- Will a meaningful volume of legitimate traffic be caught? +- Will blocking based on this cluster be effective? + +To do so, select your cluster and click on **Explore clusters**. + +The **Investigate** explorer appears and provides cluster traffic indicators: a large share of the traffic from the attack and a high proportion of IPs flagged by Threat Intelligence. + +Those are two important indicators: + +- Threat Intel % +- Traffic Distribution + +Click an indicator to see further information about the cluster traffic. + +In **Cluster Activity**, there is a visualization of the volume of the overall APM traffic matching this cluster. + +In the following example, a lot of traffic comes from before the attack. This means a legitimate activity matches this cluster in normal traffic and it would get blocked if you were to take action. You don't need to escalate or click **Block All Attacking IPs** in the signal. + +In a different example, the activity from the cluster started with the attack. This means there shouldn't be collateral damage and you can proceed to block. + + + +After confirming that the traits match the attackers, you can push an In-App WAF rule that will block requests matching those traits. Currently, this is supported for user agent-based traits only. + +To create the rule, do the following: + +1. Go to **ASM** > **In-App WAF** > [Custom Rules][24]. +2. Click **Create New Rule** and complete the configuration. +3. Select your login service (or a service where you want to block the requests). You can target blocking to the login route also. +4. Configure the conditions of the rule. In this example, the user agent is used. If you want to block a specific user agent, you can paste it with the operator `matches value in list`. If you want more flexibility, you can also use a regex. +5. Use the **Preview matching traces** section as a final review of the impact of the rule. + +If no unexpected traces are shown, select a blocking mode and proceed to save the rule. The response will be automatically pushed to tracers. You'll soon see blocked traces appear in the Trace Explorer. + +Multiple blocking actions are available, each more or less obvious. Depending on the sophistication of the attackers, you might want a more stealthy answer so that they don't immediately realize they were blocked. + + +{{% /tab %}} +{{< /tabs >}} + +### Step 3.3: Investigation + +When you have [disrupted the attacker as a preliminary response](#step-3.2:-disrupting-the-attacker-as-a-preliminary-response), you can identify the following: + +- Accounts compromised by the attackers so you can reset their credentials. +- Hints about the source of the targeted accounts you can use for proactive password reset or higher scrutiny. +- Data on the attacker infrastructure you can use to catch future attempts or other malicious activity (credit card stuffing, abuse, etc.). + +The first step is to isolate the attacker activity from the overall traffic of the application. + +#### Isolate attacker activity + +To isolate attacker activity, ensure that your current filters are exhaustive: + + +1. Go to [Traces][25], and then *exclude* traces so that the remaining traffic closely tracks your normal traffic. If you're still seeing a spike during the attack, it means further filters are necessary to comprehensively neutralize the attack. +2. Look at the matching traffic over an expanded time frame (for example, if the attack lasted an hour, use one day). Any traffic before or after the attack is likely be a false positive. + +Next, start by isolating the attack's activity. + +{{< tabs >}} +{{% tab "Bruteforce" %}} + +Extract the list of targeted users by going to [Signals][26]. + + + +To craft a query to review all the activity from targeted users, follow this template: + +`@appsec.security_activity:business_logic.users.login.* @appsec.events_data.usr.login:()` + +Successful logins should be considered very suspicious. + +{{% /tab %}} + +{{% tab "Credential Stuffing" %}} + +This signal flagged a lot of activity coming from a few IPs. This signal is closely related to its distributed variant. You might need to use the distributed credential stuffing method. + +Start by extracting a list of suspicious IPs from the signal side panel + + + +To craft a query to review all the activity from suspected IPs, follow this template: + +`@appsec.security_activity:business_logic.users.login.* @http.client_ip:()` + +Successful logins should be considered highly suspicious. + +{{% /tab %}} + +{{% tab "Distributed Credential Stuffing" %}} + +This signal flagged a large increase in login failures in one service. If the attack is large enough, this signal might trigger both Bruteforce or Credential Stuffing signals. The signal is also able to detect very diffuse attacks. + +In the diffuse attacks case, attacker attributes are available in the signal. + + + +1. Click **Investigate in full screen**. +2. In **Attacker Attributes**, select the cluster and click, then, in **Traces**, click **View in ASM Protection Trace Explorer**. + +This will get you to the trace explorer with filters set to the flagged attributes. You can start the investigation with the current query but will likely want to expand it to also match login successes on top of the failures. Review the exhaustiveness/accuracy of the filter using the technique described above (in the paragraph before the table). + + + +In the case those attributes are inaccurate/incomplete, you may try to identify further traits to isolate the attacker activity. The traits we historically found the most useful are: + +1. User agent: `@http.user_agent` +2. ASN: `@http.client_ip_details.as.domain` +3. Threat intelligence: `@threat_intel.results.category` +4. URL: `@http.url` +5. Fingerprint, when available: `@appsec.fingerpring.*` + + + +You may use Top List or Timeseries to identify the traits whose distribution most closely matches the attack. + +You may need multiple sets of filters, each possibly including multiple traits. Behind the scenes, the attacker may be using multiple randomized templates. This work identifies the constants in those templates. + +{{% /tab %}} +{{< /tabs >}} + +#### Review login successes and failures + +Reviewing login successes and failures helps to identify the following: + +* What the attackers are after so that you can block them. +* What the attackers are doing so that you can catch them, even if they change their scripts. +* How successful the attackers are so that you can take back the accounts where they took control and see how much time you have to react. + +When attacker activity is isolated, review login successes and consider the following questions: + +* Have any accounts been compromised? +* Are attackers doing something with their compromised accounts or are they leaving them dormant? +* Are the accounts accessed by a different infrastructure? +* Is there any past activity from this infrastructure? + +For the login failures, consider the following questions: + +* Are attackers targeting a specific subset of users? +* How successful are they? The accuracy of the attacks should be in the 1/100-1/1000 range. +* Are they defeating captchas or multifactor authentication? + +As your investigation progresses, you can go back and forth between this step and the next as you're ready to enforce a response based on your findings. + +### Step 3.4: Response + +Datadog's investigation capabilities are enriched by data from our Backend, which isn't available to the library running the response. Because of that, not all fields are compatible with enforcing a response. + +Motivated attackers try to circumvent your response as soon as they become aware of it. In anticipation of this approach, do the following: + +1. Ensure you don't lose visibility on the attack. +2. Make blocking as hard as possible to *identify* by the attacker. For example, make the blocking response the same as your login failure. This can confuse attackers and lead them to believe their attack is still successful. +3. Make blocking as hard as possible to *circumvent* by the attacker. Use subtle traits, such as specific header values, instead of IPs. + +You can either use Datadog's built-in blocking capabilities to deny any request that matches some criteria, or export the data automatically to one of your systems to perform a response (credentials reset, mimic login failures upon blocking, etc.). + +### Datadog blocking + +Users that are part of traffic blocked by Datadog will see a **You're blocked** page, or they receive a custom status code, such as a redirection. Blocking can be applied through two mechanisms, each with different performance characteristics: the Denylist and custom WAF rules. + + + +#### Denylist + +The [Denylist][27] is an efficient way to block a large number of entries, but is limited to IPs and users. If your investigation uncovered a small set of IPs responsible for the attack (`<1000`), blocking these IPs is the best course of action. + +The Denylist can be managed and automated using the Datadog platform by clicking **Automate Attacker Blocking** in the signal. + +Use the **Automate Attacker Blocking** or **Block All Attacking IPs** signal options to block all attacking IPs for a few hours, a week, or permanently. Similarly, you can block compromised users. + + + +The blocking can be rescinded or extended from the [Denylist][27]. + + + +If the signal wasn't accurate, you can extract the list and add it to the Denylist manually. + + + +#### In-App WAF rules + +If the Denylist isn't sufficient, you can create a WAF rule. A WAF rule evaluates slower than the Denylist, but it is more flexible. To create the rule, go to **ASM** > **Protection** > **In-App WAF** > [Custom Rules][28]. + + + +To create a new rule, do the following: + +1. Go to **ASM** > **Protection** > **In-App WAF** > [Custom Rules][28]. +2. Click **Create New Rule**. +3. Follow the steps in **Define your custom rule**. +4. In **Select the services you want this rule to apply to**, select your login service, or whichever services where you want to block requests. You can also target the blocking to the login route. + +1. In **If incoming requests match these conditions**, configure the conditions of the rule. The following example uses the user agent. + 1. If you want to block a specific user agent, you can paste it in **Values**. In **Operator**, you can use **matches value in list**, or if you want more flexibility, you can also use a **Matches RegEx**. + 2. Use the **Preview matching traces** section as a final review of the rule's impact. If no unexpected traces are shown, select a blocking mode and save the rule. + + The response is pushed to the Traces explorer automatically and blocked traces appear. + + + +Multiple blocking actions are available. Depending on the sophistication of the attackers, you might want a stealthier response so that attackers don't immediately realize they were blocked. + +For more information, see [In-App WAF Rules][30]. + +### Step 3.5: Monitor + +After the attacker introduces the response, they might suspend or adapt their attack. Keep monitoring the rate of login attempts after introducing the response, especially failures. Attacks might drop off only to resume after a few minutes, hours, or days. + +If a large-scale attack resumes, the Distributed Credential Stuffing signal should re-execute. In this case, review the following considerations: + +* Persistent attackers often require multiple iterations of defensive measures before giving up. +* The ideal defense is a robust blocking strategy that the attacker cannot circumvent. +* Attackers frequently attempt to evade detection by altering IPs and user agents. +* Effective strategies include fingerprint-based or correlation methods that identify rare header combinations. +* Monitor blocked traffic resulting from previous defensive responses. +* Blocking attacker traffic may inadvertently block legitimate traffic. Implement mechanisms to unblock legitimate traffic, either adapt the Datadog response or ensure it is unblocked post attack. + +### Step 3.6: Cleanup + +After a few days with no significant attacker activity, you might consider the attack over and move to a cleanup phase. + +The goals of the cleanup phase are the following: + +- Disable any mitigations that were added. +- Ensure no legitimate traffic is blocked. +- Identify opportunities to harden services against future attacks. +- Identify the source of the data the attacker used against users. + +#### Disabling mitigations + +User blocking should be based on the timer you set when you selected **Block All Attacking IPs** in the signal. This user blocking configuration doesn't require any further action. + +If you configured permanent blocking, unblock users and IPs from the Denylist by doing the following: + +1. Open the [Denylist][27]. +2. Click **Blocked IPs** or **Blocked users**. +3. In the entity list, locate the IP or user, and then click **Unblock**. + + + +Disable or delete any custom In-App WAF rule(s). + +To disable or delete In-App WAF rule(s), in [custom In-App WAF rules][28], disable the rules by clicking on **Monitoring** or **Blocking**, and selecting **Disable Rule**. + +If the rule is no longer relevant, you can delete it by clicking more options (**...**) and selecting **Delete**. + + + +#### Validate no legitimate traffic is blocked + +To validate that no legitimate traffic is blocked, the volume of traffic should match that of the attack closely, with virtually no blocked traces outside the main waves. + +To validate that no legitimate traffic is blocked, do the following: + +1. Go to [Traces][25] and search for blocked traces with the search `@appsec.blocked:true`. +2. If you see significant traffic blocked on an ongoing basis, the traffic is likely legitimate users. + 1. Disable the incorrect blocking rule to avoid blocking further users. + 2. Prioritize unblocking that traffic from the [Denylist][27]. + +#### Hardening your services + +Large ATO campaigns are rarely an isolated occurrence. You might want to leverage the time between attacks to harden your services and establish configurations you can leverage during subsequent attacks. + +Here are some common hardening examples: + +* **Rate limit login attempt per IP/user/network range/user agent:** This soft-blocking feature lets you aggressively curtail the scale of the attack in some circumstances with minimal impact on normal users, even if they happen to share traits with the attacker +* **Adding friction at login:** To break attackers' automation without significantly impacting users, use captchas or modifying the login flow during an attack (for example, require that a token is fetched from a new endpoint). +* **Limiting sensitive actions for users:** If your services allow users to perform sensitive actions (spending money, accessing sensitive information, changing contact information, etc.), you might want to prohibit high risk users with suspicious logins until they are reviewed manually or through multifactor authentication. Suspicious logins can be programmatically fed to your systems by Datadog through a webhook. +* **Ability to consume signal findings programmatically:** Create an endpoint to consume Datadog webhooks and automatically take action against suspected users/IPs/traits. + +#### Identifying the attacker data source + +Attackers acquire lists of compromised accounts in bulk. By identifying the source of their database, you can proactively identify users at risk. + +To identify the source of their database, export users impacted by the attack using one of these options: + +* In the signal details, in **Targeted users**, click **Export to CSV**. This option exports up to 10k users. +* If you need to export more than 10k users, manually paginate your query by performing manual [API calls][31]. The Traces explorer performs similar calls, so you can base your requests on the call it's performing by grouping by `@appsec.events_data.usr.login`. Set the limit to 10000 and use smaller time ranges to avoid the backend cap. + + + +When you have a list, review it for common attributes: +- If all users are coming from one region or one customer. +- A large majority of users share any known compromise (use the [Have I Been Pwned][32] API). + +When the source of the database is identified, proactively force a password reset of those customers or flag them as higher risk. This will increase confidence that future suspicious logins were indeed compromised. + +#### Review additional attacker activity + +Leveraging the signature from the attacker, expand filters to look at what non-login activity they performed. + +This filter can be less accurate. For example, a filter that matches the signature of a mobile application with legitimate traffic but that was cloned by the attacker for their attack. The filter might show research done by the attacker ahead of time, and share hints on what the attacker may be looking to do next. + +You can also pivot on the infrastructure used by the attacker. Did those malicious IPs do anything but logins? Are they accessing other sensitive APIs? + +## Take aways + +Account theft is a common threat but also much more complex than traditional injection exploits. Catching them requires tight integration with your systems and involves enough uncertainty that automated responses aren't possible for the most advanced attacks. + +In this guide, you did the following: +- Learned what account takeover campaigns can look like, how to triage them, and how to counter them. +- Instrumented your login services to provide Datadog ASM with all the context it needs. +- Configured your login services to provide every capability at the time of the attack. + +This is general guidance. Depending on your applications and environments, there might be a need for additional response strategies. + +[1]: /security/account_takeover_protection/ +[2]: https://app.datadoghq.com/services?query=service%3Auser-auth&env=%2A&fromUser=false&hostGroup=%2A&lens=Security&sort=-fave%2C-team&start=1735636008863&end=1735639608863 +[3]: /security/application_security/threats/setup/compatibility/ +[4]: /agent/remote_config/?tab=configurationyamlfile +[5]: https://app.datadoghq.com/security/appsec/onboarding +[6]: https://app.datadoghq.com/security/appsec/traces?query=&agg_m=count&agg_m_source=base&agg_t=count&fromUser=false&track=appsecspan&start=1735036043639&end=1735640843639&paused=false +[7]: /security/application_security/threats/setup/threat_detection/ +[8]: https://app.datadoghq.com/security/appsec/traces?query=%40appsec.security_activity%3Abusiness_logic.users.login.%2A&agg_m=count&agg_m_source=base&agg_t=count&fromUser=false&track=appsecspan&start=1735036164646&end=1735640964646&paused=false +[9]: https://docs.datadoghq.com/security/application_security/threats/add-user-info/?tab=set_user#disabling-user-activity-event-tracking +[10]: https://docs.datadoghq.com/security/application_security/threats/add-user-info/?tab=set_user#adding-business-logic-information-login-success-login-failure-any-business-logic-to-traces +[11]: https://docs.datadoghq.com/agent/remote_config/?tab=configurationyamlfile +[12]: https://app.datadoghq.com/organization-settings/remote-config?resource_type=agents +[13]: https://docs.datadoghq.com/security/application_security/threats/add-user-info/?tab=set_user#tracking-business-logic-information-without-modifying-the-code +[14]: https://app.datadoghq.com/security/appsec/threat +[15]: https://docs.datadoghq.com/security/account_takeover_protection/#attacker-strategies +[16]: https://app.datadoghq.com/security/appsec/detection-rules?query=type%3Aapplication_security%20tag%3A%22category%3Aaccount_takeover%22&deprecated=hide&groupBy=none&sort=date&viz=rules +[17]: https://docs.datadoghq.com/security/notifications/ +[18]: https://app.datadoghq.com/security/configuration/notification-rules/new?notificationData= +[19]: https://docs.datadoghq.com/security/notifications/#notification-channels +[20]: https://app.datadoghq.com/security/appsec/traces?query=%40appsec.security_activity%3Abusiness_logic.users.login.%2A&agg_m=count&agg_m_source=base&agg_t=count&fromUser=false&track=appsecspan&start=1735222832468&end=1735827632468&paused=false +[21]: https://app.datadoghq.com/security/appsec/traces?query=%40appsec.security_activity%3Abusiness_logic.users.login.%2A&agg_m=count&agg_m_source=base&agg_t=count&fromUser=false&track=appsecspan&start=1735222832468&end=1735827632468&paused=false +[22]: https://securitylabs.datadoghq.com/articles/challenges-with-ip-spoofing-in-cloud-environments/#what-should-you-do +[23]: https://app.datadoghq.com/security/appsec/detection-rules?query=type%3Aapplication_security%20tag%3A%22category%3Aaccount_takeover%22&deprecated=hide&groupBy=none&sort=date&viz=rules +[24]: https://app.datadoghq.com/security/appsec/in-app-waf?column=services-count&config_by=custom-rules +[25]: https://app.datadoghq.com/security/appsec/traces +[26]: https://app.datadoghq.com/security +[27]: https://app.datadoghq.com/security/appsec/denylist +[28]: https://ddstaging.datadoghq.com/security/appsec/in-app-waf?column=services-count&config_by=custom-rules +[30]: https://docs.datadoghq.com/security/application_security/threats/inapp_waf_rules/ +[31]: https://docs.datadoghq.com/api/latest/spans/#aggregate-spans +[32]: https://haveibeenpwned.com/ \ No newline at end of file From 4c308f6e4ff5ae9ce27f5f18f6d677f0c13d472d Mon Sep 17 00:00:00 2001 From: Michael Cretzman Date: Mon, 31 Mar 2025 15:57:54 -0700 Subject: [PATCH 02/15] Managing Account Theft with ASM draft --- .../guide/manage_account_theft_appsec.md | 25 +++++++++++++------ 1 file changed, 17 insertions(+), 8 deletions(-) diff --git a/content/en/security/application_security/guide/manage_account_theft_appsec.md b/content/en/security/application_security/guide/manage_account_theft_appsec.md index a28bcefae0205..670cf0a10c924 100644 --- a/content/en/security/application_security/guide/manage_account_theft_appsec.md +++ b/content/en/security/application_security/guide/manage_account_theft_appsec.md @@ -9,11 +9,20 @@ Datadog Application Security Management (ASM) provides [builtin][1] detection an This guide describes how to use ASM to prepare for and respond to account takeover campaigns. This guide is divided into three phases: -- [Phase 1: Collecting login information](#phase-1:-collecting-login-information) -- [Phase 2: Preparing for account takeover campaigns](#phase-2:-preparing-for-account-takeover-campaigns) -- [Phase 3: Reacting to account takeover campaigns](#phase-3:-reacting-to-account-takeover-campaigns) - -## Phase 1: Collecting login information +1. [Collecting login information](#collecting-login-information) + - Enable and verify login activity collection in Datadog ASM using automatic or manual instrumentation methods. + - Use remote configuration options if you cannot modify your service code. + - Troubleshoot missing or incorrect data. +2. [Preparing for account takeover campaigns](#preparing-for-account-takeover-campaigns) + - Prepare for ATO campaigns detected by ASM. + - Configure notifications for attack alerts. + - Validate proper data propagation for accurate attacker identification. + - Set up automatic IP blocking for immediate mitigation. + - Learn about the importance of temporary blocking due to dynamic attacker IPs. +3. [Reacting to account takeover campaigns](#reacting-to-account-takeover-campaigns) + - Learn how to react to ATO campaigns, including attacker strategies, triage, response, investigation, monitoring, and cleanup. + +## Collecting login information To detect malicious patterns, ASM requires visibility into your users' login activity. This phase describes how to enable and validate this visibility. @@ -134,7 +143,7 @@ To use custom In-App WAF rules, do the following: For more details, see [Tracking business logic information without modifying the code][13]. -## Phase 2: Preparing for Account Takeover campaigns +## Preparing for Account Takeover campaigns After setting up instrumentation for your services, ASM monitord for attack campaigns. You can review the monitoring in the [Attacks overview][14] **Business logic** section. @@ -198,7 +207,7 @@ To configure automatic blocking, do the following: -## Phase 3: Reacting to account takeover campaigns +## Reacting to account takeover campaigns This section describes common account takeover hacker behavior and how to triage, investigate, and monitor detections. @@ -616,7 +625,7 @@ This filter can be less accurate. For example, a filter that matches the signatu You can also pivot on the infrastructure used by the attacker. Did those malicious IPs do anything but logins? Are they accessing other sensitive APIs? -## Take aways +## Conclusion Account theft is a common threat but also much more complex than traditional injection exploits. Catching them requires tight integration with your systems and involves enough uncertainty that automated responses aren't possible for the most advanced attacks. From f900185ed395cf4d99ba8cfc2fdfe1a4551efa97 Mon Sep 17 00:00:00 2001 From: Michael Cretzman Date: Tue, 1 Apr 2025 12:48:11 -0700 Subject: [PATCH 03/15] Fixing some headings --- .../guide/manage_account_theft_appsec.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/content/en/security/application_security/guide/manage_account_theft_appsec.md b/content/en/security/application_security/guide/manage_account_theft_appsec.md index 670cf0a10c924..0bdbafe0f4cc9 100644 --- a/content/en/security/application_security/guide/manage_account_theft_appsec.md +++ b/content/en/security/application_security/guide/manage_account_theft_appsec.md @@ -5,24 +5,24 @@ disable_toc: false Users are trusted entities in your systems with access to sensitive information and the ability to perform sensitive actions. Malicious actors have identified users as an opportunity to target websites and steal valuable data and resources. -Datadog Application Security Management (ASM) provides [builtin][1] detection and protection capabilities to help you manage this threat. +Datadog Application Security Management (ASM) provides [built-in][1] detection and protection capabilities to help you manage this threat. -This guide describes how to use ASM to prepare for and respond to account takeover campaigns. This guide is divided into three phases: +This guide describes how to use ASM to prepare for and respond to account takeover (ATO) campaigns. This guide is divided into three phases: -1. [Collecting login information](#collecting-login-information) +1. [Collecting login information](#phase-1-collecting-login-information): - Enable and verify login activity collection in Datadog ASM using automatic or manual instrumentation methods. - Use remote configuration options if you cannot modify your service code. - Troubleshoot missing or incorrect data. -2. [Preparing for account takeover campaigns](#preparing-for-account-takeover-campaigns) +2. [Preparing for account takeover campaigns](#phase-2-preparing-for-ato-campaigns): - Prepare for ATO campaigns detected by ASM. - Configure notifications for attack alerts. - Validate proper data propagation for accurate attacker identification. - Set up automatic IP blocking for immediate mitigation. - Learn about the importance of temporary blocking due to dynamic attacker IPs. -3. [Reacting to account takeover campaigns](#reacting-to-account-takeover-campaigns) +3. [Reacting to account takeover campaigns](#phase-3-reacting-to-ato-campaigns): - Learn how to react to ATO campaigns, including attacker strategies, triage, response, investigation, monitoring, and cleanup. -## Collecting login information +## Phase 1: Collecting login information To detect malicious patterns, ASM requires visibility into your users' login activity. This phase describes how to enable and validate this visibility. @@ -143,7 +143,7 @@ To use custom In-App WAF rules, do the following: For more details, see [Tracking business logic information without modifying the code][13]. -## Preparing for Account Takeover campaigns +## Phase 2: Preparing for ATO campaigns After setting up instrumentation for your services, ASM monitord for attack campaigns. You can review the monitoring in the [Attacks overview][14] **Business logic** section. @@ -207,7 +207,7 @@ To configure automatic blocking, do the following: -## Reacting to account takeover campaigns +## Phase 3: Reacting to ATO campaigns This section describes common account takeover hacker behavior and how to triage, investigate, and monitor detections. From bd87f4a4c8dbc7af611d7e71004caa725bdf0970 Mon Sep 17 00:00:00 2001 From: Michael Cretzman <58786311+michaelcretzman@users.noreply.github.com> Date: Thu, 3 Apr 2025 13:23:27 -0700 Subject: [PATCH 04/15] Apply suggestions from code review Incorp peer edit review Co-authored-by: DeForest Richards <56796055+drichards-87@users.noreply.github.com> --- .../guide/manage_account_theft_appsec.md | 44 +++++++++---------- 1 file changed, 22 insertions(+), 22 deletions(-) diff --git a/content/en/security/application_security/guide/manage_account_theft_appsec.md b/content/en/security/application_security/guide/manage_account_theft_appsec.md index 0bdbafe0f4cc9..efc7a9183695a 100644 --- a/content/en/security/application_security/guide/manage_account_theft_appsec.md +++ b/content/en/security/application_security/guide/manage_account_theft_appsec.md @@ -131,13 +131,13 @@ To use custom In-App WAF rules, do the following: 2. Name your rule and select the **Business Logic** category. 3. Set the rule type as `users.login.failure` for login failures and `users.login.success` for login successes. -4. Select your service and write the rule to match the login attempts. Typically, you match the method (`POST`), the URI with a regex (`^/login`) and the status code (403 for failures, 302 or 200 for success). +4. Select your service and write the rule to match the login attempts. Typically, you match the method (`POST`), the URI with a regex (`^/login`), and the status code (403 for failures, 302 or 200 for success). 5. Collect the tags required by detection rules. The most important tag is `usr.login`. Assuming the login was provided in the request, you can add a condition and set `store value as tag` as the operator. 6. Select a specific user parameter as an input, either in the body or the query. -7. Set the `Tag` field to the name of the tag where we want to save the value captured using `usr.login`. +7. Set the `Tag` field to the name of the tag where you want to save the value captured using `usr.login`. -8. Click **Save**. The rule is automatically sent to every instance of the service and will start capturing login failures. +8. Click **Save**. The rule is automatically sent to every instance of the service and begins capturing login failures. **To validate that the instrumentation is correct**, see [Step 1.4: Validating login metadata is automatically collected](#step-1.4:-validating-login-metadata-is-automatically-collected). @@ -200,7 +200,7 @@ To configure automatic blocking, do the following: 1. Go to **ASM** > **Protection** > [Detection Rules][23]. 2. In **Search**, enter `tag:"category:account_takeover"`. 3. Open the rules where you want to turn on blocking. Datadog recommends turning IP blocking on for **High** or **Critical** severity. -4. In the rule, in **Define Conditions**, in **Security Responses**, enable **IP automated blocking**. +4. In the rule, navigate to **Set Conditions** > **Security Responses**, and enable **IP automated blocking**. You can control the blocking behavior per condition. Each rule can have multiple conditions based on your confidence and the attack success. **Datadog does not recommend permanent blocking of IP addresses**. Attackers are unlikely to reuse IPs and permanent blocking could result in blocking users. Moreover, ASM has a limit of how many IPs it can block (`~10000`), and this could fill this list with unnecessary IPs. @@ -224,16 +224,16 @@ Many strategies are available, but it's important to understand that the value c 3. The actor buys access to a botnet, letting them leverage many different IPs to run their attack. There are extreme cases where large campaigns with 500k+ attempts were so distributed that Datadog saw an average of 1.01 requests per IP and a single attempt per account. 4. When valid credentials are discovered, they might be sold downstream to another actor to leverage them to some end such as financial theft, spam, abuse, etc. -Whenever an attack starts against your systems, signals are generated mentioning **Credential Stuffing**, **Distributed Credential Stuffing**, or **Bruteforce**. These signal terms are based on the strategy used by the attacker. +When an attack begins against your systems, the system generates signals labeled **Credential Stuffing**, **Distributed Credential Stuffing**, or **Bruteforce**, depending on the attacker’s strategy. ### Step 3.1: Triage -The first step is to confirm that the detection is correct. Certain behaviors, such as a security scan on a login endpoint or a lot of token rotation, might appear to the detection as an attack. The analysis depends on the signal, and the following examples provide general guidance that you'll need to adapt to your situation. +The first step is to confirm that the detection is correct. Certain behaviors, such as a security scan on a login endpoint or frequent token rotation, might appear to the detection as an attack. The analysis depends on the signal, and the following examples provide general guidance that you'll need to adapt to your situation. {{< tabs >}} {{% tab "Bruteforce" %}} -The signal is looking for an attempt to steal a user account by trying many different passwords for this account. Generally, a small number of accounts are targeted by those campaigns. +The signal is looking for an attempt to steal a user account by trying many different passwords for this account. Generally, a small number of accounts are targeted by these campaigns. Review the accounts flagged as compromised. Click on a user to open a summary of recent activity. @@ -245,7 +245,7 @@ Questions for triage: If the answer to those questions is yes, the signal is likely legitimate. -You can adapt your response based on the sensitivity of the account (for example, a free account without much access/secrets vs admin account). +You can adapt your response based on the sensitivity of the account, for example, a free account with limited access versus an admin account. {{% /tab %}} @@ -286,7 +286,7 @@ If the list is truncated, click **View in App & API Protection Traces Explorer** If the conclusion of the triage is that the signal is a false positive, you can flag it as a false positive and close it. -If the false positive was caused by a unique setting in your service, you might introduce suppression filters to silence them out. +If the false positive was caused by a unique setting in your service, you can add suppression filters to silence false positives. **If the signal is legitimate**, move to step [Step 3.2: Preliminary response](#step-3.2:-disrupting-the-attacker-as-a-preliminary-response). @@ -296,10 +296,10 @@ If the attack is ongoing, you might want to disrupt the attacker as you investig **Note:** This is a common step, although you might want to skip this step in the following circumstances: -* Accounts aren't immediately valuable: you can block compromised accounts after the fact with no negative consequences. -* You want to maintain the maximum visibility on the attack by avoiding preventing the attacker from learning that an investigation is ongoing and changing their strategy to something more difficult to track. +* The accounts have little immediate value. You can block these post-compromise without causing harm. +* You want to maintain maximum visibility into the attack by avoiding any action that alerts the attacker to the investigation and causes them to change tactics. -Enforcing this preliminary response requires [Remote Configuration][11] is enabled for your services. +Enforcing this preliminary response requires that [Remote Configuration][11] is enabled for your services. If you want to initiate a partial response, do the following: @@ -308,7 +308,7 @@ If you want to initiate a partial response, do the following: The attackers are likely using a small number of IPs. To block them, open the signal and use Next Steps. You can set the duration of blocking. -We recommend **12h**, which will be enough for the attack to stop and avoid blocking legitimate users when, after the attack, those IPs get recycled to legitimate users. We do not recommend permanent blocking. +Datadog recommends a duration of **12h**, which is typically sufficient for the attack to stop and helps avoid blocking legitimate users if those IPs are later recycled. Permanent blocking is not recommended. You can also block compromised users, although a better approach would be to extract them and reset their credentials using your own systems. Finally, you can introduce automated IP blocking while running your investigation. @@ -316,13 +316,13 @@ Finally, you can introduce automated IP blocking while running your investigatio {{% tab "Distributed Credential Stuffing" %}} -Those attacks often rely on a large number of disposable IPs. The latency from the Datadog platform makes it impractical to block login attempts by blocking the IP before the IP gets dropped from the attacker's pool. +These attacks often use a large number of disposable IPs. Due to Datadog’s latency, it’s impractical to block login attempts by blocking the IP before the attacker drops it from their pool. Instead, block traits of the request that are unique to the malicious attempt (a user agent, a specific header, a fingerprint, etc.). In a **Distributed Credential Stuffing campaign** signal, Datadog automatically identifies clear traits and presents them as **Attacker Attributes**. -Before blocking, we recommend that you review the activity from the cluster to confirm that the activity is indeed malicious. +Before blocking, Datadog recommends that you review the activity from the cluster to confirm that the activity is indeed malicious. The questions you're trying to answer are: @@ -359,7 +359,7 @@ To create the rule, do the following: 4. Configure the conditions of the rule. In this example, the user agent is used. If you want to block a specific user agent, you can paste it with the operator `matches value in list`. If you want more flexibility, you can also use a regex. 5. Use the **Preview matching traces** section as a final review of the impact of the rule. -If no unexpected traces are shown, select a blocking mode and proceed to save the rule. The response will be automatically pushed to tracers. You'll soon see blocked traces appear in the Trace Explorer. +If no unexpected traces appear, select a blocking mode and save the rule. The system pushes the response to tracers automatically, and blocked traces will soon appear in the Trace Explorer. Multiple blocking actions are available, each more or less obvious. Depending on the sophistication of the attackers, you might want a more stealthy answer so that they don't immediately realize they were blocked. @@ -372,8 +372,8 @@ Multiple blocking actions are available, each more or less obvious. Depending on When you have [disrupted the attacker as a preliminary response](#step-3.2:-disrupting-the-attacker-as-a-preliminary-response), you can identify the following: - Accounts compromised by the attackers so you can reset their credentials. -- Hints about the source of the targeted accounts you can use for proactive password reset or higher scrutiny. -- Data on the attacker infrastructure you can use to catch future attempts or other malicious activity (credit card stuffing, abuse, etc.). +- Hints about the source of the targeted accounts, which you can use for proactive password resets or higher scrutiny. +- Data on the attacker infrastructure, which you can use to catch future attempts or other malicious activity (credit card stuffing, abuse, etc.). The first step is to isolate the attacker activity from the overall traffic of the application. @@ -404,7 +404,7 @@ Successful logins should be considered very suspicious. {{% tab "Credential Stuffing" %}} -This signal flagged a lot of activity coming from a few IPs. This signal is closely related to its distributed variant. You might need to use the distributed credential stuffing method. +This signal flagged a lot of activity coming from a few IPs and is closely related to its distributed variant. You might need to use the distributed credential stuffing method. Start by extracting a list of suspicious IPs from the signal side panel @@ -429,11 +429,11 @@ In the diffuse attacks case, attacker attributes are available in the signal. 1. Click **Investigate in full screen**. 2. In **Attacker Attributes**, select the cluster and click, then, in **Traces**, click **View in ASM Protection Trace Explorer**. -This will get you to the trace explorer with filters set to the flagged attributes. You can start the investigation with the current query but will likely want to expand it to also match login successes on top of the failures. Review the exhaustiveness/accuracy of the filter using the technique described above (in the paragraph before the table). +This takes you to the Trace Explorer with filters set to the flagged attributes. You can start the investigation with the current query, but will likely want to expand it to also match login successes on top of the failures. Review the exhaustiveness/accuracy of the filter using the technique described above (in the paragraph before the table). -In the case those attributes are inaccurate/incomplete, you may try to identify further traits to isolate the attacker activity. The traits we historically found the most useful are: +In the case where those attributes are inaccurate/incomplete, you may try to identify further traits to isolate the attacker activity. The traits that Datadog historically found the most useful are: 1. User agent: `@http.user_agent` 2. ASN: `@http.client_ip_details.as.domain` @@ -487,7 +487,7 @@ You can either use Datadog's built-in blocking capabilities to deny any request ### Datadog blocking -Users that are part of traffic blocked by Datadog will see a **You're blocked** page, or they receive a custom status code, such as a redirection. Blocking can be applied through two mechanisms, each with different performance characteristics: the Denylist and custom WAF rules. +Users whose traffic is blocked by Datadog are shown a **You're blocked** page or receive a custom status code, such as a redirect. Blocking can be applied through two mechanisms, each with different performance characteristics: the Denylist and custom WAF rules. From 921787b72582cedb9d7a64836bebe34fb4ca193c Mon Sep 17 00:00:00 2001 From: Michael Cretzman <58786311+michaelcretzman@users.noreply.github.com> Date: Thu, 3 Apr 2025 13:28:58 -0700 Subject: [PATCH 05/15] Apply suggestions from code review Incorp Dev review phase 1 Co-authored-by: Taiki --- .../guide/manage_account_theft_appsec.md | 44 +++++++++---------- 1 file changed, 22 insertions(+), 22 deletions(-) diff --git a/content/en/security/application_security/guide/manage_account_theft_appsec.md b/content/en/security/application_security/guide/manage_account_theft_appsec.md index efc7a9183695a..b3ac4654fc23d 100644 --- a/content/en/security/application_security/guide/manage_account_theft_appsec.md +++ b/content/en/security/application_security/guide/manage_account_theft_appsec.md @@ -97,25 +97,24 @@ To validate that login metadata is collected, do the following: In the event of a **false** user (`usr.exists:false`), look for the following issues: -- A single event: if the trace contains multiple login events, such as both successes and failures, go to [Step 1.5: Manually instrumenting your services](#step-1.5:-manually-instrumenting-your-services). -- If the event does not contain the mandatory metadata, it might appear as a user attribution section. The mandatory metadata is `usr.login` and `usr.exists` in the case of login failure, and `usr.id` in the case of login success. In this case, go to [Step 1.5: Manually instrumenting your services](#step-1.5:-manually-instrumenting-your-services). +- A single event: if the trace contains multiple login events, such as both successes and failures, this might be caused by incorrect auto-instrumentation. To change auto-instrumentation, go to [Step 1.5: Manually instrumenting your services](#step-1.5:-manually-instrumenting-your-services). +- Does the event contain the mandatory metadata? It might appear as a user attribution section in the case of a login success. The mandatory metadata is `usr.login` and `usr.exists` in the case of login failure, and `usr.login` and `usr.id` in the case of login success. If some metadata is missing, go to [Step 1.5: Manually instrumenting your services](#step-1.5:-manually-instrumenting-your-services). -**If the instrumentation is correct, go to [Phase 2: Preparing for Account Takeover campaigns](#phase-2:-preparing-for-account-takeover-campaigns).** +**If the instrumentation is correct, go to [Phase 2: Preparing for Account Takeover campaigns](#phase-2-preparing-for-ato-campaigns).** ### Step 1.5: Manually instrumenting your services ASM collects login information and metadata using an SDK embedded in the Datadog libraries. Instrumentation is performed by calling the SDK when a user login is successful/fails and by providing the SDK with the metadata of the login. The SDK attaches the login and the metadata to the trace and sends it to Datadog where it is retained. -**For an alternative to modifying the code of the service**, go to [Step 1.6: Remote instrumentation of your services](#step-1.6:-remote-instrumentation-of-your-services). +
For an alternative to modifying the service's code, go to [Step 1.6: Remote instrumentation of your services](#step-16-remote-instrumentation-of-your-services).
To manually instrument your services, do the following: 1. If auto-instrumentation is providing incorrect data (multiple events in a single trace), see [Disable auto-instrumentation][9]. - For detailed instrumentation instructions for each language, go to [Adding business logic information (login success, login failure, any business logic) to traces][10]. -2. Add the following metadata: - * `usr.login`: **Mandatory for login success and failure**. This field contains the *name* used to log into the account. The name might be an email address, a phone number, a username, or something else. The purpose of this field is to identify targeted accounts even if they don't exist in your systems because a user might be able to change those accounts. - * `usr.exists`: **Recommended for login failures**. This field is required for some default detections. The field helps to lower the priority of attempts targeted at accounts that don't exist in your systems. - * `usr.id`: **Recommended for login success and failure (if available)**. This field contains a unique identifier for the account. User blocking is based on this value. This field also helps with investigation. If no identifier is available (because the account doesn't exist), you don't need to populate this field. +2. For detailed instrumentation instructions for each language, go to [Adding business logic information (login success, login failure, any business logic) to traces][10]. Make sure to add the following metadata: + * `usr.login`: **Mandatory for login success and failure**. This field contains the *name* used to log into the account. The name might be an email address, a phone number, a username, or something else. The purpose of this field is to identify targeted accounts even if they don't exist in your systems because a user might be able to change those accounts. Also, this field provides information on the location of the database used by the attacker. This value shouldn't be confused with `usr.id`.``` + * `usr.exists`: **Mandatory for login failures**. This field is required for some default detections. The field helps to lower the priority of attempts targeted at accounts that don't exist in your systems. + * `usr.id`: **Mandatory for login success and recommended for login failures (if the user exists)**. This field contains a unique identifier for the account. User blocking is based on this value. This field also helps extract post-compromise activity. If no identifier is available (the account doesn't exist), you don't need to populate this field. **After deploying the code, validate the instrumentation is correct by following the steps in** [Step 1.4: Validating login metadata is automatically collected](#step-1.4:-validating-login-metadata-is-automatically-collected). @@ -123,7 +122,7 @@ To manually instrument your services, do the following: ASM can use custom In-App WAF rules to flag login attempts and extract the metadata from the request needed by detection rules. -This approach requires that [Remote Configuration][11] is enabled and working. Verify Remote Configuration is running in [Remote Configuration][12]. +This approach requires that [Remote Configuration][11] is enabled and working. Verify Remote Configuration is running for this service in [Remote Configuration][12]. To use custom In-App WAF rules, do the following: @@ -145,7 +144,7 @@ For more details, see [Tracking business logic information without modifying the ## Phase 2: Preparing for ATO campaigns -After setting up instrumentation for your services, ASM monitord for attack campaigns. You can review the monitoring in the [Attacks overview][14] **Business logic** section. +After setting up instrumentation for your services, ASM monitors for attack campaigns. You can review the traffic in the [Attacks overview][14] **Business logic** section. @@ -153,27 +152,25 @@ ASM detects [multiple attacker strategies][15]. Upon detecting an attack with a The severity of the signal is set based on the urgency of the threat: from **Low** in case of unsuccessful attacks to **Critical** in case of successful account compromises. -To fully leverage detections, take the following actions. +The following actions will help you to fully leverage detections and to become aware them quicker. ### Step 2.1: Configuring notifications -[Notifications][17] provide warnings when a signal is triggered. - -To create a notification rule using the [Create a new rule][18] setting, do the following: +[Notifications][17] provide a warning on your preferred channel when a signal is triggered. To create a notification rule, do the following: 1. Open [Create a new rule][18]. 2. Enter a name for the rule. -3. Select **Signal** and remove all entries except **ASM**. -4. Restrict the rule to `category:account_takeover.` +3. Select **Signal** and remove all entries except **Application Security**. +4. Restrict the rule to `category:account_takeover`, and expand the severities to include `Medium`. 5. Add notification recipients (Slack, Teams, PagerDuty). To learn more, see [Notification channels][19]. -6. Save the rule. +6. Test, then save the rule. The notification is sent the next time a signal is generated. ### Step 2.2: Validate proper data propagation -In microservice environments, services are generally reached by internal hosts running other services. This internal environment makes it challenging to identify the unique traits of the attacker, such as IP, user agent, fingerprint, etc., and validate the data. +In microservice environments, services are generally reached by internal hosts running other services. This internal environment makes it challenging to identify the unique traits of the original attacker's request, such as IP, user agent, fingerprint, etc. [ASM Traces][20] can help to validate the data by exposing the source IPs and user agent traffic. @@ -181,17 +178,20 @@ To validate the data, do the following: 1. Review login traces in the [Traces][21] and check for the following: * Source IPs (`@http.client_ip`) are varied and public IPs. - * **Problem:** If login attempts are coming from a few IPs only, this might be a proxy that you can't block without risking availability. + * **Problem:** If login attempts are coming from a few IPs only, this might be a proxy that you can't block without risking availability. To review the IPs, use the **Top List** feature and a **group by** filter such as `@http.client_ip_details.as.domain`. * **Solution:** Forward the client IP of the initial request through a HTTP header, such as `X-Forwarded-For`. You can use a custom header for [better security][22] and configure the tracer to read it using the `DD_TRACE_CLIENT_IP_HEADER` environment variable. * The user agent (`@http.user_agent`) is consistent with the expected traffic (web browser, mobile app, etc.) * **Problem:** The user agent could be replaced by the user agent in the calling microservice network library. * **Solution:** Use the client user agent when calling subsequent services. +* Multiple headers are populated (in the trace's **See more details** menu of the **Request** block). + * **Problem:** Normal request headers (for example, `accept-encoding`) aren't forwarded to the instrumented service. This impairs the generation of fingerprints (`@appsec.fingerprint.*`) and degrades the signal's ability to isolate an attacker's activity. + * **Solution:** Forward those headers when calling a subsequent microservice. ### Step 2.3: Configure automatic blocking -**Before you begin:** Verify that the IP addresses are properly configured, as described in [Step 2.2: Validate proper data propagation](#step-2.2:-validate-proper-data-propagation). +
Before you begin: Verify that the IP addresses are properly configured, as described in [Step 2.2: Validate proper data propagation](#step-22-validate-proper-data-propagation).
-ASM automatic blocking can be used to block attacks at any time of the day. Automatic blocking can help block attacks before your team members are online, providing security during off hours. +ASM automatic blocking can be used to block attacks at any time of the day. Automatic blocking can help block attacks before your team members are online, providing security during off hours. Within an ATO, automatic blocking can help mitigate the load issues caused by the increase in failed login attempts or prevent the attacker from using compromised accounts. You can configure automatic blocking to block IPs identified as part of an attack. This is only a partial remediation because attackers can change IPs; however, it can give you more time to implement comprehensive remediation. From 98ced864b1a3fce43d508372ba43c2ea6343d1d8 Mon Sep 17 00:00:00 2001 From: Michael Cretzman Date: Thu, 3 Apr 2025 15:05:56 -0700 Subject: [PATCH 06/15] incorp tech review --- .../guide/manage_account_theft_appsec.md | 126 +++++++++--------- 1 file changed, 64 insertions(+), 62 deletions(-) diff --git a/content/en/security/application_security/guide/manage_account_theft_appsec.md b/content/en/security/application_security/guide/manage_account_theft_appsec.md index b3ac4654fc23d..4490a56703daa 100644 --- a/content/en/security/application_security/guide/manage_account_theft_appsec.md +++ b/content/en/security/application_security/guide/manage_account_theft_appsec.md @@ -30,9 +30,9 @@ To detect malicious patterns, ASM requires visibility into your users' login act This step describes how to set up your service to use ASM. -If your service is already using ASM, you can go to [Step 1.3: Validating whether login information is automatically collected](#step-1.3:-validating-login-information-is-automatically-collected). +
If your service is already using ASM, you can go to Step 1.3: Validating whether login information is automatically collected.
-1. Go to [**Service Catalog**][2], click the **Security** lens, and search for your login service name. +1. Go to [**Software Catalog**][2], click the **Security** lens, and search for your login service name. @@ -44,7 +44,7 @@ If your service is already using ASM, you can go to [Step 1.3: Validating whethe - To set up ASM, move to [Step 1.2: Enabling ASM on login service](#step-1.2:-enabling-app-&-api-protection-on-your-login-service). + To set up ASM, move to [Step 1.2: Enabling ASM on login service](#step-12-enabling-asm-on-your-login-service). ### Step 1.2: Enabling ASM on your login service @@ -52,9 +52,9 @@ To enable ASM on your login service, ensure you meet the following requirements: * Similarly to Datadog APM, ASM requires a library integration in your services and a running Datadog agent. * ASM generally benefits from using the newest library possible; however, minimum supported versions are documented in [Compatibility Requirements][3]. -* **Threat Detection** is required at a minimum, and **Automatic user activity event tracking** should also be enabled, ideally. +* At a minimum, **Threat Detection** must be enabled. Ideally, **Automatic user activity event tracking** should be enabled as well. -To enable ASM using a new deployment, use the `APPSEC\_ENABLED` environment variable/library configuration or [Remote Configuration]. You can use either method, but Remote Configuration can be set up using the Datadog UI. +To enable ASM using a new deployment, use the `APPSEC_ENABLED` environment variable/library configuration or [Remote Configuration][11]. You can use either method, but Remote Configuration can be set up using the Datadog UI. To enable ASM using Remote Configuration, do the following: @@ -71,7 +71,7 @@ For more detailed instructions on using a new deployment, see [Enabling ASM Thre After you have enabled ASM, you can validate that login information is collected by Datadog. -**Note:** Once ASM is enabled on a service, wait a few minutes for users to log into the service or log into the service yourself. +**Note:** After ASM is enabled on a service, wait a few minutes for users to log into the service or log into the service yourself. To validate login information is collected, do the following: @@ -106,15 +106,15 @@ In the event of a **false** user (`usr.exists:false`), look for the following is ASM collects login information and metadata using an SDK embedded in the Datadog libraries. Instrumentation is performed by calling the SDK when a user login is successful/fails and by providing the SDK with the metadata of the login. The SDK attaches the login and the metadata to the trace and sends it to Datadog where it is retained. -
For an alternative to modifying the service's code, go to [Step 1.6: Remote instrumentation of your services](#step-16-remote-instrumentation-of-your-services).
+
For an alternative to modifying the service's code, go to Step 1.6: Remote instrumentation of your services.
To manually instrument your services, do the following: 1. If auto-instrumentation is providing incorrect data (multiple events in a single trace), see [Disable auto-instrumentation][9]. 2. For detailed instrumentation instructions for each language, go to [Adding business logic information (login success, login failure, any business logic) to traces][10]. Make sure to add the following metadata: - * `usr.login`: **Mandatory for login success and failure**. This field contains the *name* used to log into the account. The name might be an email address, a phone number, a username, or something else. The purpose of this field is to identify targeted accounts even if they don't exist in your systems because a user might be able to change those accounts. Also, this field provides information on the location of the database used by the attacker. This value shouldn't be confused with `usr.id`.``` + * `usr.login`: **Mandatory for login success and failure**. This field contains the *name* used to log into the account. The name might be an email address, a phone number, a username, or something else. The purpose of this field is to identify targeted accounts even if they don't exist in your systems because a user might be able to change those accounts. + * `usr.exists`: **Recommended for login failures**. This field is required for some default detections. The field helps to lower the priority of attempts targeted at accounts that don't exist in your systems. * `usr.exists`: **Mandatory for login failures**. This field is required for some default detections. The field helps to lower the priority of attempts targeted at accounts that don't exist in your systems. - * `usr.id`: **Mandatory for login success and recommended for login failures (if the user exists)**. This field contains a unique identifier for the account. User blocking is based on this value. This field also helps extract post-compromise activity. If no identifier is available (the account doesn't exist), you don't need to populate this field. **After deploying the code, validate the instrumentation is correct by following the steps in** [Step 1.4: Validating login metadata is automatically collected](#step-1.4:-validating-login-metadata-is-automatically-collected). @@ -122,21 +122,21 @@ To manually instrument your services, do the following: ASM can use custom In-App WAF rules to flag login attempts and extract the metadata from the request needed by detection rules. -This approach requires that [Remote Configuration][11] is enabled and working. Verify Remote Configuration is running for this service in [Remote Configuration][12]. +This approach requires that [Remote Configuration][11] is enabled and working. Verify Remote Configuration is running in [Remote Configuration][12]. To use custom In-App WAF rules, do the following: -1. Open the [In-App WAF custom rule creation form](https://app.datadoghq.com/security/appsec/in-app-waf?column=services-count&config_by=custom-rules&group_by=RULESET&order=desc&policy_by=rules&ruleColumn=modifiedAt&ruleId=newRule). +1. Open the [In-App WAF custom rule creation form][24]. 2. Name your rule and select the **Business Logic** category. 3. Set the rule type as `users.login.failure` for login failures and `users.login.success` for login successes. -4. Select your service and write the rule to match the login attempts. Typically, you match the method (`POST`), the URI with a regex (`^/login`), and the status code (403 for failures, 302 or 200 for success). +4. Select your service and write the rule to match the login attempts. Typically, you match the method (`POST`), the URI with a regex (`^/login`) and the status code (403 for failures, 302 or 200 for success). 5. Collect the tags required by detection rules. The most important tag is `usr.login`. Assuming the login was provided in the request, you can add a condition and set `store value as tag` as the operator. 6. Select a specific user parameter as an input, either in the body or the query. 7. Set the `Tag` field to the name of the tag where you want to save the value captured using `usr.login`. -8. Click **Save**. The rule is automatically sent to every instance of the service and begins capturing login failures. +8. Click **Save**. The rule is automatically sent to every instance of the service and then begins capturing login failures. **To validate that the instrumentation is correct**, see [Step 1.4: Validating login metadata is automatically collected](#step-1.4:-validating-login-metadata-is-automatically-collected). @@ -144,7 +144,7 @@ For more details, see [Tracking business logic information without modifying the ## Phase 2: Preparing for ATO campaigns -After setting up instrumentation for your services, ASM monitors for attack campaigns. You can review the traffic in the [Attacks overview][14] **Business logic** section. +After setting up instrumentation for your services, ASM monitord for attack campaigns. You can review the monitoring in the [Attacks overview][14] **Business logic** section. @@ -152,25 +152,27 @@ ASM detects [multiple attacker strategies][15]. Upon detecting an attack with a The severity of the signal is set based on the urgency of the threat: from **Low** in case of unsuccessful attacks to **Critical** in case of successful account compromises. -The following actions will help you to fully leverage detections and to become aware them quicker. +To fully leverage detections, take the following actions. ### Step 2.1: Configuring notifications -[Notifications][17] provide a warning on your preferred channel when a signal is triggered. To create a notification rule, do the following: +[Notifications][17] provide warnings when a signal is triggered. + +To create a notification rule using the [Create a new rule][18] setting, do the following: 1. Open [Create a new rule][18]. 2. Enter a name for the rule. -3. Select **Signal** and remove all entries except **Application Security**. -4. Restrict the rule to `category:account_takeover`, and expand the severities to include `Medium`. +3. Select **Signal** and remove all entries except **ASM**. +4. Restrict the rule to `category:account_takeover.` 5. Add notification recipients (Slack, Teams, PagerDuty). To learn more, see [Notification channels][19]. -6. Test, then save the rule. +6. Save the rule. The notification is sent the next time a signal is generated. ### Step 2.2: Validate proper data propagation -In microservice environments, services are generally reached by internal hosts running other services. This internal environment makes it challenging to identify the unique traits of the original attacker's request, such as IP, user agent, fingerprint, etc. +In microservice environments, services are generally reached by internal hosts running other services. This internal environment makes it challenging to identify the unique traits of the attacker, such as IP, user agent, fingerprint, etc., and validate the data. [ASM Traces][20] can help to validate the data by exposing the source IPs and user agent traffic. @@ -178,20 +180,17 @@ To validate the data, do the following: 1. Review login traces in the [Traces][21] and check for the following: * Source IPs (`@http.client_ip`) are varied and public IPs. - * **Problem:** If login attempts are coming from a few IPs only, this might be a proxy that you can't block without risking availability. To review the IPs, use the **Top List** feature and a **group by** filter such as `@http.client_ip_details.as.domain`. + * **Problem:** If login attempts are coming from a few IPs only, this might be a proxy that you can't block without risking availability. * **Solution:** Forward the client IP of the initial request through a HTTP header, such as `X-Forwarded-For`. You can use a custom header for [better security][22] and configure the tracer to read it using the `DD_TRACE_CLIENT_IP_HEADER` environment variable. * The user agent (`@http.user_agent`) is consistent with the expected traffic (web browser, mobile app, etc.) * **Problem:** The user agent could be replaced by the user agent in the calling microservice network library. * **Solution:** Use the client user agent when calling subsequent services. -* Multiple headers are populated (in the trace's **See more details** menu of the **Request** block). - * **Problem:** Normal request headers (for example, `accept-encoding`) aren't forwarded to the instrumented service. This impairs the generation of fingerprints (`@appsec.fingerprint.*`) and degrades the signal's ability to isolate an attacker's activity. - * **Solution:** Forward those headers when calling a subsequent microservice. ### Step 2.3: Configure automatic blocking -
Before you begin: Verify that the IP addresses are properly configured, as described in [Step 2.2: Validate proper data propagation](#step-22-validate-proper-data-propagation).
+**Before you begin:** Verify that the IP addresses are properly configured, as described in [Step 2.2: Validate proper data propagation](#step-2.2:-validate-proper-data-propagation). -ASM automatic blocking can be used to block attacks at any time of the day. Automatic blocking can help block attacks before your team members are online, providing security during off hours. Within an ATO, automatic blocking can help mitigate the load issues caused by the increase in failed login attempts or prevent the attacker from using compromised accounts. +ASM automatic blocking can be used to block attacks at any time of the day. Automatic blocking can help block attacks before your team members are online, providing security during off hours. You can configure automatic blocking to block IPs identified as part of an attack. This is only a partial remediation because attackers can change IPs; however, it can give you more time to implement comprehensive remediation. @@ -200,7 +199,7 @@ To configure automatic blocking, do the following: 1. Go to **ASM** > **Protection** > [Detection Rules][23]. 2. In **Search**, enter `tag:"category:account_takeover"`. 3. Open the rules where you want to turn on blocking. Datadog recommends turning IP blocking on for **High** or **Critical** severity. -4. In the rule, navigate to **Set Conditions** > **Security Responses**, and enable **IP automated blocking**. +4. In the rule, in **Define Conditions**, in **Security Responses**, enable **IP automated blocking**. You can control the blocking behavior per condition. Each rule can have multiple conditions based on your confidence and the attack success. **Datadog does not recommend permanent blocking of IP addresses**. Attackers are unlikely to reuse IPs and permanent blocking could result in blocking users. Moreover, ASM has a limit of how many IPs it can block (`~10000`), and this could fill this list with unnecessary IPs. @@ -222,18 +221,18 @@ Many strategies are available, but it's important to understand that the value c 1. The actor who initiates the attack often buys a database of credentials from a vendor (likely acquired by the compromise of another service). 2. The actor procures a script designed to automate login attempts while evading detection (randomizing headers, trying to look as similar to normal traffic as possible). 3. The actor buys access to a botnet, letting them leverage many different IPs to run their attack. There are extreme cases where large campaigns with 500k+ attempts were so distributed that Datadog saw an average of 1.01 requests per IP and a single attempt per account. -4. When valid credentials are discovered, they might be sold downstream to another actor to leverage them to some end such as financial theft, spam, abuse, etc. +4. When valid credentials are discovered, they might be sold downstream to another actor to leverage them to some end such as financial theft, spam, abuse, etc. -When an attack begins against your systems, the system generates signals labeled **Credential Stuffing**, **Distributed Credential Stuffing**, or **Bruteforce**, depending on the attacker’s strategy. +Whenever an attack starts against your systems, signals are generated mentioning **Credential Stuffing**, **Distributed Credential Stuffing**, or **Bruteforce**. These signal terms are based on the strategy used by the attacker. ### Step 3.1: Triage -The first step is to confirm that the detection is correct. Certain behaviors, such as a security scan on a login endpoint or frequent token rotation, might appear to the detection as an attack. The analysis depends on the signal, and the following examples provide general guidance that you'll need to adapt to your situation. +The first step is to confirm that the detection is correct. Certain behaviors, such as a security scan on a login endpoint or a lot of token rotation, might appear to the detection as an attack. The analysis depends on the signal, and the following examples provide general guidance that should be customized for your situation. {{< tabs >}} {{% tab "Bruteforce" %}} -The signal is looking for an attempt to steal a user account by trying many different passwords for this account. Generally, a small number of accounts are targeted by these campaigns. +The signal is looking for an attempt to steal a user account by trying many different passwords for this account. Generally, a small number of accounts are targeted by those campaigns. Review the accounts flagged as compromised. Click on a user to open a summary of recent activity. @@ -245,7 +244,7 @@ Questions for triage: If the answer to those questions is yes, the signal is likely legitimate. -You can adapt your response based on the sensitivity of the account, for example, a free account with limited access versus an admin account. +You can adapt your response based on the sensitivity of the account (for example, a free account without much access/secrets vs admin account). {{% /tab %}} @@ -275,7 +274,7 @@ This signal is looking for a large increase in the overall number of login failu Datadog tries to identify common attributes between the login failures in your service. This can surface defects in the attacker script that can be used to isolate the malicious activity. When found, a section called **Attacker Attributes** is shown. If present, review whether this is legitimate activity by selecting the cluster and clicking on **Explore clusters**. -If accurate, the activity of the cluster should closely match the increase in login failures while also being very low/nonexistent before. +If accurate, the activity of the cluster should closely match the increase in login failures while also being low/nonexistent before. If no cluster is available, click **Investigate in full screen** and review the targeted users/IPs for outliers. If the list is truncated, click **View in App & API Protection Traces Explorer** and run the investigation with the Traces explorer. For additional tools, see [Step 3.3: Investigation](#step-33-investigation). @@ -286,20 +285,20 @@ If the list is truncated, click **View in App & API Protection Traces Explorer** If the conclusion of the triage is that the signal is a false positive, you can flag it as a false positive and close it. -If the false positive was caused by a unique setting in your service, you can add suppression filters to silence false positives. +If the false positive was caused by a unique setting in your service, you might introduce suppression filters to silence them out. -**If the signal is legitimate**, move to step [Step 3.2: Preliminary response](#step-3.2:-disrupting-the-attacker-as-a-preliminary-response). +**If the signal is legitimate**, move to step [Step 3.2: Preliminary response](#step-32-disrupting-the-attacker-as-a-preliminary-response). ### Step 3.2: Disrupting the attacker as a preliminary response -If the attack is ongoing, you might want to disrupt the attacker as you investigate further. Disrupting the attacker will slow down the attack and reduce the number of compromised accounts. +If the attack is ongoing, you might want to disrupt the attacker as you investigate further. Disrupting the attacker slows down the attack and reduce the number of compromised accounts. **Note:** This is a common step, although you might want to skip this step in the following circumstances: -* The accounts have little immediate value. You can block these post-compromise without causing harm. -* You want to maintain maximum visibility into the attack by avoiding any action that alerts the attacker to the investigation and causes them to change tactics. +* Accounts aren't immediately valuable: you can block compromised accounts after the fact with no negative consequences. +* You want to maintain the maximum visibility on the attack by avoiding preventing the attacker from learning that an investigation is ongoing and changing their strategy to something more difficult to track. -Enforcing this preliminary response requires that [Remote Configuration][11] is enabled for your services. +Enforcing this preliminary response requires [Remote Configuration][11] is enabled for your services. If you want to initiate a partial response, do the following: @@ -308,15 +307,16 @@ If you want to initiate a partial response, do the following: The attackers are likely using a small number of IPs. To block them, open the signal and use Next Steps. You can set the duration of blocking. -Datadog recommends a duration of **12h**, which is typically sufficient for the attack to stop and helps avoid blocking legitimate users if those IPs are later recycled. Permanent blocking is not recommended. -You can also block compromised users, although a better approach would be to extract them and reset their credentials using your own systems. +Datadog recommend **12h**, which is enough for the attack to stop and avoid blocking legitimate users when, after the attack, those IPs get recycled to legitimate users. Datadog does not recommend permanent blocking. +You can also block compromised users, although a better approach would be to extract them and reset their credentials using your own systems. + Finally, you can introduce automated IP blocking while running your investigation. {{% /tab %}} {{% tab "Distributed Credential Stuffing" %}} -These attacks often use a large number of disposable IPs. Due to Datadog’s latency, it’s impractical to block login attempts by blocking the IP before the attacker drops it from their pool. +Those attacks often rely on a large number of disposable IPs. The latency from the Datadog platform makes it impractical to block login attempts by blocking the IP before the IP gets dropped from the attacker's pool. Instead, block traits of the request that are unique to the malicious attempt (a user agent, a specific header, a fingerprint, etc.). @@ -327,8 +327,8 @@ Before blocking, Datadog recommends that you review the activity from the cluste The questions you're trying to answer are: - Is the traffic malicious? -- Will a meaningful volume of legitimate traffic be caught? -- Will blocking based on this cluster be effective? +- Can a meaningful volume of legitimate traffic be caught? +- Can blocking based on this cluster be effective? To do so, select your cluster and click on **Explore clusters**. @@ -349,7 +349,7 @@ In a different example, the activity from the cluster started with the attack. T -After confirming that the traits match the attackers, you can push an In-App WAF rule that will block requests matching those traits. Currently, this is supported for user agent-based traits only. +After confirming that the traits match the attackers, you can push an In-App WAF rule to block requests matching those traits. This is supported for user agent-based traits only. To create the rule, do the following: @@ -359,9 +359,9 @@ To create the rule, do the following: 4. Configure the conditions of the rule. In this example, the user agent is used. If you want to block a specific user agent, you can paste it with the operator `matches value in list`. If you want more flexibility, you can also use a regex. 5. Use the **Preview matching traces** section as a final review of the impact of the rule. -If no unexpected traces appear, select a blocking mode and save the rule. The system pushes the response to tracers automatically, and blocked traces will soon appear in the Trace Explorer. +If no unexpected traces are shown, select a blocking mode and proceed to save the rule. The response is automatically pushed to tracers. Blocked traces appear in the Trace Explorer. -Multiple blocking actions are available, each more or less obvious. Depending on the sophistication of the attackers, you might want a more stealthy answer so that they don't immediately realize they were blocked. +Multiple blocking actions are available. Depending on the sophistication of the attackers, you might want a more stealthy answer so that they don't immediately realize they were blocked. {{% /tab %}} @@ -372,8 +372,8 @@ Multiple blocking actions are available, each more or less obvious. Depending on When you have [disrupted the attacker as a preliminary response](#step-3.2:-disrupting-the-attacker-as-a-preliminary-response), you can identify the following: - Accounts compromised by the attackers so you can reset their credentials. -- Hints about the source of the targeted accounts, which you can use for proactive password resets or higher scrutiny. -- Data on the attacker infrastructure, which you can use to catch future attempts or other malicious activity (credit card stuffing, abuse, etc.). +- Hints about the source of the targeted accounts you can use for proactive password reset or higher scrutiny. +- Data on the attacker infrastructure you can use to catch future attempts or other malicious activity (credit card stuffing, abuse, etc.). The first step is to isolate the attacker activity from the overall traffic of the application. @@ -390,7 +390,7 @@ Next, start by isolating the attack's activity. {{< tabs >}} {{% tab "Bruteforce" %}} -Extract the list of targeted users by going to [Signals][26]. +Extract the list of targeted users by going to [Signals][1]. @@ -398,13 +398,15 @@ To craft a query to review all the activity from targeted users, follow this tem `@appsec.security_activity:business_logic.users.login.* @appsec.events_data.usr.login:()` -Successful logins should be considered very suspicious. +Successful logins should be considered suspicious. + +[1]: https://app.datadoghq.com/security {{% /tab %}} {{% tab "Credential Stuffing" %}} -This signal flagged a lot of activity coming from a few IPs and is closely related to its distributed variant. You might need to use the distributed credential stuffing method. +This signal flagged a lot of activity coming from a few IPs. This signal is closely related to its distributed variant. You might need to use the distributed credential stuffing method. Start by extracting a list of suspicious IPs from the signal side panel @@ -414,13 +416,13 @@ To craft a query to review all the activity from suspected IPs, follow this temp `@appsec.security_activity:business_logic.users.login.* @http.client_ip:()` -Successful logins should be considered highly suspicious. +Successful logins should be considered suspicious. {{% /tab %}} {{% tab "Distributed Credential Stuffing" %}} -This signal flagged a large increase in login failures in one service. If the attack is large enough, this signal might trigger both Bruteforce or Credential Stuffing signals. The signal is also able to detect very diffuse attacks. +This signal flagged a large increase in login failures in one service. If the attack is large enough, this signal might trigger both Bruteforce or Credential Stuffing signals. The signal is also able to detect diffuse attacks. In the diffuse attacks case, attacker attributes are available in the signal. @@ -429,11 +431,11 @@ In the diffuse attacks case, attacker attributes are available in the signal. 1. Click **Investigate in full screen**. 2. In **Attacker Attributes**, select the cluster and click, then, in **Traces**, click **View in ASM Protection Trace Explorer**. -This takes you to the Trace Explorer with filters set to the flagged attributes. You can start the investigation with the current query, but will likely want to expand it to also match login successes on top of the failures. Review the exhaustiveness/accuracy of the filter using the technique described above (in the paragraph before the table). +This gets you to the trace explorer with filters set to the flagged attributes. You can start the investigation with the current query, but you should expand it to also match login successes on top of the failures. Review the exhaustiveness/accuracy of the filter using the technique described above (in the paragraph before the table). -In the case where those attributes are inaccurate/incomplete, you may try to identify further traits to isolate the attacker activity. The traits that Datadog historically found the most useful are: +In the case those attributes are inaccurate/incomplete, you may try to identify further traits to isolate the attacker activity. The most useful traits are: 1. User agent: `@http.user_agent` 2. ASN: `@http.client_ip_details.as.domain` @@ -475,7 +477,7 @@ As your investigation progresses, you can go back and forth between this step an ### Step 3.4: Response -Datadog's investigation capabilities are enriched by data from our Backend, which isn't available to the library running the response. Because of that, not all fields are compatible with enforcing a response. +Datadog's investigation capabilities are enriched by data from its backend, which isn't available to the library running the response. Because of that, not all fields are compatible with enforcing a response. Motivated attackers try to circumvent your response as soon as they become aware of it. In anticipation of this approach, do the following: @@ -487,7 +489,7 @@ You can either use Datadog's built-in blocking capabilities to deny any request ### Datadog blocking -Users whose traffic is blocked by Datadog are shown a **You're blocked** page or receive a custom status code, such as a redirect. Blocking can be applied through two mechanisms, each with different performance characteristics: the Denylist and custom WAF rules. +Users that are part of traffic blocked by Datadog see a **You're blocked** page, or they receive a custom status code, such as a redirection. Blocking can be applied through two mechanisms, each with different performance characteristics: the Denylist and custom WAF rules. @@ -615,11 +617,11 @@ When you have a list, review it for common attributes: - If all users are coming from one region or one customer. - A large majority of users share any known compromise (use the [Have I Been Pwned][32] API). -When the source of the database is identified, proactively force a password reset of those customers or flag them as higher risk. This will increase confidence that future suspicious logins were indeed compromised. +When the source of the database is identified, proactively force a password reset of those customers or flag them as higher risk. This increases confidence that future suspicious logins were indeed compromised. #### Review additional attacker activity -Leveraging the signature from the attacker, expand filters to look at what non-login activity they performed. +Leveraging the signature from the attacker, expand filters to look at what non-login activity they performed. This filter can be less accurate. For example, a filter that matches the signature of a mobile application with legitimate traffic but that was cloned by the attacker for their attack. The filter might show research done by the attacker ahead of time, and share hints on what the attacker may be looking to do next. @@ -645,7 +647,7 @@ This is general guidance. Depending on your applications and environments, there [7]: /security/application_security/threats/setup/threat_detection/ [8]: https://app.datadoghq.com/security/appsec/traces?query=%40appsec.security_activity%3Abusiness_logic.users.login.%2A&agg_m=count&agg_m_source=base&agg_t=count&fromUser=false&track=appsecspan&start=1735036164646&end=1735640964646&paused=false [9]: https://docs.datadoghq.com/security/application_security/threats/add-user-info/?tab=set_user#disabling-user-activity-event-tracking -[10]: https://docs.datadoghq.com/security/application_security/threats/add-user-info/?tab=set_user#adding-business-logic-information-login-success-login-failure-any-business-logic-to-traces +[10]: /security/application_security/threats/add-user-info/?tab=set_user#adding-business-logic-information-login-success-login-failure-any-business-logic-to-traces [11]: https://docs.datadoghq.com/agent/remote_config/?tab=configurationyamlfile [12]: https://app.datadoghq.com/organization-settings/remote-config?resource_type=agents [13]: https://docs.datadoghq.com/security/application_security/threats/add-user-info/?tab=set_user#tracking-business-logic-information-without-modifying-the-code @@ -665,5 +667,5 @@ This is general guidance. Depending on your applications and environments, there [27]: https://app.datadoghq.com/security/appsec/denylist [28]: https://ddstaging.datadoghq.com/security/appsec/in-app-waf?column=services-count&config_by=custom-rules [30]: https://docs.datadoghq.com/security/application_security/threats/inapp_waf_rules/ -[31]: https://docs.datadoghq.com/api/latest/spans/#aggregate-spans +[31]: /api/latest/spans/#aggregate-spans [32]: https://haveibeenpwned.com/ \ No newline at end of file From faf313cfbf17faf8a8317443f71ff838bb828b5d Mon Sep 17 00:00:00 2001 From: Michael Cretzman <58786311+michaelcretzman@users.noreply.github.com> Date: Thu, 3 Apr 2025 18:13:37 -0400 Subject: [PATCH 07/15] Discard changes to .gitignore --- .gitignore | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.gitignore b/.gitignore index a6d84e0019eec..4c77e458b43c2 100644 --- a/.gitignore +++ b/.gitignore @@ -276,4 +276,4 @@ local/bin/py/submit_github_status_check.py !local/bin/py/build/configuration/integration_merge.yaml # Cursor editor configuration -.cursor/ +.cursor/ \ No newline at end of file From 3091a4cef920301d62bc96eb05b1aa9bef19e9a9 Mon Sep 17 00:00:00 2001 From: Michael Cretzman <58786311+michaelcretzman@users.noreply.github.com> Date: Thu, 3 Apr 2025 18:30:40 -0700 Subject: [PATCH 08/15] Apply suggestions from code review additional edits from peer reviewer Co-authored-by: DeForest Richards <56796055+drichards-87@users.noreply.github.com> Co-authored-by: Taiki --- .../guide/manage_account_theft_appsec.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/content/en/security/application_security/guide/manage_account_theft_appsec.md b/content/en/security/application_security/guide/manage_account_theft_appsec.md index 4490a56703daa..c8f8ba57328fb 100644 --- a/content/en/security/application_security/guide/manage_account_theft_appsec.md +++ b/content/en/security/application_security/guide/manage_account_theft_appsec.md @@ -97,8 +97,8 @@ To validate that login metadata is collected, do the following: In the event of a **false** user (`usr.exists:false`), look for the following issues: -- A single event: if the trace contains multiple login events, such as both successes and failures, this might be caused by incorrect auto-instrumentation. To change auto-instrumentation, go to [Step 1.5: Manually instrumenting your services](#step-1.5:-manually-instrumenting-your-services). -- Does the event contain the mandatory metadata? It might appear as a user attribution section in the case of a login success. The mandatory metadata is `usr.login` and `usr.exists` in the case of login failure, and `usr.login` and `usr.id` in the case of login success. If some metadata is missing, go to [Step 1.5: Manually instrumenting your services](#step-1.5:-manually-instrumenting-your-services). +- A single event: if the trace contains multiple login events, such as both successes and failures, this might be caused by incorrect auto-instrumentation. To change auto-instrumentation, go to [Step 1.5: Manually instrumenting your services](#step-15-manually-instrumenting-your-services). +- Does the event contain the mandatory metadata? It might appear as a user attribution section in the case of a login success. The mandatory metadata is `usr.login` and `usr.exists` in the case of login failure, and `usr.login` and `usr.id` in the case of login success. If some metadata is missing, go to [Step 1.5: Manually instrumenting your services](#step-15-manually-instrumenting-your-services). **If the instrumentation is correct, go to [Phase 2: Preparing for Account Takeover campaigns](#phase-2-preparing-for-ato-campaigns).** @@ -199,7 +199,7 @@ To configure automatic blocking, do the following: 1. Go to **ASM** > **Protection** > [Detection Rules][23]. 2. In **Search**, enter `tag:"category:account_takeover"`. 3. Open the rules where you want to turn on blocking. Datadog recommends turning IP blocking on for **High** or **Critical** severity. -4. In the rule, in **Define Conditions**, in **Security Responses**, enable **IP automated blocking**. +4. In the rule, in **Define Conditions**, in **Security Responses**, enable **IP automated blocking**. You may also enable **User automated blocking**. You can control the blocking behavior per condition. Each rule can have multiple conditions based on your confidence and the attack success. **Datadog does not recommend permanent blocking of IP addresses**. Attackers are unlikely to reuse IPs and permanent blocking could result in blocking users. Moreover, ASM has a limit of how many IPs it can block (`~10000`), and this could fill this list with unnecessary IPs. @@ -307,7 +307,7 @@ If you want to initiate a partial response, do the following: The attackers are likely using a small number of IPs. To block them, open the signal and use Next Steps. You can set the duration of blocking. -Datadog recommend **12h**, which is enough for the attack to stop and avoid blocking legitimate users when, after the attack, those IPs get recycled to legitimate users. Datadog does not recommend permanent blocking. +Datadog recommends **12h**, which is enough for the attack to stop and avoid blocking legitimate users when, after the attack, those IPs get recycled to legitimate users. Datadog does not recommend permanent blocking. You can also block compromised users, although a better approach would be to extract them and reset their credentials using your own systems. Finally, you can introduce automated IP blocking while running your investigation. From ced78061f16e8dae4a5e249d76838255ad717563 Mon Sep 17 00:00:00 2001 From: Michael Cretzman Date: Fri, 4 Apr 2025 14:11:42 -0700 Subject: [PATCH 09/15] incorporating the tech and peer edits again they were lost when I used Visual Studio Code to manage them. I went thru each one and did it manually in this commit. --- .../guide/manage_account_theft_appsec.md | 102 +++++++++--------- 1 file changed, 52 insertions(+), 50 deletions(-) diff --git a/content/en/security/application_security/guide/manage_account_theft_appsec.md b/content/en/security/application_security/guide/manage_account_theft_appsec.md index c8f8ba57328fb..427eca8de0fec 100644 --- a/content/en/security/application_security/guide/manage_account_theft_appsec.md +++ b/content/en/security/application_security/guide/manage_account_theft_appsec.md @@ -50,24 +50,24 @@ This step describes how to set up your service to use ASM. To enable ASM on your login service, ensure you meet the following requirements: -* Similarly to Datadog APM, ASM requires a library integration in your services and a running Datadog agent. +* Similarly to Datadog APM, ASM requires a library integration in your services and a running Datadog Agent. * ASM generally benefits from using the newest library possible; however, minimum supported versions are documented in [Compatibility Requirements][3]. * At a minimum, **Threat Detection** must be enabled. Ideally, **Automatic user activity event tracking** should be enabled as well. To enable ASM using a new deployment, use the `APPSEC_ENABLED` environment variable/library configuration or [Remote Configuration][11]. You can use either method, but Remote Configuration can be set up using the Datadog UI. -To enable ASM using Remote Configuration, do the following: +**To enable ASM using Remote Configuration**, and without having to restart your services, do the following: -1. Go to [Remote Configuration][5]. +1. Go to [ASM onboarding][5]. 2. Click **Get Started with ASM**. -3. In **Threat Management**, click **Select Services.** +3. In **Activate on services already monitored by Datadog**, click **Select Services.** 4. Select your service(s), and then click **Next** and proceed with the setup instructions. When you see traces from your service in [ASM Traces][6], move to [Step 1.3: Validating login information is automatically collected](#step-1.3:-validating-login-information-is-automatically-collected). For more detailed instructions on using a new deployment, see [Enabling ASM Threat Detection using Datadog Tracing Libraries][7]. -### Step 1.3: Validating login information is automatically collected {#step-1.3:-validating-login-information-is-automatically-collected} +### Step 1.3: Validating login information is automatically collected After you have enabled ASM, you can validate that login information is collected by Datadog. @@ -76,25 +76,29 @@ After you have enabled ASM, you can validate that login information is collected To validate login information is collected, do the following: 1. Go to [Traces][8] in ASM. -2. Look for traces tagged with login activity from your login service. For example, in **Search for**, you might have `@appsec.security\_activity:business\_logic.users.login.\*`. +2. Look for traces tagged with login activity from your login service. For example, in **Search for**, you might have `@appsec.security\activity:business\logic.users.login.*`. 3. Check if all your login services are reporting login activity. You can see this in the **Service** facet. -**If you don't see login activity from a service**, go to [Step 1.5: Manually instrumenting your services](#step-1.5:-manually-instrumenting-your-services). +**If you don't see login activity from a service**, go to [Step 1.5: Manually instrumenting your services](#step-15-manually-instrumenting-your-services). -### Step 1.4: Validating login metadata is automatically collected {#step-1.4:-validating-login-metadata-is-automatically-collected} +### Step 1.4: Validating login metadata is automatically collected To validate that login metadata is collected, do the following: 1. Go to [Traces][8] in ASM. -2. Look for traces tagged with successful and failed login activity from your login service. For example, in **Search for**, you might have all. +2. Look for traces tagged with successful and failed login activity from your login service. You can update the search query in **Search for** to filter `business_logic.users.login.success` or `business_logic.users.login.failure`. 3. Open a trace. -4. In the trace details, is the **Security** tab, review **Business Logic Event**. +4. On the **Security** tab, review the **Business Logic Event**. 5. Check if the event has a false user. +Review a few traces, both login successes and login failures. For login failures, look for traces with `usr.exists` as `true` (failed login attempt by an existing user) and `false`. + +The checks must be done whether or not the user exists. + In the event of a **false** user (`usr.exists:false`), look for the following issues: - A single event: if the trace contains multiple login events, such as both successes and failures, this might be caused by incorrect auto-instrumentation. To change auto-instrumentation, go to [Step 1.5: Manually instrumenting your services](#step-15-manually-instrumenting-your-services). @@ -112,8 +116,8 @@ To manually instrument your services, do the following: 1. If auto-instrumentation is providing incorrect data (multiple events in a single trace), see [Disable auto-instrumentation][9]. 2. For detailed instrumentation instructions for each language, go to [Adding business logic information (login success, login failure, any business logic) to traces][10]. Make sure to add the following metadata: - * `usr.login`: **Mandatory for login success and failure**. This field contains the *name* used to log into the account. The name might be an email address, a phone number, a username, or something else. The purpose of this field is to identify targeted accounts even if they don't exist in your systems because a user might be able to change those accounts. - * `usr.exists`: **Recommended for login failures**. This field is required for some default detections. The field helps to lower the priority of attempts targeted at accounts that don't exist in your systems. + * `usr.login`: **Mandatory for login success and failure**. This field contains the *name* used to log into the account. The name might be an email address, a phone number, a username, or something else. The purpose of this field is to identify targeted accounts even if they don't exist in your systems because a user might be able to change those accounts. Also, this field provides information on the location of the database used by the attacker. This value shouldn't be confused with `usr.id`. + * `usr.exists`: **Mandatory for login failures**. This field is required for some default detections. The field helps to lower the priority of attempts targeted at accounts that don't exist in your systems. * `usr.exists`: **Mandatory for login failures**. This field is required for some default detections. The field helps to lower the priority of attempts targeted at accounts that don't exist in your systems. **After deploying the code, validate the instrumentation is correct by following the steps in** [Step 1.4: Validating login metadata is automatically collected](#step-1.4:-validating-login-metadata-is-automatically-collected). @@ -122,7 +126,7 @@ To manually instrument your services, do the following: ASM can use custom In-App WAF rules to flag login attempts and extract the metadata from the request needed by detection rules. -This approach requires that [Remote Configuration][11] is enabled and working. Verify Remote Configuration is running in [Remote Configuration][12]. +This approach requires that [Remote Configuration][11] is enabled and working. Verify Remote Configuration is running for this service in [Remote Configuration][12]. To use custom In-App WAF rules, do the following: @@ -130,7 +134,7 @@ To use custom In-App WAF rules, do the following: 2. Name your rule and select the **Business Logic** category. 3. Set the rule type as `users.login.failure` for login failures and `users.login.success` for login successes. -4. Select your service and write the rule to match the login attempts. Typically, you match the method (`POST`), the URI with a regex (`^/login`) and the status code (403 for failures, 302 or 200 for success). +4. Select your service and write the rule to match the login attempts. Typically, you match the method (`POST`), the URI with a regex (`^/login`), and the status code (403 for failures, 302 or 200 for success). 5. Collect the tags required by detection rules. The most important tag is `usr.login`. Assuming the login was provided in the request, you can add a condition and set `store value as tag` as the operator. 6. Select a specific user parameter as an input, either in the body or the query. @@ -144,7 +148,7 @@ For more details, see [Tracking business logic information without modifying the ## Phase 2: Preparing for ATO campaigns -After setting up instrumentation for your services, ASM monitord for attack campaigns. You can review the monitoring in the [Attacks overview][14] **Business logic** section. +After setting up instrumentation for your services, ASM monitors for attack campaigns. You can review the traffic in the [Attacks overview][14] **Business logic** section. @@ -152,45 +156,43 @@ ASM detects [multiple attacker strategies][15]. Upon detecting an attack with a The severity of the signal is set based on the urgency of the threat: from **Low** in case of unsuccessful attacks to **Critical** in case of successful account compromises. -To fully leverage detections, take the following actions. +The actions covered in the next sections help you to identify and leverage detections faster. ### Step 2.1: Configuring notifications -[Notifications][17] provide warnings when a signal is triggered. - -To create a notification rule using the [Create a new rule][18] setting, do the following: +[Notifications][17] provide a warning on your preferred channel when a signal is triggered. To create a notification rule, do the following: 1. Open [Create a new rule][18]. 2. Enter a name for the rule. -3. Select **Signal** and remove all entries except **ASM**. -4. Restrict the rule to `category:account_takeover.` +3. Select **Signal** and remove all entries except **Application Security**. +4. Restrict the rule to `category:account_takeover`, and expand the severities to include `Medium`. 5. Add notification recipients (Slack, Teams, PagerDuty). To learn more, see [Notification channels][19]. -6. Save the rule. +6. Test, and then save the rule. The notification is sent the next time a signal is generated. ### Step 2.2: Validate proper data propagation -In microservice environments, services are generally reached by internal hosts running other services. This internal environment makes it challenging to identify the unique traits of the attacker, such as IP, user agent, fingerprint, etc., and validate the data. +In microservice environments, services are generally reached by internal hosts running other services. This internal environment makes it challenging to identify the unique traits of the original attacker's request, such as IP, user agent, fingerprint, etc. -[ASM Traces][20] can help to validate the data by exposing the source IPs and user agent traffic. - -To validate the data, do the following: - -1. Review login traces in the [Traces][21] and check for the following: +[ASM Traces][20] can help you validate that the login event is properly tagged with the source IPs, user agent, etc. To validate, review login traces in [Traces][21] and check for the following: + * Source IPs (`@http.client_ip`) are varied and public IPs. * **Problem:** If login attempts are coming from a few IPs only, this might be a proxy that you can't block without risking availability. * **Solution:** Forward the client IP of the initial request through a HTTP header, such as `X-Forwarded-For`. You can use a custom header for [better security][22] and configure the tracer to read it using the `DD_TRACE_CLIENT_IP_HEADER` environment variable. * The user agent (`@http.user_agent`) is consistent with the expected traffic (web browser, mobile app, etc.) * **Problem:** The user agent could be replaced by the user agent in the calling microservice network library. * **Solution:** Use the client user agent when calling subsequent services. +* Multiple headers are populated. You can see this in a trace's **See more details** in the **Request** block. + * **Problem:** Normal request headers (for example, `accept-encoding`) aren't forwarded to the instrumented service. This impairs the generation of fingerprints (`@appsec.fingerprint.*`) and degrades the signal's ability to isolate an attacker's activity. + * **Solution:** Forward those headers when calling a subsequent microservice. ### Step 2.3: Configure automatic blocking -**Before you begin:** Verify that the IP addresses are properly configured, as described in [Step 2.2: Validate proper data propagation](#step-2.2:-validate-proper-data-propagation). +
Before you begin: Verify that the IP addresses are properly configured, as described in Step 2.2: Validate proper data propagation.
-ASM automatic blocking can be used to block attacks at any time of the day. Automatic blocking can help block attacks before your team members are online, providing security during off hours. +ASM automatic blocking can be used to block attacks at any time of the day. Automatic blocking can help block attacks before your team members are online, providing security during off hours. Within an ATO, automatic blocking can help mitigate the load issues caused by the increase in failed login attempts or prevent the attacker from using compromised accounts. You can configure automatic blocking to block IPs identified as part of an attack. This is only a partial remediation because attackers can change IPs; however, it can give you more time to implement comprehensive remediation. @@ -223,7 +225,7 @@ Many strategies are available, but it's important to understand that the value c 3. The actor buys access to a botnet, letting them leverage many different IPs to run their attack. There are extreme cases where large campaigns with 500k+ attempts were so distributed that Datadog saw an average of 1.01 requests per IP and a single attempt per account. 4. When valid credentials are discovered, they might be sold downstream to another actor to leverage them to some end such as financial theft, spam, abuse, etc. -Whenever an attack starts against your systems, signals are generated mentioning **Credential Stuffing**, **Distributed Credential Stuffing**, or **Bruteforce**. These signal terms are based on the strategy used by the attacker. +When an attack begins against your systems, the system generates signals labeled **Credential Stuffing**, **Distributed Credential Stuffing**, or **Bruteforce**, depending on the attacker's strategy. ### Step 3.1: Triage @@ -232,7 +234,7 @@ The first step is to confirm that the detection is correct. Certain behaviors, s {{< tabs >}} {{% tab "Bruteforce" %}} -The signal is looking for an attempt to steal a user account by trying many different passwords for this account. Generally, a small number of accounts are targeted by those campaigns. +The signal is looking for an attempt to steal a user account by trying many different passwords for this account. Generally, a small number of accounts are targeted by these campaigns. Review the accounts flagged as compromised. Click on a user to open a summary of recent activity. @@ -244,7 +246,7 @@ Questions for triage: If the answer to those questions is yes, the signal is likely legitimate. -You can adapt your response based on the sensitivity of the account (for example, a free account without much access/secrets vs admin account). +You can adapt your response based on the sensitivity of the account. For example, a free account with limited access versus an admin account. {{% /tab %}} @@ -285,7 +287,7 @@ If the list is truncated, click **View in App & API Protection Traces Explorer** If the conclusion of the triage is that the signal is a false positive, you can flag it as a false positive and close it. -If the false positive was caused by a unique setting in your service, you might introduce suppression filters to silence them out. +If the false positive was caused by a unique setting in your service, you can add suppression filters to silence false positives. **If the signal is legitimate**, move to step [Step 3.2: Preliminary response](#step-32-disrupting-the-attacker-as-a-preliminary-response). @@ -295,10 +297,10 @@ If the attack is ongoing, you might want to disrupt the attacker as you investig **Note:** This is a common step, although you might want to skip this step in the following circumstances: -* Accounts aren't immediately valuable: you can block compromised accounts after the fact with no negative consequences. -* You want to maintain the maximum visibility on the attack by avoiding preventing the attacker from learning that an investigation is ongoing and changing their strategy to something more difficult to track. +* The accounts have little immediate value. You can block these post-compromise without causing harm. +* You want to maintain maximum visibility into the attack by avoiding any action that alerts the attacker to the investigation and causes them to change tactics. -Enforcing this preliminary response requires [Remote Configuration][11] is enabled for your services. +Enforcing this preliminary response requires that [Remote Configuration][11] is enabled for your services. If you want to initiate a partial response, do the following: @@ -316,7 +318,7 @@ Finally, you can introduce automated IP blocking while running your investigatio {{% tab "Distributed Credential Stuffing" %}} -Those attacks often rely on a large number of disposable IPs. The latency from the Datadog platform makes it impractical to block login attempts by blocking the IP before the IP gets dropped from the attacker's pool. +These attacks often use a large number of disposable IPs. Due to Datadog's latency, it's impractical to block login attempts by blocking the IP before the attacker drops it from their pool. Instead, block traits of the request that are unique to the malicious attempt (a user agent, a specific header, a fingerprint, etc.). @@ -353,7 +355,7 @@ After confirming that the traits match the attackers, you can push an In-App WAF To create the rule, do the following: -1. Go to **ASM** > **In-App WAF** > [Custom Rules][24]. +1. Go to **ASM** > **In-App WAF** > [Custom Rules](https://app.datadoghq.com/security/appsec/in-app-waf?column=services-count&config_by=custom-rules). 2. Click **Create New Rule** and complete the configuration. 3. Select your login service (or a service where you want to block the requests). You can target blocking to the login route also. 4. Configure the conditions of the rule. In this example, the user agent is used. If you want to block a specific user agent, you can paste it with the operator `matches value in list`. If you want more flexibility, you can also use a regex. @@ -372,8 +374,8 @@ Multiple blocking actions are available. Depending on the sophistication of the When you have [disrupted the attacker as a preliminary response](#step-3.2:-disrupting-the-attacker-as-a-preliminary-response), you can identify the following: - Accounts compromised by the attackers so you can reset their credentials. -- Hints about the source of the targeted accounts you can use for proactive password reset or higher scrutiny. -- Data on the attacker infrastructure you can use to catch future attempts or other malicious activity (credit card stuffing, abuse, etc.). +- Hints about the source of the targeted accounts, which you can use for proactive password resets or higher scrutiny. +- Data on the attacker infrastructure, which you can use to catch future attempts or other malicious activity (credit card stuffing, abuse, etc.). The first step is to isolate the attacker activity from the overall traffic of the application. @@ -406,7 +408,7 @@ Successful logins should be considered suspicious. {{% tab "Credential Stuffing" %}} -This signal flagged a lot of activity coming from a few IPs. This signal is closely related to its distributed variant. You might need to use the distributed credential stuffing method. +This signal flagged a lot of activity coming from a few IPs and is closely related to its distributed variant. You might need to use the distributed credential stuffing method. Start by extracting a list of suspicious IPs from the signal side panel @@ -646,17 +648,17 @@ This is general guidance. Depending on your applications and environments, there [6]: https://app.datadoghq.com/security/appsec/traces?query=&agg_m=count&agg_m_source=base&agg_t=count&fromUser=false&track=appsecspan&start=1735036043639&end=1735640843639&paused=false [7]: /security/application_security/threats/setup/threat_detection/ [8]: https://app.datadoghq.com/security/appsec/traces?query=%40appsec.security_activity%3Abusiness_logic.users.login.%2A&agg_m=count&agg_m_source=base&agg_t=count&fromUser=false&track=appsecspan&start=1735036164646&end=1735640964646&paused=false -[9]: https://docs.datadoghq.com/security/application_security/threats/add-user-info/?tab=set_user#disabling-user-activity-event-tracking +[9]: /security/application_security/threats/add-user-info/?tab=set_user#disabling-user-activity-event-tracking [10]: /security/application_security/threats/add-user-info/?tab=set_user#adding-business-logic-information-login-success-login-failure-any-business-logic-to-traces -[11]: https://docs.datadoghq.com/agent/remote_config/?tab=configurationyamlfile +[11]: /agent/remote_config/?tab=configurationyamlfile [12]: https://app.datadoghq.com/organization-settings/remote-config?resource_type=agents -[13]: https://docs.datadoghq.com/security/application_security/threats/add-user-info/?tab=set_user#tracking-business-logic-information-without-modifying-the-code +[13]: /security/application_security/threats/add-user-info/?tab=set_user#tracking-business-logic-information-without-modifying-the-code [14]: https://app.datadoghq.com/security/appsec/threat -[15]: https://docs.datadoghq.com/security/account_takeover_protection/#attacker-strategies +[15]: /security/account_takeover_protection/#attacker-strategies [16]: https://app.datadoghq.com/security/appsec/detection-rules?query=type%3Aapplication_security%20tag%3A%22category%3Aaccount_takeover%22&deprecated=hide&groupBy=none&sort=date&viz=rules -[17]: https://docs.datadoghq.com/security/notifications/ +[17]: /security/notifications/ [18]: https://app.datadoghq.com/security/configuration/notification-rules/new?notificationData= -[19]: https://docs.datadoghq.com/security/notifications/#notification-channels +[19]: /security/notifications/#notification-channels [20]: https://app.datadoghq.com/security/appsec/traces?query=%40appsec.security_activity%3Abusiness_logic.users.login.%2A&agg_m=count&agg_m_source=base&agg_t=count&fromUser=false&track=appsecspan&start=1735222832468&end=1735827632468&paused=false [21]: https://app.datadoghq.com/security/appsec/traces?query=%40appsec.security_activity%3Abusiness_logic.users.login.%2A&agg_m=count&agg_m_source=base&agg_t=count&fromUser=false&track=appsecspan&start=1735222832468&end=1735827632468&paused=false [22]: https://securitylabs.datadoghq.com/articles/challenges-with-ip-spoofing-in-cloud-environments/#what-should-you-do @@ -665,7 +667,7 @@ This is general guidance. Depending on your applications and environments, there [25]: https://app.datadoghq.com/security/appsec/traces [26]: https://app.datadoghq.com/security [27]: https://app.datadoghq.com/security/appsec/denylist -[28]: https://ddstaging.datadoghq.com/security/appsec/in-app-waf?column=services-count&config_by=custom-rules -[30]: https://docs.datadoghq.com/security/application_security/threats/inapp_waf_rules/ +[28]: https://app.datadoghq.com/security/appsec/in-app-waf?column=services-count&config_by=custom-rules +[30]: /security/application_security/threats/inapp_waf_rules/ [31]: /api/latest/spans/#aggregate-spans [32]: https://haveibeenpwned.com/ \ No newline at end of file From 8c4e77f7e4fae43261b4cc5181371289d13d3b90 Mon Sep 17 00:00:00 2001 From: Michael Cretzman <58786311+michaelcretzman@users.noreply.github.com> Date: Mon, 7 Apr 2025 11:46:18 -0700 Subject: [PATCH 10/15] Apply suggestions from code review Committing some dev edits Co-authored-by: Taiki --- .../guide/manage_account_theft_appsec.md | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/content/en/security/application_security/guide/manage_account_theft_appsec.md b/content/en/security/application_security/guide/manage_account_theft_appsec.md index 427eca8de0fec..7706307388176 100644 --- a/content/en/security/application_security/guide/manage_account_theft_appsec.md +++ b/content/en/security/application_security/guide/manage_account_theft_appsec.md @@ -241,7 +241,7 @@ Review the accounts flagged as compromised. Click on a user to open a summary of Questions for triage: * Has there been a sharp increase of activity? -* Are the IPs attempting logins known? +* Is it the first time those IPs are attempting logins? * Are they flagged by threat intelligence? If the answer to those questions is yes, the signal is likely legitimate. @@ -256,14 +256,14 @@ This signal is looking for a large number of accounts with failed logins coming Review the accounts flagged as targeted. -If they share attributes, such as all coming from one institution, check whether the IP might be a proxy for this institution by reviewing its past activity. +If they share attributes, such as all coming from one institution, check whether the IP might be a proxy for this institution by reviewing its past activity by hovering over it and opening the side panel. Questions for triage: * Has there been a sharp increase of activity? * Are the accounts uncorrelated? * Are IPs flagged by threat intelligence? -* Are there much more login failures than successes ? +* Are there many more login failures than successes ? If the answer to those questions is yes, the signal is likely legitimate. You can adapt your response based on the scale of the attack and whether accounts are being compromised. @@ -279,7 +279,7 @@ Datadog tries to identify common attributes between the login failures in your s If accurate, the activity of the cluster should closely match the increase in login failures while also being low/nonexistent before. If no cluster is available, click **Investigate in full screen** and review the targeted users/IPs for outliers. -If the list is truncated, click **View in App & API Protection Traces Explorer** and run the investigation with the Traces explorer. For additional tools, see [Step 3.3: Investigation](#step-33-investigation). +If the list is truncated, click **View in ASM Protection Trace Explorer** and run the investigation with the Traces explorer. For additional tools, see [Step 3.3: Investigation](#step-33-investigation). {{% /tab %}} {{< /tabs >}} @@ -295,10 +295,11 @@ If the false positive was caused by a unique setting in your service, you can ad If the attack is ongoing, you might want to disrupt the attacker as you investigate further. Disrupting the attacker slows down the attack and reduce the number of compromised accounts. -**Note:** This is a common step, although you might want to skip this step in the following circumstances: +
**Note:** This is a common step, although you might want to skip this step in the following circumstances: * The accounts have little immediate value. You can block these post-compromise without causing harm. * You want to maintain maximum visibility into the attack by avoiding any action that alerts the attacker to the investigation and causes them to change tactics. +
Enforcing this preliminary response requires that [Remote Configuration][11] is enabled for your services. @@ -312,7 +313,7 @@ The attackers are likely using a small number of IPs. To block them, open the si Datadog recommends **12h**, which is enough for the attack to stop and avoid blocking legitimate users when, after the attack, those IPs get recycled to legitimate users. Datadog does not recommend permanent blocking. You can also block compromised users, although a better approach would be to extract them and reset their credentials using your own systems. -Finally, you can introduce automated IP blocking while running your investigation. +Finally, you can enable automated IP blocking from the Next Step section so that new IPs are automatically blocked while you're running your investigation. {{% /tab %}} @@ -328,7 +329,7 @@ Before blocking, Datadog recommends that you review the activity from the cluste The questions you're trying to answer are: -- Is the traffic malicious? +- Is the traffic malicious? Did this traffic exist before the beginning of th attack? - Can a meaningful volume of legitimate traffic be caught? - Can blocking based on this cluster be effective? @@ -371,7 +372,7 @@ Multiple blocking actions are available. Depending on the sophistication of the ### Step 3.3: Investigation -When you have [disrupted the attacker as a preliminary response](#step-3.2:-disrupting-the-attacker-as-a-preliminary-response), you can identify the following: +When you have [disrupted the attacker as a preliminary response](#step-32-disrupting-the-attacker-as-a-preliminary-response), you can identify the following: - Accounts compromised by the attackers so you can reset their credentials. - Hints about the source of the targeted accounts, which you can use for proactive password resets or higher scrutiny. @@ -381,7 +382,7 @@ The first step is to isolate the attacker activity from the overall traffic of t #### Isolate attacker activity -To isolate attacker activity, ensure that your current filters are exhaustive: +While isolating attacker activity, ensure that your current filters are exhaustive through two tests: 1. Go to [Traces][25], and then *exclude* traces so that the remaining traffic closely tracks your normal traffic. If you're still seeing a spike during the attack, it means further filters are necessary to comprehensively neutralize the attack. From 728c1215d69f5b025bc878a104ded4abd2cc2138 Mon Sep 17 00:00:00 2001 From: Michael Cretzman <58786311+michaelcretzman@users.noreply.github.com> Date: Mon, 7 Apr 2025 11:48:33 -0700 Subject: [PATCH 11/15] Apply suggestions from code review Edited revision Co-authored-by: Taiki --- .../application_security/guide/manage_account_theft_appsec.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/security/application_security/guide/manage_account_theft_appsec.md b/content/en/security/application_security/guide/manage_account_theft_appsec.md index 7706307388176..f6481b06e66c8 100644 --- a/content/en/security/application_security/guide/manage_account_theft_appsec.md +++ b/content/en/security/application_security/guide/manage_account_theft_appsec.md @@ -385,7 +385,7 @@ The first step is to isolate the attacker activity from the overall traffic of t While isolating attacker activity, ensure that your current filters are exhaustive through two tests: -1. Go to [Traces][25], and then *exclude* traces so that the remaining traffic closely tracks your normal traffic. If you're still seeing a spike during the attack, it means further filters are necessary to comprehensively neutralize the attack. +1. Go to [Traces][25], and then *exclude* traces based on the filters you identify. The goal is to have the remaining traffic volume similar to your normal traffic volume. If you're still seeing a spike of logins during the attack, it means further filters are necessary to comprehensively isolate the attack. 2. Look at the matching traffic over an expanded time frame (for example, if the attack lasted an hour, use one day). Any traffic before or after the attack is likely be a false positive. Next, start by isolating the attack's activity. From 40fbac26ac3c2ef47411d1d27b1a411ab50258cd Mon Sep 17 00:00:00 2001 From: Michael Cretzman <58786311+michaelcretzman@users.noreply.github.com> Date: Mon, 7 Apr 2025 12:07:10 -0700 Subject: [PATCH 12/15] Apply suggestions from code review last of dev edit Co-authored-by: Taiki --- .../guide/manage_account_theft_appsec.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/content/en/security/application_security/guide/manage_account_theft_appsec.md b/content/en/security/application_security/guide/manage_account_theft_appsec.md index f6481b06e66c8..d5bccbc079838 100644 --- a/content/en/security/application_security/guide/manage_account_theft_appsec.md +++ b/content/en/security/application_security/guide/manage_account_theft_appsec.md @@ -386,7 +386,7 @@ While isolating attacker activity, ensure that your current filters are exhausti 1. Go to [Traces][25], and then *exclude* traces based on the filters you identify. The goal is to have the remaining traffic volume similar to your normal traffic volume. If you're still seeing a spike of logins during the attack, it means further filters are necessary to comprehensively isolate the attack. -2. Look at the matching traffic over an expanded time frame (for example, if the attack lasted an hour, use one day). Any traffic before or after the attack is likely be a false positive. +2. Look at the traffic matching your filters over an expanded time frame (for example, if the attack lasted an hour, use one day). Any traffic matched before or after the attack is likely be a false positive. Next, start by isolating the attack's activity. @@ -397,13 +397,13 @@ Extract the list of targeted users by going to [Signals][1]. -To craft a query to review all the activity from targeted users, follow this template: +From this list of users, you can craft a [Traces][25] query to review all the activity from targeted users. Follow this template: `@appsec.security_activity:business_logic.users.login.* @appsec.events_data.usr.login:()` Successful logins should be considered suspicious. -[1]: https://app.datadoghq.com/security +[1]: https://app.datadoghq.com/security?query=%40workflow.rule.type%3A"Application%20Security"%20category%3Aaccount_takeover&product=appsec {{% /tab %}} @@ -492,7 +492,7 @@ You can either use Datadog's built-in blocking capabilities to deny any request ### Datadog blocking -Users that are part of traffic blocked by Datadog see a **You're blocked** page, or they receive a custom status code, such as a redirection. Blocking can be applied through two mechanisms, each with different performance characteristics: the Denylist and custom WAF rules. +Users that are part of the traffic blocked by Datadog see a **You're blocked** page, or receive a custom status code, such as a redirection. Blocking can be applied through two mechanisms, each with different performance characteristics: the Denylist and custom WAF rules. @@ -502,7 +502,7 @@ The [Denylist][27] is an efficient way to block a large number of entries, but i The Denylist can be managed and automated using the Datadog platform by clicking **Automate Attacker Blocking** in the signal. -Use the **Automate Attacker Blocking** or **Block All Attacking IPs** signal options to block all attacking IPs for a few hours, a week, or permanently. Similarly, you can block compromised users. +Use the **Automate Attacker Blocking** or **Block All Attacking IPs** signal options to block all attacking IPs for a few hours, a week, or permanently. Similarly, you can block compromised users. As a reminder, Datadog doesn't recommend blocking IPs permanently due to risks of blocking legitimate traffic after IPs get recycled into public pools. @@ -510,7 +510,7 @@ The blocking can be rescinded or extended from the [Denylist][27]. -If the signal wasn't accurate, you can extract the list and add it to the Denylist manually. +If the signal wasn't accurate, you can extract the list or users or IPs and add it to the Denylist manually. @@ -527,11 +527,11 @@ To create a new rule, do the following: 3. Follow the steps in **Define your custom rule**. 4. In **Select the services you want this rule to apply to**, select your login service, or whichever services where you want to block requests. You can also target the blocking to the login route. -1. In **If incoming requests match these conditions**, configure the conditions of the rule. The following example uses the user agent. +1. In **If incoming requests match these conditions**, configure the conditions of the rule. 1. If you want to block a specific user agent, you can paste it in **Values**. In **Operator**, you can use **matches value in list**, or if you want more flexibility, you can also use a **Matches RegEx**. 2. Use the **Preview matching traces** section as a final review of the rule's impact. If no unexpected traces are shown, select a blocking mode and save the rule. - The response is pushed to the Traces explorer automatically and blocked traces appear. +The response is pushed to tracers automatically and blocked traces appear in the [Traces explorer][25]. @@ -547,7 +547,7 @@ If a large-scale attack resumes, the Distributed Credential Stuffing signal shou * Persistent attackers often require multiple iterations of defensive measures before giving up. * The ideal defense is a robust blocking strategy that the attacker cannot circumvent. -* Attackers frequently attempt to evade detection by altering IPs and user agents. +* Attackers frequently attempt to evade detection by altering IPs and user agents. They're less likely to deeply modify the script they procured to send their login attempts so headers are a more resilient target. * Effective strategies include fingerprint-based or correlation methods that identify rare header combinations. * Monitor blocked traffic resulting from previous defensive responses. * Blocking attacker traffic may inadvertently block legitimate traffic. Implement mechanisms to unblock legitimate traffic, either adapt the Datadog response or ensure it is unblocked post attack. From fd06d64f68ef724f297b7cf592b108adc644bb5a Mon Sep 17 00:00:00 2001 From: Michael Cretzman <58786311+michaelcretzman@users.noreply.github.com> Date: Mon, 7 Apr 2025 12:18:27 -0700 Subject: [PATCH 13/15] Update content/en/security/application_security/guide/manage_account_theft_appsec.md Co-authored-by: Taiki --- .../application_security/guide/manage_account_theft_appsec.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/security/application_security/guide/manage_account_theft_appsec.md b/content/en/security/application_security/guide/manage_account_theft_appsec.md index d5bccbc079838..26665037aef3d 100644 --- a/content/en/security/application_security/guide/manage_account_theft_appsec.md +++ b/content/en/security/application_security/guide/manage_account_theft_appsec.md @@ -415,7 +415,7 @@ Start by extracting a list of suspicious IPs from the signal side panel -To craft a query to review all the activity from suspected IPs, follow this template: +From the list of IPs, you can craft a [Traces][25] query to review all the activity from suspected IPs. Follow this template: `@appsec.security_activity:business_logic.users.login.* @http.client_ip:()` From 0a084cc8dc7ef6b7a9ad717d634fc31ade79b2b6 Mon Sep 17 00:00:00 2001 From: Michael Cretzman <58786311+michaelcretzman@users.noreply.github.com> Date: Tue, 8 Apr 2025 11:49:38 -0700 Subject: [PATCH 14/15] Apply suggestions from code review incorp last dev edit pass Co-authored-by: Taiki --- .../guide/manage_account_theft_appsec.md | 25 ++++++++++--------- 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/content/en/security/application_security/guide/manage_account_theft_appsec.md b/content/en/security/application_security/guide/manage_account_theft_appsec.md index 26665037aef3d..8b230caf3e632 100644 --- a/content/en/security/application_security/guide/manage_account_theft_appsec.md +++ b/content/en/security/application_security/guide/manage_account_theft_appsec.md @@ -295,7 +295,7 @@ If the false positive was caused by a unique setting in your service, you can ad If the attack is ongoing, you might want to disrupt the attacker as you investigate further. Disrupting the attacker slows down the attack and reduce the number of compromised accounts. -
**Note:** This is a common step, although you might want to skip this step in the following circumstances: +
Note: This is a common step, although you might want to skip this step in the following circumstances: * The accounts have little immediate value. You can block these post-compromise without causing harm. * You want to maintain maximum visibility into the attack by avoiding any action that alerts the attacker to the investigation and causes them to change tactics. @@ -344,7 +344,7 @@ Those are two important indicators: Click an indicator to see further information about the cluster traffic. -In **Cluster Activity**, there is a visualization of the volume of the overall APM traffic matching this cluster. +In **Cluster Activity**, there is a visualization of the volume of the overall APM traffic matching this cluster. While comparing it to the ASM data, beware the scale, since APM data may be sampled while ASM's isn't. In the following example, a lot of traffic comes from before the attack. This means a legitimate activity matches this cluster in normal traffic and it would get blocked if you were to take action. You don't need to escalate or click **Block All Attacking IPs** in the signal. @@ -425,26 +425,26 @@ Successful logins should be considered suspicious. {{% tab "Distributed Credential Stuffing" %}} -This signal flagged a large increase in login failures in one service. If the attack is large enough, this signal might trigger both Bruteforce or Credential Stuffing signals. The signal is also able to detect diffuse attacks. +This signal flagged a large increase in login failures in one service. If the attack is large enough, this signal might also trigger either the Bruteforce or Credential Stuffing signals. The signal is also able to detect diffuse attacks more comprehensively. In the diffuse attacks case, attacker attributes are available in the signal. -1. Click **Investigate in full screen**. -2. In **Attacker Attributes**, select the cluster and click, then, in **Traces**, click **View in ASM Protection Trace Explorer**. +1. After opening the signal in the side panel, click **Investigate in full screen**. +2. In **Attacker Attributes**, select the cluster and click on **Filter this signal by selection**, then, in **Traces**, click **View in ASM Protection Trace Explorer**. -This gets you to the trace explorer with filters set to the flagged attributes. You can start the investigation with the current query, but you should expand it to also match login successes on top of the failures. Review the exhaustiveness/accuracy of the filter using the technique described above (in the paragraph before the table). +This gets you to the trace explorer with filters set to the flagged attributes. You can start the investigation with the current query, but you should expand it to also match login successes on top of the failures. You can do that by replacing `@appsec.security_activity:business_logic.users.login.failure` with `@appsec.security_activity:business_logic.users.login.*`. Review the exhaustiveness and accuracy of the filter using [the technique described above](#isolate-attacker-activity). -In the case those attributes are inaccurate/incomplete, you may try to identify further traits to isolate the attacker activity. The most useful traits are: +In the case those attributes are inaccurate or incomplete, you may try to identify further traits to isolate the attacker activity. The most useful traits are: 1. User agent: `@http.user_agent` 2. ASN: `@http.client_ip_details.as.domain` 3. Threat intelligence: `@threat_intel.results.category` 4. URL: `@http.url` -5. Fingerprint, when available: `@appsec.fingerpring.*` +5. Fingerprint, when available: `@appsec.fingerprint.*` @@ -461,13 +461,13 @@ Reviewing login successes and failures helps to identify the following: * What the attackers are after so that you can block them. * What the attackers are doing so that you can catch them, even if they change their scripts. -* How successful the attackers are so that you can take back the accounts where they took control and see how much time you have to react. +* How successful the attackers are so that you can take back the accounts they took control of and see how much time you have to react. When attacker activity is isolated, review login successes and consider the following questions: * Have any accounts been compromised? * Are attackers doing something with their compromised accounts or are they leaving them dormant? -* Are the accounts accessed by a different infrastructure? +* Are the accounts then accessed by a different infrastructure? * Is there any past activity from this infrastructure? For the login failures, consider the following questions: @@ -575,9 +575,9 @@ If you configured permanent blocking, unblock users and IPs from the Denylist by -Disable or delete any custom In-App WAF rule(s). +#### Disable or delete any custom In-App WAF rule(s) -To disable or delete In-App WAF rule(s), in [custom In-App WAF rules][28], disable the rules by clicking on **Monitoring** or **Blocking**, and selecting **Disable Rule**. +To disable or delete In-App WAF rule(s), go to the [custom In-App WAF rules page][28] and disable the rules by clicking on **Monitoring** or **Blocking**, and selecting **Disable Rule**. If the rule is no longer relevant, you can delete it by clicking more options (**...**) and selecting **Delete**. @@ -602,6 +602,7 @@ Here are some common hardening examples: * **Rate limit login attempt per IP/user/network range/user agent:** This soft-blocking feature lets you aggressively curtail the scale of the attack in some circumstances with minimal impact on normal users, even if they happen to share traits with the attacker * **Adding friction at login:** To break attackers' automation without significantly impacting users, use captchas or modifying the login flow during an attack (for example, require that a token is fetched from a new endpoint). +* **Enforce multi-factor authentication (MFA):** Datadog found MFA extremely effective in stopping account compromise. You could require your most privileged users to use MFA, especially during attacks. * **Limiting sensitive actions for users:** If your services allow users to perform sensitive actions (spending money, accessing sensitive information, changing contact information, etc.), you might want to prohibit high risk users with suspicious logins until they are reviewed manually or through multifactor authentication. Suspicious logins can be programmatically fed to your systems by Datadog through a webhook. * **Ability to consume signal findings programmatically:** Create an endpoint to consume Datadog webhooks and automatically take action against suspected users/IPs/traits. From 7906f6bd596f089bfe34d01e87ae22bdddb07541 Mon Sep 17 00:00:00 2001 From: Michael Cretzman Date: Tue, 8 Apr 2025 12:44:16 -0700 Subject: [PATCH 15/15] fixing links in tabs --- .../guide/manage_account_theft_appsec.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/content/en/security/application_security/guide/manage_account_theft_appsec.md b/content/en/security/application_security/guide/manage_account_theft_appsec.md index 8b230caf3e632..d51550b47138e 100644 --- a/content/en/security/application_security/guide/manage_account_theft_appsec.md +++ b/content/en/security/application_security/guide/manage_account_theft_appsec.md @@ -397,14 +397,14 @@ Extract the list of targeted users by going to [Signals][1]. -From this list of users, you can craft a [Traces][25] query to review all the activity from targeted users. Follow this template: +From this list of users, you can craft a [Traces][2] query to review all the activity from targeted users. Follow this template: `@appsec.security_activity:business_logic.users.login.* @appsec.events_data.usr.login:()` Successful logins should be considered suspicious. [1]: https://app.datadoghq.com/security?query=%40workflow.rule.type%3A"Application%20Security"%20category%3Aaccount_takeover&product=appsec - +[2]: https://app.datadoghq.com/security/appsec/traces {{% /tab %}} {{% tab "Credential Stuffing" %}} @@ -415,12 +415,14 @@ Start by extracting a list of suspicious IPs from the signal side panel -From the list of IPs, you can craft a [Traces][25] query to review all the activity from suspected IPs. Follow this template: +From the list of IPs, you can craft a [Traces][2] query to review all the activity from suspected IPs. Follow this template: `@appsec.security_activity:business_logic.users.login.* @http.client_ip:()` Successful logins should be considered suspicious. +[2]: https://app.datadoghq.com/security/appsec/traces + {{% /tab %}} {{% tab "Distributed Credential Stuffing" %}}