Skip to content

Add more details about cross-domain tracking #1210

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

mscwilson
Copy link
Collaborator

I took bits out of the reusable partial to use directly in the JS tracker page because it was easier to edit and break it up with subheadings

@mscwilson mscwilson requested a review from jethron April 10, 2025 16:18
Copy link

netlify bot commented Apr 10, 2025

Deploy Preview for snowplow-docs ready!

Name Link
🔨 Latest commit 314d6e4
🔍 Latest deploy log https://app.netlify.com/projects/snowplow-docs/deploys/6887b8c148347d0008b5ed22
😎 Deploy Preview https://deploy-preview-1210--snowplow-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copy link
Contributor

@jethron jethron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great start, thanks for picking this up!

Comment on lines 11 to 13
:::note Base64 encoding
This enrichment expects the events to be base64-encoded. Configure this in the trackers.
:::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the source for this?

The tracker base64 encodes the user_id, source app_id, and reason fields to make them URL-safe and to (slightly) obfuscate them in case they contain personal data (which could be unintentionally leaked to the destination site), but this is distinct from the normal base64 encoding config trackers have for SDJ payloads. No enrichment should need to be aware of the base64 encoding setting in trackers, it's already decoded by the pipeline when the enrichment runs.


The `_sp` parameter can be attached by our Web ([see cross-domain tracking](/docs/sources/trackers/javascript-trackers/web-tracker/cross-domain-tracking/index.md)) and [mobile trackers](/docs/sources/trackers/mobile-trackers/tracking-events/session-tracking/index.md#decorating-outgoing-links-using-cross-navigation-tracking) and contains user, session and app identifiers (e.g., domain user and session IDs, business user ID, source app ID). The information to include in the parameters is configurable in the trackers. This is useful for tracking the movement of users across different apps and platforms.
To add the `_sp` querystring, configure cross-domain tracking in the [web](/docs/sources/trackers/javascript-trackers/web-tracker/cross-domain-tracking/index.md) or [mobile trackers](/docs/sources/trackers/mobile-trackers/tracking-events/session-tracking/index.md#decorating-outgoing-links-using-cross-navigation-tracking). The querystring contains user, session, and app identifiers, for example domain user and session IDs, business user ID, or source application ID. This is useful for tracking the movement of users across different apps and platforms. The information to include in the parameters is configurable in the trackers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think by the end of this paragraph the reader should kind of know if they need this or not, but it's very "what"/"how" rather than "why" at the moment, so that's not clear?

The link to cross-domain-tracking is doing a lot of work here also, this text is kind of ambiguous between the default cross-domain tracking and the extended version. Maybe needs a refresher on the normal behaviour and some explanation of the actual differences?

  • They both use _sp and include domain_userid + timestamp
  • The default doesn't require any enrichment to be enabled
  • Both default and extended will populate the atomic refr_domain_userid and refr_dvce_tstamp fields
  • This enrichment adds the information in an entity as well
  • Extended lets you include the domain_sessionid, user_id, source app_id and a custom reason, which are all configurable, in addition to the default domain_userid + timestamp (which can not be disabled)
  • If enabled, this enrichment will still parse the non-extended format correctly, so you do not need to co-ordinate enabling the configuration and updating tracking


If this enrichment isn't enabled, Enrich parses `_sp` querystring parameter according to the old format, `_sp={domainUserId}.{timestamp}`
The extended cross-navigation format is `_sp={domainUserId}.{timestamp}.{sessionId}.{subjectUserId}.{sourceId}.{platform}.{reason}`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think sourceAppId is a bit clearer than sourceId -- and then probably sourcePlatform just for consistency; these are the app_id/platform values of the tracker that generates the parameter.


## Configuration

- [Schema](https://github.com/snowplow/iglu-central/blob/master/schemas/com.snowplowanalytics.snowplow.enrichments/cross_navigation_config/jsonschema/1-0-0)
- [Example](https://github.com/snowplow/enrich/blob/master/config/enrichments/cross_navigation_config.json)

```json reference
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL, cool!

This kind of makes the Schema link above redundant. Maybe add title="Schema" and swap them?

I'd say embed the example as well, but this is about as boring as enrichment configs get so I'm not sure it matters. 😅 I guess it makes it easy to copy/paste?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, it's not documented but apparently we can change the "See full example on GitHub" text too if that doesn't make sense in this context.

```json reference title="Schema" referenceLinkText="See schema on Github"

| Property | Description | Extended | Short |
| --------------- | ---------------------------------------------- | -------- | ----- |
| `domainUserId` | Current tracker-generated UUID user identifier | ✅ | ✅ |
| `timestamp` | Current epoch timestamp, ms precision | ✅ | ✅ |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: do these columns need width adjustment?

@mscwilson mscwilson force-pushed the add_details_cross_navigation branch from 281110a to 314d6e4 Compare July 28, 2025 17:51
@mscwilson mscwilson requested a review from jethron July 28, 2025 17:52
sidebar_position: 5
---

When users navigate between different domains in your ecosystem—such as from your main website to a subdomain, partner site, or mobile app—their user identity is typically lost. This creates gaps in your user journey data and makes it difficult to understand the complete customer experience across your digital properties.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subdomains should be covered by discoverRootDomain (now the default), so I'm hesitant to put it here unqualified as it's a bit misleading. It's usually a special case to want per-subdomain identities and need these settings to apply there.


## Querystring properties

The `_sp` querystring parameter has two different formats: extended or short. You can also configure exactly which properties you want to include.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can only configure them when using extended; domainUserId & timestamp are both required for both formats.


The extended cross navigation format can be described by `_sp={domainUserId}.{timestamp}.{sessionId}.{subjectUserId}.{sourceId}.{platform}.{reason}`
The extended cross-navigation format is `_sp={domainUserId}.{timestamp}.{sessionId}.{subjectUserId}.{sourceId}.{sourcePlatform}.{reason}`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe note some of these fields may be empty/null because the fields to attach is configured in the tracker.

@@ -28,10 +32,10 @@ import TestingWithMicro from "@site/docs/reusable/test-enrichment-with-micro/_in
This enrichment extracts `_sp` querystring parameter from the following inputs:

- The `page_url` field from the Snowplow event
- The referer uri extracted from corresponding HTTP header in the raw event
- The `referer` URI extracted from corresponding HTTP header in the raw event
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would drop this list and just say page_url; it's an implementation detail that enrich will use the referer if an event doesn't explicitly specify a page_url; that referrer just becomes the page_url.
Calling this out kind of implies the query strings will be merged, which isn't the case AFAIK.

```
appSchema://path/to/page?_sp=domainUserId.timestamp.sessionId.subjectUserId.sourceId.platform.reason
```
Choose which parameters to include using a `CrossDeviceParameterConfiguration` object. The `domainUserId` and `timestamp` are always included automatically.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this link to the API docs or something? The table below implies to me that we need to pass the values from the Controller/Configurations, but instead we actually just need to pass booleans (or a string for the reason) for each field and the tracker collects the values. This is only apparent from the examples further down ATM.

Comment on lines +227 to +231
| `sessionId` | Current session UUID identifier | `SessionController.sessionId` |
| `subjectUserId` | Custom business user identifier | `SubjectController.userId` |
| `sourceId` | Application identifier | `TrackerConfiguration.appId` |
| `sourcePlatform` | Platform of the current device | `TrackerController.devicePlatform` |
| `reason` | Custom information or identifier | Custom string |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this note what is/isn't enabled by default, and perhaps the reasoning for that? E.g. user_id isn't enabled by default to prevent accidentally leaking personal data to third party URIs, so requires enabling explicitly, and platform can usually be inferred from the source App ID.


:::tip

If you enable link decoration, aim to track at least one event on the starting page. The tracker writes the `domain_userid` to a cookie when it tracks an event. If the cookie doesn't exist when the user navigates to the cross-domain destination, the tracker will generate a new ID for them when they return, rather than keeping the old ID. This can make user stitching difficult.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe this is true anymore? Just creating a tracker (required to set these settings) should be enough to create the cookie in question.

@@ -53,54 +55,56 @@ newTracker('sp', '{{collector_url_here}}', {
</TabItem>
</Tabs>

The tracker will be named `sp` (tracker namespace) and will send events to the a collector url you specify by replacing `{{collector_url_here}}`. The final argument is the configuration object. Here it is just used to set the app ID and the common webPage context for each event. Each event the tracker sends will have an app ID field set to my-app-id.
The tracker will send events to the Collector URL you specify by replacing `{{collector_url_here}}`. The final argument is the configuration object. Here it's used to set the app ID and the webPage entity for each event. Each event the tracker sends will have an `app_id` field set to `my-app-id`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example sets more than just appId and contexts.webPage now, maybe this needs a revisit?

cookieSameSite: 'Lax',
crossDomainLinker: function(defineWhichLinksToDecorate) { },
useExtendedCrossDomainLinker: {
userId: true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a comment here to state if your userId contains personal data to make sure your crossDomainLinker won't leak it to third party sites.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants