Skip to content

Ignore AI assistant campaign attribution in tracker#24534

Open
tzi wants to merge 17 commits into
5.x-devfrom
dev-20141
Open

Ignore AI assistant campaign attribution in tracker#24534
tzi wants to merge 17 commits into
5.x-devfrom
dev-20141

Conversation

@tzi

@tzi tzi commented May 20, 2026

Copy link
Copy Markdown
Contributor

Description

This PR fixes tracker attribution so AI assistant sources such as utm_source=chatgpt.com are no longer incorrectly stored as campaign attribution when there is no normal browser referrer.

before after
tracker-before tracker-after

issue dev-20141

Checklist

  • [✔] I have understood, reviewed, and tested all AI outputs before use
  • [✔] All AI instructions respect security, IP, and privacy rules

Review

@tzi tzi added this to the 5.11.0 milestone May 20, 2026
@tzi tzi changed the title Dev 20141 Ignore AI assistant campaign attribution in tracker May 20, 2026
@tzi tzi marked this pull request as ready for review May 20, 2026 20:56
@tzi tzi requested a review from a team May 20, 2026 20:56

@caddoo caddoo left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tzi

Quick getting a solution for this, and your fix partially solves.

However when I see updates to the JS tracker I panic, we have multiple tracking clients the JS tracker is just one of them.

I think we should pivot to fix the server side of this and not touch the JS tracker at all..

Why server side?

The bug isn't really that the JS tracker is sending campaign attribution, their server treats _rcn as #1 for the conversions referer type even when its already been correctly classified as an AI assistant.

So we end up with: visit = AI Assistant, conversion = Campaign. That's the divergence the customer sees in their reports. The JS fix avoids hitting this code path by not sending _rcn in the first place, but the underlying server bug is still there for everyone else.

What other tracking clients

  • matomo-php-tracker
  • matomo-log-analytics
  • mobile SDKs
  • bulk tracking
  • older matomo JS trackers.

Server side fixes it all in once place with less risk in the tracker JS.

Also the fix most likely is in the archiving, so we could fix this for historical data not just new data.

@tzi

tzi commented May 22, 2026

Copy link
Copy Markdown
Contributor Author

Yes, good catch!

I did not have the other trackers in mind, and I agree the fix should be server-side rather than in the JS tracker then.

I updated the PR to fix future data. Can you review it again?

⚠️ One nuance: I think that any archiving or historical-data fix should be handled in a separate PR.

@caddoo caddoo left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sgiehl Can you look at this when you get a moment as well please.

Comment thread plugins/Referrers/Columns/Base.php Outdated
Comment thread plugins/Referrers/tests/Integration/ReferrerAttributionTest.php Outdated
@caddoo caddoo requested a review from sgiehl May 26, 2026 01:50
@sgiehl

sgiehl commented May 28, 2026

Copy link
Copy Markdown
Member

@caddoo Sorry, I'm not sure if I fully understand the problem we are trying to solve.
The javascript tracker should already ignore referrers coming from chatgpt. This config can be extended to others if needed.
This however might not work if e.g. the referrer is not provided. In that case the campaign will pass through and the javascript tracker might set it to the attribution cookie. Which to me sounds like the actual problem. The attribution cookie should afaik always hold the last campaign that was detected.
So if this is not fixed in the tracker you would risk that a conversion is not attributed to a campaign, that had been stored before in the attribution cookie. But this whole attribution topic is risky. If you try fixing an edge case for one person you risk that afterwards two others start complaining...

@caddoo

caddoo commented May 29, 2026

Copy link
Copy Markdown
Contributor

@tzi sorry for the back and fourth with this one.

@sgiehl

you are right on the technical points. I spent a bit of time testing again..

I will lean heavily on your decision here, I went for the back end approach to cover all trackers and not have fragementation. Not sure how much other trackers might care about this behaviour though.

How about this:

  • Do the JS fix again as primary (sorry Thomas).
  • Extended so the ignore list also applis when checking utm_source values.
  • Keep or drop the server side fix depending on how much we care about non-js trackers.

@tzi

tzi commented Jun 1, 2026

Copy link
Copy Markdown
Contributor Author

@caddoo @sgiehl After thinking and analyzing it, I agree that the JS tracker should be the primary fix here, since it is the code responsible for maintaining the attribution cookie. The server-side change only protects conversion attribution after the cookie has already been written, so it does not fully address the underlying issue.

First, I’ll restore the JS-side guard and extend the ignore handling to cover utm_source values as well.

Then, we can decide whether we want to keep the server-side fallback to support non-JS trackers too, but my vote would be to remove it.

@tzi tzi force-pushed the dev-20141 branch 2 times, most recently from 6084869 to 283c7d6 Compare June 2, 2026 07:34
Comment thread js/piwik.js Outdated
for (i = 0; i < configCampaignNameParameters.length; i++) {
campaignParameterValue = getUrlParameter(currentUrl, configCampaignNameParameters[i]);

if (campaignParameterValue.length && shouldIgnoreCampaignForReferrer(campaignParameterValue)) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically speaking this looks like it mixes concerns. shouldIgnoreCampaignForReferrer is meant to set referrer domains like chatgpt.com. But someone could also configure that to anything different. Technically it will then ignore any campaign parameter coming from that host as referrer.

This is the documentation for that method:

setIgnoreCampaignsForReferrers( string | array ) - Set array with hostnames or domains for referrers where campaign parameters should be ignored. For wildcard subdomains, you can use: setIgnoreCampaignsForReferrers('.referrer.com'); or setIgnoreCampaignsForReferrers('.referrer.com');. You can also specify a path along a domain: setIgnoreCampaignsForReferrers('.referrer.com/subsite1');.

This new approach is more like a ignoreCampaignValues, so specific value that are ignored regardless of any referrer. So adding it that way may actually have unexpected results.

We may need to consider adding a new config in js tracker for that, with known default values, and also some methods to set that config.

@tzi tzi Jun 7, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You’re right, the previous version overloaded setIgnoreCampaignsForReferrers() beyond its documented semantics.
I reworked it so the referrer-based API stays referrer-only, and the AI-style utm_source=chatgpt.com handling now uses a separate tracker config/API for ignored campaign values with its own defaults.

@sgiehl sgiehl modified the milestones: 5.11.0, 5.12.0 Jun 5, 2026
@tzi tzi requested review from caddoo and sgiehl June 9, 2026 11:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants