Skip to content

Implement consent forwarding pipeline#380

Open
ChristianPavilonis wants to merge 10 commits intomainfrom
feature/consent-management
Open

Implement consent forwarding pipeline#380
ChristianPavilonis wants to merge 10 commits intomainfrom
feature/consent-management

Conversation

@ChristianPavilonis
Copy link
Collaborator

@ChristianPavilonis ChristianPavilonis commented Feb 26, 2026

Summary

  • Wire CMP consent signals end-to-end from cookie/header extraction through OpenRTB bid requests, partner integrations, and KV Store persistence so publishers can comply with GDPR and US state privacy laws without additional integration work.
  • Adds configurable [consent] section with jurisdiction detection, per-partner forwarding modes, expiration checking, and GPC-to-US-Privacy construction.
  • Gate SSC creation on consent signals: EU/UK users require explicit TCF opt-in, US users are allowed unless they opt out, and non-regulated jurisdictions are unaffected.
  • Handle consent revocation: when consent is withdrawn and an SSC cookie exists, the cookie is expired and the KV store entry is deleted.
  • Grows test coverage from 431 → 493 tests with comprehensive unit tests for every new module.

Changes

Consent-gated SSC creation

File Change
crates/common/src/consent/mod.rs allows_ssc_creation() — consent decision function: GDPR requires TCF P1 opt-in, US checks opt-out, non-regulated allows
crates/common/src/cookies.rs expire_synthetic_cookie() — expires SSC cookie with Max-Age=0 for revocation
crates/common/src/consent/kv.rs delete_consent_from_kv() — deletes KV store entry on consent revocation
crates/common/src/publisher.rs Three-way consent gate in handle_publisher_request(): set SSC (consent given), revoke (consent withdrawn + cookie present), or skip (no consent + no cookie)

OpenRTB integration

File Change
crates/common/src/openrtb.rs Populate regs/user consent fields with dual-placement (top-level 2.6 + ext for older exchanges); add Eid, Uid, ConsentedProvidersSettings structs

Configuration & observability

File Change
crates/common/src/consent_config.rs Full [consent] config section: ConsentConfig, ConsentMode, ConsentForwardingMode, GdprConfig (31 countries), UsStatesConfig (20 states), conflict resolution, expiration checking
crates/common/src/consent/jurisdiction.rs Jurisdiction enum (Gdpr, UsState, NonRegulated, Unknown) + detect_jurisdiction() from geo + config
crates/common/src/consent/mod.rs Pipeline orchestrator: build_consent_context(), ConsentPipelineInput, KV fallback/write, expiration checking, GPC-to-US-Privacy, EID gating
crates/common/src/consent/types.rs TcfConsent helper methods (has_purpose_consent, has_storage_consent, etc.)
crates/common/src/settings.rs Added consent: ConsentConfig field
crates/common/src/lib.rs Module declaration for consent_config
crates/common/build.rs Include consent_config.rs in build inputs

Partner integrations

File Change
crates/common/src/cookies.rs Cookie stripping utilities (strip_cookies, forward_cookie_header, CONSENT_COOKIE_NAMES)
crates/common/src/integrations/prebid.rs ConsentForwardingMode support, consent cookie stripping in OpenrtbOnly mode
crates/common/src/integrations/lockr.rs Always strips consent cookies via forward_cookie_header
crates/common/src/integrations/aps.rs ApsGdprConsent struct, consent fields in ApsBidRequest
crates/common/src/integrations/adserver_mock.rs Consent summary in mediation request ext

KV Store persistence

File Change
crates/common/src/consent/kv.rs KvConsentEntry and ConsentKvMetadata types, SHA-256 fingerprint change detection, read fallback when cookies absent, write-on-change via Fastly KV Store API

Wiring & config

File Change
crates/common/src/auction/endpoints.rs Wire consent pipeline into /auction endpoint
crates/common/src/publisher.rs Wire consent pipeline with synthetic_id into publisher handler
fastly.toml Added consent_store KV store for local dev
trusted-server.toml Added commented [consent] config section with all options

Key design decisions

  • Dual-placement OpenRTB fields: consent values placed both at top-level (2.6 spec) and in ext for backward compatibility with older exchanges.
  • Consent cookie stripping: per-partner ConsentForwardingMode controls whether consent travels via OpenRTB body only (OpenrtbOnly strips cookies) or both cookies and body (CookiesAndBody).
  • Write-on-change KV persistence: SHA-256 fingerprint of consent signals avoids redundant KV writes; KV read used as fallback when cookies are absent (e.g., Safari ITP).
  • SSC consent gating: allows_ssc_creation() centralizes the consent decision. GDPR requires TCF Purpose 1 (store/access device) opt-in; US state privacy uses opt-out model (block only if user explicitly opted out). Non-regulated and unknown jurisdictions default to allowing SSC.
  • Revocation uses cookie value for KV deletion: the revoke path reads the synthetic ID from the cookie (not the x-synthetic-id header) to ensure the correct KV entry is deleted when both are present.

How to enable

  1. Uncomment the [consent] section in trusted-server.toml
  2. For KV persistence, configure consent_store in fastly.toml (already added for local dev)
  3. Optionally set mode = "proxy" or mode = "interpreter" depending on desired consent processing depth

Test plan

  • cargo fmt --all -- --check
  • cargo clippy --all-targets --all-features -- -D warnings
  • cargo test --workspace — 493 tests passing
  • npx vitest run — 150 JS tests passing
  • npm run format (js + docs) — clean
  • cargo build --bin trusted-server-fastly --release --target wasm32-wasip1 — WASM build passes

Checklist

  • Code compiles without warnings
  • All existing tests pass
  • New tests added for all new modules (62 new tests)
  • No secrets or credentials committed
  • Configuration is opt-in (commented out by default)

Closes #312
Closes #464
Closes #465
Closes #466

@ChristianPavilonis ChristianPavilonis self-assigned this Feb 26, 2026
@ChristianPavilonis ChristianPavilonis marked this pull request as draft February 26, 2026 00:41
@ChristianPavilonis ChristianPavilonis force-pushed the feature/consent-management branch from b4dfdde to 3e8e3c5 Compare February 26, 2026 00:46
@ChristianPavilonis
Copy link
Collaborator Author

ChristianPavilonis commented Mar 2, 2026

This has a minimal TCF Decoding implementation.
Do we want to make a full implementation as a separate crate?

@ChristianPavilonis ChristianPavilonis marked this pull request as ready for review March 2, 2026 13:59
@ChristianPavilonis
Copy link
Collaborator Author

Also maybe #390 should be merged first, there will be conflicts.

@aram356
Copy link
Collaborator

aram356 commented Mar 5, 2026

@ChristianPavilonis Please resolve conflicts

Copy link
Collaborator

@aram356 aram356 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Comprehensive consent forwarding pipeline spanning signal extraction, TCF/GPP/USP decoding, jurisdiction detection, KV persistence, and OpenRTB integration. The architecture is clean and well-decomposed. Four blocking issues around edge-case correctness and input validation need attention before merge.

Blocking

🔧 wrench

  • Expired TCF leaks through GPP-embedded fallback: when both standalone TC and GPP-embedded TCF are present and expired, only one is cleared — effective_tcf() still returns the GPP copy (crates/common/src/consent/mod.rs:171)
  • is_empty() misclassifies __gpp_sid-only requests: gpp_section_ids is not checked, so requests with only __gpp_sid=2,6 are treated as empty — skipping logging and KV writes (crates/common/src/consent/types.rs:155)
  • KV Store consent persistence disabled on /auction path: synthetic_id: None causes both try_kv_fallback and try_kv_write to early-return (crates/common/src/auction/endpoints.rs:57)
  • No input length validation on consent strings before decoding: unbounded base64 decode from cookie input could allocate large heap buffers from malicious input (crates/common/src/consent/tcf.rs:57)

Non-blocking

🤔 thinking

  • KV fingerprint ignores gpp_section_ids: SID-only changes are invisible to the fingerprint, skipping KV writes (crates/common/src/consent/kv.rs:178)
  • regs.gdpr = Some(0) false negative in ambiguous cases: a GDPR-jurisdiction user without a TCF cookie gets regs.gdpr=0, signaling "GDPR does not apply" (crates/common/src/integrations/prebid.rs:644)

♻️ refactor

  • apply_tcf_conflict_resolution clones eagerly: both TcfConsent structs are cloned before determining if a conflict exists (crates/common/src/consent/mod.rs:312)
  • now_deciseconds() uses as u64 truncation from u128: safe in practice but avoidable with u64-native arithmetic (crates/common/src/consent/mod.rs:377)

🌱 seedling

  • extract_and_log_consent is declared but never called: public function with zero callers — either document the intent or remove (crates/common/src/consent/mod.rs:203)
  • No end-to-end test for consent → OpenRTB pipeline: individual consent modules are well-tested, but there's no integration test that starts from a Request with consent cookies and asserts the final OpenRtbRequest JSON contains correct regs.gdpr, regs.gpp, user.consent, etc. This cross-module path (endpoints.rsformats.rsprebid.rs) is where mismatches are most likely. Consider adding one in a follow-up.
  • gate_eids_by_consent is defined but not wired in: gate_eids_by_consent (mod.rs:430) is implemented and unit-tested but never called in the actual bid request path. The EIDs field is always None in prebid.rs. Document whether this is deferred to a future phase or wire it in.

👍 praise

  • Clean pipeline architecture: the extract → decode → normalize → gate pipeline is well-decomposed. Each stage is independently testable, proxy mode is a clean escape hatch, and the KV fingerprint-based write deduplication is smart. The conflict resolution strategies (Restrictive/Newest/Permissive) and config-driven jurisdiction lists are particularly well thought out.
  • Dual-placement for Prebid compatibility: populating both OpenRTB 2.6 top-level fields and regs.ext.* / user.ext.consent for Prebid compatibility is the right call. The RegsExt mirroring pattern with corresponding tests shows attention to real-world integration needs.

CI Status

  • Analyze (javascript-typescript): PASS

@ChristianPavilonis ChristianPavilonis marked this pull request as draft March 6, 2026 17:33
@ChristianPavilonis
Copy link
Collaborator Author

Review feedback addressed in 798e4a2

All 9 review comments addressed in a single commit. Here's how each was resolved:

Bug fixes

  1. Expired TCF leaks through GPP fallback (consent/mod.rs:171) — Changed the if/else if to unconditionally clear both ctx.tcf and ctx.gpp.eu_tcf on expiration, so effective_tcf() can no longer fall back to stale GPP-embedded consent.

  2. is_empty() ignores gpp_section_ids (consent/types.rs:155) — Added && self.gpp_section_ids.is_none() to the check so requests with only __gpp_sid are no longer treated as empty.

  3. KV fallback disabled due to synthetic_id: None (auction/endpoints.rs:55) — Moved synthetic ID generation before build_consent_context in the auction endpoint. Updated convert_tsjs_to_auction_request to accept the pre-generated ID as a &str parameter instead of generating its own.

  4. No length limit on TC string before base64 decode (consent/tcf.rs:57) — Added MAX_TC_STRING_LEN = 4096 guard before decoding. US Privacy already validates exact length (4 chars) and GPP delegates to iab_gpp which handles its own parsing, so no changes needed there.

Design concerns

  1. Fingerprint omits gpp_section_ids (consent/kv.rs:178) — Added sorted section IDs to the SHA-256 hash with sentinel byte separators, so SID-only changes now trigger KV writes.

  2. regs.gdpr = Some(0) false negative (prebid.rs:643) — Now uses jurisdiction from ConsentContext: GDPR jurisdiction sets gdpr=1 even without a TCF string; unknown jurisdiction emits None instead of Some(0).

Cleanup

  1. Unnecessary cloning in apply_tcf_conflict_resolution (consent/mod.rs:312) — Switched to working with references for the conflict check. Only clones the GPP winner when selected; early-returns when standalone wins since it's already in ctx.tcf.

  2. as_millis() as u64 truncation (consent/mod.rs:377) — Replaced with dur.as_secs() * 10 + u64::from(dur.subsec_millis()) / 100 to stay in u64 throughout.

  3. Dead extract_and_log_consent (consent/mod.rs:203) — Removed. Zero callers in the codebase.

ChristianPavilonis added a commit that referenced this pull request Mar 9, 2026
Address PR #380 review findings:
- Cap TCF vendor range expansion to MAX_VENDOR_ID (10,000) to prevent DoS
- Add GPP string length guard (8,192 bytes) before parsing
- Replace unwrap_or_default() with expect() in now_deciseconds
- Document gate_eids_by_consent as deferred until EID wiring
- Drop plotters feature from criterion to reduce compile-time deps
- Add explanatory comment for dead_code allow in build.rs
Wire consent signals into OpenRTB bid requests, add per-partner
forwarding modes, and persist consent to KV Store for returning users.

Phase 2 - OpenRTB integration: populate regs/user consent fields with
dual-placement (top-level 2.6 + ext), add EID consent gating, AC string
forwarding, and new Eid/Uid/ConsentedProvidersSettings structs.

Phase 3 - Configuration + observability: add [consent] config section
with jurisdiction detection, expiration checking, GPC-to-US-Privacy
construction, and structured logging.

Phase 4 - Partner integrations: cookie stripping via ConsentForwardingMode,
Prebid/Lockr consent cookie filtering, APS consent fields, adserver mock
consent summary.

Phase 5 - KV Store persistence: consent/kv.rs with KvConsentEntry and
ConsentKvMetadata types, SHA-256 fingerprint change detection, read
fallback when cookies absent, write-on-change via Fastly KV Store API.
Fix expired TCF leaking through GPP fallback by clearing both sources.
Add gpp_section_ids to is_empty() check and KV fingerprint hash.
Generate synthetic ID before consent pipeline in auction endpoint so
KV fallback and write operations work correctly.
Add max-length guard on TC strings before base64 decoding.
Use jurisdiction to inform regs.gdpr instead of falsely emitting 0.
Defer cloning in TCF conflict resolution and remove dead code.
Check TCF/GPP consent signals before setting the Synthetic Session
Cookie. EU/UK users require explicit opt-in (TCF Purpose 1), US users
are allowed unless they opt out via US Privacy or GPC, and
non-regulated jurisdictions are unaffected.

When consent is absent but an existing SSC cookie is present, the
cookie is expired and the corresponding KV store entry is deleted.
Revocation targets the cookie value directly (not the x-synthetic-id
header) to avoid mismatched deletions.

Closes #464, closes #465, closes #466.
Address PR #380 review findings:
- Cap TCF vendor range expansion to MAX_VENDOR_ID (10,000) to prevent DoS
- Add GPP string length guard (8,192 bytes) before parsing
- Replace unwrap_or_default() with expect() in now_deciseconds
- Document gate_eids_by_consent as deferred until EID wiring
- Drop plotters feature from criterion to reduce compile-time deps
- Add explanatory comment for dead_code allow in build.rs
@ChristianPavilonis ChristianPavilonis force-pushed the feature/consent-management branch from 6834a2e to ba8eb3c Compare March 9, 2026 19:56
@ChristianPavilonis ChristianPavilonis marked this pull request as ready for review March 12, 2026 16:17
Copy link
Collaborator

@aram356 aram356 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Well-structured consent management pipeline (~5,600 lines, 29 files). Architecture is sound: extraction → decoding → normalization → enforcement → forwarding. Security hardening (input caps, vendor bounds, write-on-change KV) is thorough. 62 new tests with good edge-case coverage. All CI checks pass.

No blocking issues found — five non-blocking observations below.

Non-blocking

🤔 thinking

  • TCF language values not validated against spec range: 6-bit fields can hold 0–63 but only 0–25 are valid per ISO 639-1. Malformed TC strings could produce non-ASCII characters in OpenRTB output. (tcf.rs:127)
  • vendor_section_end_offset intentionally uncapped: Correct behavior but diverges from decode_vendor_section without explaining why. Fragile for future editors. (tcf.rs:261)

🏕 camp site

  • Silent fallback to restrictive mode: When select_newest_signal returns None, .unwrap_or(!gpp_allows) falls back to the restrictive choice — a significant policy decision buried in an unwrap_or. (mod.rs:325)

♻️ refactor

  • Verbose test helpers: make_tcf / make_tcf_with_storage each build 24-element purpose vectors manually. A builder helper would reduce boilerplate. (mod.rs:622-664)

🌱 seedling

  • Vec<u16> for vendor lists: Linear search is fine today, but worth switching to HashSet when vendor-level consent checking becomes a hot path. (types.rs:248)

CI Status

  • fmt: PASS
  • clippy: PASS
  • rust tests: PASS
  • js tests: PASS

// Consent language: two 6-bit values, each offset by 'A' (65)
let lang_a = reader.read_u8(108, 6);
let lang_b = reader.read_u8(114, 6);
let consent_language = format!("{}{}", char::from(b'A' + lang_a), char::from(b'A' + lang_b),);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 thinking — TCF language values not validated against spec range

The two 6-bit fields can hold values 0–63, but only 0–25 are valid per the spec (maps to A–Z for ISO 639-1). A malformed TC string with e.g. value 63 produces b'A' + 63 = 128, a non-ASCII character in consent_language. Impact is limited since this is an informational field, but it could produce garbage in OpenRTB consent_language output.

A guard before the format would make this airtight:

if lang_a > 25 || lang_b > 25 {
    return Err(Report::new(ConsentDecodeError::InvalidTcString {
        reason: format!("invalid consent language values: ({lang_a}, {lang_b})"),
    }));
}

return Ok(offset);
}

let max_vendor_id = reader.read_u16(offset, 16);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 thinkingvendor_section_end_offset intentionally skips cap but diverges from decode_vendor_section

decode_vendor_section caps max_vendor_id to MAX_VENDOR_ID (line 184), but this function uses the raw value. This is correct — the offset must reflect the actual bitstream layout to find where the LI section starts. But the divergence is fragile; a future editor might "fix" this by adding the cap and introduce a subtle bug.

A comment would help:

// Note: intentionally uncapped — we need the raw max_vendor_id to
// compute the true end-of-section offset in the bitstream, even
// though decode_vendor_section caps at MAX_VENDOR_ID for allocation.
let max_vendor_id = reader.read_u16(offset, 16);

gpp_tcf,
config.conflict_resolution.freshness_threshold_days,
)
.unwrap_or(!gpp_allows),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🏕 camp site — Silent fallback to restrictive mode deserves a comment

When both TCF timestamps are within freshness_threshold_days, select_newest_signal returns None and .unwrap_or(!gpp_allows) falls back to the restrictive choice. This is a significant policy decision that's easy to miss.

// When both signals are within the freshness threshold, fall back
// to the restrictive choice (prefer whichever denies consent).
.unwrap_or(!gpp_allows),

vendor_legitimate_interests: Vec::new(),
special_feature_opt_ins: vec![false; 12],
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ refactor — Test helpers make_tcf / make_tcf_with_storage are verbose

Each constructs TcfConsent with a 24-element purpose_consents vector. A builder helper that defaults everything and lets callers set just the relevant purpose flags (e.g. .with_purpose1(true)) would cut ~40 lines of boilerplate per constructor and make the tests more readable.

/// Checks whether a specific vendor has been granted consent.
#[must_use]
pub fn has_vendor_consent(&self, vendor_id: u16) -> bool {
self.vendor_consents.contains(&vendor_id)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🌱 seedling — Vendor lists use Vec<u16> with linear search

has_vendor_consent / has_vendor_li call Vec::contains, which is O(n). With the MAX_VENDOR_ID cap of 10,000 this could become noticeable if vendor-level consent checking lands on the hot path (e.g. per-EID gating). A HashSet<u16> or sorted vec with binary search would be O(1)/O(log n). Fine for now — just flagging for when EID gating is wired in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants