Implement consent forwarding pipeline#380
Conversation
b4dfdde to
3e8e3c5
Compare
|
This has a minimal TCF Decoding implementation. |
|
Also maybe #390 should be merged first, there will be conflicts. |
|
@ChristianPavilonis Please resolve conflicts |
aram356
left a comment
There was a problem hiding this comment.
Summary
Comprehensive consent forwarding pipeline spanning signal extraction, TCF/GPP/USP decoding, jurisdiction detection, KV persistence, and OpenRTB integration. The architecture is clean and well-decomposed. Four blocking issues around edge-case correctness and input validation need attention before merge.
Blocking
🔧 wrench
- Expired TCF leaks through GPP-embedded fallback: when both standalone TC and GPP-embedded TCF are present and expired, only one is cleared —
effective_tcf()still returns the GPP copy (crates/common/src/consent/mod.rs:171) is_empty()misclassifies__gpp_sid-only requests:gpp_section_idsis not checked, so requests with only__gpp_sid=2,6are treated as empty — skipping logging and KV writes (crates/common/src/consent/types.rs:155)- KV Store consent persistence disabled on
/auctionpath:synthetic_id: Nonecauses bothtry_kv_fallbackandtry_kv_writeto early-return (crates/common/src/auction/endpoints.rs:57) - No input length validation on consent strings before decoding: unbounded base64 decode from cookie input could allocate large heap buffers from malicious input (
crates/common/src/consent/tcf.rs:57)
Non-blocking
🤔 thinking
- KV fingerprint ignores
gpp_section_ids: SID-only changes are invisible to the fingerprint, skipping KV writes (crates/common/src/consent/kv.rs:178) regs.gdpr = Some(0)false negative in ambiguous cases: a GDPR-jurisdiction user without a TCF cookie getsregs.gdpr=0, signaling "GDPR does not apply" (crates/common/src/integrations/prebid.rs:644)
♻️ refactor
apply_tcf_conflict_resolutionclones eagerly: bothTcfConsentstructs are cloned before determining if a conflict exists (crates/common/src/consent/mod.rs:312)now_deciseconds()usesas u64truncation fromu128: safe in practice but avoidable withu64-native arithmetic (crates/common/src/consent/mod.rs:377)
🌱 seedling
extract_and_log_consentis declared but never called: public function with zero callers — either document the intent or remove (crates/common/src/consent/mod.rs:203)- No end-to-end test for consent → OpenRTB pipeline: individual consent modules are well-tested, but there's no integration test that starts from a
Requestwith consent cookies and asserts the finalOpenRtbRequestJSON contains correctregs.gdpr,regs.gpp,user.consent, etc. This cross-module path (endpoints.rs→formats.rs→prebid.rs) is where mismatches are most likely. Consider adding one in a follow-up. gate_eids_by_consentis defined but not wired in:gate_eids_by_consent(mod.rs:430) is implemented and unit-tested but never called in the actual bid request path. The EIDs field is alwaysNonein prebid.rs. Document whether this is deferred to a future phase or wire it in.
👍 praise
- Clean pipeline architecture: the extract → decode → normalize → gate pipeline is well-decomposed. Each stage is independently testable, proxy mode is a clean escape hatch, and the KV fingerprint-based write deduplication is smart. The conflict resolution strategies (Restrictive/Newest/Permissive) and config-driven jurisdiction lists are particularly well thought out.
- Dual-placement for Prebid compatibility: populating both OpenRTB 2.6 top-level fields and
regs.ext.*/user.ext.consentfor Prebid compatibility is the right call. TheRegsExtmirroring pattern with corresponding tests shows attention to real-world integration needs.
CI Status
- Analyze (javascript-typescript): PASS
Review feedback addressed in 798e4a2All 9 review comments addressed in a single commit. Here's how each was resolved: Bug fixes
Design concerns
Cleanup
|
Address PR #380 review findings: - Cap TCF vendor range expansion to MAX_VENDOR_ID (10,000) to prevent DoS - Add GPP string length guard (8,192 bytes) before parsing - Replace unwrap_or_default() with expect() in now_deciseconds - Document gate_eids_by_consent as deferred until EID wiring - Drop plotters feature from criterion to reduce compile-time deps - Add explanatory comment for dead_code allow in build.rs
Wire consent signals into OpenRTB bid requests, add per-partner forwarding modes, and persist consent to KV Store for returning users. Phase 2 - OpenRTB integration: populate regs/user consent fields with dual-placement (top-level 2.6 + ext), add EID consent gating, AC string forwarding, and new Eid/Uid/ConsentedProvidersSettings structs. Phase 3 - Configuration + observability: add [consent] config section with jurisdiction detection, expiration checking, GPC-to-US-Privacy construction, and structured logging. Phase 4 - Partner integrations: cookie stripping via ConsentForwardingMode, Prebid/Lockr consent cookie filtering, APS consent fields, adserver mock consent summary. Phase 5 - KV Store persistence: consent/kv.rs with KvConsentEntry and ConsentKvMetadata types, SHA-256 fingerprint change detection, read fallback when cookies absent, write-on-change via Fastly KV Store API.
Fix expired TCF leaking through GPP fallback by clearing both sources. Add gpp_section_ids to is_empty() check and KV fingerprint hash. Generate synthetic ID before consent pipeline in auction endpoint so KV fallback and write operations work correctly. Add max-length guard on TC strings before base64 decoding. Use jurisdiction to inform regs.gdpr instead of falsely emitting 0. Defer cloning in TCF conflict resolution and remove dead code.
Check TCF/GPP consent signals before setting the Synthetic Session Cookie. EU/UK users require explicit opt-in (TCF Purpose 1), US users are allowed unless they opt out via US Privacy or GPC, and non-regulated jurisdictions are unaffected. When consent is absent but an existing SSC cookie is present, the cookie is expired and the corresponding KV store entry is deleted. Revocation targets the cookie value directly (not the x-synthetic-id header) to avoid mismatched deletions. Closes #464, closes #465, closes #466.
Address PR #380 review findings: - Cap TCF vendor range expansion to MAX_VENDOR_ID (10,000) to prevent DoS - Add GPP string length guard (8,192 bytes) before parsing - Replace unwrap_or_default() with expect() in now_deciseconds - Document gate_eids_by_consent as deferred until EID wiring - Drop plotters feature from criterion to reduce compile-time deps - Add explanatory comment for dead_code allow in build.rs
6834a2e to
ba8eb3c
Compare
aram356
left a comment
There was a problem hiding this comment.
Summary
Well-structured consent management pipeline (~5,600 lines, 29 files). Architecture is sound: extraction → decoding → normalization → enforcement → forwarding. Security hardening (input caps, vendor bounds, write-on-change KV) is thorough. 62 new tests with good edge-case coverage. All CI checks pass.
No blocking issues found — five non-blocking observations below.
Non-blocking
🤔 thinking
- TCF language values not validated against spec range: 6-bit fields can hold 0–63 but only 0–25 are valid per ISO 639-1. Malformed TC strings could produce non-ASCII characters in OpenRTB output. (
tcf.rs:127) vendor_section_end_offsetintentionally uncapped: Correct behavior but diverges fromdecode_vendor_sectionwithout explaining why. Fragile for future editors. (tcf.rs:261)
🏕 camp site
- Silent fallback to restrictive mode: When
select_newest_signalreturnsNone,.unwrap_or(!gpp_allows)falls back to the restrictive choice — a significant policy decision buried in anunwrap_or. (mod.rs:325)
♻️ refactor
- Verbose test helpers:
make_tcf/make_tcf_with_storageeach build 24-element purpose vectors manually. A builder helper would reduce boilerplate. (mod.rs:622-664)
🌱 seedling
Vec<u16>for vendor lists: Linear search is fine today, but worth switching toHashSetwhen vendor-level consent checking becomes a hot path. (types.rs:248)
CI Status
- fmt: PASS
- clippy: PASS
- rust tests: PASS
- js tests: PASS
| // Consent language: two 6-bit values, each offset by 'A' (65) | ||
| let lang_a = reader.read_u8(108, 6); | ||
| let lang_b = reader.read_u8(114, 6); | ||
| let consent_language = format!("{}{}", char::from(b'A' + lang_a), char::from(b'A' + lang_b),); |
There was a problem hiding this comment.
🤔 thinking — TCF language values not validated against spec range
The two 6-bit fields can hold values 0–63, but only 0–25 are valid per the spec (maps to A–Z for ISO 639-1). A malformed TC string with e.g. value 63 produces b'A' + 63 = 128, a non-ASCII character in consent_language. Impact is limited since this is an informational field, but it could produce garbage in OpenRTB consent_language output.
A guard before the format would make this airtight:
if lang_a > 25 || lang_b > 25 {
return Err(Report::new(ConsentDecodeError::InvalidTcString {
reason: format!("invalid consent language values: ({lang_a}, {lang_b})"),
}));
}| return Ok(offset); | ||
| } | ||
|
|
||
| let max_vendor_id = reader.read_u16(offset, 16); |
There was a problem hiding this comment.
🤔 thinking — vendor_section_end_offset intentionally skips cap but diverges from decode_vendor_section
decode_vendor_section caps max_vendor_id to MAX_VENDOR_ID (line 184), but this function uses the raw value. This is correct — the offset must reflect the actual bitstream layout to find where the LI section starts. But the divergence is fragile; a future editor might "fix" this by adding the cap and introduce a subtle bug.
A comment would help:
// Note: intentionally uncapped — we need the raw max_vendor_id to
// compute the true end-of-section offset in the bitstream, even
// though decode_vendor_section caps at MAX_VENDOR_ID for allocation.
let max_vendor_id = reader.read_u16(offset, 16);| gpp_tcf, | ||
| config.conflict_resolution.freshness_threshold_days, | ||
| ) | ||
| .unwrap_or(!gpp_allows), |
There was a problem hiding this comment.
🏕 camp site — Silent fallback to restrictive mode deserves a comment
When both TCF timestamps are within freshness_threshold_days, select_newest_signal returns None and .unwrap_or(!gpp_allows) falls back to the restrictive choice. This is a significant policy decision that's easy to miss.
// When both signals are within the freshness threshold, fall back
// to the restrictive choice (prefer whichever denies consent).
.unwrap_or(!gpp_allows),| vendor_legitimate_interests: Vec::new(), | ||
| special_feature_opt_ins: vec![false; 12], | ||
| } | ||
| } |
There was a problem hiding this comment.
♻️ refactor — Test helpers make_tcf / make_tcf_with_storage are verbose
Each constructs TcfConsent with a 24-element purpose_consents vector. A builder helper that defaults everything and lets callers set just the relevant purpose flags (e.g. .with_purpose1(true)) would cut ~40 lines of boilerplate per constructor and make the tests more readable.
| /// Checks whether a specific vendor has been granted consent. | ||
| #[must_use] | ||
| pub fn has_vendor_consent(&self, vendor_id: u16) -> bool { | ||
| self.vendor_consents.contains(&vendor_id) |
There was a problem hiding this comment.
🌱 seedling — Vendor lists use Vec<u16> with linear search
has_vendor_consent / has_vendor_li call Vec::contains, which is O(n). With the MAX_VENDOR_ID cap of 10,000 this could become noticeable if vendor-level consent checking lands on the hot path (e.g. per-EID gating). A HashSet<u16> or sorted vec with binary search would be O(1)/O(log n). Fine for now — just flagging for when EID gating is wired in.
Summary
[consent]section with jurisdiction detection, per-partner forwarding modes, expiration checking, and GPC-to-US-Privacy construction.Changes
Consent-gated SSC creation
crates/common/src/consent/mod.rsallows_ssc_creation()— consent decision function: GDPR requires TCF P1 opt-in, US checks opt-out, non-regulated allowscrates/common/src/cookies.rsexpire_synthetic_cookie()— expires SSC cookie withMax-Age=0for revocationcrates/common/src/consent/kv.rsdelete_consent_from_kv()— deletes KV store entry on consent revocationcrates/common/src/publisher.rshandle_publisher_request(): set SSC (consent given), revoke (consent withdrawn + cookie present), or skip (no consent + no cookie)OpenRTB integration
crates/common/src/openrtb.rsregs/userconsent fields with dual-placement (top-level 2.6 +extfor older exchanges); addEid,Uid,ConsentedProvidersSettingsstructsConfiguration & observability
crates/common/src/consent_config.rs[consent]config section:ConsentConfig,ConsentMode,ConsentForwardingMode,GdprConfig(31 countries),UsStatesConfig(20 states), conflict resolution, expiration checkingcrates/common/src/consent/jurisdiction.rsJurisdictionenum (Gdpr, UsState, NonRegulated, Unknown) +detect_jurisdiction()from geo + configcrates/common/src/consent/mod.rsbuild_consent_context(),ConsentPipelineInput, KV fallback/write, expiration checking, GPC-to-US-Privacy, EID gatingcrates/common/src/consent/types.rsTcfConsenthelper methods (has_purpose_consent,has_storage_consent, etc.)crates/common/src/settings.rsconsent: ConsentConfigfieldcrates/common/src/lib.rsconsent_configcrates/common/build.rsconsent_config.rsin build inputsPartner integrations
crates/common/src/cookies.rsstrip_cookies,forward_cookie_header,CONSENT_COOKIE_NAMES)crates/common/src/integrations/prebid.rsConsentForwardingModesupport, consent cookie stripping inOpenrtbOnlymodecrates/common/src/integrations/lockr.rsforward_cookie_headercrates/common/src/integrations/aps.rsApsGdprConsentstruct, consent fields inApsBidRequestcrates/common/src/integrations/adserver_mock.rsextKV Store persistence
crates/common/src/consent/kv.rsKvConsentEntryandConsentKvMetadatatypes, SHA-256 fingerprint change detection, read fallback when cookies absent, write-on-change via Fastly KV Store APIWiring & config
crates/common/src/auction/endpoints.rs/auctionendpointcrates/common/src/publisher.rssynthetic_idinto publisher handlerfastly.tomlconsent_storeKV store for local devtrusted-server.toml[consent]config section with all optionsKey design decisions
extfor backward compatibility with older exchanges.ConsentForwardingModecontrols whether consent travels via OpenRTB body only (OpenrtbOnlystrips cookies) or both cookies and body (CookiesAndBody).allows_ssc_creation()centralizes the consent decision. GDPR requires TCF Purpose 1 (store/access device) opt-in; US state privacy uses opt-out model (block only if user explicitly opted out). Non-regulated and unknown jurisdictions default to allowing SSC.x-synthetic-idheader) to ensure the correct KV entry is deleted when both are present.How to enable
[consent]section intrusted-server.tomlconsent_storeinfastly.toml(already added for local dev)mode = "proxy"ormode = "interpreter"depending on desired consent processing depthTest plan
cargo fmt --all -- --checkcargo clippy --all-targets --all-features -- -D warningscargo test --workspace— 493 tests passingnpx vitest run— 150 JS tests passingnpm run format(js + docs) — cleancargo build --bin trusted-server-fastly --release --target wasm32-wasip1— WASM build passesChecklist
Closes #312
Closes #464
Closes #465
Closes #466