[SPIKE / DO NOT MERGE] Segment storage injection: prove encryption-at-rest is feasible without forking#712
Conversation
…encryption (no fork needed) Evidence-only branch. NOT FOR MERGE. Implements a WrappingStorageProvider that wraps analytics-kotlin's default AndroidStorageProvider via Configuration.storageProvider. On write/writePrefs the payload is prepended with a 13-byte sentinel and XORed with 0x55; on read/readAsStream the mutation is reversed so the library never sees the wrapping. A spike sequence in the java_layout sample app plants SPIKE_ needles via the public CIO API (identify / track / screen). Findings (full report in SPIKE_REPORT.md): - Event-queue path: fully intercepted. 12 wrapped writes to the segment event-queue file, all sentinel-prefixed and XORed; zero SPIKE_ plaintext in the data directory after the run. Rollover + upload + remove all observed through the wrapper. - KVS path: partial bypass. zero writePrefs calls were observed yet the analytics-android-<writeKey>.xml ended up with plaintext anonymousId and settings. AndroidKVS writes some keys directly via SharedPreferences.edit(), bypassing the Storage interface. Closed by feeding analytics-kotlin a custom Context whose getSharedPreferences returns a CryptoProvider-backed wrapper, not by forking the library. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sample app builds 📱Below you will find the list of the latest versions of the sample apps. It's recommended to always download the latest builds of the sample apps to accurately test the pull request. |
|
Hey, there @mahmoud-elmorabea 👋🤖. I'm a bot here to help you. I think the title of this pull request is not in the correct format. Follow the instructions below and then edit the pull request title to a valid format. I'll check again after you make an edit 👍. This project uses a special format for pull requests titles. Expand this section to learn more (expand by clicking the ᐅ symbol on the left side of this sentence)...This project uses a special format for pull requests titles. Don't worry, it's easy! This pull request title should be in this format: If your pull request introduces breaking changes to the code, use this format: where
Examples:Need more examples? Want to learn more about this format? Check out the official docs. Note: If your pull request does multiple things such as adding a feature and makes changes to the CI server and fixes some bugs then you might want to consider splitting this pull request up into multiple smaller pull requests. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #712 +/- ##
============================================
- Coverage 69.07% 65.79% -3.28%
+ Complexity 838 793 -45
============================================
Files 149 156 +7
Lines 4601 4765 +164
Branches 628 651 +23
============================================
- Hits 3178 3135 -43
- Misses 1189 1389 +200
- Partials 234 241 +7 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
|
📏 SDK Binary Size Comparison Report
|
|
Build available to test |
Evidence only — DO NOT MERGE
This branch exists to document a spike outcome. It will never be merged into
main.Spike goal
Prove (or disprove) that we can fully influence every byte the Segment
analytics-kotlinlibrary persists to disk by injecting a customStorageProviderviaConfiguration.storageProvider— without forking the library.TL;DR
Yes for the event queue + most KVS keys. No for
anonymousId/settings(which aren't PII — acceptable under the PII-only scope decision in the parent plan).app_segment-disk-queue/<writeKey>-N) is fully wrapped: 12 wrapped writes, every byte throughStorage.write, noSPIKE_plaintext on disk after the run. Rollover, upload to the Segment CDP HTTP endpoint, and post-flushremoveFilewere all observed through the wrapper.segment.app.version,segment.app.build,segment.device.idin theanalytics-android-<writeKey>SharedPreferences file all start withWRAPPED::v1::.segment.anonymousIdandsegment.settingslanded plaintext in the same SharedPreferences file. The trace log recorded zerowritePrefscalls during the entire run —AndroidKVSwrites these directly viaSharedPreferences.edit(), bypassing theStorageinterface.What that means for the encryption-at-rest plan
The fork-vs-public-API decision is resolved: we use the public APIs. Phase 3 Android:
Configuration.storageProvider = CioEncryptedStorageProvider(...). Path validated by this spike.userId/traitsin theanalytics-android-<writeKey>SharedPreferences (PII per the scope): feedanalytics-kotlina customContextwhosegetSharedPreferences("analytics-android-<writeKey>", MODE_PRIVATE)returns aCryptoProvider-backed wrapper around regular SharedPreferences. Not AndroidXEncryptedSharedPreferences— Iterable's documented main-thread ANR rules it out.anonymousIdandsettingsget encrypted as a side effect of wrapping the whole file; they could have been left plaintext too per the PII-only scope, but wrapping the entire file is simpler than per-key carve-outs.What's on the branch
datapipelines/src/main/kotlin/io/customer/datapipelines/spike/WrappingStorageProvider.ktStorageProviderwrappingAndroidStorageProvider: traces every Storage call via logcat tagSegmentStorageTrace, wraps writes withWRAPPED::v1::sentinel + XOR-with-0x55, unwraps reads transparently.datapipelines/.../extensions/AnalyticsExt.ktstorageProvider = WrappingStorageProvider()on the SegmentConfiguration.samples/java_layout/.../SampleApplication.javarunSegmentStorageSpike()plantingSPIKE_needles via the public CIO API.SPIKE_REPORT.mdVerification
See
SPIKE_REPORT.mdfor the full reconciliation table.Related
~/.claude/plans/encryption-at-rest.md(§5 Phase 3, updated with these findings)customerio-iosrepo