Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPIP-0421: HTTP Delegated Routing Reader Privacy Upgrade #421

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Prev Previous commit
Next Next commit
Rewrite IPIP document to reflect on comments and refine text
* Refine paragraphs for better readability.
* Change section on router selection based on code, since from multihash
code alone we cannot determine wheterh whether the request is encrypted
or not.
* Update alternatives section to explain how the IPIP can be enhanced
with OHTTP and Tor.
masih committed Jul 12, 2023
commit b8942796e3ff07b82bf67e4f96fa6fadd3aea572
86 changes: 36 additions & 50 deletions src/ipips/ipip-0421.md
Original file line number Diff line number Diff line change
@@ -17,94 +17,80 @@ tags: ['ipips', 'routing', 'privacy', 'double hashing']

## Summary

This IPIP specifies a new HTTP API for Privacy Preserving Delegated Content Routing provider lookups.
This IPIP introduces a HTTP API designed for Privacy Preserving Delegated Content Routing provider lookups.

## Motivation

IPFS is currently lacking of many privacy protections. One of its main weak points lies in the lack
of privacy protections for the Content Routing subsystem. Currently neither Readers (clients accessing files)
nor Writers (hosts storing and distributing content) have much privacy with regard to content they publish or
consume. It is very easy for a Content Router or a Passive Observer to learn which file is requested by
which client during the routing process, as the potential adversary easily learns about the requested `CID`.
A curious actor could request the same `CID` and download the associated file to monitor the user’s behavior.
This is obviously undesirable and has been for some time now a strong request from the community.
Currently, IPFS's privacy safeguards are notably deficient, particularly regarding the Content Routing subsystem. Neither Readers (clients who access files) nor Writers (hosts that store and distribute content) can maintain significant privacy related to the content they produce or consume. Presently, a Content Router or a Passive Observer can discern the identity of a file requested by a client and the specific client making the request during the routing process. This situation allows potential adversaries to gain knowledge about the requested CID. An interested party could then request the same CID and download the corresponding file to track the user's activities. Addressing these privacy concerns has been a long-standing demand from the community.

The latest upgrades to the DHT and IPNI have introduced Double Hashing - a technique that aims to better preserve Reader Privacy.
With Double Hashing in place Provider Records are encrypted and opaque to Content Routers. If presented with the original `CID` a
Content Router can decrypt the relevant Provider Records and serve them via the existing Delegated Routing API.
However in order to benefit from the privacy enhancement users need to change the way they interact with Content Routers, in particular:
- A second hash over the original `Multihash` must be used when looking up the content;
- Returned Provider Records are encrypted and must be decrypted by the client before using them;
- The client might choose to fetch additional encrypted Metadata from the Content Router.
Recent enhancements to the [IPFS DHT](https://github.com/ipfs/specs/pull/373) and [InterPlanetary Network Indexer (IPNI)](https://github.com/ipni/specs/pull/5) have incorporated Double Hashing to improve Reader Privacy. With Double Hashing, Provider Records become encrypted and non-transparent to Content Routers. Given the original CID, a Content Router can decrypt the relevant Provider Records and supply them through the existing Delegated Routing API. To make use of these privacy enhancements, users must modify their interactions with Content Routers by:

This new way of interaction can not be fullfilled by the existing API. This IPIP is an incremental improvement to the HTTP Delegated Routing API that adds
new endpoints for serving encrypted content. The original API can still be used for not Privacy Preserving lookups.
* Utilizing a secondary hash over the original Multihash during content lookup;
* Decrypting the returned, encrypted Provider Records prior to use; and
* Optionally retrieving additional encrypted Metadata from the Content Router.

Writer Privacy is out of scope of this IPIP and is going to be addressed separately.
Existing APIs cannot support these changes in interaction, necessitating this IPIP as a step to improve the HTTP Delegated Routing API. This proposal adds new endpoints for delivering encrypted content while maintaining the original API for non-privacy-preserving lookups. Writer Privacy, however, is not within the scope of this IPIP and will be handled separately.

## Detailed design

See the Delegated Routing Reader Privacy Upgrade spec (:cite[http-routing-reader-privacy-v1]) included with this IPIP.
Please refer to the Delegated Routing Reader Privacy Upgrade specification (:cite[http-routing-reader-privacy-v1]) included with this IPIP for detailed design information.

## Design rationale

This API proposal makes the following changes:
- Adds new methods for looking up encrypted Provider Records and encrypted Metadata;
- Defines Hashing and Encryption functions and response payloads structure.
The proposed API makes two key changes:

There are no ideomatic changes to the API - all data formats, design rationale and principles outlined in the original :cite[ipip-0337] apply here.
1. It introduces new methods for looking up encrypted Provider Records and encrypted Metadata.
2. It establishes Hashing and Encryption functions and structures the response payloads.

### User benefit
This proposal does not alter the API's idioms, upholding all data formats, design rationale, and principles established in the original :cite[ipip-0337].

With the new APIs users can protect themselves from:
- a malicious actor spying on the user by observing the user to Content Router traffic and then downloading the same data;
- the new API is a first step towards fully private HTTP Delegated Routing protocol that will eliminate IPNI as centralised observers.
### User benefit

There are no other functional improvements.
With the proposed APIs, users can protect themselves against malicious actors who might spy on their activities by monitoring their traffic to Content Routers and subsequently downloading identical data. Additionally, this API serves as a first step towards a fully private HTTP Delegated Routing protocol, which would eliminate centralized observers like IPNI routers.

### Compatibility

#### Backwards Compatibility

Users will need to explicitly turn on Reader Privacy on their nodes. A new flag can be introduced to the Kubo's HTTP Delegated Content Router configuration to facilitate that functionality.
Users on older nodes can continue using the old API and turn on reader Privacy at a alter point.
Users will need to deliberately activate Reader Privacy on their nodes. A new flag could be introduced into IPFS implementations such as Kubo's HTTP Delegated Content Router configuration to streamline this process. Users on older nodes can continue using the existing API and switch on Reader Privacy later.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd hope this doesn't need to be the case in an application that has some IPFS smarts (rather than a simple HTTP client). If enough features are expressed through something like #388 then the client should be able to have plausible defaults here (e.g. if my delegated router supports IPNI + DHT, but only IPNI has double-hashing support and the client can run its own DHT client it could choose to send double-hashed requests to the delegated router for IPNI and do the DHT lookups itself).

Obviously some clients will still offer configurability (e.g. would you rather ask the delegated router to do DHT lookups for you in cleartext, or not do them at all) but having reasonable default behavior should be possible.


Content Routers should provide the same QoS for both Privacy Preserving and regular APIs. This is because both can be served over the same encrypted data. If presented with a regular CID, a Content Router
can perform decryption operations on behalf of the user (i.e. mimic the client logic) and return results in clear text. If presented with a second hash the Content Router can return encrypted results and let the
user to do decryption themselves.
Content Routers should maintain the same Quality of Service (QoS) for both Privacy Preserving and regular APIs, as both can be served over the same encrypted data. A shim non-encrypted content router can be implemented to encrypt regular CIDs on the fly, proxy the requests through an encrypted content router and finally decrypt the results before returning them to the user.

It's possible that not all Content Routers will adopt Reader Privacy. The default HTTP Delegated Router like `cid.contact` should have Reader Privacy enabled by default in the newer versions of Kubo / Helia.
Users should verify themselves whether a custom router of their choice supports Reader Privacy or not when configuring it.
It is worth noting that not all Content Routers might adopt Reader Privacy. Default HTTP Delegated Routers like `cid.contact` should have Reader Privacy enabled by default in the latest versions of IPFS implementations such as Kubo and Helia. Users should confirm if their chosen custom router supports Reader Privacy when setting it up.

The `/routing/v1/encrypted/` API will be implemented in existing libraries like [`boxo/routing/http`](https://github.com/ipfs/boxo/tree/main/routing/http) and will not introduce any breaking changes to existing clear text endpoints.
The API will be released in a new minor version.
The `/routing/v1/encrypted/` API will be implemented in existing libraries, such as [`boxo/routing/http`](https://github.com/ipfs/boxo/tree/main/routing/http), and will not introduce any breaking changes to existing clear text endpoints. The API will be introduced in a new minor version.

#### Forwards Compatibility
#### Forward Compatibility

Reader Privacy relies on usage of specific hashing and encryption functions. Function rotation will require a network-wide migration. Content Routers might not be able to migrate "under the hood" as they
don't possess the original values. Function rotation should be a very infrequent event and will require network-wide efforts. When function rotation is needed - a version of the API will be incremented.
Reader Privacy relies on the use of specific hashing and encryption functions. Altering these functions would require a network-wide migration. Content Routers might not be able to migrate seamlessly, as they do not possess the original values. Such function rotation should occur infrequently and necessitate network-wide efforts. When function rotation is required, the API version will be incremented.

### Security

See "Threat Modelling" section of :cite[http-routing-reader-privacy-v1]
For details on security, please see the "Threat Modelling" section of :cite[http-routing-reader-privacy-v1].

### Alternatives

TODO: Describe alternate designs that were considered and related work.
When considering alternatives to this IPIP, two potential scenarios and their corresponding technologies are worth exploring:

1. Oblivious HTTP (OHTTP)
2. Onion Services

In scenario (1), `/routing/v1` would be implemented behind Oblivious HTTP (OHTTP), a protocol proposed by IETF and Cloudflare. OHTTP separates the information about 'who' is making a request from 'what' they are requesting, thereby preventing routing systems such as IPNI instances from viewing both pieces of information concurrently. This would add an additional layer of privacy by obscuring metadata, such as user behavior patterns, IP addresses, and user-agents.

Scenario (2) envisages the `/routing/v1` behind Onion Services. Onion Services provide another approach to concealing the origin of requests by routing them through the Tor network, further enhancing user privacy.

- TODO: Oblivious HTTP ([IETF](https://www.ietf.org/archive/id/draft-thomson-http-oblivious-01.html), [Cloudflare](https://blog.cloudflare.com/stronger-than-a-promise-proving-oblivious-http-privacy-properties/))
These two scenarios and their corresponding technologies aren't mutually exclusive to this IPIP. Instead, they could be viewed as complementary solutions that could be deployed in conjunction with Double Hashed records, as proposed in this IPIP, to create a more comprehensive privacy solution. The Double Hashing technique encrypts the content of the communication, making it opaque to passive observers. Simultaneously, OHTTP and Onion Services could provide additional privacy layers by obfuscating metadata about who is making a request.

## Test fixtures
For more information on OHTTP and Onion Services, please refer to these resources:

TODO: List relevant CIDs or JSON payloads. Describe how implementations can use them to determine
specification compliance. This section can be skipped if IPIP does not deal
with the way IPFS handles content-addressed data, or the modified specification
file already includes this information.
- [Oblivious HTTP: IETF](https://www.ietf.org/archive/id/draft-thomson-http-oblivious-01.html)
- [Oblivious HTTP: Cloudflare](https://blog.cloudflare.com/stronger-than-a-promise-proving-oblivious-http-privacy-properties/)
- [Onion Services](https://community.torproject.org/onion-services/)

### Resources

- [IPIP-272 (double hashed DHT)](https://github.com/ipfs/specs/pull/373/)
- [ipni#5 (reader privacy in indexers)](https://github.com/ipni/specs/pull/5)
- [Double-hashed DHT](https://github.com/ipfs/specs/pull/373/)
- [Reader Privacy in Indexers](https://github.com/ipni/specs/pull/5)

### Copyright