You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: vindex/README.md
+31-13Lines changed: 31 additions & 13 deletions
Original file line number
Diff line number
Diff line change
@@ -9,14 +9,29 @@ Discussions are welcome, please join us on [Transparency-Dev Slack](https://tran
9
9
10
10
## Overview
11
11
12
-
The core idea is basically to construct an index like you would find in the back of a book, i.e. search terms are mapped to a _pointer_ to where the data can be found.
13
-
A verifiable index represents an efficient data structure to allow point lookups to common queries over a single log.
14
-
For example, a verifiable index over a module/package repository could be constructed to allow efficient lookup of all modules/packages with a given name.
12
+
### The Problem: Verifiability vs. Efficiency
15
13
16
-
The result of looking up a key in a verifiable index is a list of uint64 pointers to the origin log, i.e. a list of indices in the origin log where the leaf data matches the index function.
17
-
The index has a checkpoint that commits to its state at any particular log size.
18
-
Every point lookup (i.e. query) in the map is verifiable, as is the construction of the index itself.
19
-
The verifiable index commits to all evolutions of its state by committing to all published index roots in a witnessed output log.
14
+
Logs, such as those used in Certificate Transparency or Software Supply Chains, provide a strong foundation for discoverability. You can prove that an entry exists in a log. However, they lack a critical feature: the ability to _verifiably_ query for entries based on their content.
15
+
16
+
This forces users who need to find specific data, like a domain owner finding their certificates, or a developer finding their software packages, into a painful choice:
17
+
18
+
1.**Massive Inefficiency**: Download and process the _entire_ log, which can be terabytes of mostly irrelevant data, just to find the few entries that matter to you.
19
+
2.**Losing Verifiability**: Rely on a third-party service to index the data. This breaks the chain of verifiability, as the index operator could, by accident or design, fail to show you all the results. You are forced to trust them.
20
+
21
+
Neither option is acceptable. Users should not have to sacrifice efficiency for security, or security for efficiency.
22
+
23
+
### The Solution: A Verifiable "Back-of-the-Book" Index
24
+
25
+
A Verifiable Index resolves this conflict by providing a third option: an efficient, cryptographically verifiable way to query log data.
26
+
27
+
At its core it works like a familiar index, much like one would find in the back of a book. It maps search terms (like a domain or package name) to the exact locations (pointers) in the main log where that data can be found.
28
+
29
+
This provides two key guarantees:
30
+
31
+
-**Efficiency**: Users can look up data by a meaningful key and receive a small, targeted list of pointers back, avoiding the need to download the entire log.
32
+
-**Verifiability**: Every lookup response comes with a cryptographic proof. This proof guarantees that the list of results is complete and that the index operator has not omitted any entries for your query.
33
+
34
+
The result is a system that extends the verifiability of the underlying log to its queries, preserving the end-to-end chain of trust while providing the efficiency modern systems require.
20
35
21
36
## Applications
22
37
@@ -186,14 +201,17 @@ You will also have a WAL file at `~/sumdb.wal`, which will make future boots fas
0 commit comments