Skip to content

Commit

Permalink
adr for conceptual view
Browse files Browse the repository at this point in the history
  • Loading branch information
JimFuller-RedHat committed Jan 24, 2025
1 parent e3af852 commit 327efec
Showing 1 changed file with 256 additions and 0 deletions.
256 changes: 256 additions & 0 deletions docs/adrs/00002-analysis-graph.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,256 @@
# 00002. Analysis graph API in Trustify

Date: 2025-01-23

## Status

DRAFT

## Context

This ADR is an addenda to [previous ADR](00001-graph-analytics.md) as an attempt to clarify the differences between the graph
relationships we capture and the view we want to create from the forest of graphs.

Ingesting an sbom captures a set of trustify relationships, which are instantiated
in the forest of graphs as;

```mermaid
graph TD
PackageA -->|CONTAINS| PackageOther
PackageD -->|CONTAINED_BY| PackageA
PackageA -->|DEPENDS_ON| PackageB
PackageB -->|DEPENDENCY_OF| PackageA
SBOMDOC1 -->|DESCRIBES| PackageA
UpstreamComponent -->|ANCESTOR_OF| PackageA
image.arch1 -->|VARIANT_OF| ImageIndex1
image.arch2 -->|VARIANT_OF| ImageIndex1
SBOMDOC2 -->|DESCRIBES| ImageIndex1
SBOMDOC3 -->|DESCRIBES| srpm_component
binarycomponent1 -->|GENERATED_FROM| srpm_component
binarycomponent2 -->|GENERATED_FROM| srpm_component
```

Trustify relationships attempt to put an abstraction over relationships
defined by any format of sbom (eg. cyclonedx, spdx).

This graph encapsulates provenance of sbom relationship though end users are unlikely
to directly navigate such graphs, as it would mean forcing concrete understanding of relationship directionality.
If we were to _normalise_ all relationships (similar to the rewrite of all CONTAINS into CONTAINED_BY) the
resultant DAG is probably ok ex. we could rewrite the following relationships:

[ _binarycomponent1_ **GENERATED_FROM** _srpm_component_] to [_srpm_component_ **GENERATES** _binarycomponent1_]

[ _image.arch2_ **VARIANT_OF** _ImageIndex1_] to [_ImageIndex1_ **VARIANTS** _image.arch2_]

So while we make the logical model easier to comprehend it is a good practice to keep logical model separate and not try to overload that model to serve as conceptual model.

The `api/v2/analysis` endpoints are responsible for building up the conceptual view. Where we want to query, filter and
traverse on the following.

```mermaid
graph TD
SBOMDOC1 -->|DESCRIBES| PackageA
PackageA -->|CONTAINS| PackageOther
PackageA -->|CONTAINS| PackageD
PackageA -->|DEPENDS| PackageB
SBOMDOC2 -->|DESCRIBES| ImageIndex1
UpstreamComponent -->|ANCESTOR_OF| PackageA
ImageIndex1 -->|VARIANTS| image.arch1
ImageIndex1 -->|VARIANTS| image.arch2
SBOMDOC3 -->|DESCRIBES| srpm_component
srpm_component -->|GENERATES| binarycomponent1
srpm_component -->|GENERATES| binarycomponent2
```

It is a feature that this conceptual model spans beyond traversal of just transitive software dependencies.

For example, searching for any node in above view, should let us traverse ancestors and descendents ... a
few illustrative examples:

**Search for 'PackageA'**
* component ancestors would be `[UpstreamComponent]`
* component descendents would be the tree underneath 'PackageA' `[PackageOther,PackageD,PackageB]`

**Search for 'image.arch1'**
* component ancestors would be `[ImageIndex1]`
* component descendents would be `[]`

_Note: every node in the graph already knows its relationship to original SBOM so no need
to enumerate DESCRIBES relationship ... though in the future we may see other artifacts (eg. sbom)
DESCRIBES._

We should make it easy to visualise this conceptual model direct from the endpoints (ex. Accept: image/svg
would pull down an svg representation).

## Decision

### Implement `api/v2/analysis/component`

payload returns immediate ancestor/descendent relations (eg. 'one deep')
```json
{
"sbom_id": "",
"node_id": "",
"purl": [
""
],
"cpe": [],
"name": "PackageA",
"version": "",
"published": "2024-12-19 18:04:12+00",
"document_id": "urn:uuid:537c8dc3-6f66-3cac-b504-cc5fb0a09ece",
"product_name": "",
"product_version": "",
"ancestor": [
{
"sbom_id": "",
"node_id": "",
"relationship": "AncestorOf",
"purl": [
""
],
"cpe": [],
"name": "UpstreamPackage",
"version": ""
}
],
"descendent": [
{
"sbom_id": "",
"node_id": "",
"relationship": "Variants",
"purl": [
""
],
"cpe": [],
"name": "PackageC",
"version": ""
}
]
}
}
```

Where items in `ancestor` array imply _Ancestor component_ VIEWRELATION _Searched component_ ... in the above example that would
mean _UpstreamPackage_ **AncestorOf** _PackageA_

Where items in `descendent` array imply _Searched component_ VIEWRELATION _Descendent component_ ... in the above example that would
mean _PackageA_ **Variants** _PackageC_

This endpoint provides the following url params with the following defaults
- depth=1
- show_ancestor=True
- show_descendent=True
- relationshipType=*

### Implement `api/v2/analysis/ancestor`
payload returns all ancestor relations
```json
{
"sbom_id": "",
"node_id": "",
"purl": [
""
],
"cpe": [],
"name": "",
"version": "",
"published": "2024-12-19 18:04:12+00",
"document_id": "urn:uuid:537c8dc3-6f66-3cac-b504-cc5fb0a09ece",
"product_name": "",
"product_version": "",
"ancestor": [
{
"sbom_id": "",
"node_id": "",
"relationship": "AncestorOf",
"purl": [
""
],
"cpe": [],
"name": "",
"version": ""
}, {} ....
]
}
```

The `ancestor` array contains a list of ancestors with the last item
in the list being considered the _root component_.

This is equiv to
- depth=0
- show_ancestor=True
- show_descendent=False
- relationshipType=*

### Implement `api/v2/analysis/descendent`
returns all descendent relations
```json
{
"sbom_id": "",
"node_id": "",
"purl": [
""
],
"cpe": [],
"name": "",
"version": "",
"published": "2024-12-19 18:04:12+00",
"document_id": "urn:uuid:537c8dc3-6f66-3cac-b504-cc5fb0a09ece",
"product_name": "",
"product_version": "",
"descendent": [
{
"sbom_id": "",
"node_id": "",
"relationship": "Variants",
"purl": [
""
],
"cpe": [],
"name": "",
"version": "",
"descendent": [{} ....]
}, {} ....
]
}
```
The `descendent` array contains a list of descendents with each component also containing any nested descendents.

This is equiv to
- depth=0
- show_ancestor=False
- show_descendent=True
- relationshipType=*

### Implement visualisation

Generation of a high quality svg
Generation of mermaid represenation

### Document analysis graph API

Openapi definitions should provide good REST API level documentation.

Additional sections should be added to docs/book

## Alternative approaches

**Directly use graphs:** It is likely that we will provide raw interface to the graphs (aka `api/v2/analysis/relationship`) though
we do not want to move responsibility of building up the 'view' to a client so still need to provide the other endpoints for that.

**Build a new graph representing the conceptual model:** As graphs do not mutate, its not so far fetched to
consider additionally generating a conceptual graph. It might be something we consider as an optimisation in
the future though for now thinking it would be good to avoid the cost (ram, memory). The conceptual graph model might be
considered a replacement for logical model though that would be flawed thinking as we always need the logical
model to tell us relationship provenance eg. the logical model is absolutely required.

## Consequences

* We need to identify all the relationships that are candidates for rewrites into DAG (ex. DEPENDENCY_OF to DEPENDS_ON)
* Having a clear conceptual model will reduce cognitive load of having to mentally reparse graph relations
* A DAG based conceptual model means we can also do neat stuff like generate visual representations (mermaid, svg, etc)

0 comments on commit 327efec

Please sign in to comment.