Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
128 changes: 128 additions & 0 deletions docs/rfc/0006_multiscan.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
| | |
| :----------- | :------------------------------ |
| Feature Name | Multi-Scan |
| Start Date | 6 November 2025 |
| Category | Architecture |
| RFC PR | [fill this in after opening PR] |
| State | **ACCEPTED** |

# Summary

[summary]: #summary

Support grype as an additional tool to scan for SBOMs.

# Motivation

[motivation]: #motivation

We want to add support for `grype` in order to enrich the vulnerability reports, making them more complete and accurate.

This will allow us to be vendor-neutral, since we are currently relying only on `trivy` to generate SBOMs and scan for vulnerabilities.

Additionally, we discovered that `grype` is able to find more vulnerabilities than `trivy`. Below is a recap of our research:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be worth to mention that grype can also find vulnerabilities of Go binaries that are shipped without using a package manager.

It can find issues that affect the binary project itself, not just the dependency tree of that project. An example of that is the nginx-ingress controller cve-2025-1974 that trivy cannot find.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

absolutely. that's a good point.


| image | `trivy` | `grype` |
|-------|---------|---------|
| `golang:1.12-alpine` | 45 | 210 |
| `nginx:1.21.0` | 396 | 522 |
| `redis:6.2.0-alpine` | 44 | 127 |
| `postgres:13.0-alpine` | 63 | 151 |

## Examples / User Stories

[examples]: #examples

### User story 1

As a user, I want to make use of KEV and the EPSS score (which are currently provided by grype) to prioritize vulnerability remediation efforts.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are missing a few user stories here:

  • As a user, I want image scans to include results from additional vulnerability data sources, providing broader CVE coverage.

  • As a user, I want vulnerability findings from multiple scanners to be merged into a single unified report.

  • As a user, I want to choose which scan engines are enabled when installing sbomscanner.


# Detailed design

[design]: #detailed-design

For this new feature, we are providing a way to enable it when scanning.

This will impact the `ScanJob` CRD, adding a new boolean field called `multiScan`, set to `false` by default.

To enable it, the `ScanJob` should be set like this:

```yaml
apiVersion: sbomscanner.kubewarden.io/v1alpha1
kind: ScanJob
metadata:
name: scanjob-example
namespace: default
spec:
registry: example-registry
multiScan: true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: that poses a question about which scanner is the default one when multiScan is disabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assumed, when multiScan is not enabled, this will scan only with trivy as it currently does.
Sounds good to you?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but let's clarify that inside of the RFC

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure whether this setting belongs at the Registry level or at the Installation level, or if we should even add it in the first iteration.

I also don’t think a multiScan field is explicit enough.
It would be clearer to let users select the engines, for example:

scanEngines: [trivy, grype]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I don't like the idea of letting the user choose the scan engine to use.
I think the real advantage of using the multiscan is to have a richer and more exhaustive result by adding information to the vulnerability report. If we let the user decide which scan engine to use, we are just going to generate different vulnerability reports as you can already do manually.
In my opinion, the multiscan is something the user wants to activate for a deeper view of the cluster security posture, renouncing to the speed of scanning (since we are going to scan each image twice).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the real advantage of using the multiscan is to have a richer and more exhaustive result by adding information to the vulnerability report. If we let the user decide which scan engine to use, we are just going to generate different vulnerability reports as you can already do manually.

We are not going to generate different vulnerability reports, just one by merging the results.
However we are going to ask the user which "engines" he would like to use, without hiding them.
Why should trivy be the default?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add the multi-scan setting on a per-repository level. I would make it an parameter set at installation time.

I think we can start with a simple approach and make multiscan a boolean flag that can be enabled/disabled at installation time. We can then iterate from that based on user feedback.

Why should trivy be the default?

I think this should be the case since:

  • As far as I remember, it provided richer metadata compared to grype
  • It's the most popular image scanner, people are used to scan images with it. By using it, we reduce the chances of having different results that would confuse someone who is looking into SBOMscanner for the first time

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take it back, I changed my mind after the last call we had. I think we should make the names of the scanner to use explicit.

This also paves the way to introduce multiple scanners (like for licenses, secrets,...)

```

## Scan

For the multiscan feature, we are going to double the following operations:

* sbom generation

* sbom scan

This will let `grype` generate its own report, so that we can then compare and merge with the one obtained with trivy.

We can run the tools sequentially (1st trivy, 2nd grype) in case the `multiScan` field is set to `true` in the `ScanJob` CRD.

## Merge

The second phase of the multiscan process is about merging results.

If both the scans succeed, then we can merge them together. We already have defined our own `VulnerabilityReport` format [here](./0004_vulnerability_report.md). Starting from here, we are going to enrich the struct with information that is exclusively provided by `grype`:

* `kev` is a list of known exploits from the CISA KEV dataset.

* `epss` is a list of Exploit Prediction Scoring System (EPSS) scores for the vulnerability.

* `risk` is the score of the risk.

* `licenses` is a list of the licenses used by all the components within the affected software.

In addition to that, we are going to optionally update/overwrite already existing fields retrievied from `trivy`, in case `grype` has better results:

* `cvss` version and scores.

* `references` with additional links.

* `description` if not provided by trivy.

We cannot be sure that both tools will find the same results. For this reason, we have to adopt the following merging strategy:

```
vuln_report
for vuln in trivy.vulnerabilities:
vuln_report add vuln
if grype has vuln:
vuln_report add grype.kev
vuln_report add grype.epss
...
for vuln in grype.vulnerabilities:
if vuln not in vuln_report:
vuln_report add vuln
```

# Drawbacks

[drawbacks]: #drawbacks

<!---
Why should we **not** do this?

* obscure corner cases
* will it impact performance?
* what other parts of the product will be affected?
* will the solution be hard to maintain in the future?
--->

There are no specific concerns about this new feature.

By default, the `multiScan` is not enabled, so the user will not hit performance issues.

When the feature is enabled, an additional scan will run, and consequently, its results will be merged. This shouldn't have a huge impact, but users should keep this in mind when enabling it.

Loading