Base64 decoding depth assessment #4744
Draft
+200
−30
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description:
This PR introduces an iterative decoding pipeline, allowing decoders (e.g., Base64, UTF-16) to chain their outputs. Previously, decoders ran independently on the original chunk, missing secrets hidden behind layered encoding (e.g., base64 within UTF-16, or double-base64-encoded values).
The
scannerWorkernow re-runs decoders on any new output, up to a configurable--max-decode-depth(default 5). This enables detection of secrets like GCP service accounts and private keys found within base64-encoded Docker auth configs, or Artifactory tokens within base64. The pipeline includes an early exit, ensuring negligible performance overhead for higher depths when no new decoded data is produced (typically <5% overhead compared to depth 1).Checklist:
make test-community)?make lintthis requires golangci-lint)?