Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
439a0df
feat(sdk): replace detect-secrets with kingfisher for secret scanning
danibarranqueroo Jun 24, 2026
018c58d
feat(sdk): add opt-in secret validation with critical escalation for …
danibarranqueroo Jun 24, 2026
2600402
test(sdk): update secret checks tests and fixtures for kingfisher
danibarranqueroo Jun 24, 2026
8eec932
docs: document kingfisher secret scanning and validation
danibarranqueroo Jun 24, 2026
a0f4472
Merge branch 'master' into PROWLER-2083-replace-detect-secrets-librar…
danibarranqueroo Jun 25, 2026
6f418cb
chore: add secrets flag into aws schema
danibarranqueroo Jun 25, 2026
5e54451
chore(sdk): remove deprecated detect_secrets_plugins config option
danibarranqueroo Jun 25, 2026
5e4ddf4
fix(sdk): skip secret re-validation in cloudwatch line-number rescan
danibarranqueroo Jun 25, 2026
26d9f61
docs: complete secret-scanning check list and note provider-wide vali…
danibarranqueroo Jun 25, 2026
21e07b1
feat(sdk): add batched secret scanning to amortize kingfisher subproc…
danibarranqueroo Jun 25, 2026
7bd6884
perf(sdk): batch secret scanning across cloudwatch, lambda and ecs re…
danibarranqueroo Jun 25, 2026
741cc94
refactor(sdk): batch secret scanning across all remaining secret checks
danibarranqueroo Jun 25, 2026
92262a6
docs: document the batched secret-scanning check structure
danibarranqueroo Jun 25, 2026
6be7e01
perf(sdk): batch the cloudwatch multiline-event rescan
danibarranqueroo Jun 25, 2026
388e471
refactor(sdk): remove unused single-payload detect_secrets_scan helper
danibarranqueroo Jun 25, 2026
c01297a
fix(sdk): bound the kingfisher subprocess with a timeout and type the…
danibarranqueroo Jun 25, 2026
0e9731a
test(sdk): cover the verified-secret escalation path across secret ch…
danibarranqueroo Jun 25, 2026
b13b3f0
docs(sdk): correct secret-scanning docs, changelog and OpenStack chec…
danibarranqueroo Jun 25, 2026
d9b9bd7
docs(sdk): add version badge and engine-history note to secret-scanni…
danibarranqueroo Jun 25, 2026
faa50ee
fix(sdk): apply ignore patterns to the cloudwatch multiline-event rescan
danibarranqueroo Jun 26, 2026
869a08e
fix(sdk): report undecodable user data as MANUAL instead of dropping …
danibarranqueroo Jun 26, 2026
2004494
Merge branch 'master' into PROWLER-2083-replace-detect-secrets-librar…
danibarranqueroo Jun 26, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/developer-guide/checks.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -445,3 +445,5 @@ The metadata structure is enforced in code using a Pydantic model. For reference
## Specific Check Patterns

Details for specific providers can be found in documentation pages named using the pattern `<provider_name>-details`.

Checks that scan resources for plaintext secrets follow a dedicated batched structure. Refer to [Secret-Scanning Checks](/developer-guide/secret-scanning-checks) before creating or updating one.
1 change: 0 additions & 1 deletion docs/developer-guide/configurable-checks.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,6 @@ Only fields with a numeric range, a fixed value set, or a length cap are listed.
| `max_days_secret_unused` | `7..365` days | |
| `max_days_secret_unrotated` | `1..180` days | NIST IA-5: rotate quarterly; CIS ≤90 |
| `min_kinesis_stream_retention_hours` | `24..8760` h | 1 day .. 1 year |
| `detect_secrets_plugins[].limit` | `0.0..10.0` | Shannon entropy threshold |
| `shodan_api_key` | ≤512 chars | |

### Azure
Expand Down
119 changes: 119 additions & 0 deletions docs/developer-guide/secret-scanning-checks.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
---
title: 'Secret-Scanning Checks'
---

import { VersionBadge } from "/snippets/version-badge.mdx"

<VersionBadge version="5.32.0" />

Prowler scans audited resources for plaintext secrets using [Kingfisher](https://github.com/mongodb/kingfisher), an open-source secret-scanning engine that Prowler invokes as a subprocess. This guide explains the structure every secret-scanning check must follow to keep scanning correct and efficient on large accounts.

<Note>
Since Prowler 5.32.0 the secret-scanning checks scan with Kingfisher. Earlier versions used the `detect-secrets` library.
</Note>

## Overview
Comment thread
danibarranqueroo marked this conversation as resolved.

Secret detection runs through a single helper in `prowler/lib/utils/utils.py`:

- **`detect_secrets_scan_batch(payloads, excluded_secrets=..., validate=...)`** scans many payloads in chunked subprocess invocations and returns a `{key: [findings]}` dictionary. To scan a single payload, pass a one-entry mapping (for example, `{0: data}`).

Every Kingfisher invocation carries a fixed process-startup cost (around 100 ms). Scanning once per resource would spawn thousands of subprocesses on large accounts (for example, thousands of CloudWatch log groups). `detect_secrets_scan_batch` amortizes that cost: it writes each payload to a temporary file as it consumes them, runs one subprocess per chunk (500 payloads by default), and maps the findings back to each payload by key.

## The Batched Structure

Every secret-scanning check follows three phases.

### Phase 1: Collect

Define a generator that yields `(key, payload)` for each scannable unit. The generator builds payload strings only — it does not call Kingfisher. Lazy yielding keeps memory and temporary-disk usage bounded to a single chunk, which matters when an account holds thousands of resources.

### Phase 2: Batch

Call `detect_secrets_scan_batch` once with the generator. The helper consumes it in chunks, runs Kingfisher per chunk, and returns the keys that produced findings mapped to their finding lists.

### Phase 3: Report

Iterate the resources, look up the findings by key, and build one report per resource. Emit a finding for **every** iterated resource — never drop one silently. When a resource's payload cannot be prepared for scanning (for example, user data that fails to base64-decode or decompress), report it as `MANUAL` with a status explaining the scan could not inspect it, rather than omitting it or claiming `PASS`.

```python
from prowler.lib.check.models import Check, Check_Report_AWS
from prowler.lib.utils.utils import (
annotate_verified_secrets,
detect_secrets_scan_batch,
)
from prowler.providers.aws.services.example.example_client import example_client


class example_resource_no_secrets(Check):
def execute(self):
findings = []
excluded = example_client.audit_config.get("secrets_ignore_patterns", [])
validate = example_client.audit_config.get("secrets_validate", False)
resources = list(example_client.resources)

# Phase 1: collect — builds strings only, no scan.
def payloads():
for index, resource in enumerate(resources):
if resource.scannable_data:
yield index, serialize(resource)

# Phase 2: batch — one call, chunked subprocesses.
batch_results = detect_secrets_scan_batch(
payloads(), excluded_secrets=excluded, validate=validate
)

# Phase 3: report — look up findings by key.
for index, resource in enumerate(resources):
report = Check_Report_AWS(metadata=self.metadata(), resource=resource)
report.status = "PASS"
report.status_extended = f"No secrets found in {resource.name}."
detect_secrets_output = batch_results.get(index)
if detect_secrets_output:
report.status = "FAIL"
report.status_extended = (
f"Potential secret found in {resource.name} -> ..."
)
annotate_verified_secrets(report, detect_secrets_output)
findings.append(report)

return findings
```

## Choosing the Key

The key maps each finding back to its source. Two shapes cover every check:

- **One payload per resource:** use the resource index. This fits checks that serialize a single payload per resource, such as launch configurations, CloudFormation outputs, SSM documents, Step Functions definitions, and OpenStack metadata.
- **Several payloads per resource:** use a `(resource_index, fragment)` tuple, where the fragment identifies the variable, log stream, container, file, or version. Phase 3 groups the per-fragment findings to build the resource report. This fits CloudWatch log streams, ECS containers, CodeBuild variables, Glue arguments, and Lambda code files.

Derive the indices from the same `list(...)` of resources in both Phase 1 and Phase 3 so the order stays stable and the keys align.

## Preserving Per-Payload Results

`detect_secrets_scan_batch` runs Kingfisher with `--no-dedup`, so a secret that appears in more than one payload is reported for each one. This reproduces the result of scanning each payload individually. Build payload strings exactly as a single scan would: serialize the same data and keep line ordering, because messages often map a finding's `line_number` back to a variable name or metadata key.

## Validation and Severity

`detect_secrets_scan_batch` accepts `validate`, read from `secrets_validate` in the provider configuration or the `--scan-secrets-validate` flag. When enabled, Kingfisher confirms whether each secret is live, and confirmed secrets carry `is_verified: True`.

After marking a report as `FAIL`, pass the findings to `annotate_verified_secrets(report, findings)`. When any secret is verified, the helper escalates the finding to critical severity and appends a note that the secret was confirmed live. Validation stays off by default because it sends the discovered secret to the provider API.
Comment thread
coderabbitai[bot] marked this conversation as resolved.

## Excluded Secrets

`detect_secrets_scan_batch` applies `secrets_ignore_patterns` — regular expressions from the provider configuration — against each finding's source line and drops the matches, mirroring single-scan behavior.

## Testing

To assert on the verified-secret path, mock `detect_secrets_scan_batch` in the check module and return the keyed dictionary. For a single resource scanned at index `0`:

```python
mock.patch(
"prowler.providers.aws.services.example.example_resource_no_secrets.example_resource_no_secrets.detect_secrets_scan_batch",
return_value={
0: [{"type": "...", "line_number": 1, "is_verified": True}]
},
)
```

Most tests need no mock at all: they seed resources that contain example secrets and assert on the `FAIL` status and message, which exercises the real batched path. Refer to the [Testing](/developer-guide/unit-testing) documentation for the general structure.
1 change: 1 addition & 0 deletions docs/docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -396,6 +396,7 @@
"developer-guide/provider",
"developer-guide/services",
"developer-guide/checks",
"developer-guide/secret-scanning-checks",
"developer-guide/outputs",
"developer-guide/integrations",
"developer-guide/security-compliance-framework",
Expand Down
28 changes: 28 additions & 0 deletions docs/user-guide/cli/tutorials/configuration_file.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@
title: "Configuration File"
---

import { VersionBadge } from "/snippets/version-badge.mdx"

Several Prowler's checks have user configurable variables that can be modified in a common **configuration file**. This file can be found in the following [path](https://github.com/prowler-cloud/prowler/blob/master/prowler/config/config.yaml):

```
Expand Down Expand Up @@ -87,6 +89,32 @@ The following list includes all the AWS checks with configurable variables that
| `opensearch_service_domains_not_publicly_accessible` | `trusted_ips` | List of Strings |


### Validating Discovered Secrets

<VersionBadge version="5.32.0" />

Comment thread
coderabbitai[bot] marked this conversation as resolved.
By default, the secret-scanning checks run fully offline: secrets are detected but never sent anywhere. Setting `secrets_validate` to `True` additionally confirms whether each discovered secret is live by authenticating with it against the corresponding provider API. The discovered secret itself serves as the credential, so Prowler requires no additional permissions to validate it.

`secrets_validate` applies to every AWS secret-scanning check listed above (those that accept `secrets_ignore_patterns`). The `--scan-secrets-validate` CLI flag is provider-wide: it also enables validation for the secret-scanning checks of other providers, such as the OpenStack metadata checks.

To enable validation through the configuration file, set the value under the `aws` section:

```yaml
aws:
secrets_validate: True
```

To enable validation for a single scan (any provider), use Prowler CLI:

```
prowler aws --scan-secrets-validate
```

<Warning>
Secret validation makes outbound network calls that authenticate with each discovered secret. The credential is exercised against the provider, so the call appears in the audited account's logs and can trigger its monitoring (for example, AWS CloudTrail records the validation request). Validation stays disabled by default so that scans remain fully offline.
</Warning>


## Azure

### Configurable Checks
Expand Down
19 changes: 16 additions & 3 deletions docs/user-guide/cli/tutorials/pentesting.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,20 +6,33 @@ Prowler has some checks that analyse pentesting risks (Secrets, Internet Exposed

## Detect Secrets

Prowler uses `detect-secrets` library to search for any secrets that are stores in plaintext within your environment.
Prowler scans for secrets stored in plaintext within the audited environment using [Kingfisher](https://github.com/mongodb/kingfisher), an open-source secret-scanning engine. By default these scans run fully offline, so no data leaves the audited environment. Discovered secrets can optionally be validated against the provider APIs to confirm whether they are live — see [Validating Discovered Secrets](/user-guide/cli/tutorials/configuration_file#validating-discovered-secrets).
Comment thread
danibarranqueroo marked this conversation as resolved.

The actual checks that have this functionality are the following:
The checks with this functionality are the following.

AWS:

- autoscaling\_find\_secrets\_ec2\_launch\_configuration
- awslambda\_function\_no\_secrets\_in\_code
- awslambda\_function\_no\_secrets\_in\_variables
- cloudformation\_stack\_outputs\_find\_secrets
- cloudwatch\_log\_group\_no\_secrets\_in\_logs
- codebuild\_project\_no\_secrets\_in\_variables
- ec2\_instance\_secrets\_user\_data
- ec2\_launch\_template\_no\_secrets
- ecs\_task\_definitions\_no\_environment\_secrets
- glue\_etl\_jobs\_no\_secrets\_in\_arguments
- ssm\_document\_secrets
- stepfunctions\_statemachine\_no\_secrets\_in\_definition

OpenStack:

- compute\_instance\_metadata\_sensitive\_data
- blockstorage\_volume\_metadata\_sensitive\_data
- blockstorage\_snapshot\_metadata\_sensitive\_data
- objectstorage\_container\_metadata\_sensitive\_data

To execute detect-secrets related checks, you can run the following command:
To execute the secret-scanning checks, run the following command:

```console
prowler <provider> --categories secrets
Expand Down
6 changes: 6 additions & 0 deletions prowler/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,12 @@ All notable changes to the **Prowler SDK** are documented in this file.
- `entra_conditional_access_policy_explicitly_targets_azure_devops` check for M365 provider, verifying at least one enabled Conditional Access policy explicitly includes the Azure DevOps cloud application instead of relying on a broad "All cloud apps" policy [(#11182)](https://github.com/prowler-cloud/prowler/pull/11182)
- `entra_conditional_access_policy_no_exclusion_gaps` check for M365 provider, verifying every user, group, role, or application excluded from an enabled Conditional Access policy stays in scope of another enabled policy [(#11577)](https://github.com/prowler-cloud/prowler/pull/11577)
- `stepfunctions_statemachine_encrypted_with_cmk` check for AWS provider, verifying that each Step Functions state machine uses a customer-managed KMS key for encryption at rest rather than the default AWS-owned key [(#11538)](https://github.com/prowler-cloud/prowler/pull/11538)
- `--scan-secrets-validate` flag and `aws.secrets_validate` configuration option to optionally validate the secrets discovered by the secret-scanning checks against the provider APIs; secrets confirmed to be live are reported as critical [(#11694)](https://github.com/prowler-cloud/prowler/pull/11694)

### 🔄 Changed

- Replaced the `detect-secrets` library with [Kingfisher](https://github.com/mongodb/kingfisher) as the engine for the secret-scanning checks; scans run fully offline by default and obvious placeholder values are no longer reported as findings [(#11694)](https://github.com/prowler-cloud/prowler/pull/11694)
- Removed the `detect_secrets_plugins` configuration option, which is no longer used by the new secret-scanning engine [(#11694)](https://github.com/prowler-cloud/prowler/pull/11694)

---

Expand Down
38 changes: 7 additions & 31 deletions prowler/config/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -423,6 +423,13 @@ aws:
# Patterns to ignore in the secrets checks
secrets_ignore_patterns: []

# Validate discovered secrets by checking whether they are live against the
# provider APIs. WARNING: this makes outbound network calls that authenticate
# with the discovered secret itself; the credential is exercised against the
# provider and the call will appear in the audited account's logs (and may
# trigger its monitoring). Disabled by default (scans stay fully offline).
secrets_validate: False
Comment thread
coderabbitai[bot] marked this conversation as resolved.

# AWS Secrets Manager Configuration
# aws.secretsmanager_secret_unused
# Maximum number of days a secret can be unused
Expand All @@ -436,37 +443,6 @@ aws:
# Minimum retention period in hours for Kinesis streams
min_kinesis_stream_retention_hours: 168 # 7 days

# Detect Secrets plugin configuration
detect_secrets_plugins: [
{"name": "ArtifactoryDetector"},
{"name": "AWSKeyDetector"},
{"name": "AzureStorageKeyDetector"},
{"name": "BasicAuthDetector"},
{"name": "CloudantDetector"},
{"name": "DiscordBotTokenDetector"},
{"name": "GitHubTokenDetector"},
{"name": "GitLabTokenDetector"},
{"name": "Base64HighEntropyString", "limit": 6.0},
{"name": "HexHighEntropyString", "limit": 3.0},
{"name": "IbmCloudIamDetector"},
{"name": "IbmCosHmacDetector"},
# {"name": "IPPublicDetector"}, https://github.com/Yelp/detect-secrets/pull/885
{"name": "JwtTokenDetector"},
{"name": "KeywordDetector"},
{"name": "MailchimpDetector"},
{"name": "NpmDetector"},
{"name": "OpenAIDetector"},
{"name": "PrivateKeyDetector"},
{"name": "PypiTokenDetector"},
{"name": "SendGridDetector"},
{"name": "SlackDetector"},
{"name": "SoftlayerDetector"},
{"name": "SquareOAuthDetector"},
{"name": "StripeDetector"},
# {"name": "TelegramBotTokenDetector"}, https://github.com/Yelp/detect-secrets/pull/878
{"name": "TwilioKeyDetector"},
]

# AWS CodeBuild Configuration
# aws.codebuild_project_uses_allowed_github_organizations
codebuild_github_allowed_organizations:
Expand Down
34 changes: 8 additions & 26 deletions prowler/config/schema/aws.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,29 +101,6 @@ def _validate_account_ids(v: Optional[list[str]]) -> Optional[list[str]]:
return v


# ---- Nested models ----------------------------------------------------------


class _DetectSecretsPlugin(ProviderConfigBase):
"""One entry inside ``detect_secrets_plugins``.

Only ``name`` is required by the upstream library. ``limit`` is used by
the entropy detectors. Any other plugin-specific kwarg is preserved by
the ``extra="allow"`` policy inherited from ProviderConfigBase.
"""

name: str
limit: Optional[float] = Field(
default=None,
ge=0.0,
le=10.0,
description=(
"Entropy threshold for detect-secrets entropy plugins. Range: 0..10 "
"(Shannon entropy is bounded by log2(256)=8; >10 is meaningless)."
),
)


# ---- Main schema ------------------------------------------------------------


Expand Down Expand Up @@ -394,6 +371,14 @@ class AWSProviderConfig(ProviderConfigBase):

# --- Secrets ---------------------------------------------------------
secrets_ignore_patterns: Optional[list[str]] = None
secrets_validate: Optional[bool] = Field(
default=None,
description=(
"Validate discovered secrets against the provider APIs (live check). "
"Makes outbound network calls that authenticate with the discovered "
"secret. Disabled by default."
),
)
max_days_secret_unused: Optional[int] = Field(
default=None,
ge=7,
Expand All @@ -417,6 +402,3 @@ class AWSProviderConfig(ProviderConfigBase):
le=8760,
description="Hours of Kinesis stream retention. Range: 24..8760 (1 day .. 1 year).",
)

# --- detect-secrets plugin list -------------------------------------
detect_secrets_plugins: Optional[list[_DetectSecretsPlugin]] = None
12 changes: 12 additions & 0 deletions prowler/lib/cli/parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -473,6 +473,18 @@ def __init_config_parser__(self):
default=default_fixer_config_file_path,
help="Set configuration fixer file path",
)
config_parser.add_argument(
"--scan-secrets-validate",
action="store_true",
default=False,
help=(
"Validate secrets discovered by the secrets checks by checking "
"whether they are live against the provider APIs. WARNING: this "
"makes outbound network calls using the discovered secret itself; "
"the credential is exercised against the provider and the call "
"appears in the audited account's logs. Disabled by default."
),
)
Comment thread
danibarranqueroo marked this conversation as resolved.

def __init_custom_checks_metadata_parser__(self):
# CustomChecksMetadata
Expand Down
Loading
Loading