-
Notifications
You must be signed in to change notification settings - Fork 16
Feature: Sanitize credential-like URLs in telemetry events to avoid False CredScan detection #340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
f7be25e
d685a52
e88327a
281f3e0
ac91bd5
5861900
e0edee2
292547e
5e8f80f
e489e9f
20c6824
8d86f40
3ddb799
5d433f3
6d4140c
31746d9
3bf953b
729f0a4
7183c8c
a2abded
c7c2514
011d41d
8c3e5c0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,44 @@ | ||
| # Copyright 2026 Microsoft Corporation | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
| # | ||
| # Requires Python 2.7+ | ||
| import re | ||
|
|
||
|
|
||
| class CredentialSanitizer(object): | ||
| """Service that sanitizes credential-like values from URIs by removing password/token from URI userinfo.""" | ||
|
|
||
| def __init__(self, composite_logger): | ||
| self.composite_logger = composite_logger | ||
|
|
||
| def sanitize(self, message): | ||
| """Removes password/token from URI credentials in the given message. | ||
| Args: | ||
| message: The message to sanitize | ||
| Returns: The message with credentials removed from URIs | ||
| """ | ||
| try: | ||
| # Pattern matches: scheme://user:password@host -> scheme://user@host | ||
| # Handles credentials containing special characters (except @, /, whitespace) | ||
| # Groups: | ||
| # (1) scheme: https://, http://, or ftp:// | ||
| # (2) username: one or more non-whitespace, non-slash, non-colon, non-@ characters | ||
| # (3) password: zero or more non-whitespace, non-slash, non-@ characters | ||
| sanitized_message = re.sub(r'(https?://|ftp://)([^:/@\s]+):([^@/\s]*)@',r'\1\2@',message) | ||
| self.composite_logger.log_verbose("Message was sanitized to remove sensitive information. [InputMessage={0}][SanitizedMessage={1}]".format(str(message), str(sanitized_message))) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. logging original message defeat's purpose isn't it? why do we want to log InputMessage which has credentials? |
||
| return sanitized_message | ||
| except Exception as error: | ||
| self.composite_logger.log_error("Error occurred while sanitizing credentials from message: [Error={0}]".format(repr(error))) | ||
| return message | ||
|
rane-rajasi marked this conversation as resolved.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we should not be returning message which contain credentials in case of exception. when we cannot sanitize i think its better to take safer route to avoid credential leak. --> check for alternatives instead of returning original message
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please refer this thread : #340 (comment) |
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,45 @@ | ||
| # Copyright 2026 Microsoft Corporation | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
| # | ||
| # Requires Python 2.7+ | ||
|
|
||
| import re | ||
|
|
||
|
|
||
| class CredentialSanitizer(object): | ||
| """Service that sanitizes credential-like values from URIs by removing password/token from URI userinfo.""" | ||
|
|
||
|
rane-rajasi marked this conversation as resolved.
|
||
| def __init__(self, logger): | ||
| self.logger = logger | ||
|
|
||
| def sanitize(self, message): | ||
| """Removes password/token from URI credentials in the given message. | ||
| Args: | ||
| message: The message to sanitize | ||
| Returns: The message with credentials removed from URIs | ||
| """ | ||
| try: | ||
| # Pattern matches: scheme://user:password@host -> scheme://user@host | ||
| # Handles credentials containing special characters (except @, /, whitespace) | ||
| # Groups: | ||
| # (1) scheme: https://, http://, or ftp:// | ||
| # (2) username: one or more non-whitespace, non-slash, non-colon, non-@ characters | ||
| # (3) password: zero or more non-whitespace, non-slash, non-@ characters | ||
| sanitized_message = re.sub(r'(https?://|ftp://)([^:/@\s]+):([^@/\s]*)@',r'\1\2@',message) | ||
| self.logger.log_verbose("Message was sanitized to remove sensitive information. [InputMessage={0}][SanitizedMessage={1}]".format(str(message), str(sanitized_message))) | ||
| return sanitized_message | ||
| except Exception as error: | ||
| self.logger.log_error("Error occurred while sanitizing credentials from message: [Error={0}]".format(repr(error))) | ||
| return message | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.