Skip to content

Commit f9a51d8

Browse files
feat(core): add data anonymization integration to sdk (#93)
1 parent 500b699 commit f9a51d8

29 files changed

Lines changed: 3582 additions & 9 deletions

.env_integration_tests.example

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,12 @@ CLOUD_SDK_CFG_DESTINATION_DEFAULT_URL=https://your-destination-auth-url-here
1414
CLOUD_SDK_CFG_DESTINATION_DEFAULT_URI=https://your-destination-configuration-uri-here
1515
CLOUD_SDK_CFG_DESTINATION_DEFAULT_IDENTITYZONE=your-identity-zone-here
1616

17+
CLOUD_SDK_CFG_DATA_ANONYMIZATION_DEFAULT_URL=https://your-data-anonymization-api-url-here
18+
CLOUD_SDK_CFG_DATA_ANONYMIZATION_DEFAULT_CERT=your-base64-encoded-client-certificate-pem
19+
CLOUD_SDK_CFG_DATA_ANONYMIZATION_DEFAULT_KEY=your-base64-encoded-client-private-key-pem
20+
# Alternative to inline base64 cert/key values:
21+
# CLOUD_SDK_CFG_DATA_ANONYMIZATION_DEFAULT_DESTINATION_NAME=your-client-certificate-destination-name
22+
1723
CLOUD_SDK_CFG_SDM_DEFAULT_URI=https://your-sdm-api-uri-here
1824
CLOUD_SDK_CFG_SDM_DEFAULT_UAA='{"url":"https://your-auth-url","clientid":"your-client-id","clientsecret":"your-client-secret","identityzone":"your-identity-zone"}'
1925

README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
## About this project
66

7-
This SDK provides consistent interfaces for interacting with foundational services such as object storage, destination management, audit logging, telemetry, and secure credential handling.
7+
This SDK provides consistent interfaces for interacting with foundational services such as object storage, destination management, audit logging, data anonymization, telemetry, and secure credential handling.
88

99
The Python SDK offers a clean, type-safe API following Python best practices while maintaining compatibility with the SAP Application Foundation ecosystem.
1010

@@ -23,6 +23,7 @@ The Python SDK offers a clean, type-safe API following Python best practices whi
2323
- **ObjectStore Service**
2424
- **Secret Resolver**
2525
- **Telemetry & Observability**
26+
- **Data Anonymization Service**
2627

2728
## Requirements and Setup
2829

@@ -76,6 +77,7 @@ Each module has comprehensive usage guides:
7677
- [ObjectStore](src/sap_cloud_sdk/objectstore/user-guide.md)
7778
- [Secret Resolver](src/sap_cloud_sdk/core/secret_resolver/user-guide.md)
7879
- [Telemetry](src/sap_cloud_sdk/core/telemetry/user-guide.md)
80+
- [Data Anonymization](src/sap_cloud_sdk/core/data_anonymization/user-guide.md)
7981

8082
## Support, Feedback, Contributing
8183

docs/INTEGRATION_TESTS.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,28 @@ CLOUD_SDK_CFG_HANA_AGENT_MEMORY_DEFAULT_APPLICATION_URL=https://your-agent-memor
7474
CLOUD_SDK_CFG_HANA_AGENT_MEMORY_DEFAULT_UAA='{"url":"https://your-auth-url","clientid":"your-client-id","clientsecret":"your-client-secret"}'
7575
```
7676

77+
### Data Anonymization Integration Tests
78+
79+
For Data Anonymization integration tests, configure the following variables in `.env_integration_tests`:
80+
81+
```bash
82+
# Data Anonymization Configuration
83+
CLOUD_SDK_CFG_DATA_ANONYMIZATION_DEFAULT_URL=https://your-data-anonymization-api-url-here
84+
CLOUD_SDK_CFG_DATA_ANONYMIZATION_DEFAULT_CERT=your-base64-encoded-client-certificate-pem
85+
CLOUD_SDK_CFG_DATA_ANONYMIZATION_DEFAULT_KEY=your-base64-encoded-client-private-key-pem
86+
```
87+
88+
`CLOUD_SDK_CFG_DATA_ANONYMIZATION_DEFAULT_CERT` and `CLOUD_SDK_CFG_DATA_ANONYMIZATION_DEFAULT_KEY` must contain the base64-encoded PEM content, not filesystem paths.
89+
90+
If the certificate is managed through BTP Destination service, you can use a destination instead of inline certificate values:
91+
92+
```bash
93+
CLOUD_SDK_CFG_DATA_ANONYMIZATION_DEFAULT_URL=https://your-data-anonymization-api-url-here
94+
CLOUD_SDK_CFG_DATA_ANONYMIZATION_DEFAULT_DESTINATION_NAME=your-client-certificate-destination-name
95+
```
96+
97+
The destination must be configured with `ClientCertificateAuthentication` and reference a certificate bundle containing the client certificate and private key.
98+
7799
## Running Integration Tests
78100

79101
```bash
@@ -82,6 +104,7 @@ uv run pytest tests/ -m integration -v
82104

83105
# Run specific module integration tests
84106
uv run pytest tests/core/integration/auditlog -v
107+
uv run pytest tests/core/integration/data_anonymization -v
85108
uv run pytest tests/objectstore/integration/ -v
86109
uv run pytest tests/destination/integration/ -v
87110
uv run pytest tests/agent_memory/integration/ -v

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[project]
22
name = "sap-cloud-sdk"
3-
version = "0.19.3"
3+
version = "0.20.0"
44
description = "SAP Cloud SDK for Python"
55
readme = "README.md"
66
license = "Apache-2.0"
Lines changed: 138 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,138 @@
1+
"""SAP Cloud SDK extension – Data Anonymization module.
2+
3+
Telemetry for this module is limited to operation-level metrics. Sensitive
4+
payloads such as source text, pseudonymization metadata, file contents, or
5+
certificate material are never emitted as telemetry attributes.
6+
7+
Usage::
8+
9+
from sap_cloud_sdk.core.data_anonymization import (
10+
create_client,
11+
AnonymizeRequest,
12+
PseudonymizeRequest,
13+
)
14+
15+
# Auto-detect config from environment / mount
16+
client = create_client()
17+
18+
# Anonymize (irreversible)
19+
result = client.anonymize(AnonymizeRequest(text="John Doe, john@example.com"))
20+
assert result.result is not None
21+
22+
# Pseudonymize (reversible)
23+
pseudo = client.pseudonymize(PseudonymizeRequest(text="John Doe"))
24+
assert pseudo.result is not None
25+
assert len(pseudo.metadata) >= 0
26+
27+
# Explicit config with inline base64 Key Store
28+
from sap_cloud_sdk.core.data_anonymization import DataAnonymizationConfig
29+
cfg = DataAnonymizationConfig(
30+
service_url="https://anonymization.example.com",
31+
cert="<base64-encoded-client-certificate>",
32+
key="<base64-encoded-client-private-key>",
33+
)
34+
client = create_client(config=cfg)
35+
36+
# BTP Destination Key Store (cloud)
37+
client = create_client(config=DataAnonymizationConfig(
38+
service_url="https://anonymization.example.com",
39+
destination_name="my-anonymization-dest",
40+
))
41+
"""
42+
43+
from typing import Optional
44+
45+
from sap_cloud_sdk.core.data_anonymization.client import DataAnonymizationClient
46+
from sap_cloud_sdk.core.data_anonymization.config import (
47+
DataAnonymizationConfig,
48+
_load_config_from_env,
49+
)
50+
from sap_cloud_sdk.core.data_anonymization._http_transport import HttpTransport
51+
from sap_cloud_sdk.core.data_anonymization.models import (
52+
AnonymizeTextRequest,
53+
AnonymizeFileRequest,
54+
AnonymizeRequest,
55+
AnonymizeFileResult,
56+
AnonymizeResult,
57+
FileOperationResult,
58+
PseudonymizeTextRequest,
59+
PseudonymizeFileRequest,
60+
PseudonymizeRequest,
61+
PseudonymizeFileResult,
62+
PseudonymizeResult,
63+
EntityMapping,
64+
)
65+
from sap_cloud_sdk.core.data_anonymization.exceptions import (
66+
DataAnonymizationError,
67+
ClientCreationError,
68+
TransportError,
69+
AuthenticationError,
70+
)
71+
from sap_cloud_sdk.core.telemetry import Module, Operation, record_metrics
72+
73+
74+
@record_metrics(
75+
Module.DATA_ANONYMIZATION,
76+
Operation.DATA_ANONYMIZATION_CREATE_CLIENT,
77+
)
78+
def create_client(
79+
*,
80+
config: Optional[DataAnonymizationConfig] = None,
81+
instance: str = "default",
82+
_telemetry_source: Optional[Module] = None,
83+
) -> DataAnonymizationClient:
84+
"""Create a DataAnonymizationClient with automatic configuration detection.
85+
86+
Args:
87+
config: Optional explicit DataAnonymizationConfig. When omitted the
88+
config is loaded from environment variables / secret mounts.
89+
instance: Service instance name used for secret resolution when
90+
*config* is not provided. Defaults to ``"default"``.
91+
_telemetry_source: Internal parameter; not for end-user use.
92+
93+
Returns:
94+
DataAnonymizationClient ready for anonymize / pseudonymize calls.
95+
96+
Raises:
97+
ClientCreationError: If client creation fails.
98+
99+
Note:
100+
Telemetry for client creation records only module/operation metadata and
101+
never includes configuration values or processed user content.
102+
"""
103+
try:
104+
resolved = config if config is not None else _load_config_from_env(instance)
105+
transport = HttpTransport(resolved)
106+
return DataAnonymizationClient(transport, _telemetry_source=_telemetry_source)
107+
except Exception as e:
108+
raise ClientCreationError(
109+
f"Failed to create DataAnonymizationClient: {e}"
110+
) from e
111+
112+
113+
__all__ = [
114+
# Factory
115+
"create_client",
116+
# Client
117+
"DataAnonymizationClient",
118+
# Config
119+
"DataAnonymizationConfig",
120+
# Request / response models
121+
"AnonymizeTextRequest",
122+
"AnonymizeRequest",
123+
"AnonymizeFileRequest",
124+
"AnonymizeFileResult",
125+
"AnonymizeResult",
126+
"FileOperationResult",
127+
"PseudonymizeTextRequest",
128+
"PseudonymizeRequest",
129+
"PseudonymizeFileRequest",
130+
"PseudonymizeFileResult",
131+
"PseudonymizeResult",
132+
"EntityMapping",
133+
# Exceptions
134+
"DataAnonymizationError",
135+
"ClientCreationError",
136+
"TransportError",
137+
"AuthenticationError",
138+
]

0 commit comments

Comments
 (0)