Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ These three documents define the patterns this codebase already uses. Generating
| `list_*` methods, pagination, iterator vs list, the `_list` helper | [`docs/ITERATORS.md`](docs/ITERATORS.md) |
| Pydantic model conventions: `ConfigDict`, aliases, validators, relationships, exporting | [`docs/MODELS.md`](docs/MODELS.md) |
| Resource service patterns: method shape, JSON:API envelopes, client wiring, examples | [`docs/RESOURCE.md`](docs/RESOURCE.md) |
| Logging: namespace, redaction, env-var setup, debug round-trip traces | [`docs/LOGGING.md`](docs/LOGGING.md) |

Each doc ends with a checklist. Use those checklists; they encode the rules a reviewer will look for.

Expand Down Expand Up @@ -75,6 +76,7 @@ These are mistakes a competent Python developer would make if they hadn't read t
- **Don't add features beyond what was asked.** This codebase is approaching v1.0.0. Adding "while I'm here" refactors or speculative abstractions slows reviews and risks breaking the Ansible collection.
- **Don't assume every successful response is `{"data": ...}`.** Check the docs/go-tfe/spec for each endpoint: some return a JSON:API envelope, some return a bare resource object, `204 No Content`, `null`, raw bytes, or a redirect to a blob URL. Add tests for non-standard shapes.
- **Don't use bare `list[...]` annotations inside a resource class after defining `def list(...)`.** In class scope, mypy can resolve `list` to the method instead of the builtin. Use `builtins.list[...]`, `Sequence[...]`, or another unshadowed type.
- **Don't `print()` or use ad-hoc `logging.getLogger(__name__)` calls in library code.** The SDK has a structured logging framework — use `pytfe._logging.transport_logger` for HTTP traffic, or `pytfe._logging.logger` (the `pytfe` root) for higher-level events. Everything from that namespace is silent by default (NullHandler) and respects the user's `setup_logging()` or stdlib configuration. See [LOGGING.md](docs/LOGGING.md) for redaction rules — bearer tokens and `token`/`secret`/`password` keys are auto-redacted by `RoundTrip`, but only inside that formatter. Never `log.info(token)` directly.

## Known cross-dependencies you should not break

Expand Down
63 changes: 63 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,10 +93,73 @@ A couple of things worth knowing:
- The iterator is **single-use**. Once you've walked it, iterating again gives you nothing. Capture it with `list(...)` first if you need to reuse the result.
- Filters and page size live on the `*ListOptions` model for each resource — e.g. `WorkspaceListOptions(search="prod", page_size=50)`. Pagination still happens transparently; `page_size` only controls how big each underlying API page is.

## Logging

pyTFE integrates with Python's standard `logging` module and is **silent by default** — nothing is emitted unless you opt in. The library publishes two loggers:

- `pytfe` — root namespace; rarely emits directly
- `pytfe.transport` — HTTP request/response and retry trace

### Turn it on with an environment variable

The quickest way is to set `PYTFE_LOG`:

```bash
PYTFE_LOG=debug python my_script.py
```

`setup_logging()` is invoked automatically on package import, so the env var alone is enough — no code change required. Use the programmatic call only when you need to (re)apply env vars set after import (e.g. in a REPL or test):

```python
import pytfe
pytfe.setup_logging()
```

Levels: `debug` shows every request/response, `info` shows retry decisions only.

### Sample output

```
[2026-05-25 14:12:26 pytfe.transport DEBUG]
> GET /api/v2/organizations/acme/workspaces?page[number]=1&page[size]=100
< 200 OK
< {
< "data": [
< { "id": "ws-...", "type": "workspaces", ... }
< ]
< }
```

### Safe by default

Bearer tokens and other credentials are redacted **before** they reach the logger:

- Sensitive headers (`Authorization`, `Cookie`, anything containing `token` / `secret` / `password` / `api-key`) are replaced with `**REDACTED**`. Headers are off by default; even when you turn them on with `PYTFE_LOG_HEADERS=true`, redaction still applies.
- JSON bodies have sensitive keys (`token`, `access_token`, `refresh_token`, `secret`, `password`, `private_key`, `client_secret`) replaced recursively.
- Large bodies are truncated to `PYTFE_LOG_TRUNCATE_BYTES` (default `1024`). Long arrays are clipped with `"... (N additional elements)"`.
- Binary responses (state-version downloads, configuration-version tarballs, etc.) render as `[raw stream]` — the bytes are never decoded into the log.

### Compose with your existing logging

Because pyTFE uses stdlib `logging`, all the standard knobs work:

```python
import logging

# Just the HTTP traffic, at DEBUG
logging.getLogger("pytfe.transport").setLevel(logging.DEBUG)

# Send pyTFE logs to your existing handler instead of stderr
logging.getLogger("pytfe").addHandler(my_json_handler)
```

For full details — environment variables, redaction guarantees, and how to add log statements to new SDK code — see [`docs/LOGGING.md`](./docs/LOGGING.md).

## Documentation

- API reference and guides (SDK): **coming soon**
- Terraform Enterprise API: https://developer.hashicorp.com/terraform/enterprise/api-docs
- Internal reference: [`docs/ITERATORS.md`](./docs/ITERATORS.md), [`docs/MODELS.md`](./docs/MODELS.md), [`docs/RESOURCE.md`](./docs/RESOURCE.md), [`docs/LOGGING.md`](./docs/LOGGING.md)

## Examples

Expand Down
150 changes: 150 additions & 0 deletions docs/LOGGING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
# Logging in pyTFE

Internal reference for the SDK's logging framework. Companion to [`ITERATORS.md`](ITERATORS.md), [`MODELS.md`](MODELS.md), [`RESOURCE.md`](RESOURCE.md).

The framework is designed to be **silent by default** (library best practice — no logs unless the caller opts in), **integrated with stdlib `logging`** (so it composes with the user's existing setup), and **safe** (bearer tokens and other credentials are redacted before they ever reach a handler).

## The one-line quickstart for users

```bash
PYTFE_LOG=debug python my_script.py
```

`setup_logging()` is invoked automatically when the `pytfe` package is imported, so the env var is the entire user surface — no code change required. Programmatic equivalent (handy in tests or in REPLs where the env var was set after import):

```python
import pytfe
pytfe.setup_logging()
```

`setup_logging()` is idempotent — calling it more than once is safe.

That's it. Anything between `DEBUG`-level HTTP request/response traces and `INFO`-level retry decisions will show up on stderr with a per-line format like:

```
[2026-05-25 14:12:26 pytfe.transport DEBUG]
> GET /api/v2/organizations/acme/workspaces?page[number]=1&page[size]=100
< 200 OK
< {
< "data": [
< { "id": "ws-...", "type": "workspaces", ... }
< ]
< }
```

## Logger namespace

| Logger | What it logs |
|---|---|
| `pytfe` | Root namespace; rarely emits directly. Use it to dial **everything** pytfe says up or down at once. |
| `pytfe.transport` | HTTP request/response (DEBUG), retry decisions (INFO), transport exceptions (DEBUG). The noisy one. |

There is no `pytfe.resource.*` per-service logger. Resource methods do not log; if a caller needs visibility into "the SDK is calling `client.workspaces.read('ws-abc')'" they get it via the transport log right below it.

Standard stdlib selectors apply:

```python
import logging
logging.getLogger("pytfe").setLevel(logging.INFO) # everything
logging.getLogger("pytfe.transport").setLevel(logging.DEBUG) # just HTTP
```

## Configuration knobs

The framework has three environment variables. All are optional.

| Variable | Default | Effect |
|---|---|---|
| `PYTFE_LOG` | unset | `debug` or `info` (case-insensitive) configures stdlib `logging` for you. Anything else is ignored. |
| `PYTFE_LOG_HEADERS` | `false` | When truthy, include request/response headers in `RoundTrip` output. Sensitive ones are still redacted; this just turns on the `> * Header: value` lines at all. |
| `PYTFE_LOG_TRUNCATE_BYTES` | `1024` | Truncation budget for any single string in a logged body. Values below 96 are clamped up. |

All three are read at call time, not at import — switching `PYTFE_LOG_HEADERS=true` in the middle of a long-running process takes effect on the next request.

## Redaction guarantees

The `RoundTrip` formatter (in [`src/pytfe/_logging.py`](../src/pytfe/_logging.py)) redacts before formatting, so the redacted value never reaches the logger:

**Headers** — replaced with `**REDACTED**` when matched. Names matched case-insensitively:

```
authorization
cookie
set-cookie
proxy-authorization
x-tfc-task-signature
```

Plus any header whose name contains the substring `token`, `secret`, `password`, `api-key`, or `apikey`.

**JSON bodies** — when the response body is JSON, these top-level *and nested* keys have their values replaced (case-insensitive key match):

```
token, access_token, refresh_token,
secret, password,
private_key, client_secret
```

This is structural: a value can be redacted even if it's deep inside an array of nested objects. **String values themselves are not scanned for tokens** — only the keys are matched. If you stuff a bearer token into a field named `"description"`, it will appear in the log.

The `**REDACTED**` constant is exported from `pytfe._logging` if you ever need to assert on it in a test.

## Truncation behavior

Bodies are formatted, not echoed:

- JSON arrays beyond the budget are clipped with `"... (N additional elements)"`.
- JSON string values longer than the per-string budget are clipped with `"... (N more bytes)"`.
- Non-JSON bodies are shown verbatim (after the same per-string truncation).
- Binary bodies (state-version downloads, CV tarballs, anything with a non-text/non-JSON `Content-Type`) are rendered as `[raw stream]` — the body is not decoded or formatted.

This keeps a `--list` over a 10,000-workspace organization to one screen of log output instead of 10MB.

## How the transport uses it

[`src/pytfe/_http.py`](../src/pytfe/_http.py) emits:

| Event | Logger | Level | Cost when disabled |
|---|---|---|---|
| Every HTTP request/response round-trip | `pytfe.transport` | DEBUG | Zero — guarded by `isEnabledFor(DEBUG)`. The `RoundTrip` object is only constructed when the level is enabled. |
| Retry decisions (`429`, `5xx`, `Retry-After`) | `pytfe.transport` | INFO | One conditional + format-string evaluation. |
| Transport exceptions during retry loop | `pytfe.transport` | DEBUG | Zero (same guard pattern). |

There is no DEBUG cost when logging is off, even on 10k-request workloads.

## How to use it in new SDK code

If you're adding code under `src/pytfe/`, prefer the framework over `print` or ad-hoc `logging.getLogger(__name__)`:

```python
# resources/something.py
from .._logging import logger

def some_operation(self, foo):
if logger.isEnabledFor(logging.INFO):
logger.info("performing some_operation on %s", foo)
...
```

Two rules:

1. **Always guard non-trivial log argument construction** with `isEnabledFor`. Don't pay format/serialize cost when the level is off.
2. **Never log a token, password, or other secret yourself.** Only `RoundTrip` knows how to redact, and it only redacts what it knows about. If you're tempted to write `logger.info("got token %s", token)` — don't.

For low-level transport additions, use `transport_logger` (also exported from `pytfe._logging`).

## What this isn't

- **Not a metrics framework.** No counters, gauges, timing histograms. If you want metrics, wrap the client.
- **Not an audit log.** Logs are for debugging, not for compliance trails.
- **Not a tracing framework.** No correlation IDs, no OpenTelemetry spans. Standard stdlib `logging` only.
- **Not Ansible-aware.** When the Ansible collection imports pytfe, the `pytfe` logger inherits from Ansible's root logger like any other library — which means it stays silent unless the Ansible user explicitly raises the level. No special integration is needed or provided.

## Checklist when reviewing log-touching code

- [ ] New library log calls use `pytfe._logging.logger` or `pytfe._logging.transport_logger`, not `logging.getLogger(__name__)` ad hoc
- [ ] Anything more expensive than a literal format string is guarded with `isEnabledFor(...)`
- [ ] No raw tokens, passwords, or other credentials in any log call
- [ ] If logging a header dict, it goes through `redact_headers(...)`
- [ ] If logging a request/response, it uses `RoundTrip(resp).generate()` so the standard redaction + truncation applies
- [ ] Logger default level is unchanged (i.e. NullHandler still active for callers who don't opt in)
10 changes: 9 additions & 1 deletion src/pytfe/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
from importlib.metadata import version as _pkg_version

from . import errors, models
from ._logging import setup_logging
from .client import TFEClient
from .config import TFEConfig

Expand All @@ -13,4 +14,11 @@
except PackageNotFoundError: # running from a source checkout without install
__version__ = "0.0.0+unknown"

__all__ = ["TFEConfig", "TFEClient", "errors", "models", "__version__"]
__all__ = [
"TFEConfig",
"TFEClient",
"errors",
"models",
"setup_logging",
"__version__",
]
39 changes: 38 additions & 1 deletion src/pytfe/_http.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@

from __future__ import annotations

import logging
import re
import time
from collections.abc import Mapping
Expand All @@ -12,6 +13,7 @@
import httpx

from ._jsonapi import build_headers, parse_error_payload
from ._logging import RoundTrip, transport_logger
from .errors import (
AuthError,
NotFound,
Expand Down Expand Up @@ -87,7 +89,6 @@ def request(
if headers:
hdrs.update(headers)
attempt = 0
# print(method, url, params, json_body, hdrs)
while True:
try:
resp = self._sync.request(
Expand All @@ -100,24 +101,60 @@ def request(
follow_redirects=allow_redirects,
)
except httpx.HTTPError as e:
transport_logger.debug(
"transport exception on %s %s (attempt %d): %s",
method,
url,
attempt,
e,
)
Comment on lines +104 to +110
if attempt >= self.max_retries:
raise ServerError(str(e)) from e
self._sleep(attempt, None)
attempt += 1
continue
if resp.status_code in _RETRY_STATUSES and attempt < self.max_retries:
retry_after = _parse_retry_after(resp)
transport_logger.info(
"retrying %s %s after %s (status=%d, attempt=%d)",
method,
url,
f"{retry_after:.2f}s" if retry_after else "backoff",
resp.status_code,
attempt,
)
self._sleep(attempt, retry_after)
attempt += 1
continue
# When the caller explicitly opted out of redirect-following,
# surface 3xx responses to them (so they can read Location)
# rather than treating them as errors.
if not allow_redirects and 300 <= resp.status_code < 400:
self._log_round_trip(resp)
return resp
self._log_round_trip(resp)
self._raise_if_error(resp)
return resp

def _log_round_trip(self, resp: httpx.Response) -> None:
"""Emit a DEBUG-level request/response trace when enabled.

Cheap when disabled: ``isEnabledFor(DEBUG)`` short-circuits before any
body decoding or JSON parsing happens.
"""
if not transport_logger.isEnabledFor(logging.DEBUG):
return
# Treat binary content types as raw streams so we don't try to JSON
# parse a state-version download or a CV tarball.
ct = (resp.headers.get("content-type") or "").lower()
raw = not (
"json" in ct
or ct.startswith("text/")
or ct == ""
or "application/vnd.api+json" in ct
)
transport_logger.debug("\n%s", RoundTrip(resp, raw=raw).generate())

def _sleep(self, attempt: int, retry_after: float | None) -> None:
if retry_after is not None:
time.sleep(retry_after)
Expand Down
Loading
Loading