Skip to content

Security: zbze-org/zbze_ocr_cli

Security

SECURITY.md

Security Policy

Supported Versions

We provide security updates for the following versions of zbzeocr-cli:

Version Supported
0.1.x
< 0.1

Reporting a Vulnerability

We take the security of zbzeocr-cli seriously. If you discover a security vulnerability, please report it responsibly.

How to Report

Please do NOT open a public GitHub issue for security vulnerabilities.

Instead, please email your findings to:

a.panagoa@gmail.com

What to Include in Your Report

To help us understand and resolve the issue quickly, please include:

  1. Description of the vulnerability

    • What is the security issue?
    • What type of vulnerability is it? (e.g., code injection, path traversal, etc.)
  2. Steps to reproduce

    • Detailed steps to reproduce the vulnerability
    • Sample files or commands that demonstrate the issue
  3. Potential impact

    • What could an attacker achieve by exploiting this vulnerability?
    • What data or systems could be compromised?
  4. Affected versions

    • Which versions of zbzeocr-cli are affected?
    • Have you tested multiple versions?
  5. Suggested fix (optional)

    • If you have ideas on how to fix the vulnerability, please share them
    • Patches or pull requests are welcome after coordinated disclosure

Example Report

Subject: Security Vulnerability: Path Traversal in PDF Processing

Description:
The split-pdf-to-jpeg command does not properly sanitize output
directory paths, allowing path traversal attacks.

Steps to Reproduce:
1. Run: zbzeocr split-pdf-to-jpeg -f test.pdf -o "../../../etc/"
2. Files are written outside intended directory

Potential Impact:
An attacker could write files to arbitrary locations on the filesystem,
potentially overwriting system files or placing malicious files in
sensitive directories.

Affected Versions:
v0.1.0 (tested)

Suggested Fix:
Add path normalization and validation in cli_utils.validate_output_directory()
to reject paths with ".." components.

Response Timeline

We are committed to responding to security reports promptly:

  • Acknowledgment: Within 48 hours of receiving your report
  • Initial assessment: Within 1 week
  • Progress updates: Regular updates on investigation and fix development
  • Resolution notification: When fix is released or issue is resolved

Disclosure Policy

Coordinated Disclosure

We practice coordinated disclosure:

  1. Investigation: We investigate and develop a fix
  2. Private notification: We notify you when fix is ready
  3. Public disclosure: After fix is released, we:
    • Create a GitHub Security Advisory
    • Credit the reporter (unless they prefer anonymity)
    • Release a patch version
    • Update this document if needed

Public Disclosure Timeline

  • Standard timeline: 90 days from initial report
  • May be extended if fix requires significant changes
  • May be shortened if vulnerability is being actively exploited

Security Best Practices for Users

When using zbzeocr-cli, follow these security best practices:

Input Validation

  1. Only process trusted PDF files

    • PDFs can contain malicious content
    • Validate source and integrity of documents before processing
  2. Use dedicated processing directories

    • Create isolated directories for OCR processing
    • Don't process files in system directories
  3. Verify file permissions

    • Ensure output directories have appropriate permissions
    • Don't run zbzeocr with elevated privileges unless absolutely necessary

Environment Security

  1. Keep dependencies updated

    pip install --upgrade zbzeocr-cli
    zbzeocr check-deps
  2. Update external tools

    • Keep Tesseract OCR updated (5.5.0+)
    • Keep ImageMagick updated (7.1.1+)
    • Keep Unpaper updated (7.0.0+)
  3. Use virtual environments

    • Isolate zbzeocr-cli dependencies
    • Prevent conflicts with system packages

Configuration Security

  1. Protect configuration files

    • Don't store sensitive data in profile YAML files
    • Use appropriate file permissions (0600 or 0644)
  2. Validate custom profiles

    • Review custom profiles before use
    • Don't load profiles from untrusted sources

Processing Security

  1. Resource limits

    • Be aware of disk space when processing large PDFs
    • Monitor CPU/memory usage for batch operations
  2. Temporary file cleanup

    • Pipeline automatically cleans temp files
    • Verify cleanup completed after processing
  3. Output verification

    • Verify processed PDFs before sharing
    • Check for data leakage in OCR output

Known Security Considerations

PDF Processing

  • Risk: PDFs can contain embedded JavaScript or malicious content
  • Mitigation: We only extract images, not execute content
  • Recommendation: Still only process trusted documents

Command Injection

  • Status: No known command injection vulnerabilities
  • Protection: All shell commands use proper escaping
  • Note: Report any suspected command injection immediately

Path Traversal

  • Status: Paths are validated in CLI layer
  • Protection: Output directories checked before processing
  • Note: Still verify output paths in automation scripts

Dependency Vulnerabilities

  • Risk: Dependencies may have vulnerabilities
  • Mitigation: Dependencies listed in pyproject.toml
  • Recommendation: Regularly check for updates with pip list --outdated

Security Features

Current Security Features

  1. Input validation

    • File existence checks
    • Path sanitization
    • File type verification
  2. Safe file operations

    • Atomic file writes
    • Temporary directory isolation
    • Automatic cleanup
  3. No network operations

    • All processing is local
    • No external network calls
    • No telemetry or tracking

Planned Security Enhancements

  • Checksum verification for processed files
  • Sandboxed processing environment option
  • Enhanced logging for security auditing
  • Security-focused CI/CD checks

Security Updates

Security updates will be released as:

  • Patch versions (e.g., 0.1.1) for minor security fixes
  • Minor versions (e.g., 0.2.0) for significant security improvements
  • Announcements via GitHub Security Advisories

Contact

For security-related questions or concerns:

For general questions and support:

Attribution

We thank the following security researchers for responsibly disclosing vulnerabilities:

(None yet - you could be the first!)


Thank you for helping keep zbzeocr-cli and its users safe!

Last updated: 2025-11-08

There aren't any published security advisories