We provide security updates for the following versions of zbzeocr-cli:
| Version | Supported |
|---|---|
| 0.1.x | ✅ |
| < 0.1 | ❌ |
We take the security of zbzeocr-cli seriously. If you discover a security vulnerability, please report it responsibly.
Please do NOT open a public GitHub issue for security vulnerabilities.
Instead, please email your findings to:
To help us understand and resolve the issue quickly, please include:
-
Description of the vulnerability
- What is the security issue?
- What type of vulnerability is it? (e.g., code injection, path traversal, etc.)
-
Steps to reproduce
- Detailed steps to reproduce the vulnerability
- Sample files or commands that demonstrate the issue
-
Potential impact
- What could an attacker achieve by exploiting this vulnerability?
- What data or systems could be compromised?
-
Affected versions
- Which versions of zbzeocr-cli are affected?
- Have you tested multiple versions?
-
Suggested fix (optional)
- If you have ideas on how to fix the vulnerability, please share them
- Patches or pull requests are welcome after coordinated disclosure
Subject: Security Vulnerability: Path Traversal in PDF Processing
Description:
The split-pdf-to-jpeg command does not properly sanitize output
directory paths, allowing path traversal attacks.
Steps to Reproduce:
1. Run: zbzeocr split-pdf-to-jpeg -f test.pdf -o "../../../etc/"
2. Files are written outside intended directory
Potential Impact:
An attacker could write files to arbitrary locations on the filesystem,
potentially overwriting system files or placing malicious files in
sensitive directories.
Affected Versions:
v0.1.0 (tested)
Suggested Fix:
Add path normalization and validation in cli_utils.validate_output_directory()
to reject paths with ".." components.
We are committed to responding to security reports promptly:
- Acknowledgment: Within 48 hours of receiving your report
- Initial assessment: Within 1 week
- Progress updates: Regular updates on investigation and fix development
- Resolution notification: When fix is released or issue is resolved
We practice coordinated disclosure:
- Investigation: We investigate and develop a fix
- Private notification: We notify you when fix is ready
- Public disclosure: After fix is released, we:
- Create a GitHub Security Advisory
- Credit the reporter (unless they prefer anonymity)
- Release a patch version
- Update this document if needed
- Standard timeline: 90 days from initial report
- May be extended if fix requires significant changes
- May be shortened if vulnerability is being actively exploited
When using zbzeocr-cli, follow these security best practices:
-
Only process trusted PDF files
- PDFs can contain malicious content
- Validate source and integrity of documents before processing
-
Use dedicated processing directories
- Create isolated directories for OCR processing
- Don't process files in system directories
-
Verify file permissions
- Ensure output directories have appropriate permissions
- Don't run zbzeocr with elevated privileges unless absolutely necessary
-
Keep dependencies updated
pip install --upgrade zbzeocr-cli zbzeocr check-deps
-
Update external tools
- Keep Tesseract OCR updated (5.5.0+)
- Keep ImageMagick updated (7.1.1+)
- Keep Unpaper updated (7.0.0+)
-
Use virtual environments
- Isolate zbzeocr-cli dependencies
- Prevent conflicts with system packages
-
Protect configuration files
- Don't store sensitive data in profile YAML files
- Use appropriate file permissions (0600 or 0644)
-
Validate custom profiles
- Review custom profiles before use
- Don't load profiles from untrusted sources
-
Resource limits
- Be aware of disk space when processing large PDFs
- Monitor CPU/memory usage for batch operations
-
Temporary file cleanup
- Pipeline automatically cleans temp files
- Verify cleanup completed after processing
-
Output verification
- Verify processed PDFs before sharing
- Check for data leakage in OCR output
- Risk: PDFs can contain embedded JavaScript or malicious content
- Mitigation: We only extract images, not execute content
- Recommendation: Still only process trusted documents
- Status: No known command injection vulnerabilities
- Protection: All shell commands use proper escaping
- Note: Report any suspected command injection immediately
- Status: Paths are validated in CLI layer
- Protection: Output directories checked before processing
- Note: Still verify output paths in automation scripts
- Risk: Dependencies may have vulnerabilities
- Mitigation: Dependencies listed in pyproject.toml
- Recommendation: Regularly check for updates with
pip list --outdated
-
Input validation
- File existence checks
- Path sanitization
- File type verification
-
Safe file operations
- Atomic file writes
- Temporary directory isolation
- Automatic cleanup
-
No network operations
- All processing is local
- No external network calls
- No telemetry or tracking
- Checksum verification for processed files
- Sandboxed processing environment option
- Enhanced logging for security auditing
- Security-focused CI/CD checks
Security updates will be released as:
- Patch versions (e.g., 0.1.1) for minor security fixes
- Minor versions (e.g., 0.2.0) for significant security improvements
- Announcements via GitHub Security Advisories
For security-related questions or concerns:
- Email: a.panagoa@gmail.com
- PGP Key: Not currently available (planned)
For general questions and support:
- Issues: https://github.com/zbze-org/zbze_ocr_cli/issues
- Discussions: https://github.com/zbze-org/zbze_ocr_cli/discussions
We thank the following security researchers for responsibly disclosing vulnerabilities:
(None yet - you could be the first!)
Thank you for helping keep zbzeocr-cli and its users safe!
Last updated: 2025-11-08