This project implements Huffman Coding, a lossless data compression algorithm. The goal is to analyze and optimize text compression by constructing Huffman trees, calculating entropy, and comparing compression efficiency with ASCII encoding.
The project uses a Python script to:
- Read text from a
.docxfile. - Compute character frequencies and probabilities.
- Build a Huffman tree and generate Huffman codes.
- Calculate entropy and compression efficiency.
- Compare Huffman encoding with standard ASCII encoding.
- Lossless compression using Huffman Coding.
- Entropy calculation to evaluate compression efficiency.
- Comparison with ASCII encoding to measure compression performance.
- Table representation of frequencies, probabilities, and Huffman codes.
- Compression percentage calculation.
- The Huffman coding achieved ~47.27% compression compared to ASCII encoding.
- Entropy calculations were close to theoretical limits, proving efficiency.
- Shorter codes were assigned to more frequent characters, ensuring optimal compression.
Huffman coding is a greedy algorithm used for lossless data compression. It assigns variable-length binary codes to input symbols based on their frequency of occurrence.
Compression efficiency is evaluated by comparing:
- Fixed-length ASCII encoding (8 bits per character)
- Variable-length Huffman coding (adaptive bit allocation)
This project was developed by:
- Yazeed Hamdan
- Mahmoud Hamdan
Feel free to reach out for collaboration or inquiries! 😊
For any questions or discussions, feel free to reach out: