Add PPM Algorithm Implementation #12392

LukOlen · 2024-11-21T01:25:40Z

Describe your change:

This pull request introduces a new implementation of the Prediction by Partial Matching (PPM) algorithm to the repository. The PPM algorithm is a statistical data compression technique that utilizes context modeling to achieve efficient compression rates.

Key Features:

Context-Based Compression: The algorithm maintains a context of previous symbols to predict the next symbol, enhancing compression efficiency. The Wikipedia page describes the algorithm way better than I ever could. https://en.wikipedia.org/wiki/Prediction_by_partial_matching

Compression and Decompression Functions:

The implementation includes both compression and decompression methods in one file, allowing for reversible data transformation.

File Handling:

A utility function is provided to read data from a file, making it easy to test the algorithm with real data.

Usage:

To use the PPM algorithm, simply call the ppm function with the path to the file you wish to compress. The algorithm will output the compressed data as probabilities and also provide the decompressed data for verification.

Testing:

I have tested the implementation with various datasets to ensure its functionality and performance. The results demonstrate that the algorithm effectively compresses and decompresses data while maintaining accuracy.

[x ] Add an algorithm?
Fix a bug or typo in an existing algorithm?
Add or change doctests? -- Note: Please avoid changing both code and tests in a single pull request.
Documentation change?

Checklist:

[ x] I have read CONTRIBUTING.md.
[x ] This pull request is all my own work -- I have not plagiarized.
[x ] I know that pull requests will not be merged if they fail the automated tests.
[x ] This PR only changes one algorithm file. To ease review, please open separate PRs for separate algorithms.
[x ] All new Python files are placed inside an existing directory.
[x ] All filenames are in all lowercase characters with no spaces or dashes.
[x ] All functions and variable names follow Python naming conventions.
[x ] All function parameters and return values are annotated with Python type hints.
[x ] All functions have doctests that pass the automated testing.
[x ] All new algorithms include at least one URL that points to Wikipedia or another similar explanation.
[x ] If this pull request resolves one or more open issues then the description above includes the issue number(s) with a closing keyword: "Fixes #ISSUE-NUMBER".

- Implemented the PPM algorithm for data compression and decompression. - Added methods for updating the model, encoding, and decoding symbols. - Included utility functions for reading from files and testing the algorithm. - Verified functionality with various datasets to ensure accuracy. This addition enhances the repository's collection of Python algorithms.

for more information, see https://pre-commit.ci

LukOlen and others added 3 commits November 21, 2024 02:10

[pre-commit.ci] auto fixes from pre-commit.com hooks

521d7a2

for more information, see https://pre-commit.ci

algorithms-keeper bot added the tests are failing Do not merge until tests pass label Nov 21, 2024

LukOlen and others added 5 commits November 21, 2024 09:48

trying to make the code pass ruff auto review

653f8e4

trying to pass ruff tests

435f451

[pre-commit.ci] auto fixes from pre-commit.com hooks

4359762

for more information, see https://pre-commit.ci

fixed last issues with ruff

bad910e

[pre-commit.ci] auto fixes from pre-commit.com hooks

930c4d4

for more information, see https://pre-commit.ci

algorithms-keeper bot added the awaiting reviews This PR is ready to be reviewed label Nov 21, 2024

ruff fixes

fe3a43c

algorithms-keeper bot removed the tests are failing Do not merge until tests pass label Nov 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add PPM Algorithm Implementation #12392

Add PPM Algorithm Implementation #12392

LukOlen commented Nov 21, 2024

Add PPM Algorithm Implementation #12392

Are you sure you want to change the base?

Add PPM Algorithm Implementation #12392

Conversation

LukOlen commented Nov 21, 2024

Describe your change:

Key Features:

Compression and Decompression Functions:

File Handling:

Usage:

Testing:

Checklist: