Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Introduce FormattingConfig and deprecate DefaultFormats #127

Merged
merged 10 commits into from
Jul 20, 2024

Conversation

fpgmaas
Copy link
Collaborator

@fpgmaas fpgmaas commented Jul 17, 2024

PR Checklist

  • A description of the changes is added to the description of this PR.
  • If there is a related issue, make sure it is linked to this PR.
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added or modified a feature, documentation in docs is updated

Description of changes

This PR proposes to deprecate the existing methods of configuring the output format of chispa through the use of the DefaultFormats and arbitrary dataclasses. Currently there are a few issues with this approach;

  • Poor error handling; what if a color does not exist?
  • How does a user know which colors are available?
  • Colors and formatting styles (e.g. bold) are all mixed together
  • A user can specify multiple colors without that raising an error,
  • Difficult to add proper static typing in the future
  • A user needs to provide all keys in his custom formats dataclass, even if they want to keep most the same to the defaults.
  • How do a user know which keys their custom dataclass should have?

As an alternative, this PR proposes to replace it with the classes Color, Style, Format, and FormattingConfig, so it can be used as

from chispa.formatting import FormattingConfig

formats = FormattingConfig(
        mismatched_rows={"color": "light_yellow"},
        matched_rows={"color": "cyan", "style": "bold"},
        mismatched_cells={"color": "purple"},
        matched_cells={"color": "blue"},
    )

assert_basic_rows_equality(df1.collect(), df2.collect(), formats=formats)

or

formats = FormattingConfig(
        mismatched_rows={"color": "light_yellow"},
    )

assert_basic_rows_equality(df1.collect(), df2.collect(), formats=formats)

or similarly:

from chispa.formatting import FormattingConfig, Color, Style

formats = FormattingConfig(
        mismatched_rows={"color": Color.LIGHT_YELLOW},
        matched_rows={"color": Color.CYAN, "style": Style.BOLD},
        mismatched_cells={"color": Color.PURPLE},
        matched_cells={"color": Color.BLUE},
    )

assert_basic_rows_equality(df1.collect(), df2.collect(), formats=formats)

This brings:

  • Improved code readability and maintainability.
  • Easier to extend and modify formatting options.
  • Enhanced error handling and validation of color and style inputs.

Notes

  • Even though we do not actively use type hints yet, this PR introduces them for the new classes. In the future we probably want to add type hints to the entire project and enable mypy. This PR also adds the line from __future__ import annotations to all files through the use of ruff, to provide type hinting compatiblity in Python 3.8 and 3.9
  • The changes should be backwards compatible; see test_deprectaed.py.
  • Running assert_basic_rows_equality(df1.collect(), df2.collect(), formats=CustomFormats()) will display
    DeprecationWarning: Using an arbitrary dataclass for formatting is deprecated. Use `chispa.formatting.FormattingConfig` instead.
    
  • Running assert_basic_rows_equality(df1.collect(), df2.collect(), formats=DefaultFormats()) will display
    DeprecationWarning: DefaultFormats is deprecated. Use `chispa.formatting.FormattingConfig` instead.
    

chispa/rows_comparer.py Outdated Show resolved Hide resolved
chispa/default_formats.py Show resolved Hide resolved
@fpgmaas fpgmaas marked this pull request as draft July 17, 2024 19:31
@fpgmaas
Copy link
Collaborator Author

fpgmaas commented Jul 17, 2024

@SemyonSinchenko I have some more ideas for improvements based on your feedback. Will change it to a draft PR and get back to you soon!

@fpgmaas fpgmaas changed the title feat: Introduce TerminalStringFormatter and FormattingConfig feat: Introduce TerminalStringFormatter and FormattingConfig and deprecate DefaultFormats Jul 18, 2024
@fpgmaas fpgmaas requested a review from SemyonSinchenko July 19, 2024 05:08
@fpgmaas fpgmaas mentioned this pull request Jul 19, 2024
fpgmaas added 3 commits July 19, 2024 07:28
wip

wip

functioning unit tests

fix

fix

fix

add import fix

fix mutable defaults

fix mutable defaults

moved format-parsing logic to FormattingConfig class

make backwards compatible

fix init

add docstring

changed TerminalStringFormatter class into a function
move dict parsing logic to format class

inherit Enum from str

rename string formatter

simplify
@fpgmaas fpgmaas marked this pull request as ready for review July 19, 2024 05:37
chispa/__init__.py Outdated Show resolved Hide resolved
chispa/default_formats.py Outdated Show resolved Hide resolved
chispa/default_formats.py Outdated Show resolved Hide resolved
@fpgmaas fpgmaas requested a review from SemyonSinchenko July 19, 2024 09:18
@fpgmaas fpgmaas changed the title feat: Introduce TerminalStringFormatter and FormattingConfig and deprecate DefaultFormats feat: Introduce FormattingConfig and deprecate DefaultFormats Jul 19, 2024
Copy link
Collaborator

@SemyonSinchenko SemyonSinchenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I left one minor comment, feel free to merge it.

chispa/formatting/formatting_config.py Outdated Show resolved Hide resolved
@fpgmaas fpgmaas merged commit 580631a into MrPowers:main Jul 20, 2024
7 checks passed
@fpgmaas fpgmaas deleted the feat/formatter branch July 20, 2024 06:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants