Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

plots: allow definition of plots section as list #8412

Merged
merged 1 commit into from
Oct 17, 2022

Conversation

dtrifiro
Copy link
Contributor

@dtrifiro dtrifiro commented Oct 7, 2022

Top-level section for plots currently requires a mapping, which looks like this:

# dvc.yaml
plots:
  evaluation/test/plots/confusion_matrix.json: # Configure template and axes.
    template: confusion
    x: actual
    y: predicted
  logs.csv:
    x: actual
    y: [predicted_class, predicted_class2]

with this PR, we allow for the plots top-level section to be defined as a list, which should make it more similar to how plots are defined in the stage sections:

# dvc.yaml
plots:
  - evaluation/test/plots/confusion_matrix.json: # Configure template and axes.
      template: confusion
      x: actual
      y: predicted
  - logs.csv:
      x: actual
      y: [predicted_class, predicted_class2]

@dtrifiro dtrifiro requested review from pared and dberenbaum October 7, 2022 15:28
@dtrifiro
Copy link
Contributor Author

dtrifiro commented Oct 7, 2022

Somehow related to #8371, since in that case I'd expect params/metrics to be lists and not mappings

@dtrifiro dtrifiro self-assigned this Oct 7, 2022
@dtrifiro dtrifiro added the A: plots Related to the plots label Oct 7, 2022
@dberenbaum
Copy link
Collaborator

@pared What do you think? I think this was purposely a mapping since unique keys are required, but isn't that true for stage plots and other deps/outs that take lists?

@pared
Copy link
Contributor

pared commented Oct 10, 2022

@dberenbaum yep, I think that we can enforce uniqueness with schema validation rules. Also that would make plots more conforming with how we handle outputs.

@dberenbaum
Copy link
Collaborator

@dtrifiro Great idea. Let's do it!

@dberenbaum
Copy link
Collaborator

Somehow related to #8371, since in that case I'd expect params/metrics to be lists and not mappings

Agreed, I don't think we should need mappings support for these.

@dtrifiro dtrifiro marked this pull request as ready for review October 10, 2022 16:52
@dtrifiro dtrifiro marked this pull request as draft October 10, 2022 16:54
@dtrifiro dtrifiro marked this pull request as ready for review October 10, 2022 16:58
Copy link
Contributor

@daavoo daavoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dtrifiro
Copy link
Contributor Author

Thanks @daavoo: iterative/dvcyaml-schema#23

@dtrifiro
Copy link
Contributor Author

@dberenbaum can we merge this?

Copy link
Collaborator

@dberenbaum dberenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge, although I don't have permission to merge

@daavoo daavoo merged commit e924492 into iterative:main Oct 17, 2022
@skshetry
Copy link
Member

What is the canonical way of writing top-level plots now?

Although I prefer this schema, I am -1 on this as I don't see much difference between the two, and only increases the maintenance burden and worsens the error message (voluptuous falls back to the first schema on Any, giving confusing error message). We are offering two similar formats to define plots.

Also, the top-level plots section is not the same as stages.<stage>.plots section and felt like a conscious choice to me. The plots section is also very overloaded:

plots: Union[
      Dict[
          PlotIdOrFilePath, Union[TopLevelPlotFlags, EmptyTopLevelPlotFlags]
      ],
      List[
          Dict[
              PlotIdOrFilePath,
              Union[TopLevelPlotFlags, EmptyTopLevelPlotFlags],
        ]
    ],
]

Regarding 3.0 deprecation, we haven't deprecated anything in dvc.yaml, and we are most likely stuck with that schema.

@dberenbaum
Copy link
Collaborator

What is the canonical way of writing top-level plots now?

I think list format should be the canonical way. It matches stage plots and stage outputs and also upcoming top-level sections like params and metrics.

We need to update docs here. We can update now to only mention list support. We can even only add list support to schema validation.

Not sure how much it saves us to make the breaking change in 3.0 of dropping mapping/dict support. I'm fine with leaving it unless it will cause major headaches so as not to break anyone's projects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: plots Related to the plots
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants