Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create rules for emitting PackageFilesInfo, PackageDirsInfo #273

Merged
merged 13 commits into from
Feb 10, 2021

Conversation

nacl
Copy link
Collaborator

@nacl nacl commented Jan 19, 2021

This change creates rules for emitting PackageFilesInfo (pkg_files) and
PackageDirsInfo (pkg_mkdirs) providers.

The implementations and tests of these rules are largely similar to the rules
that currently exist in experimental/pkg_filegroup.bzl, with a few notable
differences:

  • pkg_filegroup is replaced with pkg_files. pkg_filegroup will be
    re-implemented as a grouping provider in a future change.

  • pkg_files now checks for files that map to the same destination. The check
    was previously deferred to the packaging rules themselves. This behavior is
    now tested.

  • attr checks for pkg_files and friends are now no longer strict. Only the
    "unix" attribute is verified. The logic for how to check this isn't yet
    complete, but it's enough to make forward progress.

  • section for pkg_files and pkg_mkdirs is removed. It will be replaced
    with more specific attributes as needed.

  • make_strip_prefix is removed in favor of the strip_prefix pseudo-module.

pkg_mkdirs is effectively unchanged, other than differences in the providers.
The tests are also effectively unchanged, excepting those regarding section,
which was removed, and those added, as mentioned above.


Do pardon the length of the review. I can separate the tests from the rule implementation if that would be easier for you.

Note also: pkg_mklinks will be added in a future change, as it looks like it may be nontrivially different than it was before.

This change creates rules for emitting `PackageFilesInfo` (`pkg_files`) and
`PackageDirsInfo` (`pkg_mkdirs`) providers.

The implementations and tests of these rules are largely similar to the rules
that currently exist in `experimental/pkg_filegroup.bzl`, with a few notable
differences:

- `pkg_filegroup` is replaced with `pkg_files`.  `pkg_filegroup` will be
  re-implemented as a grouping provider in a future change.

- `pkg_files` now checks for files that map to the same destination.  The check
  was previously deferred to the packaging rules themselves.  This behavior is
  now tested.

- `attr` checks for `pkg_files` and friends are now no longer strict.  Only the
  "unix" attribute is verified.  The logic for how to check this isn't yet
  complete, but it's enough to make forward progress.

- `section` for `pkg_files` and `pkg_mkdirs` is removed.  It will be replaced
  with more specific attributes as needed.

- `make_strip_prefix` is removed in favor of the `strip_prefix` pseudo-module.

`pkg_mkdirs` is effectively unchanged, other than differences in the providers.
The tests are also effectively unchanged, excepting those regarding `section`
and the ones added above.
@nacl nacl requested a review from aiuto as a code owner January 19, 2021 21:17
@aiuto
Copy link
Collaborator

aiuto commented Jan 20, 2021

It's going to take me a little while to understand this. The first question is whey the tests have to create their own WORKSPACE under tests. That makes it much harder to do bazel test ...

@nacl
Copy link
Collaborator Author

nacl commented Jan 20, 2021

It's going to take me a little while to understand this.

If you'd like to hop on a call to walk through this, I'd be happy to facilitate that.

The first question is whey the tests have to create their own WORKSPACE under tests. That makes it much harder to do bazel test ...

It's to allow us to test pkg_files when they reach across WORKSPACE boundaries. Files within are used in tests in //tests/mappings/..., and all tests within the repo are a part of tested as a part of the //tests/mappings:pkg_files_analysis_tests.

I can make this into a repository rule if that would be easier to understand.

Copy link
Collaborator

@aiuto aiuto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the whole, cool. This is most of my thoughts. I have not looked at the logic of the path stripping. My brain is full now.

pkg/mappings.bzl Outdated Show resolved Hide resolved
pkg/mappings.bzl Show resolved Hide resolved
pkg/mappings.bzl Outdated
- `from_pkg(path)`: strip all directory components up to the current
package, plus what's in `path`, if provided.

- `from_root(path)`: strip beginning from the file's WORKSPACE root plus
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what this means. I'm also having trouble reasoning about what happens with filtering filegroups that are a mix of sources and targets, especially if some of the targets are in a different package

Some of your tests, which filter by package names rather than paths hint that you are thinking of something more complex than me. What I am missing is a definition of how we interpret paths from objects in filegroup.

Let's take this one:

toast/BUILD:
filegroup(
name = "tfiles",
srcs = glob(["foo/**"]) + [
"x.dat",
"//:build_tar",
],
)

And we inspect the DefaultInfo.files for :tfiles. We get
dirname basename => short_path path
toast/foo | bar => toast/foo/bar toast/foo/bar
toast | toast.bzl => toast/toast.bzl toast/toast.bzl
. | build_tar.py => build_tar.py build_tar.py
bazel-out/k8-fastbuild/bin | build_tar => build_tar bazel-out/k8-fastbuild/bin/build_tar

strip_files_only is easy to understand (unless there is a name clash).
but what else could be reasonable?

strip("toast") => foo/bar, toast.bzl, and what of build_tar.py? I think we should fail if the strip is not a prefix of every short_path. It may not be the most flexible, but it will not lead to surprise.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pkg_files operates on the level of Starlark's File struct, which, in the case, means that filegroups expand directly to the files in question. This, indeed, can be a bit confusing when you try using strip_prefix across package boundaries. Generally:

  • from_pkg always begins relative to the package root for each file in question, by calculating relative to the "owner" of the file. The package root may be different for each file.
  • from_root always begins relative to the repository root for each file in question. The identity of the repository in question is not considered.

I don't know of a great way to solve this problem that satisfies all potential concerns. Perhaps some accompanying documentation with best practices should be provided too? One point in there could be that to avoid confusion, strip_prefix.from_pkg / strip_prefix.from_root should only be used on sets of files from the same package/repository.

The current behavior for prefix stripping is that stripping is "optional". This is probably bad for user surprise, and as you suggested, this should hard fail if it's not possible.

Your thoughts are appreciated as always. I'll work on something that makes invalid strip prefixes fail hard for this change, and documentation can be added in an additional change. WDYT?

pkg/mappings.bzl Show resolved Hide resolved
pkg/mappings.bzl Show resolved Hide resolved
pkg/mappings.bzl Outdated Show resolved Hide resolved
pkg/mappings.bzl Outdated Show resolved Hide resolved

The following keys are rejected:

- Any label that expands to more than one file (mappings must be
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are not allowing that, why does this have to be labels instead of plain strings?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reason for this is to make it easier to identify source files.

I tried to make this match what we do in excludes, which uses attr.label_list to name source files. All calculations are based on Files, which (AFAICT) cannot be created outside of the rule ctx struct.

I'm open to alternatives if you have them.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a better alternative now. It would be interesting to handle glob style renaming, and to handle target renaming where the target has multiple outputs. But unifying them is hard - and maybe more than needed right now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Thanks!

pkg/mappings.bzl Outdated Show resolved Hide resolved
pkg/mappings.bzl Show resolved Hide resolved
@nacl
Copy link
Collaborator Author

nacl commented Feb 1, 2021

On the whole, cool. This is most of my thoughts. I have not looked at the logic of the path stripping. My brain is full now.

Thanks for taking a look; I totally missed this last week in the pile of github mail and I was super distracted on top of that. I'm working through these comments now. They're giving me ideas.

Copy link
Collaborator Author

@nacl nacl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've addressed all but the following:

  • strip_prefix should fail if the strip prefix path does not apply cleanly everywhere

With the following deferred:

  • Choosing an alternate scheme for mapping rule attributes.
  • Fixing strip_prefix.from_pkg() inconsistency
  • Templating of package file parameters with PackageVariablesInfo

I would appreciate any additional feedback on the path stripping logic. It's worked fine for us so far, but any gotchas would be nice to understand.

A part of me also wonders if "layout" would be a better name than "mapping" here. Not a massive concern, though.

pkg/mappings.bzl Outdated Show resolved Hide resolved
pkg/mappings.bzl Outdated Show resolved Hide resolved
pkg/mappings.bzl Outdated Show resolved Hide resolved

The following keys are rejected:

- Any label that expands to more than one file (mappings must be
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reason for this is to make it easier to identify source files.

I tried to make this match what we do in excludes, which uses attr.label_list to name source files. All calculations are based on Files, which (AFAICT) cannot be created outside of the rule ctx struct.

I'm open to alternatives if you have them.

pkg/mappings.bzl Show resolved Hide resolved
pkg/mappings.bzl Show resolved Hide resolved
pkg/mappings.bzl Outdated
""",
mandatory = True,
),
"attrs": attr.string_list_dict(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM.

pkg/mappings.bzl Show resolved Hide resolved
pkg/mappings.bzl Outdated
- `from_pkg(path)`: strip all directory components up to the current
package, plus what's in `path`, if provided.

- `from_root(path)`: strip beginning from the file's WORKSPACE root plus
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pkg_files operates on the level of Starlark's File struct, which, in the case, means that filegroups expand directly to the files in question. This, indeed, can be a bit confusing when you try using strip_prefix across package boundaries. Generally:

  • from_pkg always begins relative to the package root for each file in question, by calculating relative to the "owner" of the file. The package root may be different for each file.
  • from_root always begins relative to the repository root for each file in question. The identity of the repository in question is not considered.

I don't know of a great way to solve this problem that satisfies all potential concerns. Perhaps some accompanying documentation with best practices should be provided too? One point in there could be that to avoid confusion, strip_prefix.from_pkg / strip_prefix.from_root should only be used on sets of files from the same package/repository.

The current behavior for prefix stripping is that stripping is "optional". This is probably bad for user surprise, and as you suggested, this should hard fail if it's not possible.

Your thoughts are appreciated as always. I'll work on something that makes invalid strip prefixes fail hard for this change, and documentation can be added in an additional change. WDYT?

pkg/mappings.bzl Show resolved Hide resolved
Andrew Psaltis added 3 commits February 2, 2021 12:03
…t tests, docs

This reduces user surprise at the expense of flexibility.

Also cleaned up test and tagging issues that were detected while doing said
adaptation.
Add negative test root prefix strip failures.
@nacl
Copy link
Collaborator Author

nacl commented Feb 2, 2021

Just added in the changes to cause _do_strip_prefix to fail if it can't complete the strip operation. PTAL.

Andrew Psaltis added 3 commits February 4, 2021 10:29
Document well-known attributes within.  This currently is just "unix".  Upcoming
ones will include "rpm_filetag" for RPM packages.

Also, reformat the **ATTRIBUTES** table so that rows/columns line up properly.
@nacl nacl force-pushed the topic/pkg_mkfiles_mkdirs branch from 128e1f0 to a83356a Compare February 4, 2021 21:33
@nacl
Copy link
Collaborator Author

nacl commented Feb 4, 2021

Sorry about the commit spam. I was experimenting with pushing and it didn't do what I wanted. Everything should be in the right state now.

Copy link
Collaborator

@aiuto aiuto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not quite happy with the WORKSPACE files for the tests - for reasons.
But that shouldn't block work right now.

pkg/mappings.bzl Show resolved Hide resolved

The following keys are rejected:

- Any label that expands to more than one file (mappings must be
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a better alternative now. It would be interesting to handle glob style renaming, and to handle target renaming where the target has multiple outputs. But unifying them is hard - and maybe more than needed right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants