Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

document that directories are flattened unless data_path, strip_prefix are set to "." #82

Open
jmhodges opened this issue Dec 4, 2016 · 26 comments
Labels
documentation P2 An issue that should be worked on when time is available

Comments

@jmhodges
Copy link

jmhodges commented Dec 4, 2016

In bazel 0.4.1 (and I think 0.3.2), if you give docker_build a filegroup with a glob that contains multiple directories of files, those files will be flattened into one directory.

Here's the repo I made for the reproduction. It needed a bit of set up. (You'll need xz installed to get the wheezy rootfs file unpacked.)

If you run bazel run //subdir:buggy_flatten_image, I expected a /foo/quux and a /bar/quux file to exist in the created image but instead only a /quux exists (verifiable with a docker run -it bazel/subdir:buggy_flatten_image). A warning about duplicate filepaths is given but not error occurs.

The addition of data_path = "." as in the //subdir:has_subdirs_image target fixes this and that's very surprising.

I think it's inheriting some behavior from pkg_tar's strip_prefix maybe, but it's definitely not what I expected and finding the solution wasn't easy, either. This also makes it difficult to merge together multiple filegroups with subdirectories into one image because you have to give each of them their own docker_build (or, perhaps, pkg_tar) in order to keep them from ending up at /.

Seems like this, or something similar, was addressed in bazelbuild/bazel#677 / bazelbuild/bazel@0e22258 but maybe something has happened in between then and now.

(I noticed this when a build was throwing up warnings, but not errors, of duplicate filepaths being included. Maybe that should be another ticket?)

Environment info

  • Operating System:

macOS 10.12 Sierra

  • Bazel version (output of bazel info release):

release 0.4.1-homebrew

@damienmg
Copy link

damienmg commented Dec 5, 2016

This look like a bug to me. Normally default should be "." so not strip the path.

@jmhodges
Copy link
Author

jmhodges commented Dec 9, 2016

Yeah, this same flattening-every-dir problem seems to be affecting pkg_tar. I've just added some targets that demonstrate it in pkg_tar, too.

I also added another file at bar/okay/welp so that you can see that the welp is being dropped at the very top of the fs unless strip_prefix = "." is included.

Changing pkg_tar to use strip_prefix = "." by default would mean that filegroups with srcs like glob("somedir/*") would start getting somedir included in them. That might be too much of an API change? Not sure.

There might be a middle ground we need to hit where the default doesn't flatten all of the directories, but also doesn't include the dirs that the filegroup was made by globbing over. That'd get us closer to what people might expect, but not being able to rewrite filegroups target directories easily when creating a pkg_tar or docker_build from multiple targets would still be a bummer.

Lmk what you think.

@jmhodges jmhodges changed the title docker_build: directories are flattened unless data_path set to "." docker_build, pkg_tar: directories are flattened unless data_path, strip_prefix are set to "." Dec 9, 2016
@damienmg
Copy link

This sounds reasonable to me. Can you ask the question on bazel-discuss@ to see what other thinks?

Wrapping it with a macro for people might be an acceptable workaround for people that complains.

@jmhodges
Copy link
Author

@jmhodges
Copy link
Author

Nobody bit on that thread! Seems like it's up to y'all.

@damienmg
Copy link

Ok thanks, this will get done next quarter. Happy holidays!

@treaster
Copy link

treaster commented Mar 3, 2017

I know I'm late to the discussion here. However, I just ran into this problem, so I thought I'd cast a vote in the event that it's not too late.

Personally, I'd prefer to see the default behavior be to preserve the directory structure below the BUILD file containing the filegroup rule, then allow options like strip_prefix or a hypothetical flatten to modify the paths from there (probably on the filegroup rather than on the docker rule, although maybe I'm overlooking some use case...). Following the original directory structure preserves the most information, while allowing the options provides the necessary flexibility. If the behavior change is well documented in release notes and is easy to reverse by simply applying flatten, hopefully it won't cause too much of a problem for people.

@mattmoor
Copy link

mattmoor commented May 1, 2017

@damienmg FWIW +1 to making the default preserve directory structure. I just got bit by this playing with some samples and it probably would have burned way more time if I didn't have a feint recollection of this issue from triaging docker_build issues.

@damienmg damienmg removed their assignment May 2, 2017
@jmhodges
Copy link
Author

jmhodges commented Aug 3, 2017

Just got hit by this again. Any updates?

@damienmg
Copy link

docker_build should be fixed IIUC

@damienmg damienmg removed their assignment Aug 28, 2017
@raliste
Copy link

raliste commented Jan 13, 2018

Still happening at container_image

@jmhodges
Copy link
Author

jmhodges commented Feb 9, 2018

I'm no longer seeing this behavior in container_image, at least in the cases I've looked at. I'm running 3caddbe7f75fde6afb2e2c63654b5bbeeeedf2ac of rules_docker.

(In fact, using my data_path = "." now does a weird thing by including an extra subdirectory naned something like amd64_stripped for Go binaries)

@jmhodges
Copy link
Author

jmhodges commented Feb 9, 2018

Oh, the data_path = "." is doing weird things with Go binaries but the data_path still needed on any container_image that that a container_image that includes a go_binary is using as its base.

So now we're in a really weird spot where you have to split up your container_image calls in order to get data_path = "." to do the workaround stuff without being too weird.

@mattmoor
Copy link

IIUC that change was something that bit go_image too, the rules_go changes something about their output paths in a breaking way. I don't believe that change had anything to do with data_path.

Having been bitten by this strange default behavior multiple times myself, I'm completely on-board that this behavior is bizarre and not what users expect. :)

We should probably add a flag to restore the behavior as it exists today, in case folks are broken by it. It would be useful to have a Googler to shepherd this into the mono-repo (I don't have time right now).

@ash2k
Copy link

ash2k commented Jun 26, 2018

Having to add strip_prefix = "." to pkg_tar to preserve the directory structure is really weird.

@cgoy-nvidia
Copy link

These rules, pkg_tar and container_layer, are pretty much broken. strip_prefix/data_path doesn't seem to work as advertised at all. We have two repos which are selected based on a config_setting. We want to be able to preserve the directory structure of a filegroup we are using within each repo. Let's say we have a filegroup named tools in each repo, and our repos are named local and prebuilt.

If we add the strip_prefix/data_path to preserve the directory structure we end up with the root of the repo in the member names of the tar file.

./../prebuilt/tools/some_script.sh
or
./../local/tools/some_script.sh

It also ignores the package_dir attribute if strip_prefix is set.

We even tried setting the strip_prefix based on which repo we are selecting, but it doesn't matter. The layer gets unpacked with the repo name as the top level folder in the container we are using. It doesn't matter if the container_image was created with container_layers or pkg_tars.

@shettich
Copy link

In case anyone else runs into this, I just wasted a few hrs figuring out that:

The strip_prefix="." only works if the pkg_tar target is in the top level BUILD file. If it is down a couple levels it does not work and you get a flattened tree.

You can see that the helper function _short_path_dirname special cases top level targets

@nlopezgi
Copy link

From docs in rules_docker (https://github.com/bazelbuild/rules_docker#container_image)

Hint: if you want to put files in specific directories inside the image use pkg_tar rule to create the desired directory structure and pass that to container_image via tars attribute. Note you might need to set strip_prefix = "." or strip_prefix = "{some directory}" in your rule for the files to not be flattened. See https://github.com/bazelbuild/bazel/issues/2176 and bazelbuild/rules_docker#317 for more details.

@shettich , can you try with strip_prefix = "{some directory}" and see if that works for you?
If not please let us know so that we at least can improve documentation.

@shettich
Copy link

I was not building a container image, just a tar.

You can see in the code that the prefix to strip off is calculated by compute_data_path() based on the output path and the strip_prefix your provide. If you provide strip_prefix="." it will use _short_path_dirname() on the output file as the prefix. That function special cases top level target paths, which is why the strip_prefix trick works for some targets but not all.

@aiuto
Copy link
Collaborator

aiuto commented Aug 26, 2019

I agree with @cgoy-nvidia that pkg_tar is pretty much broken. We need a new design for how fileset's and/or other targets are combined into a tar/zip/deb/rpm, ... file. Moving this to rules_pkg so we can work on it there.

@aiuto aiuto transferred this issue from bazelbuild/bazel Aug 26, 2019
@kylecordes
Copy link

Having to add strip_prefix = "." to pkg_tar to preserve the directory structure is really weird

... it's still pretty weird in 1.0; might a PR be accepted to note this on the pkg_tar doc page, between now and when pkg_tar gets revamped?

@jsharpe
Copy link

jsharpe commented Feb 13, 2020

Just ran into this - in my case I was using pkg_tar with a filegroup in a external repo. In this case I had to set strip_prefix = "./../"
Which appears to be the issue in #131 / PR #132

@aiuto aiuto added the P2 An issue that should be worked on when time is available label Jan 28, 2021
@jmhodges
Copy link
Author

As the original reporter who just got bit by this again today, it would be lovely to see a fix. What direction do y'all want to take it in?

@aiuto
Copy link
Collaborator

aiuto commented Feb 10, 2021

@NCal and I are tag teaming work on sensible replacements for strip prefix. #273 is the latest bit in that.

@jmhodges
Copy link
Author

Oh, wonderful! Thank you so much for the update!

@aiuto aiuto changed the title docker_build, pkg_tar: directories are flattened unless data_path, strip_prefix are set to "." document that directories are flattened unless data_path, strip_prefix are set to "." May 3, 2021
@aiuto
Copy link
Collaborator

aiuto commented Oct 22, 2021

Related to #354

fionera added a commit to monogon-dev/NetMeta that referenced this issue Feb 1, 2023
This is a fix for bazelbuild/rules_pkg#82

Change-Id: I7733ea110103298c36106e40233d57fa9b1d2a1a
monogon-bot pushed a commit to monogon-dev/NetMeta that referenced this issue Feb 1, 2023
This is a fix for bazelbuild/rules_pkg#82

Change-Id: I7733ea110103298c36106e40233d57fa9b1d2a1a
Reviewed-on: https://review.monogon.dev/c/NetMeta/+/1115
Reviewed-by: Leopold Schabel <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation P2 An issue that should be worked on when time is available
Projects
None yet
Development

No branches or pull requests