-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bazelify compiler's global package database #859
Conversation
baafa20
to
86495d5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! This should help simplify Hazel's handling of unix
and maybe also rts
.
This requires building a bunch of Nix targets. Depending on how quickly the required changes go into upstream, we could also push to cachix, so that people working on rules_haskell don't have to rebuild locally.
@@ -80,7 +81,7 @@ def _haskell_doctest_single(target, ctx): | |||
File: the doctest log. | |||
""" | |||
|
|||
if HaskellInfo not in target: | |||
if HaskellInfo not in target or HaskellPrebuiltPackageInfo in target: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we filter by whether the target is in an external workspace? This would also filter Hazel targets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test will be removed in a future PR (that actually and finally removes HaskellPrebuiltPackageInfo
). In this PR I'm just doing the simplest thing.
haskell/private/pkgdb_to_bzl.py
Outdated
pkg_confs = {} | ||
for conf in glob.glob(os.path.join(topdir, "package.conf.d", "*.conf")): | ||
with open(conf, 'r') as f: | ||
pkg_confs[conf] = f.read() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why load all files into memory before iterating over the pkg_confs
? Couldn't this loop be fused with the next?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I will fix this to be more lazy and reduce peak memory usage.
haskell/private/pkgdb_to_bzl.py
Outdated
if path_to_label(include_dir, pkg.pkgroot) | ||
], | ||
static_library = "glob({})".format([ | ||
path_to_label("{}/lib{}.a".format(library_dir, hs_library), pkg.pkgroot) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this also catch profiling libraries libHSxyz_p.a
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory no. There is no wildcard in the glob here. The only reason we use glob
is so that if the library file is missing (some packages don't have both a shared and a static library), then it won't error out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. Sorry, assumed there was a wild-card.
haskell/private/pkgdb_to_bzl.py
Outdated
for library_dir in pkg.dynamic_library_dirs + pkg.library_dirs | ||
if not path_to_label(library_dir, pkg.pkgroot) | ||
] + [ | ||
"-l{}".format(extra_library) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If extra_library
is a static library, do these linkopts
end up in the right position in the linker flags? I.e. behind the object that depends on the extra_library
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a comment.
haskell/toolchain.bzl
Outdated
@@ -206,6 +212,10 @@ _haskell_toolchain = rule( | |||
doc = "GHC and executables that come with it. First item take precedance.", | |||
mandatory = True, | |||
), | |||
"libraries": attr.label_list( | |||
doc = "The set of libraries that come with GHC.", | |||
#mandatory = True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#mandatory = True, |
9800126
to
a3f2f1d
Compare
@@ -62,13 +63,13 @@ def _get_extra_libs(libnames, extra_libs): | |||
result.append(lookup) | |||
return result | |||
|
|||
def _get_core_dependency_includes(ghc_workspace): | |||
def _get_core_dependency_includes(name): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC this is only necessary, because haskell_import
used to not include headers. Now that it does, I suspect that we can completely drop this part.
deps[condition] += [_CORE_DEPENDENCY_INCLUDES[p]] | ||
# Create per-package unique import of core dependency. | ||
haskell_toolchain_library( | ||
name = _CORE_DEPENDENCY_INCLUDES[p], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hazel generates haskell_toolchain_library
and cc_library
targets for all items of core_packages
. This includes unix
, and rts
. So, they should be available as @haskell_unix//:unix(-cbits)
and @haskell_rts//:rts(-cbits)
. However, the -cbits
targets are empty dummy targets. Now that haskell_toolchain_library
contains a CcInfo
, we can probably drop much of the cbits
logic.
@@ -425,6 +429,12 @@ def _get_build_attrs( | |||
|
|||
elibs_targets = _get_extra_libs(build_info.extraLibs, extra_libs) | |||
|
|||
# Create per-package unique import of core dependency. | |||
rts_import_name = "__hazel_{}_rts".format(name) | |||
haskell_toolchain_library( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. That should already exist in @haskell_rts//:rts
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could creating these per-package targets be the reason for the large (LD_)LIBRARY_PATH
. IIUC each transitive Hackage dependency will add another instance of these libraries to the dependencies.
@@ -200,7 +201,6 @@ def _get_build_attrs( | |||
generated_srcs_dir, | |||
extra_modules, | |||
ghc_version, | |||
ghc_workspace, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉 Good riddance!
@@ -436,7 +446,7 @@ def _get_build_attrs( | |||
]), | |||
defines = [o[2:] for o in build_info.ccOptions if o.startswith("-D")], | |||
textual_hdrs = list(headers), | |||
deps = ["{}//:rts-headers".format(ghc_workspace)] + select(cdeps) + cc_deps + elibs_targets, | |||
deps = [rts_import_name] + select(cdeps) + cc_deps + elibs_targets, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this is related to the MacOS segfault. But, a side effect of this is that HSrts-ghc8.6.4
appears in the extra-libraries
field of Hazel haskell_library targets.
Generally, in the following setup
haskell_library(name = "hs-lib-a", ...)
cc_library(name = "c-lib", deps = [":hs-lib-a"], ...)
haskell_library(name = "hs-lib-b", deps = [":c-lib", "hs-lib-a"], ...)
hs-lib-b
's package configuration file would list hs-lib-a
both in depends
and in extra-libraries
.
We initially modeled `haskell_toolchain` on `go_toolchain` in rules_go. `haskell_toolchain` is a macro that defines a private toolchain implementation rule and then the `toolchain` rule to specify execution and target constraints. This design, while convenient, creates problems. Ideally, we want the toolchain implementation definition to be next to the code, and we want the `toolchain` metadata to be in a separate repository. That way Bazel can see the `toolchain` definition without downloading the toolchain itself, and decide based on the platform constraints whether the toolchain should be selected during toolchain resolution. Keeping `haskell_toolchain` and `toolchain` separate matches the design of the CC rules, where there is likewise a `cc_toolchain` and `toolchain`. This is a breaking change: using `haskell_toolchain` now requires calling it and then also `toolchain`.
Exposing GHC's global package database to Bazel enables us to stop treating prebuilt libraries specially. They now export a `HaskellLibraryInfo` like any other library. Even better, they now also export a `CcInfo` provider like any other native code library, which means that CC rules have a full view of all static archives needed for linking a binary statically (see the `cc_haskell_import` test, which no longer requires `linkstatic = False`). `HaskellPrebuiltPackageInfo` is still there but now has no content. It's only there to prevent aspects from reaching too far. We'll get rid of it in future commits. Fixes #838
Transitive Haskell dependencies can show up in `extra-libraries` when these transitive dependencies arise via a CC library direct dependency. This is a problem for reasons explained in #873. This commit implements a simple temporary hack to avoid the problem.
Transitive Haskell dependencies can show up in the ext_libs dependencies when these transitive dependencies arise via a CC library direct dependency. This is a problem for reasons explained in #873. This commit implements a simple temporary hack to avoid the problem.
- rules_haskell now handles the global package db within Bazel tweag/rules_haskell#859 - We no longer use the Nix provided c2hs. So, we drop it. - Rename `ghcWithC2hs` to `ghcStatic` to clarify that that's where the static linking patches are applied. - Extend package-db patches to align Nix store paths with the new $out. This works around a restriction in current rules_haskell, where the paths in the package config files must have the same prefix as the path to the package config files themselves. - Don't exclude haskell libraries from extra-libraries entries. Test rules_haskell pkgdb-to-bzl fix-for-da Drop c2hs Fix GHC package-db patch
- rules_haskell now handles the global package db within Bazel tweag/rules_haskell#859 - We no longer use the Nix provided c2hs. So, we drop it. - Rename `ghcWithC2hs` to `ghcStatic` to clarify that that's where the static linking patches are applied. - Extend package-db patches to align Nix store paths with the new $out. This works around a restriction in current rules_haskell, where the paths in the package config files must have the same prefix as the path to the package config files themselves. - Don't exclude haskell libraries from extra-libraries entries. Test rules_haskell pkgdb-to-bzl fix-for-da Drop c2hs Fix GHC package-db patch
- rules_haskell now handles the global package db within Bazel tweag/rules_haskell#859 - We no longer use the Nix provided c2hs. So, we drop it. - Rename `ghcWithC2hs` to `ghcStatic` to clarify that that's where the static linking patches are applied. - Extend package-db patches to align Nix store paths with the new $out. This works around a restriction in current rules_haskell, where the paths in the package config files must have the same prefix as the path to the package config files themselves. - Don't exclude haskell libraries from extra-libraries entries.
- rules_haskell now handles the global package db within Bazel tweag/rules_haskell#859 - We no longer use the Nix provided c2hs. So, we drop it. - Rename `ghcWithC2hs` to `ghcStatic` to clarify that that's where the static linking patches are applied. - Extend package-db patches to align Nix store paths with the new $out. This works around a restriction in current rules_haskell, where the paths in the package config files must have the same prefix as the path to the package config files themselves. - Don't exclude haskell libraries from extra-libraries entries.
- rules_haskell now handles the global package db within Bazel tweag/rules_haskell#859 - We no longer use the Nix provided c2hs. So, we drop it. - Rename `ghcWithC2hs` to `ghcStatic` to clarify that that's where the static linking patches are applied. - Extend package-db patches to align Nix store paths with the new $out. This works around a restriction in current rules_haskell, where the paths in the package config files must have the same prefix as the path to the package config files themselves. - Don't exclude haskell libraries from extra-libraries entries.
Allow users to override the default BUILD file used in the GHC workspace in haskell_register_ghc_nixpkgs. The default (and previously enforced) is a BUILD file that assumes the presence of a BUILD.bzl file as generated in `haskell/nix/default.nix`.
Some fields in a package configuration file are allowed to occur multiple times, e.g. `extra-libraries`. pkbdb_to_bzl.py did not take this into account and only kept the value of the last entry, instead of merging multiple entries.
The hs-libraries field can be empty in custom package databases.
- rules_haskell now handles the global package db within Bazel tweag/rules_haskell#859 - We no longer use the Nix provided c2hs. So, we drop it. - Rename `ghcWithC2hs` to `ghcStatic` to clarify that that's where the static linking patches are applied. - Extend package-db patches to align Nix store paths with the new $out. This works around a restriction in current rules_haskell, where the paths in the package config files must have the same prefix as the path to the package config files themselves. - Don't exclude haskell libraries from extra-libraries entries.
- rules_haskell now handles the global package db within Bazel tweag/rules_haskell#859 - We no longer use the Nix provided c2hs. So, we drop it. - Rename `ghcWithC2hs` to `ghcStatic` to clarify that that's where the static linking patches are applied. - Extend package-db patches to align Nix store paths with the new $out. This works around a restriction in current rules_haskell, where the paths in the package config files must have the same prefix as the path to the package config files themselves. - Don't exclude haskell libraries from extra-libraries entries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was able to test this on https://github.com/digital-asset/daml and see it succeed on Linux, MacOS, and Windows, by adding a few more commits to this PR:
- Make targets in the GHC bindist/nixpkgs workspace public. Our use-case is to access the
ghc-pkg
binary. - I've changed
haskell_register_ghc_nixpkgs
to allow users to specify custom BUILD files.
The default in rules_haskell assumes the presence ofBUILD.bzl
which is generated in rules_haskell'shaskell/nix/default.nix
and therefore typically not present in nixpkgs imports outside of the rules_haskell repository. pkgdb_to_bzl.py
assumed that each field in the package configuration file appeared exactly once. That is true for vanilla bindist/nixpkgs. However, multiple occurrences of the same field are allowed and should be combined. Our use-case is a patched GHC build with static-only linking. I've extendedpkgdb_to_bzl.py
accordingly.pkgdb_to_bzl.py
assumed that thehs-libraries
field is always present. In our use-case we've patched the GHC package-db for static-only linking and moved thehs-libraries
entries to theextra-libraries
. I've extendedpkgdb_to_bzl.py
to handle a missinghs-libraries
field.- Some of the changes in this PR had initially broken
alex
andhappy
. However, rules_haskell's CI didn't cover those tools. I've added them to the Hazel test step.
Additional note:
Unfortunately, pkgdb_to_bzl.py
breaks with nixpkgs' ghcWithPackages
. The latter regenerates the package-db in a new Nix store path. However, the entries in the package configuration files (e.g. include-dirs
) still point to the Nix store path of the original ghc derivation. This logic will then cause those entries to be stripped away silently, causing compiler errors due to missing headers and the like later on.
- It might be better to not silently strip those entries away, but instead fail right there if
pkgdb_to_bzl.py
cannot handle an entry. Missinginclude-dirs
etc. will cause failures later on anyway. This way it's at least clear to the user where the error originated. - It is possible to work around this issue by modifying
ghcWithPackages
to not only regenerate the package-db but also patch the entries to point to the new Nix store path. - I'm not sure how to fix this properly. Maybe if a path does not start with
pkgroot
pkgdb_to_bzl.py
could successively strip directories from the path and search for the remainder in the workspace?
Best to start with at least failing early for now. |
- rules_haskell now handles the global package db within Bazel tweag/rules_haskell#859 - We no longer use the Nix provided c2hs. So, we drop it. - Rename `ghcWithC2hs` to `ghcStatic` to clarify that that's where the static linking patches are applied. - Extend package-db patches to align Nix store paths with the new $out. This works around a restriction in current rules_haskell, where the paths in the package config files must have the same prefix as the path to the package config files themselves. - Don't exclude haskell libraries from extra-libraries entries.
- rules_haskell now handles the global package db within Bazel tweag/rules_haskell#859 - We no longer use the Nix provided c2hs. So, we drop it. - Rename `ghcWithC2hs` to `ghcStatic` to clarify that that's where the static linking patches are applied. - Extend package-db patches to align Nix store paths with the new $out. This works around a restriction in current rules_haskell, where the paths in the package config files must have the same prefix as the path to the package config files themselves. - Don't exclude haskell libraries from extra-libraries entries.
* Update rules_haskell - rules_haskell now handles the global package db within Bazel tweag/rules_haskell#859 - We no longer use the Nix provided c2hs. So, we drop it. - Rename `ghcWithC2hs` to `ghcStatic` to clarify that that's where the static linking patches are applied. - Extend package-db patches to align Nix store paths with the new $out. This works around a restriction in current rules_haskell, where the paths in the package config files must have the same prefix as the path to the package config files themselves. - Don't exclude haskell libraries from extra-libraries entries. * Drop redundant unix-compat override This is a left-over from when the package was patched. * Windows GHC bindist includes ffi header * Drop unused language-c Nix override
Exposing GHC's global package database to Bazel enables us to stop
treating prebuilt libraries specially. They now export
a
HaskellLibraryInfo
like any other library. Even better, they nowalso export a
CcInfo
provider like any other native code library,which means that CC rules have a full view of all static archives
needed for linking a binary statically (see the
cc_haskell_import
test, which no longer requires
linkstatic = False
).HaskellPrebuiltPackageInfo
is still there but now has no content.It's only there to prevent aspects from reaching too far. We'll get
rid of it in future commits.
To get here, we had to make the
haskell_toolchain
macro stopdefining a
toolchain
target. This design is less like rules_go, butmore like the built-in CC rules, so we're still covered by precedent.
Fixes #838