Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bazelify compiler's global package database #859

Merged
merged 10 commits into from
May 17, 2019
Merged

Bazelify compiler's global package database #859

merged 10 commits into from
May 17, 2019

Conversation

mboes
Copy link
Member

@mboes mboes commented May 4, 2019

Exposing GHC's global package database to Bazel enables us to stop
treating prebuilt libraries specially. They now export
a HaskellLibraryInfo like any other library. Even better, they now
also export a CcInfo provider like any other native code library,
which means that CC rules have a full view of all static archives
needed for linking a binary statically (see the cc_haskell_import
test, which no longer requires linkstatic = False).

HaskellPrebuiltPackageInfo is still there but now has no content.
It's only there to prevent aspects from reaching too far. We'll get
rid of it in future commits.

To get here, we had to make the haskell_toolchain macro stop
defining a toolchain target. This design is less like rules_go, but
more like the built-in CC rules, so we're still covered by precedent.

Fixes #838

@mboes mboes requested a review from aherrmann May 4, 2019 12:09
@mboes mboes force-pushed the pkgdb-to-bzl branch 2 times, most recently from baafa20 to 86495d5 Compare May 4, 2019 12:53
Copy link
Member

@aherrmann aherrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! This should help simplify Hazel's handling of unix and maybe also rts.

This requires building a bunch of Nix targets. Depending on how quickly the required changes go into upstream, we could also push to cachix, so that people working on rules_haskell don't have to rebuild locally.

@@ -80,7 +81,7 @@ def _haskell_doctest_single(target, ctx):
File: the doctest log.
"""

if HaskellInfo not in target:
if HaskellInfo not in target or HaskellPrebuiltPackageInfo in target:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we filter by whether the target is in an external workspace? This would also filter Hazel targets.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test will be removed in a future PR (that actually and finally removes HaskellPrebuiltPackageInfo). In this PR I'm just doing the simplest thing.

pkg_confs = {}
for conf in glob.glob(os.path.join(topdir, "package.conf.d", "*.conf")):
with open(conf, 'r') as f:
pkg_confs[conf] = f.read()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why load all files into memory before iterating over the pkg_confs? Couldn't this loop be fused with the next?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I will fix this to be more lazy and reduce peak memory usage.

if path_to_label(include_dir, pkg.pkgroot)
],
static_library = "glob({})".format([
path_to_label("{}/lib{}.a".format(library_dir, hs_library), pkg.pkgroot)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this also catch profiling libraries libHSxyz_p.a?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory no. There is no wildcard in the glob here. The only reason we use glob is so that if the library file is missing (some packages don't have both a shared and a static library), then it won't error out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. Sorry, assumed there was a wild-card.

for library_dir in pkg.dynamic_library_dirs + pkg.library_dirs
if not path_to_label(library_dir, pkg.pkgroot)
] + [
"-l{}".format(extra_library)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If extra_library is a static library, do these linkopts end up in the right position in the linker flags? I.e. behind the object that depends on the extra_library.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a comment.

@@ -206,6 +212,10 @@ _haskell_toolchain = rule(
doc = "GHC and executables that come with it. First item take precedance.",
mandatory = True,
),
"libraries": attr.label_list(
doc = "The set of libraries that come with GHC.",
#mandatory = True,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#mandatory = True,

@mboes mboes force-pushed the pkgdb-to-bzl branch 2 times, most recently from 9800126 to a3f2f1d Compare May 6, 2019 21:23
@@ -62,13 +63,13 @@ def _get_extra_libs(libnames, extra_libs):
result.append(lookup)
return result

def _get_core_dependency_includes(ghc_workspace):
def _get_core_dependency_includes(name):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC this is only necessary, because haskell_import used to not include headers. Now that it does, I suspect that we can completely drop this part.

deps[condition] += [_CORE_DEPENDENCY_INCLUDES[p]]
# Create per-package unique import of core dependency.
haskell_toolchain_library(
name = _CORE_DEPENDENCY_INCLUDES[p],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hazel generates haskell_toolchain_library and cc_library targets for all items of core_packages. This includes unix, and rts. So, they should be available as @haskell_unix//:unix(-cbits) and @haskell_rts//:rts(-cbits). However, the -cbits targets are empty dummy targets. Now that haskell_toolchain_library contains a CcInfo, we can probably drop much of the cbits logic.

@@ -425,6 +429,12 @@ def _get_build_attrs(

elibs_targets = _get_extra_libs(build_info.extraLibs, extra_libs)

# Create per-package unique import of core dependency.
rts_import_name = "__hazel_{}_rts".format(name)
haskell_toolchain_library(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. That should already exist in @haskell_rts//:rts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could creating these per-package targets be the reason for the large (LD_)LIBRARY_PATH. IIUC each transitive Hackage dependency will add another instance of these libraries to the dependencies.

@@ -200,7 +201,6 @@ def _get_build_attrs(
generated_srcs_dir,
extra_modules,
ghc_version,
ghc_workspace,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 Good riddance!

@@ -436,7 +446,7 @@ def _get_build_attrs(
]),
defines = [o[2:] for o in build_info.ccOptions if o.startswith("-D")],
textual_hdrs = list(headers),
deps = ["{}//:rts-headers".format(ghc_workspace)] + select(cdeps) + cc_deps + elibs_targets,
deps = [rts_import_name] + select(cdeps) + cc_deps + elibs_targets,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is related to the MacOS segfault. But, a side effect of this is that HSrts-ghc8.6.4 appears in the extra-libraries field of Hazel haskell_library targets.

Generally, in the following setup

haskell_library(name = "hs-lib-a", ...)
cc_library(name = "c-lib", deps = [":hs-lib-a"], ...)
haskell_library(name = "hs-lib-b", deps = [":c-lib", "hs-lib-a"], ...)

hs-lib-b's package configuration file would list hs-lib-a both in depends and in extra-libraries.

mboes and others added 5 commits May 13, 2019 13:59
We initially modeled `haskell_toolchain` on `go_toolchain` in
rules_go. `haskell_toolchain` is a macro that defines a private
toolchain implementation rule and then the `toolchain` rule to specify
execution and target constraints. This design, while convenient,
creates problems. Ideally, we want the toolchain implementation
definition to be next to the code, and we want the `toolchain`
metadata to be in a separate repository. That way Bazel can see the
`toolchain` definition without downloading the toolchain itself, and
decide based on the platform constraints whether the toolchain should
be selected during toolchain resolution.

Keeping `haskell_toolchain` and `toolchain` separate matches the
design of the CC rules, where there is likewise a `cc_toolchain` and
`toolchain`.

This is a breaking change: using `haskell_toolchain` now requires
calling it and then also `toolchain`.
Exposing GHC's global package database to Bazel enables us to stop
treating prebuilt libraries specially. They now export
a `HaskellLibraryInfo` like any other library. Even better, they now
also export a `CcInfo` provider like any other native code library,
which means that CC rules have a full view of all static archives
needed for linking a binary statically (see the `cc_haskell_import`
test, which no longer requires `linkstatic = False`).

`HaskellPrebuiltPackageInfo` is still there but now has no content.
It's only there to prevent aspects from reaching too far. We'll get
rid of it in future commits.

Fixes #838
Transitive Haskell dependencies can show up in `extra-libraries` when
these transitive dependencies arise via a CC library direct
dependency. This is a problem for reasons explained in #873. This
commit implements a simple temporary hack to avoid the problem.
Transitive Haskell dependencies can show up in the ext_libs dependencies
when these transitive dependencies arise via a CC library direct
dependency. This is a problem for reasons explained in #873. This commit
implements a simple temporary hack to avoid the problem.
aherrmann-da pushed a commit to digital-asset/daml that referenced this pull request May 14, 2019
- rules_haskell now handles the global package db within Bazel
    tweag/rules_haskell#859
- We no longer use the Nix provided c2hs. So, we drop it.
- Rename `ghcWithC2hs` to `ghcStatic` to clarify that that's where the
    static linking patches are applied.
- Extend package-db patches to align Nix store paths with the new $out.
    This works around a restriction in current rules_haskell, where
    the paths in the package config files must have the same prefix as
    the path to the package config files themselves.
- Don't exclude haskell libraries from extra-libraries entries.

Test rules_haskell pkgdb-to-bzl

fix-for-da

Drop c2hs

Fix GHC package-db patch
aherrmann-da pushed a commit to digital-asset/daml that referenced this pull request May 14, 2019
- rules_haskell now handles the global package db within Bazel
    tweag/rules_haskell#859
- We no longer use the Nix provided c2hs. So, we drop it.
- Rename `ghcWithC2hs` to `ghcStatic` to clarify that that's where the
    static linking patches are applied.
- Extend package-db patches to align Nix store paths with the new $out.
    This works around a restriction in current rules_haskell, where
    the paths in the package config files must have the same prefix as
    the path to the package config files themselves.
- Don't exclude haskell libraries from extra-libraries entries.

Test rules_haskell pkgdb-to-bzl

fix-for-da

Drop c2hs

Fix GHC package-db patch
aherrmann-da pushed a commit to digital-asset/daml that referenced this pull request May 16, 2019
- rules_haskell now handles the global package db within Bazel
    tweag/rules_haskell#859
- We no longer use the Nix provided c2hs. So, we drop it.
- Rename `ghcWithC2hs` to `ghcStatic` to clarify that that's where the
    static linking patches are applied.
- Extend package-db patches to align Nix store paths with the new $out.
    This works around a restriction in current rules_haskell, where
    the paths in the package config files must have the same prefix as
    the path to the package config files themselves.
- Don't exclude haskell libraries from extra-libraries entries.
aherrmann-da pushed a commit to digital-asset/daml that referenced this pull request May 16, 2019
- rules_haskell now handles the global package db within Bazel
    tweag/rules_haskell#859
- We no longer use the Nix provided c2hs. So, we drop it.
- Rename `ghcWithC2hs` to `ghcStatic` to clarify that that's where the
    static linking patches are applied.
- Extend package-db patches to align Nix store paths with the new $out.
    This works around a restriction in current rules_haskell, where
    the paths in the package config files must have the same prefix as
    the path to the package config files themselves.
- Don't exclude haskell libraries from extra-libraries entries.
aherrmann-da pushed a commit to digital-asset/daml that referenced this pull request May 16, 2019
- rules_haskell now handles the global package db within Bazel
    tweag/rules_haskell#859
- We no longer use the Nix provided c2hs. So, we drop it.
- Rename `ghcWithC2hs` to `ghcStatic` to clarify that that's where the
    static linking patches are applied.
- Extend package-db patches to align Nix store paths with the new $out.
    This works around a restriction in current rules_haskell, where
    the paths in the package config files must have the same prefix as
    the path to the package config files themselves.
- Don't exclude haskell libraries from extra-libraries entries.
aherrmann added 4 commits May 16, 2019 17:19
Allow users to override the default BUILD file used in the GHC workspace
in haskell_register_ghc_nixpkgs. The default (and previously enforced)
is a BUILD file that assumes the presence of a BUILD.bzl file as
generated in `haskell/nix/default.nix`.
Some fields in a package configuration file are allowed to occur
multiple times, e.g. `extra-libraries`. pkbdb_to_bzl.py did not take
this into account and only kept the value of the last entry, instead of
merging multiple entries.
The hs-libraries field can be empty in custom package databases.
aherrmann-da pushed a commit to digital-asset/daml that referenced this pull request May 16, 2019
- rules_haskell now handles the global package db within Bazel
    tweag/rules_haskell#859
- We no longer use the Nix provided c2hs. So, we drop it.
- Rename `ghcWithC2hs` to `ghcStatic` to clarify that that's where the
    static linking patches are applied.
- Extend package-db patches to align Nix store paths with the new $out.
    This works around a restriction in current rules_haskell, where
    the paths in the package config files must have the same prefix as
    the path to the package config files themselves.
- Don't exclude haskell libraries from extra-libraries entries.
aherrmann-da pushed a commit to digital-asset/daml that referenced this pull request May 17, 2019
- rules_haskell now handles the global package db within Bazel
    tweag/rules_haskell#859
- We no longer use the Nix provided c2hs. So, we drop it.
- Rename `ghcWithC2hs` to `ghcStatic` to clarify that that's where the
    static linking patches are applied.
- Extend package-db patches to align Nix store paths with the new $out.
    This works around a restriction in current rules_haskell, where
    the paths in the package config files must have the same prefix as
    the path to the package config files themselves.
- Don't exclude haskell libraries from extra-libraries entries.
Copy link
Member

@aherrmann aherrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to test this on https://github.com/digital-asset/daml and see it succeed on Linux, MacOS, and Windows, by adding a few more commits to this PR:

  • Make targets in the GHC bindist/nixpkgs workspace public. Our use-case is to access the ghc-pkg binary.
  • I've changed haskell_register_ghc_nixpkgs to allow users to specify custom BUILD files.
    The default in rules_haskell assumes the presence of BUILD.bzl which is generated in rules_haskell's haskell/nix/default.nix and therefore typically not present in nixpkgs imports outside of the rules_haskell repository.
  • pkgdb_to_bzl.py assumed that each field in the package configuration file appeared exactly once. That is true for vanilla bindist/nixpkgs. However, multiple occurrences of the same field are allowed and should be combined. Our use-case is a patched GHC build with static-only linking. I've extended pkgdb_to_bzl.py accordingly.
  • pkgdb_to_bzl.py assumed that the hs-libraries field is always present. In our use-case we've patched the GHC package-db for static-only linking and moved the hs-libraries entries to the extra-libraries. I've extended pkgdb_to_bzl.py to handle a missing hs-libraries field.
  • Some of the changes in this PR had initially broken alex and happy. However, rules_haskell's CI didn't cover those tools. I've added them to the Hazel test step.

Additional note:

Unfortunately, pkgdb_to_bzl.py breaks with nixpkgs' ghcWithPackages. The latter regenerates the package-db in a new Nix store path. However, the entries in the package configuration files (e.g. include-dirs) still point to the Nix store path of the original ghc derivation. This logic will then cause those entries to be stripped away silently, causing compiler errors due to missing headers and the like later on.

  • It might be better to not silently strip those entries away, but instead fail right there if pkgdb_to_bzl.py cannot handle an entry. Missing include-dirs etc. will cause failures later on anyway. This way it's at least clear to the user where the error originated.
  • It is possible to work around this issue by modifying ghcWithPackages to not only regenerate the package-db but also patch the entries to point to the new Nix store path.
  • I'm not sure how to fix this properly. Maybe if a path does not start with pkgroot pkgdb_to_bzl.py could successively strip directories from the path and search for the remainder in the workspace?

@mboes
Copy link
Member Author

mboes commented May 17, 2019

This logic will then cause those entries to be stripped away silently, causing compiler errors due to missing headers and the like later on.

Best to start with at least failing early for now.

@mboes mboes merged commit 1abeb65 into master May 17, 2019
@mboes mboes deleted the pkgdb-to-bzl branch May 17, 2019 10:25
aherrmann-da pushed a commit to digital-asset/daml that referenced this pull request May 17, 2019
- rules_haskell now handles the global package db within Bazel
    tweag/rules_haskell#859
- We no longer use the Nix provided c2hs. So, we drop it.
- Rename `ghcWithC2hs` to `ghcStatic` to clarify that that's where the
    static linking patches are applied.
- Extend package-db patches to align Nix store paths with the new $out.
    This works around a restriction in current rules_haskell, where
    the paths in the package config files must have the same prefix as
    the path to the package config files themselves.
- Don't exclude haskell libraries from extra-libraries entries.
aherrmann-da pushed a commit to digital-asset/daml that referenced this pull request May 20, 2019
- rules_haskell now handles the global package db within Bazel
    tweag/rules_haskell#859
- We no longer use the Nix provided c2hs. So, we drop it.
- Rename `ghcWithC2hs` to `ghcStatic` to clarify that that's where the
    static linking patches are applied.
- Extend package-db patches to align Nix store paths with the new $out.
    This works around a restriction in current rules_haskell, where
    the paths in the package config files must have the same prefix as
    the path to the package config files themselves.
- Don't exclude haskell libraries from extra-libraries entries.
aherrmann-da added a commit to digital-asset/daml that referenced this pull request May 20, 2019
* Update rules_haskell

- rules_haskell now handles the global package db within Bazel
    tweag/rules_haskell#859
- We no longer use the Nix provided c2hs. So, we drop it.
- Rename `ghcWithC2hs` to `ghcStatic` to clarify that that's where the
    static linking patches are applied.
- Extend package-db patches to align Nix store paths with the new $out.
    This works around a restriction in current rules_haskell, where
    the paths in the package config files must have the same prefix as
    the path to the package config files themselves.
- Don't exclude haskell libraries from extra-libraries entries.

* Drop redundant unix-compat override

This is a left-over from when the package was patched.

* Windows GHC bindist includes ffi header

* Drop unused language-c Nix override
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support static linking of Haskell libraries in CC binaries
2 participants