-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update to htslib 1.10.2 #184
Update to htslib 1.10.2 #184
Conversation
Small side-detail CI/CD detail: How come there's a "local", deprecated rust-bio htslib fork when we could be tracking tags/versions from upstream's htslib?
I just noticed TravisCI failing right now since I didn't realise that rust-htslib was pointing to rust-bio's outdated htslib fork :-S |
@brainstorm https://github.com/brainstorm/rust-htslib/pull/1 should fix the issue with the unknown |
Fix compression levels
…large_positions.md to spot 64 bit reference position changes and sam_* function renaming. Only 4 errors left, but need to closely check types afterwards very carefully. Specially for the binary formats where this change does not apply and (sub)fields that do not necessarily require big representations
Thanks @juliangehring, it's good to have more eyes on this! Please feel free to pull in my last commit and quickly review? I'm paranoid on putting an i64 on the wrong place and blow things up for somebody down the line ;) Speaking of which I'll try to fix the TravisCI submodules issue next so that the tests run. I still don't get why rust-bio keeps an outdated fork in the org when it can perfectly point the submodule upstream, which is to my eyes more desirable? :-S |
…t-bio/rust-htslib/jobs/645812050#L166, rollback and try with bionic ubuntu distro. It does compile locally on OSX
…compiles fine in OSX and fails on Linux now...
…es.ubuntu.com/source/disco/htslib ... and it even comes with pre-shipped libcurl4-gnutls-dev, so no need to install it explicitly?: https://travis-ci.org/rust-bio/rust-htslib/jobs/645816116#L160
…fix one testsuite warning
EDIT: nevermind, fixed, at least for OSX, need to test other platforms ;) |
hts-sys/Cargo.toml
Outdated
@@ -8,7 +8,7 @@ description = "This library provides HTSlib bindings." | |||
readme = "README.md" | |||
keywords = ["htslib", "bam", "bioinformatics", "pileup", "sequencing"] | |||
license = "MIT" | |||
repository = "https://github.com/rust-bio/rust-htslib.git" | |||
repository = "https://github.com/samtools/rust-htslib.git" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That line now points to the wrong repo, I'm afraid :)
repository = "https://github.com/samtools/rust-htslib.git" | |
repository = "https://github.com/rust-bio/rust-htslib.git" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jch-13 for the quick insight but as I pointed out earlier I think that rust-htslib should avoid this antipattern, for the following reasons:
- Keeping a local, rust-bio org, htslib fork might encourage small tweaks to that package, making it diverge from upstream and making future rust-htslib upgrades painful.
- Any required change to
htslib
should be upstreamed anyway, so there's no need to keep a local (outdated) fork that needs to be updated periodically. - Since we use submodules to point to htslib, we might as well just point to specific tags and commits from upstream as I'm doing here since I'm pointing to version 1.10.2 of htslib from the samtools repo.
Hope those make sense? I would definitely like to hear more opinions, perhaps I'm missing very valid reasons to keep that htslib repo/fork around?
Cheers!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's some confusion here. The repository =
entry in the package
section of the Cargo.toml
merely points to a place, where more information / documentation can be found about this package. Thus, to accurately reflect how hts-sys
currently points to the rust-bio
fork of htslib
, it should actually read:
repository = "https://github.com/rust-bio/htslib.git"
However, I think we can revert to actually pointing to the original htslib
repo (see below for the history of this). Thus, this submodule import should be changed to point to the htslib 1.10.2
release you are updating the wrapper to and the pointer here should be changed to:
repository = "https://github.com/samtools/htslib.git"
However, the current suggested change (repository = "https://github.com/samtools/rust-htslib.git"
) simply doesn't exist (with the rust-
prefix remaining in the repo name).
As to the history of this, I did a bit of digging:
This was originally introduced to have a commit applied that was not part of the htslib releases at the time. See for example PR #121 and especially the commit that introduced this. However, if you go over to the htslib repo, you will find that the respective commit 52368d672ce4b33a607688fc987ff396bdce68a1
has been released as of htslib 1.10
, see e.g. the 1.10
release notes and the responsible htslib pull request 766 which contains this commit. Thus, as part of this update of rust-htslib
to htslib 1.10.2
, it should be OK to revert to pointing to the original samtools/htslib.git
repo in its release 1.10.2
state.
Hope, I cleared this up a bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P.S.: I see that the submodule obviously already points to the correct upstream release tag, so just correct the repository entry with the rust-
prefix released. And we can now all feel safe about doing this, knowing the history of it all... :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Argh, you are absolutely right, thanks @dlaehnemann, sorry @jch-13 for the misunderstanding. I'll amend that right now ;)
…o/rust-htslib/jobs/647667984#L212 ... Not sure if travis_wait will work with homebrew addons?: https://docs.travis-ci.com/user/common-build-problems/#build-times-out-because-no-output-was-received... trying
For future reference, the flags required (shown as htslib Makefile diff, but can be passed as env, of course) for an OSX host with (base) rvalls@umccr htslib % git diff
diff --git a/Makefile b/Makefile
index e8101d5..5f2eb8f 100644
--- a/Makefile
+++ b/Makefile
@@ -22,20 +22,20 @@
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
# DEALINGS IN THE SOFTWARE.
-CC = gcc
-AR = ar
-RANLIB = ranlib
+CC = x86_64-linux-musl-gcc -static
+AR = x86_64-linux-musl-ar
+RANLIB = x86_64-linux-musl-ranlib
# Default libraries to link if configure is not used
htslib_default_libs = -lz -lm -lbz2 -llzma -lcurl
-CPPFLAGS =
+CPPFLAGS = -I/usr/local/opt/bzip2/include -I/usr/local/Cellar/curl-openssl/7.68.0/include -I/usr/local/Cellar/xz/5.2.4/include -I/usr/local/opt/zlib/include
# TODO: make the 64-bit support for VCF optional via configure, for now add -DVCF_ALLOW_INT64
# to CFLAGS manually, here or in config.mk if the latter exists.
# TODO: probably update cram code to make it compile cleanly with -Wc++-compat
# For testing strict C99 support add -std=c99 -D_XOPEN_SOURCE=600
-#CFLAGS = -g -Wall -O2 -pedantic -std=c99 -D_XOPEN_SOURCE=600
-CFLAGS = -g -Wall -O2 -fvisibility=hidden
+CFLAGS = -g -Wall -O2 -pedantic -std=c99 -D_XOPEN_SOURCE=600
+#CFLAGS = -g -Wall -O2 -fvisibility=hidden
EXTRA_CFLAGS_PIC = -fpic
LDFLAGS = -fvisibility=hidden
LIBS = $(htslib_default_libs) This is assuming that the required packages are installed via Homebrew:
Next up, let's figure out a more portable/clean version of that leveraging |
…ttps://github.com/brainstorm/homebrew-musl-cross/commit/9304bf9cd6bc2b784083948771977d502030815a, high sierra bottle coming up soon. Cargo build script draft
I just noticed that @johanneskoester went for GitHub Actions instead of TravisCI and musl via #191, cool! Today I've homebrew-bottled musl for OSX catalina and mojave, otherwise the builds take unnecessarily way too long, so the github actions can use it via homebrew too as: $ brew install brainstorm/musl-cross/musl-cross This could bypass the need to use docker containers, but it's really up to you guys on how you want to deliver this crate for devs? Of course I'll be rebasing all the TravisCI tests and rollbacks out from this PR soon, btw, too many unrelated CI tests for what was supposed to be a htslib-1.10.2 version bump. |
Btw @johanneskoester, I'm hitting this issue with GitHub actions and clippy on this very fork/pullrequest, see: https://github.com/rust-bio/rust-htslib/pull/184/checks?check_run_id=499633681#step:7:212 |
After a few hours of testing if we can get away without docker for the musl-static building, I don't think that it's very ergonomic to have reliable/portable musl builds for rust-htslib devs without docker due to tricky curl+openssl dependencies on the new htslib. As an example, see the following Dockerfile and imagine a dev machine getting tweaked that way to have a proper setup: https://github.com/clux/muslrust/blob/master/Dockerfile So, given that @johanneskoester is already on track with the new github actions with musl support via cross, I will assume this docker-based building strategy (or clux, or muslrust or whatever other container) will be the default from now on. |
@brainstorm sounds good. I have no real opinion on musl. Cross seems the way to go indeed. |
I would like to merge this as soon as possible, cause we are hit by yet another htslib bug which I hope to fix by upgrading: varlociraptor/varlociraptor#101. |
Do you think you can quickly fix the musl-related errors with the docker approach? If not, let us deactivate musl tests for now, and fix it in a later PR. |
@johanneskoester The musl stuff is indeed non trivial, specially with the openssl+curl combo. I guess it'll take me a few more hours to crack that nut, imho. So yes, please, go ahead and prepare to merge this PR w/o musl for now? Feel free to rebase and make any changes you see fit from this point on, going to sleep now ;) /cc @ohofmann |
* Update to htslib 1.10.2 (#184) * Pull 1.10.2 submodule. Initial pass on i32->i64 type conversion, will eventually fix issue #183 * Fixes issue #109 while I'm at it, long hanging fruit fix * Fix compression levels * Going through https://github.com/samtools/htslib/blob/develop/README.large_positions.md to spot 64 bit reference position changes and sam_* function renaming. Only 4 errors left, but need to closely check types afterwards very carefully. Specially for the binary formats where this change does not apply and (sub)fields that do not necessarily require big representations * Repoint submodule to upstream release tag * Switch from rust-bio hosted htslib to upstream's samtools htslib. * Try to amend submodule * Handle submodules manually on TravisCI, since it is failing now: https://travis-ci.org/rust-bio/rust-htslib/builds/645772614 * Odd balls that would need review marked with XXX. New htslib seems to require libcurl * Add curl-sys crate for hplugin_curl, aws s3 et al * Add conditional libcurl compilation flags * Be nice with cargo fmt [ci skip] * Try installing libcurl meta-package instead of the openssl dependent one?: https://travis-ci.org/rust-bio/rust-htslib/builds/645790138#L1529 * Does not work with libcurl-dev metapackage: https://travis-ci.org/rust-bio/rust-htslib/jobs/645812050#L166, rollback and try with bionic ubuntu distro. It does compile locally on OSX * Add matrix compile support for TravisCI, I would like to know why it compiles fine in OSX and fails on Linux now... * PPA source no longer needed for ubuntu bionic? [ci skip] * Bionic packages htslib with gnutls instead of openssl: https://packages.ubuntu.com/source/disco/htslib ... and it even comes with pre-shipped libcurl4-gnutls-dev, so no need to install it explicitly?: https://travis-ci.org/rust-bio/rust-htslib/jobs/645816116#L160 * sed is not gsed on OSX... not entirely sure this substitution is 100% needed * Add libssl-dev to the mix for the openssl-sys: https://travis-ci.org/rust-bio/rust-htslib/jobs/645825022#L1000. Testsuite seems to run partially now: https://travis-ci.org/rust-bio/rust-htslib/jobs/645825022#L623 * Add build-essential in hopes to fix libcurl issue, incorporate #185, fix one testsuite warning * Add curl on default features for hts-sys * Point to the right htslib upstream repo, thanks @dlaehnemann [skip ci] * Appease OSX musl-cross toolchain requirements: https://travis-ci.org/rust-bio/rust-htslib/jobs/645907695#L1222 * TravisCI timesout installing brew deps: https://travis-ci.org/rust-bio/rust-htslib/jobs/647667984#L212 ... Not sure if travis_wait will work with homebrew addons?: https://docs.travis-ci.com/user/common-build-problems/#build-times-out-because-no-output-was-received... trying * Try workaround for https://twitter.com/braincode/status/1226103002550849542 * Do not update homebrew (time intensive), switch from travis_wait to verbose to keep the build output being generated, hopefully * brew install --verbose is not verbose enough for travis, reverting travis_wait and bumping to 40min for brew deps install and musl-cross * Explicitly install system libcurl on linux and osx * This .cargo/config makes it test right locally: rust-lang/rust#60149 (comment), only 5 tests failing now and no linking errors from libcurl (on OSX, at least) :) * Cargo.toml missing quote misshap * Same linker flags should apply for Linux builds? build-essential not necessary * tid is i64 now, CRAM vs BAM test l_data is off by 3 and m_data is off too * There should be no need for travis_wait with -v * Debugging htslib structs with @jkbonfield guidance... * Disable osx and musl-cross homebrew on the TravisCI matrix since it times out anyway, start looking into l_extranul and alignment issues for CRAM/BAM. Add curl_sys crate in hts-sys/src/lib.rs. * Mystery solved for test_read_cram, testing some of the struct fields, explicitly state that l_data should differ according to #184 (comment) * Extract selective CRAM/BAM field comparison to test helper function * Formatting * Temporarily commenting the hfile format category detection makes .bed filetype tests pass but .BAI idx locator fails * Format detection seems to be broken upstream on htslib, see #184 (comment) * Revert to libcurl4-openssl-dev instead of gnutls. All TravisCI script targets compiling and passing tests otherwise * Bump up curl-sys version * Experimenting musl cross-compilation with rustembedded cross. Ideally this would get rid of the homebrew musl-cross issue (too long compile times). /cc @nlhepler * Remove assert_ne since cannot guarantee that l_data is different on all bam/crams * Add minimum deps needed for rustembedded cross docker container * Run docker in TravisCI, even if it is only supported on linux jobs :/ * Add CFLAGS suggested by @nlhepler * Separate musl from GNU toolchains on Rust cross * Bump up bindgen, vendorize openssl, further experimentation with compile flags * Deprecate the mixed GNU/MUSL container * Add minimal explanation on how to use the new docker files [ci skip] * cross 0.2.0 with openssl vendored. Progress but more htslib compile time issues remain still * Make sure libzma is present in both containers [ci skip] * Focusing on LLVMv8 GNU cross build first, musl will come later on (if possible w/ htslib) * Two more usize that should be u64 * Amend and settle BED header debate * Rename test name accordingly * Transition to cross instead of cargo for builds and tests * All GNU/clang/llvm8 builds and tests seem to succeed, transitioning to musl testing now... * Remove stray (repeated) musl build * Remove @FiloSottile musl-cross homebrew formula since it takes too long to compile and timeouts CI, using clux/muslrust gets me a bit farther. Remove local rustup musl target... since htslib 1.10, the introduction of openssl complicated local compilation w/ musl greatly, unfortunately we have to resort to docker these days instead :/ * Experiment with toolchain setup in https://gitlab.com/rust_musl_docker/image. Remove vendored openssl since it's shipped in cross containers. * Remove build.rs flags noise and different tests, containers should handle them transparently depending on the defined environment within, thanks @nlhepler * Indicate dependencies and build flags required for OSX [ci skip]. * enable musl builds * add libclang * minor * dbg * revert * dbg * dbg * Revert cross/docker madness, bottled up musl-cross myself over here: https://github.com/brainstorm/homebrew-musl-cross/commit/9304bf9cd6bc2b784083948771977d502030815a, high sierra bottle coming up soon. Cargo build script draft * Add in openssl to hts-sys * Make clippy boolean logic happy * Remove homebrew OSX check since we'll just use rustembedded cross Co-authored-by: Julian Gehring <[email protected]> Co-authored-by: Johannes Köster <[email protected]> * disable musl tests Co-authored-by: Roman Valls Guimera <[email protected]> Co-authored-by: Julian Gehring <[email protected]>
Thanks to everybody! |
Last mile TODO:
Determine ifmusl
can be used to statically compile htslib (via rustembbeded'scross
tool). Also see Static compilation possible? #80. This is useful for reliable AWS Lambda runtime and deployment.Go through @brentp equivalent Nim-htslib migration and compare notes: brentp/hts-nim@1dc9d41Fix.musl
building as attempted in enable musl builds #191Make sure AWS S3.s3_vopen()
works correctly (my personal main aim with this PR)Rebase to remove all CI test commits and merge to master🎉Please feel free to jump in and pullrequest against this branch and I'll merge in/rebase/coordinate the changes as fast as I can since I would like to have this shipped relatively soon.