-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve --heap-size-hint
arg handling
#48050
Conversation
Previously, --heap-size-hint would silently ignore many flavors of "bad" input, parsing things like "3gigs" as 3 bytes. This change makes it significantly less permissive, erroring unless it can parse a number (still relying on the C `sscanf` `%Lf` format specifier there) with an optional unit (case-insensitive, either with or without the trailing `b`). Also test it.
Is this ready to go? |
Is this ready to merge? |
Backported PRs: - [x] #47782 <!-- Generalize Bool parse method to AbstractString --> - [x] #48634 <!-- Remove unused "deps" mechanism in internal sorting keywords [NFC] --> - [x] #49931 <!-- Lock finalizers' lists at exit --> - [x] #50064 <!-- Fix numbered prompt with input only with comment --> - [x] #50474 <!-- docs: Fix a `!!! note` which was miscapitalized --> - [x] #50516 <!-- Fix visibility of assert on GCC12/13 --> - [x] #50635 <!-- `versioninfo()`: include build info and unofficial warning --> - [x] #49915 <!-- Revert "Remove number / vector (#44358)" --> - [x] #50781 <!-- fix `bit_map!` with aliasing --> - [x] #50845 <!-- fix #50438, use default pool for at-threads --> - [x] #49031 <!-- Update inference.md --> - [x] #50289 <!-- Initialize prev_nold and nold in gc_reset_page --> - [x] #50559 <!-- Expand kwcall lowering positional default check to vararg --> - [x] #49582 <!-- Update HISTORY.md for `DelimitedFiles` --> - [x] #50341 <!-- invokelatest docs should say not exported before 1.9 --> - [x] #50525 <!-- only check that values are finite in `generic_lufact` when `check=true` --> - [x] #50444 <!-- Optimize getfield lowering to avoid boxing in some cases --> - [x] #50523 <!-- Avoid generic call in most cases for getproperty --> - [x] #50860 <!-- Add `Base.get_extension` to docs/API --> - [x] #50164 <!-- codegen: handle dead code with unsafe_store of FCA pointers --> - [x] #50568 <!-- `Array(::AbstractRange)` should return an `Array` --> - [x] #50871 <!-- macOS: Don't inspect dead threadtls during exception handling. --> Need manual backport: - [ ] #48542 <!-- Add docs on task-specific buffering using multithreading --> - [ ] #50591 <!-- build: fix various makefile bugs --> Non-merged PRs with backport label: - [ ] #50842 <!-- Avoid race conditions with recursive rm --> - [ ] #50823 <!-- Make ranges more robust with unsigned indexes. --> - [ ] #50663 <!-- Fix Expr(:loopinfo) codegen --> - [ ] #49716 <!-- Update varinfo() docstring signature --> - [ ] #49713 <!-- prevent REPL from erroring in numbered mode in some situations --> - [ ] #49573 <!-- Implement jl_cpu_pause on PPC64 --> - [ ] #48726 <!-- fix macro expansion of property destructuring --> - [ ] #48642 <!-- Use gc alloc instead of alloc typed in lowering --> - [ ] #48183 <!-- Don't use pkgimage for package if any includes fall in tracked path for coverage or alloc tracking --> - [ ] #48050 <!-- improve `--heap-size-hint` arg handling --> - [ ] #47615 <!-- Allow threadsafe access to buffer of type inference profiling trees -->
@mbauman do you still have plans to update this, taking the existing comments into account? |
Oh yeah, this was a bit of a fly-by code dump here. Let's see if we can quickly get this back into shape. I've removed the debugging test code (tests pass locally; I don't recall whether I was trying to debug something in CI?). And I've changed the docs to match with #50480. This looking better? |
g = 1024m | ||
t = 1024g | ||
@testset "--heap-size-hint=$str" for (str, val) in [("1", 1), ("1e7", 1e7), ("2.5e7", 2.5e7), ("1MB", 1m), ("2.5g", 2.5g), ("1e4kB", 1e4k), | ||
("1e100", typemax(UInt64)), ("1e500g", typemax(UInt64)), ("1e-12t", 1), ("500000000b", 500000000)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, clearly I did not look closely enough before: I thought this was about adding support for inputs like 2.5g
which seems reasonable to me
But is a heap size hint of the form 1e-12t
or 1e100
really something we want to / should support?
I guess it would be more work to not support them, though. Ah well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR doesn't change the numeric syntaxes supported here — it just adds tests for them.
The big change here isn't about the supported numeric values, but rather the byte unit. Before Julia would silently discard/ignore unrecognized units. After this PR, Julia errors. That's the big improvement.
Currently:
▶ julia --heap-size-hint=2.5GB -E "Base.JLOptions().heap_size_hint"
0x0000000000000002
▶ julia --heap-size-hint=2.5Gibberish -E "Base.JLOptions().heap_size_hint"
0x0000000000000002
▶ julia --heap-size-hint=2.5Gobbledeegook -E "Base.JLOptions().heap_size_hint"
0x0000000000000a00
After:
▶ ./julia --heap-size-hint=2.5GB -E "Base.JLOptions().heap_size_hint"
0x00000000a0000000
▶ ./julia --heap-size-hint=2.5Gibberish -E "Base.JLOptions().heap_size_hint"
ERROR: julia: invalid argument to --heap-size-hint (2.5Gibberish)
▶ ./julia --heap-size-hint=2.5Gobbledeegook -E "Base.JLOptions().heap_size_hint"
ERROR: julia: invalid argument to --heap-size-hint (2.5Gobbledeegook)
Apparently something is wrong with |
Also needs a rebase to fix some conflicting doc changes |
static char ascii_tolower(char c) | ||
{ | ||
if ('A' <= c && c <= 'Z') | ||
return c - 'A' + 'a'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, let me recap: this PR changed things to use tolower
, which has issues, so it was suggested to use utf8proc_tolower
instead which then was replaced by this hand-written function...
How about my original suggestion to just stick with what the code did before this PR, and simply have two cases in the switch
for each letter? That seems by far simplest and is guaranteed to have no UTF-8 related issues or anything else and requires no additional explanation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha, yeah. But this works too, and there are a couple extra cases now, including one that is not in a switch statement, so it is not entirely obvious to me that it is significantly better one way or the other; just a style preference.
Previously, `--heap-size-hint` would silently ignore many flavors of "bad" input, parsing things like "3PB" as 3 bytes. This change makes it significantly less permissive, erroring unless it can parse a number (still relying on the C `sscanf` `%Lf` format specifier there) with an optional unit (case-insensitive, either with or without the trailing `b`). Also test it.
Previously, `--heap-size-hint` would silently ignore many flavors of "bad" input, parsing things like "3PB" as 3 bytes. This change makes it significantly less permissive, erroring unless it can parse a number (still relying on the C `sscanf` `%Lf` format specifier there) with an optional unit (case-insensitive, either with or without the trailing `b`). Also test it. (cherry picked from commit 138aba7)
Previously, `--heap-size-hint` would silently ignore many flavors of "bad" input, parsing things like "3PB" as 3 bytes. This change makes it significantly less permissive, erroring unless it can parse a number (still relying on the C `sscanf` `%Lf` format specifier there) with an optional unit (case-insensitive, either with or without the trailing `b`). Also test it.
Backported PRs: - [x] #39071 <!-- Add a lazy `logrange` function and `LogRange` type --> - [x] #51802 <!-- Allow AnnotatedStrings in log messages --> - [x] #53369 <!-- Orthogonalize re-indexing for FastSubArrays --> - [x] #48050 <!-- improve `--heap-size-hint` arg handling --> - [x] #53482 <!-- add IR encoding for EnterNode --> - [x] #53499 <!-- Avoid compiler warning about redefining jl_globalref_t --> - [x] #53507 <!-- update staled `Core.Compiler.Effects` documentation --> - [x] #53408 <!-- task splitting: change additive accumulation to multiplicative --> - [x] #53523 <!-- add back an alias for `check_top_bit` --> - [x] #53377 <!-- add _readdirx for returning more object info gathered during dir scan --> - [x] #53525 <!-- fix InteractiveUtils call in Base.runtests on failure --> - [x] #53540 <!-- use more efficient `_readdirx` for tab completion --> - [x] #53545 <!-- use `_readdirx` for `walkdir` --> - [x] #53551 <!-- revert "Add @create_log_macro for making custom styled logging macros (#52196)" --> - [x] #53554 <!-- Always return a value in 1-d circshift! of abstractarray.jl --> - [x] #53424 <!-- yet more atomics & cache-line fixes on work-stealing queue --> - [x] #53571 <!-- Update Documenter to v1.3 for inventory writing --> - [x] #53403 <!-- Move parallel precompilation to Base --> - [x] #53589 <!-- add back `unsafe_convert` to pointer for arrays --> - [x] #53596 <!-- build: remove extra .a file --> - [x] #53606 <!-- fix error path in `precompilepkgs` --> - [x] #53004 <!-- Unexport with, at_with, and ScopedValue from Base --> - [x] #53629 <!-- typo fix in scoped values docs --> - [x] #53630 <!-- sroa: Fix incorrect scope counting --> - [x] #53598 <!-- Use Base parallel precompilation to build stdlibs --> - [x] #53649 <!-- precompilepkgs: package in boths deps and weakdeps are in fact only weak --> - [x] #53671 <!-- Fix bootstrap Base precompile in cross compile configuration --> - [x] #52125 <!-- Load Pkg if not already to reinstate missing package add prompt --> - [x] #53602 <!-- Handle zero on arrays of unions of number types and missings --> - [x] #53516 <!-- permit NamedTuple{<:Any, Union{}} to be created --> - [x] #53643 <!-- Bump CSL to 1.1.1 to fix libgomp bug --> - [x] #53679 <!-- move precompile workload back from Base --> - [x] #53663 <!-- add isassigned methods for reinterpretarray --> - [x] #53662 <!-- [REPL] fix incorrectly cleared line after completions accepted --> - [x] #53611 <!-- Linalg: matprod_dest for Diagonal and adjvec --> - [x] #53659 <!-- fix #52025, re-allow all implicit pointer casts in cconvert for Array --> - [x] #53631 <!-- LAPACK: validate input parameters to throw informative errors --> - [x] #53628 <!-- Make some improvements to the Scoped Values documentation. --> - [x] #53655 <!-- Change tbaa of ptr_phi to tbaa_value --> - [x] #53391 <!-- Default to the medium code model in x86 linux --> - [x] #53699 <!-- Move `isexecutable, isreadable, iswritable` to `filesystem.jl` --> - [x] #41232 <!-- Fix linear indexing for ReshapedArray if the parent has offset axes --> - [x] #53527 <!-- Enable analyzegc checks for try catch and fix found issues --> - [x] #52092 - [x] #53682 <!-- Increase build precompilation --> - [x] #53720 - [x] #53553 <!-- typeintersect: fix `UnionAll` unaliasing bug caused by innervars. --> Contains multiple commits, manual intervention needed: - [ ] #53305 <!-- Propagate inbounds in isassigned with CartesianIndex indices --> Non-merged PRs with backport label: - [ ] #53736 <!-- fix literal-pow to return the right type when the base is -1 --> - [ ] #53707 <!-- Make ScopedValue public --> - [ ] #53696 <!-- add invokelatest to on_done callback in bracketed paste --> - [ ] #53660 <!-- put Logging back in default sysimage --> - [ ] #53509 <!-- revert moving "creating packages" from Pkg.jl --> - [ ] #53452 <!-- RFC: allow Tuple{Union{}}, returning Union{} --> - [ ] #53402 <!-- Add `jl_getaffinity` and `jl_setaffinity` --> - [ ] #52694 <!-- Reinstate similar for AbstractQ for backward compatibility --> - [ ] #51928 <!-- Styled markdown, with a few tweaks --> - [ ] #51816 <!-- User-themable stacktraces --> - [ ] #51811 <!-- Make banner size depend on terminal size --> - [ ] #51479 <!-- prevent code loading from lookin in the versioned environment when building Julia -->
Previously,
--heap-size-hint
would silently ignore many flavors of "bad" input, parsing things like "3PB" as 3 bytes. This change makes it significantly less permissive, erroring unless it can parse a number (still relying on the Csscanf
%Lf
format specifier there) with an optional unit (case-insensitive, either with or without the trailingb
). Also test it.