Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize Bool parse method to AbstractString #47782

Merged
merged 2 commits into from
Dec 6, 2022

Conversation

quinnj
Copy link
Member

@quinnj quinnj commented Dec 2, 2022

Fixes JuliaStrings/InlineStrings.jl#57.

We currently have a specialization for parse(Bool, ::Union{String, SubString{String}) where true and false are parsed appropriately. The restriction to Union{String, SubString{String}}, however, means we don't get this behavior for other AbstractStrings. In the linked issue above, for InlineStrings, we end up going through the generic integer parsing codepath which results in an InexactError when we try to do Bool(10).

The proposal in this PR takes advantage of the fact that there is only the 2 comparisons where we do _memcmp that require the input string to be "dense" (in memory), and otherwise, we just do a comparison against a SubString of the input string.

Relatedly, I've wanted to introduce the concept of an abstrac type like:

abstract type MemoryAddressableString <: AbstractString

where the additional required interface would be being able to call pointer(::MemoryAddressableString), since a lot of our string algorithms depend on doing these kind of pointer operations and hence makes it quite a pain to implement your own custom string type.

Fixes JuliaStrings/InlineStrings.jl#57.

We currently have a specialization for `parse(Bool, ::Union{String, SubString{String})`
where `true` and `false` are parsed appropriately. The restriction to
`Union{String, SubString{String}}`, however, means we don't get this behavior
for other `AbstractString`s. In the linked issue above, for InlineStrings, we end up
going through the generic integer parsing codepath which results in an `InexactError`
when we try to do `Bool(10)`.

The proposal in this PR takes advantage of the fact that there is only the 2 comparisons
where we do `_memcmp` that require the input string to be "dense" (in memory), and otherwise,
we just do a comparison against a `SubString` of the input string.

Relatedly, I've wanted to introduce the concept of an abstrac type like:

```julia
abstract type MemoryAddressableString <: AbstractString
```

where the additional required interface would be being able to call
`pointer(::MemoryAddressableString)`, since a lot of our string algorithms depend on doing
these kind of pointer operations and hence makes it quite a pain to implement your own
custom string type.
base/parse.jl Outdated Show resolved Hide resolved
@brenhinkeller brenhinkeller added the strings "Strings!" label Dec 2, 2022
Copy link
Contributor

@nickrobinson251 nickrobinson251 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

test/parse.jl Outdated Show resolved Hide resolved
Co-authored-by: Stefan Karpinski <[email protected]>
Co-authored-by: Nick Robinson <[email protected]>
@LilithHafner
Copy link
Member

Possibly worth changing this 1 to startpos

if isnumeric(sbuff[1])

@quinnj quinnj merged commit 63830a6 into master Dec 6, 2022
@quinnj quinnj deleted the jq-parse-bool-abstractstring branch December 6, 2022 19:15
@mkitti
Copy link
Contributor

mkitti commented Jul 4, 2023

@KristofferC could we back port this? Someone else just ran into this issue:
https://discourse.julialang.org/t/string7-and-bools-bug-or-feature/101188/5

@DilumAluthge DilumAluthge added bugfix This change fixes an existing bug backport 1.6 Change should be backported to release-1.6 backport 1.8 Change should be backported to release-1.8 backport 1.9 Change should be backported to release-1.9 backport 1.10 Change should be backported to the 1.10 release labels Jul 5, 2023
KristofferC pushed a commit that referenced this pull request Jul 11, 2023
* Generalize Bool parse method to AbstractString

Fixes JuliaStrings/InlineStrings.jl#57.

We currently have a specialization for `parse(Bool, ::Union{String, SubString{String})`
where `true` and `false` are parsed appropriately. The restriction to
`Union{String, SubString{String}}`, however, means we don't get this behavior
for other `AbstractString`s. In the linked issue above, for InlineStrings, we end up
going through the generic integer parsing codepath which results in an `InexactError`
when we try to do `Bool(10)`.

The proposal in this PR takes advantage of the fact that there is only the 2 comparisons
where we do `_memcmp` that require the input string to be "dense" (in memory), and otherwise,
we just do a comparison against a `SubString` of the input string.

Relatedly, I've wanted to introduce the concept of an abstrac type like:

```julia
abstract type MemoryAddressableString <: AbstractString
```

where the additional required interface would be being able to call
`pointer(::MemoryAddressableString)`, since a lot of our string algorithms depend on doing
these kind of pointer operations and hence makes it quite a pain to implement your own
custom string type.

* Apply suggestions from code review

Co-authored-by: Stefan Karpinski <[email protected]>
Co-authored-by: Nick Robinson <[email protected]>

Co-authored-by: Stefan Karpinski <[email protected]>
Co-authored-by: Nick Robinson <[email protected]>
(cherry picked from commit 63830a6)
@KristofferC KristofferC mentioned this pull request Jul 11, 2023
35 tasks
KristofferC added a commit that referenced this pull request Aug 18, 2023
Backported PRs:
- [x] #47782 <!-- Generalize Bool parse method to AbstractString -->
- [x] #48634 <!-- Remove unused "deps" mechanism in internal sorting
keywords [NFC] -->
- [x] #49931 <!-- Lock finalizers' lists at exit -->
- [x] #50064 <!-- Fix numbered prompt with input only with comment -->
- [x] #50474 <!-- docs: Fix a `!!! note` which was miscapitalized -->
- [x] #50516 <!-- Fix visibility of assert on GCC12/13 -->
- [x] #50635 <!-- `versioninfo()`: include build info and unofficial
warning -->
- [x] #49915 <!-- Revert "Remove number / vector (#44358)" -->
- [x] #50781 <!-- fix `bit_map!` with aliasing -->
- [x] #50845 <!-- fix #50438, use default pool for at-threads -->
- [x] #49031 <!-- Update inference.md -->
- [x] #50289 <!-- Initialize prev_nold and nold in gc_reset_page -->
- [x] #50559 <!-- Expand kwcall lowering positional default check to
vararg -->
- [x] #49582 <!-- Update HISTORY.md for `DelimitedFiles` -->
- [x] #50341 <!-- invokelatest docs should say not exported before 1.9
-->
- [x] #50525 <!-- only check that values are finite in `generic_lufact`
when `check=true` -->
- [x] #50444 <!-- Optimize getfield lowering to avoid boxing in some
cases -->
- [x] #50523 <!-- Avoid generic call in most cases for getproperty -->
- [x] #50860 <!-- Add `Base.get_extension` to docs/API -->
- [x] #50164 <!-- codegen: handle dead code with unsafe_store of FCA
pointers -->
- [x] #50568 <!-- `Array(::AbstractRange)` should return an `Array` -->
- [x] #50871 <!-- macOS: Don't inspect dead threadtls during exception
handling. -->

Need manual backport:
- [ ] #48542 <!-- Add docs on task-specific buffering using
multithreading -->
- [ ] #50591 <!-- build: fix various makefile bugs -->


Non-merged PRs with backport label:
- [ ] #50842 <!-- Avoid race conditions with recursive rm -->
- [ ] #50823 <!-- Make ranges more robust with unsigned indexes. -->
- [ ] #50663 <!-- Fix Expr(:loopinfo) codegen -->
- [ ] #49716 <!-- Update varinfo() docstring signature -->
- [ ] #49713 <!-- prevent REPL from erroring in numbered mode in some
situations -->
- [ ] #49573 <!-- Implement jl_cpu_pause on PPC64 -->
- [ ] #48726 <!-- fix macro expansion of property destructuring -->
- [ ] #48642 <!-- Use gc alloc instead of alloc typed in lowering -->
- [ ] #48183 <!-- Don't use pkgimage for package if any includes fall in
tracked path for coverage or alloc tracking -->
- [ ] #48050 <!-- improve `--heap-size-hint` arg handling -->
- [ ] #47615 <!-- Allow threadsafe access to buffer of type inference
profiling trees -->
@KristofferC KristofferC added backport 1.10 Change should be backported to the 1.10 release and removed backport 1.10 Change should be backported to the 1.10 release labels Aug 18, 2023
@IanButterworth IanButterworth removed the backport 1.9 Change should be backported to release-1.9 label Aug 19, 2023
@IanButterworth IanButterworth removed the backport 1.10 Change should be backported to the 1.10 release label Aug 19, 2023
KristofferC pushed a commit that referenced this pull request Oct 11, 2023
* Generalize Bool parse method to AbstractString

Fixes JuliaStrings/InlineStrings.jl#57.

We currently have a specialization for `parse(Bool, ::Union{String, SubString{String})`
where `true` and `false` are parsed appropriately. The restriction to
`Union{String, SubString{String}}`, however, means we don't get this behavior
for other `AbstractString`s. In the linked issue above, for InlineStrings, we end up
going through the generic integer parsing codepath which results in an `InexactError`
when we try to do `Bool(10)`.

The proposal in this PR takes advantage of the fact that there is only the 2 comparisons
where we do `_memcmp` that require the input string to be "dense" (in memory), and otherwise,
we just do a comparison against a `SubString` of the input string.

Relatedly, I've wanted to introduce the concept of an abstrac type like:

```julia
abstract type MemoryAddressableString <: AbstractString
```

where the additional required interface would be being able to call
`pointer(::MemoryAddressableString)`, since a lot of our string algorithms depend on doing
these kind of pointer operations and hence makes it quite a pain to implement your own
custom string type.

* Apply suggestions from code review

Co-authored-by: Stefan Karpinski <[email protected]>
Co-authored-by: Nick Robinson <[email protected]>

Co-authored-by: Stefan Karpinski <[email protected]>
Co-authored-by: Nick Robinson <[email protected]>
(cherry picked from commit 63830a6)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 1.6 Change should be backported to release-1.6 backport 1.8 Change should be backported to release-1.8 bugfix This change fixes an existing bug strings "Strings!"
Projects
None yet
Development

Successfully merging this pull request may close these issues.

parse(Bool, s::String3) giving InexactError
9 participants