Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

codegen: optimize const fields of mutable objects #53484

Merged
merged 2 commits into from
Mar 5, 2024
Merged

Conversation

vtjnash
Copy link
Member

@vtjnash vtjnash commented Feb 26, 2024

For example, we seek to eliminate the gc frame from this function, as
observed here:

julia> code_llvm((BitSet,), raw=true) do x; r = x.bits; GC.safepoint(); @inbounds r[1]; end
; Function Signature: var"https://github.com/JuliaLang/julia/issues/3"(Base.BitSet)
;  @ REPL[1]:1 within `https://github.com/JuliaLang/julia/issues/3`
define swiftcc i64 @"julia_#3_494"(ptr nonnull swiftself %pgcstack, ptr noundef nonnull align 8 dereferenceable(16) %"x::BitSet") #0 !dbg !5 {
top:
  call void @llvm.dbg.declare(metadata ptr %"x::BitSet", metadata !21, metadata !DIExpression()), !dbg !22
  %ptls_field = getelementptr inbounds ptr, ptr %pgcstack, i64 2
  %ptls_load = load ptr, ptr %ptls_field, align 8, !tbaa !23
  %0 = getelementptr inbounds ptr, ptr %ptls_load, i64 2
  %safepoint = load ptr, ptr %0, align 8, !tbaa !27
  fence syncscope("singlethread") seq_cst
  %1 = load volatile i64, ptr %safepoint, align 8, !dbg !22
  fence syncscope("singlethread") seq_cst
; ┌ @ Base.jl:49 within `getproperty`
   %"x::BitSet.bits" = load atomic ptr, ptr %"x::BitSet" unordered, align 8, !dbg !29, !tbaa !27, !alias.scope !33, !noalias !36, !nonnull !11, !dereferenceable !41, !align !42
; └
; ┌ @ gcutils.jl:253 within `safepoint`
   %ptls_load4 = load ptr, ptr %ptls_field, align 8, !dbg !43, !tbaa !23
   %2 = getelementptr inbounds ptr, ptr %ptls_load4, i64 2, !dbg !43
   %safepoint5 = load ptr, ptr %2, align 8, !dbg !43, !tbaa !27
   fence syncscope("singlethread") seq_cst, !dbg !43
   %3 = load volatile i64, ptr %safepoint5, align 8, !dbg !43
   fence syncscope("singlethread") seq_cst, !dbg !43
; └
; ┌ @ essentials.jl:892 within `getindex`
   %4 = load ptr, ptr %"x::BitSet.bits", align 8, !dbg !46, !tbaa !49, !alias.scope !52, !noalias !53
   %5 = load i64, ptr %4, align 8, !dbg !46, !tbaa !54, !alias.scope !57, !noalias !58
   ret i64 %5, !dbg !46
; └
}

@vtjnash vtjnash added compiler:codegen Generation of LLVM IR and native code needs nanosoldier run This PR should have benchmarks run on it don't squash Don't squash merge labels Feb 26, 2024
@vtjnash vtjnash force-pushed the jn/mutable-const-opt branch from f2243b4 to 0f05f3a Compare February 27, 2024 16:39
vtjnash added 2 commits March 4, 2024 16:14
For example, we seek to eliminate the gc frame from this function, as
observed here:

```julia
julia> code_llvm((BitSet,), raw=true) do x; r = x.bits; GC.safepoint(); @inbounds r[1]; end
; Function Signature: var"#3"(Base.BitSet)
;  @ REPL[1]:1 within `#3`
define swiftcc i64 @"julia_#3_494"(ptr nonnull swiftself %pgcstack, ptr noundef nonnull align 8 dereferenceable(16) %"x::BitSet") #0 !dbg !5 {
top:
  call void @llvm.dbg.declare(metadata ptr %"x::BitSet", metadata !21, metadata !DIExpression()), !dbg !22
  %ptls_field = getelementptr inbounds ptr, ptr %pgcstack, i64 2
  %ptls_load = load ptr, ptr %ptls_field, align 8, !tbaa !23
  %0 = getelementptr inbounds ptr, ptr %ptls_load, i64 2
  %safepoint = load ptr, ptr %0, align 8, !tbaa !27
  fence syncscope("singlethread") seq_cst
  %1 = load volatile i64, ptr %safepoint, align 8, !dbg !22
  fence syncscope("singlethread") seq_cst
; ┌ @ Base.jl:49 within `getproperty`
   %"x::BitSet.bits" = load atomic ptr, ptr %"x::BitSet" unordered, align 8, !dbg !29, !tbaa !27, !alias.scope !33, !noalias !36, !nonnull !11, !dereferenceable !41, !align !42
; └
; ┌ @ gcutils.jl:253 within `safepoint`
   %ptls_load4 = load ptr, ptr %ptls_field, align 8, !dbg !43, !tbaa !23
   %2 = getelementptr inbounds ptr, ptr %ptls_load4, i64 2, !dbg !43
   %safepoint5 = load ptr, ptr %2, align 8, !dbg !43, !tbaa !27
   fence syncscope("singlethread") seq_cst, !dbg !43
   %3 = load volatile i64, ptr %safepoint5, align 8, !dbg !43
   fence syncscope("singlethread") seq_cst, !dbg !43
; └
; ┌ @ essentials.jl:892 within `getindex`
   %4 = load ptr, ptr %"x::BitSet.bits", align 8, !dbg !46, !tbaa !49, !alias.scope !52, !noalias !53
   %5 = load i64, ptr %4, align 8, !dbg !46, !tbaa !54, !alias.scope !57, !noalias !58
   ret i64 %5, !dbg !46
; └
}
```
Make this analysis even stronger, using a function from
llvm-late-gc-lowering.cpp that implements it more aggressively
@vtjnash vtjnash force-pushed the jn/mutable-const-opt branch from 0f05f3a to 50114d7 Compare March 4, 2024 21:14
@vtjnash
Copy link
Member Author

vtjnash commented Mar 4, 2024

@nanosoldier runbenchmarks(ALL, vs=":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

@vtjnash vtjnash removed the needs nanosoldier run This PR should have benchmarks run on it label Mar 5, 2024
@vtjnash
Copy link
Member Author

vtjnash commented Mar 5, 2024

One non-null result here, since I assume that means alloc-opt failed for this:

["inference", "abstract interpretation", "broadcasting"] | 1.56 (5%) ❌ | 1.63 (1%) ❌

@vtjnash vtjnash merged commit 58291db into master Mar 5, 2024
5 of 9 checks passed
@vtjnash vtjnash deleted the jn/mutable-const-opt branch March 5, 2024 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:codegen Generation of LLVM IR and native code don't squash Don't squash merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants