Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make task sticky before entering threaded region #43420

Closed
wants to merge 1 commit into from

Conversation

tkf
Copy link
Member

@tkf tkf commented Dec 14, 2021

When we have t = @spawn @threads for ..., the task t that was at the primary thread when scheduling @threads for may be migrated away to a non-primary thread. This makes me wonder if we need to set t.sticky = true before calling jl_enter_threaded_region to ensure that jl_exit_threaded_region is called at the primary thread. I haven't encountered a bug due to this but I'm opening as a PR to see if my concern is valid.

@tkf tkf added the multithreading Base.Threads and related functionality label Dec 14, 2021
@tkf tkf requested review from vchuravy and vtjnash December 14, 2021 06:54
@tkf tkf force-pushed the prefer_threaded_region branch from 4e017fe to c78dde4 Compare December 14, 2021 06:55
@tkf
Copy link
Member Author

tkf commented Dec 14, 2021

Actually, this is (accidentally?) not a problem ATM since all tasks in @threads for are sticky and so schedule'ing them sets current_task().sticky = true. But making it explicit like this may be better? Or, I think it's better to put some comments in the code at least. (Edit: No, this is not the case since @threads for sets the tid of the child tasks before schedule. So, I think something like this PR is required.)

Copy link
Member

@vchuravy vchuravy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand jl_enter_threaded_region well enough. But yes I agree that this seems needed in the context of migration and I question the usefulness of threaded_region in the context of migration.

So maybe we should remove it, and support IO on all threads even outside a threaded_region.

Copy link
Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is necessary, or desirable. The calling Task can freely migrate, as all of the worker tasks will be pinned to 1:n, and this Task will simply sleep until all of the workers are done.

@tkf
Copy link
Member Author

tkf commented Dec 15, 2021

@vtjnash How do we balance the increments and decrements of _threadedregion without disabling the task migration?

@vtjnash
Copy link
Member

vtjnash commented Dec 15, 2021

_threadedregion is a simple global counter

@tkf
Copy link
Member Author

tkf commented Dec 15, 2021

Uh, sorry, I just realized that I've been reading the code from #43418 (which motivated me to look into the threaded region business) where you added if (jl_atomic_load_relaxed(&jl_current_task->tid) != 0) checks #43418 (comment)

julia/src/threading.c

Lines 538 to 549 in 2a6a8ab

JL_DLLEXPORT void jl_enter_threaded_region(void)
{
if (jl_atomic_load_relaxed(&jl_current_task->tid) != 0)
return;
jl_atomic_fetch_add(&_threadedregion, 1);
}
JL_DLLEXPORT void jl_exit_threaded_region(void)
{
if (jl_atomic_load_relaxed(&jl_current_task->tid) != 0)
return;
jl_atomic_fetch_add(&_threadedregion, -1);

I can demonstrate the scenario that I mentioned in the OP with

function demo(n)
    ok = false
    nprimaries = 0
    niters = 0
    post = zeros(Int, Threads.nthreads())
    for outer niters in 1:n
        t = Threads.@spawn begin
            @assert !current_task().sticky
            Threads.threadid() == 1 || return nothing
            Threads.@threads :static for i in 1:2
                if i == 1
                    t0 = time_ns() + 1000
                    while t0 > time_ns()
                    end
                end
            end
            @assert !current_task().sticky
            return Threads.threadid()
        end
        y = fetch(t)
        if y !== nothing
            nprimaries += 1
            tid = y::Int
            post[tid] += 1
        end
        if ccall(:jl_in_threaded_region, Cint, ()) != 0
            ok = true
            break
        end
    end
    return (; ok, nprimaries, niters, post)
end

With Julia from the master branch, I always get demo(_).ok == false. But with #43418 I get

julia> ccall(:jl_in_threaded_region, Cint, ())
0

julia> demo(10000)
(ok = true, nprimaries = 2, niters = 39, post = [1, 1])

julia> ccall(:jl_in_threaded_region, Cint, ())
1

julia> Base.GIT_VERSION_INFO
Base.GitVersionInfo("2a6a8ab0b2ac5c2c6826c8895cc650ea05ab2159", "2a6a8ab0b2", "improve-threads", 1135, "2021-12-13 19:46 UTC", false, 9, 1.639413997e9, "", "")

i.e., now the threaded region is permanently enabled.

I'll change the merge target of this PR to #43418.

@tkf tkf changed the base branch from master to improve-threads December 15, 2021 19:30
@@ -22,8 +22,22 @@ See also: `BLAS.get_num_threads` and `BLAS.set_num_threads` in the
"""
nthreads() = Int(unsafe_load(cglobal(:jl_n_threads, Cint)))

function threading_run(func)
function prefer_threaded_region(f)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a better name is

Suggested change
function prefer_threaded_region(f)
function prefer_io(f)

?

I also wonder if it makes sense to expose this as a public API with the caveat that it is just a hint and f -> f() is a valid implementation of this. It then maybe even makes sense to not do this in @threads for by default to help CPU-bound (almost no I/O) use cases since people can recover the old behavior with prefer_io() do; @threads for ... end; end.

@tkf
Copy link
Member Author

tkf commented Dec 15, 2021

It seems like a better fix is coming #43418 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
multithreading Base.Threads and related functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants