Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make get_global_resource host/device #3040

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Artem-B
Copy link
Contributor

@Artem-B Artem-B commented Dec 3, 2024

Description

The function is called from a host/device constructors and may be invoked from the device code.

As it happens neither NVCC nor clang diagnose such an invalid call. llvm/llvm-project#118415

As the result, in unoptimized builds we end up with ptxas failing with an unresolved reference to this function because it is never generated during GPU-side compilation.
#2813 (comment)

closes #3023

Checklist

  • [x ] New or existing tests cover these changes.
  • [n/a] The documentation is up to date with these changes.

@Artem-B Artem-B requested review from a team as code owners December 3, 2024 23:46
Copy link

copy-pr-bot bot commented Dec 3, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@Artem-B
Copy link
Contributor Author

Artem-B commented Dec 3, 2024

@miscco FYI

@miscco
Copy link
Collaborator

miscco commented Dec 4, 2024

Thanks @Artem-B I was stretched a bit thin the last weeks.

@miscco
Copy link
Collaborator

miscco commented Dec 4, 2024

/ok to test

@miscco miscco enabled auto-merge (squash) December 4, 2024 08:23
auto-merge was automatically disabled December 4, 2024 18:58

Head branch was pushed to by a user without write access

The function is called from a host/device constructors and may be invoked from the device code.

As it happens neither NVCC nor clang diagnose such an invalid call.
llvm/llvm-project#118415

As the result, in unoptimized builds we end up with ptxas failing with
an unresolved reference to this function because it is never generated
during GPU-side compilation.
NVIDIA#2813 (comment)
@Artem-B
Copy link
Contributor Author

Artem-B commented Dec 4, 2024

@miscco looks like my rebase to update the branch has disabled auto-merge. Can you get it going again, please?

@miscco
Copy link
Collaborator

miscco commented Dec 4, 2024

/ok to test

@miscco
Copy link
Collaborator

miscco commented Dec 4, 2024

Oh its broken because NVCC cannot handle dynamic initialization in a function local

So I am wondering whether we need to restrict this to only clang-cuda for now

@@ -204,7 +204,7 @@ _CCCL_HOST_DEVICE bool operator!=(const memory_resource<Pointer>& lhs, const mem
* \return a pointer to a global instance of \p MR.
*/
template <typename MR>
_CCCL_HOST MR* get_global_resource()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this getting used in a host/device function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_CCCL_HOST_DEVICE stateless_resource_allocator()
: base(get_global_resource<Upstream>())
{}

@Artem-B
Copy link
Contributor Author

Artem-B commented Dec 5, 2024

Oh its broken because NVCC cannot handle dynamic initialization in a function local

Interesting. It looks like the issue I've ran into before #2813 (comment)

The error went away with clang after disabling initializer guards, which suggests that the initializer is indeed empty, as far as clang was concerned, but it's possible that there are some differences about what exactly is considered to be empty.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Review
Development

Successfully merging this pull request may close these issues.

[BUG]: host-only get_global_resource() may be called from GPU-side code.
3 participants