Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use semantic versioning #3

Merged
merged 1 commit into from
Dec 18, 2015
Merged

Use semantic versioning #3

merged 1 commit into from
Dec 18, 2015

Conversation

lukeyeager
Copy link
Member

Adherence to Semantic Versioning.

build/lib/
├── libnccl.so -> libnccl.so.1
├── libnccl.so.1 -> libnccl.so.1.0.0
└── libnccl.so.1.0.0

Necessary to build a proper deb package (#2).

/cc @borisfom

nluehr added a commit that referenced this pull request Dec 18, 2015
@nluehr nluehr merged commit 4807909 into NVIDIA:master Dec 18, 2015
@lukeyeager lukeyeager deleted the semver branch December 18, 2015 21:28
@getengqing getengqing mentioned this pull request Mar 3, 2017
@weberxie weberxie mentioned this pull request Sep 30, 2020
minsii pushed a commit to minsii/nccl that referenced this pull request Sep 16, 2022
Co-authored-by: Facebook Community Bot <[email protected]>
liralon added a commit to liralon/nccl that referenced this pull request Aug 9, 2023
ncclUniqueId is an opaque array of bytes that is casted by
ncclCommInitRankFunc() to struct ncclBootstrapHandle. Which is required to be
aligned to 8 bytes in memory for defined behaviour.

This issue was caught by running nccl-tests with NCCL built with UBSAN:
```
bootstrap.cc:242:40: runtime error: member access within misaligned address 0x61100043846c for type 'struct ncclBootstrapHandle', which requires 8 byte alignment
0x61100043846c: note: pointer points here
  5f 00 00 00 ce 7d 2b 40  b1 c5 ce 59 02 00 9f 21  0a c1 35 7a 00 00 00 00  00 00 00 00 00 00 00 00
              ^
    #0 0x7feba7d327b4 in bootstrapInit(ncclBootstrapHandle*, ncclComm*) /home/ubuntu/nccl/src/bootstrap.cc:242
    NVIDIA#1 0x7feba7c7bd48 in ncclCommInitRankFunc /home/ubuntu/nccl/src/init.cc:1350
    NVIDIA#2 0x7feba7e0cebc in ncclAsyncJobMain(void*) /home/ubuntu/nccl/src/group.cc:63
    NVIDIA#3 0x7feba72e2608 in start_thread /build/glibc-2.31/nptl/pthread_create.c:477
    NVIDIA#4 0x7feba6ed4132 in __clone (/lib/x86_64-linux-gnu/libc.so.6)
```

Signed-off-by: Liran Alon <[email protected]>
liralon added a commit to liralon/nccl that referenced this pull request Aug 9, 2023
ncclRealloc() is used in the code also to perform the initial allocation.
This results in calling memcpy() with a NULL source pointer and 0 copy size.

Allegedly this seems harmless, but memcpy is declared to never accept NULL pointers.
Which could lead to undefined behaviour.

This issue was caught by running nccl-tests with NCCL built with UBSAN:
```
include/alloc.h:68:9: runtime error: null pointer passed as argument 2, which is declared to never be null
    #0 0x7f867cf80932 in ncclResult_t ncclRealloc<ncclProxyConnection*>(ncclProxyConnection***, unsigned long, unsigned long) include/alloc.h:68
    NVIDIA#1 0x7f867cf6c838 in ncclProxyNewConnection /home/ubuntu/nccl/src/proxy.cc:942
    NVIDIA#2 0x7f867cf74ba8 in proxyConnInit /home/ubuntu/nccl/src/proxy.cc:1261
    NVIDIA#3 0x7f867cf78021 in proxyProgressAsync /home/liran/nccl/src/proxy.cc:1318
    NVIDIA#4 0x7f867cf79ff0 in proxyServiceInitOp /home/liran/nccl/src/proxy.cc:1377
    NVIDIA#5 0x7f867cf7c052 in ncclProxyService(void*) /home/ubuntu/nccl/src/proxy.cc:1507
    NVIDIA#6 0x7f867c3c8608 in start_thread /build/glibc-2.31/nptl/pthread_create.c:477
    NVIDIA#7 0x7f867bfba132 in __clone (/lib/x86_64-linux-gnu/libc.so.6)
```

Signed-off-by: Liran Alon <[email protected]>
liralon added a commit to liralon/nccl that referenced this pull request Aug 9, 2023
ncclUniqueId is an opaque array of bytes that is casted by
ncclCommInitRankFunc() to struct ncclBootstrapHandle. Which is required to be
aligned to 8 bytes in memory for defined behaviour.

This issue was caught by running nccl-tests with NCCL built with UBSAN:
```
bootstrap.cc:242:40: runtime error: member access within misaligned address 0x61100043846c for type 'struct ncclBootstrapHandle', which requires 8 byte alignment
0x61100043846c: note: pointer points here
  5f 00 00 00 ce 7d 2b 40  b1 c5 ce 59 02 00 9f 21  0a c1 35 7a 00 00 00 00  00 00 00 00 00 00 00 00
              ^
    #0 0x7feba7d327b4 in bootstrapInit(ncclBootstrapHandle*, ncclComm*) /home/ubuntu/nccl/src/bootstrap.cc:242
    NVIDIA#1 0x7feba7c7bd48 in ncclCommInitRankFunc /home/ubuntu/nccl/src/init.cc:1350
    NVIDIA#2 0x7feba7e0cebc in ncclAsyncJobMain(void*) /home/ubuntu/nccl/src/group.cc:63
    NVIDIA#3 0x7feba72e2608 in start_thread /build/glibc-2.31/nptl/pthread_create.c:477
    NVIDIA#4 0x7feba6ed4132 in __clone (/lib/x86_64-linux-gnu/libc.so.6)
```

Signed-off-by: Liran Alon <[email protected]>
liralon added a commit to liralon/nccl that referenced this pull request Aug 9, 2023
ncclRealloc() is used in the code also to perform the initial allocation.
This results in calling memcpy() with a NULL source pointer and 0 copy size.

Allegedly this seems harmless, but memcpy is declared to never accept NULL pointers.
Which could lead to undefined behaviour.

This issue was caught by running nccl-tests with NCCL built with UBSAN:
```
include/alloc.h:68:9: runtime error: null pointer passed as argument 2, which is declared to never be null
    #0 0x7f867cf80932 in ncclResult_t ncclRealloc<ncclProxyConnection*>(ncclProxyConnection***, unsigned long, unsigned long) include/alloc.h:68
    NVIDIA#1 0x7f867cf6c838 in ncclProxyNewConnection /home/ubuntu/nccl/src/proxy.cc:942
    NVIDIA#2 0x7f867cf74ba8 in proxyConnInit /home/ubuntu/nccl/src/proxy.cc:1261
    NVIDIA#3 0x7f867cf78021 in proxyProgressAsync /home/liran/nccl/src/proxy.cc:1318
    NVIDIA#4 0x7f867cf79ff0 in proxyServiceInitOp /home/liran/nccl/src/proxy.cc:1377
    NVIDIA#5 0x7f867cf7c052 in ncclProxyService(void*) /home/ubuntu/nccl/src/proxy.cc:1507
    NVIDIA#6 0x7f867c3c8608 in start_thread /build/glibc-2.31/nptl/pthread_create.c:477
    NVIDIA#7 0x7f867bfba132 in __clone (/lib/x86_64-linux-gnu/libc.so.6)
```

Signed-off-by: Liran Alon <[email protected]>
liralon added a commit to liralon/nccl that referenced this pull request Aug 10, 2023
ncclUniqueId is an opaque array of bytes that is casted by
ncclCommInitRankFunc() to struct ncclBootstrapHandle. Which is required
to be aligned to 8 bytes in memory for defined behaviour.

This issue was caught by running nccl-tests with NCCL built with UBSAN:
```
bootstrap.cc:242:40: runtime error: member access within misaligned address 0x61100043846c for type 'struct ncclBootstrapHandle', which requires 8 byte alignment
0x61100043846c: note: pointer points here
  5f 00 00 00 ce 7d 2b 40  b1 c5 ce 59 02 00 9f 21  0a c1 35 7a 00 00 00 00  00 00 00 00 00 00 00 00
              ^
    #0 0x7feba7d327b4 in bootstrapInit(ncclBootstrapHandle*, ncclComm*) /home/ubuntu/nccl/src/bootstrap.cc:242
    NVIDIA#1 0x7feba7c7bd48 in ncclCommInitRankFunc /home/ubuntu/nccl/src/init.cc:1350
    NVIDIA#2 0x7feba7e0cebc in ncclAsyncJobMain(void*) /home/ubuntu/nccl/src/group.cc:63
    NVIDIA#3 0x7feba72e2608 in start_thread /build/glibc-2.31/nptl/pthread_create.c:477
    NVIDIA#4 0x7feba6ed4132 in __clone (/lib/x86_64-linux-gnu/libc.so.6)
```

Signed-off-by: Liran Alon <[email protected]>
liralon added a commit to liralon/nccl that referenced this pull request Aug 10, 2023
ncclRealloc() is used in the code also to perform the initial
allocation. This results in calling memcpy() with a NULL source pointer
and 0 copy size.

Allegedly this seems harmless, but memcpy is declared to never accept
NULL pointers. Which could lead to undefined behaviour.

This issue was caught by running nccl-tests with NCCL built with UBSAN:
```
include/alloc.h:68:9: runtime error: null pointer passed as argument 2, which is declared to never be null
    #0 0x7f867cf80932 in ncclResult_t ncclRealloc<ncclProxyConnection*>(ncclProxyConnection***, unsigned long, unsigned long) include/alloc.h:68
    NVIDIA#1 0x7f867cf6c838 in ncclProxyNewConnection /home/ubuntu/nccl/src/proxy.cc:942
    NVIDIA#2 0x7f867cf74ba8 in proxyConnInit /home/ubuntu/nccl/src/proxy.cc:1261
    NVIDIA#3 0x7f867cf78021 in proxyProgressAsync /home/liran/nccl/src/proxy.cc:1318
    NVIDIA#4 0x7f867cf79ff0 in proxyServiceInitOp /home/liran/nccl/src/proxy.cc:1377
    NVIDIA#5 0x7f867cf7c052 in ncclProxyService(void*) /home/ubuntu/nccl/src/proxy.cc:1507
    NVIDIA#6 0x7f867c3c8608 in start_thread /build/glibc-2.31/nptl/pthread_create.c:477
    NVIDIA#7 0x7f867bfba132 in __clone (/lib/x86_64-linux-gnu/libc.so.6)
```

Signed-off-by: Liran Alon <[email protected]>
liralon added a commit to liralon/nccl that referenced this pull request Aug 10, 2023
ncclUniqueId is an opaque array of bytes that is casted by
ncclCommInitRankFunc() to struct ncclBootstrapHandle. Which is required
to be aligned to 8 bytes in memory for defined behaviour.

This issue was caught by running nccl-tests with NCCL built with UBSAN:
```
bootstrap.cc:242:40: runtime error: member access within misaligned address 0x61100043846c for type 'struct ncclBootstrapHandle', which requires 8 byte alignment
0x61100043846c: note: pointer points here
  5f 00 00 00 ce 7d 2b 40  b1 c5 ce 59 02 00 9f 21  0a c1 35 7a 00 00 00 00  00 00 00 00 00 00 00 00
              ^
    #0 0x7feba7d327b4 in bootstrapInit(ncclBootstrapHandle*, ncclComm*) /home/ubuntu/nccl/src/bootstrap.cc:242
    NVIDIA#1 0x7feba7c7bd48 in ncclCommInitRankFunc /home/ubuntu/nccl/src/init.cc:1350
    NVIDIA#2 0x7feba7e0cebc in ncclAsyncJobMain(void*) /home/ubuntu/nccl/src/group.cc:63
    NVIDIA#3 0x7feba72e2608 in start_thread /build/glibc-2.31/nptl/pthread_create.c:477
    NVIDIA#4 0x7feba6ed4132 in __clone (/lib/x86_64-linux-gnu/libc.so.6)
```

Signed-off-by: Liran Alon <[email protected]>
liralon added a commit to liralon/nccl that referenced this pull request Aug 10, 2023
ncclRealloc() is used in the code also to perform the initial
allocation. This results in calling memcpy() with a NULL source pointer
and 0 copy size.

Allegedly this seems harmless, but memcpy is declared to never accept
NULL pointers. Which could lead to undefined behaviour.

This issue was caught by running nccl-tests with NCCL built with UBSAN:
```
include/alloc.h:68:9: runtime error: null pointer passed as argument 2, which is declared to never be null
    #0 0x7f867cf80932 in ncclResult_t ncclRealloc<ncclProxyConnection*>(ncclProxyConnection***, unsigned long, unsigned long) include/alloc.h:68
    NVIDIA#1 0x7f867cf6c838 in ncclProxyNewConnection /home/ubuntu/nccl/src/proxy.cc:942
    NVIDIA#2 0x7f867cf74ba8 in proxyConnInit /home/ubuntu/nccl/src/proxy.cc:1261
    NVIDIA#3 0x7f867cf78021 in proxyProgressAsync /home/ubuntu/nccl/src/proxy.cc:1318
    NVIDIA#4 0x7f867cf79ff0 in proxyServiceInitOp /home/ubuntu/nccl/src/proxy.cc:1377
    NVIDIA#5 0x7f867cf7c052 in ncclProxyService(void*) /home/ubuntu/nccl/src/proxy.cc:1507
    NVIDIA#6 0x7f867c3c8608 in start_thread /build/glibc-2.31/nptl/pthread_create.c:477
    NVIDIA#7 0x7f867bfba132 in __clone (/lib/x86_64-linux-gnu/libc.so.6)
```

Signed-off-by: Liran Alon <[email protected]>
alexander-zinoviev pushed a commit to alexander-zinoviev/nccl that referenced this pull request Nov 7, 2024
[v2.21.5-1] Add our patches to NCCL v2.21.5-1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants