Skip to content

Commit

Permalink
bootstrap: Align ncclUniqueId to 8 bytes
Browse files Browse the repository at this point in the history
ncclUniqueId is an opaque array of bytes that is casted by
ncclCommInitRankFunc() to struct ncclBootstrapHandle. Which is required
to be aligned to 8 bytes in memory for defined behaviour.

This issue was caught by running nccl-tests with NCCL built with UBSAN:
```
bootstrap.cc:242:40: runtime error: member access within misaligned address 0x61100043846c for type 'struct ncclBootstrapHandle', which requires 8 byte alignment
0x61100043846c: note: pointer points here
  5f 00 00 00 ce 7d 2b 40  b1 c5 ce 59 02 00 9f 21  0a c1 35 7a 00 00 00 00  00 00 00 00 00 00 00 00
              ^
    #0 0x7feba7d327b4 in bootstrapInit(ncclBootstrapHandle*, ncclComm*) /home/ubuntu/nccl/src/bootstrap.cc:242
    NVIDIA#1 0x7feba7c7bd48 in ncclCommInitRankFunc /home/ubuntu/nccl/src/init.cc:1350
    NVIDIA#2 0x7feba7e0cebc in ncclAsyncJobMain(void*) /home/ubuntu/nccl/src/group.cc:63
    NVIDIA#3 0x7feba72e2608 in start_thread /build/glibc-2.31/nptl/pthread_create.c:477
    NVIDIA#4 0x7feba6ed4132 in __clone (/lib/x86_64-linux-gnu/libc.so.6)
```

Signed-off-by: Liran Alon <[email protected]>
  • Loading branch information
liralon committed Aug 10, 2023
1 parent 800dd1d commit 1b09084
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion src/nccl.h.in
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,9 @@ typedef struct ncclComm* ncclComm_t;
#define NCCL_COMM_NULL NULL

#define NCCL_UNIQUE_ID_BYTES 128
typedef struct { char internal[NCCL_UNIQUE_ID_BYTES]; } ncclUniqueId;
typedef struct {
char internal[NCCL_UNIQUE_ID_BYTES];
} ncclUniqueId __attribute__((aligned(8)));

/* Error type */
typedef enum { ncclSuccess = 0,
Expand Down

0 comments on commit 1b09084

Please sign in to comment.