Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory leak #9

Open
zcswdt opened this issue Dec 5, 2023 · 11 comments
Open

memory leak #9

zcswdt opened this issue Dec 5, 2023 · 11 comments

Comments

@zcswdt
Copy link

zcswdt commented Dec 5, 2023

Can you re-upload the code? Thank you very much

@Entongsu
Copy link

Entongsu commented Mar 7, 2024

I fix it by removing this line

@zcswdt
Copy link
Author

zcswdt commented Mar 7, 2024

I fix it by removing this line

It's very powerful. I tried for a long time but couldn't solve it. You deleted this place. Have you verified that the code function is OK?

@zcswdt
Copy link
Author

zcswdt commented Mar 7, 2024

我通过删除这一行来修复它

I fix it by removing this line


I just tried, but there is still a memory leak. Have you tried?

ray.exceptions.OutOfMemoryError: Task was killed due to the node running low on memory.
Memory on the node (IP: 100.79.61.171, ID: dd64af687cb1299d0339f770b1cf8002a43d25240e4f9a1aea3abe31) where the task (actor ID: 198f27dd8eedd8a529db2fdc01000000, name=SimEnv.init, pid=25063, memory used=2.28GB) was running was 29.74GB / 31.30GB (0.950172), which exceeds the memory usage threshold of 0.95. Ray killed this worker (ID: cfe2287dcee7a2bc8a0bef249ec702f3351b919a353a178bc04fa50b) because it was the most recently scheduled task; to see more information about memory usage on this node, use ray logs raylet.out -ip 100.79.61.171. To see the logs of the worker, use `ray logs worker-cfe2287dcee7a2bc8a0bef249ec702f3351b919a353a178bc04fa50b*out -ip 100.79.61.171. Top 10 memory users:
PID MEM(GB) COMMAND

@Entongsu
Copy link

Entongsu commented Mar 7, 2024

My code functions well now, but there are many OpenGL warnings, and it can be run directly. I modified the code and did not use any Ray-related function for the running.

@zcswdt
Copy link
Author

zcswdt commented Mar 7, 2024

My code functions well now, but there are many OpenGL warnings, and it can be run directly. I modified the code and did not use any Ray-related function for the running.

Can you explain why this place was deleted? Can you also tell me your computer configuration? Run, nvidia-smi, nvcc -V, and free -h with your ubuntu version. Thanks.

@Entongsu
Copy link

Entongsu commented Mar 7, 2024

I have fixed the warning now(I made some mistakes by myself), and the code can be runned only by removing this line. The reason I removed this space is that I got the error of assertion error on this line.

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 520.61.05    Driver Version: 520.61.05    CUDA Version: 11.8     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:2B:00.0  On |                  N/A |
|  0%   51C    P8    38W / 420W |   1481MiB / 24576MiB |     54%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0
              total        used        free      shared  buff/cache   available
Mem:           62Gi        10Gi        15Gi       1.0Gi        36Gi        50Gi
Swap:          47Mi        47Mi       0.0Ki

@zcswdt
Copy link
Author

zcswdt commented Mar 7, 2024

I have fixed the warning now(I made some mistakes by myself), and the code can be runned only by removing this line. The reason I removed this space is that I got the error of assertion error on this line.

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 520.61.05    Driver Version: 520.61.05    CUDA Version: 11.8     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:2B:00.0  On |                  N/A |
|  0%   51C    P8    38W / 420W |   1481MiB / 24576MiB |     54%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0
              total        used        free      shared  buff/cache   available
Mem:           62Gi        10Gi        15Gi       1.0Gi        36Gi        50Gi
Swap:          47Mi        47Mi       0.0Ki

Thank you very much for providing this to me. Can you explain why deleting this line of code can solve it?

@Entongsu
Copy link

Entongsu commented Mar 7, 2024

Because I got the assertion error from this line. I tried to remove this and found the code functions well.

@zcswdt
Copy link
Author

zcswdt commented Mar 7, 2024

Because I got the assertion error from this line. I tried to remove this and found the code functions well.

Got it, try training for half an hour and see if the memory leaks. Maybe the driver and CUDA you and I use are different. My cuda is 10.0

@xiaolijz
Copy link

Can you re-upload the code? Thank you very much

Hello, have you successfully run this project? I am encountering some of the same problems you faced before. Could you please give me some advice? Thank you, and I look forward to your reply.

@xiaolijz
Copy link

xiaolijz commented Dec 5, 2024

Because I got the assertion error from this line. I tried to remove this and found the code functions well.

Hello, if you ran the author's demo successfully?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants