Questions about Hardware requirement #12

saki-37 · 2024-03-18T08:56:29Z

Excuse me, but when the model inference on 1 * RTX4090, running python cli_demo_sat.py --from_pretrained cogcom-base-17b --local_tokenizer tokenizer --english --quant 4, the output will be CUDA out of memory. I wonder if it needs more GPU, or I need to add some arguments? Thank you!

The text was updated successfully, but these errors were encountered:

qijimrc · 2024-03-20T18:21:05Z

Excuse me, but when the model inference on 1 * RTX4090, running python cli_demo_sat.py --from_pretrained cogcom-base-17b --local_tokenizer tokenizer --english --quant 4, the output will be CUDA out of memory. I wonder if it needs more GPU, or I need to add some arguments? Thank you!

Hi, thanks for your interest! I am currently trying to investigate this quantization problem.

zilunzhang · 2024-04-01T07:18:38Z

Same here. Is there an approximate estimation for VRAM usage? (<20GB or ~24GB)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions about Hardware requirement #12

Questions about Hardware requirement #12

saki-37 commented Mar 18, 2024

qijimrc commented Mar 20, 2024

zilunzhang commented Apr 1, 2024 •

edited

Loading

Questions about Hardware requirement #12

Questions about Hardware requirement #12

Comments

saki-37 commented Mar 18, 2024

qijimrc commented Mar 20, 2024

zilunzhang commented Apr 1, 2024 • edited Loading

zilunzhang commented Apr 1, 2024 •

edited

Loading