NVIDIA AMMO documentation #1368

fedem96 · 2024-03-28T13:55:02Z

Is there any official documentation of NVIDIA AMMO toolkit? If so, where is it?

In particular, I'd be interested in documentation about:

implemented features
supported quantization techniques for each model type
changelog between versions

juney-nvidia · 2024-04-01T07:55:34Z

@RalphMao do you have any comments on this ask? :)

dmitrymailk · 2024-04-17T15:11:13Z

Same. How I can find source code of this library? I want to write custom quantization pipeline for encoder-decoder models like T5.

yao-matrix · 2024-05-07T07:26:10Z

same question here.

puppetm4st3r · 2024-05-09T05:02:27Z

x2

ChristianPala · 2024-05-17T08:44:31Z

Hi folks!
Are there updates on the docs?

lix19937 · 2024-05-17T10:00:26Z

+1

RalphMao · 2024-05-17T20:40:43Z

Hi all, thank you for your interest. The AMMO toolkit has been renamed to "TensorRT model optimizer" and the documentation is available at https://nvidia.github.io/TensorRT-Model-Optimizer/ . Examples related with Model Optimizer is available at https://github.com/NVIDIA/TensorRT-Model-Optimizer?tab=readme-ov-file

RalphMao · 2024-05-17T20:41:42Z

Same. How I can find source code of this library? I want to write custom quantization pipeline for encoder-decoder models like T5.

The library is available on PyPi, with source open (instead of open source). You can access most of the files but some files doesn't have the approval for open source (yet)

ashwin-js · 2024-05-20T02:45:43Z

Hi @dmitrymailk , I am also exploring ways to run 4bit quantized encoder - decoder model in tensorrt-llm. Where you able to make any progress on that front ?

github-actions · 2024-06-20T01:50:37Z

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 15 days."

nv-guomingz · 2024-11-14T07:13:18Z

Hi @fedem96 , plz feel free to reopen it if needed.

juney-nvidia assigned RalphMao Apr 1, 2024

brb-nv mentioned this issue May 28, 2024

Improving int8 quantization results. NVIDIA/TensorRT#3865

Open

github-actions bot added the stale label Jun 20, 2024

stas00 mentioned this issue Aug 29, 2024

[Doc]: nvidia ammo has been renamed vllm-project/vllm#8010

Open

1 task

nv-guomingz closed this as completed Nov 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NVIDIA AMMO documentation #1368

NVIDIA AMMO documentation #1368

fedem96 commented Mar 28, 2024

juney-nvidia commented Apr 1, 2024

dmitrymailk commented Apr 17, 2024

yao-matrix commented May 7, 2024

puppetm4st3r commented May 9, 2024

ChristianPala commented May 17, 2024

lix19937 commented May 17, 2024

RalphMao commented May 17, 2024

RalphMao commented May 17, 2024

ashwin-js commented May 20, 2024

github-actions bot commented Jun 20, 2024

nv-guomingz commented Nov 14, 2024

NVIDIA AMMO documentation #1368

NVIDIA AMMO documentation #1368

Comments

fedem96 commented Mar 28, 2024

juney-nvidia commented Apr 1, 2024

dmitrymailk commented Apr 17, 2024

yao-matrix commented May 7, 2024

puppetm4st3r commented May 9, 2024

ChristianPala commented May 17, 2024

lix19937 commented May 17, 2024

RalphMao commented May 17, 2024

RalphMao commented May 17, 2024

ashwin-js commented May 20, 2024

github-actions bot commented Jun 20, 2024

nv-guomingz commented Nov 14, 2024