Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NVIDIA AMMO documentation #1368

Closed
fedem96 opened this issue Mar 28, 2024 · 11 comments
Closed

NVIDIA AMMO documentation #1368

fedem96 opened this issue Mar 28, 2024 · 11 comments
Assignees
Labels

Comments

@fedem96
Copy link

fedem96 commented Mar 28, 2024

Is there any official documentation of NVIDIA AMMO toolkit? If so, where is it?

In particular, I'd be interested in documentation about:

  • implemented features
  • supported quantization techniques for each model type
  • changelog between versions

@Tracin @juney-nvidia

@juney-nvidia
Copy link
Collaborator

@RalphMao do you have any comments on this ask? :)

@dmitrymailk
Copy link

Same. How I can find source code of this library? I want to write custom quantization pipeline for encoder-decoder models like T5.

@yao-matrix
Copy link

same question here.

@puppetm4st3r
Copy link

x2

@ChristianPala
Copy link

Hi folks!
Are there updates on the docs?

@lix19937
Copy link

+1

@RalphMao
Copy link
Collaborator

Hi all, thank you for your interest. The AMMO toolkit has been renamed to "TensorRT model optimizer" and the documentation is available at https://nvidia.github.io/TensorRT-Model-Optimizer/ . Examples related with Model Optimizer is available at https://github.com/NVIDIA/TensorRT-Model-Optimizer?tab=readme-ov-file

@RalphMao
Copy link
Collaborator

Same. How I can find source code of this library? I want to write custom quantization pipeline for encoder-decoder models like T5.

The library is available on PyPi, with source open (instead of open source). You can access most of the files but some files doesn't have the approval for open source (yet)

@ashwin-js
Copy link

Hi @dmitrymailk , I am also exploring ways to run 4bit quantized encoder - decoder model in tensorrt-llm. Where you able to make any progress on that front ?

Copy link

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 15 days."

@nv-guomingz
Copy link
Collaborator

Hi @fedem96 , plz feel free to reopen it if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

10 participants