-
-
Notifications
You must be signed in to change notification settings - Fork 98
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: automatically adapt to current free VRAM state (#182)
* feat: read tensor info from `gguf` files * feat: `inspect gguf` command * feat: `inspect measure` command * feat: `readGgufFileInfo` function * feat: GGUF file info on `LlamaModel` * feat: estimate VRAM usage of the model and context with certain options to adapt to current VRAM state and set great defaults for `gpuLayers` and `contextSize`. no manual configuration of those options is needed anymore to maximize performance * feat: `JinjaTemplateChatWrapper` * feat: use the `tokenizer.chat_template` header from the `gguf` file when available - use it to find a better specialized chat wrapper or use `JinjaTemplateChatWrapper` with it as a fallback * feat: improve `resolveChatWrapper` * feat: simplify generation CLI commands: `chat`, `complete`, `infill` * feat: read GPU device names * feat: get token type * refactor: gguf * test: separate gguf tests to model dependent and model independent tests * test: switch to new vitest test signature * fix: use the new `llama.cpp` CUDA flag * fix: improve chat wrappers tokenization * fix: bugs
- Loading branch information
Showing
146 changed files
with
10,767 additions
and
2,632 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
--- | ||
outline: deep | ||
--- | ||
# `inspect gguf` command | ||
|
||
<script setup lang="ts"> | ||
import {data as docs} from "../cli.data.js"; | ||
const commandDoc = docs.inspect.gguf; | ||
</script> | ||
|
||
{{commandDoc.description}} | ||
|
||
## Usage | ||
```shell-vue | ||
{{commandDoc.usage}} | ||
``` | ||
<div v-html="commandDoc.options"></div> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
--- | ||
outline: deep | ||
--- | ||
# `inspect gpu` command | ||
|
||
<script setup lang="ts"> | ||
import {data as docs} from "../cli.data.js"; | ||
const commandDoc = docs.inspect.gpu; | ||
</script> | ||
|
||
{{commandDoc.description}} | ||
|
||
## Usage | ||
```shell-vue | ||
{{commandDoc.usage}} | ||
``` | ||
<div v-html="commandDoc.options"></div> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
--- | ||
outline: deep | ||
--- | ||
# `inspect measure` command | ||
|
||
<script setup lang="ts"> | ||
import {data as docs} from "../cli.data.js"; | ||
const commandDoc = docs.inspect.measure; | ||
</script> | ||
|
||
{{commandDoc.description}} | ||
|
||
## Usage | ||
```shell-vue | ||
{{commandDoc.usage}} | ||
``` | ||
<div v-html="commandDoc.options"></div> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.