Skip to content

Commit

Permalink
feat: resolveModelFile method (#351)
Browse files Browse the repository at this point in the history
* feat: `resolveModelFile` method
* feat: `hf:` URI support
* fix: improve GGUF metadata read times
* fix: hide internal type
* docs: document the `hf:` URI
  • Loading branch information
giladgd authored Sep 29, 2024
1 parent 578e710 commit 4ee10a9
Show file tree
Hide file tree
Showing 38 changed files with 1,150 additions and 614 deletions.
1 change: 1 addition & 0 deletions .vitepress/config/apiReferenceSidebar.ts
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ const categoryOrder = [

const functionsOrder = [
"getLlama",
"resolveModelFile",
"defineChatSessionFunction",
"createModelDownloader",
"resolveChatWrapper",
Expand Down
8 changes: 7 additions & 1 deletion docs/cli/pull.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,16 @@ const commandDoc = docs.pull;
A wrapper around [`ipull`](https://www.npmjs.com/package/ipull)
to download model files as fast as possible with parallel connections and other optimizations.

Automatically handles split and binary-split models files, so only pass the URL to the first file of a model.
Automatically handles split and binary-split models files, so only pass the URI to the first file of a model.

If a file already exists and its size matches the expected size, it will not be downloaded again unless the `--override` flag is used.

The supported URI schemes are:
- **HTTP:** `https://`, `http://`
- **Hugging Face:** `hf:<user>/<model>/<file-path>#<branch>` (`#<branch>` is optional)

Learn more about using model URIs in the [Downloading Models guide](../guide/downloading-models.md#model-uris).

> To programmatically download a model file in your code, use [`createModelDownloader()`](../api/functions/createModelDownloader.md)
## Usage
Expand Down
4 changes: 4 additions & 0 deletions docs/guide/choosing-a-model.md
Original file line number Diff line number Diff line change
Expand Up @@ -164,3 +164,7 @@ npx --no node-llama-cpp pull --dir ./models <model-file-url>
>
> If the model file URL is of a single part of a multi-part model (for example, [this model](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q5_K_L.gguf/Meta-Llama-3-70B-Instruct-Q5_K_L-00001-of-00002.gguf)),
> it will also download all the other parts as well into the same directory.
::: tip
Consider using [model URIs](./downloading-models.md#model-uris) to download and load models.
:::
49 changes: 47 additions & 2 deletions docs/guide/downloading-models.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,16 +69,61 @@ This option is recommended for more advanced use cases, such as downloading mode
If you know the exact model URLs you're going to need every time in your project, it's better to download the models
automatically after running `npm install` as described in the [Using the CLI](#cli) section.

## Model URIs {#model-uris}
You can reference models using a URI instead of their full download URL when using the CLI and relevant methods.

When downloading a model from a URI, the model files will be prefixed with a corresponding adaptation of the URI.

To reference a model from Hugging Face, you can use the scheme
<br/>
`hf:<user>/<model>/<file-path>#<branch>` (`#<branch>` is optional).

Here's an example usage of the Hugging Face URI scheme:
```
hf:mradermacher/Meta-Llama-3.1-8B-Instruct-GGUF/Meta-Llama-3.1-8B-Instruct.Q4_K_M.gguf
```

When using a URI to reference a model,
it's recommended [to add it to your `package.json` file](#cli) to ensure it's downloaded when running `npm install`,
and also resolve it using the [`resolveModelFile`](../api/functions/resolveModelFile.md) method to get the full path of the resolved model file.

Here's and example usage of the [`resolveModelFile`](../api/functions/resolveModelFile.md) method:
```typescript
import {fileURLToPath} from "url";
import path from "path";
import {getLlama, resolveModelFile} from "node-llama-cpp";

const __dirname = path.dirname(fileURLToPath(import.meta.url));
const modelsDirectory = path.join(__dirname, "models");

const modelPath = await resolveModelFile(
"hf:user/model/model-file.gguf",
modelsDirectory
);

const llama = await getLlama();
const model = await llama.loadModel({modelPath});
```

::: tip NOTE
If a corresponding model file is not found in the given directory, the model will automatically be downloaded.

When a file is being downloaded, the download progress is shown in the console by default.
<br/>
Set the [`cli`](../api/type-aliases/ResolveModelFileOptions#cli) option to `false` to disable this behavior.
:::

## Downloading Gated Models From Hugging Face {#hf-token}
Some models on Hugging Face are "gated", meaning they require a manual consent from you before you can download them.

To download such models, after completing the consent form on the model card, you need to create a [Hugging Face token](https://huggingface.co/docs/hub/en/security-tokens) and set it in one of the following locations:
* Set an environment variable called `HF_TOKEN` the token
* Set the `~/.cache/huggingface/token` file content to the token

Now, using the CLI or the [`createModelDownloader`](../api/functions/createModelDownloader.md) method will automatically use the token to download gated models.
Now, using the CLI, the [`createModelDownloader`](../api/functions/createModelDownloader.md) method,
or the [`resolveModelFile`](../api/functions/resolveModelFile.md) method will automatically use the token to download gated models.

Alternatively, you can use the token in the [`tokens`](../api/type-aliases/ModelDownloaderOptions.md#tokens) option when using [`createModelDownloader`](../api/functions/createModelDownloader.md).
Alternatively, you can use the token in the [`tokens`](../api/type-aliases/ModelDownloaderOptions.md#tokens) option when using [`createModelDownloader`](../api/functions/createModelDownloader.md) or [`resolveModelFile`](../api/functions/resolveModelFile.md).

## Inspecting Remote Models
You can inspect the metadata of a remote model without downloading it by either using the [`inspect gguf` command](../cli/inspect/gguf.md) with a URL,
Expand Down
4 changes: 2 additions & 2 deletions docs/guide/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,9 +51,9 @@ npx --no node-llama-cpp inspect gpu
```

## Getting a Model File
We recommend you to get a GGUF model from either [Michael Radermacher on Hugging Face](https://huggingface.co/mradermacher) or [search HuggingFace directly](https://huggingface.co/models?library=gguf) for a GGUF model.
We recommend getting a GGUF model from either [Michael Radermacher on Hugging Face](https://huggingface.co/mradermacher) or by [searching HuggingFace directly](https://huggingface.co/models?library=gguf) for a GGUF model.

We recommend you to start by getting a small model that doesn't have a lot of parameters just to ensure everything works, so try downloading a `7B`/`8B` parameters model first (search for models with both `7B`/`8B` and `GGUF` in their name).
We recommend starting by getting a small model that doesn't have a lot of parameters just to ensure everything works, so try downloading a `7B`/`8B` parameters model first (search for models with both `7B`/`8B` and `GGUF` in their name).

For improved download speeds, you can use the [`pull`](../cli/pull.md) command to download a model:
```shell
Expand Down
4 changes: 2 additions & 2 deletions scripts/scaffoldElectronExampleForCiBuild.ts
Original file line number Diff line number Diff line change
Expand Up @@ -40,8 +40,8 @@ await scaffoldProjectTemplate({
directoryPath: resolvedPackageFolderPath,
parameters: {
[ProjectTemplateParameter.ProjectName]: projectName,
[ProjectTemplateParameter.ModelUrl]: "https://github.com/withcatai/node-llama-cpp",
[ProjectTemplateParameter.ModelFilename]: "model.gguf",
[ProjectTemplateParameter.ModelUriOrUrl]: "https://github.com/withcatai/node-llama-cpp",
[ProjectTemplateParameter.ModelUriOrFilename]: "model.gguf",
[ProjectTemplateParameter.CurrentModuleVersion]: packageVersion
}
});
Expand Down
16 changes: 9 additions & 7 deletions src/chatWrappers/Llama3_1ChatWrapper.ts
Original file line number Diff line number Diff line change
Expand Up @@ -36,13 +36,7 @@ export class Llama3_1ChatWrapper extends ChatWrapper {
/**
* @param options
*/
public constructor({
cuttingKnowledgeDate = new Date("2023-12-01T00:00:00Z"),
todayDate = () => new Date(),
noToolInstructions = false,

_specialTokensTextForPreamble = false
}: {
public constructor(options: {
/**
* Set to `null` to disable
*
Expand All @@ -64,6 +58,14 @@ export class Llama3_1ChatWrapper extends ChatWrapper {
} = {}) {
super();

const {
cuttingKnowledgeDate = new Date("2023-12-01T00:00:00Z"),
todayDate = () => new Date(),
noToolInstructions = false,

_specialTokensTextForPreamble = false
} = options;

this.cuttingKnowledgeDate = cuttingKnowledgeDate == null
? null
: cuttingKnowledgeDate instanceof Function
Expand Down
4 changes: 2 additions & 2 deletions src/cli/commands/ChatCommand.ts
Original file line number Diff line number Diff line change
Expand Up @@ -77,9 +77,9 @@ export const ChatCommand: CommandModule<object, ChatCommand> = {

return yargs
.option("modelPath", {
alias: ["m", "model", "path", "url"],
alias: ["m", "model", "path", "url", "uri"],
type: "string",
description: "Model file to use for the chat. Can be a path to a local file or a URL of a model file to download. Leave empty to choose from a list of recommended models"
description: "Model file to use for the chat. Can be a path to a local file or a URI of a model file to download. Leave empty to choose from a list of recommended models"
})
.option("header", {
alias: ["H"],
Expand Down
4 changes: 2 additions & 2 deletions src/cli/commands/CompleteCommand.ts
Original file line number Diff line number Diff line change
Expand Up @@ -57,9 +57,9 @@ export const CompleteCommand: CommandModule<object, CompleteCommand> = {
builder(yargs) {
return yargs
.option("modelPath", {
alias: ["m", "model", "path", "url"],
alias: ["m", "model", "path", "url", "uri"],
type: "string",
description: "Model file to use for the chat. Can be a path to a local file or a URL of a model file to download. Leave empty to choose from a list of recommended models"
description: "Model file to use for the completion. Can be a path to a local file or a URI of a model file to download. Leave empty to choose from a list of recommended models"
})
.option("header", {
alias: ["H"],
Expand Down
4 changes: 2 additions & 2 deletions src/cli/commands/InfillCommand.ts
Original file line number Diff line number Diff line change
Expand Up @@ -59,9 +59,9 @@ export const InfillCommand: CommandModule<object, InfillCommand> = {
builder(yargs) {
return yargs
.option("modelPath", {
alias: ["m", "model", "path", "url"],
alias: ["m", "model", "path", "url", "uri"],
type: "string",
description: "Model file to use for the chat. Can be a path to a local file or a URL of a model file to download. Leave empty to choose from a list of recommended models"
description: "Model file to use for the infill. Can be a path to a local file or a URI of a model file to download. Leave empty to choose from a list of recommended models"
})
.option("header", {
alias: ["H"],
Expand Down
59 changes: 40 additions & 19 deletions src/cli/commands/InitCommand.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@ import logSymbols from "log-symbols";
import validateNpmPackageName from "validate-npm-package-name";
import fs from "fs-extra";
import {consolePromptQuestion} from "../utils/consolePromptQuestion.js";
import {isUrl} from "../../utils/isUrl.js";
import {basicChooseFromListConsoleInteraction} from "../utils/basicChooseFromListConsoleInteraction.js";
import {splitAnsiToLines} from "../utils/splitAnsiToLines.js";
import {arrowChar} from "../../consts.js";
Expand All @@ -21,6 +20,7 @@ import {ProjectTemplateOption, projectTemplates} from "../projectTemplates.js";
import {getReadablePath} from "../utils/getReadablePath.js";
import {createModelDownloader} from "../../utils/createModelDownloader.js";
import {withCliCommandDescriptionDocsUrl} from "../utils/withCliCommandDescriptionDocsUrl.js";
import {resolveModelDestination} from "../../utils/resolveModelDestination.js";

type InitCommand = {
name?: string,
Expand Down Expand Up @@ -93,7 +93,7 @@ export async function InitCommandHandler({name, template, gpu}: InitCommand) {
logLevel: LlamaLogLevel.error
});

const modelUrl = await interactivelyAskForModel({
const modelUri = await interactivelyAskForModel({
llama,
allowLocalModels: false,
downloadIntent: false
Expand All @@ -113,29 +113,53 @@ export async function InitCommandHandler({name, template, gpu}: InitCommand) {

await fs.ensureDir(targetDirectory);

const modelDownloader = await createModelDownloader({
modelUrl,
showCliProgress: false,
deleteTempFileOnCancel: false
});
const modelEntrypointFilename = modelDownloader.entrypointFilename;
async function resolveModelInfo() {
const resolvedModelDestination = resolveModelDestination(modelUri);

if (resolvedModelDestination.type === "uri")
return {
modelUriOrUrl: resolvedModelDestination.uri,
modelUriOrFilename: resolvedModelDestination.uri,
cancelDownloader: async () => void 0
};

if (resolvedModelDestination.type === "file")
throw new Error("Unexpected file model destination");

const modelDownloader = await createModelDownloader({
modelUri: resolvedModelDestination.url,
showCliProgress: false,
deleteTempFileOnCancel: false
});
const modelEntrypointFilename = modelDownloader.entrypointFilename;

return {
modelUriOrUrl: resolvedModelDestination.url,
modelUriOrFilename: modelEntrypointFilename,
async cancelDownloader() {
try {
await modelDownloader.cancel();
} catch (err) {
// do nothing
}
}
};
}

const {modelUriOrFilename, modelUriOrUrl, cancelDownloader} = await resolveModelInfo();

await scaffoldProjectTemplate({
template,
directoryPath: targetDirectory,
parameters: {
[ProjectTemplateParameter.ProjectName]: projectName,
[ProjectTemplateParameter.ModelUrl]: modelUrl,
[ProjectTemplateParameter.ModelFilename]: modelEntrypointFilename,
[ProjectTemplateParameter.ModelUriOrUrl]: modelUriOrUrl,
[ProjectTemplateParameter.ModelUriOrFilename]: modelUriOrFilename,
[ProjectTemplateParameter.CurrentModuleVersion]: await getModuleVersion()
}
});

try {
await modelDownloader.cancel();
} catch (err) {
// do nothing
}
await cancelDownloader();

await new Promise((resolve) => setTimeout(resolve, Math.max(0, minScaffoldTime - (Date.now() - startTime))));
});
Expand Down Expand Up @@ -213,10 +237,7 @@ async function askForProjectName(currentDirectory: string) {
if (item == null)
return "";

if (isUrl(item, false))
return logSymbols.success + " Entered project name " + chalk.blue(item);
else
return logSymbols.success + " Entered project name " + chalk.blue(item);
return logSymbols.success + " Entered project name " + chalk.blue(item);
}
});

Expand Down
21 changes: 10 additions & 11 deletions src/cli/commands/PullCommand.ts
Original file line number Diff line number Diff line change
Expand Up @@ -34,13 +34,13 @@ export const PullCommand: CommandModule<object, PullCommand> = {
return yargs
.option("urls", {
type: "string",
alias: ["url"],
alias: ["url", "uris", "uri"],
array: true,
description: [
"A `.gguf` model URL to pull.",
!isInDocumentationMode && "Automatically handles split and binary-split models files, so only pass the URL to the first file of a model.",
"A `.gguf` model URI to pull.",
!isInDocumentationMode && "Automatically handles split and binary-split models files, so only pass the URI to the first file of a model.",
!isInDocumentationMode && "If a file already exists and its size matches the expected size, it will not be downloaded again unless the `--override` flag is used.",
"Pass multiple URLs to download multiple models at once."
"Pass multiple URIs to download multiple models at once."
].filter(Boolean).join(
isInDocumentationMode
? "\n"
Expand Down Expand Up @@ -104,13 +104,13 @@ export const PullCommand: CommandModule<object, PullCommand> = {
const headers = resolveHeaderFlag(headerArg);

if (urls.length === 0)
throw new Error("At least one URL must be provided");
throw new Error("At least one URI must be provided");
else if (urls.length > 1 && filename != null)
throw new Error("The `--filename` flag can only be used when a single URL is passed");
throw new Error("The `--filename` flag can only be used when a single URI is passed");

if (urls.length === 1) {
const downloader = await createModelDownloader({
modelUrl: urls[0]!,
modelUri: urls[0]!,
dirPath: directory,
headers,
showCliProgress: !noProgress,
Expand Down Expand Up @@ -155,14 +155,13 @@ export const PullCommand: CommandModule<object, PullCommand> = {
console.info(`Downloaded to ${chalk.yellow(getReadablePath(downloader.entrypointFilePath))}`);
} else {
const downloader = await combineModelDownloaders(
urls.map((url) => createModelDownloader({
modelUrl: url,
urls.map((uri) => createModelDownloader({
modelUri: uri,
dirPath: directory,
headers,
showCliProgress: false,
deleteTempFileOnCancel: noTempFile,
skipExisting: !override,
fileName: filename || undefined
skipExisting: !override
})),
{
showCliProgress: !noProgress,
Expand Down
Loading

0 comments on commit 4ee10a9

Please sign in to comment.