feat: resolveModelFile method (#351)

* feat: `resolveModelFile` method * feat: `hf:` URI support * fix: improve GGUF metadata read times * fix: hide internal type * docs: document the `hf:` URI
withcatai · Sep 29, 2024 · 4ee10a9 · 4ee10a9
1 parent 578e710
commit 4ee10a9
Show file tree

Hide file tree

Showing 38 changed files with 1,150 additions and 614 deletions.
diff --git a/.vitepress/config/apiReferenceSidebar.ts b/.vitepress/config/apiReferenceSidebar.ts
@@ -10,6 +10,7 @@ const categoryOrder = [
 
 const functionsOrder = [
     "getLlama",
+    "resolveModelFile",
     "defineChatSessionFunction",
     "createModelDownloader",
     "resolveChatWrapper",

diff --git a/docs/cli/pull.md b/docs/cli/pull.md
@@ -13,10 +13,16 @@ const commandDoc = docs.pull;
 A wrapper around [`ipull`](https://www.npmjs.com/package/ipull)
 to download model files as fast as possible with parallel connections and other optimizations.
 
-Automatically handles split and binary-split models files, so only pass the URL to the first file of a model.
+Automatically handles split and binary-split models files, so only pass the URI to the first file of a model.
 
 If a file already exists and its size matches the expected size, it will not be downloaded again unless the `--override` flag is used.
 
+The supported URI schemes are:
+- **HTTP:** `https://`, `http://`
+- **Hugging Face:** `hf:<user>/<model>/<file-path>#<branch>` (`#<branch>` is optional)
+
+Learn more about using model URIs in the [Downloading Models guide](../guide/downloading-models.md#model-uris).
+
 > To programmatically download a model file in your code, use [`createModelDownloader()`](../api/functions/createModelDownloader.md)
 
 ## Usage

diff --git a/docs/guide/choosing-a-model.md b/docs/guide/choosing-a-model.md
@@ -164,3 +164,7 @@ npx --no node-llama-cpp pull --dir ./models <model-file-url>
 > 
 > If the model file URL is of a single part of a multi-part model (for example, [this model](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q5_K_L.gguf/Meta-Llama-3-70B-Instruct-Q5_K_L-00001-of-00002.gguf)),
 > it will also download all the other parts as well into the same directory.
+
+::: tip
+Consider using [model URIs](./downloading-models.md#model-uris) to download and load models.
+:::
diff --git a/docs/guide/downloading-models.md b/docs/guide/downloading-models.md
@@ -69,16 +69,61 @@ This option is recommended for more advanced use cases, such as downloading mode
 If you know the exact model URLs you're going to need every time in your project, it's better to download the models
 automatically after running `npm install` as described in the [Using the CLI](#cli) section.
 
+## Model URIs {#model-uris}
+You can reference models using a URI instead of their full download URL when using the CLI and relevant methods.
+
+When downloading a model from a URI, the model files will be prefixed with a corresponding adaptation of the URI.
+
+To reference a model from Hugging Face, you can use the scheme
+<br/>
+`hf:<user>/<model>/<file-path>#<branch>` (`#<branch>` is optional).
+
+Here's an example usage of the Hugging Face URI scheme:
+```
+hf:mradermacher/Meta-Llama-3.1-8B-Instruct-GGUF/Meta-Llama-3.1-8B-Instruct.Q4_K_M.gguf
+```
+
+When using a URI to reference a model,
+it's recommended [to add it to your `package.json` file](#cli) to ensure it's downloaded when running `npm install`,
+and also resolve it using the [`resolveModelFile`](../api/functions/resolveModelFile.md) method to get the full path of the resolved model file.
+
+Here's and example usage of the [`resolveModelFile`](../api/functions/resolveModelFile.md) method:
+```typescript
+import {fileURLToPath} from "url";
+import path from "path";
+import {getLlama, resolveModelFile} from "node-llama-cpp";
+
+const __dirname = path.dirname(fileURLToPath(import.meta.url));
+const modelsDirectory = path.join(__dirname, "models");
+
+const modelPath = await resolveModelFile(
+    "hf:user/model/model-file.gguf",
+    modelsDirectory
+);
+
+const llama = await getLlama();
+const model = await llama.loadModel({modelPath});
+```
+
+::: tip NOTE
+If a corresponding model file is not found in the given directory, the model will automatically be downloaded.
+
+When a file is being downloaded, the download progress is shown in the console by default.
+<br/>
+Set the [`cli`](../api/type-aliases/ResolveModelFileOptions#cli) option to `false` to disable this behavior.
+:::
+
 ## Downloading Gated Models From Hugging Face {#hf-token}
 Some models on Hugging Face are "gated", meaning they require a manual consent from you before you can download them.
 
 To download such models, after completing the consent form on the model card, you need to create a [Hugging Face token](https://huggingface.co/docs/hub/en/security-tokens) and set it in one of the following locations:
 * Set an environment variable called `HF_TOKEN` the token
 * Set the `~/.cache/huggingface/token` file content to the token
 
-Now, using the CLI or the [`createModelDownloader`](../api/functions/createModelDownloader.md) method will automatically use the token to download gated models.
+Now, using the CLI, the [`createModelDownloader`](../api/functions/createModelDownloader.md) method,
+or the [`resolveModelFile`](../api/functions/resolveModelFile.md) method will automatically use the token to download gated models.
 
-Alternatively, you can use the token in the [`tokens`](../api/type-aliases/ModelDownloaderOptions.md#tokens) option when using [`createModelDownloader`](../api/functions/createModelDownloader.md).
+Alternatively, you can use the token in the [`tokens`](../api/type-aliases/ModelDownloaderOptions.md#tokens) option when using [`createModelDownloader`](../api/functions/createModelDownloader.md) or [`resolveModelFile`](../api/functions/resolveModelFile.md).
 
 ## Inspecting Remote Models
 You can inspect the metadata of a remote model without downloading it by either using the [`inspect gguf` command](../cli/inspect/gguf.md) with a URL,

diff --git a/docs/guide/index.md b/docs/guide/index.md
@@ -51,9 +51,9 @@ npx --no node-llama-cpp inspect gpu
 ```
 
 ## Getting a Model File
-We recommend you to get a GGUF model from either [Michael Radermacher on Hugging Face](https://huggingface.co/mradermacher) or [search HuggingFace directly](https://huggingface.co/models?library=gguf) for a GGUF model.
+We recommend getting a GGUF model from either [Michael Radermacher on Hugging Face](https://huggingface.co/mradermacher) or by [searching HuggingFace directly](https://huggingface.co/models?library=gguf) for a GGUF model.
 
-We recommend you to start by getting a small model that doesn't have a lot of parameters just to ensure everything works, so try downloading a `7B`/`8B` parameters model first (search for models with both `7B`/`8B` and `GGUF` in their name).
+We recommend starting by getting a small model that doesn't have a lot of parameters just to ensure everything works, so try downloading a `7B`/`8B` parameters model first (search for models with both `7B`/`8B` and `GGUF` in their name).
 
 For improved download speeds, you can use the [`pull`](../cli/pull.md) command to download a model:
 ```shell

diff --git a/scripts/scaffoldElectronExampleForCiBuild.ts b/scripts/scaffoldElectronExampleForCiBuild.ts
@@ -40,8 +40,8 @@ await scaffoldProjectTemplate({
     directoryPath: resolvedPackageFolderPath,
     parameters: {
         [ProjectTemplateParameter.ProjectName]: projectName,
-        [ProjectTemplateParameter.ModelUrl]: "https://github.com/withcatai/node-llama-cpp",
-        [ProjectTemplateParameter.ModelFilename]: "model.gguf",
+        [ProjectTemplateParameter.ModelUriOrUrl]: "https://github.com/withcatai/node-llama-cpp",
+        [ProjectTemplateParameter.ModelUriOrFilename]: "model.gguf",
         [ProjectTemplateParameter.CurrentModuleVersion]: packageVersion
     }
 });

diff --git a/src/chatWrappers/Llama3_1ChatWrapper.ts b/src/chatWrappers/Llama3_1ChatWrapper.ts
@@ -36,13 +36,7 @@ export class Llama3_1ChatWrapper extends ChatWrapper {
     /**
      * @param options
      */
-    public constructor({
-        cuttingKnowledgeDate = new Date("2023-12-01T00:00:00Z"),
-        todayDate = () => new Date(),
-        noToolInstructions = false,
-
-        _specialTokensTextForPreamble = false
-    }: {
+    public constructor(options: {
         /**
          * Set to `null` to disable
          *
@@ -64,6 +58,14 @@ export class Llama3_1ChatWrapper extends ChatWrapper {
     } = {}) {
         super();
 
+        const {
+            cuttingKnowledgeDate = new Date("2023-12-01T00:00:00Z"),
+            todayDate = () => new Date(),
+            noToolInstructions = false,
+
+            _specialTokensTextForPreamble = false
+        } = options;
+
         this.cuttingKnowledgeDate = cuttingKnowledgeDate == null
             ? null
             : cuttingKnowledgeDate instanceof Function

diff --git a/src/cli/commands/ChatCommand.ts b/src/cli/commands/ChatCommand.ts
@@ -77,9 +77,9 @@ export const ChatCommand: CommandModule<object, ChatCommand> = {
 
         return yargs
             .option("modelPath", {
-                alias: ["m", "model", "path", "url"],
+                alias: ["m", "model", "path", "url", "uri"],
                 type: "string",
-                description: "Model file to use for the chat. Can be a path to a local file or a URL of a model file to download. Leave empty to choose from a list of recommended models"
+                description: "Model file to use for the chat. Can be a path to a local file or a URI of a model file to download. Leave empty to choose from a list of recommended models"
             })
             .option("header", {
                 alias: ["H"],

diff --git a/src/cli/commands/CompleteCommand.ts b/src/cli/commands/CompleteCommand.ts
@@ -57,9 +57,9 @@ export const CompleteCommand: CommandModule<object, CompleteCommand> = {
     builder(yargs) {
         return yargs
             .option("modelPath", {
-                alias: ["m", "model", "path", "url"],
+                alias: ["m", "model", "path", "url", "uri"],
                 type: "string",
-                description: "Model file to use for the chat. Can be a path to a local file or a URL of a model file to download. Leave empty to choose from a list of recommended models"
+                description: "Model file to use for the completion. Can be a path to a local file or a URI of a model file to download. Leave empty to choose from a list of recommended models"
             })
             .option("header", {
                 alias: ["H"],

diff --git a/src/cli/commands/InfillCommand.ts b/src/cli/commands/InfillCommand.ts
@@ -59,9 +59,9 @@ export const InfillCommand: CommandModule<object, InfillCommand> = {
     builder(yargs) {
         return yargs
             .option("modelPath", {
-                alias: ["m", "model", "path", "url"],
+                alias: ["m", "model", "path", "url", "uri"],
                 type: "string",
-                description: "Model file to use for the chat. Can be a path to a local file or a URL of a model file to download. Leave empty to choose from a list of recommended models"
+                description: "Model file to use for the infill. Can be a path to a local file or a URI of a model file to download. Leave empty to choose from a list of recommended models"
             })
             .option("header", {
                 alias: ["H"],

diff --git a/src/cli/commands/InitCommand.ts b/src/cli/commands/InitCommand.ts
@@ -6,7 +6,6 @@ import logSymbols from "log-symbols";
 import validateNpmPackageName from "validate-npm-package-name";
 import fs from "fs-extra";
 import {consolePromptQuestion} from "../utils/consolePromptQuestion.js";
-import {isUrl} from "../../utils/isUrl.js";
 import {basicChooseFromListConsoleInteraction} from "../utils/basicChooseFromListConsoleInteraction.js";
 import {splitAnsiToLines} from "../utils/splitAnsiToLines.js";
 import {arrowChar} from "../../consts.js";
@@ -21,6 +20,7 @@ import {ProjectTemplateOption, projectTemplates} from "../projectTemplates.js";
 import {getReadablePath} from "../utils/getReadablePath.js";
 import {createModelDownloader} from "../../utils/createModelDownloader.js";
 import {withCliCommandDescriptionDocsUrl} from "../utils/withCliCommandDescriptionDocsUrl.js";
+import {resolveModelDestination} from "../../utils/resolveModelDestination.js";
 
 type InitCommand = {
     name?: string,
@@ -93,7 +93,7 @@ export async function InitCommandHandler({name, template, gpu}: InitCommand) {
             logLevel: LlamaLogLevel.error
         });
 
-    const modelUrl = await interactivelyAskForModel({
+    const modelUri = await interactivelyAskForModel({
         llama,
         allowLocalModels: false,
         downloadIntent: false
@@ -113,29 +113,53 @@ export async function InitCommandHandler({name, template, gpu}: InitCommand) {
 
         await fs.ensureDir(targetDirectory);
 
-        const modelDownloader = await createModelDownloader({
-            modelUrl,
-            showCliProgress: false,
-            deleteTempFileOnCancel: false
-        });
-        const modelEntrypointFilename = modelDownloader.entrypointFilename;
+        async function resolveModelInfo() {
+            const resolvedModelDestination = resolveModelDestination(modelUri);
+
+            if (resolvedModelDestination.type === "uri")
+                return {
+                    modelUriOrUrl: resolvedModelDestination.uri,
+                    modelUriOrFilename: resolvedModelDestination.uri,
+                    cancelDownloader: async () => void 0
+                };
+
+            if (resolvedModelDestination.type === "file")
+                throw new Error("Unexpected file model destination");
+
+            const modelDownloader = await createModelDownloader({
+                modelUri: resolvedModelDestination.url,
+                showCliProgress: false,
+                deleteTempFileOnCancel: false
+            });
+            const modelEntrypointFilename = modelDownloader.entrypointFilename;
+
+            return {
+                modelUriOrUrl: resolvedModelDestination.url,
+                modelUriOrFilename: modelEntrypointFilename,
+                async cancelDownloader() {
+                    try {
+                        await modelDownloader.cancel();
+                    } catch (err) {
+                        // do nothing
+                    }
+                }
+            };
+        }
+
+        const {modelUriOrFilename, modelUriOrUrl, cancelDownloader} = await resolveModelInfo();
 
         await scaffoldProjectTemplate({
             template,
             directoryPath: targetDirectory,
             parameters: {
                 [ProjectTemplateParameter.ProjectName]: projectName,
-                [ProjectTemplateParameter.ModelUrl]: modelUrl,
-                [ProjectTemplateParameter.ModelFilename]: modelEntrypointFilename,
+                [ProjectTemplateParameter.ModelUriOrUrl]: modelUriOrUrl,
+                [ProjectTemplateParameter.ModelUriOrFilename]: modelUriOrFilename,
                 [ProjectTemplateParameter.CurrentModuleVersion]: await getModuleVersion()
             }
         });
 
-        try {
-            await modelDownloader.cancel();
-        } catch (err) {
-            // do nothing
-        }
+        await cancelDownloader();
 
         await new Promise((resolve) => setTimeout(resolve, Math.max(0, minScaffoldTime - (Date.now() - startTime))));
     });
@@ -213,10 +237,7 @@ async function askForProjectName(currentDirectory: string) {
             if (item == null)
                 return "";
 
-            if (isUrl(item, false))
-                return logSymbols.success + " Entered project name " + chalk.blue(item);
-            else
-                return logSymbols.success + " Entered project name " + chalk.blue(item);
+            return logSymbols.success + " Entered project name " + chalk.blue(item);
         }
     });
 

diff --git a/src/cli/commands/PullCommand.ts b/src/cli/commands/PullCommand.ts
@@ -34,13 +34,13 @@ export const PullCommand: CommandModule<object, PullCommand> = {
         return yargs
             .option("urls", {
                 type: "string",
-                alias: ["url"],
+                alias: ["url", "uris", "uri"],
                 array: true,
                 description: [
-                    "A `.gguf` model URL to pull.",
-                    !isInDocumentationMode && "Automatically handles split and binary-split models files, so only pass the URL to the first file of a model.",
+                    "A `.gguf` model URI to pull.",
+                    !isInDocumentationMode && "Automatically handles split and binary-split models files, so only pass the URI to the first file of a model.",
                     !isInDocumentationMode && "If a file already exists and its size matches the expected size, it will not be downloaded again unless the `--override` flag is used.",
-                    "Pass multiple URLs to download multiple models at once."
+                    "Pass multiple URIs to download multiple models at once."
                 ].filter(Boolean).join(
                     isInDocumentationMode
                         ? "\n"
@@ -104,13 +104,13 @@ export const PullCommand: CommandModule<object, PullCommand> = {
         const headers = resolveHeaderFlag(headerArg);
 
         if (urls.length === 0)
-            throw new Error("At least one URL must be provided");
+            throw new Error("At least one URI must be provided");
         else if (urls.length > 1 && filename != null)
-            throw new Error("The `--filename` flag can only be used when a single URL is passed");
+            throw new Error("The `--filename` flag can only be used when a single URI is passed");
 
         if (urls.length === 1) {
             const downloader = await createModelDownloader({
-                modelUrl: urls[0]!,
+                modelUri: urls[0]!,
                 dirPath: directory,
                 headers,
                 showCliProgress: !noProgress,
@@ -155,14 +155,13 @@ export const PullCommand: CommandModule<object, PullCommand> = {
             console.info(`Downloaded to ${chalk.yellow(getReadablePath(downloader.entrypointFilePath))}`);
         } else {
             const downloader = await combineModelDownloaders(
-                urls.map((url) => createModelDownloader({
-                    modelUrl: url,
+                urls.map((uri) => createModelDownloader({
+                    modelUri: uri,
                     dirPath: directory,
                     headers,
                     showCliProgress: false,
                     deleteTempFileOnCancel: noTempFile,
-                    skipExisting: !override,
-                    fileName: filename || undefined
+                    skipExisting: !override
                 })),
                 {
                     showCliProgress: !noProgress,