-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pull] master from kserve:master #357
Conversation
* docs: Move Alibi explainer to docs Signed-off-by: Yuan Tang <[email protected]> * Empty-Commit Signed-off-by: Yuan Tang <[email protected]> * fix test Signed-off-by: Yuan Tang <[email protected]> * Empty-Commit Signed-off-by: Yuan Tang <[email protected]> --------- Signed-off-by: Yuan Tang <[email protected]>
* build: Add flake8 and black to pre-commit hooks Signed-off-by: Yuan Tang <[email protected]> * fix path Signed-off-by: Yuan Tang <[email protected]> * pass config Signed-off-by: Yuan Tang <[email protected]> * fix flake8 Signed-off-by: Yuan Tang <[email protected]> --------- Signed-off-by: Yuan Tang <[email protected]>
…#3576) * Set writable cache folder to avoid permission issue. Fixes #3562 Signed-off-by: Yuan Tang <[email protected]> * Update huggingface_server.Dockerfile Signed-off-by: Yuan Tang <[email protected]> * Empty-Commit Signed-off-by: Yuan Tang <[email protected]> --------- Signed-off-by: Yuan Tang <[email protected]>
…#3596) chore: Fixes [CVE-2023-45288](https://www.cve.org/CVERecord?id=CVE-2023-45288) Signed-off-by: Spolti <[email protected]>
* Add OpenAIModel support to model repository. Signed-off-by: grandbora <[email protected]> * Allow model server to register an openai model Signed-off-by: grandbora <[email protected]> * address comments Signed-off-by: grandbora <[email protected]> * fix format Signed-off-by: grandbora <[email protected]> * make black happy Signed-off-by: grandbora <[email protected]> * Python 3.9 can not do isinstance on union type Signed-off-by: grandbora <[email protected]> * add comment Signed-off-by: grandbora <[email protected]> * Use a base model Signed-off-by: grandbora <[email protected]> * fix formatting Signed-off-by: grandbora <[email protected]> * Fix case Co-authored-by: Dan Sun <[email protected]> Signed-off-by: Bora <[email protected]> * Fix case Signed-off-by: grandbora <[email protected]> --------- Signed-off-by: grandbora <[email protected]> Signed-off-by: Bora <[email protected]> Co-authored-by: Dan Sun <[email protected]>
* updated xgboost to support json and ubj models Signed-off-by: Andrews Arokiam <[email protected]> * rename bst_model dir Signed-off-by: Andrews Arokiam <[email protected]> * bug fix Signed-off-by: Andrews Arokiam <[email protected]> * black format Signed-off-by: Andrews Arokiam <[email protected]> * black formatter Signed-off-by: Andrews Arokiam <[email protected]> * bug fix Signed-off-by: Andrews Arokiam <[email protected]> --------- Signed-off-by: Andrews Arokiam <[email protected]>
* google.golang.org/protobuf version upgrade Signed-off-by: Andrews Arokiam <[email protected]> * version upgrade Signed-off-by: Andrews Arokiam <[email protected]> --------- Signed-off-by: Andrews Arokiam <[email protected]>
* VLLM support for OpenAI Completions in HF server Signed-off-by: Gavrish Prabhu <[email protected]> * remove unwanted imports Signed-off-by: Gavrish Prabhu <[email protected]> * minor fixes Signed-off-by: Gavrish Prabhu <[email protected]> * fix lint Signed-off-by: Gavrish Prabhu <[email protected]> * fix verify license Signed-off-by: Gavrish Prabhu <[email protected]> * fix verify license Signed-off-by: Gavrish Prabhu <[email protected]> * Change base model Signed-off-by: Gavrish Prabhu <[email protected]> * fix linter Signed-off-by: Gavrish Prabhu <[email protected]> * fix tests Signed-off-by: Gavrish Prabhu <[email protected]> * Fix vllm Base and Chat Completion template Signed-off-by: Gavrish Prabhu <[email protected]> * Include Readme Signed-off-by: Gavrish Prabhu <[email protected]> * ignore file from linter and generate Signed-off-by: Gavrish Prabhu <[email protected]> * ignore file from linter and generate Signed-off-by: Gavrish Prabhu <[email protected]> * add codege license Signed-off-by: Gavrish Prabhu <[email protected]> * bring in openai errors Signed-off-by: Gavrish Prabhu <[email protected]> * fix linting Signed-off-by: Gavrish Prabhu <[email protected]> * Remove openai import and update openai types codegen cmd Signed-off-by: Gavrish Prabhu <[email protected]> * remove unwanted import Signed-off-by: Gavrish Prabhu <[email protected]> * fix tests after conflict Signed-off-by: Gavrish Prabhu <[email protected]> * fix poetry lock Signed-off-by: Gavrish Prabhu <[email protected]> * fix poetry lock Signed-off-by: Gavrish Prabhu <[email protected]> * remove openai from extras Signed-off-by: Gavrish Prabhu <[email protected]> * fix logprobs Signed-off-by: Gavrish Prabhu <[email protected]> * send json repsone content of type openai error Signed-off-by: Gavrish Prabhu <[email protected]> --------- Signed-off-by: Gavrish Prabhu <[email protected]>
Signed-off-by: grandbora <[email protected]> Co-authored-by: Bora Tunca <[email protected]>
Fix model server stop method Signed-off-by: Sivanantham Chinnaiyan <[email protected]>
* Provide full Signed-off-by: Yuan Tang <[email protected]> * Move to cmd directory Signed-off-by: Yuan Tang <[email protected]> * Add helm charts Signed-off-by: Yuan Tang <[email protected]> * regen Signed-off-by: Yuan Tang <[email protected]> * fix conflict Signed-off-by: Yuan Tang <[email protected]> * rebase Signed-off-by: Yuan Tang <[email protected]> * remove unused Signed-off-by: Yuan Tang <[email protected]> * remove redundant files Signed-off-by: Yuan Tang <[email protected]> * Empty-Commit Signed-off-by: Yuan Tang <[email protected]> * Rename file Signed-off-by: Yuan Tang <[email protected]> --------- Signed-off-by: Yuan Tang <[email protected]>
Signed-off-by: Yuan Tang <[email protected]>
* go lint fix Signed-off-by: Andrews Arokiam <[email protected]> * commit for golangci Signed-off-by: Andrews Arokiam <[email protected]> * rewrite if-else to switch statement Signed-off-by: Andrews Arokiam <[email protected]> * fix for the response body Signed-off-by: Andrews Arokiam <[email protected]> --------- Signed-off-by: Andrews Arokiam <[email protected]>
Signed-off-by: Sivanantham Chinnaiyan <[email protected]>
Ignore protected namespaces. Don't set json_loads Signed-off-by: Curtis Maddalozzo <[email protected]>
* build: Fix CRD copying in generate-install.sh Signed-off-by: Yuan Tang <[email protected]> * Empty-Commit Signed-off-by: Yuan Tang <[email protected]> --------- Signed-off-by: Yuan Tang <[email protected]>
Signed-off-by: Yuan Tang <[email protected]> Co-authored-by: Sivanantham <[email protected]>
Remove replace for golang.org/x/net and fix CVE-2023-45288 for qpext Signed-off-by: Sivanantham Chinnaiyan <[email protected]>
test re run Signed-off-by: Andrews Arokiam <[email protected]>
* OpenAI data models and endpoints from vLLM Signed-off-by: Tessa Pham <[email protected]> more components for OpenAI endpoints Signed-off-by: Tessa Pham <[email protected]> add OpenAI endpoints to router Signed-off-by: Tessa Pham <[email protected]> modify generate() in data plane Signed-off-by: Tessa Pham <[email protected]> class OpenAIModel Signed-off-by: Tessa Pham <[email protected]> delete and rename files Signed-off-by: Tessa Pham <[email protected]> add create_chat_completion() to OpenAIModel Signed-off-by: Tessa Pham <[email protected]> update routers and lint Signed-off-by: Tessa Pham <[email protected]> * Implement streaming Signed-off-by: Curtis Maddalozzo <[email protected]> * Register OpenAI endpoints when appropriate Signed-off-by: Curtis Maddalozzo <[email protected]> * Remove completion types from dataplane methods Signed-off-by: Curtis Maddalozzo <[email protected]> * Add OpenAI endpoint support to huggingfaceserver Signed-off-by: Curtis Maddalozzo <[email protected]> * Allow accessing headers and response in completion methods Signed-off-by: Curtis Maddalozzo <[email protected]> * Create separate model for completion and chat completion requests Signed-off-by: Curtis Maddalozzo <[email protected]> * Add stop function for handling model shutdown Signed-off-by: Curtis Maddalozzo <[email protected]> * Add arg for remote code param Signed-off-by: Curtis Maddalozzo <[email protected]> * Add option to allow selecting model backend Signed-off-by: Curtis Maddalozzo <[email protected]> * Pin ray to 2.10.x Signed-off-by: Curtis Maddalozzo <[email protected]> * Use correct type in tests Signed-off-by: Curtis Maddalozzo <[email protected]> * Refactor encoder-decoder and decoder only models into separate classes. Fix tests. Signed-off-by: Curtis Maddalozzo <[email protected]> * Add more test cases. Factor models out into fixtures. Pass loop as argument to the background request handler. Signed-off-by: Curtis Maddalozzo <[email protected]> * Remove unneccessary None check Signed-off-by: Curtis Maddalozzo <[email protected]> * Properly handle unsupported models. Don't try to load table question answering models as they are not supported. Signed-off-by: Curtis Maddalozzo <[email protected]> * Remove models we don't support Signed-off-by: Curtis Maddalozzo <[email protected]> * Pass in predictor config Signed-off-by: Curtis Maddalozzo <[email protected]> * Fix test assertion. Remove debug lines Signed-off-by: Curtis Maddalozzo <[email protected]> --------- Signed-off-by: Tessa Pham <[email protected]> Signed-off-by: Curtis Maddalozzo <[email protected]> Co-authored-by: Tessa Pham <[email protected]>
Signed-off-by: Spolti <[email protected]>
Signed-off-by: Yuan Tang <[email protected]>
* build: Remove misleading logs from minimal-crdgen.sh Signed-off-by: Yuan Tang <[email protected]> * Add file Signed-off-by: Yuan Tang <[email protected]> --------- Signed-off-by: Yuan Tang <[email protected]>
* Fix v2 predict for hf Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Add e2e test for hf Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Fix post processing and e2e image build Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Increase memory limit Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Fix output for v2 Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Add more tests Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Reduce parallelism Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Use backend argument Signed-off-by: Dan Sun <[email protected]> * Update to use chat completion endpoint Signed-off-by: Dan Sun <[email protected]> * Fix openai tests Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Sivanantham Chinnaiyan <[email protected]> Signed-off-by: Dan Sun <[email protected]> Co-authored-by: Dan Sun <[email protected]>
…erver (#3594) * set SAFETENSORS_FAST_GPU and HF_HUB_DISABLE_TELEMETRY Signed-off-by: Lize Cai <[email protected]> * add doc on the default value Signed-off-by: Lize Cai <[email protected]> --------- Signed-off-by: Lize Cai <[email protected]>
Signed-off-by: Curtis Maddalozzo <[email protected]>
… backend (#3657) * Assign device of input tensors Signed-off-by: sailgpu <[email protected]> * lint fix Signed-off-by: sailgpu <[email protected]> --------- Signed-off-by: sailgpu <[email protected]>
* Test image builds for ARM64 arch in CI Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Update lockfiles Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Add ARM64 support for paddle Signed-off-by: Sivanantham Chinnaiyan <[email protected]> --------- Signed-off-by: Sivanantham Chinnaiyan <[email protected]>
* Encoder-decoder models do not include input tokens in their output Signed-off-by: Curtis Maddalozzo <[email protected]> * Pass stopping criteria into streamer Signed-off-by: Curtis Maddalozzo <[email protected]> --------- Signed-off-by: Curtis Maddalozzo <[email protected]>
* Added the field AdditionalIngressDomains into the struct IngressConfig Signed-off-by: Vincent Hou <[email protected]> * Added the additional ingress domains into the hosts Signed-off-by: Vincent Hou <[email protected]> * Fixed the indentation Signed-off-by: Vincent Hou <[email protected]> * Added isvc name and namespace into the domain name * Added the validation for the URLs Signed-off-by: Vincent Hou <[email protected]> * Validate the domain in the additionalIngressDomains Signed-off-by: Vincent Hou <[email protected]> * Create the hosts from the list of additionalIngressDomains Signed-off-by: Vincent Hou <[email protected]> * Change the way to validate the host Signed-off-by: Vincent Hou <[email protected]> * Change the validation error message Signed-off-by: Vincent Hou <[email protected]> * Revert the name to url Signed-off-by: Vincent Hou <[email protected]> * Get all the available domain list Signed-off-by: Vincent Hou <[email protected]> * gofmt -s -w the file Signed-off-by: Vincent Hou <[email protected]> * Add additionalIngressDomains into the charts Signed-off-by: Vincent Hou <[email protected]> * Added the comments and refactor the tests Signed-off-by: Vincent Hou <[email protected]> * Regenerate the manifests Signed-off-by: Vincent Hou <[email protected]> * Modify createHTTPMatchRequest, the charts and the test cases Signed-off-by: Vincent Hou <[email protected]> * Run make generate Signed-off-by: Vincent Hou <[email protected]> --------- Signed-off-by: Vincent Hou <[email protected]>
Signed-off-by: Curtis Maddalozzo <[email protected]>
upgrade vllm version Signed-off-by: Johnu George <[email protected]>
Signed-off-by: Curtis Maddalozzo <[email protected]>
…Fixes #3452 (#3603) * feat: Support customizable deployment strategy for RawDeployment mode Signed-off-by: Yuan Tang <[email protected]> * regen Signed-off-by: Yuan Tang <[email protected]> * lint Signed-off-by: Yuan Tang <[email protected]> * Correctly apply rollingupdate Signed-off-by: Yuan Tang <[email protected]> * address comments Signed-off-by: Yuan Tang <[email protected]> * Add validation Signed-off-by: Yuan Tang <[email protected]> --------- Signed-off-by: Yuan Tang <[email protected]>
* Enable dtype for huggingface server Signed-off-by: Dattu Sharma <[email protected]> * Set float16 as default. Fixup linter Signed-off-by: Dattu Sharma <[email protected]> * Add small comment to make the changes understandable Signed-off-by: Dattu Sharma <[email protected]> * Fixup linter Signed-off-by: Dattu Sharma <[email protected]> * Adapt to new huggingfacemodel Signed-off-by: Dattu Sharma <[email protected]> * Fixup merge :) Signed-off-by: Dattu Sharma <[email protected]> * Explicitly mention the behaviour of dtype flag on auto. Signed-off-by: Dattu Sharma <[email protected]> * Default to FP32 for encoder models Signed-off-by: Dattu Sharma <[email protected]> * Selectively add --dtype to parser. Use FP16 for GPU and FP32 for CPU Signed-off-by: Dattu Sharma <[email protected]> * Fixup linter Signed-off-by: Dattu Sharma <[email protected]> * Update poetry Signed-off-by: Dattu Sharma <[email protected]> * Use torch.float32 forr tests explicitly Signed-off-by: Dattu Sharma <[email protected]> --------- Signed-off-by: Dattu Sharma <[email protected]>
Signed-off-by: Curtis Maddalozzo <[email protected]>
* fix for extract zip from gcs Signed-off-by: Andrews Arokiam <[email protected]> * initial commit for gcs model download unittests Signed-off-by: Andrews Arokiam <[email protected]> * unittests for model download from gcs Signed-off-by: Andrews Arokiam <[email protected]> * black format fix Signed-off-by: Andrews Arokiam <[email protected]> * code verification Signed-off-by: Andrews Arokiam <[email protected]> --------- Signed-off-by: Andrews Arokiam <[email protected]>
Signed-off-by: Gavrish Prabhu <[email protected]>
* update wording for huggingface README small update to make readme easier to understand Signed-off-by: Alexa Griffith <[email protected]> * Update README.md Signed-off-by: Alexa Griffith [email protected] * Update python/huggingfaceserver/README.md Co-authored-by: Filippe Spolti <[email protected]> Signed-off-by: Alexa Griffith <[email protected]> * update vllm Signed-off-by: alexagriffith <[email protected]> * Update README.md --------- Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: Alexa Griffith [email protected] Signed-off-by: alexagriffith <[email protected]> Signed-off-by: Dan Sun <[email protected]> Co-authored-by: Filippe Spolti <[email protected]> Co-authored-by: Dan Sun <[email protected]>
* fix: HPA equality check should include annotations Signed-off-by: Yuan Tang <[email protected]> * Only watch related autoscalerclass annotation Signed-off-by: Yuan Tang <[email protected]> * simplify Signed-off-by: Yuan Tang <[email protected]> * Add missing delete action Signed-off-by: Yuan Tang <[email protected]> * fix logic Signed-off-by: Yuan Tang <[email protected]> --------- Signed-off-by: Yuan Tang <[email protected]>
fix huggingface runtime in chart Signed-off-by: Dan Sun <[email protected]>
* fix huggingface runtime in chart Signed-off-by: Dan Sun <[email protected]> * Allow model_dir to be specified on template Signed-off-by: Dan Sun <[email protected]> * Default model_dir to /mnt/models for HF Signed-off-by: Dan Sun <[email protected]> * Lint format Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Dan Sun <[email protected]>
* Fix:vLLM Model Supported check throwing circular dependency Signed-off-by: Gavrish Prabhu <[email protected]> * remove unwanted comments Signed-off-by: Gavrish Prabhu <[email protected]> * remove unwanted comments Signed-off-by: Gavrish Prabhu <[email protected]> * fix return case Signed-off-by: Gavrish Prabhu <[email protected]> * fix to check all arch in model config forr vllm support Signed-off-by: Gavrish Prabhu <[email protected]> * fixlint Signed-off-by: Gavrish Prabhu <[email protected]> --------- Signed-off-by: Gavrish Prabhu <[email protected]>
Fix: allow null in Finish reason Signed-off-by: Gavrish Prabhu <[email protected]>
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Hi @pull[bot]. Thanks for your PR. I'm waiting for a opendatahub-io member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: pull[bot] The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
929471b
into
opendatahub-io:master
…storage-initializer-211 Red Hat Konflux update kserve-storage-initializer-211
See Commits and Changes for more details.
Created by pull[bot]
Can you help keep this open source service alive? 💖 Please sponsor : )