Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Dynamic tool support in skills #459

Open
zane-neo opened this issue Nov 5, 2024 · 2 comments
Open

[RFC] Dynamic tool support in skills #459

zane-neo opened this issue Nov 5, 2024 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@zane-neo
Copy link
Collaborator

zane-neo commented Nov 5, 2024

Dynamic tool support in skills

Problem statement

Tools are important part in ml-commons agent framework, but currently there are several pain points in tool using for both user and developer. Tool implementation is not elegant as the tool copies the corresponding core/plugin code to it and apply certain transformation to make it suitable for tool’s purpose, this brings several pain points:

  1. The tool needs to keep tracking the code source to make it adapt to changes which takes a lot of maintenance effort, e.g. when _cat/indices API is been changed to _list/indices API: [Enhancement] Change CatIndexTool implementation from _cat/index action to _list/index action ml-commons#3182, tool maintainer needs to copy the latest code.
  2. The tool needs to be versioned for different AOS versions and user needs to be aware of the version difference to ensure the tool runs as expected.
  3. A user can’t use a tool that hasn't been implemented in skills or ml-commons which is not a good user experience.

Purposed solution

If the tools can be configured dynamically and if the execution run the tool based on configuration, then it can eliminate the pain points above, the high level solution looks like below:

  1. A new tool named DynamicTool will be created and this tool in charged for executing tools defined with configuration.
  2. In /agents/_register API, the request_body needs a little modification to support dynamic tool, the uri, request_body, tool_steps in parameters map are critical keys to identify if it’s dynamic.
    • uri: the actual uri of the REST API the tool uses, e.g. CatIndexTool uses _cat/indices API and MLModelTool uses _ml/{model_id}/_predict
    • request_body: the request body of the corresponding REST API.
    • tool_steps: A tool could be a simple tool or a composite tool, e.g. a RAGTool is composite with two steps, first to retrieve context from knowledge base, second step is to invoke LLM to generate response with context and user question.
{
  "name": "dynamic tool for composite tool",
  "type": "conversational_flow",
  "description": "this is a test agent",
  "memory": {
    "type": "conversation_index"
  },
  "tools": [
    {
      "name": "text embedding RAG tool",
      "type": "RAGTool",
      "tool_steps": [
        {
          "type": "DynamicTool",
          "name": "text embedding model query tool",
          "parameters": {
            "uri": "/_ml/${parameters.textEmbeddingModelId}/_predict",
            "textEmbeddingModelId": "FSdp4ZIBKOcmWBSuKJGR",
            "request_body": "{\"query\": {\"nested\": {\"path\": \"${parameters.nestedPath:-null}\",\"score_mode\": \"${parameters.score_mode:-null}\",\"query\": {\"neural\": {\"embeddingField\": {\"query_text\": \"${parameters.query_text:-null}\",\"model_id\": \"${parameters.model_id:-null}\",\"k\": \"${parameters.k:-null}\"}}}}}}"
          }
        },
        {
          "type": "MLModelTool",
          "name": "LLM interaction tool",
          "parameters": {
            "uri": "/${parameters.index}/_search",
            "textEmbeddingModelId": "FSdp4ZIBKOcmWBSuKJGR",
            "request_body": "{\"parameters\": {\"prompt\": \"You're an political expert can answer any questions related to politics.\",\"question\": \"${parameters.question}\",\"max_token\": 10}}"
          }
        }
      ]
    }
  ],
  "app_type": "rag"
}
  1. During agent runtime, the tools are being created with the configuration, the tool itself doesn’t have to be implemented in code base, the only constraints is user needs to ensure the uri exists in OpenSearch.
  2. During tool execution, a dummy RestRequest will be created and the corresponding TransportHandler will be selected to handle the request. With this, it doesn’t need API level operations like AuthN/AuthZ, and the API response can be returned to tools and agent.

Future plans

Phase1

In first phase we will provide user this capability to use dynamic tools, the mixed use of dynamic tool and existing tool will not be supported.

Phase2

In second phase we will migrate the existing tools to dynamic tools and old tools will be deprecated, there’ll be several build-in functions created for their output/input processing.

@yuye-aws
Copy link
Member

It's really good to see you create this RFC. It provides users with a Tool to call any API and saves developer much time to develop a new tool for API.

@dblock dblock removed the untriaged label Nov 25, 2024
@dblock
Copy link
Member

dblock commented Nov 25, 2024

[Catch All Triage - 1, 2, 3, 4, 5]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants