Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Add a new (PainlessScriptTool) as a generic tool #375

Open
Hailong-am opened this issue Jul 29, 2024 · 9 comments
Open

[RFC] Add a new (PainlessScriptTool) as a generic tool #375

Hailong-am opened this issue Jul 29, 2024 · 9 comments
Labels
enhancement New feature or request

Comments

@Hailong-am
Copy link
Contributor

Hailong-am commented Jul 29, 2024

Is your feature request related to a problem?

Nowadays, a single tool can typically produce only output of string type, which could be in various formats such as plain text, JSON, XML, and so on. With agent frameworks like Flow Agent, we can create pipelines that leverage the output of previous tools. However, this approach has some limitations.

For example, if tool A produces a JSON output like this:

{"key1":"value1","key2":"value2"}

and tool B wants to use only the key1 value, then tool B needs to have logic to parse the JSON object. This is not a generic solution, as each tool would need to have the necessary logic to adapt to the output format of the previous tool. Furthermore, the output format of the previous tool might change or be in a different format altogether, making it impractical to anticipate and handle all possible cases within the tools themselves.

To address these limitations, we need a generic tool that can handle string parsing and parameter processing, acting as a glue between the different tools in the pipeline. The PainlessScript tool, which leverages the built-in Painless scripting capabilities of OpenSearch, provides a solution. This tool allows you to pass scripts into it, making it a generic and adaptable component that can handle various input and output formats across the pipeline.

Here is the POC implementation

    @Override
    public <T> void run(Map<String, String> parameters, ActionListener<T> listener) {
       
        Script script = new Script(ScriptType.INLINE, "painless", scriptCode, Collections.emptyMap());
        // flatten parameters from previous tool
        Map<String, Object> flattenedParameters = flattenMap(parameters);
        // execute Painless script
        TemplateScript templateScript = scriptService.compile(script, TemplateScript.CONTEXT).newInstance(flattenedParameters);
        try {
            String result = templateScript.execute();
            listener.onResponse(result == null ? (T) "" : (T) result);
        } catch (Exception e) {
            listener.onFailure(e);
        }
    }
@Hailong-am Hailong-am added enhancement New feature or request untriaged labels Jul 29, 2024
@zane-neo
Copy link
Collaborator

This is a valid problem but I have concern on using painless script. In ml-commons when remote inference initially supported, the process functions are in painless format, e.g. https://github.com/opensearch-project/ml-commons/blob/main/docs/remote_inference_blueprints/bedrock_connector_titan_multimodal_embedding_blueprint.md#2-create-connector-for-amazon-bedrock. Painless script has flexibility though it's difficult to read and maintain. So in later releases, more and more build-in process functions are added, e.g. connector.pre_process.cohere.embedding. I would suggest to write different build-in parsers for different format, e.g. jsonpath for json format.

@Hailong-am
Copy link
Contributor Author

Painless script has flexibility though it's difficult to read and maintain. So in later releases, more and more build-in process functions are added, e.g. connector.pre_process.cohere.embedding. I would suggest to write different build-in parsers for different format, e.g. jsonpath for json format.

Agree on this. As Painless script don't support json or xml/yml parsing, these parsing logic will in the tool itself and we can definitely have some built in parsers.

This part will handled by flattenMap, it will have some built in parsers and Painless script will do string format and other pure java functions.

// flatten parameters from previous tool
Map<String, Object> flattenedParameters = flattenMap(parameters);

An example of json parse and flatten

// ppl tool output
return '{\\"executionResult\\":\\"result\\",\\"ppl\\":\\"source=demo| head 1\\"}'

// after parse and flatten, flattenedParameters will have 
// PPL.output.executionResult and PPL.output.ppl two entries.

@zane-neo
Copy link
Collaborator

zane-neo commented Aug 1, 2024

The tool's response will be parsed by build-in parsers first and then this painless script tool will be used as the final tool to build a final response to user? Do you have an example that painlessTool can be used as glue placed between two tools, the second tool consumes the results generated by the painlessTool?

It seems the tool's response is included in the parameter map with key {toolName}.output: https://github.com/opensearch-project/ml-commons/blob/main/ml-algorithms/src/main/java/org/opensearch/ml/engine/algorithms/agent/MLFlowAgentRunner.java#L109, I was thinking it's possible to enhance the tool's response parse logic to generate a formatted result and put them into the parameters map with example key {toolName}.output.formatted. This approach adds an extra benefit: If next tool needs to consume some key in the response of previous tool, it can extract the value from the formatted map easily.

@Hailong-am
Copy link
Contributor Author

Do you have an example that painlessTool can be used as glue placed between two tools, the second tool consumes the results generated by the painlessTool?

Here is one example for test purpose https://github.com/opensearch-project/skills/pull/380/files#diff-b36a3f988b4f9a022ca2be29cfa45a99ca309d0fc4cdf5a74245fef3968e6a7eR52-R79

@Hailong-am
Copy link
Contributor Author

I was thinking it's possible to enhance the tool's response parse logic to generate a formatted result and put them into the parameters map with example key {toolName}.output.formatted. This approach adds an extra benefit: If next tool needs to consume some key in the response of previous tool, it can extract the value from the formatted map easily.

the issue here is {toolName}.output.formatted need to format as string first and then parse to next tool

  String key = getToolName(previousToolSpec);
  String outputKey = key + ".output";

  String outputResponse = parseResponse(output);
  params.put(outputKey, escapeJson(outputResponse));

https://github.com/opensearch-project/ml-commons/blob/a436a946b439ab5a4644e575112911df5a6d26f6/ml-algorithms/src/main/java/org/opensearch/ml/engine/algorithms/agent/MLFlowAgentRunner.java#L108-L112

@zane-neo
Copy link
Collaborator

zane-neo commented Aug 5, 2024

Do you have an example that painlessTool can be used as glue placed between two tools, the second tool consumes the results generated by the painlessTool?

Here is one example for test purpose https://github.com/opensearch-project/skills/pull/380/files#diff-b36a3f988b4f9a022ca2be29cfa45a99ca309d0fc4cdf5a74245fef3968e6a7eR52-R79

It seems the painlessTool is the last tool in the agent instead of a glue to combine two different tools.

@zane-neo
Copy link
Collaborator

zane-neo commented Aug 5, 2024

I was thinking it's possible to enhance the tool's response parse logic to generate a formatted result and put them into the parameters map with example key {toolName}.output.formatted. This approach adds an extra benefit: If next tool needs to consume some key in the response of previous tool, it can extract the value from the formatted map easily.

the issue here is {toolName}.output.formatted need to format as string first and then parse to next tool

  String key = getToolName(previousToolSpec);
  String outputKey = key + ".output";

  String outputResponse = parseResponse(output);
  params.put(outputKey, escapeJson(outputResponse));

https://github.com/opensearch-project/ml-commons/blob/a436a946b439ab5a4644e575112911df5a6d26f6/ml-algorithms/src/main/java/org/opensearch/ml/engine/algorithms/agent/MLFlowAgentRunner.java#L108-L112

I was thinking we can change the behavior in ml-commons, but it's fine if you think this is a different issue than current proposal to solve. But maybe we should consider where is a better place to put the parsers, skills or ml-commons.

@Hailong-am
Copy link
Contributor Author

But maybe we should consider where is a better place to put the parsers, skills or ml-commons.

That could be options too, changes in ml-commons need to consider both flow/conversational agents

@dblock dblock removed the untriaged label Aug 19, 2024
@dblock
Copy link
Member

dblock commented Aug 19, 2024

Catch All Triage - 1, 2, 3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants