-
Notifications
You must be signed in to change notification settings - Fork 0
file processing application design
The file processing application is thought in a way that it is an extensible application. The application can be extended using native action which relies on native packages, managed by the language-specific package manager (Cargo for Rust, NPM for JavaScript, Pip for Python...). Then those package can be local or located on a registry of the package manager. The application can also be extended using docker action based on Docker containers. In a same way repository of the docker image can be local or on a docker registry.
fpa/
|__ bin/
|__ src/
| |__ commands/
| | |__ plugin/
| | | |__ add.js
| | |
| | |__ consume.js
| | |__ install.js
| |
| |__ actioner.js
| |__ index.js
| |__ ta-util.js
|
|__ tests/
|__ tmp/
|__ actions.json
|__ .env
This folder contains the executable to use the command line interface
The sources of the application
The entrypoint of the command line application
A set of reused functions accross the code base of the application
- The actioner take as an argument the payload of the task given by the consumer
- Then the actioner creates a temporary work folder inside tmp/: tmp/xxx/ and two other temporary folders: tmp/xxx/output/ and tmp/xxx/input/
- Then the actioner get file(s) of the payload from file storage service and put them inside tmp/xxx/input/
- Then actioner use the action-list.conf file to get the type of the action (native/docker)
- If the action is a native action:
- Then the actioner import the coresponding library and run it by passing it the location of temporary
work folder
tmp/xxx/
and the arguments of the action.
- Then the actioner import the coresponding library and run it by passing it the location of temporary
work folder
- If the action is a docker action:
- Then the actioner launch the container named following the action name.
- The tmp/xxx/ folder is mount as a volume with the binding:
/path/to/tmp/xxx:/app/files/
- The arguments of the command use by the launched container are the arguments of the action.
- For both type of action:
- The action will write the resulting files inside the folder tmp/xxx/output which is bind to /app/files/output inside a container.
- The action reads the tmp/xxx/results.json file (bind to /app/files/results.json for a docker action) to know the naming convention of the output files enforced by the task payload. Yet, sometimes it is possible that no file naming is enforced inside the task payload. It is not possible to know how many files result from the action. Then the actioner randomly name the resulting files.
- The action writes into the tmp/xxx/output/metadata.json (bind to /app/files/output/metadata.json for a docker action) file the resulting from the action.
- The action writes into the tmp/xxx/output/status.json (bind to /app/files/output/status.json for a docker action) file a live report of the process.
- The actioner reads callback from the action
- The actioner gives feedback to MMF api
- Then the actioner push the resulting files to the file storage service.
- Then actioner gives feedback to MMF api concerning the location of these files on the file storage service and the metadata created by the action.
- Then the actioner clean the temporary work folder.
The folder containing the available commands of the command line interface
The folder contains the commands used for plugin management
This the part of the code used to add a plugin to the task-actioner
This is the code for the command that create an AMQP comsumer which will get a task from a queue and transfer the payload of the task to the actioner. The payload of the task is a json containing the name of the action, the files to be processed and a list of arguments for the action.
This is the code of the command to run to install plugins based on the actions.json file.
The tests folder
This file contains the different urls to the api of the file storage service, MMF platform , the secrets associated to these apis and the uid, gid, name of the user running action inside docker container.
MMF_API_BASE_URL=
MMF_API_SECRET_KEY=
FILE_STORAGE_HOST=
FILE_STORAGE_PORT=
FILE_STORAGE_USE_SSL=
FILE_STORAGE_SECRET_KEY=
FILE_STORAGE_ACCESS_KEY=
RABBITMQ_HOST=
RABBITMQ_PORT=
RABBITMQ_USER=
RABBITMQ_PASSWORD=
UID=
GID=
UNAME=
An action is either a docker action, either a native action
It contains the library of the action, written using the native language of the task actioner. It must be an node module whose main function is exported as run() which takes as arguments, the arguments of the action contained inside the task payload and the workspace of the action.
It contains:
- a README
- a Dockerfile
The readme contains information about the different files it can output and how to access it following the output files convention.
The dockerfile of the docker image
It must respect this template:
FROM <parent_image>
ARG UNAME=worker
ARG UID=1000
ARG GID=1000
# For classic parent_image
RUN groupadd --gid $GID $UNAME && useradd --gid $GID --uid $UID $UNAME
# For alpinelinux parent image
RUN addgroup -g $GID -S $UNAME && adduser -u $UID -S $UNAME -G $UNAME
WORKDIR /app
# Do everything you need to be done
# RUN something
# Copy something
# ...
RUN chown -R $UNAME:$UNAME /app
USER $UNAME
ENTRYPOINT ["my_entrypoint"]
A JSON string
{
"id": 12315,
"action": "action-a",
"inputFiles": [
{
"bucketName": "my bucket",
"objectName": "my object"
},
{
"bucketName": "my other or same bucket",
"objectName": "my other object"
},
...
],
"gpu": true,
"args": [
"arg1",
"arg2",
...
],
"outputFiles": {
"key1": {
"location": "my/location",
"name": "my_name"
},
"key2": {
"location": "my/location",
"name": "my_name"
},
...
},
"s3Location": {
"bucketName": "my bucket",
"keyPrefix": "my prefix"
}
}
{
"key1": {
"location": "my_location",
"name": "my_name"
},
"key2": {
"location": "my/location",
"name": "my_name"
},
...
}
This file is used to give a live short report of the process.
{
"step_name":{
"status":"done" or "in progress",
"progress": percentage representing the completion of the step
},
"step_name":{
"status":"done" or "in progress",
"progress": percentage representing the completion of the step
},
...
}