A hydrus API plugin to download ExH archives
hex is a tool that connects to the hydrus client API and allows you to download and import ExH galleries via the archive download method (instead of scraping the images), alongside their tags and URL association.
A simple HTTP API allows you to send gallery URLs to hex that it will then proceed to download and import automatically. A userscript that adds a hex toolbar to ExH gallery pages is included for convenience.
The recommended way to run is via Docker. Basic instructions on how to run without it are also provided.
To use hex with Docker, you can simply pull the prebuilt image from Docker Hub:
user@local:~$ docker pull mtbl/hex
Alternatively, you can also build the image yourself. The user that is used
inside the container has UID 1000
and GID 1000
by default. You can adjust
this (e.g., to match your host UID/GID) by providing the arguments USER_ID
and GROUP_ID
when making a build.
To install without Docker, you can simply clone the repository and install dependencies.
user@local:~$ git clone https://github.com/imtbl/hex.git
user@local:~$ cd hex
user@local:hex$ yarn
- hydrus
- Docker (when using Docker)
- Node.js (when not using Docker)
- Yarn (when not using Docker)
- A Puppeteer-compatible browser that hex uses for navigating and parsing ExH; browserless is recommended as a headless solution
hex should work with both the latest LTS and the latest stable version of Node.js. If you encounter any issues with either of those versions when not using Docker, please let me know.
This application follows semantic versioning and any
breaking changes that require additional attention will be released under a new
major version (e.g., 2.0.0
). Minor version updates (e.g., 1.1.0
or 1.2.0
)
are therefore always safe to simply install.
When necessary, this section will be expanded with upgrade guides for new major versions.
Simply pull the latest Docker image to update:
user@local:~$ docker pull mtbl/hex
If you chose not to use Docker, you can update via Git:
user@local:hex$ git pull
user@local:hex$ yarn
hex has a few specific requirements that you need to keep in mind in order to use it, regardless of if you decide to run with or without Docker:
- It requires a Puppeteer-compatible browser that is used for navigating and parsing ExH; browserless is recommended as a headless solution and used in the included Docker Compose example setup.
- hex needs access to the hydrus client API. The access key needs to have the
following permissions:
import files
,add tags to files
andadd urls for processing
. - The import path needs to be accessible by both hex and hydrus, so you need to find a way to achieve this should you decide to run hex on a different machine. This can, for example, be achieved by using sshfs.
To make using Docker as easy as possible, a working
Docker Compose example setup is provided. To get started with
this example setup, simply duplicate docker-compose.yml.example
as
docker-compose.yml
and adjust the variables in the environment
section as
described here.
Pay special attention to the variable HEX_DOCKER_HOST_IMPORT_PATH
. This is
only required when using Docker and needs to contain the absolute path to
the import directory on the host machine that is mounted as volume. This is the
path hydrus will access to import the files from when hex asks it
to.
Finally, start the containers:
user@local:hex$ docker-compose up -d
To run without Docker, you will first need to duplicate the .env.example
as
.env
and adjust the variables as described here.
After that, you can start hex:
user@local:hex$ yarn start
Configuration is done entirely via environment variables. Please pay special attention to the instructions to prevent issues.
HEX_PORT=8000
: the port hex is listening on.HEX_ACCESS_KEY=
: an arbitrary string used as access key for hex's API. Can be of any (reasonable) length, the only requirement being that it is set at all.HEX_BROWSER_WS_ENDPOINT=ws://localhost:3000
: the WebSocket endpoint of the Puppeteer-compatible browser hex needs to be able to connect to. No trailing slashes.HEX_BROWSER_USER_AGENT=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36
: the user agent to use with Puppeteer.HEX_HYDRUS_BASE_URL=http://localhost:45869
: the hydrus client API base URL. No trailing slashes.HEX_HYDRUS_ALLOW_NEWER_API_VERSION=false
: setting this totrue
allows hex to connect to a newer hydrus client API versions than it officially supports. Enable this on your own risk to (possibly) be able to continue using hex if the hydrus API gets updated and a new hex release that reflects this is not out yet. Be aware that this can lead to hex not working at all or imports getting (partially) broken.HEX_HYDRUS_ACCESS_KEY=
: the hydrus client API access key hex uses to connect. Needs to have the following permissions:import files
,add tags to files
andadd urls for processing
.HEX_HYDRUS_TAG_SERVICE=my tags
: the hydrus tag service hex adds tags to.HEX_SKIP_IMPORT=false
: setting this totrue
downloads the archive and extracts it, but skips the hydrus import altogether.HEX_IMPORT_PATH=./import
: the path hex saves archive downloads at and extracts them in. Can be relative or absolute.HEX_DOCKER_HOST_IMPORT_PATH=
: only required when using Docker and needs to contain the absolute path to the import directory on the host machine that is mounted as volume. The path has to be in a format your host operating system supports.HEX_SKIP_KNOWN_FILES=false
: setting this totrue
skips files hydrus knows about altogether (this will neither try to import them nor attempt to update their tags).HEX_DELETE_ARCHIVES_AFTER_IMPORT=true
: setting this tofalse
will cause the extracted archives stored underHEX_IMPORT_PATH
not to be deleted once the hydrus import finishes.HEX_SKIP_TAGS=false
: setting this totrue
will prevent any new tags from being added to imported files and disregards any other tag-related settings.HEX_BLACKLISTED_NAMESPACES=
: namespaces that are added here separated with###
will be excluded from getting added to hydrus. E.g.,artist###language###misc
. This only applies to tags sourced from ExH and the specialpage:<page number>
tag that is added by default, not to tags added viaHEX_ADDITIONAL_TAGS
. In addition, ifHEX_NAMESPACE_REPLACEMENTS
is used and the replacement (but not the original) is a blacklisted namespace, it will still be added as well.HEX_NAMESPACE_REPLACEMENTS='artist|||creator###parody|||series###female|||###male|||###group|||###misc|||'
: namespaces that are added here in the format<original>|||<replacement>
and separated with###
will be replaced accordingly. Leaving out the replacement altogether (e.g.,###misc|||
) unnamespaces them.HEX_ADDITIONAL_TAGS=
: additional tags to be added. Have to be provided in the format<namespace>:<tag>
or simply<tag>
(for unnamespaced tags) and separated with###
.HEX_ADD_UNIQUE_IDENTIFIER_TAG=false
: setting this to true causes hex to add a tag in the form<HEX_UNIQUE_IDENTIFIER_NAMESPACE>:<ExH gallery ID>-<page>
. This tag is intended to uniquely identify the position of an image inside an archive in case it is used across multiple ones (in which case it might have multiple/differentpage
tags, making it hard to determine which one belongs to which archive).HEX_UNIQUE_IDENTIFIER_NAMESPACE=unique
: the namespace to use for the unique identifier tag whenHEX_ADD_UNIQUE_IDENTIFIER_TAG
istrue
.
To make using hex as comfortable as possible, a userscript that adds a hex toolbar to ExH gallery pages is included. When you first open ExH with the userscript enabled, it will prompt you for the hex base URL and the access key. You can adjust these at any point in the settings, but be sure to refresh any open ExH page after doing so (as the changes will not be reflected on a page that had already been loaded before changing base URL or access key). Please keep in mind that the access key is stored in plaintext and that anyone with access to the browser can read it.
Request and response bodies are always in JSON format. The Authorization
header in the format Authorization: Bearer <HEX_ACCESS_KEY>
is used to
authenticate for all routes except the base route (/
).
Requests with missing or malformed parameters will be responded with an error
in the following format and error code 400
:
{
"error": <field name>
}
Responds with the version number and the API version number of the hex installation. The API version number will increase by 1 every time an existing API endpoint is modified in a way it behaves differently than before or removed altogether.
Route: GET /
Response on success:
{
"hex": {
"version": <version number of hex installation>,
"apiVersion>": <API version number of hex installation>
}
}
Responds with a number of default settings configured via environment variables. These settings can then be overridden per import.
Route: GET /settings
Response on success:
{
"settings": {
"skipImport": <boolean indicating if hydrus imports should be skipped>,
"skipKnownFiles": <boolean indicating if known files should be skipped>,
"deleteArchivesAfterImport": <boolean indicating if archives should be deleted after import>,
"skipTags": <boolean indicating if adding tags should be skipped>,
"blacklistedNamespaces": <array of blacklisted namespaces>,
"namespaceReplacements": <object where the key is the original and the value the replacement namespace>,
"additionalTags": <array of additional tags>
}
}
Used to send ExH gallery URLs to hex for processing.
Route: POST /import
Request body:
{
"cookies": <ExH `Cookie` header>,
"url": <ExH gallery URL to be processed>,
"skipImport": <boolean indicating if hydrus imports should be skipped>, // optional, if not provided, the default will be used
"skipKnownFiles": <boolean indicating if known files should be skipped>, // optional, if not provided, the default will be used
"deleteArchivesAfterImport": <boolean indicating if archives should be deleted after import>, // optional, if not provided, the default will be used
"skipTags": <boolean indicating if adding tags should be skipped>, // optional, if not provided, the default will be used
"blacklistedNamespaces": <blacklisted namespaces in the same format as `HEX_BLACKLISTED_NAMESPACES`>, // optional, if not provided, the default will be used
"namespaceReplacements": <namespace replacements in the same format as `HEX_NAMESPACE_REPLACEMENTS`>, // optional, if not provided, the default will be used
"additionalTags": <additional tags in the same format as `HEX_ADDITIONAL_TAGS`> // optional, if not provided, the default will be used
}
Response on success:
{
"import": <ExH gallery URL> // does not indicate success, only that the processing of the gallery has been started
}
You are welcome to help out!
Open an issue or submit a pull request.
AGPLv3 © imtbl