marine
is a tool kit for building the Japanese accent estimation model proposed in our paper (demo).
For academic use, please cite the following paper (ISCA archive).
@inproceedings{park22b_interspeech,
author={Byeongseon Park and Ryuichi Yamamoto and Kentaro Tachibana},
title={{A Unified Accent Estimation Method Based on Multi-Task Learning for Japanese Text-to-Speech}},
year=2022,
booktitle={Proc. Interspeech 2022},
pages={1931--1935},
doi={10.21437/Interspeech.2022-334}
}
The model included in this package is trained using JSUT corpus, which is not the same as the dataset in our paper. Therefore, the model's performance is also not equal to the performance introduced in our paper.
$ pip install marine
$ pip install -e ".[dev]"
In [1]: from marine.predict import Predictor
In [2]: nodes = [{"surface": "こんにちは", "pos": "感動詞:*:*:*", "pron": "コンニチワ", "c_type": "*", "c_form": "*", "accent_type": 0, "accent_con_type": "-1", "chain_flag": -1}]
In [3]: predictor = Predictor()
In [4]: predictor.predict([nodes])
Out[4]:
{'mora': [['コ', 'ン', 'ニ', 'チ', 'ワ']],
'intonation_phrase_boundary': [[0, 0, 0, 0, 0]],
'accent_phrase_boundary': [[0, 0, 0, 0, 0]],
'accent_status': [[0, 0, 0, 0, 0]]}
In [5]: predictor.predict([nodes], accent_represent_mode="high_low")
Out[5]:
{'mora': [['コ', 'ン', 'ニ', 'チ', 'ワ']],
'intonation_phrase_boundary': [[0, 0, 0, 0, 0]],
'accent_phrase_boundary': [[0, 0, 0, 0, 0]],
'accent_status': [[0, 1, 1, 1, 1]]}
Coming soon...
- marine: Apache 2.0 license (LICENSE)
- JSUT: CC-BY-SA 4.0 license, etc. (Please check jsut-label/LICENCE.txt)