GitHub - vyhuholl/russian_detoxification: Models for automatic Russian texts detoxification

Models for automatic detoxification of Russian texts.

Data

Folder data consists of:

data/train.csv — train dataset, 262702 texts (213271 non-toxic, 37073 toxic);
data/test.txt — test dataset, 12358 toxic texts;
data/parallel_corpus.csv — parallel train dataset, 500 samples;
data/toxic_vocab.txt — pre-defined vocab of rude and toxic words, 139490 words;
data/preds_delete.txt — predictions of the delete baseline on the test dataset.

Models

Baselines

We provide two baselines:

Duplicate — simple duplication of the input;
Delete (baselines/delete.py) — removal of rude and toxic from the pre-defined vocab.

Models

The general algorithm of text detoxification:

Toxic word detection — we train a binary classifier to detect toxic words;
Toxic word replacement — to replace words classified as toxic, we use one of pre-trained NLP models for Russian language (either ruBERT-large or ruRoBERTa-large). From the top-10 of model predictions we select one that is 1) non-toxic 2) closest to the original word (word embeddings are generated with the FastText model).
Toxic word deletion — if a non-toxic replacement wasn't found in the top-10 of model predictions, we delete the word.

Evaluation

The evaluation consists of three types of metrics:

style transfer accuracy (STA) — the average confidence of the pre-trained BERT-based toxic/non-toxic text classifier (SkolkovoInstitute/russian_toxicity_classifier);
cosine similarity (CS) — the average distance of embeddings of the input and output texts. The embeddings are generated with the FastText Skipgram model;
fluency score (FL) — the average difference in confidence of the pre-trained BERT-based corrupted/non-corrupted text classifier (SkolkovoInstitute/rubert-base-corruption-detector) between the input and output texts.

Finally, joint score (JS): the sentence-level multiplication of the STA, SIM, and FL scores.

You can run the metric.py script for evaluation with the following parameters:

-i, --inputs — the path to the input dataset written in .txt file;
-p, --preds — the path to the file of model's prediction written in .txt file;
-b, --batch_size — batch size for the classifiers, default value 32;
-m, --model — the name of your model, is empty by default;
-f, --file — the path to the output file. If not specified, results will not be written to a file.

Results

Method	STA↑	CS↑	FL↑	JS↑
Baselines
Duplicate	0.07	1.00	1.00	0.06
Delete	0.35	0.97	0.84	0.26
Models
ruBERT-large
ruRoBERTa-large

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
baselines		baselines
data		data
models		models
notebooks		notebooks
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
README.md		README.md
metric.py		metric.py
requirements.txt		requirements.txt
utils.py		utils.py
Детоксификация русскоязычных текстов-final.pdf		Детоксификация русскоязычных текстов-final.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data

Models

Baselines

Models

Evaluation

Results

About

Releases

Packages

Languages

vyhuholl/russian_detoxification

Folders and files

Latest commit

History

Repository files navigation

Data

Models

Baselines

Models

Evaluation

Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages