Obrazowanie Biomedyczne

[INFO] Different datasets and experiments on branches

A study of Extraction Methods and SVM hyper-parameters impact on classification problem for 4 class COVID-19 and Pneumonia dataset.

How to use

To preprocess original dataset (on google colab) kaggle.json authorization file is required to be in repo directory

!git clone https://github.com/Shandelier/Classification-Experiments
%cd Classification-Experiments

!mkdir /content/Classification-Experiments/curated-chest-xray-image-dataset-for-covid19 -p
!pip install kaggle
import json
import zipfile
import os
auth = {}
with open('/content/Classification-Experiments/kaggle.json', 'r') as file:
  auth = json.load(file)
os.environ['KAGGLE_USERNAME'] = auth["username"]
os.environ['KAGGLE_KEY'] = auth["key"]
# !kaggle config set -n path -v '/content/drive/MyDrive/Colab Notebooks/PWr9'
!kaggle datasets download -d unaissait/curated-chest-xray-image-dataset-for-covid19
os.chdir('/content/Classification-Experiments/curated-chest-xray-image-dataset-for-covid19')
for file in os.listdir():
    zip_ref = zipfile.ZipFile(file, 'r')
    zip_ref.extractall()
    zip_ref.close()
! mv /content/Classification-Experiments/curated-chest-xray-image-dataset-for-covid19/curated-chest-xray-image-dataset-for-covid19.zip /content/Classification-Experiments/curated-chest-xray-image-dataset-for-covid19
%cd ..

# Unzip dataset
!unzip curated-chest-xray-image-dataset-for-covid19.zip -d /content/Classification-Experiments/curated-chest-xray-image-dataset-for-covid19

!python ./preprocess.py --dataset_dir "/content/Classification-Experiments/curated-chest-xray-image-dataset-for-covid19" --results_dir "/content/Classification-Experiments/results" --output_dir "/content/Classification-Experiments/output" --output_dataset_dir "/content/Classification-Experiments/datasets"

To analyze prepared dataset

python ./analyze_extraction.py
python ./post_extraction.py
python ./extract.py
python ./analyze_SVM.py
pythom ./post_SVM.py

The results are in "results*" folders.

Extraction Methods impact

We have compared PCA and Chi2 extraction methods with basic dataset from COVID-19 Kaggle ds. The 48 features were extracted using methods from preprocess.py file. Then in the loop we compared Balanced Accuracy score for diffrent number of preserved components.

SVM parameters

From extraction experiment we choose the best settings (PCA with n_components=10), and prepare dataset for this experiment. In here we compared 3 cathegories of parameters:

Kernels – Linear, Sigmoid, RBF;
Gammas – 1 – 0.0001
C parameteres – 1000 – 0.01

Wilcoxon Test

The results were tested with Wilcoxon test to reveal statisticly different scores. Under every score in table there is a list of scores that were worst compared to this one.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
datasets		datasets
results		results
results_Cs		results_Cs
results_Gammas		results_Gammas
results_Kernels		results_Kernels
.gitignore		.gitignore
README.md		README.md
analyze_SVM.py		analyze_SVM.py
analyze_extraction.py		analyze_extraction.py
extract.py		extract.py
latextabs.py		latextabs.py
post_extraction.py		post_extraction.py
post_svm.py		post_svm.py
preprocess.py		preprocess.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Obrazowanie Biomedyczne

[INFO] Different datasets and experiments on branches

How to use

Extraction Methods impact

SVM parameters

Wilcoxon Test

About

Releases

Packages

Languages

Shandelier/Classification-Experiments

Folders and files

Latest commit

History

Repository files navigation

Obrazowanie Biomedyczne

[INFO] Different datasets and experiments on branches

How to use

Extraction Methods impact

SVM parameters

Wilcoxon Test

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages