diff --git a/projects/medical/2d_image/x_ray/chest_image_pneum/README.md b/projects/medical/2d_image/x_ray/chest_image_pneum/README.md new file mode 100644 index 0000000000..a1cd27ba45 --- /dev/null +++ b/projects/medical/2d_image/x_ray/chest_image_pneum/README.md @@ -0,0 +1,147 @@ +# Chest Image Dataset for Pneumothorax Segmentation + +## Description + +This project supports **`Chest Image Dataset for Pneumothorax Segmentation`**, which can be downloaded from [here](https://tianchi.aliyun.com/dataset/83075). + +### Dataset Overview + +Pneumothorax can be caused by a blunt chest injury, damage from underlying lung disease, or most horrifying—it may occur for no obvious reason at all. On some occasions, a collapsed lung can be a life-threatening event. +Pneumothorax is usually diagnosed by a radiologist on a chest x-ray, and can sometimes be very difficult to confirm. An accurate AI algorithm to detect pneumothorax would be useful in a lot of clinical scenarios. AI could be used to triage chest radiographs for priority interpretation, or to provide a more confident diagnosis for non-radiologists. + +The dataset is provided by the Society for Imaging Informatics in Medicine(SIIM), American College of Radiology (ACR),Society of Thoracic Radiology (STR) and MD.ai. You can develop a model to classify (and if present, segment) pneumothorax from a set of chest radiographic images. If successful, you could aid in the early recognition of pneumothoraces and save lives. + +### Original Statistic Information + +| Dataset name | Anatomical region | Task type | Modality | Num. Classes | Train/Val/Test Images | Train/Val/Test Labeled | Release Date | License | +| --------------------------------------------------------------------- | ----------------- | ------------ | -------- | ------------ | --------------------- | ---------------------- | ------------ | ------------------------------------------------------------------ | +| [pneumothorax segmentation](https://tianchi.aliyun.com/dataset/83075) | thorax | segmentation | x_ray | 2 | 12089/-/3205 | yes/-/no | - | [CC-BY-SA-NC 4.0](https://creativecommons.org/licenses/by-sa/4.0/) | + +| Class Name | Num. Train | Pct. Train | Num. Val | Pct. Val | Num. Test | Pct. Test | +| :---------------: | :--------: | :--------: | :------: | :------: | :-------: | :-------: | +| normal | 12089 | 99.75 | - | - | - | - | +| pneumothorax area | 2669 | 0.25 | - | - | - | - | + +Note: + +- `Pct` means percentage of pixels in this category in all pixels. + +### Visualization + +![bac](https://raw.githubusercontent.com/uni-medical/medical-datasets-visualization/main/2d/semantic_seg/x_ray/chest_image_pneum/chest_image_pneum_dataset.png) + +### Prerequisites + +- Python v3.8 +- PyTorch v1.10.0 +- [MIM](https://github.com/open-mmlab/mim) v0.3.4 +- [MMCV](https://github.com/open-mmlab/mmcv) v2.0.0rc4 +- [MMEngine](https://github.com/open-mmlab/mmengine) v0.2.0 or higher +- [MMSegmentation](https://github.com/open-mmlab/mmsegmentation) v1.0.0rc5 + +All the commands below rely on the correct configuration of `PYTHONPATH`, which should point to the project's directory so that Python can locate the module files. In `chest_image_pneum/` root directory, run the following line to add the current directory to `PYTHONPATH`: + +```shell +export PYTHONPATH=`pwd`:$PYTHONPATH +``` + +### Dataset preparing + +- download dataset from [here](https://tianchi.aliyun.com/dataset/83075) and decompress data to path `'data/'`. +- run script `"python tools/prepare_dataset.py"` to format data and change folder structure as below. +- run script `"python ../../tools/split_seg_dataset.py"` to split dataset and generate `train.txt`, `val.txt` and `test.txt`. If the label of official validation set and test set can't be obtained, we generate `train.txt` and `val.txt` from the training set randomly. + +```none + mmsegmentation + ├── mmseg + ├── projects + │ ├── medical + │ │ ├── 2d_image + │ │ │ ├── x_ray + │ │ │ │ ├── chest_image_pneum + │ │ │ │ │ ├── configs + │ │ │ │ │ ├── datasets + │ │ │ │ │ ├── tools + │ │ │ │ │ ├── data + │ │ │ │ │ │ ├── train.txt + │ │ │ │ │ │ ├── test.txt + │ │ │ │ │ │ ├── images + │ │ │ │ │ │ │ ├── train + │ │ │ │ | │ │ │ ├── xxx.png + │ │ │ │ | │ │ │ ├── ... + │ │ │ │ | │ │ │ └── xxx.png + │ │ │ │ │ │ ├── masks + │ │ │ │ │ │ │ ├── train + │ │ │ │ | │ │ │ ├── xxx.png + │ │ │ │ | │ │ │ ├── ... + │ │ │ │ | │ │ │ └── xxx.png +``` + +### Divided Dataset Information + +***Note: The table information below is divided by ourselves.*** + +| Class Name | Num. Train | Pct. Train | Num. Val | Pct. Val | Num. Test | Pct. Test | +| :---------------: | :--------: | :--------: | :------: | :------: | :-------: | :-------: | +| normal | 9637 | 99.75 | 2410 | 99.74 | - | - | +| pneumothorax area | 2137 | 0.25 | 532 | 0.26 | - | - | + +### Training commands + +Train models on a single server with one GPU. + +```shell +mim train mmseg ./configs/${CONFIG_FILE} +``` + +### Testing commands + +Test models on a single server with one GPU. + +```shell +mim test mmseg ./configs/${CONFIG_FILE} --checkpoint ${CHECKPOINT_PATH} +``` + + + +## Results + +### Bactteria detection with darkfield microscopy + +| Method | Backbone | Crop Size | lr | mIoU | mDice | config | download | +| :-------------: | :------: | :-------: | :----: | :--: | :---: | :------------------------------------------------------------------------------------: | :----------------------: | +| fcn_unet_s5-d16 | unet | 512x512 | 0.01 | - | - | [config](./configs/fcn-unet-s5-d16_unet_1xb16-0.01-20k_chest-image-pneum-512x512.py) | [model](<>) \| [log](<>) | +| fcn_unet_s5-d16 | unet | 512x512 | 0.001 | - | - | [config](./configs/fcn-unet-s5-d16_unet_1xb16-0.001-20k_chest-image-pneum-512x512.py) | [model](<>) \| [log](<>) | +| fcn_unet_s5-d16 | unet | 512x512 | 0.0001 | - | - | [config](./configs/fcn-unet-s5-d16_unet_1xb16-0.0001-20k_chest-image-pneum-512x512.py) | [model](<>) \| [log](<>) | + +## Checklist + +- [x] Milestone 1: PR-ready, and acceptable to be one of the `projects/`. + + - [x] Finish the code + + - [x] Basic docstrings & proper citation + + - [x] Test-time correctness + + - [x] A full README + +- [x] Milestone 2: Indicates a successful model implementation. + + - [x] Training-time correctness + +- [ ] Milestone 3: Good to be a part of our core package! + + - [ ] Type hints and docstrings + + - [ ] Unit tests + + - [ ] Code polishing + + - [ ] Metafile.yml + +- [ ] Move your modules into the core package following the codebase's file hierarchy structure. + +- [ ] Refactor your modules into the core package following the codebase's file hierarchy structure. diff --git a/projects/medical/2d_image/x_ray/chest_image_pneum/configs/chest-image-pneum_512x512.py b/projects/medical/2d_image/x_ray/chest_image_pneum/configs/chest-image-pneum_512x512.py new file mode 100644 index 0000000000..411229bd41 --- /dev/null +++ b/projects/medical/2d_image/x_ray/chest_image_pneum/configs/chest-image-pneum_512x512.py @@ -0,0 +1,42 @@ +dataset_type = 'ChestImagePneumDataset' +data_root = 'data/' +img_scale = (512, 512) +train_pipeline = [ + dict(type='LoadImageFromFile'), + dict(type='LoadAnnotations'), + dict(type='Resize', scale=img_scale, keep_ratio=False), + dict(type='RandomFlip', prob=0.5), + dict(type='PhotoMetricDistortion'), + dict(type='PackSegInputs') +] +test_pipeline = [ + dict(type='LoadImageFromFile'), + dict(type='Resize', scale=img_scale, keep_ratio=False), + dict(type='LoadAnnotations'), + dict(type='PackSegInputs') +] +train_dataloader = dict( + batch_size=16, + num_workers=4, + persistent_workers=True, + sampler=dict(type='InfiniteSampler', shuffle=True), + dataset=dict( + type=dataset_type, + data_root=data_root, + ann_file='train.txt', + data_prefix=dict(img_path='images/', seg_map_path='masks/'), + pipeline=train_pipeline)) +val_dataloader = dict( + batch_size=1, + num_workers=4, + persistent_workers=True, + sampler=dict(type='DefaultSampler', shuffle=False), + dataset=dict( + type=dataset_type, + data_root=data_root, + ann_file='val.txt', + data_prefix=dict(img_path='images/', seg_map_path='masks/'), + pipeline=test_pipeline)) +test_dataloader = val_dataloader +val_evaluator = dict(type='IoUMetric', iou_metrics=['mIoU', 'mDice']) +test_evaluator = dict(type='IoUMetric', iou_metrics=['mIoU', 'mDice']) diff --git a/projects/medical/2d_image/x_ray/chest_image_pneum/configs/fcn-unet-s5-d16_unet_1xb16-0.0001-20k_chest-image-pneum-512x512.py b/projects/medical/2d_image/x_ray/chest_image_pneum/configs/fcn-unet-s5-d16_unet_1xb16-0.0001-20k_chest-image-pneum-512x512.py new file mode 100644 index 0000000000..0f26459467 --- /dev/null +++ b/projects/medical/2d_image/x_ray/chest_image_pneum/configs/fcn-unet-s5-d16_unet_1xb16-0.0001-20k_chest-image-pneum-512x512.py @@ -0,0 +1,18 @@ +_base_ = [ + './chest-image-pneum_512x512.py', + 'mmseg::_base_/models/fcn_unet_s5-d16.py', + 'mmseg::_base_/default_runtime.py', + 'mmseg::_base_/schedules/schedule_20k.py' +] +custom_imports = dict(imports='datasets.chest-image-pneum_dataset') +img_scale = (512, 512) +data_preprocessor = dict(size=img_scale) +optimizer = dict(lr=0.0001) +optim_wrapper = dict(optimizer=optimizer) +model = dict( + data_preprocessor=data_preprocessor, + decode_head=dict(num_classes=2), + auxiliary_head=None, + test_cfg=dict(mode='whole', _delete_=True)) +vis_backends = None +visualizer = dict(vis_backends=vis_backends) diff --git a/projects/medical/2d_image/x_ray/chest_image_pneum/configs/fcn-unet-s5-d16_unet_1xb16-0.001-20k_chest-image-pneum-512x512.py b/projects/medical/2d_image/x_ray/chest_image_pneum/configs/fcn-unet-s5-d16_unet_1xb16-0.001-20k_chest-image-pneum-512x512.py new file mode 100644 index 0000000000..37b91889d8 --- /dev/null +++ b/projects/medical/2d_image/x_ray/chest_image_pneum/configs/fcn-unet-s5-d16_unet_1xb16-0.001-20k_chest-image-pneum-512x512.py @@ -0,0 +1,18 @@ +_base_ = [ + './chest-image-pneum_512x512.py', + 'mmseg::_base_/models/fcn_unet_s5-d16.py', + 'mmseg::_base_/default_runtime.py', + 'mmseg::_base_/schedules/schedule_20k.py' +] +custom_imports = dict(imports='datasets.chest-image-pneum_dataset') +img_scale = (512, 512) +data_preprocessor = dict(size=img_scale) +optimizer = dict(lr=0.001) +optim_wrapper = dict(optimizer=optimizer) +model = dict( + data_preprocessor=data_preprocessor, + decode_head=dict(num_classes=2), + auxiliary_head=None, + test_cfg=dict(mode='whole', _delete_=True)) +vis_backends = None +visualizer = dict(vis_backends=vis_backends) diff --git a/projects/medical/2d_image/x_ray/chest_image_pneum/configs/fcn-unet-s5-d16_unet_1xb16-0.01-20k_chest-image-pneum-512x512.py b/projects/medical/2d_image/x_ray/chest_image_pneum/configs/fcn-unet-s5-d16_unet_1xb16-0.01-20k_chest-image-pneum-512x512.py new file mode 100644 index 0000000000..379e8181f3 --- /dev/null +++ b/projects/medical/2d_image/x_ray/chest_image_pneum/configs/fcn-unet-s5-d16_unet_1xb16-0.01-20k_chest-image-pneum-512x512.py @@ -0,0 +1,18 @@ +_base_ = [ + './chest-image-pneum_512x512.py', + 'mmseg::_base_/models/fcn_unet_s5-d16.py', + 'mmseg::_base_/default_runtime.py', + 'mmseg::_base_/schedules/schedule_20k.py' +] +custom_imports = dict(imports='datasets.chest-image-pneum_dataset') +img_scale = (512, 512) +data_preprocessor = dict(size=img_scale) +optimizer = dict(lr=0.01) +optim_wrapper = dict(optimizer=optimizer) +model = dict( + data_preprocessor=data_preprocessor, + decode_head=dict(num_classes=2), + auxiliary_head=None, + test_cfg=dict(mode='whole', _delete_=True)) +vis_backends = None +visualizer = dict(vis_backends=vis_backends) diff --git a/projects/medical/2d_image/x_ray/chest_image_pneum/datasets/chest-image-pneum_dataset.py b/projects/medical/2d_image/x_ray/chest_image_pneum/datasets/chest-image-pneum_dataset.py new file mode 100644 index 0000000000..aeee60ae92 --- /dev/null +++ b/projects/medical/2d_image/x_ray/chest_image_pneum/datasets/chest-image-pneum_dataset.py @@ -0,0 +1,27 @@ +from mmseg.datasets import BaseSegDataset +from mmseg.registry import DATASETS + + +@DATASETS.register_module() +class ChestImagePneumDataset(BaseSegDataset): + """ChestImagePneumDataset dataset. + + In segmentation map annotation for ChestImagePneumDataset, + ``reduce_zero_label`` is fixed to False. The ``img_suffix`` + is fixed to '.png' and ``seg_map_suffix`` is fixed to '.png'. + + Args: + img_suffix (str): Suffix of images. Default: '.png' + seg_map_suffix (str): Suffix of segmentation maps. Default: '.png' + """ + METAINFO = dict(classes=('normal', 'pneumothorax area')) + + def __init__(self, + img_suffix='.png', + seg_map_suffix='.png', + **kwargs) -> None: + super().__init__( + img_suffix=img_suffix, + seg_map_suffix=seg_map_suffix, + reduce_zero_label=False, + **kwargs) diff --git a/projects/medical/2d_image/x_ray/chest_image_pneum/tools/prepare_dataset.py b/projects/medical/2d_image/x_ray/chest_image_pneum/tools/prepare_dataset.py new file mode 100755 index 0000000000..47eddc96dc --- /dev/null +++ b/projects/medical/2d_image/x_ray/chest_image_pneum/tools/prepare_dataset.py @@ -0,0 +1,73 @@ +import os + +import numpy as np +import pandas as pd +import pydicom +from PIL import Image + +root_path = 'data/' +img_suffix = '.dcm' +seg_map_suffix = '.png' +save_img_suffix = '.png' +save_seg_map_suffix = '.png' + +x_train = [] +for fpath, dirname, fnames in os.walk('data/chestimage_train_datasets'): + for fname in fnames: + if fname.endswith('.dcm'): + x_train.append(os.path.join(fpath, fname)) +x_test = [] +for fpath, dirname, fnames in os.walk('data/chestimage_test_datasets/'): + for fname in fnames: + if fname.endswith('.dcm'): + x_test.append(os.path.join(fpath, fname)) + +os.system('mkdir -p ' + root_path + 'images/train/') +os.system('mkdir -p ' + root_path + 'images/test/') +os.system('mkdir -p ' + root_path + 'masks/train/') + + +def rle_decode(rle, width, height): + mask = np.zeros(width * height, dtype=np.uint8) + array = np.asarray([int(x) for x in rle.split()]) + starts = array[0::2] + lengths = array[1::2] + + current_position = 0 + for index, start in enumerate(starts): + current_position += start + mask[current_position:current_position + lengths[index]] = 1 + current_position += lengths[index] + + return mask.reshape(width, height, order='F') + + +part_dir_dict = {0: 'train/', 1: 'test/'} +dict_from_csv = pd.read_csv( + root_path + 'chestimage_train-rle_datasets.csv', sep=',', + index_col=0).to_dict()[' EncodedPixels'] + +for ith, part in enumerate([x_train, x_test]): + part_dir = part_dir_dict[ith] + for img in part: + basename = os.path.basename(img) + img_id = '.'.join(basename.split('.')[:-1]) + if ith == 0 and (img_id not in dict_from_csv.keys()): + continue + image = pydicom.read_file(img).pixel_array + save_img_path = root_path + 'images/' + part_dir + '.'.join( + basename.split('.')[:-1]) + save_img_suffix + print(save_img_path) + img_h, img_w = image.shape[:2] + image = Image.fromarray(image) + image.save(save_img_path) + if ith == 1: + continue + if dict_from_csv[img_id] == '-1': + mask = np.zeros((img_h, img_w), dtype=np.uint8) + else: + mask = rle_decode(dict_from_csv[img_id], img_h, img_w) + save_mask_path = root_path + 'masks/' + part_dir + '.'.join( + basename.split('.')[:-1]) + save_seg_map_suffix + mask = Image.fromarray(mask) + mask.save(save_mask_path)