-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Project] Medical semantic seg dataset: chest_image_pneum (#2727)
- Loading branch information
Showing
7 changed files
with
343 additions
and
0 deletions.
There are no files selected for viewing
147 changes: 147 additions & 0 deletions
147
projects/medical/2d_image/x_ray/chest_image_pneum/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,147 @@ | ||
# Chest Image Dataset for Pneumothorax Segmentation | ||
|
||
## Description | ||
|
||
This project supports **`Chest Image Dataset for Pneumothorax Segmentation`**, which can be downloaded from [here](https://tianchi.aliyun.com/dataset/83075). | ||
|
||
### Dataset Overview | ||
|
||
Pneumothorax can be caused by a blunt chest injury, damage from underlying lung disease, or most horrifying—it may occur for no obvious reason at all. On some occasions, a collapsed lung can be a life-threatening event. | ||
Pneumothorax is usually diagnosed by a radiologist on a chest x-ray, and can sometimes be very difficult to confirm. An accurate AI algorithm to detect pneumothorax would be useful in a lot of clinical scenarios. AI could be used to triage chest radiographs for priority interpretation, or to provide a more confident diagnosis for non-radiologists. | ||
|
||
The dataset is provided by the Society for Imaging Informatics in Medicine(SIIM), American College of Radiology (ACR),Society of Thoracic Radiology (STR) and MD.ai. You can develop a model to classify (and if present, segment) pneumothorax from a set of chest radiographic images. If successful, you could aid in the early recognition of pneumothoraces and save lives. | ||
|
||
### Original Statistic Information | ||
|
||
| Dataset name | Anatomical region | Task type | Modality | Num. Classes | Train/Val/Test Images | Train/Val/Test Labeled | Release Date | License | | ||
| --------------------------------------------------------------------- | ----------------- | ------------ | -------- | ------------ | --------------------- | ---------------------- | ------------ | ------------------------------------------------------------------ | | ||
| [pneumothorax segmentation](https://tianchi.aliyun.com/dataset/83075) | thorax | segmentation | x_ray | 2 | 12089/-/3205 | yes/-/no | - | [CC-BY-SA-NC 4.0](https://creativecommons.org/licenses/by-sa/4.0/) | | ||
|
||
| Class Name | Num. Train | Pct. Train | Num. Val | Pct. Val | Num. Test | Pct. Test | | ||
| :---------------: | :--------: | :--------: | :------: | :------: | :-------: | :-------: | | ||
| normal | 12089 | 99.75 | - | - | - | - | | ||
| pneumothorax area | 2669 | 0.25 | - | - | - | - | | ||
|
||
Note: | ||
|
||
- `Pct` means percentage of pixels in this category in all pixels. | ||
|
||
### Visualization | ||
|
||
![bac](https://raw.githubusercontent.com/uni-medical/medical-datasets-visualization/main/2d/semantic_seg/x_ray/chest_image_pneum/chest_image_pneum_dataset.png) | ||
|
||
### Prerequisites | ||
|
||
- Python v3.8 | ||
- PyTorch v1.10.0 | ||
- [MIM](https://github.com/open-mmlab/mim) v0.3.4 | ||
- [MMCV](https://github.com/open-mmlab/mmcv) v2.0.0rc4 | ||
- [MMEngine](https://github.com/open-mmlab/mmengine) v0.2.0 or higher | ||
- [MMSegmentation](https://github.com/open-mmlab/mmsegmentation) v1.0.0rc5 | ||
|
||
All the commands below rely on the correct configuration of `PYTHONPATH`, which should point to the project's directory so that Python can locate the module files. In `chest_image_pneum/` root directory, run the following line to add the current directory to `PYTHONPATH`: | ||
|
||
```shell | ||
export PYTHONPATH=`pwd`:$PYTHONPATH | ||
``` | ||
|
||
### Dataset preparing | ||
|
||
- download dataset from [here](https://tianchi.aliyun.com/dataset/83075) and decompress data to path `'data/'`. | ||
- run script `"python tools/prepare_dataset.py"` to format data and change folder structure as below. | ||
- run script `"python ../../tools/split_seg_dataset.py"` to split dataset and generate `train.txt`, `val.txt` and `test.txt`. If the label of official validation set and test set can't be obtained, we generate `train.txt` and `val.txt` from the training set randomly. | ||
|
||
```none | ||
mmsegmentation | ||
├── mmseg | ||
├── projects | ||
│ ├── medical | ||
│ │ ├── 2d_image | ||
│ │ │ ├── x_ray | ||
│ │ │ │ ├── chest_image_pneum | ||
│ │ │ │ │ ├── configs | ||
│ │ │ │ │ ├── datasets | ||
│ │ │ │ │ ├── tools | ||
│ │ │ │ │ ├── data | ||
│ │ │ │ │ │ ├── train.txt | ||
│ │ │ │ │ │ ├── test.txt | ||
│ │ │ │ │ │ ├── images | ||
│ │ │ │ │ │ │ ├── train | ||
│ │ │ │ | │ │ │ ├── xxx.png | ||
│ │ │ │ | │ │ │ ├── ... | ||
│ │ │ │ | │ │ │ └── xxx.png | ||
│ │ │ │ │ │ ├── masks | ||
│ │ │ │ │ │ │ ├── train | ||
│ │ │ │ | │ │ │ ├── xxx.png | ||
│ │ │ │ | │ │ │ ├── ... | ||
│ │ │ │ | │ │ │ └── xxx.png | ||
``` | ||
|
||
### Divided Dataset Information | ||
|
||
***Note: The table information below is divided by ourselves.*** | ||
|
||
| Class Name | Num. Train | Pct. Train | Num. Val | Pct. Val | Num. Test | Pct. Test | | ||
| :---------------: | :--------: | :--------: | :------: | :------: | :-------: | :-------: | | ||
| normal | 9637 | 99.75 | 2410 | 99.74 | - | - | | ||
| pneumothorax area | 2137 | 0.25 | 532 | 0.26 | - | - | | ||
|
||
### Training commands | ||
|
||
Train models on a single server with one GPU. | ||
|
||
```shell | ||
mim train mmseg ./configs/${CONFIG_FILE} | ||
``` | ||
|
||
### Testing commands | ||
|
||
Test models on a single server with one GPU. | ||
|
||
```shell | ||
mim test mmseg ./configs/${CONFIG_FILE} --checkpoint ${CHECKPOINT_PATH} | ||
``` | ||
|
||
<!-- List the results as usually done in other model's README. [Example](https://github.com/open-mmlab/mmsegmentation/tree/dev-1.x/configs/fcn#results-and-models) | ||
You should claim whether this is based on the pre-trained weights, which are converted from the official release; or it's a reproduced result obtained from retraining the model in this project. --> | ||
|
||
## Results | ||
|
||
### Bactteria detection with darkfield microscopy | ||
|
||
| Method | Backbone | Crop Size | lr | mIoU | mDice | config | download | | ||
| :-------------: | :------: | :-------: | :----: | :--: | :---: | :------------------------------------------------------------------------------------: | :----------------------: | | ||
| fcn_unet_s5-d16 | unet | 512x512 | 0.01 | - | - | [config](./configs/fcn-unet-s5-d16_unet_1xb16-0.01-20k_chest-image-pneum-512x512.py) | [model](<>) \| [log](<>) | | ||
| fcn_unet_s5-d16 | unet | 512x512 | 0.001 | - | - | [config](./configs/fcn-unet-s5-d16_unet_1xb16-0.001-20k_chest-image-pneum-512x512.py) | [model](<>) \| [log](<>) | | ||
| fcn_unet_s5-d16 | unet | 512x512 | 0.0001 | - | - | [config](./configs/fcn-unet-s5-d16_unet_1xb16-0.0001-20k_chest-image-pneum-512x512.py) | [model](<>) \| [log](<>) | | ||
|
||
## Checklist | ||
|
||
- [x] Milestone 1: PR-ready, and acceptable to be one of the `projects/`. | ||
|
||
- [x] Finish the code | ||
|
||
- [x] Basic docstrings & proper citation | ||
|
||
- [x] Test-time correctness | ||
|
||
- [x] A full README | ||
|
||
- [x] Milestone 2: Indicates a successful model implementation. | ||
|
||
- [x] Training-time correctness | ||
|
||
- [ ] Milestone 3: Good to be a part of our core package! | ||
|
||
- [ ] Type hints and docstrings | ||
|
||
- [ ] Unit tests | ||
|
||
- [ ] Code polishing | ||
|
||
- [ ] Metafile.yml | ||
|
||
- [ ] Move your modules into the core package following the codebase's file hierarchy structure. | ||
|
||
- [ ] Refactor your modules into the core package following the codebase's file hierarchy structure. |
42 changes: 42 additions & 0 deletions
42
projects/medical/2d_image/x_ray/chest_image_pneum/configs/chest-image-pneum_512x512.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
dataset_type = 'ChestImagePneumDataset' | ||
data_root = 'data/' | ||
img_scale = (512, 512) | ||
train_pipeline = [ | ||
dict(type='LoadImageFromFile'), | ||
dict(type='LoadAnnotations'), | ||
dict(type='Resize', scale=img_scale, keep_ratio=False), | ||
dict(type='RandomFlip', prob=0.5), | ||
dict(type='PhotoMetricDistortion'), | ||
dict(type='PackSegInputs') | ||
] | ||
test_pipeline = [ | ||
dict(type='LoadImageFromFile'), | ||
dict(type='Resize', scale=img_scale, keep_ratio=False), | ||
dict(type='LoadAnnotations'), | ||
dict(type='PackSegInputs') | ||
] | ||
train_dataloader = dict( | ||
batch_size=16, | ||
num_workers=4, | ||
persistent_workers=True, | ||
sampler=dict(type='InfiniteSampler', shuffle=True), | ||
dataset=dict( | ||
type=dataset_type, | ||
data_root=data_root, | ||
ann_file='train.txt', | ||
data_prefix=dict(img_path='images/', seg_map_path='masks/'), | ||
pipeline=train_pipeline)) | ||
val_dataloader = dict( | ||
batch_size=1, | ||
num_workers=4, | ||
persistent_workers=True, | ||
sampler=dict(type='DefaultSampler', shuffle=False), | ||
dataset=dict( | ||
type=dataset_type, | ||
data_root=data_root, | ||
ann_file='val.txt', | ||
data_prefix=dict(img_path='images/', seg_map_path='masks/'), | ||
pipeline=test_pipeline)) | ||
test_dataloader = val_dataloader | ||
val_evaluator = dict(type='IoUMetric', iou_metrics=['mIoU', 'mDice']) | ||
test_evaluator = dict(type='IoUMetric', iou_metrics=['mIoU', 'mDice']) |
18 changes: 18 additions & 0 deletions
18
...st_image_pneum/configs/fcn-unet-s5-d16_unet_1xb16-0.0001-20k_chest-image-pneum-512x512.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
_base_ = [ | ||
'./chest-image-pneum_512x512.py', | ||
'mmseg::_base_/models/fcn_unet_s5-d16.py', | ||
'mmseg::_base_/default_runtime.py', | ||
'mmseg::_base_/schedules/schedule_20k.py' | ||
] | ||
custom_imports = dict(imports='datasets.chest-image-pneum_dataset') | ||
img_scale = (512, 512) | ||
data_preprocessor = dict(size=img_scale) | ||
optimizer = dict(lr=0.0001) | ||
optim_wrapper = dict(optimizer=optimizer) | ||
model = dict( | ||
data_preprocessor=data_preprocessor, | ||
decode_head=dict(num_classes=2), | ||
auxiliary_head=None, | ||
test_cfg=dict(mode='whole', _delete_=True)) | ||
vis_backends = None | ||
visualizer = dict(vis_backends=vis_backends) |
18 changes: 18 additions & 0 deletions
18
...est_image_pneum/configs/fcn-unet-s5-d16_unet_1xb16-0.001-20k_chest-image-pneum-512x512.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
_base_ = [ | ||
'./chest-image-pneum_512x512.py', | ||
'mmseg::_base_/models/fcn_unet_s5-d16.py', | ||
'mmseg::_base_/default_runtime.py', | ||
'mmseg::_base_/schedules/schedule_20k.py' | ||
] | ||
custom_imports = dict(imports='datasets.chest-image-pneum_dataset') | ||
img_scale = (512, 512) | ||
data_preprocessor = dict(size=img_scale) | ||
optimizer = dict(lr=0.001) | ||
optim_wrapper = dict(optimizer=optimizer) | ||
model = dict( | ||
data_preprocessor=data_preprocessor, | ||
decode_head=dict(num_classes=2), | ||
auxiliary_head=None, | ||
test_cfg=dict(mode='whole', _delete_=True)) | ||
vis_backends = None | ||
visualizer = dict(vis_backends=vis_backends) |
18 changes: 18 additions & 0 deletions
18
...hest_image_pneum/configs/fcn-unet-s5-d16_unet_1xb16-0.01-20k_chest-image-pneum-512x512.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
_base_ = [ | ||
'./chest-image-pneum_512x512.py', | ||
'mmseg::_base_/models/fcn_unet_s5-d16.py', | ||
'mmseg::_base_/default_runtime.py', | ||
'mmseg::_base_/schedules/schedule_20k.py' | ||
] | ||
custom_imports = dict(imports='datasets.chest-image-pneum_dataset') | ||
img_scale = (512, 512) | ||
data_preprocessor = dict(size=img_scale) | ||
optimizer = dict(lr=0.01) | ||
optim_wrapper = dict(optimizer=optimizer) | ||
model = dict( | ||
data_preprocessor=data_preprocessor, | ||
decode_head=dict(num_classes=2), | ||
auxiliary_head=None, | ||
test_cfg=dict(mode='whole', _delete_=True)) | ||
vis_backends = None | ||
visualizer = dict(vis_backends=vis_backends) |
27 changes: 27 additions & 0 deletions
27
projects/medical/2d_image/x_ray/chest_image_pneum/datasets/chest-image-pneum_dataset.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
from mmseg.datasets import BaseSegDataset | ||
from mmseg.registry import DATASETS | ||
|
||
|
||
@DATASETS.register_module() | ||
class ChestImagePneumDataset(BaseSegDataset): | ||
"""ChestImagePneumDataset dataset. | ||
In segmentation map annotation for ChestImagePneumDataset, | ||
``reduce_zero_label`` is fixed to False. The ``img_suffix`` | ||
is fixed to '.png' and ``seg_map_suffix`` is fixed to '.png'. | ||
Args: | ||
img_suffix (str): Suffix of images. Default: '.png' | ||
seg_map_suffix (str): Suffix of segmentation maps. Default: '.png' | ||
""" | ||
METAINFO = dict(classes=('normal', 'pneumothorax area')) | ||
|
||
def __init__(self, | ||
img_suffix='.png', | ||
seg_map_suffix='.png', | ||
**kwargs) -> None: | ||
super().__init__( | ||
img_suffix=img_suffix, | ||
seg_map_suffix=seg_map_suffix, | ||
reduce_zero_label=False, | ||
**kwargs) |
73 changes: 73 additions & 0 deletions
73
projects/medical/2d_image/x_ray/chest_image_pneum/tools/prepare_dataset.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,73 @@ | ||
import os | ||
|
||
import numpy as np | ||
import pandas as pd | ||
import pydicom | ||
from PIL import Image | ||
|
||
root_path = 'data/' | ||
img_suffix = '.dcm' | ||
seg_map_suffix = '.png' | ||
save_img_suffix = '.png' | ||
save_seg_map_suffix = '.png' | ||
|
||
x_train = [] | ||
for fpath, dirname, fnames in os.walk('data/chestimage_train_datasets'): | ||
for fname in fnames: | ||
if fname.endswith('.dcm'): | ||
x_train.append(os.path.join(fpath, fname)) | ||
x_test = [] | ||
for fpath, dirname, fnames in os.walk('data/chestimage_test_datasets/'): | ||
for fname in fnames: | ||
if fname.endswith('.dcm'): | ||
x_test.append(os.path.join(fpath, fname)) | ||
|
||
os.system('mkdir -p ' + root_path + 'images/train/') | ||
os.system('mkdir -p ' + root_path + 'images/test/') | ||
os.system('mkdir -p ' + root_path + 'masks/train/') | ||
|
||
|
||
def rle_decode(rle, width, height): | ||
mask = np.zeros(width * height, dtype=np.uint8) | ||
array = np.asarray([int(x) for x in rle.split()]) | ||
starts = array[0::2] | ||
lengths = array[1::2] | ||
|
||
current_position = 0 | ||
for index, start in enumerate(starts): | ||
current_position += start | ||
mask[current_position:current_position + lengths[index]] = 1 | ||
current_position += lengths[index] | ||
|
||
return mask.reshape(width, height, order='F') | ||
|
||
|
||
part_dir_dict = {0: 'train/', 1: 'test/'} | ||
dict_from_csv = pd.read_csv( | ||
root_path + 'chestimage_train-rle_datasets.csv', sep=',', | ||
index_col=0).to_dict()[' EncodedPixels'] | ||
|
||
for ith, part in enumerate([x_train, x_test]): | ||
part_dir = part_dir_dict[ith] | ||
for img in part: | ||
basename = os.path.basename(img) | ||
img_id = '.'.join(basename.split('.')[:-1]) | ||
if ith == 0 and (img_id not in dict_from_csv.keys()): | ||
continue | ||
image = pydicom.read_file(img).pixel_array | ||
save_img_path = root_path + 'images/' + part_dir + '.'.join( | ||
basename.split('.')[:-1]) + save_img_suffix | ||
print(save_img_path) | ||
img_h, img_w = image.shape[:2] | ||
image = Image.fromarray(image) | ||
image.save(save_img_path) | ||
if ith == 1: | ||
continue | ||
if dict_from_csv[img_id] == '-1': | ||
mask = np.zeros((img_h, img_w), dtype=np.uint8) | ||
else: | ||
mask = rle_decode(dict_from_csv[img_id], img_h, img_w) | ||
save_mask_path = root_path + 'masks/' + part_dir + '.'.join( | ||
basename.split('.')[:-1]) + save_seg_map_suffix | ||
mask = Image.fromarray(mask) | ||
mask.save(save_mask_path) |