ENH: Enable heatmaps when tiling on the fly #491

vale-salvatelli · 2022-07-05T12:40:20Z

In this PR we enable heatmap outputs when tiling on the fly. Specifically:

we changed the MONAI transform used for tiling to be GridPatch. The main advantage is that this transform returns coordinates for each of the patch
we update the collate function to handle generic arrays
the environment is updated to MONAI dev because the transform and the patches we added to it in MONAI are not released yet

…d for homogeneity

…e test case)

for more information, see https://pre-commit.ci

codecov · 2022-07-05T12:45:59Z

Codecov Report

Merging #491 (33c52b1) into main (c73103a) will decrease coverage by 22.82%.
The diff coverage is 5.66%.

Flag	Coverage Δ
hi-ml-cpath	`25.69% <5.66%> (-51.71%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
hi-ml-cpath/src/health_cpath/utils/output_utils.py	`6.62% <0.00%> (-65.05%)`	⬇️
hi-ml-cpath/src/health_cpath/utils/wsi_utils.py	`15.62% <3.84%> (-84.38%)`	⬇️
...-cpath/src/health_cpath/datamodules/base_module.py	`7.69% <7.69%> (-85.74%)`	⬇️
hi-ml-cpath/src/health_cpath/utils/naming.py	`2.53% <20.00%> (-96.12%)`	⬇️
...-ml-cpath/src/health_cpath/preprocessing/tiling.py	`10.52% <0.00%> (-85.97%)`	⬇️
...ml-cpath/src/health_cpath/datasets/base_dataset.py	`6.55% <0.00%> (-75.41%)`	⬇️
hi-ml-cpath/src/health_cpath/utils/layer_utils.py	`19.44% <0.00%> (-75.00%)`	⬇️
hi-ml-cpath/src/health_cpath/utils/viz_utils.py	`12.28% <0.00%> (-72.81%)`	⬇️
... and 63 more

kenza-bouzid

Looking good, I left some suggestions.

hi-ml-histopathology/src/histopathology/configs/classification/DeepSMILEPanda.py

hi-ml-histopathology/src/histopathology/datamodules/base_module.py

hi-ml-histopathology/src/histopathology/models/deepmil.py

kenza-bouzid · 2022-07-05T16:38:10Z

hi-ml-histopathology/src/histopathology/models/deepmil.py

+        return faulty_slides_idx
+
+    def get_slide_patch_coordinates(
+        self, slide_offset: List, patches_location: List, patch_size: List


can you add doc string please?

kenza-bouzid · 2022-07-05T16:48:00Z

hi-ml-histopathology/src/histopathology/models/deepmil.py

+            results.update(metadata_dict)
+            # each slide can have a different number of patches
+            for i in range(n_slides):
+                updated_metadata_dict = self.compute_slide_metadata(batch, i, metadata_dict)


I wonder if all this metadata processing should be wrapped into a transform. Data is prefetched anyways it should be more efficient to apply it as a transform that is muti processed by the dataloader. Also from a design perspective, the model shouldn't be handling any metadata processing...

hi-ml-histopathology/src/histopathology/utils/naming.py

kenza-bouzid · 2022-07-05T16:52:20Z

hi-ml-histopathology/src/histopathology/utils/output_utils.py

@@ -72,6 +72,8 @@ def normalize_dict_for_df(dict_old: Dict[ResultsKey, Any]) -> Dict[str, Any]:
                value = value.squeeze(0).cpu().numpy()
                if value.ndim == 0:
                    value = np.full(bag_size, fill_value=value)
+            if isinstance(value, List) and isinstance(value[0], torch.Tensor):


Can you explain why do we need to do that here? is that for the coordinates? Maybe turn it into numpy arrays in the first place?

hi-ml-histopathology/src/histopathology/datamodules/base_module.py

hi-ml-histopathology/src/histopathology/models/deepmil.py

dccastro · 2022-07-05T16:56:41Z

hi-ml-histopathology/src/histopathology/models/deepmil.py

+        self, slide_offset: List, patches_location: List, patch_size: List
+    ) -> Tuple[List, List, List, List]:
+        """ computing absolute coordinates for all patches in a slide"""
+        top, bottom, left, right = self.get_empty_lists(len(patches_location), 4)


Might this be cleaner using numpy arrays, perhaps?

@dccastro , I like your suggestion above to use the Box class instead. Much cleaner than using a 4-tuple of ints, that's just calling for errors to happen.

dccastro · 2022-07-05T16:59:04Z

hi-ml-histopathology/src/histopathology/models/deepmil.py

+        return ll
+
+    @staticmethod
+    def get_patch_coordinate(slide_offset: List, patch_location: List, patch_size: List) -> Tuple[int, int, int, int]:


What would you think of using our Box class?

+1 on that. Tuples with 4 elements of the same type are just too easy to mix up

hi-ml-histopathology/src/histopathology/datamodules/base_module.py

dccastro · 2022-07-05T17:50:47Z

hi-ml-histopathology/src/histopathology/models/deepmil.py

+
+    def compute_slide_metadata(self, batch: Dict, index: int, metadata_dict: Dict) -> Dict:
+        """compute patch-dependent and patch-invariante metadata for a single slide """
+        offset = batch[SlideKey.OFFSET.value][index]


[Minor] The .value shouldn't be necessary

dccastro · 2022-07-05T17:58:10Z

hi-ml-histopathology/src/histopathology/models/deepmil.py

+            top[i], bottom[i], left[i], right[i] = self.get_patch_coordinate(slide_offset, location, patch_size)
+        return top, bottom, left, right
+
+    def compute_slide_metadata(self, batch: Dict, index: int, metadata_dict: Dict) -> Dict:


Why does this method need to mutate the input metadata_dict in-place, instead of returning a new dictionary?

+1. If needs to mutate in place, then the signature and name of the function should reflect that mutation. (update_metadata, -> None)

dccastro · 2022-07-05T17:59:08Z

hi-ml-histopathology/src/histopathology/models/deepmil.py

-                ],
+        if all(key.value in batch.keys() for key in [SlideKey.OFFSET, SlideKey.PATCH_LOCATION, SlideKey.PATCH_SIZE]):
+            n_slides = len(batch[SlideKey.SLIDE_ID])
+            metadata_dict = {


Couldn't this be replaced by just adding if key not in results: results[key] = [] in the loop below?

hi-ml-histopathology/src/histopathology/utils/wsi_utils.py

ant0nsc · 2022-07-06T09:24:52Z

hi-ml-histopathology/src/histopathology/models/deepmil.py

+            top[i], bottom[i], left[i], right[i] = self.get_patch_coordinate(slide_offset, location, patch_size)
+        return top, bottom, left, right
+
+    def compute_slide_metadata(self, batch: Dict, index: int, metadata_dict: Dict) -> Dict:


+1. If needs to mutate in place, then the signature and name of the function should reflect that mutation. (update_metadata, -> None)

ant0nsc · 2022-07-06T09:31:56Z

hi-ml-histopathology/src/histopathology/models/deepmil.py

-                ResultsKey.IMAGE_PATH: [
-                    [img_path] * bag_sizes[i] for i, img_path in enumerate(batch[SlideKey.IMAGE_PATH])
-                ],
+        if all(key.value in batch.keys() for key in [SlideKey.OFFSET, SlideKey.PATCH_LOCATION, SlideKey.PATCH_SIZE]):


I think a set operation would be easier to understand. set(SlideKey.OFFSET, SlideKey....) <= set(batch.keys()

hi-ml-histopathology/src/histopathology/utils/naming.py

ant0nsc · 2022-07-06T09:37:08Z

hi-ml-histopathology/src/histopathology/utils/output_utils.py

@@ -72,6 +72,8 @@ def normalize_dict_for_df(dict_old: Dict[ResultsKey, Any]) -> Dict[str, Any]:
                value = value.squeeze(0).cpu().numpy()
                if value.ndim == 0:
                    value = np.full(bag_size, fill_value=value)
+            if isinstance(value, List) and isinstance(value[0], torch.Tensor):


This function could do with a lot more documentation. A docstring would be great. And each of the if branches should have a clear description which case we are handling here, and where those cases arise.

hi-ml-histopathology/src/histopathology/utils/output_utils.py

for more information, see https://pre-commit.ci

…crosoft/hi-ml into vsalva/monai_transform_update

for more information, see https://pre-commit.ci

…crosoft/hi-ml into vsalva/monai_transform_update

for more information, see https://pre-commit.ci

ant0nsc · 2022-08-16T12:04:47Z

hi-ml-cpath/primary_deps.yml

@@ -33,6 +33,47 @@ dependencies:
      - ruamel.yaml==0.16.12
      - tensorboard==2.6.0
      # Histopathology requirements
+      - coloredlogs==15.0.1


Changes to primary_deps.yml should be made via requirements_run.txt

ant0nsc · 2022-08-16T12:07:13Z

hi-ml-cpath/src/health_cpath/datamodules/base_module.py

+            If only one float number is given, it will be applied to all dimensions. Defaults to 0.0.
+        :param intensity_threshold: a value to keep only the patches whose sum of intensities are less than the
+            threshold. Defaults to no filtering.
+        :pad_mode:  refer to NumpyPadMode and PytorchPadMode. If None, no padding will be applied.


Suggested change

:pad_mode: refer to NumpyPadMode and PytorchPadMode. If None, no padding will be applied.

:param pad_mode: refer to NumpyPadMode and PytorchPadMode. If `None`, no padding will be applied.

ant0nsc · 2022-08-16T12:07:57Z

hi-ml-cpath/src/health_cpath/datamodules/base_module.py

-        monai transform for tiling on the fly.
+        :param filter_mode:  when `num_patches` is provided, it determines if keep patches with highest values
+            (`"max"`), lowest values (`"min"`), or in their default order (`None`). Default to None.
+        :param overlap: the amount of overlap of neighboring patches in each dimension (a value between 0.0 and 1.0).


What's the order of values? (width, height) or the other way around?

ant0nsc · 2022-08-16T12:19:01Z

hi-ml-cpath/src/health_cpath/datamodules/base_module.py

+        max_offset = None if (self.random_offset and stage == ModelKey.TRAIN) else 0
+
+        if stage != ModelKey.TRAIN:
+            grid_transform = RandGridPatchd(


Something I don't get here: We are using random tiles when we are NOT training?

ant0nsc · 2022-08-16T12:27:40Z

hi-ml-cpath/src/health_cpath/models/deepmil.py

+        bottom = slide_offset[0] + patch_location[0] + patch_size[0]
+        left = slide_offset[1] + patch_location[1]
+        right = slide_offset[1] + patch_location[1] + patch_size[1]
+        return top, bottom, left, right


Tuples of 4 integers are really error prone. Can we use the Box class instead?

(maybe changing it to use top, bottom, left, right in the same washup?)

ant0nsc · 2022-08-16T12:30:32Z

hi-ml-cpath/src/health_cpath/models/deepmil.py

+    @staticmethod
+    def expand_slide_constant_metadata(id: str, path: str, n_patches: int, top: List[int],
+                                       bottom: List[int], left: List[int], right: List[int]) -> Tuple[List, List, List]:
+        """Duplicate metadata that is patch invariant to match the shape of other arrays"""


Can you expand the documentation a bit here? also "match the shape of other arrays" is not completely correct, it is matching the number given in n_patches (and assumes that many arrays have matching lengths).

ant0nsc · 2022-08-16T12:36:16Z

hi-ml-cpath/src/health_cpath/utils/output_utils.py

@@ -72,6 +72,8 @@ def normalize_dict_for_df(dict_old: Dict[ResultsKey, Any]) -> Dict[str, Any]:
                value = value.squeeze(0).cpu().numpy()
                if value.ndim == 0:
                    value = np.full(bag_size, fill_value=value)
+            if isinstance(value, List) and isinstance(value[0], torch.Tensor):
+                value = [value[i].item() for i in range(len(value))]


Suggested change

value = [value[i].item() for i in range(len(value))]

value = [v.item() for v in value]

ant0nsc · 2022-08-16T12:39:00Z

hi-ml-cpath/testhisto/testhisto/utils/test_wsi_utils.py

@@ -39,7 +39,7 @@ def __getitem__(self, index: int) -> List[Dict[SlideKey, Any]]:


 @pytest.mark.parametrize("random_n_tiles", [False, True])
-def test_image_collate(random_n_tiles: bool) -> None:
+def test_array_collate(random_n_tiles: bool) -> None:


there is no test coverage for the new functionality?

vale-salvatelli and others added 20 commits June 1, 2022 17:12

transform works locally, coordinatess need to be propagated

df92a48

switch branch

a7754d3

fix merge

6f29abe

fix merge

ce04dab

adding coordinates to the batch

be25cf6

sfixing collate

8183fb6

updating transform parameters

1a93d3a

method that update slide results updated, tiles mthod to be refactore…

bbaa5fa

…d for homogeneity

shape and type results aligned to Tiles

5c10885

updatingg env to pin MONAI dev commit

585c893

heatmaps produced locally when tiling on the fly

7970037

problematic slides now skipped

792385e

bug fix runs locally

85bd4ec

reduce logging

5281940

works locally skipping slide with problematic patches (1 in val on th…

2ea8c94

…e test case)

issue with coordinnates being equal fixed

a720a28

cleaning up checks no longer needed

a996323

fixing merge conflicts

547889f

leftover rfrom merge

7550cae

[pre-commit.ci] auto fixes from pre-commit.com hooks

e989b4f

for more information, see https://pre-commit.ci

vale-salvatelli changed the title ~~Vsalva/monai transform update~~ ENH: Enable heatmaps when tiling on the fly Jul 5, 2022

vale-salvatelli added 2 commits July 5, 2022 12:51

flake8 fixes

73f0798

flake8 fixes

3418f47

vale-salvatelli requested review from kenza-bouzid and dccastro July 5, 2022 14:34

kenza-bouzid reviewed Jul 5, 2022

View reviewed changes

addressing some PR feedback, thanks @kenza

73b044f

dccastro reviewed Jul 5, 2022

View reviewed changes

ant0nsc reviewed Jul 6, 2022

View reviewed changes

vale-salvatelli and others added 13 commits July 6, 2022 17:04

more changes

58c74ec

fixing merge with latest main

fec8f8e

more feedback implemented

87e693f

[pre-commit.ci] auto fixes from pre-commit.com hooks

bf16cb0

for more information, see https://pre-commit.ci

fixing current test plus update env

b656599

Merge branch 'vsalva/monai_transform_update' of https://github.com/mi…

6408a2e

…crosoft/hi-ml into vsalva/monai_transform_update

fix flake8

aa3e6f1

minor fixes

4226ba3

[pre-commit.ci] auto fixes from pre-commit.com hooks

4a97089

for more information, see https://pre-commit.ci

fixing issue with unexpcted batch size due to changes in MONAI dev

50a58c0

Merge branch 'vsalva/monai_transform_update' of https://github.com/mi…

6dfa5d6

…crosoft/hi-ml into vsalva/monai_transform_update

[pre-commit.ci] auto fixes from pre-commit.com hooks

600dd06

for more information, see https://pre-commit.ci

merging main with cpath renaming

33c52b1

ant0nsc requested changes Aug 16, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Enable heatmaps when tiling on the fly #491

ENH: Enable heatmaps when tiling on the fly #491

vale-salvatelli commented Jul 5, 2022 •

edited

Loading

codecov bot commented Jul 5, 2022 •

edited

Loading

kenza-bouzid left a comment

kenza-bouzid Jul 5, 2022

kenza-bouzid Jul 5, 2022

kenza-bouzid Jul 5, 2022

dccastro Jul 5, 2022

ant0nsc Jul 6, 2022

dccastro Jul 5, 2022

ant0nsc Jul 6, 2022

dccastro Jul 5, 2022

dccastro Jul 5, 2022

ant0nsc Jul 6, 2022

dccastro Jul 5, 2022

ant0nsc Jul 6, 2022

ant0nsc Jul 6, 2022

ant0nsc Jul 6, 2022

ant0nsc Aug 16, 2022

ant0nsc Aug 16, 2022

ant0nsc Aug 16, 2022

ant0nsc Aug 16, 2022

ant0nsc Aug 16, 2022

ant0nsc Aug 16, 2022

ant0nsc Aug 16, 2022

ant0nsc Aug 16, 2022

ant0nsc Aug 16, 2022

	:pad_mode: refer to NumpyPadMode and PytorchPadMode. If None, no padding will be applied.
	:param pad_mode: refer to NumpyPadMode and PytorchPadMode. If `None`, no padding will be applied.

	value = [value[i].item() for i in range(len(value))]
	value = [v.item() for v in value]

ENH: Enable heatmaps when tiling on the fly #491

Are you sure you want to change the base?

ENH: Enable heatmaps when tiling on the fly #491

Conversation

vale-salvatelli commented Jul 5, 2022 • edited Loading

codecov bot commented Jul 5, 2022 • edited Loading

Codecov Report

kenza-bouzid left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vale-salvatelli commented Jul 5, 2022 •

edited

Loading

codecov bot commented Jul 5, 2022 •

edited

Loading