This repository contains the official code for the paper:
A Good Feature Extractor Is All You Need for Weakly Supervised Pathology Slide Classification
Georg Wölflein, Dyke Ferber, Asier Rabasco Meneghetti, Omar S. M. El Nahhas, Daniel Truhn, Zunamys I. Carrero, David J. Harrison, Ognjen Arandjelović and Jakob N. Kather
arXiv, Nov 2023.
Read full abstract.
Stain normalisation is thought to be a crucial preprocessing step in computational pathology pipelines. We question this belief in the context of weakly supervised whole slide image classification, motivated by the emergence of powerful feature extractors trained using self-supervised learning on diverse pathology datasets. To this end, we performed the most comprehensive evaluation of publicly available pathology feature extractors to date, involving more than 8,000 training runs across nine tasks, five datasets, three downstream architectures, and various preprocessing setups. Notably, we find that omitting stain normalisation and image augmentations does not compromise downstream slide-level classification performance, while incurring substantial savings in memory and compute. Using a new evaluation metric that facilitates relative downstream performance comparison, we identify the best publicly available extractors, and show that their latent spaces are remarkably robust to variations in stain and augmentations like rotation. Contrary to previous patch-level benchmarking studies, our approach emphasises clinical relevance by focusing on slide-level biomarker prediction tasks in a weakly supervised setting with external validation cohorts. Our findings stand to streamline digital pathology workflows by minimising preprocessing needs and informing the selection of feature extractors.- We compare 14 feature extractors, and find that UNI, CTransPath and Lunit's DINO produce the best representations for downstream weakly supervised slide classification tasks.
- We show that stain normalisation and image augmentations can be omitted without compromising downstream performance.
Note
June 2024: We released an extended version of our preprint that includes two additional feature extractors (UNI and ViT-L), alongside extensive additional experiments at
Note
March 2024: We updated our preprint to include two additional feature extractors: Phikon-Teacher and Lunit-MoCo.
If you find this useful, please cite:
@misc{wolflein2023good,
title = {A Good Feature Extractor Is All You Need for Weakly Supervised Pathology Slide Classification},
author = {W\"{o}lflein, Georg and Ferber, Dyke and Meneghetti, Asier Rabasco and El Nahhas, Omar S. M. and Truhn, Daniel and Carrero, Zunamys I. and Harrison, David J. and Arandjelovi\'{c}, Ognjen and Kather, Jakob N.},
journal = {arXiv:2311.11772},
year = {2023},
}