PanNuke Dataset

Dataset details

Last updated: 15 Dec 2022
Meta Album ID MCR.PNU
Domain ID MCR
Domain Name Microscopic
Set Number 1
Dataset ID PNU
Dataset Name PanNuke
Short Description 19 Human Tissues Dataset
Long Description The PanNuke dataset(https://jgamper.github.io/PanNukeDataset/) is a semi-automatically generated segmentation and classification task of nuclei. The dataset contains 7 753 images of 19 different tissue types. For the Meta-Album meta-dataset, even though this dataset was designed as a segmentation task, we were able to transform it into a tissue classification task since we had the tissue type for each sample in the dataset. We also resized the images to 128x128 pixels and applied stain normalization to avoid bias and remove some spurious features.
# Classes 19
# Images 5530
Keywords microscopic, human tissues
Data Format images
Image size 128x128
License
(original data release)
Attribution-NonCommercial-ShareAlike 4.0 International
License URL
(original data release)
https://warwick.ac.uk/fac/cross_fac/tia/data/pannuke
https://creativecommons.org/licenses/by-nc-sa/4.0/
License
(Meta-Album data release)
Attribution-NonCommercial-ShareAlike 4.0 International
License URL
(Meta-Album data release)
https://creativecommons.org/licenses/by-nc-sa/4.0/
Source PanNuke: An Open Pan-Cancer Histology Dataset for Nuclei Instance Segmentation and Classification
Source URL https://jgamper.github.io/PanNukeDataset/
Original Author Gamper, Jevgenij and Koohbanani, Navid Alemi and Benet, Ksenija and Khuram, Ali and Rajpoot, Nasir
Original contact j.gamper@warwick.ac.uk
Meta Album author Romain Mussard
Created Date 01 March 2022
Contact Name Ihsan Ullah
Contact Email meta-album@chalearn.org
Contact URL https://meta-album.github.io/

Download Meta-data files

Download Dataset from OpenML

Dataset Version OpenML ID
Micro 44312 Download
Mini 44297 Download
Extended 44330 Download

Code to download dataset using OpenML API

      # import openml
      import openml
  
      # download dataset with DATASET_ID. DATASET_ID is OpenML ID
      dataset = openml.datasets.get_dataset(DATASET_ID)
  
      # display dataset info
      print(dataset.name)
              

Sample Images

Cite this dataset

@inproceedings{gamper2019pannuke,
  title={PanNuke: an open pan-cancer histology dataset for nuclei instance segmentation and classification},
  author={Gamper, Jevgenij and Koohbanani, Navid Alemi and Benet, Ksenija and Khuram, Ali and Rajpoot, Nasir},
  booktitle={European Congress on Digital Pathology},
  pages={11--19},
  year={2019},
  organization={Springer}
}

@article{gamper2020pannuke,
  title={PanNuke Dataset Extension, Insights and Baselines},
  author={Gamper, Jevgenij and Koohbanani, Navid Alemi and Graham, Simon and Jahanifar, Mostafa and Khurram, Syed Ali and Azam, Ayesha and Hewitt, Katherine and Rajpoot, Nasir},
  journal={arXiv preprint arXiv:2003.10778},
  year={2020}
}
              
Download as bib

Cite Meta-Album

  @inproceedings{meta-album-2022,
    title={Meta-Album: Multi-domain Meta-Dataset for Few-Shot Image Classification},
    author={Ullah, Ihsan and Carrion, Dustin and Escalera, Sergio and Guyon, Isabelle M and Huisman, Mike and Mohr, Felix and van Rijn, Jan N and Sun, Haozhe and Vanschoren, Joaquin and Vu, Phan Anh},
    booktitle={Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
    url = {https://meta-album.github.io/},
    year = {2022}
  }
              
Download as bib