A meta-dataset for
few-shot image classification

few-shot learning meta-learning continual learning transfer learning image classification

About

Meta Album is a meta-dataset created for few-shot learning, meta-learning, continual learning and so on. Meta Album consists of 40 datasets from 10 unique domains. Datasets are arranged in sets (10 datasets, one dataset from each domain). It is a continuously growing meta-dataset. See our datasets in Datasets Section.

We repurposed datasets that were generously made available by original creators, see credits page. All datasets are free for use for academic purposes, provided that proper credits are given. For your convenience, you may cite our paper, which references all original creators.

License

Meta-Album is released under a CC BY-NC 4.0 license permitting non-commercial use for research purposes, provided that you cite us. Additionally, redistributed datasets have their own license, see the credits page. All resources made available through this website provided “as is”. The curators of Meta-Album (and their home institutions and their sponsors) who have worked on its preparation, this website, the code provided to read, process data, and run baseline methods, make no warranties concerning the licensed material, including fitness for any purpose, non-infringement absence of defects or errors, accuracy, and they decline any liability for losses or other possible consequences that may arise by using such material.
This briefly summarizes the terms of the license CC BY-NC 4.0 and the disclaimer (that the license includes).

Recommended use

The recommended use of Meta-Album is to conduct fundamental research on machine learning algorithms and conduct benchmarks, particularly in: few-shot learning, meta-learning, continual learning, transfer learning, and image classification.

Code

We provide code in our GitHub Repository for

  1. Data processing
  2. Data formatting
  3. Quality control
  4. Meta-Album use cases
  5. Challenge winners code (NeurIPS 2021 MetaDL competition)

Visit our GitHub repository for more details.

Meta-Album GitHub Repository

Datasets

We list in the tables below the data statistics of the original datasets. We generally chose image classification datasets with at least 20 classes having more than 40 examples per class. Each dataset was transformed into 128x128 pixel images. The data is available in 4 versions:

  1. Original data (from the original creators’ website)
  2. Meta Album extended = All classes having at least 40 examples per class, images 128x128 pixel
  3. Meta Album mini = same as Meta Album extended, but we randomly sampled only 40 examples for each class (hence the datasets are class-balanced)
  4. Meta Album micro = same as Meta Album mini, but only 20 randomly selected classes.

Download Instructions

Our datasets are hosted on OpenML platform. The following piece of code will help you download the datasets.
Install OpenML for python

Download using OpenML python API

    # import openml
    import openml

    # download dataset with DATASET_ID. Check Dataset detail page for DATASET_ID
    dataset = openml.datasets.get_dataset(DATASET_ID, download_data=True, download_all_files=True)

    # display dataset info
    print(dataset.name)
            

Datasets are downloaded in openml cache directory. You can check it with this code:

    # display openml cache directory
    print(openml.config.cache_directory)
            
Datasets Set-0, release date: 06 June 2022
Meta Album IDDomainOriginal Dataset# Classes# ImagesMore
LR_AM.BRD Large AnimalsBirds31562, 454Details
SM_AM.PLK Small AnimalsPlankton102477,513Details
PLT.FLW PlantsFlowers10213,069Details
PLT_DIS.PLT_VIL Plant DiseasesPlant Village3856, 625Details
MCR.BCT MicroscopyBacteria336,180Details
REM_SEN.RESISC Remote SensingRESISC4534,100Details
VCL.CRS VehiclesCars19624,825Details
MNF.TEX ManufacturingTextures6412,035Details
HUM_ACT.SPT Human Actions73 Sports7314,136Details
OCR.MD_MIX OCROmniprint-MD-mix70629,040Details
Datasets Set-1, release date: 28 November 2022
Meta Album IDDomainOriginal Dataset# Classes# ImagesMore
LR_AM.DOG Large AnimalsDogs12026,080Details
SM_AM.INS_2 Small AnimalsInsects 210280,102Details
PLT.PLT_NET PlantsPlantNet25122,488Details
PLT_DIS.MED_LF Plant DiseasesMedicinal Leaf263,396Details
MCR.PNU MicroscopyPanNuke197,090Details
REM_SEN.RSICB Remote SensingRSICB4539,307Details
VCL.APL VehiclesAirplanes2111,265Details
MNF.TEX_DTD ManufacturingTextures DTD478,320Details
HUM_ACT.ACT_40 Human ActionsStanford 40 Actions405,749Details
OCR.MD_5_BIS OCROmniprint-MD-5-bis70629,040Details
Datasets Set-2, release date: 28 November 2022
Meta Album IDDomainOriginal Dataset# Classes# ImagesMore
LR_AM.AWA Large AnimalsAnimals with Attributes5040,118Details
SM_AM.INS Small AnimalsInsects117175,416Details
PLT.FNG PlantsFungi2516,922Details
PLT_DIS.PLT_DOC Plant DiseasesPlantDoc274,429Details
MCR.PRT MicroscopySubcel. Human Protein2116,690Details
REM_SEN.RSD Remote SensingRSD4346,145Details
VCL.BTS VehiclesBoats26140,207Details
MNF.TEX_ALOT ManufacturingTextures ALOT25035,800Details
HUM_ACT.ACT_410 Human ActionsMPII Human Pose294,362Details
OCR.MD_6 OCROmniprint-MD-670328,920Details
Datasets Set-3, release date: June 2023

Citation

If you are using Meta Album, cite our paper as mentioned below:

  @inproceedings{meta-album-2022,
    title={Meta-Album: Multi-domain Meta-Dataset for Few-Shot Image Classification},
    author={Ullah, Ihsan and Carrion, Dustin and Escalera, Sergio and Guyon, Isabelle M and Huisman, Mike and Mohr, Felix and van Rijn, Jan N and Sun, Haozhe and Vanschoren, Joaquin and Vu, Phan Anh},
    booktitle={Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
    url = {https://meta-album.github.io/},
    year = {2022}
  }
            
Download as bib Meta-Album Paper

Contributors and Institutions

Sponsors