OmniPrint-MD-6 Dataset

Dataset details

Last updated: 15 Dec 2022
Meta Album ID OCR.MD_6
Domain ID OCR
Domain Name Optical Character Recognition
Set Number 2
Dataset ID MD_6
Dataset Name OmniPrint-MD-6
Short Description Character images with a specific set of nuisance parameters
Long Description OmniPrint-MD-6 dataset consists of 28 120 images (128x128, RGB) from 703 categories. The images are synthesized with OmniPrint, no further processing was done. The OmniPrint synthesis parameters are stated as follows: font size is 192, image size is 128, the strength of random perspective transformation is 0.04, left/right/top/bottom margins are all 20% of the image size, the strength of pre-rasterization elastic transformation is 0.035, random translation is activated both horizontally and vertically, image blending method is Poisson Image Editing, rotation is within -60 and 60 degrees, horizontal shear is within -0.5 and 0.5, both foreground and background are images taken from a personal mobile phone.
# Classes 703
# Images 28120
Keywords ocr
Data Format images
Image size 128x128
License
(original data release)
CC BY 4.0
License URL
(original data release)
https://creativecommons.org/licenses/by/4.0/
License
(Meta-Album data release)
CC BY 4.0
License URL
(Meta-Album data release)
https://creativecommons.org/licenses/by/4.0/
Source OmniPrint
Source URL https://github.com/SunHaozhe/OmniPrint
Original Author Haozhe Sun
Original contact sunhaozhe275940200@gmail.com
Meta Album author Haozhe Sun
Created Date 25 June 2021
Contact Name Haozhe Sun
Contact Email meta-album@chalearn.org
Contact URL https://meta-album.github.io/

Download Meta-data files

Download Dataset from OpenML

Dataset Version OpenML ID
Micro 44280 Download
Mini 44310 Download

Code to download dataset using OpenML API

      # import openml
      import openml
  
      # download dataset with DATASET_ID. DATASET_ID is OpenML ID
      dataset = openml.datasets.get_dataset(DATASET_ID)
  
      # display dataset info
      print(dataset.name)
              

Sample Images

Cite this dataset

@inproceedings{sun2021omniprint,
    title={OmniPrint: A Configurable Printed Character Synthesizer},
    author={Haozhe Sun and Wei-Wei Tu and Isabelle M Guyon},
    booktitle={Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1)},
    year={2021},
    url={https://openreview.net/forum?id=R07XwJPmgpl}
}
              
Download as bib

Cite Meta-Album

  @inproceedings{meta-album-2022,
    title={Meta-Album: Multi-domain Meta-Dataset for Few-Shot Image Classification},
    author={Ullah, Ihsan and Carrion, Dustin and Escalera, Sergio and Guyon, Isabelle M and Huisman, Mike and Mohr, Felix and van Rijn, Jan N and Sun, Haozhe and Vanschoren, Joaquin and Vu, Phan Anh},
    booktitle={Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
    url = {https://meta-album.github.io/},
    year = {2022}
  }
              
Download as bib