Download Dataset
Documentation PDF
The dataset is a compilation of six datasets that were gathered from different sources and at different times. However each of them were checked rigorously under the same evaluation criterion so that all digits were at least legible to one human being without any prior knowledge.
UPDATE (14th August 2018): The initial release of the NumtaDB dataset was used for the Bengali.AI Computer Vision Challenge. It was found that the testing set consisted of some illegible and ambiguous digits. These digits are replaced by legible digits of the same label. The new testing digits along with old legible ones can be downloaded here:
Download Revised Testing Set
To check your results on the (revised) testing set
click here.
Disclaimer: Dataset-e is an abridged and curated version of BanglaLekha-Isolated and was not collected by Bengali.AI volunteers