Title: | Small Image Training Sets: Exploring the Limits of Conventional and CNN-based Methods |
Authors: | Kovalev, V. |
Keywords: | материалы конференций;conference proceedings;Image Classification;Benchmarking;Convolutional Neural Networks;Histology images |
Issue Date: | 2021 |
Publisher: | UIIP NASB |
Citation: | Kovalev, V. Small Image Training Sets: Exploring the Limits of Conventional and CNN-based Methods / Kovalev V. // Pattern Recognition and Information Processing (PRIP'2021) = Распознавание образов и обработка информации (2021) : Proceedings of the 15th International Conference, 21–24 Sept. 2021, Minsk, Belarus / United Institute of Informatics Problems of the National Academy of Sciences of Belarus. – Minsk, 2021. – P. 178–182. |
Abstract: | This work is dedicated to the problem of image classification under the condition of small image datasets. Both traditional and CNN-based methods are examined and compared based on a benchmark image dataset. The dataset consisted of 12000 routine hematoxylin-eosin stained histological images. They represent the biopsy samples of normal tissue and the malignant tumors caused by breast cancer. The commonly-known image analysis methods which make use of color co-occurrence matrices of images converted to an adaptive 32-color space and the limited number of their principal components (PCA) were used as image features. The features were inputted to SVM and Random Forests classifiers. The original image training set was gradually
reduced from 8400 to 840 images with the step of 10%. In addition, the very-small sub-samples of 5% (420), 2.5% (210), 1.25% (105), and 1% (84) of original image dataset were also examined. In its turn, the classical CNN was employed that consisted of only 3 convolutional + MaxPooling layers with 16, 32, and 64 filters respectively. This is because the small image training sets were specifically targeted in this particular study. The convolutional part was followed by a fully connected neural network with 512 intermediate nodes. As a result, it was found that traditional methods outperform the CNN-based image classification technique on the training sets comprised of less than 840 images. |
URI: | https://libeldoc.bsuir.by/handle/123456789/45826 |
Appears in Collections: | Pattern Recognition and Information Processing (PRIP'2021) = Распознавание образов и обработка информации (2021)
|