Transferred Parallel Convolutional Neural Network for Large Imbalanced Plankton Database Classification

Chao Wang,Xueer Zheng,Chunfeng Guo,Zhibin Yu,Jia Yu,Haiyong Zheng,Bing Zheng
DOI: https://doi.org/10.1109/oceanskobe.2018.8558836
2018-01-01
Oceans
Abstract:Plankton are critically important to our ecosystem, accounting for more than half the primary productivity on earth and nearly half the total carbon fixed in the global carbon cycle. Loss of plankton populations could result in ecological upheaval as well as negative societal impacts. By contrast, a bloom of phytoplankton can result in red tides which will cause huge economic loss. So it's a valuable thing for people to get the species population and distribution information. Recently, convolutional neural networks have achieved state of the art result on large scale image classification. We use several popular CNN models on WHOI large scale plankton database, it has achieved high accuracy on this dataset, but the data distribution of WHOI is not balance, so we have to solve a data imbalance problem. To evaluate the classier in an impartial way, we introduce an evaluation criterion called F1 score. Although the CNN method have achieved high global accuracy on the database, they achieved low F1 score: 0.17, 0.29 on CIFAR10 CNN model and VGG16 model separately. In this paper, we introduced a transfer parallel model approach to overcome this problem. We pre-trained a CNN model on the small classes which have images less than 5,000. Then the pre-trained model was treated as a feature extractor to enhance the small class's features and we fixed all the weights of this pre-trained model and combined with a parallel network to train on the whole training database. Through this transferred feature based approach we achieved high F1 score 0.3752, 0.5444 with our model based on CIFAR10 CNN model and VGG16 model respectively.
What problem does this paper attempt to address?