Breast mass classification method based on convolutional neural networks

: To classify the X-ray mammograms images as benign or malignant is a long-standing unresolved problem, due to the high similarity of different between the mammograms images. In this study, a novel convolutional neural network based X-ray breast mass classification method is proposed. The method receives original breast mass image and its transformed image simultaneously, and extracts more abundant features from the breast mass images. Though the transformed image is a simple inverse of the original image, it allows another side of the thing to be perceived by the network at a very low cost. Experiment results demonstrate that the proposed method significantly outperforms the compared state-of-the-art classification methods for breast mass. mass; (iii) experimental results on several datasets illustrate that the proposed method achieves the highest classification accuracies with the compared state-of-the-art methods of mammography classification.


Introduction
With increasing of the ageing population, the burden caused by cancer continues to grow and become a severe social problem. The breast cancer is the leading cause of death among all cancers for women, and it is the most fatal cancer that threatens women's life [1]. Using computer to diagnose the breast cancer can accelerate the detection, improve the diagnose accuracy and save medical costs [2,3]. The most common methods for breast cancer screening in the world are ultrasound, molybdenum, nuclear magnetic resonance and so on, in which molybdenum X-ray mammography is the most widely used in breast cancer diagnosis owing to its low cost, little injury to patients.
Machine learning methods have been widely used in the field of molybdenum X-ray mammography image recognition. The pattern recognition methods generally contain two stages: feature extraction and classifier. The feature extraction is the pre-order work for pattern classification and plays an important role in the process of pattern recognition [4,5,6]. In past decades, some breast mass classification methods have been proposed that are used to extract the intrinsic characteristics in mammography and improve classification accuracy. For example, Aarthi et al. [7] extracted image features from region of interest in mammography image by statistical method, and generated fusion features by combining the image features with the clinical features, then used support vector machine (SVM) to classify the fusion features as benign or malignant breast mass. Johra and Shuvo [8] extracted a six-dimensional feature from the segmented image, and classified the extracted features as benign or malignant breast mass by ANN, SVM and fuzzy logic methods. Experiment results proved that the fuzzy logic achieved a higher accuracy than ANN and SVM.
In recent years, the deep learning techniques, especially the convolutional neural networks (CNNs), have achieved widespread attention, it is widely used in the field of image process, such as target detection, image classification, image denoising, due to their excellent capability of automatic feature extraction [9,10,11]. For example, AlexNet [12] used non-linear activation ReLU to speed up and reduce the risk of over-fitting in the training process of the classification model. Moreover, the generalisation ability of the model can be enhanced and the prediction accuracy can be improved by the Droupout mechanism. It gets top-5 error rate of 16.4% on ImageNet. By visualising the output of each layer in the CNN, it is obvious that the features extracted from each layer in the neural network have a hierarchical structure. In other words, the deeper the hierarchies, the more abstract are the obtained features. ZFNet [13] is based on AlexNet by modifying its architecture, ZFNet achieves top-5 error rate of 11% on ImageNet by adjusting the network depth, convolution kernels size and stride step. Szegedy et al. [14] proposed Inception network that combines convolution kernels with size of 1 × 1, 3 × 3 and 5 × 5. Actually, GoogLeNet [14] simulates the convolution operation with large field of vision by superimposing multiple Inceptions to increase the network's width and enhance the network's representation ability. Experiment the proposed method on ImageNet database, GoogLeNet gets top-5 image classification error rate of 6.7%.
The CNN has also been widely used in the field of mammography processing due to its powerful ability of automatic feature extraction from image. For example, Qui et al. [15] used the CNN with eight layers to extract breast mass image features and classify them as benign or malignant breast mass. Experiment the proposed method on the dataset that contains 560 breast mass images of 512 × 512 pixels, it gets the performance of AUC at 0.79. Since producing and labelling the mammography require doctors have rich experience, it is difficult to generate large-scale mammography dataset. The problem of lacking mammograms can be relieved by transfer learning that pre-trains the model on nonmedical dataset and fine-tunes the model on medical dataset. Huynh et al. [16] pre-trained the deep learning model on ImageNet [17], and fine-tuned the model on the mammography dataset that is collected by the University of Chicago Medical Center. The mammography dataset contains 607 mammograms, which includes 261 benign and 346 malignant breast masses.
The breast mass image is not as colourful as the natural image, and the difference of shape and texture between benign and malignant mass is very small, so it is difficult to distinguish the benign mass from malignant mass. Moreover, the breast image is difficult be acquired, so we face a serious problem of the lacking breast mass images.
To address the above problems, we propose a novel and effective CNN-based breast mass classification method in this paper. Compared with the existing methods, the proposed method has the following superiorities: (i) the proposed method integrates two types of input images to extract discriminative features from breast mass, which can obtain more abundant information of breast mass and improve the classification accuracy of breast mass; (ii) we proposed a breast mass image transformation method that enables the original breast mass image entered into another data space that is benefit to obtain alternative representation of breast

Related works
In recent years, the achievements of the CNN have got a lot of attentions by researchers in the field of X-ray mammography image recognition, some scholars tried to solve the medical image recognition problem by the CNN. For example, Jiao et al. [18] extracted the middle-level and high-level image features of breast mass from different layers in the CNN, respectively, and classified the extracted features as benign or malignant breast mass. Finally, the classification results were used for final decision making. Experiment on the dataset that contains 600 mammograms that are extracted from The Digital Database for Screening Mammography (DDSM) [19], it achieves the accuracy of 96.7%. In order to obtain the features with stronger ability for classification from breast mass image, Sun and Xu [20] extracted features with strong discriminative ability by multi-scale kernels, and achieved the accuracy of AUC at 0.7129 on the DDSM database.
The literatures mentioned above show that the existing methods of mammography classification merely improve the classification accuracy of mammography image by extracting more effective features from mammography image directly. For example, Jiao et al. extracted discriminative features from different layers in the CNN. Sun et al. extracted different scale types of feature maps from breast mass image with two computing paths. The methods proposed above can extract more effective features from mammography image than typical CNN, but they only improved the feature extraction method, while ignored the sufficient mammography's intrinsic features. In order to extract more redundant features from mammography image, we converted the mammography to another representation data space, and proposed a method based on CNN to extract more discriminative features for classification from the mammography and its corresponding transformed image.

Proposed method
In order to extract underlying features from breast masse image effectively, we propose a method to capture the differential between benign and malignant breast masses, which is beneficial to improve the performance of the breast mass classification.
The proposed feature fusion sub-network (FFSN) receives two types of breast mass images simultaneously, and generates fusion features from the inputted images. The architecture of the FFSN is shown in Fig. 1.
The FFSN consists of two image inputs, two sub-networks and one feature fusion layer. The image inputs receive two types of breast mass images and put them into the sub-networks, respectively. The sub-network generates feature maps by two convolutional layers, the feature fusion layer receives feature maps that are generated from two sub-networks and outputs the fusion features. Every sub-network contains two convolutional layers, and every convolutional layer contains 30 kernels with size of 5 × 5, followed by a maximum down-pooling layer with 2 × 2 filter and stride of 2.
We proposed the CNN based on fusion features (CNN-FF) by integrating the CNN and the FFSN, the architecture of the CNN-FF is shown in Fig. 2. It contains one FFSN, two convolutional layers, two full-connection layers and one output layer. Every convolutional layer is followed by a maximum down-pooling layer. The CNN-FF contains two inputs that are used to receive two breast mass images with size of 200 × 200 pixels, and contain an output that is used to output the classification result. Two types of breast mass images are put into the FFSN and the fusion features are obtained at the end of the FFSN, then two convolutional layers and two full-connection layers are used to extract abstract information from fusion features. Finally, putting the abstract information into a classification layer and outputting the classification results.
The convolutional layer that follows the FFSN contains 30 kernels with size of 3 × 3, and every convolutional layer is followed by a maximum down-pooling layer. The features obtained by two convolutional layers and their following down-pooling layers are put into two full-connection layers to further extract the discriminative features. Finally, the extracted features are fed to the classification layer to classify the image as benign or malignant breast mass.
The CNN based on fusion features with transformation (CNN-FF-TF) is generated by modifying the inputs of the CNN-FF. The CNN-FF-TF receives breast mass image and its transformed image, and puts them into the two sub-networks in the FFSN. The CNN based on fusion features with transformation and batch normalisation (CNN-FF-TF-BN) is generated by appending a BN layer followed by every convolutional layer in the CNN-FF-TF. In order to adapt the two image inputs and enhance the representation ability of breast mass image, we proposed a transformation method for breast mass image that is used to map the breast mass image into another representation space. In this way, we can obtain more implicit and intrinsic features from breast mass images in two representation spaces. The image transformation method is expressed as follows: where x max is the global maximum pixel value, x i, j, c is the pixel value of the ith row and jth column in the cth channel. Fig. 3 shows different visual exhibitions of the breast mass image and its transformed image.

Experiments and results
In this section, we experimented with our method and the compared state-of-the-art methods of mammography classification on the DDSM mammography database and analyse their experiment results.

Datasets
The DDSM mammography database is widely used in the medical image process due to it contains more samples than other mammography databases. We extracted the subset from the DDSM that contains only one breast mass in the mammography. The subset contains 2496 breast mass images, includes 1256 benign and 1240 malignant breast masses. 1005 benign and 992 malignant breast mass images are used as training set, 251 benign and 248 malignant breast mass images are used as testing set. We evaluate the robustness of our method on the MIAS mammography database. As the subset extracted from MIAS contains 111 breast mass images only, it cannot take the advantage of the compared methods enough on a dataset that containing a small number of samples, we used the models which trained on the DDSM to predict the samples extracted from the MIAS. The validated dataset extracted from MIAS that contains 111 breast mass images and includes 60 benign and 51 malignant breast mass images.

Comparison of convolution results for two types of images
The breast mass image and its transformed image are put into the sub-networks in FFSN, two types of feature maps are obtained in the end of sub-networks. The two types of feature maps obtained from the second convolutional layer in two sub-networks were shown in Fig. 4. The feature maps in the first row were obtained from the sub-network which receives the original breast mass image, the feature maps in the second row were obtained from the sub-network which receives the transformed image. In contrast, it expresses large differences from the feature maps with different types, and more abundant representation information from the breast mass image and its transformed image can be obtained by fusing the two types of feature maps.

Result comparison and analysis
As shown in Table 1, LeNet achieves an accuracy of 0.7054, and the accuracies of our proposed methods CNN-FF, CNN-FF-TF and CNN-FF-TF-BN are 0.7495, 0.7996 and 0.8156, respectively. The accuracy improvement is 4.05-11.02% of our methods over LetNet. The CNN-FF has higher classification accuracy than LeNet due to it enlarges the network's width, and augments the training set by using two sub-networks to extract feature maps from the breast mass images. The CNN-FF-TF generates the discriminative features from the breast mass image and the corresponding transformed image, and it can retain the intrinsic information of mammography as much as possible. Therefore, the accuracy of CNN-FF-TF is 0.0501 higher than CNN-FF that uses the original breast mass image merely. Due to the data generated by the convolutional layer are normalised to the range of [0, 1] by BN layer in CNN-FF-TF-BN, the number of iterations is reduced and the process of model training is speeded up. Moreover, adding a BN layer after convolutional layer can avoid the unbalanced contributions for different dimension data caused by different value ranges in the process of model training, which is beneficial to improve the classification accuracy of the model. As a result, the CNN-FF-TF-BN achieves better performance than CNN-TF.  The experiment results of the compared state-of-the-art classification methods for mammography and our methods on the DDSM dataset are listed in Table 2.
The accuracy of CNN-FF-TF-BN is 0.8156, and the accuracies of CNN + SVM and CNN(Conv5 + Fc7) + SVM are 0.7114 and 0.7174, respectively. The CNN + SVM method is divided into two parts: feature extraction from breast mass image by CNN and classification of image features by SVM algorithm. Due to the layers of different levels in the CNN can extract different types of features, the CNN(Conv5 + Fc7) + SVM method used fusion features by combining the outputs from the fourth convolutional layer and the last full-connection layer in the CNN, and predicted the extracted features by the SVM algorithm. The accuracy of CNN(Conv5 + Fc7) + SVM is 0.006 higher than the CNN that uses the discriminative features extracted from the last full-connection layer merely. The experiment results illustrate that the proposed method CNN-FF-TF-BN is better than the compared state-of-theart methods of mammography classification on the DDSM dataset.
As shown in Table 3, the accuracies of our methods all outperform the compared state-of-the-art methods of mammography classification on the MIAS. The experiment results proved that robustness and effectiveness of our methods is better than the compared methods. In addition, we find that the accuracy of CNN-FF-TF-BN is equal to CNN-FF-TF that without BN layer on MIAS. It illustrates that the BN layer is not the crucial component to improve the performance of our methods.
The methods experimented in this paper have different convergence rate. As shown in Fig. 5, LeNet, CNN + SVM, CNN(Conv5 + Fc7) + SVM, CNN-FF and CNN-FF-TF achieved the highest predicted accuracy at the iteration of 30, CNN-FF-TF-BN converged around the iteration of 15 due to adding a BN layer after every convolutional layer in CNN-FF-TF-BN. The BN layer normalised and mapped the output of convolutional layer to a standard normal distribution with a mean of 0 and a variance of 1, the data are converted into the region that is conductive to distinguish, and greatly accelerates the convergence rate in the process of model training.

Conclusion
In this paper, we proposed a classification method of breast mass image that combines the transformation method for breast mass image and FFSN. We have carried out the following meaningful works: (i) we proposed a FFSN which can extract abundant discriminative features from breast mass image by the two subnetworks in FFSN; (ii) we proposed an image transformation method for breast mass to enhance the representation ability of breast mass image; (iii) we proposed a CNN-FF-TF-BN method that is generated by fusing the proposed transformation method for breast mass image and FFSN. The proposed methods were compared with the state-of-the-art methods of mammography classification on the DDSM and the MIAS mammography dataset.
Experiment results showed that CNN-FF-TF-BN outperformed the compared state-of-the-art convolutional deep learning methods of mammography classification. Table 2 Comparison with the compared state-of-the-art methods of mammography classification on DDSM Method name Accuracy CNN + SVM [22] 0.7114 CNN(Conv5 + Fc7) + SVM [18] 0.7174 CNN-FF-TF-BN 0.8156