Detection of t(9;22) Chromosome Translocation Using Deep Residual Neural Network

Karyotype analysis has significant clinical importance. Effectively detecting the exact abnormity of chromosomes will contribute to the diagnosis of cer-tain diseases. In this paper, I presented a convenient and reliable system that was capable of detecting t(9;22) chromosome translocation, a specific chromosomal abnormity in CML patients. The functions of this system were based on deep learning algorithms, and I created a classification system using ResNet. The model could effectively detect t(9;22) translocation based on images of chromosomes 9 and 22. This model achieves a 97.5% accuracy on the validation set.

. A sample of chromosomes within a human cell (left) and the accompanying karyotype image sorted based on Denver groups (right).
the 22q derivative chromosome known as the Philadelphia (Ph) chromosome [2] ( Figure 2). It was the first chromosome abnormality to be found in leukemia in 1960 and is now known to be present in 95% of CML cases and regarded as a specific genetic marker of CML patients [3]. During previous decades, it has been proved that the BCR-ABL fusion gene encodes a protein with tyrosine kinase activity initiating and maintaining the disease [3]. The treatment using tyrosine kinase inhibitor such as imatinib mesylate against the BCR-ABL fusion gene has revolutionized effect. Survival of up to 77.1% of the patients on imatinib mesylate treatment was confirmed by the International Randomized Study [4]. Therefore, it is crucial to effectively identify the t(9;22) chromosome translocation and to apply treatment to the abnormalities in time. Current chromosome analysis uses automated karyotyping systems (AKS) which provides interactive and graphical environment [5]. However, these AKS systems still need manual chromosome classification which is usually highly time-consuming and requires professional knowledge. A professional technician may need years of experience to effectively and independently perform karyotype analysis. These constraints thus make it difficult to perform karyotype analysis, especially in undeveloped areas that experience a lack of professionals.

Related Work
Basically, karyotype analysis is a problem of image analysis. During recent years, in order to reduce the burden of karyotype analysis, many computer-based auto-  image classification. Gulshan showed the ability of CNN to detect diabetic retinopathy [7]. Esteva and others showed the ability of CNN to detect skin cancer [8]. Ehteshami Bejnordi showed the excellent performance of an improved CNN in lymph node metastasis detection [9]. Monika [10] used a method combining the crowdsourcing, preprocessing and deep learning to segment out and classify chromosomes especially with overlapping chromosomes. The accuracy of classification was 86.7%. Joshi [11] proposed the incremental learning for chromosomes classification for automated karyotyping of metaphase chromosomes and the accuracy of 97% was achieved. In general, they preprocessed each chromosome image first by using skeletonization algorithms, then features are extracted along each computed axis. At last, based on the extracted features, the classifiers are built to estimate chromosome's type. However, all of the study above focused on the normal chromosomes. None of these works used artificial intelligence to specifically address the topic of identifying chromosomal abnormalities from the visual appearance of chromosomes. In this work, I present an approach to identify a special chromosome abnormality-t(9;22) from images containing chromosome 9 and 22. This model is a 50 layers ResNet built using Tensorflow framework. I designed the system focusing on extracting shape features of chromosome images and solving the issue by applying Residual Network, which is capable of increasing depth of the network while reducing the effect of the Vanishing Gradient Problem and therefore can produce better accuracy in the image classification task that I want to address. By image pre-processing, deep learning, and feature extracting, I effectively enhanced performance on t(9;22) chromosome translocation detection.

Data
The raw chromosome images were collected from 200 different individual samples, provided by Dr. Liu. Each image was karyotyped and assigned with correct labels. All images were received in a de-identified format to protect the identity of patients. Figure 3 shows two of the samples. I extracted the images of chromo-  To put the data into CNN for training, I modifed each image to obtain a uniform image size of 90 × 90 with a pixel value of 72. Figure 4 shows the image sets after preprocessing.

System Design
The structure of the proposed ResNet is illustrated in Figure 5.
Unlike other traditional neural networks, the CNN also has a convolutional layer and a pooling layer for feature extraction. The details for these two layers are described below: 1) Convolutional layer: Convolutional layers generally perform a convolution operation to extract features from images. A convolution operation is a mathematical computation between two real variable parameters and is a key component of CNN. Initial convolutional layers in CNN extracts low-level features, in my experiment may be the edges or shapes of chromosomes. Other convolutional layers in CNN will extract more complex features from each images. A sophisticated feature map will be obtained using multiple layers of convolution operations. The mathematical representation of a convolutional layer is as follows: 2) Pooling Layer: Pooling layer is located after the convolutional layer and is responsible for compressing the input to extract only main features, therefore making the feature map smaller and simplifying the complexity of computing.
CNN achieves a high capability of detecting t(9;22) chromosome translocation by training with existing chromosome abnormality images. Currently, CNNs usually use ReLU activation function to increase generalization ability of the model. Such practice has proved to have good performance [16]. I also employed this to maintain good performance for my model.
Kaiming He and others first proposed Residual Network (ResNet) [17]. The 152 layer model they trained using ResNet achieved a top-5 accuracy of 96.43% at ILSVRC 2015 (ImageNet Classification) [18]. The main idea of ResNet is similar to that of Highway Network [19]. Unlike traditional network structure, which only allows non-linear transformations, Highway Network allows the model to store a portion of previously computed output. ResNet also has such a mechanism by including a directly-connected channel so input can be directly transferred to later layers.
If a layer receives an input X, the feature it will learn will be denoted as H(x). The residual that I want the layer to learn will therefore be F(x) = H(x) − x, so the original feature it will learn becomes F(x) + x. Theoretically, if the residual approaches zero, the layer will only directly pass the feature learned to next layer without losing performance accuracy. The residual will not be zero in reality, but the existing residual will increase the performance of the model by allowing the layer to learn additional features from input. Figure 6 is a general composition of ResNet.
ResNet further establishes residual learning to reduce the effect of gradient descent as more layers are added to the model. With residual learning, the network becomes more sensitive to additional small changes. The residual function usually has small responsive variables and has a shortcut connection. From Figure 6 it can be illustrated that the shortcut path has two layers, which are illustrated in the following formulas. The σ represents a nonlinear function.
Then passing through a shortcut and a second ReLu function, obtaining output y, as described below: when the dimensions of input and output need to be changed, such as modifying the number of channels, a non-linear transformation s W can be applied to x as described below.

Experimental Result
The training process is done using a computer with a system of Ubuntu 16.04, memory of 32 GB, and GPU of Nvidia GTX 1080T. After adjusting parameters, I finally train the network with twenty epochs. The results of four trials I conducted are presented in Table 1. The experiment shows that my model is able to achieve an average of 97.5% accuracy on the validation set, which demonstrates its power. Figure 7 shows the accuracy change on training and validation sets, in which the horizontal axis represents the number of epochs from zero to twenty and vertical axis represents accuracy from 0% to 100%. The green and black curves   Figure 7 represent the accuracy loss and indicates a reliable model as they are approaching near zero. The blue and red curves represent accuracy and their horizontal asymptotes near 1 would demonstrate the accuracy of my model.
Based on Figure 7, it is illustrated that after 20 epochs this model has an accuracy near 100% and an accuracy loss near 0, except trial 2 in which the train loss curve fails to approach 0. Based on the trend of all four curves, it can be concluded that this model has no obvious over-fitting and is effective in detecting t(9;22) chromosomal translocation.

Conclusion
Detecting chromosomal abnormalities effectively has great clinical significance.
It is crucial to detect gene abnormalities in patients at the earliest stage to ensure I hope as I analyze and address these and other remaining areas and apply technologies to the field of chromosome abnormality diagnosis, the efficiency in detecting such issues will continue to be improved.