Elektrotehniški vestnik 87(1-2): 68-73, 2020 Original scientific paper Automatic Segmentation Based on the Cardiac Magnetic Resonance Image Using a Modified Fully Convolutional Network Xinyu Yang1'2, Yingming Sun2' *, Yuan Zhang1' *, Anton Kos3 1College of Electronic and Information Engineering, Southwest University, China 2School of Information Science and Engineering, University ofJinan, Jinan, China 3University of Ljubljana, Faculty of Electrical Engineering, Ljubljana, Slovenia Corresponding author e-mail: ise_sunym@ujn.edu.cn , yuanzhang@swu.edu.cn Abstract. Segmentation of the cardiac magnetic resonance image (MRI) is an indispensable step for evaluating the cardiac function. For the cardiac MRI segmentation, the traditional methods need to manually segment the left ventricle (LV), right ventricle (RV) and myocardium (MYO), which is time-consuming and prone to mistakes. Therefore, it is still desirable to develop automatic MRI segmentation methods. Inspired by the power of deep neural networks, we propose an image-to-image modified Fully Convolutional Network (FCN) to perform the cardiac MRI segmentation. Firstly, the MRI data is preprocessed. Then, the preprocessed data is fed into modified FCN which is designed to learn the low-layer and high-layer representations from the cardiac MRI. The model of modified FCN is directly trained using cardiac MRI and a corresponding ground truth. Finally, a novel constraint scheme is introduced by combining the region loss (LossR ) with the multi-class cross-entropy loss (Lossc ) to learn the more representative features. Experimental results show that the proposed method achieves a good achievement with the manual MRI segmentation results and outperforms the previous approaches in terms of the Dice Similarity Coefficient, Hausdorff distance and sensitivity. Keywords: Cardiac MRI, Medical Image Segmentation, Deep Neural Networks Samodejna razčlenitev srčne magnetno resonančne slike z uporabo modificiranih polno konvolucijskih mrež Razčlenitev srčne magnetno resonančne slike (MRI) je nepogrešljiv korak za oceno srčne funkcije. Za razčlenitev MRI srca s tradicionalnimi metodami je potrebno ročno razčleniti levi prekat (LV), desni prekat (RV) in srčno mišico (MYO), kar je zamudno in je nagnjeno k napakam. Zato je še vedno zaželeno razviti metode samodejne razčlenitve. Navdihnjeni z močjo globokih nevronskih mrež predlagamo izvedbo razčlenitve MRI srca s spremenjenimi polno konvolucijskimi mrežami (FCN), ki delujejo na posameznih slikah. Najprej izvedemo predobdelavo MRI podatkov, potem pa s spremenjenimi FCN, ki so zasnovane tako, da se nauči nizkoplastnih in visokoplastnih predstavitev s srčno magnetno resonanco. Model spremenjene FCN se uči neposredno iz MRI srca in pripadajočih oznak. Na koncu uvedemo novo shemo omejitev, tako da izgubo regije LossR kombiniramo z večvrstno navzkrižno entropijo, za pridobivanje več značilnih lastnosti. Rezultati kažejo, da predlagana metoda dosega dobro skladnost z rezultati ročne razčlenitve in prekaša prejšnje pristope v smislu koeficienta podobnosti kock, Hausdorffove oddaljenosti in občutljivosti. 1 Introduction The cardiovascular disease is one of the most fatal diseases in the world even though the mortality has been decreasing over years thanks to the development of cardiac imaging technologies [1]. In particular, the use of short-axis MRI allows an accurate evaluation of the cardiac function [2]. In the clinical cardiology, the clinicians manually delineate the left ventricle (LV), right ventricle (RV), and myocardium (MYO) of cardiac MRI to calculate the cardiac function indicators, such as myocardial thickness, ventricular volume and ejection fraction. These cardiac function indicators are important references to the diagnosis of the cardiovascular diseases. However, the massive MRI volumes relying on manual delineations, can be both time-consuming and prone to subjective errors. Therefore, there is an urgent demand of automatic cardiac MRI segmentation methods. Starting from the 21st Century, some researchers have been using machine learning methods to implement the automatic cardiac MRI segmentation. For instance, Codella et al. [3] propose a region growing method to segment the LV. However, the seed points need to be picked manually. Pluempitiwiriyawej et al. [4] present a novel stochastic active contour scheme (STACS) which can overcomes some of the unique challenges in the cardiac MRI segmentation, such as low contrast and adverse effect of papillary muscles on Received 6December 2019 Accepted 8 January 2020 AUTOMATIC SEGMENTATION BASED ON THE CARDIAC MAGNETIC RESONANCE IMAGE USING A MODIFIED.. 69 the MRI segmentation. But this method was less sensitive to the contour, which could lead to low accuracy of segmentation. In the last few years, researchers have turned to the deep-learning technology to perform automatic cardiac MRI segmentation. Ngo et al. [5] introduce a new methodology that combines deep learning and level set to train a small dataset for the automated LV MRI segmentation. Avendi et al. [6] proposed an automatic LV MRI segmentation method by adopting convolutional networks to detect the LV chamber in the MRI dataset. Jonathan Long et al. [7] proposed the fully convolutional network (FCN) which defines the state-of-the-art of the semantic segmentation. In the field of the RV segmentation, FCN has also achieved unprecedented performance [8]. In addition to the above approaches, some researchers perform the cardiac MRI segmentation by modifying the neural network architecture [9]-[11]. Despite the desired results achieved by these methods in certain fields, some methods may fail to perform the LV, RV and MYO segmentation simultaneously. Furthermore, the results of most algorithms are sensitive to the ventricle contours. In this paper, we proposed an image-to-image modified FCN is proposed for automatic cardiac MRI segmentation which can segment the LV, RV and MYO simultaneously. Our contributions can be summarized as follows: 1) We introduce a modified FCN architecture to learn a multi-layer feature representation for the automatic cardiac MRI segmentation; 2) We introduce a novel constraint scheme by adopting a constraint term to guide the training of modified FCN to effectively minimize the variance of the posterior probabilities for target regions between the MRI segmentation result and the ground truth. The paper is structured as follows. In section 2, we detail our proposed method of cardiac MRI segmentation. Section 3 presents our experiment results and analysis. Section 4 draws conclusions of our work. 2 Methodology This section presents details of our method. Firstly, the cardiac MRI data is preprocessed. A modified FCN is trained directly using cardiac MRI and a corresponding ground truth. After the training stage, we can obtain the model. Finally, we use the model to output the segmentation results. The flowchart of the proposed approach is shown in Figure 1. 2.1 Preprocessing The MRI data need to be preprocessed prior to the training of the proposed architecture. The preprocessing stage includes the following four steps. 2.1.1 Dimension Transformation In this work, the dataset from the Automated Cardiac Diagnosis Challenge (ACDC) is used. However, the large slice-thickness (5 to 10 mm) of the 3D data is too large impedes the generalization ability of the model and leads to insufficient connectivity information between adjacent slices [12]. Considering the above, we convert the 3D MRI data into 2D images by slicing them. Figure 1. A flowchart of the proposed method 2.1.2 Size Normalization The size of the ACDC dataset varies from 154 x 224 to 428 x 512. In order to facilitate the input of the neural network, the size of the 2D data is normalized to 256 x 256 by padding the residual regions with a minimum gray value of each image. As for images whose size of X or Y are larger than 256, we crop it to 256. 2.1.3 Data Augmentation As each case of the 3D MRI data contains approximately ten slices, model training will cause overfitting. To avoid it, rotation transformation from 0° to 120° with unique a interval of 15° is implemented to augment these resized 2D images. 2.1.4 Pixel Intensity Normalization The MRI datasets have a wide range of pixel intensity due to different scanner types or acquisition protocols and this kind of variety will impede the segmentation performance. Hence, the Z-Score method is used to normalize the pixel intensity of the processed MRI, as follows: x - / (1) where z is the value after normalization, x is the value before the normalization, / is the mean of the 2D image, and a denotes the standard deviation of the image. 2.2 Modified FCN To realize image-to-image idea, an architecture is designed to learn multi-layer feature representations and to directly outputs the MRI segmentation results with the resolution of the original images. Furthermore, a constraint scheme is introduced to the model training stage. z = a 70 YANG, SUN, ZHANG, KOS Figure 2. Architecture of the modified FCN. 2.2.1 Architecture FCN achieves the most advanced performance in semantic segmentation field mainly because of its feature representation for the dense classification. The FCN achieves an end-to-end training by taking the images as inputs and getting segmentation results as outputs. The FCN architecture contains two modules: down-sampling path and up-sampling path. The down-sampling path includes convolution and pooling layers, which are widely utilized in the CNN model. The FCN model replaces the full connection layers with the convolution layers as the fact that the full-connection layers may destroy the image spatial structure. The up-sampling path consists of the convolutional and deconvolutional layers, which fuse the feature maps and upsample the fused feature maps into a probability map [12]. The FCN process is defined as follows: Li = Li_l ®Wt + bi (2) where Lt is the output of the i _ th convolution or deconvolution layer, W denotes the weight vector of i _ th convolution or deconvolution kernel, and bt is the bias. ® is the convolution or deconvolution operation. Next, using the activate function, FCN can show satisfying performance. A combination of the output of deconvolution and the corresponding convolution layers is used as the input of the following layer in the FCN up-sampling path, the process of up-sampling path process is formulated as: F = ifi = 1 ifi > 1 (3) where F is the output of the convolution or deconvolution layer and O is the output of the corresponding convolution layer. However, traditional FCN only makes use of the high-layer features which are effective in distinguishing between the target organ and other organs of a similar shape, while the low-layer features are much helpful in obtaining an edge information for more accurate segmentation results [13]. Thus, FCN is modified by using multi-layer information. The proposed architecture mainly includes a down-sampling and an up-sampling path. The down-sampling path contains convolutional layers and maximum-pooling layers, designed to recognize the semantic meanings based on an abstract information. The up-sampling channel contains the convolutional and deconvolutional layers, which predicts the segmentation results through different feature maps in hierarchical layers. Furthermore, the outputs of the first and the second max-pooling layers are upsampled and concatenated to the output of other upsampling paths to fully use of the high-layer and the low-layer features simultaneously. The concatenated feature maps are then imported to the main classifier which is a 1 x 1 convolutional layer. In this way, the contextual information is fused, and the concatenated feature maps are refined. Finally, the MRI segmentation probability map can be obtained from the main classifier. After that, by using the argmax function the segmentation result is obtained. Figure 2 shows the architecture of modified FCN. 2.2.2 Constraint Scheme A novel constraint scheme is proposed to guide modified FCN to minimize the variance of the posterior probabilities for target regions between the AUTOMATIC SEGMENTATION BASED ON THE CARDIAC MAGNETIC RESONANCE IMAGE USING A MODIFIED.. 71 segmentation result and the ground truth. In the novel constraint scheme, the loss function is defined as: Loss = Lossc + LossR (4) where Lossc is a multi-class cross-entropy loss and Loss is the region loss. Modified FCN is trained to classify each pixel by minimizing the loss function. The multi-class cross-entropy loss is defined as: Lossc =Z!=, Lme (S(Xm ),Ym ) (5) where L is the multi-class cross-entropy, X is the original image, Y represents the ground truth, and S(Xm) is the MRI segmentation probability map. Differently from the multi-class cross-entropy loss based on each pixel classification in the image, the proposed region loss based on the target region is used to minimize the difference of the posterior probabilities for the target region between the predicted result and the ground truth. The function is defined as: Lossr = iCRm - C( u G,m 4? (6) where c and c p,m g,m while the remaining 80 subjects are the training set. The proposed method is implemented using the tensorflow framework and the proposed model is trained on Nvidia GTX 1080Ti. In order to train the modified FCN, the Adam optimization method is adopted where the initial learning rate is set to 10"3 and is decreased by a polynomial decay with the power of 0.7. 3.3 Result and Quantitative Analysis A comparison between the MRI segmentation result of the proposed method and the ground truth is shown in Figure 3. As can be seen from Figure 3, the proposed method achieves a good agreement with the manual MRI segmentation result. are the posterior probability of the target region (LV, RV and MYO) in the MRI segmentation result and the ground truth, respectively. c and c can be formulated as: p,m g ,m C, = ^, CG = S^ (7) sw sw where S is the whole area for an image, S denotes the target region (LV, RV and MYO) of the segmentation result, and S presents the target region of the ground truth. By combining the multi-class cross-entropy loss and the region loss, the modified FCN can learn more representative features to produce satisfying segmentation results. 3 Experimental results and analysis 3.1 Dataset The dataset from Automated Cardiac Diagnosis Challenge (ACDC) is used [14]. It contains 100 cases of the cardiac short-axis MRI whose corresponding manual delineation images are provided. It is also divided into five group: normal case (NOR), heart failure with infarction (MINF), dilated cardiomyopathy (DCM), hypertrophic cardiomyopathy (HCM), and abnormal right ventricle (ARV). 3.2 Implementation Details The dataset is categorized into five groups according to the type of diseases. Each group includes 20 subjects. Therefore, we randomly select four subjects from each group, forming a total of 20 subjects as the test set, Ground Truth Modified FCN Figure 3. Visual comparison between results of the proposed method and the ground truth. To quantitatively analyze the performance of the proposed method, three criteria, mean Dice Similarity Coefficient (DSC), Hausdorff Distance (HD) and Sensitivity (SEN), are used to evaluate the segmentation performance. The related formulas are: 2| S, DSC = ^r ^ SGr| |S,| +1SG (8) 72 YANG, SUN, ZHANG, KOS where SR represents the result of the MRI segmentation and is the ground truth. The evaluation index mainly evaluates the pixel overlap ratio. HD = MAX (MAXX!zCMINXr_Caid (x, y), MAXx „MINx cd ( x, y)) (9) where C is the contour of the segmentation result, CGT represents the contour of the ground truth, and d ( x, y) means the distance between the two points. HD is the longest distances of a point from the contour to the closest contour point which is in another one. TP SEN=- (10) it of LV, as shown in Tables 1 and 3. Unlike the RV shape, the MYO shape does not change frequently. Therefore, the HD of MYO is more excellent than HD of RV. Table 1. Performance comparison of the LV MRI segmentation on the test set. TP + FN where TP is true positive of the pixels classification and FN is the false negative of pixels classification. SEN evaluates the ability of the proposed method to distinguish the target region. The performance of the state-of-the-art solutions and our solution is shown in Tables 1, 2 and 3, respectively. FCN [7] is a classic neural network architecture for the semantic segmentation. U-NET [15] is an effective architecture for the medical image segmentation which adds the skip connections between the feature maps. The Chang's method [16] adopts YOLO and FCN to perform the cardiac MRI segmentation, which achieves an outstanding segmentation result. As seen from the three tables, our approach exhibits the excellent performance. Also, our modified FCN also outperforms FCN in terms of the three evaluation criteria. In LV, the modified FCN improves the DSC by 4.4%. In the RV region, our method brings a 5.7% improvement compared to FCN in terms of DSC. In MYO, DSC increases by 4.9%. These comparisons strongly demonstrates that our modified FCN effectively improves the performance of the cardiac MRI segmentation compared to conventional FCN. As shown in Table 1, our approach is optimal in the LV segmentation. Modified FCN reaches 92.1%, 10.126 and 93.0% in terms of DSC, HD, and SEN, respectively. Table 2 shows the modified FCN is superior to FCN and U-NET in the RV MRI segmentation in terms of the three criteria, but it is still inferior to the Chang's method in HD. This is due to the variable shape of RV along with the progress of the breathing movement making it difficult for our method to handle the variation in the RV shape with the limited amount of data. The Chang's method uses YOLO to detect the heart region, which improves the model ability to recognize the varying RV shapes. In our future work, we will explore semi-supervised methods. To acquire more useful features to handle the shape-variation problem. As seen from Table 3, the modified FCN is optimal for the MYO segmentation. Compared with the LV region, MYO has a smaller area which makes it difficult to perform the MRI segmentation correctly. As a consequence, the MYO evaluation result is lower than Method DSC HD SEN Our Method 0.921 10.126 0.930 FCN [7] 0.877 11.261 0.869 U-NET [15] 0.901 11.002 0.891 Chang's Method [16] 0.919 10.452 0.909 Table 2. Performance comparison of the RV MRI segmentation on the test set. Method DSC HD SEN Our Method 0.883 11.843 0.909 FCN [7] 0.826 12.778 0.864 U-NET [15] 0.862 12.214 0.897 Chang's Method [16] 0.869 9.857 0.904 Table 3. Performance comparison of the MYO MRI segmentation on the test set. Method DSC HD SEN Our Method 0.890 10.663 0.911 FCN [7] 0.841 12.316 0.876 U-NET [15] 0.882 10.972 0.903 Chang's Method [16] 0.879 10.857 0.904 4 Conclusion To solve the challenge of an accurate segmentation of LV, RV and MYO simultaneously, the paper proposes the modified FCN by combining multi-layer feature representations. A region loss function is introduced to minimize the difference in the posterior probabilities for the target region between the MRI segmentation result and the ground truth, which is beneficial for the network to exhibit a satisfying performance. Our experimental result on a dataset demonstrates that the proposed method achieves a good agreement with a manual MRI segmentation result for LV, RV and MYO and outperforms previous approaches in terms of the three criteria. Our method is believed to provide a reliable quantitative evaluation of the cardiac function. Acknowledgement The authors acknowledge the supports from the National Natural Science Foundation of China (61572231). AUTOMATIC SEGMENTATION BASED ON THE CARDIAC MAGNETIC RESONANCE IMAGE USING A MODIFIED.. 73 References [1] Caroline Petitjeana, Jean-Nicolas Dacherb, 2011. A review of segmentation methods in short axis cardiac MR images. Medical Image Analysis, 15 (2), pp.169-184. [2] P. Peng, K. Lekadir, A. Gooya, 2016. A review of heart chamber segmentation for structural and functional analysis using cardiac magnetic resonance imaging. Magnetic Resonance Materials in Physics, Biology and Medicine, 29(2), pp.155-195. [3] Noel C. F. Codella, J. W. Weinsaft, M. D. Cham, M. Janik, M.R Prince, and Y. Wang, 2008. Left ventricle: automated segmentation by using myocardial effusion threshold reduction and intravoxel computation at MR imaging. International Journal of Medical Radiology, 248(3), pp. 1004-1012. [4] Chamchai Pluempitiwiriyawej, José .M.F. Moura, Yi-Jen Lin Wu, and Chien Ho, 2005. Stacs: new active contour scheme for cardiac mr image segmentation, IEEE Transactions on Medical Imaging, 24(5), pp.593-603. [5] Tuan Anh Ngo, Zhi Lu, and Gustavo Carneiro, 2017. Combining deep learning and level set for the automated segmentation of the left ventricle of the heart from cardiac cine magnetic resonance. Medical Image Analysis, 35, pp.159-171. [6] M. Avendi, Arash Kheradvar, Hamid Jafarkhani, 2016. A combined deep-learning and deformable-model approach to fully automatic segmentation of the left ventricle in cardiac MRI. Medical Image Analysis, 30, pp.108-119. [7] Jonathan Long, Evan Shelhamer, Trevor Darrell, 2014. Fully convolutional networks for semantic segmentation. IEEE Transactions on Pattern Analysis & Machine Intelligence, 39(4), pp.640-651. [8] Gongning Luo, Ran An, Kuanquan Wang, Suyu Dong, and Henggui Zhang. "A deep learning network for right ventricle segmentation in short-axis mri." in 2016 Computing in Cardiology Conference (CinC), pp. 485-488, IEEE, 2016. [9] Mahendra Khened, Varghese Alex, and Ganapathy Krishnamurthi. "Densely connected fully convolutional network for short-axis cardiac cine mr image segmentation and heart diagnosis using random forest." International Conference on Medical Image Computing and Computer-Assisted Intervention, pp.140-151, Quebec, Canada, 2017 [10] Jay Patravali, Shubham Jain, Sasank Chilamkurthy. "2d-3d fully convolutional neural networks for cardiac mr segmentation." International Conference on Medical Image Computing and Computer-Assisted Intervention, pp.130-139, Quebec, Canada, 2017. [11] Marc-Michel Rohé, Maxime Sermesant, Xavier Pennec. "Automatic multi-atlas segmentation of myocardium with svf-net." International Conference on Medical Image Computing and Computer-Assisted Intervention, pp.170-177, Quebec, Canada, 2017. [12] Yeonggul Jang, Yoonmi Hong, Seongmin Ha, Sekeun Kim, and Hyuk-Jae Chang. "Automatic segmentation of lv and rv in cardiac mri." in International Workshop on Statistical Atlases and Computational Models of the Heart. pp. 161-169, 2017. [13] Hao Chen, Xiaojuan Qi, Jie-Zhi Cheng, and Pheng-Ann Heng. "Deep contextual networks for neuronal structure segmentation." in Thirtieth AAAI conference on artificial intelligence, pp.11671173, 2016. [14] Olivier Bernard, Alain Lalande, Clement Zotti, Frederick Cervenansky, Xin Yang, Pheng-Ann Heng, et al., 2018. Deep learning techniques for automatic mri cardiac multi-structures segmentation and diagnosis: Is the problem solved? IEEE transactions on medical imaging, 37(11), pp. 2514-2525. [15] Olaf Ronneberger, Philipp Fischer, and Philipp Brox. "U-net: Convolutional networks for biomedical image segmentation." in International Conference on Medical image computing and computer-assisted intervention, pp. 234-241, 2015. [16] Yakun Chang, Baoyu Song, Cheolkon Jung, and Liyu Huang. "Automatic segmentation and cardiopathy classification in cardiac mri images based on deep neural networks." in 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1020-1024, IEEE, 2018. Xinyu Yang received his BSc degree in Communication Engineering from the University of Jinan, China, 2005. He is a master student majoring in computer technology at Shandong Provincial Key Laboratory of Network Based Intelligent Computing, University of Jinan. His current research interests include biomedical image/signal processing and deep learning. Yuan Zhang is currently a Professor with the Collage of Electronics and Information Engineering, Southwest University, China. Dr. Zhang's research interests include wearable sensing for smart health, machine learning for auxiliary diagnosis, and biomedical big data analysis. As the first author or corresponding author he has published more than 60 peer reviewed papers in international journals and conference proceedings, a book chapter, and nine patents. He is a member of IEEE EMBS Wearable Biomedical Sensors and Systems Technical Committee. He is an associate editor for IEEE Reviews in Biomedical Engineering and IEEE Access. He has served as a Leading Guest Editor for six special issue of IEEE, Elsevier, Springer and InderScience publication. He is a senior member of IEEE and ACM. Yingming Sun, is an associate Professor of the University of Jinan (UJN), is currently the director of the Engineering Consulting Research Center of UJN, an international FIDIC certified consulting engineer and a national registered Consulting Engineer. His research work meets the needs of the Information Science Department and focuses on practical applications. Professor Sun's primary research interests include Big Data Information Processing, Network Applications, and Engineering Consulting, etc. His publications have won the first prize of the Tenth Excellent Academic Paper Award of the Shandong Electrical Engineering Association, and the Second Prize of Provincial Engineering Consulting Association "Shandong Province Excellent Engineering Consulting Achievement". He has edited two textbooks and published over 20 academic papers. Anton Kos received his Ph.D. degree in electrical engineering from the University of Ljubljana, Slovenia, in 2006. He is an assistant professor at the Faculty of Electrical Engineering, University of Ljubljana. He is a member of the Laboratory of Information Technologies at the Department of Communication and Information Technologies. His teaching and research work includes communication networks and protocols, quality of service, dataflow computing and applications, biofeedback systems and applications, signal processing, and information systems. He is the an author and coauthor of more than thirty papers appeared in the international engineering journals and of more than fifty papers presented at international conferences. He is a senior member of IEEE and since 2018 he is IEEE Slovenia ComSoc chapter chair.