A novel denoising algorithm for medical images based on the non-convex non-local similar adaptive regularization

Sparse representation is a powerful statistical image modelling technique and has been successfully applied to image denoising. For a given patch, a non-convex non-local similarity adaptive method is adopted for sparse representation of images. First, it uses the autoregressive model to perform dictionary learning from sample patch datasets. Second, the sparse representation of an image introduces non-convex non-local self-similarity as the regularization term. In order to make better use of the sparse regularization method for image denoising, the parameters used in this study are estimated using adaptive meth-ods. This model is more efﬁcient and accurate, Compared with K-means singular value decomposition (KSVD) algorithm, a generalized K-means clustering method, total variation of population sparsity (GSTV) algorithm, adaptive sparse domain selection (ASDS) algorithm, forward denoising convolutional neural network (DnCNNs), a fast and ﬂexible Convolutional Neural Network image denoising method (FFNNet) and operator-splitting algorithm to minimize the Euler elastica functional (OSEEF). Image noise-reduction experiments conﬁrmed that using the adaptive regularization method, the results in peak signal to noise ratio (PSNR) and visual opinion are better than other algorithms.

local sparse region. In this study, we introduce two adaptive regularization terms [20] in sparse representation. First, we learn from a set of samples and establish an autoregressive model [21,22]. The local structure of the adjusted image is adaptively selected by using the given patch of the best fitting autoregressive model [23]. Second, nonlocal self-similarity is introduced as another regularization term. In order to recover the image better, the adaptive method is used to obtain the sparse regularization parameters. After a large number of images denoising experiments, we find that the new adaptive sparse domain selection and adaptive regularization method are superior to many advanced algorithms in PSNR and visual perception. On the other hand, some experimental studies show that the nonconvex optimization method based on denoising has high computational complexity, but its accuracy is often higher than that of convex optimization method [24]. Therefore, we hope to find a more efficient algorithm by using the high correlation between sparse coefficient and non-convex optimization. Since each singular value in the optimization problem has its physical meaning, the singular value should be treated differently in any practical problem [25]. In this paper, a non-convex and non-local denoising algorithm is proposed. The low rank substitution model based on log (|⋅| + ε) function can adaptively assign weights to different singular values [26]. Many experimental studies show that the denoising result of non-convex optimization model is more accurate than that of convex optimization model, although the algorithm is slightly more complex and needs more calculation time [27].
In this paper, we propose an adaptive sparse domain selection (ASDS) scheme for sparse representation. A compact set of sub-dictionaries is established by learning from high-quality patch images. Each patch of sample images is divided into many clusters, each cluster is consisting of similar patterns. A subdictionary can be set up for each cluster to form a compact set of sub dictionaries. Specifically, we use the principal component analysis (PCA) technique as a dictionary-learning technique for setting up a sub-dictionary. For encoding an image patch, the best sub-dictionary is selected that is most relevant to a given patch. Because a given patch can be better represented using an adaptive selection sub-dictionary, the image can be reconstructed more accurately than using a universal dictionary, as we will show in the experiments.
We propose a piecewise autoregressive model [28] that uses pre-training data sets to describe the structural features of local images. Depending on given local patches, one or more autoregressive models can be adaptively chosen to regularize the solution space. On the one hand, because there are often many repetitive image structures in an image, it is very helpful to introduce a non-local (NL) self-similarity constraint as another regularization term to preserve sharp edges and suppress noise [29][30][31].
We introduce ASDS and adaptive regularization (AR) [23] into IR framework based on sparse representation, and propose an effective iterative shrinkage (is) algorithm to solve the minimization problem [32] and adaptively estimate the local sparsity of the image to adjust the sparse regularization.

GENERAL IMAGE DENOISING MODEL
The goal of image denoising is to reconstruct a high quality (denoised) image x from a low-quality image y that contains noise [33][34][35]. The image denoising is typically an ill-posed inverse problem, which can be modelled as follows: where x is the noise-free image, y is the noisy image, and v is additive noise? Due to the ill conditions of image denoising, we adapt the l 2 -norm regularization constraint, i.e., To solve Equation (1) for x, wherex is the denoised image. However, the solution to Equation (2) is usually not unique. In this study, a priori knowledge of medical images is used to regularize the image denoising problem. The most commonly used regularization model is the total variation (TV) model [31,32], as shown beloŵ where |∇x| 1 is the l 1 -norm of the first derivative of the image, and > 0 is a constant. A medical image can be sparsely represented using an atomic dictionary [33], such as the Discrete Cosine Transform (DCT) dictionary. Using such a dictionary Φ, most of the representative coefficients can be found through x ≈ Φ when vector approaches zero ( → 0). According to the a priori sparsity, x can be estimated from y through solving the following l 0 -norm minimization problem where the l 0 -norm is used to compute the number of nonzero elements in the vector . Oncêis determined,x can be computed usingx = Φ̂. The l 0 -norm minimization problem is a typical non-deterministic polynomial-time (NP)-hard problem. Thus, l 0 -norm is usually replaced by l 1 -norm in solving the sparse coding problem Recent studies have shown that using a weighted l 1 sparse regularization term can achieve better image restoration [34]. This is because there are many repetitive substructures within an image; a non-local (NL) [29] self-similarity constrained regularization term is very effective in preserving edges and suppressing noise during the image restoration. For example, is a coded objective signal, is a given dictionary, gives the sparsely coded coefficients, that is the majority of the coefficients are almost zero, thus x ≈ Φα. If the sparsity, or the l 0 -norm of α can be measured, that is the number of non-zero elements ofĉ an be found, the sparse coding problem can be expressed aŝ

Self-adaptive selection of sub-dictionaries
For each subset S k of the dictionary Φ k , the centroids k of clusters C k associated with S k are calculated. In the sparsity-based image denoising schemes, a self-adaptive sparse domain is generated for each self-adaptive sub-dictionary of x [1]. Ifx represents an estimate of x andx i represents i th patch ofx i , for each cluster's centroid k , the best sub-dictionary k i can be chosen to representx i by comparing the high-pass filtering [35] ofx i as shown in Equation (7): Let U = [ 1 , 2 , … , K ] be a matrix containing all centroids, the covariance matrix of U can be obtained by using the singular value decomposition (SVD) method, then the principal component analysis (PCA) transformation matrix of U can be obtained [36,30]. The distance betweenx h i and k in the subspace using Φ c can be calculated as follows: Using Equations (7) and (8) can improve the stability of the selected self-adaptive dictionary. By solving the l 0 -norm regularization minimization problem Equation (4) for̂, we can obtain the corrected estimate of denoised image x usingx = Φ•̂. This process is iterative until the estimatedx is convergent. The final output is the denoised image x.

Self-adaptive re-weighted sparse regularization
In Equation (5), is the coefficient of the weighted l 1 norm of the sparse regularization term ‖ ‖ 1 . The local sparsity and weighted l 1 sparsity of the self-adaptive image is estimated by solving the following weighted l 1 sparse regularization minimum problem [37]: where i, j is the j th atom-related coefficient of the dictionary Φ k i and i, j is the weight of coefficient i, j .

Spatial self-adaptive regularization
Self-adaptive sub-dictionaries are usually used to encode a given image patch. In this study, we use a self-adaptive regularization model to constrain input image patches. Based on the model, image local correlation and non-local similarity are used for image denoising. The selection of the self-adaptive regularization model for each patch x i is the same as the selection of the dictionary for each sub-dictionary of x i . The outputx h i is computed when the estimatex i of x i is available. That is, when there exists is then assigned directly to the patch x i . By adding the con- to the weighted l 1 -norm sparse regulariza-tion minimum sparse (Equation (9)), we have a new objective function as follows [23]: where is the coefficient of the self-adaptive regularization term. This algorithm can minimize the sum, thus obtaining a convex optimization problem that is easy to solve [27]. However, since each singular value in the optimization problem has its own physical meaning, the singular value should be treated differently in any practical problem [26]. This algorithm treats all singular values equally, which greatly limits the ability and flexibility of the algorithm. This article presents a non-convex non-local denoising algorithm. The sparsity replacement model based on log(| ⋅ | + ) function can adaptively assign weights to different singular values [25,26]. Many experimental studies show that the reconstruction results of non-convex optimization model are more accurate than those of convex optimization model, although the algorithm is slightly more complex and requires more computational time [25]. Instead of solving Equation (10), our aim is then changed to solving the following optimization problem: , we can reduce Equation (11) to the following form: where n 0 = min{N , n}, j (L) represents the j th singular value of L. For simplicity, we will use j to represent the j th singular value of L. Although ∑ n 0 j =1 log( j + ) is non-convex, the local minimization method can be used effectively to solve the nonconvex problem. The singular value of L can be easily obtained according to [26] and can reduce Equation (12) to the following form: Four sets of high-quality images are used to train sub-dictionaries and AR models. The top row is the Shepp-Logan MRI images that are used to train sub-dictionaries. The second row is the brain MRI images. The third row is the low-resolution (LR) cerebrovascular MRI images. The fourth row is the high-resolution (HR) chest CT images For the convenience of expression, we write the third term where I is the identity matrix and Then, (13) can be rewritten aŝ For any image-denoising algorithm that is based on the adaptive regularization model, the local statistical information in each image patch is used. On the other hand, there are many repetitive patches in a medical image. This nonlocal spatial redundancy is very helpful in improving the quality of the denoised images. This is a re-weighted l1-norm minimization problem, which can be effectively solved by the iterative shrinkage algorithm [19,38].

Experimental results and analysis
In this study, different types of images are used to train dictionaries. The microstructure of an image can be represented by a small number of patches. The sparse coding scheme conforms to the human visual system. That is, a small number of basis functions of the overcomplete set is used to encode an image. Therefore, a natural dictionary can be represented by one that can be trained in several training images.
To illustrate the stability of the model-training data set, we used four different sets of training images in an experiment; each set has five high-quality images, as shown in Figure 1, where each row represents a set of five high-quality images. The four sets of training images have quite different structures. We randomly select a set of image patches of 7 × 7 in size from the set of training images. The newly developed algorithm has been compared with the KSVD algorithm [32], the GSTV algorithm [20], the ASDS algorithm [23], the DnCNN algorithm [39], the FFDNet algorithm [40] in the denoising of medical images, the OSEEF algorithm [18], The non-local low-rank regularization method allows us to efficiently use similar patches of a sparse group and minimize the non-convexity of the low-rank model that provides an effective method to recover images from the denoising patches.
Because of the particularity of the human visual system, we can use sparse coding scheme to represent images. Therefore, patches of rich edge and texture characteristics can be trained from an image. In the following denoising experiments, we used the Shepp-Logan (SL) image and the images with added normally distributed random noise of mean of 0 and standard deviations (SD) of 15, 20 and 30, respectively, to investigate sensitivity of the performance of the new algorithm to noise level.  [32], the GSTV algorithm [20], the ASDS algorithm [23], the DnCNN algorithm [39], the FFDNet algorithm [40], the OSEEF algorithm [18] and the present new algorithm, respectively, of the corresponding noisy images in column (b) FIGURE 3 Same as Figure 2 but the image denoising was performed on the denoising high-resolution (HR) brain MRI images with different levels of random noise In the denoising algorithm, we set empirically we set empirically = 0.079, = 0.144, and i, j = 1∕(|̂i , j | + ),̂i , j is the estimate of i, j , and is a small constant.
In order to evaluate the effectiveness of the proposed algorithm in image denoising, we first compare the denoising results of different images with different levels of random noise by the proposed method to the results by other algorithms in the public domain in Figures 2-5. Figure 2 shows the comparison of the present new algorithm with other algorithms in denoising Shepp-Logan images with normally distributed random noise of different levels. Images of column (b) from the top to the bottom show the noisy images with normally-distributed random noise of mean of 0 and standard deviations of 15, 20 and 30, respectively, added to the origi-nal images shown in Column (a). The denoised images using the present new algorithm were compared to those denoised from the KSVD algorithm, the GSTV algorithm, and the ASDS algorithm algorithms, the denoising results are shown in Figure 2. The images of the first row (from the top) of panels (c)-(g) are the denoising images by the KSVD algorithm, the GSTV algorithm, the ASDS algorithm, the DnCNN algorithm, the FFDNet algorithm, the OSEEF algorithm [18], and the present new algorithm, respectively. From these images we can see that the present new algorithm performed better than the KSVD, GSTV, ASDS, DnCNN and FFDNet algorithms, the OSEEF algorithm [18]. Meanwhile, the denoised image by the present new algorithm has no obvious edge blurring, but those by the other algorithms have visible edge blurring. The denoised image Intercomparison of different algorithms in denoising high-resolution (HR) chest CT images with different levels of random noise by the KSVD and GSTV methods have less serrated-effect than the other algorithms. The denoised image by the ASDS algorithm has a loss of information in more heterogeneous areas of more surface features and textures. The denoised image by the DnCNN, the FFDNet and the OSEEF is relatively blurry. The present new algorithm has intrinsic noise suppression capacity so areas of high-frequency changes may be suppressed. However, the difference image in the Figure 2 shows that the suppression is not obvious. Comparing the magnified parts (lowerleft corner inset in each image) of the denoised results, we can see that the present new algorithm maintains. Figure 3 shows the comparison of the present new algorithm with other algorithms in denoising high-resolution (HR) brain MRI images with normally distributed random noise of different levels. The KSVD and GSTV algorithms have larger differences from the original image than the other algorithms. The denoised image by the ASDS method has palpable edge blurring, and there are more details lost in the region of more edges.
The denoised image by the GSTV algorithm shows obvious serrated effect. The reconstructed image by the ASDS algorithm shows a data loss in the areas that have more surface features and edges. The denoised image by the DnCNN, the FFDNet and the OSEEF produce over-smooth edges and textures. The image denoised by the present new algorithm showed overall better visual effects and fidelity to the original image. Figure 4 shows the comparison of the present new algorithm to other algorithms in denoising low-resolution (LR)cerebrovascular MRI images with normally distributed random noise of different levels. The denoised image by the KSVD algorithm has the largest difference from the original image among the algorithms, followed by those by the GSTV and ASDS algorithms. The least difference from the original image was obtained by the ASDS algorithm. The denoised result of the KSVD has substantial edge blurring and more data loss in the region of more edges. Compared to the original image, more details were lost in the denoised image. The image denoised by the ASDS algorithm shows a loss of surface fine structures and edges, and also shows blurring effect of the edges. The denoised image by the DnCNN and FFDNet are unexpected stenosis of blood vessels, the OSEEF algorithm leads to some undesired artefacts. In the reconstructed image by the present new algorithm, partially enlarged area (lower-left corner) shows a good visual effect and that details of edges and non-smooth areas have been well retained. Figure 5 shows the comparison of the present new algorithm to other algorithms in denoising high-resolution (HR) chest CT images with normally distributed random noise of different levels. The denoised images show that the image denoised by the KSVD algorithm has the largest deviation from the original image and that by the present new algorithm has the least deviation from the original image. For the GSTV and ASDS algorithms, the partial enlargement parts (inset at the lowerleft corner in each image) of the denoised images have more obvious edge blurring. The reconstructed image by the GSTV algorithm shows that the reconstructed results have obvious serrated-effect. The denoised image by the ASDS algorithm also has blurring effects in the areas of inhomogeneity. The denoised image by the DnCNN and FFDNet has a loss in local information. For the present new algorithm, the denoised image has a good visualization, and suppression of details is not visible, and boundaries of objects in the image are well preserved.
In order to quantify the effectiveness of the proposed algorithm in image denoising, we used two quantitative picture quality indices (PQI) for performance evaluation. One is the peak signal-to-noise ratio (PSNR) and the other is the structure similarity (SSIM) [41]. The values of PSNR and SSIM from the denoising experiments shown in Figures 2-5 are summarized in Table 1. From Table 1 Figures 2-5 we can see that the new method can produce better visual effect. Also compared to the original images that are assumed to be noise free, the edges of objects in the original images were found to be well preserved in the restored images. All these results (higher PSNR and SSIM values, better visual effects and edge preservation) indicate that the newly proposed method has better denoising performance in terms of effectiveness and accuracy than the KSVD, GSTV, ASDS, DnCNN, FFDNet, OSEEF algorithms.
An important element of the new algorithm is the election of the size of the image patches. To evaluate the effect of patch size

FIGURE 6
Intercomparison of different algorithms in denoising low radiation dose CT real-time medical images. The column (a) shows low radiation dose CT real-time medical images. Columns (b)-(f) show the denoised images using the ASDS algorithm, the KSTV algorithm, the GSTV algorithm, the OSEEF algorithm and the present new algorithm on the image restoration, we tested different patch sizes during the training of sub-dictionaries, that is patch sizes such as 3 × 3, 5 × 5, 7 × 7, 9 × 9 were tested in the image restoration. The experimental results show that the patch size of 7 × 7 results in better image restoration. Figure 6 shows the comparison of the present new algorithm with other algorithms in denoising low radiation dose CT realtime medical images. We compare our algorithm with other denoising algorithms, such as ASDS, KSVD, GSTV, OSEEF. The denoised image by the KSVD and GSTV methods have less serrated-effect than the other algorithms. The denoised image by the ASDS algorithm has a loss of information in more heterogeneous areas of more surface features and textures. The denoised image by the OSEEF is relatively blurry. Our algorithm retains more details of medical information and well pre-serves boundaries of objects in the image. Our algorithm does not produce gradient effect.

DISCUSSION
The non-convex non-local similar adaptive regularization method depends on the weight parameters between data fidelity term and regularization term on the adjustment control, but this method can obtain very good recovery results. Experiments using KSVD, GSTV, ASDS, DnCNN, FFDNet, OSEEF and the new algorithm proposed in this paper were performed to synthetic and real MRI images and low radiation dose CT realtime medical images. The results show that the new denoising method can achieve excellent results with very small error for MRI images and CT real-time medical images. The proposed method outperforms the other algorithm. Generally, errors in the restored MRI images mostly occur at the transition zones between different categories of brain tissues. However, even in these cases, the recovery pixels in the transition zones using the new method maintain the precision. This is very important, because most applications of MRI imaging require high fidelity in the transition zones so that different brain tissues can be clearly distinguished. The brain MRI tensor data can be accurately recovery. For the situation of medical images, real-time recovery and thus fast computation speed is often required. To improve computation speed, the proposed methods can be realized through high performance by the GPU. Because of the inherent parallel property of the algorithm, implementation of the algorithm in parallel computation can reduce the time of the recovery process.

CONCLUSIONS
In this paper, we present a new image denoising method based on sparse representation for adaptive sparse domain selection and adaptive regularization. Considering the best sparse domain of an image, we chose an adaptive dictionary that is trained from a data set of high-quality sample patches and used for each partial patch. To further improve the quality of the denoised image, we introduce two regularization items into the ASDS-based image denoising scheme. The non-local similarity of the images is merged by the regularization term and used for image non-local redundancy. A large number of images denoising experiments were performed and the results show that the newly developed method can effectively preserve image details and outperforms algorithms such as KSVD, GSTV, ASDS, DnCNN, FFDNet, OSEEF in terms of PSNR, SSIM, visual quality, and edge preservation. The new algorithm is more efficient and accurate, compared to the KSVD, GSTV, ASDS, DnCNN, FFDNet, OSEEF algorithms.