Review of automated segmentation approaches for knee images

Knee disorders are common among the human population. Knee osteoarthritis (OA) is the most widespread knee joint disorder, which may require surgical treatment. The detection and diagnosis of knee joint disorders from medical images demand enormous human effort and time. The development of a computer-aided diagnosis (CAD) system can notably minimise the burden of medical experts and remove the intra-observer and inter-observer variations. To achieve the goal, the highly challenging research problem of knee image segmentation has been frequently paid attention in past years, which can be efﬁciently applied in the development of the CAD system. Knee image segmentation is a challenging task owing to the image contrasts, intensity variations, shape irregularities, and the presence of thin cartilage structures. Therefore, this paper presents a literature review of automated segmentation approaches mainly focused on the segmentation of knee cartilage and bone, with respect to the underlying technical aspects, datasets used, and the performance reported. The paper also presents the growth from classical segmentation approaches towards the deep learning approaches in the knee image segmentation. Owing to the varying quality and complexity of different knee image datasets, this paper abstains from doing a rigorous comparative evaluation of image segmentation approaches.

and inexpensive as compared to MRI. CT is well suited for fracture analysis [8]. Due to the involvement of ionising radiation in CT scan, MRI is preferred over CT in daily routine [6]. For the segmentation of knee images, MRI is an extensively used imaging modality due to its multi-planar capability and better soft-tissue contrast resolution [10].
Thresholding is amongst the most traditional and fastest intensity-based image segmentation methods [11,12]. The pixel value is classified based on a cut-off value. The image segmentation approaches based on partial differential equations (PDEs) such as snakes, and level set are highly useful in solving topology problems, computing, and analysing the motion of an interface in either two or three dimensions [13]. The model-based image segmentation approaches are useful when a priori knowledge of the shape and appearance of the objects in the image is available. A variety of model-based approaches such as deformable models, active appearance models (AAM), active shape models (ASMs), and statistical shape models (SSMs) are exploited for the segmentation of medical images in the past years [14]. Yet another approach for image segmentation is graph cut. The minimum cut algorithm is exploited for segmenting the image into the background and foreground pixels. The algorithm takes linear time to run, depending upon the number of graph edges and also preserves the information in low-variability image areas whereas details in high-variability image areas are ignored [15].
With the increasing popularity of machine learning nowadays, a considerable contribution of deep learning can be seen in medical image segmentation, especially in knee image segmentation. A bunch of deep learning architectures has been presented so far in the literature for accurately segmenting the knee cartilage tissues and bones. Convolutional neural networks (CNNs) are the primarily used deep learning architectures in the medical domain [16]. In addition, to overcome the drawbacks of some segmentation models, a combination of two or more segmentation approaches is also used. The hybrid approaches are discussed within six segmentation categories presented in this paper, and no separate section has been devoted to them. Therefore, the paper presents a review of classical knee image segmentation approaches and the growth towards deep learningbased methods.
The rest of the article is organised as follows. The motivation for this work is presented in Section 2. The related work and the contribution of this study are discussed in Section 3. The methodology followed in conducting the review is detailed in Section 4. Section 5 provides an overview of the human knee joint anatomy. Further, in Section 6, a detailed discussion of the two famous knee image datasets utilised for knee image segmentation is carried out. Various evaluation metrics utilised for testing the performance of the automated knee image segmentation framework are explored in Section 7. Section 8 provides a literature review of the automated knee image segmentation approaches. Further, in Section 9, the advantages and disadvantages of using each knee image segmentation approach are summarised with a comparison between different segmentation approaches. Finally, Section 10 presents the conclusion of the paper.

MOTIVATION
Knee joint disorders are common among the human population. Knee OA is the predominant reason of chronic disability, which leads to the degeneration of articular cartilage. The diagnosis of OA from the medical images is a time-consuming and effort demanding task. Knee cartilage segmentation greatly contributes in the diagnosis of OA progression. Moreover, for the surgical planning of knee joint disorders and 3D modelling of the knee for predicting knee joint kinematics, the bone and cartilage segmentations are required [1]. The results obtained by manual segmentation are reliable while identifying the structures for the diagnosis of knee disorders. ITK-SNAP, 3D Slicer, and Myrian Studio are some of the open-source applications used for the manual and automated segmentation of different body structures [23]. The expert with the underlying knowledge of knee anatomy can easily outline or fill the ROI on each slice in a 2D stack of knee images using medical image segmentation tools such as ITK-SNAP, as depicted in Figure 1. However, the manual segmentation of knee bones and cartilages often leads to intra-observer and inter-observer variations. It is also a very tedious task, which is not advantageous for daily clinical use [4]. Moreover, the manual segmentations are not reproducible. Therefore, to reduce the burden of medical experts and variabilities, it is desirable to develop a CAD system for automatically segmenting the knee images. The automated knee image segmentation is a difficult task due to varying intensity levels for a single tissue class, presence of small cartilage structures and image artifacts, low tissue contrast, and shape irregularity [20]. In the literature, various segmentation methods based on thresholding, active contour models, graph-cut algorithm, multi-atlas, and machine learning are proposed specifically for knee joint images. Therefore, it becomes inevitable to review the existing knee image segmentation work explaining the

RELATED WORK AND OUR CONTRIBUTION
There exist several review articles on the topic of knee segmentation, listed in Table 1. In [11], the review of knee bone segmentation has been presented with the classification based on the user level interaction and exclusion of machine learningbased segmentation techniques. Similar to [11], Zhang et al. [17] lacks the discussion of knee segmentation based on machine learning and the detailed quantitative comparison. Kubicek et al. [19] and Kumar et al. [20] worked on reviewing the knee cartilage segmentation by classifying the methods into manual, semiautomatic, and automatic. Ebrahimkhani et al. [21] covered the automated knee cartilage segmentation methods published till 2018 and categorised them based on state-of-art approaches, providing the quantitative comparison and results. Most of the reviews till date are focused on the knee articular cartilage [19,21,24] and meniscus segmentation [18,22]. However, the knee bone segmentation is also essential for the quantitative assessment of the degree of fracture and to determine the state of an injury [25]. Therefore, the contribution of this work is as follows: a) Instead of focusing only on knee bone or only on cartilage segmentation, this study covers both the knee structures.
b) This study provides a broader classification of state-ofart image segmentation approaches and presents the increased employment of deep learning-based segmentation methods for knee image segmentation.
c) This work presents various aspects of knee segmentation approaches proposed in the literature, such as the datasets used, the ROI, quantitative results obtained, limitations and advantages.
d) This work also provides a quantitative comparison of the performance of machine learning approaches. It also presents the details of the most recent deep learning-based segmentation architectures proposed till date in a tabular manner.

REVIEW METHODOLOGY
A notable amount of research has been published on the subject of knee image segmentation, which requires a well-planned approach to review the most relevant articles in the subject area. This review is focused on the evolution from conventional knee image segmentation methods towards deep learning-based knee image segmentation approaches. The scholarly databases searched with the intent to find the relevant research articles on the topic of knee image segmentation included IEEE Xplore, ScienceDirect, Springer, Wiley Online Library, PubMed, and Google Scholar. The query string 'automated knee segmentation' was included in almost all the searches. Further, after the classification of image segmentation approaches, each segmentation approach has been searched with respect to knee images using query strings 'thresholding', 'snakes', 'deformable models', 'active contours', 'level-set', 'graph cut', 'atlas', 'active shape models', 'statistical shape models', 'active appearance models', 'machine learning', and 'deep learning'. The search results returned many relevant and irrelevant articles. The irrelevant research articles were excluded manually based on the title and abstract. To finally include an article relevant to the topic, a full-text reading was done. For this review paper, the research articles related to automated knee image segmentation published from 2000 and up to mid-2020 were included. The research articles majorly related to the segmentation of knee bones and cartilages were included. Moreover, the research articles published only in journals and conferences were included. The research articles proposing an image segmentation approach for the body organs other than knee were excluded. The research articles not providing the necessary technical details were excluded. The important information like segmentation approach used, datasets used, performance reported, advantages and disadvantages is presented in a tabular form, and each approach used for knee image segmentation is explained in brief.

KNEE JOINT ANATOMY
It is essential to have prior knowledge about the anatomy of knee joint to better understand the problem of knee joint image segmentation. The knee joint is a complex joint in the human body due to its bony anatomy, the ligamentous structures, and muscles acting dynamically on kneecap [26]. The knee joint is made up of a connective tissue called articular cartilage and three bones, namely, femur, patella, and tibia, as shown in Figure 2.

Articular cartilage
Articular cartilage is present in the diarthrodial joints, which provides a lubricated and smooth surface for articulation and facilitates the transmission of load with a low frictional coefficient. As the articular cartilage is subjected to the harsh biomechanical environment, the capacity of intrinsic repair and healing is limited. Owing to the complex and distinct anatomy of articular cartilage, the treatment and repair of cartilage becomes hard for the patient and the expert [27]. There are three types of articular cartilages present in the knee joint, which are femoral cartilage, patellar cartilage, and tibial cartilage. The degeneration of the cartilages may lead to OA, which may require surgical or non-surgical treatment. The segmentation of articular cartilage is a challenging task due to its dependency upon multiple parameters such as tissue structure, image acquisition, and imaging protocols used [20].

Femur
Femur, also named as thigh bone, is the strongest and the longest bone present in the human skeleton. Femur has a wider distal end forming a double condyle (separated by an intercondylar notch), which articulates with patella and tibia. The other end of the femur articulates with acetabulum.

Fibula
Fibula is a thin, lateral, and long bone present in the lower half of the leg. It runs parallel to the shin bone, supporting the lower leg muscles and helping in the stabilisation of the ankle. The fibula forms a joint with tibia on the proximal and distal ends, called as tibiofibular joint.

Patella
Patella, also known as the kneecap, is considered as the largest sesamoid bone present inside a human body [28]. A sesamoid bone is a small bony nodule that is embedded within a muscle or a tendon. Generally, the shape of the patella is triangular, flat, and curved. The patella engages only with the distal of femur. The joint formed by patella and femur is called as patellofemoral joint. The articular surface of the patella is much smaller as compared to femur''s articular surface. Therefore, during the movement of knee, the contact surface between femur and patella varies considerably [29]. The excessive load on the knee joint may result in the dislocation of patella.

Tibia
Tibia, also called shin bone, is present in the lower half of the leg. Two condyles are formed at the proximal end of tibia, which articulate with the femoral condyles. At the distal end, tibia articulates with the anklebone. The proximal and distal ends of tibia also articulate with another bone, called fibula. Fibula has no articulations with femur and patella.

DATASETS USED
Medical imaging data has a significant contribution in the development of any CAD system. Therefore, the segmentation of knee images requires a huge set of data for training, validation, and testing of the CAD system. The researchers have relied majorly on two publicly available knee image datasets for their work.

Osteoarthritis initiative
Osteoarthritis initiative (OAI) is a longitudinal cohort study that provides access to the repository of knee osteoarthritis data and images. The data from OAI is available for download, which includes the measures, OAI study design, Digital Imaging and Communications in Medicine, that is, DICOM images, and clinical data. Various quantitative image assessments are also available, which can help in testing the growth from the early stages of knee OA to clinically significant disease [31]. No annotations for the segmentation of knee images are provided by OAI [1]. Therefore, the medical experts need to first label the OAI dataset manually for testing the performance of the automated segmentation frameworks.

EVALUATION METRICS
Being a crucial image processing procedure in medical image analysis, image segmentation needs to be precise for the detection and diagnosis of abnormalities. One of the principal challenges in image segmentation is the choice of correct evaluation metrics [32]. A bunch of metrics has been proposed so far to test the segmentation accuracy of different automated frameworks. For image segmentation, the majorly used evaluation metrics have been divided into three categories as discussed below [33].

Distance-based metrics
For measuring the dissimilarities in contours of the automated and manual image segmentation results, the most popularly used metrics are distance-based metrics. In the context of knee image segmentation, the following distance-based metrics have been frequently used.

Average symmetric surface distance
Average symmetric surface distance (AvgD) is used for the evaluation of surface overlapping between the manual and automated segmentation maps/contours. It is generally measured in millimeters. A smaller value of AvgD represents high surface overlapping, and vice versa [34]. The formula for AvgD calculation is given in (1).
(1) Note: The grand challenge ended in the year 2018. Therefore, the SKI10 data is no longer available for download. where (.) and ‖.‖ represent the segmentation boundary and Euclidean distance, respectively. S and R represent the automated and manual segmentation maps, respectively.

Root mean square distance
Root mean square distance (RMSD) is computed by taking the square of the distances and calculating the square root of average distance value [35]. RMSD value is calculated as per (2) (2) A lower value of RMSD indicates good performance of an automated segmentation algorithm.

Overlap-based metrics
The overlap-based metrics are obtained using four entities from the confusion matrix: true positive (TP), false positive (FP), false negative (FN) and true negative (TN) [33] as depicted in Figure 3 with respect to the femoral bone segmentation. The true positive value is the number of pixels segmented correctly as foreground. The count of pixels falsely classified as the foreground is given by false positive value. The total count of falsely classified as background pixels is represented by false negatives. The true negative value represents the correctly classified background pixels. The most commonly used overlap-based metrics include the Dice similarity coefficient, Jaccard similarity coefficient (JSC), precision, sensitivity, specificity, accuracy, and F-score.

Dice similarity coefficient
Dice coefficient is a segmentation metric proposed by Dice [36], which is used for the measurement of the spatial overlap accuracy of automated segmentation with the ground truth. A complete overlap is indicated by a dice similarity coefficient (DSC) value of 1 and the automated algorithm works perfectly. A poor performance is indicated if the DSC value is near zero [32]. The DSC value is calculated as per Equation (3), where S represents the automated segmentation result and R represents the ground truth segmentation.

Jaccard similarity coefficient
The ratio of intersection and union of manual and automated segmentation results represents JSC [33]. Considering S and R as the automated and ground truth segmentation results, respectively, the JSC is depicted as per equation (4).
The zero value of JSC implies no overlap and a value of one reflects complete overlap.

Precision
Precision is the percentage value of relevant true positive pixels as described by (5). The performance of an image segmentation method is directly proportional to its value of precision.

Sensitivity
Sensitivity or recall (R) or true positive rate (TPR) provides the measure of the number of positive pixels/voxels present in the expert segmentation and also recognised as positive on application of an automated segmentation algorithm [33]. Sensitivity is calculated as per (6).
where FN and TP represent false negative and true positive, respectively. A low number of false negatives results in higher sensitivity.

Specificity
Specificity or true negative rate (TNR), provides the measure of the number of negative pixels/voxels present in the expert segmentation and also recognised as negative by the segmentation algorithm [33]. The formula for specificity calculation is given in (7).
where TN and FP represents true negative and false positive, respectively. As per Figure 3, specificity finds all the areas except femur bone. A low number of false positives results in higher specificity.

Accuracy
The ratio of the correctly classified pixels to the total number of classified pixels is defined as the accuracy and can be calculated using (8).
Accuracy generally measures how good a semi-automated or automated segmentation approach performs by correctly identifying and excluding the desired region. DSC and accuracy can be applied alternatively for testing the performance of a segmentation method. Since, accuracy calculation also considers the number of true negatives, the researchers prefer to use DSC over accuracy for the evaluation of their knee image segmentation approaches.

F-score
The harmonic mean of precision (P) and recall (R) represents the F-score for an image segmentation method, as depicted by (9).

Volume-based metrics
In certain medical image segmentation tasks where the volume of an object is of importance for treatment planning, the volume-based metrics contributes in testing the performance of segmentation methods [37]. Volume overlap error and volume difference are the two frequently utilised metrics to evaluate the knee image segmentation results.

Volume overlap error
Volume overlap error (VOE) is calculated using (10) as follows, where S represents the set of voxels obtained from an automated segmentation algorithm whereas set of voxels present in the manual segmentation are depicted by R. A smaller value of VOE indicates accurate segmentation results. A 100% VOE value means no overlap between the expert and automated segmentation results [34].

Volume difference
Volume difference (VD) is generally used to obtain an estimation of the size differences between the segmented tissues [34].
The formula for calculation of VD is given in (11) as follows, Among all the metrics, DSC is the most popular metric used to test the performance of knee image segmentation methods. Therefore, the studies discussed in this paper are compared on the basis of DSC. If DSC values are not available, the distancebased and volume-based evaluation metrics are taken into consideration.

KNEE IMAGE SEGMENTATION APPROACHES
Knee image segmentation is a complex and challenging task because of the intensity inhomogeneities and low contrast of the ROI to be segmented [5]. In this paper, the knee image segmentation approaches have been broadly divided into six categories: (i) thresholding based, (ii) partial differential equation based, (iii) graph based, (iv) atlas based, (v) model based and (vi) machine learning based approaches. The broad categories are further divided into sub-categories as depicted in Figure 4. A number of hybrid segmentation approaches has also been proposed in the literature. The hybrid segmentation approaches are presented within the six categories listed, and no separate section has been devoted.

Thresholding-based Segmentation
Thresholding is a classical and fastest intensity-based image segmentation approach. There are mainly two thresholding approaches: global thresholding and local thresholding [41]. Global thresholding works on an assumption, considering that a given image has a bimodal histogram. The image can be segmented into foreground and background on comparing the pixel intensity values with a threshold value t. Equation (12) represents a function f(x,y) for depicting global threshold: A binary segmented image is produced as an output. The major drawback of global thresholding is the inability to produce good segmentation output with a single threshold. To overcome the issue, local thresholding approach was discovered which segments the image by splitting into multiple sub-images and finding the threshold for every sub-image [42].
Kang et al. [38] developed an automated framework for three-dimensional (3D) segmentation of knee, proximal femur and skull from volumetric CT images using region-growing algorithm [43] with local (adaptive) thresholding. The soft tissue and bones were segmented using a mixture of global and local thresholding. The segmented output may have unclosed boundaries, which were closed using morphological operations using a spherical structuring element. The output of the first two steps produced erroneous results due to the dependency on threshold values and the size of the structuring element. Therefore, to preserve the smaller structures of bone and obtain the final result, the authors performed an anatomically oriented boundary adjustment on the obtained segmentation results. A precision error of less than 2% was observed, which depicted a good segmentation performance.  Lee and Chung [39] proposed a technique to segment femur and patella from knee MRIs by integrating thresholding and edge detection-based approaches. To minimise the time complexity, the authors adopted a moment preserving thresholding approach [44] to obtain the bone outline. The ROI was estimated using the bone outline information. A wavelet enhancement scheme was incorporated for the improvement of contrast around the bone edges. The detected edges were transformed into the region-based format by using flow fill after Laplacian of Gaussian (FLoG) algorithm. Finally, the onion growing method was combined with the result of FLoG to thicken the eroded bone images pixel-by-pixel using morphological dilation, and the image with segmented femur and patella was produced as output.
Li et al. [40] implemented a bilateral thresholding approach to segment 3D MRI volume of the knee joint. The bones were segmented based on their intensity range. The lower and upper thresholds were picked up to locate the bone intensity interval. The intensity histograms were used to compute the threshold values. Further, for smoother boundaries, an intra-image contrast analysis was performed, and the missing information was acquired by applying the inter-image repairing process. Table 2 presents a comparative analysis of the approaches discussed above. A small application of global thresholdingbased segmentation is depicted in Figure 5, presenting the original cross-sectional knee MRI and the obtained segmentation results of femur and patella. The knee MR images have low value of signal to noise ratio, which results in loss of image information and therefore, producing output with low segmentation accuracy while using a thresholding approach [11]. A more robust segmentation approach dependent upon the shape

Partial differential equation-based segmentation
PDEs find their applications in the area of applied mathematics, computer vision, and image processing. In 2D and 3D medical imagery, the shape recovery of various anatomical structures plays a significant role, which assists the experts in medical treatment. Shape recovery is a difficult task to achieve in the context of medical images due to the variations in shape, complexity of different structures, and image artifacts. The PDE-based segmentation has allowed creating a fast, accurate, and robust segmentation system [49]. Two primary PDE-based segmentation techniques include snakes or active contours and level-set evolution.

Snakes
Snakes or active contour model is a prominent automated image segmentation technique where the shape boundaries in an image are extracted using internal and external forces which are derived on the basis of image characteristics. The user specifies the initial contour position, and the curve for the snake as defined in (13) evolves in a way, such that the energy function (presented in (14)) is minimised [50].
The first additive term of the integral represents the internal energy of the active contour which restricts the movement of the curve by controlling the parameters and representing elasticity and stiffness, respectively.
The other additive term of the integral represents the external energy which helps in moving the curve towards the significant image features such as edges of femur and patella in a knee MRI [50]. The external energy is represented in (15), where I(x,y) represents a grey-level image.
G represents a 2D Gaussian filter with ▿ as the gradient operator and as the standard deviation. For improving the edge map and reducing the noise in image I, it is subjected to the filter. As a result, the gradient operator gives high values on the regions which are closer to the edge boundaries. Lorigo et al. [45] introduced an automated segmentation method for identifying femur and tibia from human knee MRIs using active contours. The method uses the curve evolving criteria specified in (13) and (14) by incorporating the texture information of the images. A balloon force dependent upon the image allowed the outwards flow of the curve towards the boundary. An initial contour was specified by the user within the area to be segmented. The algorithm converged if the curve has not evolved over a certain number of iterations.
Similar to the previous approach, using prior expert knowledge in combination with active contour models, Lynch et al. [46] presented a framework for cartilage segmentation of OA knee from 3D volumes of MR images. Six initialisation points were marked on a selected slice for defining a cubic spline lying within the articular cartilage. The spline curve evolved in a way such that the cost/energy function was minimised. The spline curves were propagated from one slice to another in a 3D volume to segment the articular cartilage of femur. The authors claimed better reproducibility and lesser human interaction as compared to region-growing segmentation.
Sun et al. [48] used active contour model for the femur and patella segmentation to diagnose patellar dislocation. The CT images were initially subjected to contrast enhancement. The automated angle and distance measurements were accomplished after applying region-based active contour segmentation technique. A DSC value of 96.9% and 95.7% was obtained for femur and pattela, respectively.
Although active contour models produce accurate segmentation results, they do not perform well for non-convex objects and are highly sensitive to the contour initialisation. Moreover, the topology changes are not handled well by active contours. Therefore, the level-set methods were introduced to overcome the gaps in active contour models.

Level set
The level-set algorithm was designed by Osher and Sethian in 1988 [51]. Level-set methods are highly useful in solving topology problems, computing, and analysing the motion of an interface in either two or three dimensions [13], which are not as efficient while using active contour model. The ultimate motive of level-set evolution is to determine the motion of an interface or boundary Γ, bounding a region Ω and possessing a velocity field ⃗ v, which provides the speed to the interface in the normal direction. The velocity may be dependent upon time, the shape of the boundary, position, external physics, and image characteristics such as image intensities and gradients [13,52]. Let Γ to be a closed surface in R 2 . Γ is viewed as a zero level set of a function (x, t = 0) from R 2 to R. Assuming (x, t = 0) = ±d, where the distance between x and Γ is represented by d. The positive or negative sign indicates whether the point x lies to the exterior or interior of the initial contour Γ, respectively [53]. Therefore, the curve evolution equation is given as per (16).
For the purpose of image segmentation, the design of velocity function v is dependent upon the information available from an image. As the level-set methods can easily detect the variation in anatomical structures across different subjects over time, they proved to be highly efficient for the segmentation of knee cartilage tissues and bones. A novel segmentation approach was introduced by Ahn et al. [47] for knee cartilages using a region-based level-set method, where an initial contour was specified by deploying spatial fuzzy C-means (SFCM) algorithm. As the region-based segmentation model is unable to locate the segments in a non-homogeneous image, the authors used localising active contours proposed by Lankton et al. [54] to obtain accurate segmentation results. A template data was created using 20 normal subjects acquired from the OAI database, which helped to find the initial contour using SFCM. Finally, the localising active contours were implemented to yield the segmentation of femoral, patellar and tibial cartilage.
A level-set-based fully automated technique for the segmentation of subchondral bone has been adopted by Gandhamal et al. [24]. The OAI MRI dataset was subjected to contrast enhancement using S-curve transformation. The seed point was automatically detected for segmentation using the 3D multiedge overlapping scheme. The popularly known distance regularised level-set evolution (DRLSE) technique [55] was deployed for the extraction of bony regions. The boundary leakages were corrected using the newly proposed boundary displacement technique where the leakages are identified if the boundary distance difference between successive slices was greater than a specific threshold value.
Chen et al. [5] proposed a workflow for the automated segmentation of tibia and femur from knee MRIs. For the segmentation of trabecular boundary and the removal of inhomogeneities, a 3D local intensity clustering-based level-set method proposed by Li et al. [56] was exploited. For obtaining a smoother trabecular boundary, an intensity line along the normal vector of the interface was generated. A number of boundary candidates were constructed using different sets of neighbouring intensity lines. The candidate nearest to the rough boundary of trabecular bone and with the least variance was selected as the smoothened boundary. The framework was capable of handling field inhomogeneity as well as correctly segmenting the trabecular and cortical bone. An application of DRLSE is presented in Figure 6, where the segmentation boundaries of the patella and femur are produced as a result after the initialisation of contours. The only drawback of employing level-set methods is the requirement of extra effort for the construction of appropriate velocities to advance the level-set function. However, the results produced are highly accurate, even for complex problems.
A comparative analysis of PDE-based segmentation methods is provided in Table 3.

Graph-based Segmentation
The graph-cut approach has gained tremendous attention in the field of image segmentation since it utilises both the regional and the boundary information of an image [59]. Felzenszwalb and Huttenlocher [15] introduced a segmentation technique based on graph-cut approach. Consider an undirected graph, G = (V, E ). In a graph-based segmentation S, the pixel of an image represents the vertex v V which is connected to other pixels via edges (v i , v j ) E, where the weight of an edge represented the dissimilarity among pixels. The weights can be dependent upon image gradients or intensity values. An s-t graph is a directed graph consisting of the source node s, and the terminal node t. A cut c(s, t ) in a graph G is achieved by removal of an edge set E cut such that after removal, there is no route from s to t [60]. The min-cut segments an image into the background and foreground pixels. The method is highly adaptable to the variations in the neighbouring regions within an image as compared to classical segmentation methods such as thresholding. Therefore, the method has been utilised by many researchers to efficiently segment the structures present in knee images. Shim et al. [3] used a semi-automated graph-cut-based approach to segment the knee cartilages. The technique was divided into three steps: (i) Selection of seed point in the anatomic region to be segmented, (ii) Propagation of seeds to neighbouring regions, and (iii) Automated segmentation using the graph-cut algorithm. In the case of erroneous segmentation results, the seed selection and further computation were performed multiple times. Two expert radiologists from the field performed the manual segmentations, which were compared

Paper
Year Approach Dataset

Region of interest Performance Advantages Drawbacks
Lorigo et al. [45] 1998 with the semi-automated segmentation results. A higher DSC value was obtained for the graph-cut-based method.
Ababneh et al. [57] also used the graph-cut scheme for the reinforcement of a content-based, fully automated knee segmentation from MRI. The authors used a two-pass disjoint block discovery to distinguish between the background and bony regions in MRI by utilising the features extracted from the training dataset. The graph-cut segmentation technique was designed using the classified blocks. Further, the max-flow algorithm was applied, which generated a min graph cut to produce the initial segmentation results. For the smoothening of segmentation results, the content-based refinements were applied along with few morphological operations. The major advantages of the proposed method included no user interaction and high efficiency in distinguishing between bones and similar structures.
A method to segment four knee bones, namely femur, fibula, patella and tibia, from CT scans, was proposed by Wu et al. for the planning of orthopaedic knee surgery [58]. First, the concept of marginal space learning (MSL) was used for the pose estimation using the basic concepts of translation, orientation, and scaling. The larger search space for pose estimation was decomposed into marginal space inference. After estimating the pose of a given knee volume, SSMs were used to detect the bone boundaries. For the refined segmentation of each bone, the graph-cut method was used, and each bone was separately segmented using a multi-layer graph-cut approach for removing the possible overlapping error.
A demonstration of the graph-cut segmentation technique on the sagittal knee MRI is presented in Figure 7. The drawback of using graph-cut-based segmentation is the boundary leakage problem due to varying grey-level intensities and imaging artifacts in MR images. A comparative analysis of the graphbased knee image segmentation approaches has been presented in Table 4.

Atlas-based segmentation
Atlas-based methods are advantageous over other image segmentation methods due to their abilities to segment with no pre-defined relationship among pixel and region intensities. The method performs well if the differences between the objects can be incorporated within a spatial relationship. To use atlas information for segmentation, the target image requires to be registered with the reference image (atlas) to compensate for the differences in orientation, position, and size of both target and reference images [64]. In clinical practice, atlas-based methods proved to be highly convenient to use as they can be frequently applied for the measurement of the shape of an object and detection of morphological differences among a cohort [65]. Therefore, the knee image segmentation problem can employ the method for the detection of knee joint-related disorders and diseases. Tamez-Pena et al. [61] proposed a multi-atlas-based knee segmentation approach. The six MRI datasets from OAI were manually segmented by the experts, out of which five datasets were randomly chosen based on the leave-one-out experiment as reference atlases for creating fully automated multi-atlas-based technique for image segmentation. The five selected segmentations were fused to produce a single segmentation result by utilising the idea of fuzzy membership. The method was validated by successfully running the algorithm over 48 knee MRI datasets from OAI. The approach performed well for precise segmentation of knee cartilage tissue and bone due to the initial creation of atlases performed manually and accurately by the experts. An automated method to segment the knee cartilages using multi-atlas-based method has been introduced by Shan et al. [62] similar to [61]. Initially, the bones were segmented from the knee MRIs using multi-atlas-based registration as the accurate bone segmentation plays a vital role in achieving satisfactory segmentation of cartilage. The alignment step followed the bone segmentation for aligning the constructed atlas with the provided query image and acquire the local priors for tibial cartilage and femoral cartilage. The features extracted by using the propagated atlases of tibial as well as femoral cartilage were imported as feature vectors to the k-nearest neighbour (kNN) classifier. Finally, after obtaining the atlas priors and the data likelihoods of either being femoral or tibial cartilage, a threelabel (0 = femoral cartilage, 1 = background, 2 = tibial cartilage) segmentation was performed.
Dam et al. [63] presented an automated segmentation technique for low and high field knee MRIs by performing quantification of knee images. The quantification of knee cartilage, menisci, and bone is considered a highly challenging and complex task. The proposed method used the multi-atlas registration to align the provided query scan as per the training scans. After registration, only the regions which covered the structures of interest were learned specified by a simple rectangular ROI. The features for voxel classification were extracted using the approach similar to [66]. For performing practical computations and limit the memory size for training, the voxels were sampled based on their sampling densities within each structure and background. A set of features were selected by applying a forward feature selection approach [67] and provided as input to the kNN classifier for structure-wise classification. The sampleexpand method was used for classifying voxels by randomly defining a seed point. If the seed point voxel lies within the structure, the region was expanded to segment the structure. Therefore, the proposed scheme relied on the manual segmentations performed during training and allowed to segment various complex knee structures.
The atlas construction consumes a lot of time whenever there is any complex non-rigid registration involved or any iterative procedure is incorporated into it. Table 5 presents a comparative analysis of the atlas-based segmentation techniques discussed.

Model-based segmentation
The medical image artifacts and lack of contrast make it challenging to detect the boundary of an object using local information. However, the experts in the medical domain are still capable of formulating the object boundary as they are aware of how the object shape is supposed to look. Similarly, the modelbased segmentation algorithms incorporate the prior knowledge of shape as well as appearance of objects to extract the structures of interest. These methods are divided into two stages: (i) initialisation of location or appearance of the model and (ii) optimisation of the shape and appearance so that the results match the ground truth segmentations [14]. A 3D shape can be represented using a vector X containing all k coordinate points given by (17).
All the shape samples used for training are firstly aligned in a common coordinate frame. Principal component analysis (PCA) is applied for the dimensionality reduction of shape vectors [68]. Afterwards, the mean template shape is obtained by taking the average of all n shape samples, calculated as per (18). The covariance matrix is computed using the average of n samples given by (19). Finally, the eigenvalues and eigenvectors are computed for obtaining the covariance matrix, discarding the  [69].
The model-based segmentation approaches can further be categorised into three categories on the basis of prior knowledge: SSMs, ASMs and AAMs.

Statistical shape models
The SSMs can be easily deployed to the shapes of arbitrary topology using PCA [68]. The set of objects processing complex shapes is depicted by using the set of vectors, which is useful in uniquely determining the shape of objects and well suited for statistical analysis. For all the shape parameters considered, the correspondence between shape features and the vector components need to be in order. The main motive behind SSMs is to create a template shape and fit into all objects required to be analysed [70]. Since the knee bone shape can be easily represented using a set of vectors, therefore, the SSMs can prove to be beneficial in knee image segmentation. Seim et al. [71] adopted an SSM-based segmentation approach to segment tibia and femur. Using SSM, the femoral and tibial bones were reconstructed and used for the generation of new SSMs for femur and tibia. Finally, the SSM features were then adapted through histogram thresholding to obtain the segmented femur and tibia.
The concept of prior shape models and level-set method has been employed by Pang et al. [72] for accurately segmenting knee MRIs. The prior shape models were built using training data consisting of the whole shape of femur and tibia manually identified by clinical experts. Two level-set functions were used for modelling the shapes of tibia and femur. Due to the directional edge force, the contours were pushed in the right position, and the shapes of femur and tibia were preserved by prior shape models resulting in a good performance when compared with other segmentation techniques.
The shape constraints in SSMs result in higher robustness, but it also limits the accuracy of the segmentation results [14].

Active shape models
The SSMs are based upon the shape of objects; however, the shape of anatomical structures can vary over time and among individuals. Therefore, to add some degree of variability in the shape of objects, the ASMs were devised by Cootes and Taylor in 1994 [77]. The variability in the shape of knee bones in different subjects demands a more robust segmentation technique such as ASMs for accuracy. Fripp et al. [73] presented a novel framework for automating the segmentation process of bone-cartilage interface (BCI) and bones from MRIs using 3D ASMs. The 3D ASM was divided into two components: (i) SSM and (ii) matching criteria. The points in 3D volume were moved in the direction which was normal to the surface of a nearby match. The SSMs provided the required shape and pose estimation for deformed objects. The bones were segmented using 3D ASM initialisation by incorporation of affine registration to an atlas. Finally, the extraction of BCI was done by employing the prior knowledge and image information embedded in 3D ASMs.
Another approach based on ASMs, automated the process of segmentation of patella from X-ray images [74]. Initially, the shape model of patella was created by using PCA from the available landmark points in the training data. Afterwards, an edge tracing strategy was designed by first locating the seed point as close to the boundary of patella. Finally, a dual-optimisation procedure was deployed to get the initial pose estimation using genetic algorithms and refine the estimated shape and pose using ASMs.
Wu and Mahfouz [76] implemented a robust X-ray image segmentation technique to efficiently extract proximal tibia and distal femur using spectral clustering and ASM. Due to varying intensities of femur and tibia in X-rays, using just the traditional ASM was not sufficient and accurate. Therefore, for noise removal from X-ray images, spectral clustering technique based on the eigenvectors and eigenvalues of an affinity matrix was used. The resultant images were finally subjected to ASMbased segmentation. The SSMs and ASMs produce the segmentation results only on the basis of the object shapes in an image. In contrast, a more robust method is to use additional statistical information of the object intensity in an image.

Active appearance models
AAMs are an extension to SSMs and ASMs incorporating the texture information of the object. For building an AAM, an annotated training set of images is required where each example has corresponding points marked. To align the set of points, the Procrustes analysis [78] is applied, and an SSM is made. The points in each training image are warped to match the mean shape. For building a texture model, eigen analysis is performed. Finally, to generate the AAM, the correlation between texture and shape, is learned [79]. AAMs can produce good results for knee image segmentation as the knee images are prone to the large variability in the texture of different structures. Vincent et al. [75] presented an automated segmentation model of the knee joint using AAMs. Initially, the manual segmentations of femur, tibia, femoral, and tibial cartilage were carried out by expert segmenter. For the creation of a statistical appearance model, the landmark points on the bone surface were obtained using the minimum description length technique to groupwise-image registration (MDL-GIR) [79]. A set of deformations was chosen using MDL-GIR such that all the images were registered together as efficiently as possible. A reference mean image and set of deformations, mapping the mean image to each query image were produced as output. AAMs need an initial estimation of few model parameters such as rotation, position and scaling for producing final segmentation results.
While deploying AAMs for medical image segmentation, the major challenge is the requirement of a considerable amount of data for model construction. For 3D models, the 3D texture information representation leads to large equation systems. Therefore, the texture data needs to be scaled down for faster computation [14]. Table 6 presents a comparative analysis of model-based knee image segmentation techniques developed so far in the literature.

Machine learning-based segmentation
Over the past few years, the role of machine learning in medical applications has grown by a huge amount [84][85][86]. With the increasing aging population, the current staffing levels would not be able to keep pace. The advent of high speed computers and their modest prices made it possible to deal with a large scale of patients using automated systems based on machine learning. Machine learning systems are slowly replacing the manual tasks being carried out by the medical experts with higher accuracy and reduced human effort. As far as the field of orthopaedics is concerned, machine learning has brought a revolutionary change by assisting the knee surgeries and diagnosing knee OA [87]. The evolution of deep learning architectures has now allowed to dig deeper into the medical image features and produce accurate results. Based on the machine learning techniques used in the literature, the knee image segmentation approaches are categorised into: (i) Classical machine learning and (ii) Deep learning.

Classical machine learning
The classical machine learning algorithms require the extraction of a set of features by human experts for feeding the data to train the model. The major conventional machine learning models include linear regression, support vector machine (SVM), kNN, decision trees, and random forests. A bundle of these models has been used by many researchers for segmenting knee images. An articular cartilage segmentation approach was developed by Folkesson et al. [66] by classifying the voxels using the kNN classifier from knee MRIs. Two binary classifiers were used to separate the tibial and femoral cartilages from the background. The multi-class binary classifiers performed better in comparison with the direct application of kNN classifier. The features based on intensity, position and geometry of cartilages were selected using forward and backward feature selection. In forward feature selection, the feature set was initially empty. The search space was expanded by the addition of a single feature at a time as per the output of a criterion function. The features possessing least significance were removed using the criterion function in case of backward feature selection. The proposed method was claimed to have performed best for the cartilage segmentation.
Yin et al. [80] used AdaBoost and random forest classifiers for the segmentation of various surfaces and objects, that is, knee bones and cartilages from 3D MRIs. The AdaBoost classifier identified the volume of interest (VOI) for femur, patella and tibia. A set of 500 most significant features were extracted. The random classifier was fed a set of training data to identify the bone shapes. The VOIs extracted using AdaBoost were given as input to a random classifier to extract the mean bone shapes. The AdaBoost classifier trained on nine MRIs dataset helped in classifying cartilage and non-cartilage regions. The regional properties of cartilage were learned by random classifier using the same nine MRI dataset. The final validations were performed on a test set.
Another knee bone segmentation approach proposed by Lindner et al. [81] used random forest (RF) regression voting for automatic shape model matching. Provided with the initial pose estimation of an object, the ROI of an image was resampled into a reference frame. The area surrounding every feature point was searched. The relevant features were extracted at each position. The features extracted were utilised by RF regressor to vote for the optimal location in an array which collected the 2D histogram of votes.
Similar to [66], Prasoon et al. [82] proposed an technique for segmentation of femoral cartilage by performing two-stage voxel classification. Initially, the ROIs were extracted from available MRI scans to reduce the number of voxels. A total of 178 features extracted by Folkesson et al. [66] were used for the two-stage segmentation process. Stage one used the kNN classifier to achieve minimum false positives. SVM classifier with Gaussian kernel was used at stage two using all the 178 features extracted. The proposed approach achieved good results, showing the robustness of the two-stage classifier.
Pang et al. [83] adopted an automated cartilage segmentation method based on pattern recognition. The knee MRIs were used to first identify BCI using the Canny edge detector. SVM was used for the identification of edges to differentiate between different classes of cartilages. For the initial segmentation of cartilages, the Bayesian minimum error classification method was used. Finally, the broken edges were processed using morphological operations such as thinning, spur removal, and edge connection for the final segmentation of cartilage.
The extraction of features often consumes a lot of time and demands enormous human effort. Therefore, a more handy approach is to use a deep learning framework for feature extraction. Table 7 presents a comparative analysis of classical machine learning-based knee image segmentation techniques.

Deep learning
Deep learning has evolved at a swift pace over the past few years in the area of medical image processing. Deep learning has shown a tremendous performance, especially in the area of image segmentation. The famous deep CNN architectures used for image segmentation include U-Net [99], SegNet [100], VGG16 [101], etc. The CNN architectures contain multiple hidden layers for automatic feature extraction. CNN building blocks can be divided into four parts: (i) Convolutional layer, (ii) Pooling layer, (iii) Fully connected layer, and (iv) Activation function. The convolutional layer is responsible for extraction of significant features by using a small array known as the kernel. The kernel convolves over the entire input map. The number of steps taken by a kernel over an input map is called stride. The output from the convolutional operation is fed to the non-linear activation function, such as rectified linear unit (ReLU). After the ReLU operation, the results are fed to the pooling layer. The pooling layer reduces the dimensionality within a plane to lower the number of learnable parameters. The output from the final pooling or convolution layer is flattened into a 1D numerical array interconnected to a single or multiple fully connected or dense layers. Every link from input to output has learnable parameters which are further used for the classification or segmentation task. The final output is obtained by applying a nonlinear function to the output of a fully connected layer. The complete architecture of CNN is presented in Figure 8. CNN needs to learn the kernels and weights such that the difference between ground truth labels and predicted output is minimised. For learning these entities, the network is trained using optimisation algorithms like gradient descent, Adam, RMSprop, etc. The optimisation algorithm used while training the network requires the fine-tuning of hyperparameters such as learning rate and loss function. The learning rate decides by what amount the weights in the network needs to be updated to obtain accurate results. The loss function evaluates the difference between prediction and ground truth. The cross-entropy loss function is a common choice for image classification [86]. Deep learning has major usefulness in the problem of knee image segmentation. Over the years, many researchers have accepted the challenge of knee image segmentation and provided better solutions using deep learning architectures as compared to other classical image segmentation methods. Prasoon et al. [88] provided a segmentation technique for knee bone cartilage based on 2D CNNs. The feature set extracted in [66] has been utilised in the method. Each voxel in a 3D volume passes through three orthogonal planes. For each plane, a 2D patch was extracted, which is centered around a voxel. Each 2D patch was associated with CNN. The CNNs were connected only at the output layer. A softmax classifier was applied to the joint output of three CNNs to acquire the segmentation of tibial cartilage.
A deep CNN and 3D deformable model-based approach have been adopted by Liu et al. [89] for the segmentation of knee tissues. The popular SegNet CNN was used for performing pixel-wise high-resolution tissue classification. The 3D deformable models were applied to preserve the overall shape and maintain a smooth surface of musculoskeletal structures.
The automated segmentation technique was tested on publicly available SKI10 dataset.
An advanced segmentation approach to [89] has been presented by Zhou et al. [34] for the knee joint. The output of the CNN was fed to a fully connected 3D conditional random field (CRF) for regularising the contextual relationship between the voxels present in the same class of tissue and different class of tissue. The result was fed as input to the 3D deformable model to regularise the segmentation results.
Liu [90] adopted an automated method for joint segmentation using adversarial networks and CNNs. As the medical image datasets are available in different tissue contrasts, it often becomes challenging to annotate data for every image sequence manually. For reducing the human effort needed for manual segmentation, the author used the cycle-consistent generative adversarial network (CycleGAN) for translation of images between MRI datasets possessing different tissue contrasts. For the segmentation of bone and cartilage, a method abbreviated as SUSAN (Segmenting Unannotated image Structure using Adversarial Network) was proposed. Two clinical MRI datasets have been acquired from the Department of Radiology of Wisconsin Institutes for Medical Research in Madison for testing the system. A publicly available annotated knee MRI dataset was also used.
A fully automated segmentation system for cartilage lesion detection was devised by Liu et al. [91] based on deep learning. A convolutional encoder-decoder (CED) network was initially deployed for segmenting the knee bone and cartilage. For the detection of structural abnormalities in the segmented cartilage, another CNN classification network was deployed. The authors claimed a greater intra-observer agreement as compared to the intra-observer agreement of medical experts for the proposed method.
Norman et al. [92] adopted the state-of-art CNN for the automated knee cartilage and meniscus segmentation. The U-Net CNN architecture was chosen as it required a fewer trainable parameters in comparison with SegNet. The cross-entropy loss function was employed for the computation of training loss between automated segmentation and ground truth. A split of 70, 20, 10 was done for training, validation, and testing, respectively. The model produced precise and accurate results using underlying U-Net architecture.
Pröve et al. [93] introduced an automated technique for the segmentation of knee for assessing the age from 3D MRIs. In the initial stage, the selected datasets were normalised for the transformation of all image intensities to a similar scale. As the dataset used was not large enough for applying deep learning algorithm, the image augmentation was applied to the available dataset. Ground truth was generated using a semi-automated region-growing technique, which was corrected manually in case of erroneous results. For segmentation, an architecture resembling U-Net was built.
Panfilov et al. [94] used the GAN model based on U-Net for the cartilage and meniscus segmentation. The U-net architecture was modified to work as a generator which performed slice-wise segmentation of 3D-DESS MRIs. A discriminator model was used, having alternate five convolutional and four leakyReLU layers. Adam optimisers were utilised for training the generator and discriminator models. The results produced by the proposed method had shown that the unannotated data could help in boosting the generalisation of the segmentation task.
Ambellan et al. [95] devised an algorithm for the automated segmentation of cartilage and bone from available knee MRIs. The concept of 3D SSM and CNN has been exploited for the segmentation of pathological knee bones. The SSMs and CNNs were trained by utilising the data from OAI and SKI10 challenge. To create the initial segmentation masks of femur and tibia, 2D CNNs were used. The results were regularised by fitting SSMs to the obtained masks. Afterwards, the 3D CNNs were employed to segment the sub-volumes of MRI. The output was enhanced by SSM post-processing. In the end, the femoral and tibial cartilages were segmented using 3D CNNs, and the results were compared with [63].
A holistically nested network (HNN) was presented by Chen et al. [96] for segmenting the patellofemoral MRIs. The HNN architecture was similar to U-Net, excluding the expansion path, which lessens the GPU resources requirement and limits the hyperparameter space required for testing and training. Only using MRI data produced irregular segmentation results. Therefore, both MRI and coherence-enhanced diffusion (CED) [102] images were fed to the network with the ground truth masks. The segmentation maps produced using MRI and CED data at each convolution layer were fused to produce the final output. The proposed network provided remarkable results for femur and patella segmentation.
A semi-supervised learning-based model was presented by Burton et al. [97] for the segmentation of knee MRI. Similar to [88], 2D CNNs and triplanar 2D CNNs based on U-Net were used for segmentation along with 3D CNNs. Monte Carlo patch sampling was applied to 3D CNNs to access the patches containing important context information, which increased the segmentation accuracy. The 3D CNNs outperformed both 2D and triplanar CNNs. The overall DSC of 98.9% was obtained for the segmentation of six knee structures.
Gaj et al. [98] presented a variant of GANs, the conditional GANs (CGAN), for the automated knee cartilage segmentation. The U-Net model was used as the generator network to produce automated segmentation. The discriminator model backpropagates the error obtained on comparing manual and automated segmentation results to the generator until the discriminator is unable to differentiate between ground truth and automated segmentation. The proposed architecture performed a little better than [95]. Table 8 gives a comparative analysis of deep learning techniques proposed so far. The details of different deep learning architectures used for knee image segmentation has been presented in Table 9. From Figure 9, it is observed that Burton et al. [97] provided best results for femur segmentation whereas Ambellan et al. [95] and Cheng et al. [96] gave best results for tibia and patella segmentation, respectively. Also, very few machine learning-based approaches focused on patella bone segmentation. For the cartilage segmentation, Gaj et al. [98] obtained remarkable segmentation performance, as depicted in Figure 10. It is evident from Table 9 that U-Net is the most frequently used basic deep learning architecture with the variations in the number of downsampling and upsampling units and learning parameters. The comparison charts in Figures 9 and 10 provide a visualisation of the performance of machine learningbased knee bone and cartilage segmentation techniques, respectively, based on DSC. With the availability of high-end computational resources, the deep learning algorithms have shown their effectiveness in segmentation compared with the previously used conventional machine learning techniques.

DISCUSSION
A lot of research has been presented on the challenging task of knee image segmentation. The challenge 'SKI10' has gained much attention during the time it was open for submission of results. The two popularly used knee image datasets for research include SKI10 and data from OAI. DSC is the most frequently used evaluation metric in testing the performance of various image segmentation approaches. In this paper, the knee  image segmentation methods proposed so far are classified into six categories. The thresholding-based image segmentation is the oldest and fastest technique. Low accuracy is reported by using thresholding due to the image inhomogeneities and artifacts. A more robust approach is to deploy deformable mod-els/active contours or level-set methods that use the internal as well as external forces for controlling the movement of a curve and helping the curve to move towards the boundary of an object, respectively. The only drawback of using the PDEbased methods is the requirement of initialisation of contour. Graph-based segmentation is another approach used by various researchers for knee image segmentation. The method performed well as compared to thresholding. Boundary leakage is the issue observed in graph-based segmentation due to varying grey-level intensities. The atlas-based method is concentrated on the segmentation of images based on the spatial relationship between different objects. The atlas construction is a timeconsuming task if there is an involvement of complex non-rigid registration. Another most popular method is the model-based image segmentation, where the properties of shape and appearance are taken into consideration. The most frequently used image segmentation approach includes machine learning, which produces the results comparable to the PDE-based methods. The classical machine learning algorithms demand a lot of human effort in feature extraction. Deep learning methods have resolved the issue by automatically extracting the features. The major issue of using deep learning methods is the requirement of a substantial amount of annotated medical data, which limits the use of supervised learning techniques. A more handy approach is to use GANs to increase the size of training data in the scenario of unsupervised and semi-supervised settings [103]. Another important issue resolved by GANs includes the image-to-image translation. While segmenting the medical data of different image sequences, there is a need of expert annotations for each image sequence, which itself is a time-consuming and laborious task. GANs such as CycleGAN [104] can easily translate a given image sequence into another image sequence for which the annotations are available, therefore, saving the labelling and training time [105]. While using a deep learning-based architecture, it is also observed by researchers that the final segmentation results lack spatial consistency, and the segmented boundaries are not regular. Therefore, to overcome the problem, various classical segmentation algorithms such as SSMs, Thresholding based Fast and easy to use Spatial information is not considered and sensitive to noise PDE based Robust and highly accurate Active contours are sensitive to contour initialisation, level-set method demands the construction of appropriate velocities to advance the function Graph based Both the regional and boundary information is utilised Boundary leakage due to varying image intensities Atlas based Easily detect morphological differences within a cohort Time consuming due to complex registration steps involved Model based Robust due to prior incorporation of knowledge of shape and appearance Shape constraints limit the accuracy, requires large medical dataset for construction of model Machine learning based Reduced human effort in feature extraction Requires extensive medical dataset and time for training the model ASMs, and deformable models have been incorporated with the deep learning models. The strengths and weaknesses of different image segmentation approaches are presented in Table 10.
In comparison, the deep learning methods incorporated in some classical image segmentation techniques also performed well. For knee bone segmentation, Ambellan et al. [95] and Burton et al. [97] performed remarkably well. The combination of CNN and SSM produced regularised segmentation masks for knee bones in [95]. The best overall cartilage segmentation results were presented by Gaj et al. [98] by using CGANs. Also, in the literature, many different combinations such as grey-level S-curve transformation and level-set method [24], atlas, and kNN classifier [62,63], SSMs and graph-based optimisation [71], and Canny edge detector and SVM [83] were implemented which performed moderately well. The PDE-based methods [5,24,47] and model-based methods [71-73, 75, 76] also gave good results for knee cartilage tissue and bone segmentation before the use of deep learning methods. The atlas-based methods [61][62][63] performed quite well. The thresholding-based segmentation methods perform poorly for images with varying intensities. It is observed that hybrid models using deep learning methods [95] performed well compared to other classical hybrid models [24, 62, 63 83]. Moreover, with the evolution of deep learning, the knee image segmentation has achieved remarkable performance compared to classical segmentation approaches. As per the literature, the automated knee bone segmentations are comparable with the manual expert annotations whereas the knee cartilage segmentation still has much scope for improvement.

CONCLUSION
Knee image segmentation is a challenging task for the automated diagnosis of knee disorders due to the presence of intensity inhomogeneities, low contrast, and image artifacts. The image segmentation techniques proposed so far in the literature have achieved significant results over the time. Before the era of machine learning, the PDE-based and model-based segmentation methods performed exceptionally well. With the advancements in technology, the use of deep learning-based segmentation methods has increased. Even though the deep learning methods demand a large amount of data, time, and high-end computational resources for training, they perform well when trained with optimal learning parameters. Moreover, the segmentation performance improves when used in conjunction with an optimal classical segmentation approach. The knee bone segmentation has attained a remarkable performance over the years, but there is still a scope of improvement in the performance of knee cartilage segmentation. Therefore, the knee image segmentation is still an open research area with a scope of improvement.