Shape retrieval by using multi-scale angle-based representation and dynamic label propagation

: To improve the robustness and discrimination power of the triangle-area representation, a novel shape matching method based on multi-scale angle representation is proposed in this study. By analysing the configurations of different sample points from each shape contour, shape descriptors are constructed by using space angles at different scale levels. With the proposed shape representation, the multi-scale information of shape contours is efficiently described, and the dynamic programming is further used to determine the correspondence between samples from different shapes and calculate the shape distance in the feature matching step. Moreover, to improve the shape retrieval results based on pairwise shape distances, the dynamic label propagation is introduced as the post-processing step. Unlike previous distance learning methods learning the database manifold implicitly, the authors method retrieves relative objects on the shortest paths from near to far explicitly, and the underlying structure can be effectively captured. The proposed method tested on different shape databases provides the performances superior to many other methods, and it can be applied to visual data processing and understanding of the internet of things.


Introduction
Shape retrieval is one of the most important issues in computer vision, and it plays an important role in image recognition [1], object retrieval [2,3], bioinformatics [4], medical imaging [5], visual internet of things [6] etc. Given a query shape, the most similar shapes in a data set are obtained based on a certain distance measure, and the retrieval results can be shown in the form of ranking dissimilarities between the query object and the others. In general, the goal of shape retrieval is achieved by analysing the pairwise shape distances that can be calculated by using the pairwise shape matching method, and the basic point is that the more similar two elements are, the smaller distance is measured. Contextual distance learning is introduced into the shape matching method as the post-process procedure in recent years, and then the shape retrieval accuracies can be effectively improved [7][8][9].
The pairwise shape matching method mainly consists of two steps: shape feature extraction and shape feature matching [10]. According to the origination of shape descriptors, shape matching methods can be classified into two categories: the contour-based shape matching method and the region-based shape matching method [11]. For the contour-based shape matching method, one effective solution is to treat each shape as a finite set of points sampled from the contour, and then the procedure of shape feature matching is employed to determine the correspondence between different sets. Many contour-based shape matching methods are proposed in previous research studies and achieve desirable matching results and recognition accuracies [12][13][14][15]. In general, a certain contour-based descriptor contains both local and global information, and this guideline is widely adopted in the process of feature extraction. However, for most previous methods, the effects of local and global information are not treated differently, and there is no effective rule to balance the functions between them.
In this paper, under the framework of the contour-based shape matching method, a novel shape descriptor named multi-scale angle representation (MAR) is proposed in this paper. In our method, each sample point is viewed as the reference and the spaces angles at different scales are calculated and used to construct the signatures. The proposed descriptor is closely related to the triangle-area representation (TAR) [16,17]. In some cases, TAR might ignore obvious differences between dissimilar objects, and it cannot deal with small deformations for paired similar shapes. Compared to TAR, the proposed descriptor provides strong discriminative power, and it is robust to small deformations and noises. As the orders of the sample points for each shape contour are already known, the dynamic programming (DP) algorithm is introduced to the correspondence between sample points from different shapes.
Moreover, to improve shape retrieval results by utilising the intrinsic data manifold structure, we further introduce a novel shape retrieval scheme: dynamic label propagation (DLP), which is derived from the process of label propagation (LP) within a semisupervised learning framework. Instead of focusing on searching the complete geodesic paths iteratively, the proposed approach only retrieves and labels a small number of shapes within a certain range after several iterations, and the complete geodesic paths can be gradually obtained by repeating this process. As physical devices deployed in the internet of things collect a large number of images, the images processing and understanding become an important challenge on optimisation, and the proposed methods in this paper provide an effective strategy for image classification and retrieval.
The structure of the rest of this paper is as follows: we briefly discuss related work in Section 2. Section 3 reviews the shape matching method based on the TAR, which is closely related to the proposed method. Section 4 gives a detailed analysis of MAR, and the corresponding feature matching framework is discussed in Section 5. In Section 6, DLP is introduced to improve pairwise shape retrieval results. Finally, experimental results on several well-known shape databases are shown to illustrate the effectiveness of our proposed method.

Related work
For the contour-based shape matching method, the shape representation constructed by using sample points from shape contour shares many advantages such as the rapid features extraction and the convenience for shape comparison. Shape context (SC) proposed by Belongie et al. [18]  descriptor based on the contour point set. Given bins uniformly generated from the log-polar space, a histogram is constructed for each point to describe the relative distribution of the remaining samples. With this structure of shape representation, the distribution of samples nearby the reference point can be described precisely, and the global information obtained from far away points is roughly reflected. In the procedure of shape feature matching, the pairwise shape distance is evaluated by using the Hungarian method. Great efforts have been made by previous researchers and many descriptors on the basis of SC are further developed, i.e. generalised SC [19], histogram of orientation SC [20], and fuzzy SC [21]. With the correspondences between sample points from different shapes determined by DP based on SC, a robust symbolic representation is proposed by Daliri and Torre [22] to evaluate the pairwise shape matching cost, and it achieves desirable shape retrieval results. In [12], a global shape descriptor based on the multi-scale integration of angular pattern (AP) and binary AP, which are intrinsically invariant to scale and rotation, is introduced to solve the problem of pairwise comparison.
To provide more effective descriptions of shapes, the region information is introduced into the construction of the contour-based shape matching method. Inner-distance SC (IDSC) [23] is introduced by replacing the Euclidean distance with the inner distance, and the proposed descriptor is robust to articulation and able to capture the part structure. In [24], aspect SC is proposed by introducing geodesic distances in the aspect space to generate a more effective descriptor for complex shapes, and it provides a reasonable balance of two competing factors: deformability and discriminability. As the shape interiors play an essential role in object recognition, a perceptually motivated variant of SC named solid SC is proposed to capture interior properties by Premachandran and Kakarala [25]. Alajlan et al. [16] utilised triangle areas in the process of building shape descriptor, and proposed TAR to solve the problem of shape matching. With TAR, the global and local information of each shape can be effectively described, and, in many cases, it is insensitive to noises and tiny deformations. In [17], with the introduced TAR, dynamic space warping (DSW) is further used to determine the correspondence between samples from different shapes, and the accurate shape distance can be calculated. Compared with pairwise shape matching methods that have been considered and studied for a long time, the context-sensitive similarities only received extensive attention recently [26][27][28]. LP is an extreme case of semi-supervised learning introduced by Yang et al. [29] and Bai et al. [30] to improve shape retrieval results. Without explicitly selecting the shortest paths, LP suffers from some disadvantages such as redundant context information and noisy shape elements. For this reason, Wang et al. [31] proposed the shortest path propagation by capturing relative objects on the shortest paths explicitly. In recent years, the unsupervised framework is directly utilised to improve context-sensitive similarities [8,26,32]. In [32], a global similarity measure named self-smoothing operator based on the self-induced smoothing kernel is introduced, and the structure of the similarity metric obtained from pairwise distances can be preserved effectively. Tensor product graph (TPG) [26] is introduced by constructing the tensor product of the graph obtained from the affinity matrix with itself, and the proposed framework can be applied for not only shape recognition but also image retrieval and image segmentation. An extremely efficient algorithm based on a novel feature vector named sparse contextual activation (SCA) is proposed for visual re-ranking in [27], and local consistency enhancement is further introduced to improve the performance.

Triangle-area representation
For TAR signature, each shape is represented by the point set P = p i : 1 ≤ i ≤ λ with a finite number of samples, and the points are sampled uniformly from the shape boundary, where p i = x i , y i . Given three sample points: p i , p i − h , and p i + h , the area of the triangle TAR i, h formed by these points can be defined as follows: With the sample points on the contour viewed in the clockwise direction, three typical triangle areas: convex, concave, and straight line, are corresponding to three types of values of TAR i, h , respectively. Fig. 1 illustrates the three kinds of triangle areas, three sample points i, j, and k represent the convex area, concave area, and straight-line area, respectively. As the triangle area becomes larger with the increase of the scale h, TAR i, h for p i ∈ P at different scales is normalised to balance the effects of the signatures as follows: The triangle-area descriptor of p i ∈ P is represented as Fig. 2 shows an example of the signatures based on the triangle areas obtained from the shape contour in Fig. 1. According to the definition of TAR signature, both the global shape information and the local shape information are introduced into the proposed feature, which is not only generally robust to noises but also superior in describing fine details.
Given two sample points p i ∈ P and q j ∈ Q from two different shapes, the cost of matching D i j ≡ D i, j is defined as With the extracted descriptors, DSW is further used to analyse the correspondence between sampling points from different shape contours, and then the shape distance can be determined.

Multi-scale angle-based representation
With the TAR, in most cases, the shape information can be effectively described. However, TAR is sensitive to small deformations of shapes, and incorrect correspondences and inaccurate cost might be caused during the process of matching similar shapes. On the other hand, given two shapes from different categories, in some cases, the cost of TARs from dissimilar sample points is very small, and then the two shapes might be classified as the same class. Fig. 3 shows an example of matching two similar shapes, and Fig. 4 shows another example of matching two dissimilar shapes. As shown in Fig. 3, according to the TAR-based shape matching method, p i ∈ P in Fig. 3a and q j ∈ Q in Fig. 3b are regarded as the convex point and concave point, respectively, and then TAR i, h and TAR j, h represent a large positive value and a large negative value, respectively. Therefore, the corresponding matching cost D i j has bad effects on the final shape matching cost, and the relationships between the corresponding points on similar shapes cannot be exactly reflected. Meanwhile, as shown in Fig. 4, Figs. 4a and b show the TARs of sharply convex area and sharply concave area from dissimilar shapes, respectively. As TAR i, h and TAR j, h provide small positive and negative values, and the small D i, j might lead to incorrect shape matching results.
With each shape represented by the point set P with λ sample points, the points are sampled uniformly from the shape boundary, where p i = x i , y i . Given three sample points: p i , p i − h , and p i + h , the space angle θ i, h viewed as the signature can be obtained as follows: where 1 ≤ i ≤ λ, 1 ≤ h ≤ H, and H = (λ/2) − 1. Fig. 5 shows an example of the constructed space angles under different scales, and given p i , p j , and p k , θ i, h , θ j, h , and θ k, h correspond to the examples of convex, concave, and straight-line area, respectively. As the differences from various shapes are sensitive to changes of θ i, h around 0, π or 2π, we introduce sine function and have the signature MAR i, h as follows: where p i ∈ P and α is introduced as a hyper-parameter. Then, the matching cost C h i, j at h is defined as follows: where 1 ≤ i ≤ λ, 1 ≤ j ≤ λ, and 1 ≤ h ≤ H. Referring to previous effective shape descriptors, such as SC and IDSC, the discriminative power of the signature highly depends on the particular spatial structures, and the nearby samples play much more critical roles in the construction of descriptors than the points far away. With this idea, the exponential function e −αh is introduced to generate the weights of C i j h for different scales. Then, the weights of the signatures at small scales obtained from adjacent samples are increased, and the impacts of the signatures generated by the points far away from the reference are reduced. Therefore, given p i ∈ P and q j ∈ Q, we have the matching cost C i j ≡ C i, j as With the increasing of h, the value of θ i, h gradually approaches 0, and the corresponding signatures are likely to lead to incorrect matching results. Fig. 6 shows an example of MAR with the weights from different scales obtained from the shape in Fig. 5.

Shape feature matching
With the shape descriptors extracted from different shapes, DP is further employed to determine the correspondence of the samples from the paired shapes. Given two shapes P and Q, a distance matrix C = C i j with size λ × λ can be obtained according to (10). Then a desirable matching path between P and Q can be obtained by combining the DP algorithm. The minimum distance of the corresponding points is expressed as C DP i, j , and the initial value of the elements of C DP is With the cost matrix C, the elements of C DP can be updated as follows: where 2 ≤ i ≤ λ and max {2, i − ω} ≤ j ≤ min {λ, j + ω}. The hyper-parameter Δ as a penalty threshold is determined by the matching cost between descriptors. The hyper-parameter ω is set to a small value, which is closely related to the differences between paired shapes. By introducing ω as the constraint condition, the correspondence of sample points from different shapes can be effectively determined, and the consumption of matching time can be decreased accordingly. Then, the shape distance between P and Q can be expressed as C DP λ, λ . As the proposed descriptor is not rotation invariant, this problem can be solved by a circular shift of C before calculating C DP . For the gth circular shift, we have C g defined as follows: where 1 ≤ g ≤ G, n g = g × ceil λ/G , and the rotation number G is set to 10 empirically. Given C PQ g λ, λ obtained from the gth circular shift and we have the improved shape distance as dis P, Q = min Furthermore, in order to eliminate the influence of the flipping of shapes, we fix the query shape P during the shape matching process, and match P with Q and Q′, respectively, where Q′ is the flipped shape of Q. The minimum matching result is selected as the shape distance, and (14) can be updated as follows: To enhance the discrimination ability of the proposed descriptor, the shape complexity μ for each shape contour is introduced to improve pairwise shape distances, and it is defined as the mean differences of the maximum cost of the signature for the same sample at different scales: Then, the shape distance between D PQ ≡ D P, Q is finally defined as follows: where μ P and μ Q correspond to the shape complexities of P and Q, respectively. The hyper-parameter β is introduced to eliminate the effects of shape complexities with small values, and we have β = 1 in all the experiments.

Dynamic LP
In this section, to improve the retrieval results obtained from pairwise shape distances, we introduce a shape retrieval scheme named DLP based within a semi-supervised framework. The proposed method closely related to LP provides a diffusion process based on a time-variant state space S t , which is constructed on the basis of the affinity matrix W to search the relative objects from the nearby regions of labelled objects. Given the distance matrix D = D i j for a database with N shapes, the similarity sim i j can be defined as where σ i j is the bandwidth. The choice of σ i j depends on an adaptive kernel size as where knn d i denotes the set of K d nearest neighbours (NNs) based on the pairwise distances. Given a shape database with a large number of objects, redundant connections between elements from different categories may cause incorrect retrieval results, and we introduce the affinity matrix W = W i j based on mutual k-NN defined as where knn s j is the set of K s NNs based on sim i j . For iteration t, we have the state space S t with N t elements divided into two subsets: U t with unlabelled samples and R t with labelled samples, and each state corresponds to a sample from the database. For each R t with labelled objects, we are mainly concerned with a small number of shapes within the local areas adjacent to the labelled objects and have S t defined as where With W t as the sub-matrix of W corresponding to S t , the transition matrix P t = p i j t derived by normalising W t in row wise can be obtained as where P UU t = p ii′ t , P UR t = p i j′ t , P RU t = p ji′ t , P RR t = p j j′ t , i ∈ U, i′ ∈ U, j ∈ R, and j′ ∈ R.
As it takes a period of time to propagate the information from labelled samples to unlabelled samples, we introduce a group of internal loops to determine the similarities in each iteration t. With a function f : X → 0, 1 , f l t i is used to denote the similarity between i ∈ S t to the subset R t in loop l, then we set f 0 t i = 0 for i ∈ U t and f l t j = 1 for j ∈ R t , where 0 ≤ l ≤ L. f l t i can be updated as in Algorithm 1 (see Fig. 7) After L loops of updating f l t i , we have L t + 1 = L t + knn f L t as the extended subset with labelled objects, where knn f L t denotes K f NNs of R t based on f L t i , and the final retrieval results can be obtained from L T after T times iterations. In some cases, the graph corresponding to W obtained from (21) is composed of a set of connected components with subsets {S k : 1 ≤ k ≤ K}, and, given the queries from one or several components, the relative objects from the other components cannot receive the information from the query objects. To solve this problem, with G t as the union of the components containing R t , a temporary W′ = W′ i j is constructed as follows: where P′ kt = p′ i j is a sub-matrix of P′ obtained from a fully connected graph of the database based on the similarities, with i ∈ S k and j ∈ C t , and S k ∈ {S i : Given N ∈ S as the query and N Q as the number of objects expected to be retrieved, the proposed method is summarised as the pseudo-code of Algorithm 1.

Experiments results
To evaluate the effectiveness of the proposed shape retrieval method, the experiments on different shape databases are carried out on both artificial databases and real-world databases in this section. Also, shape retrieval results of the proposed method and others are compared and discussed. The retrieval score is introduced to evaluate the performances of shape retrieval in this subsection. With each shape viewed as the query and compared with all shapes, the number of objects from the same class of the query among the M most similar candidates is counted, including the self-match, where M denotes the number of samples from each category. Also, the retrieval score is defined as the ratio of retrieved objects over the total number of relative objects.
In the following experiments, each shape contour is represented by λ = 100 sample points, and then they are used to construct the shape descriptors. As given in (10), we set H = 25 and α = 0.25 during the step of calculating the matching cost between each pair of sample points. In the DP process, the penalty threshold Δ is set to 1.5, and the hyper-parameter ω is set to 20. In the postprocessing step, α is set to 0.4 for generating the similarities by using shape distances as given in (19). For the process of constructing the affinity matrix W, the hyper-parameters K d and K s are closely related to the size of the database, and they are set to different values according to different experimental databases. The number of internal loops L is set to 200, and K f = 5 unlabelled samples are added to R t in each iteration.

Kimia-99 shape database
In this subsection, the proposed methods are tested on the Kimia-99 database [33], which is a small artificial database consists of 99 images from nine different classes, and all the images from this database are shown in Fig. 8. With each shape viewed as the query and compared with all the others, the retrieval results of this database are summarised as the number of top 1 to top 10 closest matches, and the best possible value for each is 99.
As the size of the Kimia-99 database is small, with K d = 5, we have K s = N − 1 = 98 to construct a fully connected graph to generate W. Table 1 gives shape retrieval results for different shape matching methods on the Kimia-99 database. As shown in Table 1, for the pairwise-based shape retrieval approaches, the proposed shape matching method performs comparably to the best-known algorithms. By taking the proposed pairwise shape matching algorithm as the baseline method, the proposed Dynamic Label Propagation (DLP) achieves a perfect retrieval score of 100% as the post-processing step.

MPEG-7 shape database
The widely used MPEG-7 database [36] consists of 1400 silhouette images from 70 different categories. This database contains a variety of objects including both natural and artificial targets, and Fig. 9 shows the examples from different classes of this database. There exist large variations in the shapes from each class in the MPEG-7 database, which is a huge challenge for object retrieval.
To evaluate the effects of different approaches on the MPEG-7 database, the bull's eye score, which is the most commonly used indicator for this database, is introduced in the experiments. With each shape regarded as the query, the number of shapes from the same class in the top 40 most similar candidates is counted, and the score is defined as the ratio of the number of members from the same class to the best possible number (which is 20 × 1400).
In this experiment, K d and K s are set to 20 and 12, respectively. The retrieval results based on the bull's eye score for the MPEG-7 database are listed in Table 2. According to the results of the MPEG-7 database, the proposed pairwise shape matching method achieves the bull's eye score of 88.94%. We have the bull's eye score of 94.09% by using the proposed DLP as the postprogressing step of shape matching. Therefore, the proposed pairwise shape matching algorithm performs comparably with the best-known algorithms. Our algorithm also achieves a highly competitive retrieval result, which boosts the bull's eye score of baseline over 5%.

Tari-1000 shape database
We test the prosed method on the Tari-1000 database [42] in this subsection, and it contains 1000 shapes divided into 50 classes of 20 silhouettes each. This database has more articulation changes within each class than the MPEG-7 database [43], and Fig. 10 shows the examples of shapes from this database.
In this experiment, K d is set to 10 and K s is set to 12, respectively. The retrieval results based on different methods for the MPEG-7 database are listed in Table 3. According to the retrieval results, it can be found that the proposed method contains more desirable pairwise shape matching results. Also, for the distance learning step, the proposed retrieval algorithm achieves a score of 99.41%, which is superior to other methods based on the pairwise shape distances.
In previous classification experiments, this database is divided into training samples and testing samples. However, in this subsection, this database viewed as a whole is selected for testing the retrieval performances of different methods. The retrieval score is used to evaluate the performances of shape retrieval in this subsection, and the 75 most similar candidates is counted, including the self-match. For this experiment, K d = 12 and K s = 18 are selected for generating the affinity matrix. Table 4 gives the shape retrieval results of the proposed methods and other retrieval methods. The retrieval score obtained by using the proposed pairwise shape matching method is 80.08%, and the retrieval score of 94.38% is obtained by introducing the DLP as the post-processing step. So, the baseline retrieval score is significantly improved by over 14%. As can be seen, our proposed method performs better than the other selected method in this experiment.

One-hundred plant species leaves data set from UCI
We employ another real-world leaf data set downloaded from the UCI Machine Learning Repository in this subsection. This database, firstly provided in [45], consists of 100 varieties of leaves and 16 examples of leaves are collected for each variety. Fig. 12 shows the binary images from some of the classes in this database, and all the binary images can be downloaded from the web set of UCI data sets. Also, it is selected as a database for shape retrieval in this paper, instead of as a classification database in previous research studies. It is worth noting that this database consists of a wide set of classes, with a low number of examples. Thus, in this experiment, the difficulty is to search the within-class objects rather than the very similar objects from other classes.
By counting 16 most similar candidates, including the selfmatch, the retrieval score is used to evaluate the performances of shape retrieval for different methods. K d = 12 and K s = 12 are selected for generating the affinity matrix for this experiment. Table 5 gives the shape retrieval results of the proposed method and other shape retrieval methods. Also, as shown in the table, the proposed pairwise shape matching method achieves the retrieval scores of 66.73%, and with the post-processing step introduced, we have the retrieval score of 84.66%, which boosts the original retrieval scores by nearly 18%. According to the shape retrieval Table 3 Comparison of results for different algorithms tested on the Tari-1000 database.

Conclusion
In this paper, a novel pairwise shape matching method based on MAR is proposed, and, compared to the TAR method, the proposed descriptor not only has strong discriminative power but also is robust to noises and small deformations. By introducing the DP algorithm, the correspondence between sample points from different shapes can be effectively determined, and shape matching results can be calculated accurately. To improve the pairwise shape matching results, we further propose DLP as the post-processing step, and the manifold structure of the database can be effectively captured within the semi-supervised learning framework. Experimental results on artificial and real-world databases demonstrate that the proposed method performs better than the other methods.