Extraction of compact boundary normalisation based geometric descriptors for afﬁne invariant shape retrieval

Shape recognition and retrieval is a complex task on non-rigid objects and it can be effectively performed by using a set of compact shape descriptors. This paper presents a new technique for generating normalised contour points from shape silhouettes, which involves the identiﬁcation of object contour from images and subsequently the object area normalisation (OAN) method is used to partition the object into equal part area segments with respect to shape centroid. Later, these contour points are used to derive six descriptors such as compact centroid distance (CCD), central angle (ANG), normalized points distance (NPD), centroid distance ratio (CDR), angular pattern descriptor (APD) and multi-triangle area representation (MTAR). These descriptors are a 1D shape feature vector which preserve contour and region information of the shapes. The performance of the proposed descriptors is evaluated on MPEG-7 Part-A, Part-B and multi-view curve dataset images. The present experiments are aimed to check proposed shape descriptor’s robustness to afﬁne invariance property and image retrieval performance. Comparative study has also been carried out for evaluating our approach with other state of the art approaches. The results show that image retrieval rate in OAN approach performs signiﬁcantly better than that in other existing shape descriptors.


INTRODUCTION
Images consist of rich amount of visual information which easily communicates a subject than any other forms of existing data such as symbols, text, voice etc. The natural and artificial images from the consumer digital camera, social media, websites and other application specific fields like medical, industrial, satellite, remote sensing etc. are some of the sources of image data. In the current world, there is a demand for efficient and effective image analysis techniques which meets with the increasing and rapid growth of image databases and handling. Presently, there is a huge interest among the researchers working in the image analysis tasks such as recognition of specific object from the images and searching of similar objects from the image collections. In both cases, the image features such as colour, texture, shape, spatial location, and interest points are extracted to describe the objects in images [1][2][3].
Image retrieval research is nowadays a part of computer vision which makes the machine vision task like human visual system. In the case of computer vision system, the shapes of objects are significantly important and about 71% of image retrieval is performed using shape features since the shapes provides strong evidence for identification of the objects in the image [4]. This paper is intended to deals with regular and arbitrary shaped objects as the object shapes may undergo projective or geometrical transformations due to camera movements or motion of objects in the real-world image analysis. These viewpoint changes are simulated in experiments using affine transformations [5,6]. The general affine transformations such as translation, rotation, scaling, and shearing are used to model the viewpoint changes in shape recognition applications [7]. The image analysis techniques with affine invariance supports are significantly beneficial in automating the computer vision and pattern recognitions related applications, such as image FIGURE 1 Overall framework of the proposed OAN shape descriptors for image retrieval task retrieval, object recognition, industrial inspection, robot navigation, scene classification, pose detection, and face recognition etc. [2].
In order to carry out the shape matching of image objects effectively, the boundary or contour normalisation techniques are widely used, as it save the time and space requirements of feature extraction and similarity matching processes [8][9][10][11][12][13][14][15][16]. The normalised or sampled contour points are representative points, which are used to represent the shapes in compact manner. The conventional shape matching and retrieval applications perform analysis methods of equal distance normalisation (EDN) on the shape contours [4,[8][9][10][11][12][13]. In addition with EDN, the boundary can also be normalised by accounting the enclosed area of the shapes to capture the complete information about the shape [6,[14][15][16][17]. The shape descriptors proposed in this work exploit the enclosed area based boundary normalisation called object area normalisation (OAN), which can further be implemented by using triangle or sector area [6,14] approaches. The pixel coordinate positions of normalised contour points generated by these two methods may vary for the same shape; however for every shape each method preserves its own uniqueness in original and affine transformed images. Among these methods, OAN approach performs well and captures both region and contour information effectively. The schematic diagram of the proposed work is shown in Figure 1.
The shape feature extraction process applied on both input probe and dataset images are online and offline process, respectively. Prior to the features extraction, several preprocessing techniques are used e.g. stepwise, a minimum bounding box to cover the object shape and ignore the background details beyond this box, and then making all shapes of uniform sizes e.g. 256 rows and its aspect ratio of columns and later removing the holes present inside the shape, thus it makes shapes suitable for further processing. The feature extraction process involves contour extraction, contour normalisation and OAN descriptors generations. These descriptors are 1D shape feature vectors which preserve contour and region information of the shapes. At the end, the image retrieval process searches relevant shapes for a query image based on the criterion defined by similarity matching functions. The dynamic programming based HopDSW algorithm is used in this paper for similarity matching task [15].
The present work aims at deriving six numbers of contourbased geometric shape features e.g. compact centroid distance (CCD), central angle (ANG), normalised points distance (NPD), centroid distance ratio (CDR), angular pattern descriptor (APD) and multi-triangle area representation (MTAR) from the normalised contour points, which are more compact and affine invariant. Various sections of the paper are organized as follows: Section 2 gives detailed literature review; Section 3 elaborates the working concepts of OAN and shape descriptors extraction. Section 4 gives experimental results in terms of affine invariance test and image retrieval test. Section 5 will summarize the work along with future research directions.

RELATED WORKS
Shape feature extraction, is a crucial step in image analysis and gains more attention because of its wide utilization in computer vision applications. The primary classification of shape descriptors are based on its region and contour information [16,18,19]. Among various types of shape feature extraction, the concept of centroid distance plays significant role in literature, because it is easier to calculate and its representation is simple [12,[20][21][22][23]. Many of the shape description methods have adapted contour normalisation for compact representation and fast matching of features [4,8,9,11,13,[24][25][26]. Further shape descriptors with affine invariance support is focused by many authors [24,[27][28][29][30][31]. Recently, owing to the availability of huge computational resources, many authors have used the feature fusion and multi-scale representation of shapes, which consequently increase the overall performances [2,10,17,[32][33][34][35][36].
The multi-resolution rotation, scaling, translation (RST) invariant shape descriptor designed by Attalla and Siy is based on polygon approximation [20]. Their approach extracts three features from the equally distance normalised points namely centroid distance, chord to centre angle and segment chord to arc ratio. The contour points distribution histogram (CPDH) is an RST invariant global shape descriptor which extracts object features from polar coordinates [21]. The minimum circumscribed circles are formed on the polar coordinate, which are divided into segments with respect to shape centroid and equal internal angle. The number of boundary points that falls on each partition are distributed to histogram bins for representation. Shi et al. developed space symmetry based contour descriptor; it is based on circumscribed circle algorithm which essentially computes the distance between two farthest points from centroid to form a central axis [22]. Revollo et al. proposed an approach to analyse the bilateral and radial symmetry based shape descriptors, which has been extracted from sequence of contour points with respect to shape centroid, for automatic shape recognition and classification of diatoms [23]. Later, Yang et al. have proposed a triangular centroid distances (TCDs) shape descriptor, which is extracted from equidistant normalised contour points of the shape for 2D non-rigid partial shape matching task [12].
The classical curvature scale space (CSS) descriptor, proposed by Abbasi et al., normalises the contour points by equal arc-length parameter followed by a Gaussian function which is applied to produce curvature zero-crossing points [8]. The limitations of basic CSS descriptor are the problem of shallow concavities, normalisation of rotation and starting point issues. Giannekou et al. have used affine invariant curve normalisation to overcome rotation and starting point issues [24]. The shallow convexity/concavity issues have already been successfully addressed by many other researchers [4,9]. Particularly the multi-scale convexity concavity (MCC) descriptor determines convexity/concavity information of all normalised contour points at different scale levels [9]. The hierarchical string cut (HSC) descriptor performs multiple level curve normalisation utilizing the uniform sampling; the strings are drawn between the normalised contour points to represent the shapes [11]. Pedrosa et al. developed a shape salience descriptor, which normalises the contour by high curvature points [25]. In this technique, each normalised point is represented by angular position relative to the centroid and curvature value. The beam angle statistics (BAS) descriptor captures topological structure of the shape using 1D BAS moment functions [13]. The BAS function compute angle between a point with two adjacent equidistant points i.e. pair of bearings on the shape boundary. Similarly, the online to offline (O2O) method by Zheng et al. learns topological structure of the shape using sequence of similar shapes for fast 2D shape retrieval [26].
An affine invariant shape descriptor by Mao et al. computes azimuth angle and centroid distance ratio of opposite contour points to the barycentre of the shape [27]. Avrithis et al. developed an affine invariant curve normalisation (CN) approach to align curves of original and affine transformed images for object shape representation, classification and retrieval [28]. They have used curve orthogonalisation approach on affine transformed shapes to reduce the translation, scale and shear transformation effects. The affine-invariant curve descriptor (AICD) developed by Fu et al. performs curve segmentation with respect to contour zero-crossing points [29]. Lakehal et al. have used an affine length to parameterize the contour curve into 'n' zero-crossing points. [30]. An invariant multi-scale shape descriptor by Yang et al. sampled the contour to extract various features using adaptive discrete contour evolution technique [31].
In recent reports, the feature fusion and multi-scale shape description methods have widely been used as they capture the maximum information of shapes [2,4,10,17,[32][33][34][35][36]]. An affine invariant feature fusion method, introduced by Yu et al., performs the parallel extraction of geometric features with a twostage double biologically inspired transformation (DBIT) features from the images [17]. This method uses the Pearson correlation distance as a weighted fusion strategy to combine the geometric and DBIT features for object recognition tasks. The significance of feature fusion is proved by Abro et al., and they have tested two fusion strategies e.g. the direction concatenation and discriminant correlation analysis on four shape features which includes Fourier descriptors, hierarchical centroids, shape context and shape moments [32]. Wei et al. represent complex shapes using V4 local features (position, scale, orientation, and shape curve features), which are extracted from curve segments of the shape contours [33]. Their approach combines group of V4 features and self-organizing map (SOM) based learning was applied for object detection task.
The angular pattern and binary angular pattern (AP & BAP) descriptors are multi-scale feature extraction techniques which extract the angular information from equidistant normalised contour points [10] and these descriptors with sequential backward selection algorithm are capable of improving the image retrieval performance [34]. The triangle area representation (TAR) descriptor proposed by Alajlan et al. uses uniform sampling at multiple scales to find convexity/concavity information [4]. The TAR signature is also used by Mouine et al. for leaf image classification [35]. Hu et al. have developed a common base triangle area (CBTA) descriptor which uses series of base-fixed and vertex-varied triangles to represent shape objects [2]. Recently, Zheng et al. have extracted shape histogram based multi-scale Fourier descriptor using group feature (MSFDGF-SH) from courser contour of the shapes for 2D shape matching applications [36].

3
OBJECT AREA NORMALISED COMPACT SHAPE DESCRIPTORS

Object area normalisation
The OAN is a boundary normalisation technique which produces normalised boundary points to represent shapes. Initially, the preprocessing techniques such as background removal,

FIGURE 2
Steps involved in OAN-based contour normalisation zero padding, resizing and holes removal are performed and then the contour extraction is performed by a boundary tracing algorithm. The output of this step is continuous contour pixels Finally, the OAN process is applied on the continuous contour to produce normalised contour points {P j , j ∈[{0,1,2,…N-1]} where N is total number of normalised contour points. The step-by-step procedure of OAN is summarized in the Algorithm-1 and Figure 2. The pictorial illustration of contour extraction and contour normalisation on 'bell' shape is shown in Figure 3.
2. Compute total area of the shape (S), centroid (G), and average part area (S part ) values. 3. Find sector area between the three points (Γ i , G, P j ) and compare it with S part .
The OAN process involves computation of total area (S) and average part area (S part ) of the given shape object using following Equations (1) and (2): The sector area between the three points (Γ, G, P) in Figure 3(c) is calculated by Equation (3).

Shape descriptors based on OAN
In this section, a set of six features are extracted from OAN contour which are CCD, ANG, NPD, CDR, APD and MTAR. The extraction of these features from 'bell' shape are illustrated in Figures 4(a-c) and 5(a-c).

CCD descriptor
This is a powerful shape descriptor derived from the OAN contour. The Euclidean distance metric is used to compute the CCD values between G and P j points as shown in Equation (4).

ANG descriptor
The ANG feature extracts central angle between two adjacent normalised contour points (P j, P j+1 ) with respect to centroid G. The ANG feature extraction is illustrated in Figure 4(b). The ANG descriptor is a 1D feature vector computed using Equation (5).

NPD descriptor
The NPD descriptor captures straight line distance i.e. Dist(P j, P j+1 ) between the adjacent normalised contour points using Euclidean distance metric. The NPD feature extraction is illustrated in Figure 4(c).

CDR descriptor
The centroid distance ratio of opposite normalised contour points are calculated and stored in CDR array of size N/2, where N = 64 in the present experiments. The CDR feature extraction is illustrated in Figure 5(a).

AP descriptor
The AP descriptor was proposed by Hu et al. [10]. The APD feature extraction from N normalised contour points is illustrated in Figure 5(b). The AP descriptor extracted from the point P j is calculated using Equation (8),

MTAR descriptor
The MTAR shape descriptor is a multi-scale version of TAR signature developed by Alajlan et al. [4]. The original MTAR signature is extracted from equidistant normalised boundary points.
In the present work, the MTAR signature is extracted from OAN contour for better image retrieval results.

EXPERIMENTAL RESULTS
This section presents the various experiments performed to prove the effectiveness of OAN descriptors and they are mainly concentrated on two aspects namely robustness to affine transformations and image retrieval rates. In the present experiments, the shape contour is normalised with N = 64 and the starting point (P 0 ) is selected as a farthest contour point from the shape centroid. This section initially gives details about various datasets used in the experiments. Further this section elaborates performance of OAN descriptors in terms of affine invariant image retrieval and shape similarity retrieval.

Image datasets
The performance of OAN shape descriptors are tested on two benchmark datasets and three customized datasets using 'Matlab' platform. The images available in these datasets are binary images with single closed contour objects. The Figure 6 shows sample shapes present in the MPEG-7 CE Shape-1 Part-B dataset and multi-view curve (MCD) dataset. The MPEG-7 Core Experiment CE Shape-1 Part-B dataset and MCD dataset are benchmark datasets used to test the shape similarity retrieval and affine invariance, respectively. Three customized datasets created are, (i) MPEG-7 Part-A1 dataset (for scale invariance), (ii) MPEG-7 Part-A2 dataset (for rotation invariance), and (iii) MPEG-7 Part-A3 (for shear invariance) dataset. The MPEG-7 Part-A1 and Part-A2 datasets are created based on MPEG-7 content description standard. The purpose of the MPEG-7 Part-A datasets is to evaluate the performance of shape descriptors under view point change [4,5,13]. The MPEG-7 CE Shape-1 Part-B dataset is a well-known benchmark dataset defined by ISO/IEC. This dataset is created using natural and manmade objects with rigid and nonrigid deformations. Letacki et al. described this dataset for image retrieval task [5]. Total number of images (T) available in this dataset is 1400, whereas the images are distributed into 70 classes (C) with 20 images (I) in each class. The MCD dataset is a specialized dataset derived from MPEG-7 Part-B dataset images to facilitate affine invariance test [7]. The MCD dataset takes 40 different classes of images from MPEG-7 Part-B dataset. From each class, a number of 7 affine transformed images and its mirrored images are generated, thus a total of 14 images are created in each class. At the end, 40 × 14 = 560 images are

Affine invariant image retrieval
This section shows the robustness of OAN descriptors in order to support affine invariance properties like scaling, rotation and shearing transformations along with image retrieval task. The image retrieval process searches relevant shapes for a query image based on the criterion defined by HopDSW algorithm [15]. In this subsection affine invariant image retrieval is performed on four datasets e.g. MPEG-7 CE Shape-1 Part A1, A2, A3 and MCD dataset. of the 420 images in these datasets are used as query and its retrieval count is taken into account for calculating the overall retrieval performance of OAN descriptors on Part-A datasets. The retrieval performance of six OAN descriptors using TA and SA approaches on MPEG-7 Part-A1, A2 and A3 datasets are summarized in Table 1.

MPEG-7 Part-A datasets
As shown in Table 1, the SA-based OAN descriptors produced higher retrieval rates than TA-based OAN descriptors. Among the retrieval rates of different signatures, the CCD descriptor gives higher retrieval rates. The retrieval rates of other individual features (e.g. ANG, CDR, APD) of OAN are relatively low for image retrieval, for this reason feature integration of SA-based OAN descriptors are performed for increased image retrieval performance. Hence CCD, ANG, NPD, CDR, APD and MTAR signatures are integrated and it has produced retrieval rates of 98.33%, 100%, and 96.90% for MPEG-7 Part-A1, A2 and A3 datasets, respectively. Table 2 shows performance comparison of OAN descriptor with existing shape descriptors data available in the literature for MPEG-7 Part-A1 and A2 datasets [4,13].

MCD dataset
The MCD dataset was created by Zuliani et al. which models the perspective distortions occurred in real world objects [7]. The MCD dataset consists of 560 images which are created by using the following steps. Initially, a number of 40 images are randomly selected from MPEG-7 Part-B dataset and printed on the white paper. Then these images are recorded by a digital camera with 7 different points of view {central, left, right, top, bottom, top left, bottom right}. Thereafter, the contour of these images is extracted and its mirrored versions are created to form the MCD dataset. The retrieval screenshots of OAN descriptor on different shapes of MCD dataset is shown in Figure 7. Here, the retrieved images are shown along with its distance to the query image and the different sized images are aligned with uniform size for convenient display of results. In Figure 7, the query images, retrieval results and the precision or retrieval count on top 10 retrieval results are respectively shown in the first, second and third columns. For all the query inputs, the precision count obtained is above 8 upon top-10 retrievals. The retrieval accuracy of OAN descriptor and other existing shape signatures on MCD dataset is shown in Table 3. From this table, it is evident that the retrieval accuracy of OAN descriptor is reasonably good for MCD dataset i.e., it shows the ability of OAN descriptors for affine invariant image retrievals.
In summary, it is illustrated that OAN descriptors effectively support affine invariance by performing a set of experiments on MPEG-7 Part-A1, A2, A3 and MCD datasets. From these experiments, it is clear that CCD descriptor of OAN performs better than other OAN features. Further integration of different features extracted from the OAN contour significantly improves the retrieval performance.

Shape similarity retrieval
The shape similarity retrieval measures image retrieval performance of OAN descriptors on MPEG-7 CE Shape-1 Part B dataset with non-rigid object shapes [5]. The performance measurement criterion of MPEG-7 CE Shape-1 Part-B dataset is bulls eye retrieval (BER) rate. The BER rate for an image indicates that the correct retrieved matches counted from top 40 retrieval results. The BER rate of whole dataset is calculated by summation of BER count for all images (1400) and divides this value by maximum possible retrieval count i.e. 20 × 1400 = 28,000. According to Letacki et al., 100% BER rate is not possible due to inter-class similarity and intra-class variation [5]. The BER rate is calculated using the following Equation (9).
The BER rate comparison between basic centroid distance (CD) function and CCD descriptor on MPEG-7 CE Shape-1 Part-B dataset was reported earlier [16]. Similarly, the image   Figure 8. From the bar chart, x-axis represents image class names and y-axis represents BER rate percentage, the green and yellow bars denote retrieval performance of CCD feature extracted from TA-and SA-based OAN contour respectively. The comparison graph shows that SA-based OAN produces higher BER rate than TA-based OAN for most of the input images. Like CCD descriptor, the image retrieval task is performed by using other features of OAN. The BER rate of individual features of TA-and SA-based OAN are computed for all the images in MPEG-7 CE Shape-1 Part-B dataset and the comparison results are shown in Table 4.
From the experimental results shown in Figure 8 and Table 4, it is proved that proposed SA-based OAN descriptors {CCD, ANG, NPD, CDR, APD and MTAR} produced higher retrieval rates than existing TA-based OAN approach. Later this point, all the experiments are conducted on SA-based OAN features. Further, the image retrieval screenshots of individual OAN features on 'teddy' shape are shown in Figures 9 and 10. As mentioned in the above Equation (9), the BER rate of individual OAN features is calculated by counting the similar images (SI) from top 40 retrieved results. From the figures, the BER rates obtained for CCD, ANG, NPD, CDR, APD, and MTAR descriptors are 100%, 100%, 95%, 90%, 90% and 100% respectively for the input query shape 'teddy'.
The BER rate results given in Table 4 for MPEG-7 CE Shape-1 Part-B dataset using CCD and other individual features of OAN indicate that they are not sufficient enough for effective image retrieval task. Hence, the integration of OAN features is performed and the BER rates obtained are shown in Table 5. Initially, the individual OAN descriptors are integrated with CCD descriptor and its BER rates are calculated, later all the features of OAN are integrated which has produced a BER rate of 84.52%.
Apart from the six OAN features, nine global features described in [3,4,8,11] are extracted from all the 1400 shapes. They are listed as, rectangularity, eccentricity, circularity, solidity, perimeter ratio of convex hull to shape (CVX), equivalent diameter (ED), discontinuity angle irregularity (DAI), length irregularity (LI) and sharpness. The next task is to integrate these nine global features to the OAN descriptors set. Before integration, retrieval performance of these nine global features is tested separately and its BER rates are shown in the Table 6.
The BER rates of these nine features are negligible and hence these global signatures alone are not sufficient enough for image retrieval applications owing to its poor performance. Therefore,    these signatures are integrated with other shape signatures for increasing the retrieval performance [4,8,17,26]. In this work, these nine global features are integrated with OAN descriptors, as a result an improved BER rate is obtained and the results are presented in Table 7. The retrieval performance is gradually increased when more features are added, i.e. after integration of all OAN features (OAN_ALL) with the nine global features, a considerable improvement in performance has been achieved when it is compared with the results shown in Tables 4-6.
In order to prove the effectiveness of proposed OAN descriptors, their BER rate performance is compared with the state-of-the-art shape descriptors. The BER rate comparison of different shape descriptors in the literature and the proposed OAN are shown in Table 8. From this, it is evident that OAN descriptor produces good BER rate i.e. 89.74% than many well-known shape descriptors in the literature. However, several shape descriptors in the literature [17,19,26] have claimed more than 90% BER rate by combining their approach with more complex existing methods which incurs more additional complexity. At last the time complexity is analysed in two aspects namely feature extraction time and similarity matching time. The feature extraction time complexity of proposed OAN descriptors are linear i.e. O(n) which purely depends on the contour normalisation value N. But the time complexity of many existing approaches in the literature are comparatively higher, for example, time complexity of TAR [4], MCC [9], BAS [13] and IDSC [37], AICD [29] are O(n 2 ) and O(n 3 ) respectively. Moreover, the time complexity of the similarity matching algorithm HopDSW' used in this work is O(nlog n) [15]. Hence, the proposed SA-based OAN shape descriptors set is more compact, faster and more suitable for affine invariant and shape similarity based image retrieval applications.

CONCLUSION AND FUTURE WORK
In the current work, we have presented a set of affine invariant shape descriptors which are derived from OAN-normalised contour for image retrieval and shape matching applications. The OAN boundary normalisation uses sector area approach, which effectively accounts curve information of the shapes. The OAN descriptors (CCD, ANG, NPD, CDR, APD and MTAR) are faster, compact in nature and more effective in terms of robustness to affine transformations and image retrieval performance. Among the six OAN descriptors, the CCD is more influential than all the others. Moreover, nine global shape features were extracted and integrated with OAN descriptors for increased image retrieval performance. The 'HopDSW' algorithm was used for similarity matching task. The performance of OAN descriptors were validated on MPEG-7 CE Shape-1 Part-A1, Part-A2, Part-A3, Part-B and MCD datasets. The proposed approach has generated a magnitude of about 89.74% BER rate on MPEG-7 CE Shape-1 Part-B dataset. The results of the present work make it clear that OAN descriptors may have the potential for implementing the real-time image retrieval and shape matching applications, which require affine invariance supports and higher image retrieval performances. The future work includes automatic selection of optimal normalisation values N and the right selection of starting point in contour tracing to further improve the performances. Moreover, it is observed that the OAN descriptors works good for any rigid and non-rigid shapes but performance degradation occur in shapes with complex interiors. In future, these kinds of shapes can be handled by extracting OAN descriptors in multi-scale representations and integrating them for excellent performances.