Finger vein recognition algorithm under reduced ﬁeld of view

Algorithms for ﬁnger vein recognition methods are generally designed based on greyscale images containing vein distributions, but greyscale inhomogeneities and non-venous texture structures often adversely affect the recognition results. Besides, the performance of the algorithm when the ﬁeld of view is reduced has not been studied. Therefore, the aim of this paper is to propose an algorithm based on binary image, so as to minimize the interference of non-venous factors in the identiﬁcation process. We use the feature from accelerated segment test algorithm to detect feature points, and use gradient histogram to describe these feature points in a vectorized manner. In addition, we propose the concept of circular matching neighbourhood, and select the matching feature point pairs in this area. Then, the Euclidean distance and the number of correct matching pairs were comprehensively considered to increase the recognition rate of the algorithm. The algorithm was tested on the database of SDU-MLA, FV-USM and MMCBNU-6000. The results show that the algorithm has certain advantages before and after the ﬁeld of view is reduced. Therefore, this paper not only provides a new idea for ﬁnger vein recognition, but also has practical application value for the miniaturization of ﬁnger vein acquisition devices.


INTRODUCTION
Traditional finger vein recognition algorithms mainly include methods based on details and feature points [1], local pattern [2,3], and texture network [4,5]. The first method is to make the texture of finger vein as prominent as possible through image pre-processing, then extract the required feature points and finally carry out image matching recognition based on these feature points. This type of method combines the advantages of the latter two methods to a certain extent.
For example, the Cui Jing team of the National University of Defense Technology of China using the improved Harris algorithm to detect the intersections in vein texture, and through non-maximum suppression for corner filtering, thereby improves the effectiveness of corner detection in vein texture [6]. The Yusuke Matsuda team of Hitachi, Japan, used the Gabor filter to find feature points in the non-linear part of the vein tex-ture, and by constructing a finger deformation model to reduce the impact of deformation and rotation during recognition [7]. The Meng Xianjing team of Shandong University of Finance and Economics use the grey unevenness of finger vein images to enhance vein detail and use the scale-invariant feature transform (SIFT) to detect feature points, their published method has achieved good results on the finger vein database of the Hong Kong Polytechnic University [8]. In recent years, the Convolutional Neural Network algorithm [9,10] is used in finger vein recognition, and this method also obtains an acceptable recognition rate in different databases.
However, the above methods process the image into a greyscale image as a source image for detecting feature points. Although some denoising and image enhancement algorithms are adopted to make the vein texture more prominent, some irregular shadows and non-venous features still appear in the image, this will affect the next step of feature point detection.
In addition, the above methods take the intersection point of vein texture or the pixel points with larger curvature value in a certain direction as feature points, but after the screening, the feature points available for matching may be insufficient, thereby increasing the probability of mismatching.
More importantly, the performance of the algorithm when the field of view is reduced has not been studied. We know that the finger vein recognition is less used than fingerprint recognition. One of the reasons is that vein recognition needs a large observation field, which makes volume of the vein acquisition device difficult to miniaturize [11]. Therefore, it is of practical significance to study the finger vein recognition algorithm under the reduced field of view.
Aiming at these issues, we propose a finger vein recognition algorithm based on vein binary image, and using threshold segmentation to minimize the interference of non-venous factors. Then, we use the feature from accelerated segment test (FAST) algorithm [12,13] to detect pixel points on the edge of the vein texture and describe them as high-dimensional vectors for subsequent matching. In the matching process, we build a circular neighbourhood around the feature points, and then judge the matching quality in this neighbourhood, so as to reduce the number of wrong matching pairs. Finally, the number of correct matching pairs and the average Euclidean distance are considered together to measure the similarity between images. We used the database of SDU-MLA [14], FV-USM [15] and MMCBNU-6000 [16] for performance testing. The results show that our proposed algorithm has certain advantages before and after the field of view is reduced. Therefore, the algorithm is helpful for miniaturization of finger vein acquisition device, so as to expand the application range of finger vein recognition.

Algorithm design image pre-processing
The principle of finger vein image acquisition is based on the difference in the degree of absorption of near-infrared light by the venous blood vessels relative to the bones and skin, thereby highlighting the finger veins in the image. However, in the actual acquisition process, the intensity of the light will have a great influence on the image. If the light is too strong, then a large bright spot will appear in the image. If the light is too weak, then the vein texture and background will be confused. In addition, the individual conditions are also different. The thickness of the finger and surface skin will greatly influence the acquired image. Therefore, for the original vein image, it is necessary to perform image pre-processing such as region of interest (ROI) extraction, denoising, enhancement, and segmentation.

Finger contour ROI extraction
The Sobel operator [17] is commonly used for edge detection in computer vision. Since the edge of the finger contour is horizontally distributed, we use the Sobel operator longitudinal convolution kernel to roughly extract the finger vein region. The edge extracted by the Sobel operator contains some non- edge parts, which will cause interference in the next ROI extraction. Therefore, it is necessary to perform non-maximum suppression along the gradient direction and delete some non-edge points. An original finger vein image from the Shandong University Public Database (SDU-MLA) is shown in Figure 1(a) and its processing result is shown in Figure 1(b), respectively. After the above processing, a clearer finger edge contour can be obtained, but there is still contain some non-finger contour edge, so a linear fit to the detected edge pixels is required. To eliminate the influence of noise as much as possible, a random sampling consistency algorithm (RANSAC) [18] is used for linear fitting. The upper and lower contours of the fingers are respectively fitted. Based on this contour line, ROI extraction is performed on the original image, and the processing result is shown in Figure 1c. The next step is image enhancement and threshold segmentation.

Image enhancement
It can be seen from the ROI image that the vein texture is relatively blurred and the contrast with the background is low. If the binarization process is performed directly, many vein information will be lost or the background will be mistakenly classified as vein structure. This will seriously affect the subsequent matching recognition results, so the image must be enhanced. Image enhancement generally consists of both spatial-and frequency-domain enhancement. We will combine these two methods to enhance the finger vein image.
In the spatial-domain, we first use contrast limited adaptive histogram equalization (CLAHE) [19] to enhance the contrast of the vein texture. After this enhancement, as shown in Figure 2(a), the texture of finger vein is enhanced, but the texture is still not well separated from the background, so the image requires further enhancement.
In the frequency-domain, we use a two-dimensional Gabor filter [20] to extract the frequency and texture direction information of the image. The two-dimensional Gabor filter has five parameters, which are wavelength (λ), orientation (θ), phase offset (φ), aspect ratio (γ), bandwidth (b). Because the adjusted image size is 180 × 80 and finger vein texture are densely distributed, the size of the Gabor filter kernel should not be too large. In this paper, the size of the filter kernel is set to 7 × 7. For the small size filter kernel, fewer parallel stripes should be kept, so, we set parameters γ and σ that affect the number of stripes to 0.5. For the 7 × 7 size filter kernel, the experimental results show that the influence of wavelength and phase shift on image processing is small, thus, we set the parameter ψ = 0, λ = 2.
Due to the irregular distribution of finger vein texture, the orientation angle (θ) should not be set to a fixed value. To sample the entire frequency of the image, we first construct a Gabor filter with eight directions, from 0 to 7π/8, at intervals of π/8, and then filter the images separately to get eight responses. Then the eight responses are compared and filtered to retain the maximum response of the texture part in each result. The image enhancement result is shown in Figure 2(b).

Image binarization
After image enhancement, the texture features are more clearly highlighted. However, to extract the entire texture structure and eliminate the influence of background and noise, it is necessary to perform threshold segmentation on the image. We use the NiBlack algorithm [21]. By this dynamic local threshold processing method, the appropriate threshold can be effectively found. The binarized segmentation result is shown in Figure 3. This concludes the image pre-processing process, and we have finally obtained a binary image with prominent vein texture.

Feature point detection
The premise of the finger vein recognition method based on details and feature points is to find suitable feature points. Fast algorithm [12,13] judges feature points by comparing the pixel values around the candidate points. This method can not only

Feature point description
The feature points detected by the FAST algorithm only contain location information and cannot be directly used for matching identification. The feature points must be vectorized according to a certain rule. In this paper, we first calculate the feature point neighbourhood gradient value and its direction [22], and then construct a gradient histogram to describe the feature points. In the feature point detection process, the judging range of feature points is a circle with a radius of seven. To make sure the neighbourhood size is the same during feature point detection and description, the gradient value and direction of the neighbourhood pixels are still calculated with a diameter of seven pixels. Let the coordinates of pixel point P be (x, y). We calculate the modulo M (x, y) and direction θ (x, y) of the gradient as formula (1) and (2): where L represents the Gaussian scale space, which is obtained by convolution calculation of the image and the Gaussian function G (x, y, σ). Here, σ = 1.17, because the response of the Gaussian filter is mainly within 6σ and the neighbourhood in feature point description has a diameter of 7.
After calculating the gradient direction, a histogram is used to calculate the gradient direction and amplitude of the pixels in the neighbourhood of the feature points. A direction histogram is constructed. Its horizontal axis is the angle size of the FIGURE 5 Feature point axis rotation gradient direction, and the vertical axis is the accumulation of gradient amplitude. The peak of the direction histogram represents the main direction of the feature point. To maintain rotation invariance of the vector, feature points need be considered as the centre and the coordinate axis is rotated by the angle θ (the main direction of the feature point) in the vicinity of the feature point, that is to say, the coordinate axis is rotated as the main direction of the feature point, as shown in Figure 5.
After rotation, we take a 16 × 16 window centred on the feature point and divide it into 16 small blocks of 4 × 4. We also construct a direction histogram, but with an interval of 45 degrees, so that each small block has gradient intensity information in eight directions. Therefore, each feature point can finally obtain a 128-dimensional (4 × 4 × 8) feature vector.

Matching algorithm design
In the traditional feature point matching process, the similarity between two feature points is generally measured by Euclidean distance. Let n denote the number of feature points that are successfully matched between two images. The similarity of the two images is usually measured by the size of n. In the feature point matching process, since there are similar feature points in the vein binary image, if only the minimum Euclidean distance between feature points is used to judge whether the matching is successful, then a large number of mismatches will occur. In this paper, we improve the matching method and reduce the number of mismatched pairs. And meanwhile, we combine the average Euclidean distance and the number of feature points successfully matched between two images to construct a new matching distance for similarity evaluation. Let d i j represent the Euclidean distance between the ith feature point in the test image and the jth feature point in the database image, i.e. formula (3) where i ∈ [1, For the feature point i(x 1 , y 1 ) in the test image, a circular neighbourhood O i with the same coordinate point (x 1 , y 1 ) as the centre and r as the radius is constructed in the database image. If there are m feature points in the circular neighbourhood in the database image, then we calculate the Euclidean distances between feature point i and the m feature points in the database image. If the minimum Euclidean distance is d ip , then d ip is the local optimal match. Then calculate the Euclidean distance for all (N 2 − m) feature points that are not in the circular neighbourhood O i in the database image. If the minimum Euclidean distance is d iq , then d iq is the global optimal match. If d ip ≤ d iq , then it is considered that feature point i in the test image and feature point p in the database image is the correct matching pair. If d ip > d iq , then it is considered that there is no corresponding matching point in the database image for feature point i in the test image.
Since the ROI region of each finger vein image has been accurately extracted during image pre-processing, for finger vein images from the same class (images of the same finger), even if there is a certain degree of translation or rotation during finger vein image acquisition, the coordinate position of the matched feature point pair should be within a similar area. But for finger vein images from different classes (images of different fingers), the local optimal matching is usually not a global optimal match, so the matching method based on the circular neighbourhood can minimize the number n of matching pairs between images of different classes. Because for finger vein images of a different class, even if some local optimal match happens to be the global optimal match, the corresponding average Euclidean distance of different class will be greater than the average Euclidean distance between images of the same class. Therefore, the average Euclidean distance of all correct matching pairs between the test image and the training image are included to evaluate the similarity. The matching distance between two images is defined as (4) where n is the number of correct matching pairs, N is the total number of feature points in the test image, and d (i ) is the Euclidean distance between the ith correct matching pair. The first half of the formula, (1 − n N ), weights the average Euclidean distance to describe the matching degree between two images. The higher the degree of matching (i.e. the greater the number of correct matching pairs), the smaller the weight, which will result in a smaller final matching distance. After the test image traverses all the training images in the database, we will compare all the calculated matching distances and find the test image with the minimum matching distance, and this test image is the result of matching.

EXPERIMENTS AND RESULTS
We first evaluated our proposed method on the database of SDU-MLA. The database contains finger vein information of 106 people; the index, middle, and ring fingers of both hands were collected for each person, and six images were collected for each finger Therefore, the database contains 636 fingers and 3816 images. We used the method described in Section 2 to process all the images in the database into binarized images, and uniformly standardized the image size into a pixel scale of 180 × 80 for testing.

Selection of neighbourhood radius r
In the matching step, by constructing a circular neighbourhood with a feature point as the centre and r as the radius, the number of mismatching pairs could be reduced as much as possible. The size of r determines the search range of local optimal matching of feature points in the test image, and a smaller r means that the degree of tolerance for the correct matching is lower, which results in too few candidate feature points in the training image for mating. If the intra-class training image corresponding to the test image itself has a certain degree of translation, then the matching algorithm will not be able to find enough correct matching pair, thus affecting the final matching result. Although a larger r increases the range that allows translation, it will include too many candidate feature points, causing some wrong candidate feature points to meet the neighbourhood requirements, which will also affect the matching results. We select the appropriate r by evaluating the effect of different values of r on the final correct matching rate. For the 636 fingers in the SDU-MLA dataset, the third image of each category was selected as a test sample. The other five images in each category were used as training samples. There were 636 images in the test set and 3180 images in the training set. We set r to 5% of the height of the image as the starting radius and incremental step size, i.e. r = 4, 8, 12, …, 40. Figure 6 shows the rank-1 recognition rates corresponding to the different values of r. It can be seen that when r = 24, the recognition rate reaches its peak, and when the value of r is greater than or less than 24, the recognition rate will decrease to some extent. Therefore, the following matching experiments will be based on r = 24.

Performance test of identification mode and verification mode
In the identification mode, we assume that the source of a finger vein image is unknown, that is, we do not know who the finger vein image belongs to, and we need to verify his identity through image recognition algorithm. We randomly selected one from each group of images as the test image, and use the remaining five images as training images. Correspondingly, there are 636 images in the test set of the database and 3180 images in the training set of the database. The test was repeated 10 times, with the recognition rate shown in Table 1. The average rank-one recognition rate and average lowest rank of 100% recognition are shown in Table 2.
The cumulative matching curve (CMC) [23] are shown in Figure 7. The experimental results in the recognition mode confirm the reliability of the proposed method.
In the verification mode, to save computing time, we select 400 different fingers in the database for testing. Therefore, each image is subjected to inter-class matching with the other six images of 399 fingers, hence 5,745,600 (400 × 6 × 399 × 6) inter-class matching distances can be obtained. At the same   The intra-and inter-class matching distances were normalized to between 0 and 1 before calculations. Figure 8 shows the receiver operating characteristic (ROC) of our method, with equal error rate (EER) of approximately 0.82%. Figure 9 shows the distribution of intra-and inter-class matching distances when 1000 samples were randomly selected from the intraand inter-class matching results. It can be seen from Figures 8  and 9 that the image processing and matching algorithm of our proposed method is able to distinguish intra-and inter-class images.

Comparison of results in reduced view area
Reducing the view area will reduce the acquisition of finger vein information, and also reduces the number of detected feature points, which will have a certain impact on the recognition rate and EER. However, reducing the view area means that the physical distance between the lens and the finger in the finger vein acquisition device can be reduced, thereby reducing the size of acquisition device and facilitating its application in more fields. Therefore, it has practical significance to study the influence of view area size on recognition rate and EER.
We crop the image to simulate the situation when the field of view is reduced. Specifically, we subtract 5% of the original image's height and width each time until 40%. To ensure the consistency of the image size in all experiments, the image size was normalized to 180 × 80 after cropping (as shown in Figure 10). As shown in Figure 11, the feature points before and after the reduction of the field of view are evenly distributed at the edge of the vein network, but the number of feature points is significantly reduced (see Table 3). The method of "Leave-One-Out Method [24]" can eliminate the influence on recognition result when dividing test set and training set. Thus, to ensure the consistency of datasets in the comparison test, the training and test sets are divided by this method. The recognition rate after the field of view is reduced to different sizes is shown in Figure 12. The EER comparison results in the verification mode are shown in Figure 13. The results show that when the field of view area is reduced by 36% (20% reduction in length and width), the recognition performance is almost unaffected. This proves the effectiveness of our proposed method under the reduced field of view.

Comparison with similar methods
The effectiveness of our proposed method in the identification and verification modes was demonstrated in Section 3.2. As shown in Section 3.3, our method performs well even if the field of view is reduced to a certain extent. We now compare the performance parameters of our method with similar methods, specifically, the FAST+BRIEF-based method [25] and traditional SIFT-based method [26], on the same dataset. Table 4 compares the results of recognition rate of different methods when the field of view is reduced by different proportions. The test and training sets were also divided by the leave-one-out method. The data in Table 4 show that the recognition rate of   Table 5. The data show that when the length and width are reduced by 20%, the EER was not change significantly (0.82% to 0.89%). When the length and width are reduced by 40%, the EER can still stay below 2% (1.89%). Figure 14 shows the ROC curves for different methods and different field of view. It can be seen that our proposed method achieves the best results in terms of recognition rate and EER. So, our method has significant advantages over the other two methods.

Testing on other databases
In order to verify the robustness of the algorithm, similar to the previous section, we calculated the recognition rate and EER of our algorithm on the database of FV-USM [15] and MMCBNU-6000 [16]. The recognition rate under different size reductions is shown in Table 6. As can be seen from the table, when the image is the original size, our algorithm also achieved good recognition rates in the two databases, reaching 0.991 and 0.971 respectively. When the length and width of the image were reduced by 20% (the field of view was reduced by 36%), the recognition rate was only decreased a little (to 0.977 and 0.942, respectively). The ROC curves under different size for these two databases  Figures 15 and 16. The curves show that when the length and width are reduced by 20%, EER can still be maintained at a relatively good level. However, when the length and width of the image are reduced by 40% (the field of view was reduced by 64%), the recognition rate obviously decreased. This is because for different databases, the lighting conditions of the vein acquisition device are different, and the position of the finger when shooting is also different. When the image was cropped too much, some key vein information was lost. But overall, the proposed algorithm can still obtain a good result in other databases when the field of view is reduced by 36%, this means that the algorithm has certain robustness.

SUMMARY
Traditional algorithms for finger vein recognition are generally designed based on greyscale images containing vein distributions, but greyscale inhomogeneities and non-venous texture structures often adversely affect the recognition results. Therefore, we propose a finger vein recognition algorithm based on a binary image containing a vein distribution. We use the FAST algorithm with non-maximum value suppression to detect feature points, and describe them in a vectorized manner. Furthermore, we perform feature point matching in the circular neighbourhood and examine its matching quality. By combining the average Euclidean distance and the number of correct matching pairs between two images, a new matching distance is calculated to measure the similarity between the two images and obtain a matching result. We test the algorithm on database of SDU-MLA, FV-USM and MMCBNU-6000, the results show that the algorithm can obtain a good recognition rate and EER, even if the field of view is reduced by 36%. Therefore, this paper not only provides a new idea based on the binary image for finger vein recognition but also has practical significance and application value for the miniaturisation of finger vein acquisition device and the expansion of their applications.