Epipolar geometric association distance: A new cost function for passive sensor data association

The traditional way of passive sensor data association relies on the estimation of candidate target position, and hence suffers from severe performance degradation due to the estimation error and the so ‐ called ghosts. The proposed new cost function is based on epipolar geometry theory that uses fundamental matrix to measure the correlation be-tween sensor measurements directly without estimation of candidate target position. This approach has two superiorities: (1) it declines the accuracy loss caused by a large number of trigonometric operations, and (2) it greatly immunes to the impact of ghosts. A comparison with three traditional cost functions via numerical experiments demonstrates that the proposed cost function significantly improves the accuracy of data association, especially when the number of targets is more and the sensor distribution is closer.


| INTRODUCTION
Passive sensor system is widely used in air traffic control, space surveillance and many other fields [1]. Compared to active sensor, it can provide only line-of-sight (LOS) measurements (azimuth and elevation) of each target. A single stationary passive sensor cannot fully localize a target; to do this, it needs at least two measurements from different sensors or from the same moving sensor at different locations. Therefore, in multisensor and multi-target scenarios, data association arises as a central problem, that is, to determine the measurement sets for each target or classify the measurements as many sets corresponding to the likely existing targets before the targets have been declared. There are two major challenges about it. A reasonable cost function is hard to be formulated due to the effect of ghost which may result in an incorrect association. The other challenge is to find an efficient algorithm to obtain the optimal solution, which result in a multi-dimensional (S-D) assignment problem that is proven NP hard when dimension ≥3 (where dimension represents the number of sensors) [2,3].
Many researchers have investigated these problems and proposed some solutions. Pattipati et al. [4] formulated the cost function as a joint-likelihood of the candidate association, and proposed Lagrangian relaxation algorithm to solve the S-D (S = 3) assignment problem. The cost function is computed involving the estimation of target position. For each candidate association, a corresponding target position is estimated using the least squares method. Then, the measurements of each sensor are regenerated by the estimated target position. The likelihood between the newly estimated measurements and the original measurements are taken as the cost function, and it will be discussed in detail in Section 2. Lagrangian relaxation algorithm solved the 3-D assignment problem as a series of 2-D assignment problems, by relaxing the constraint and using Lagrange multipliers. Then the Auction algorithm [5,6] is used to solve these 2-D assignment problems. Compared with other approaches, this framework performs better in accuracy and stability, and some researchers have investigated it deeply and tried to improve it. Yeddanapudi et al. [7] extended the algorithm to multi-dimensional case and proposed S-D assignment algorithm. Popp et al. [8] proposed m-best S-D assignment algorithm to find out the most probabilistic association relationship in the first m optimal solutions, and enhance the robustness of the algorithm. Ouyang et al. [9,10] pointed out that the cost function which Pattipati et al. proposed does not take the effect of the target position estimation error into account. So, the cost function was modified by integrating the least squares estimation errors. Liu Hang et al. [11] indicated that the smaller the distance between LOS, the probability that the measurement comes from the same object is higher. Cao Xiaowen et al. [12] modified the cost function on the assumption that some targets are real ones, which contributes to the fast convergence of Lagrangian relaxation. Bishop et al. [13] proposed the maximum number of sensors needed to eliminate ghost points is Pþ1, where P is the number of targets. Chummun et al. [14] proposed a fast data association technique based on clustering. Klein et al. [15][16][17] derived a method that can associate the passive sensor data when the measurements from multiple sensors are asynchronous and the positions of one or more passive sensors are unknown.
As Pattipati pointed out, ghosts are the major problem which affects the accuracy of data association. In ideal scenario, the LOSs of true measurement-target associations will intersect at the target position. However, due to the estimation error (caused by limited sensor resolution and measurement error), the corresponding LOSs may not intersect, and other LOSs may be very close or even intersect, resulting in a ghost. In this situation, the cost between genuine target (correct association) and ghost may very small. Moreover, there exist some ghosts whose costs are much lower than those of genuine targets [4]. And the more the target number, the more the ghost number. Over and above this, when the sensor distribution is closer, ghosts are also more likely to be created. Thus the optimal solution of the cost function may involve ghosts with very low costs, which results in an incorrect association. The method of Ouyang et al. [9,10] and Liu Hang et al. [11] improved the performance of cost function in some scenarios, but they are not essentially different from the method of Pattipati. They all need to use LOS to estimate the target position during the cost calculation. Uncorrelated LOSs may still intersect, so the problem of ghosts still exists.
This paper proposed a new cost function based on epipolar geometry theory that measures the data association with the fundamental matrix. There is an important property that if x and x' are the measurements of two sensors originated by the same target, respectively, and F is the fundamental matrix between the two sensors, then x' T Fx = 0. This motivated us to formulate the proposed cost function, it neither need the estimation of the target position, nor the calculation of LOSs, which can just calculate the cost directly with the measurements. It is shown in this paper that when there are three sensors, the proposed cost function is greatly immune to the effect of ghosts. The remainder of this paper is organized as follows. Section 2 briefly introduces the problem formulation proposed by Pattipati et al. [4] and its modification by Ouyang et al. [10] and Liu Hang et al. [11]. Section 3 introduces the method proposed in this paper. In Section 4, a comparison of our approach to other three is introduced in detail. Section 5 is conclusions.

| Problem formulation
The problem formulation follows the work of Pattipati et al. [4]. A typical scenario is shown in Figure 1. Assume there are N targets in the surveillance region, which are described by their Cartesian coordinates T j = (x j ,y j ,z j ). The number of passive sensors in the network is supposed to be 3. Sensor s with position S s = (x s ,y s ,z s ) can measure azimuth β sj and elevation α sj of target j. Let θ sj = [β sj ,α sj ], and with false alarms considered, the noisy measurement z si can be denoted by The measurement noise between different sensors is assumed independent of each other. And v s is the measurement noise of sensor s, and is statistical subject to a normal distribution N (0,σ s 2 ). The probability density function of the spurious measurement is assumed to be uniform, and is given by p ws = 1/Φ s , where Φ s represents the field-of-view of sensor s. The objective of the association algorithm is to determine the correlation relationship between the measurements belonging to different sensors. A candidate correlation relationship can be represented by a 3-tuple Ζ i 1 i 2 i 3 = {Ζ si s } 3 s=1 , which means the i 1 th measurement of sensor 1, the i 2 th measurement of sensor 2 and the i 3 th measurement of sensor 3 corresponding to the same target. For each 3-tuple, an association cost is assigned, which can be denoted by where θ̂s j is the measurement generated by the estimated candidate target position corresponding to the 3-tuple Ζ i1i2i3 , and this will be introduced in detail in Section 2.2. p Ds represents the detection probability of sensor s. When miss detection is considered, the dummy measurement z s0 is added, representing the possible missed target. u (i s ) is an indicator function that if i s = 0, u (i s ) = 1; otherwise u(i s ) = 0. It is assumed that a genuine target should be detected by at least two sensors; otherwise, it is judged as a spurious target. So the 3-tuples like Ζ i100 ,Ζ 0i20 and Ζ 00i3 are all judged as spurious targets, and the corresponding cost is defined by c i100 = c 0i20 = c 00i3 = 0.
The association result is the combination of N 3-tuples which cost minimal and satisfy the following constraints: 1. Each 3-tuple represents a genuine target or spurious target. 2. Each target can only originate one 3-tuple, that is, Then the data association problem of the three sensors is converted to a 3-D assignment problem: Note that, for the S-D assignment problem, if S ≥ 3, the problem NP hard, and there is no known polynomial time algorithm exists [2,3]. Pattipati et al. [4] proposed to use Lagrangian relaxation algorithm to obtain the sub-optimal feasible solution, and this method is proved effective by simulations. Because this paper mainly focuses on the modification of cost function, more details of Lagrangian relaxation algorithm can refer to [4].
Note that, in detection and tracking system, β sj and α sj cannot be obtained directly; they are calculated as follows. As shown in Figure 2, the result of detection algorithm is the 2-D coordinates (u, v) in image plane coordinate system. And its elevation and azimuth (ϑ, φ) can be obtained easily by basic pinhole model [13]. However (ϑ, φ) are not (β, α), the former is in the sensor coordinate system (as shown in Figure 2), and the latter is in the world coordinate system (as shown in Figure 1). Obvious (β, α) is determined by the direction of LOS. As schematized in Figure 2, the direction of LOS can be obtained by rotating the principal axis (z-axis) of sensor, whose direction vector in world coordinate is supposed known. First, rotate the principal axis φ around the z-axis, and then rotate it ϑ around the ϑ-axis. The rotation matrix R(r,ϕ) is as Equation (10), where r = (r x , r y , r z ) is the rotation axis, ϕ is the rotation angle.
cos ϕ þ r x 2 ð1 À cos ϕÞ r x r y ð1 À cos ϕÞ À r z sin ϕ r x r z ð1 À cos ϕÞ þ r y sin ϕ r y r x ð1 À cos ϕÞ þ r z sin ϕ cos ϕ þ r y 2 ð1 À cos ϕÞ r y r z ða À cos ϕÞ À r x sin ϕ r z r x ð1 À cos ϕÞ À r y sin ϕ r z r y ð1 À cos ϕÞ þ r x sin ϕ cos ϕ þ r z 2 ð1 À cos ϕÞ -103 Therefore, the relationship between (u,v) and (β,α) can be summarized as follows: α ¼ a tan 0 B @ d z ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffiffi À d y where ResW and ResH represent the width and height of the resolution of image plane whose unit is pixel. F is the focal length of the sensor, and w, h are the real size of the image plane whose unit is the same as f. Vector d presents the direction of the measurement in the world coordinates, and vector d x¡axis , d y¡axis and d z¡axis correspond to the direction of x-axis, y-axis and z-axis of the sensor. In this subsection, the traditional way of calculating cost function is introduced in detail. It can be seen that, there are a large number of trigonometric operations involved. Because of the precision of the computer, there are truncation errors in trigonometric operations. The more trigonometric operations involved the more truncation errors added. Thus, the estimated target position is not so accurate, and the genuine targets cost higher than they should be. This makes it harder to distinguish genuine target from ghost, and affects the association accuracy. In following two subsections, we briefly introduce the modification of cost function by other researchers.

| Cost function with estimation error considered
Ouyang et al. [9,10] pointed out that the candidate target position is estimated using the least square method in Pattipati proposed cost function, but the cost function does not reflect the effect of the estimation error. So they take the target estimation error σ 2 i 1 i 2 i 3 as the scale factor to modify the cost function. So Equation (2) can be modified as As can be seen from the comparison between Equations (2) and (16), their difference is only the scale factor σ 2 i 1 i 2 i 3 . The greater the target position estimation error, the less likely it is to be a genuine target; so the cost of this candidate association is increased. On the contrary, the smaller the σ 2 i 1 i 2 i 3 is, the more likely it is to be a genuine target, so the cost of this candidate association is decreased.

| Cost function based on LOS distance
Liu Hang et al. [11] showed that the distance between LOSs can reflect association probability between measurements. In ideal scenario with no measurement error, the correctly associated LOS intersects at the target position, and the distance between LOSs is 0. Because of the limited resolution and measurement error, the distance between LOSs is always bigger than 0. Therefore, the smaller the distance between LOSs, the higher the association probability between measurements. If LOS sis represents the line determined by the position of sensor s and the direction of z sis , the cost function can be modified as where d i1i2 represents the distance between the line LOS 1i1 and LOS 2i2 , as shown in Figure 3.

| Epipolar geometry and the fundamental matrix
Epipolar geometry reflects the intrinsic projective geometry between two sensors. It only depends on the internal parameters and the relative pose of the sensors, and is independent of targets structure. The fundamental matrix is a 3�3 matrix of rank 2, which encapsulates the intrinsic data association relationship between sensors. According to the theory of projective geometry [18], the parameter model of the sensor's projection matrix (or camera matrix) can be denoted as P=KR K is called the camera calibration matrix, and the camera can be an infrared sensor or other optical passive sensor. The parameters contained in K are called the internal camera parameters, which are determined by focal length, the coordinates of the principal point in the image plane etc. R is a 3�3 rotation matrix representing the orientation of the camera coordinate. I represents the identity matrix. C represents the coordinates of the camera centre in the world coordinate. Xu et al. [19] introduced that the fundamental matrix between two sensors can be denoted in term of the two sensor's projection matrices, that is, F = [e 0 ] � P 0 P þ , where e 0 represents epipole. It is generated by the centre point of the first camera (whose camera matrix is P) projected to the image plane of the second camera (whose camera matrix is P ' ). P þ is the pseudo-inverse of P, that is, PP ' = I. And, [ e' ] � represents skew-symmetric matrix of e 0 . It is defined as follow, assume e 0 = (e 1 0 , e 2 0 , e 3 0 ) T is a 3-vector, then its corresponding skew-symmetric matrix is ½e 0 � � ¼ The most important property of the fundamental matrix is as follow. If a 3-D point Χ is imaged as x in the first view, and x 0 in the second view (i.e., PX = x, P 0 X = x 0 ), then the image points satisfy the relation Figure 4 shows the relationship of these points. Line LOS s 1 x is defined by s 1 and x, and line LOS s 2 x0 is defined by s 2 and x 0 . If x and x 0 are associated, then LOS s 1 x and LOS s 2 x0 are coplanar. This is a necessary condition for correlated data. The fundamental matrix can measure the degree of the coplanar relationship, so we are motivated to modify it as a cost function.

| Cost function based on epipolar geometry association distance
Assume the homogeneous coordinate of z s 1 i 1 , z s 2 i 2 is 1Þ, as shown in Figure 5. where F 12 is the fundamental matrix between sensor 1 and sensor 2, and |⋅| represents the absolute value. Equation (21) can be understood as a kind of distance that reflects the degree of association between two measurements. We define the distance as the epipolar geometry association distance (EGAD), whose more general definition will be given later. Ideally, if two measurements are associated, their EGAD is 0. Conversely, the weaker the correlation between the two measurements, the bigger their EGAD is. When there are three sensors, one target could originate three measurements (one measurement for each sensor). Ideally, sum up the EGAD of any two of them equal 0. For example, assume x s 3 i 3 also corresponding to the same target as x s 1 i 1 and x s 2 i 2 , the following equation holds: Motivated by Equations (21) and (22), the proposed cost function based on EGAD is formulated as follows: x s k i k ¼ ðu s k i k ; v s k i k ; 1Þ represents the homogeneous coordinate of the i k th measurement of sensor s k in the image plane. The relationship between (u,v) and z=(β,α) is summarized in Equations (11)- (15). P j and P k present the camera matrix of sensor j and k, respectively. The computation of F jk has been introduced in Section 3.1. And the data association problem can be reformulated by submit Equation (23) into Equation (3), which is denoted by subject to equation (4). The advantage of this cost function is that it gives a way of characterizing the association between measurements without using LOS to estimate the target position. This is very important for image-based sensor. In the image-based target detection and tracking system, the detection result is the 2-D coordinates of the target in the image plane. To calculate the target LOS, one need to translate the 2-D coordinates to the sensor coordinate and then to the world coordinate as discussed in Section 2.2. This process involves a lot of trigonometric calculations. Truncation errors will be introduced in the process of triangulation, and a large number of trigonometric calculations will accumulate these errors. The accumulated errors affect the accuracy of target position estimation, thus affect the association accuracy. However, using EGAD as the cost function, the result of target detection can be directly used as the input of the cost function, which avoids the estimation of the target position, avoids the loss of accuracy and improves the association accuracy.

| Ghost elimination
As discussed above, the coplanar measurements are associated. However, when the coplanar measurements are more than one pair, association results are ambiguous, and there will be socalled ghosts. In this section, we explain why the cost function proposed in this paper is immune to the effect of ghost. Figure 6 illustrates the typical scenario of ghost where the red dot represents the measurements of targets. The plane determined by the sensor centre and measurements is illustrated by light green trapezium. The essence of EGAD is the degree of coplanar. As can be seen from Figure 6, the genuine targets and ghosts can originate the same measurements. And the four measurements are all coplanar, the EGAD of them all equal 0. Therefore, this results in an ambiguity. In this case, it is necessary to use another sensor to resolve the ambiguous problem. Figure 7 shows the scenario with three sensors, where x i 0 represents the measurement of the target, the upper corner represents the index of the sensor, and the lower corner represents the index of the measurements. As can be seen in Figure 7, x 1 、x 2 、x 1 0 、x 2 0 are coplanar on π 12 , which is defined by s 1 , s 2 and x 1 . As discussed above, |x 1 However, which one of x 1 0 and x 2 0 is associated with x 1 cannot be determined (so does x 2 ). With the addition of sensor 3, the association can be determined by plane π 23 (defined by s 2 , s 3 and In the same way, s 2 , s 3 and x 2 0 can define another plane, which is coplanar with x 2 ″. So, Similarly, the association determined by sensor 1 and sensor 3 is � x 00 Combining Equations (27)-(30), associations determined are In Equation (31), x 1 maps back to itself through three associations, thus there is no ambiguity. That is to say, the ambiguous problem in Equation (27) is resolved. The authors call this circular association, which is defined as follows.
Definition: If a measurement can map back to itself after multiple maps, this is a circular association.
The case in Figure 7 shows that, the measurements of three non collinear sensors could establish circular associations. So, the proposed cost function is greatly immune to the effect of ghosts.

| Simulation
The main work of this paper is improving the cost function; so, the purpose of the experiment is to evaluate the performance of different cost functions in association accuracy. Therefore, we use different cost functions in the same algorithm framework to analyse association accuracy in terms of the increasing number of targets and the closing distribution of sensors. The algorithm framework is shown in Figure 8.
Step 1. Measurement lists are obtained by detection algorithm.
Step 2. Calculate the cost function.
Step 3. Convert minimizing the cost function to a 3-D assignment problem.
Step 4. Use Lagrangian relaxation algorithm to transform the 3-D assignment problem to a series of 2-D assignment problems.
Step 5. Use auction algorithm to solve these 2-D assignment problems.
Because the experiment mainly compares the performance of the cost function, we adopted an idealized situation in which  [9,10], Liu Hang et al. [11] and the cost function proposed in this paper. Lagrangian relaxation algorithm and Auction algorithm are based on the methods proposed by Pattipati et al. [4] and Adrian et al. [5,6].
The input of the algorithm is randomly generated by the following approach. As shown in Figure 9, the positions of the three sensors are determined by (r,θ,φ). In which r is generated randomly in range [260,290]. To test the effect of the sensor distribution on the association accuracy, θ and φ are randomly generated in range [0,π] [π/6,5π/6] [π/3,2π/3] [5π/12,7π/12] respectively, that is the sensor distribution is getting closer and closer. Assume the surveillance region is the origin of the world coordinate system, so the sensor's principal axis points to the origin.
The target position is randomly generated in a sphere with a radius of 20 and the centre of the sphere is the origin of the world coordinate system. In order to compare the performance of the cost function in term of target number, the number of targets increases from 3 to 50.
The input image of the algorithm is generated by OpenGL, with a resolution of 800 £ 600, FOV (field of view) = 45, NEAR (this parameter can be understood as the focal length) = 0.1. The target detection algorithm use 'regionprops' function in Matlab R2016a. The detection results are the 2-D coordinates of the target in the input image. Due to the resolution of the image, there is an error (noise) between the detection result and the genuine position. By statistics, the mean of error is almost 0 and the variance is 0.025.
Furthermore, in order to compare the complexity between different algorithms, the average time of computation process is counted. The experiment is simulated on a computer with Intel Core i5 CPU (1.8 GHz), 8G RAM, and algorithm is programmed by Matlab R2016a.

| Association Accuracy
100 Monte Carlo runs were made for each case. The simulation results are shown in Table 1, where Method Ⅰ-Ⅳ represent the association algorithm employ the cost function proposed by Pattipati et al. [4], Ouyang et al. [9,10], Liu Hang et al. [11] and the cost function proposed in this paper, respectively.

| Time consumption
The running time of passive data association algorithm is mainly consumed in the construction of the cost matrix. When the number of sensors is 3, the cost matrix is 3-D, including N 3 elements. Therefore, the cost function needs to be run N 3 times in one data association process. The average time consumption of the four discussed cost functions, averaged over 100 runs, is presented in Table 2. CF Ⅰ-Ⅳ represent the cost function proposed by Pattipati et al. [4], Ouyang et al. [9,10], Liu Hang et al. [11] and the cost function proposed in this paper, respectively.

| Discussion
As can be seen from Table 1, Methods Ⅰ-Ⅲ are very sensitive to target number and sensor distribution. For each table, the spatial distribution of the sensors is in the same range. With the increase of the targets number from 3 to 50, the association accuracy shows a downward trend. As the sensors are distributed closer and closer from (a) to (d), the association accuracy also shows a downward trend. This is due to the fact that as the number of targets increases, the number of LOS is also more and more. The uncorrelated LOS may very close, resulting in ghosts. With the spatial distribution of the sensor getting closer, the possibility of ghost generation increasing. And there exist some ghosts whose costs are much lower than those of genuine targets, which make them much harder to be distinguished by the cost function. Even if the global optimal solution is obtained by Lagrangian relaxation algorithm, due to the effect of ghost, the global optimal solution may not be the correct association result. This has important implications for association accuracy.
However, the simulation results show that the performance of the proposed cost function is greatly immune to the effect of sensor distribution and targets number. This is because the cost function proposed in this paper measures the correlation between measurements according to the epipolar geometric relation, and does not need to use LOS to estimate the target position, which greatly avoids the possibility of ghost appearance. As described in Section 3.3, when there are multiple measurements satisfying the epipolar geometric association, the ambiguity can be eliminated by circular association between three sensors. Therefore, the cost function proposed in this paper performs better.
It should be pointed out that when the number of targets exceeds 20, the association accuracy of the proposed method is not perfect. We analysed the incorrect association results and find that they are mainly caused by two reasons. (1) The distance between targets is very close. When the targets number exceeds 20, the targets density is rather high, and the distance between some targets in the imaging plane is very small or even only 1 pixel. (2) The distance between sensors is very close. In this experiment, the sensor positions are generated randomly in a certain region, and sometimes they may be very close. The above two reasons may lead to a much closed association cost -109 between some pair of 3-tuples, thus result in an incorrect association result. Nevertheless, compared with other cost functions, the association accuracy of the proposed cost function is still excellent. Furthermore, Table 2 demonstrates that the average time consumption of the proposed cost function is less than the other three traditional cost functions. This is another superiority of it.

| CONCLUSIONS
In order to improve the passive sensor data association accuracy, a new cost function is proposed based on epipolar geometry theory, and the correlation between data is measured by the fundamental matrix between sensors. Compared with the traditional cost function, this cost function has two superiorities. (1) It can directly measure the correlation between measurements, and does not need to estimate the target position by LOS, which greatly avoids the probability of ghost appearance and ensures the association accuracy. (2) When calculating the cost function, the detected 2-D coordinates can be directly used as the input of the cost function, which avoids a large number of trigonometric operations in the calculation of LOS, thus avoiding the accuracy loss caused by truncation error and ensuring the association accuracy. Experiments of comparison with the other three cost functions demonstrate that the proposed cost function is greatly immune to the effect of sensor distribution and targets number.