Novel system model-based fault location approach using dynamic search technique

Impedance-based fault location (IBFL) approaches are the most commonly used fault location methods in digital relays. However, each IBFL approach is designed speciﬁc to a line or network conﬁguration and thus is not universal. For example, complex line conﬁgurations such as lines with mutual coupling between them and three-terminal lines have to employ individual IBFL algorithms derived speciﬁcally for them. Furthermore, they suf-fer from several sources of errors such as non-homogeneous system, and CT saturation. Hence, this paper presents a novel fault location approach that utilises a system model to overcome these limitations. The proposed model-based fault location (MBFL) approach estimates the fault location by identifying the closest match among various anticipated fault scenarios obtained using the system model and the actual fault scenario. It uses a dynamic search technique to implement the MBFL efﬁciently. A key highlight of the proposed approach is identifying the location of a fault on a neighbouring line using limited measurements, as few as only the through fault current ﬂowing in a neighbouring line. The advantages of the approach and its practical applicability have been demonstrated by implementing it in complex network conﬁgurations as well as ﬁeld data.


INTRODUCTION
Transmission and distribution lines are crucial components of any power system. They form the backbone of the presentday interconnected power system and have enabled the transfer power to loads spread across vast regions. However, for a wide variety of reasons such as natural events, equipment failure, physical accidents, and misoperation, short-circuit faults on overhead lines are unavoidable [1,2]. Transmission lines can span for long distances ranging from a few tens of miles to hundreds of miles. On the other hand, a distribution system can be a complex network consisting of several taps along its downstream path. Hence, it is essential to clear the fault and restore the line to normal operation. Dispatching the maintenance crew directly to the fault location saves a tremendous amount of resources in the restoration process, which has led to the development of several fault location approaches. Though fault location has been a subject of research interest for decades, only a few algorithms have been used due to practical considerations and limitations of these approaches. For several decades, impedance-based fault location (IBFL) methods have been used in commercial industrial relays [3][4][5][6]. Each IBFL approach has its own input data requirements and is designed specific to a line configuration. For example, most of the commonly used IBFL algorithms are valid only for simple two-terminal lines. Each complex line configuration, such as three-terminal line, mutually coupled line, series compensated line, and parallel transmission lines, has to employ a separate IBFL algorithm derived specifically for it [6][7][8][9][10][11]. Furthermore, they assume the line impedances to be homogeneous. Hence, these algorithms provide accurate fault location estimates only when their specific requirements and assumptions have been met. Some of the common factors which affect IBFL methods and their sources of errors are discussed in [12,13].
Recently, apart from IBFL approaches, traveling-wave (TW) fault location methods are being used in relays as well [3,14,15]. These methods require highly accurate timing information and waveforms to be recorded at a very high sampling frequency (in the rate of 100s of kHz or MHz) [9]. As a result, they need appropriate sensors, devices with very high sampling frequency, large storage and processing capability, which makes the approach very expensive to implement. In addition, [6] discusses some factors which affect the accuracy of traveling-wave based approaches. Though there are other methods proposed based on frequency-domain analysis or wide-area fault locating techniques based on synchrophasor measurements, they are not yet commonly commercially used. Hence, there is a need for a new fault location approach. The new approach would be desirable to suit any power system configuration, and have fewer restrictions and measured data requirements. Furthermore, an approach that does not require the installation of additional hardware such as high-frequency sampling devices or phasor measurement units will make it more economical to implement.
The power system model and information regarding the system are usually underutilised when performing fault location. IBFL and many other fault location approaches commonly use only the line parameters of the faulted line and source impedances connected to the faulted line in their algorithm. However, valuable information surrounding the faulted line, though they are known, are not exploited. For example, through fault current is the fault current flowing through a line feeding an external fault (a fault present elsewhere on the system and not on that line). These through fault current measurements can be used to identify the location of a fault on a neighbouring line using the system model. Loss of accurate measurements caused by a variety of factors and events hamper the fault location process. A common cause is current transformer (CT) saturation and there are several works in the literature citing the observation of CT saturation in event reports [16,17]. Improper or missing connections to measurement devices also lead to loss of input data [18,19]. Besides, some relays may not have the ability to record measurements or may not be triggered by the fault [20]. Wide-area fault location algorithms have been proposed to tackle scenarios where the fault recording devices at the terminals of a line fail to record the fault waveform [21]. However, these algorithms usually require measurements from multiple locations, which severely limit their practical application to only systems with wide-area monitoring capabilities. Hence, identifying the location of a fault present on neighbouring lines using minimal number of measurements (as less as currents from only one terminal of a neighbouring line) is a unique and valuable benefit. Furthermore, this capability can be used as a second step verification process to the fault location estimated using other methods.
Fault resistance is a critical factor that affects the fault currents and voltages measured by a relay. Though most approaches aim to negate the effect of fault resistance when estimating the fault location, it is rather beneficial to identify the fault resistance value as it presents an insight into the root cause of the fault [22,23]. Furthermore, knowledge of the fault resistance value enables the usage of several other fault analysis algorithms that require the fault resistance value in its calculations as well as to replicate the fault scenario in a power system model simulation.
Current and voltage measurements extracted from event reports recorded by relays or monitoring devices are used to identify the location of a fault. The power system parameters and network configuration affect these measured quantities and the relationship between them is used to identify the fault location. Engineers use the power system model for a variety of purposes such as to perform protection coordination, system power flow, and stability analysis. Hence, electric utilities commonly have their power system modeled in a power system simulation software such as those presented in [24,25]. Furthermore, several approaches have been developed to extract various power system parameters to form, update, and verify the power system model [26][27][28]. Thus, system model based fault location techniques are promising fault location approaches that can overcome the limitations of IBFL and TW based methods.
A form of model-based fault location (MBFL) approach matches the actual fault currents recorded by a relay or fault recorder with fault currents of simulated fault scenarios. Though similar ideas have been presented in [29,30], they need to be further explored with rigorous analysis to uncover the full potential of the approach. Furthermore, an effective implementation method as well as a study on critical factors that affect these approaches are required.
Based on the above background and motivation, the objective of this paper is to present a fault location method that overcomes the limitations of IBFL approaches as well as provides the novel benefit of identifying the location of a fault on a neighbouring line using limited number of measurements. The challenges in identifying the fault location in multiterminal lines and on neighbouring lines are analysed and their input data requirements are deduced. Furthermore, critical factors such as the effect of remote source voltage as well as remote source impedance on the fault location estimates obtained using the proposed method are studied. The paper also presents the successful application of the proposed method in complex network configurations as well as in realworld fault event to demonstrate its practical applicability and superior performance over conventionally used fault location methods.
The major contribution is the novel fault location method that utilises the system model to identify the location and fault resistance of a fault on the line monitored by the relay as well as on neighbouring lines that are outside the primary protection zone of the relay. The paper is organised as follows. Section 2 introduces MBFL and Section 3 presents the proposed approach and a demonstration of the implementation and application of the approach. Section 4 discusses the challenges involved in identifying the location of a fault on multi-terminal lines and neighbouring lines, and extends the application of the proposed method to these cases. Section 5 studies the effect of remote voltage source and system parameters on the proposed approach. Section 6 presents comparisons with IBFL methods and highlights the benefits of the approach.

MODEL-BASED FAULT LOCATION
The underlying concept of a model-based fault location approach is to simulate a variety of fault scenarios using the power system model and identify fault scenarios that closely match the measured fault currents recorded by the relay to estimate the fault location. An overview of the proposed fault location approach is presented in Figure 1. When a fault occurs on the system, the electric utility operators receive fault event reports recorded by relays or fault recorders. The first step is the data preprocessing step where fault current phasors and other information such as fault type are extracted from the fault event reports. The next step is to determine whether the fault is on the line monitored by the relay or on a neighbouring line. This process is discussed in Section 3.1.
Once the fault is determined to either be on the line monitored by the relay or on a neighbouring line, the fault location and fault resistance are estimated using the steps presented in Section 3.2. The approach is implemented using two software programs. A control and processing software (CPS) is used to preprocess the event report and perform analysis and calculations. A power system simulation software (PS) is used to simulate fault scenarios and feed the results to the CPS. Though electric utilities contain large circuit models of a vast region of the transmission network, the fault location analysis focuses only on a small region surrounding the relay recording the measurements. Hence, fault scenario simulations in this approach are carried out using a reduced equivalent circuit (REC) rather than the large full circuit model to speed up the fault analysis process and reduce the memory and computation required. The REC will typically include the line monitored by the relay and neighbouring lines up to one or two buses away from the monitored line.
The proposed method uses fundamental frequency fault current phasors extracted from the steady-state fault portion of the event report as its input. The phasors represented in rectangular coordinates are used for all the calculations and analysis. Using phasors provide additional information regarding the underlying system impedance during the fault rather than using only the fault current magnitude. The process of extracting phasors for post-fault analysis is a well-known process [31,32]. The fault type is usually available from the event report. It can also be determined using the fault current and/or voltage measurements [4,21,33]. Though fault type identification is a common process, it needs to be detected accurately as only fault scenarios of the detected fault type will be simulated in the proposed dynamic search algorithm. The event report will also contain information regarding which relay recorded the event report. This information can be extracted in the data preprocessing step to identify the appropriate REC to be used for analysis from a library of RECs for various relay locations.

PROPOSED FAULT LOCATION APPROACH
The proposed approach is presented along with a demonstration. The test circuit used for this purpose is the same transmission network as that used in [30]. The circuit contains complex network configurations such as three-terminal lines and parallel lines with mutual coupling. Figure 2 presents the REC formed from a large transmission network. It shows the equivalent sources and actual lines present but does not show the equivalent line impedances that were introduced into the system during the circuit reduction process. The circuit models used in this paper are classical short-circuit models and do not consider load currents. This is a reasonable assumption because the loads' current levels in a transmission system are negligible in comparison to the fault current levels. As a result, the fault currents in the non-faulted phases are negligible or zero in the examples presented in this paper.
In this paper, OpenDSS was used as the PS to simulate fault scenarios on the REC. MATLAB was used as the CPS to drive OpenDSS to perform data preprocessing as well as post-fault analysis. To demonstrate the approach, a fault was created in the large transmission network model and the observed fault current phasors were used as inputs to the proposed approach as shown in Figure 3. In real-world applications, the fault current phasors can be extracted from event reports as illustrated in [31] and be used in the proposed algorithm implemented inside the protection device such as a relay or in a computer at the substation.
Consider a single line-to-ground (SLG) fault in phase A on a two-terminal line (Line 4) as shown in Figure 2    The fault current in one of the faulted phases for each set of boundary fault scenarios is fitted to a cubic polynomial shown in (1). A cubic polynomial was identified to be sufficient to capture the fault current trend in each of the four cases. Any curve fitting approach, such as the method of least squares, can be The b component is the real part of the fault current and the a component is the imaginary part of the fault current. Each set of the above listed fault scenarios consists of the extreme or boundary fault cases: faults at the beginning and far end of the line, and faults with minimum and maximum fault resistance values. Hence, faults anywhere on the line with any fault resistance value will lie within the region delimited by these boundary equations and is proved in the appendix section. As a result, when the measured fault current lies within the region delimited by these boundary equations, it is determined to be on the monitored line. Otherwise, it is located on a neighbouring line. This process is demonstrated using Figure 5 Once the fault is determined to be on the monitored line, the fault location and fault resistance are estimated by performing fault analysis on the monitored line. If the fault is determined to lie outside the monitored line, there are two ways to proceed as discussed in Section 4.3. Fault analysis is either performed on each neighbouring line and the result from the appropriate faulted line is chosen with the assistance of a protection engineer, or available additional input measurement is used to estimate the fault location independently.

Stage 2: Estimating the fault location and fault resistance
The fault location and fault resistance estimation process is presented in Figure 6. It involves three steps. The first step is determining the fault simulation scenarios and is presented in Section 3.2.1. The next step is to simulate the fault cases and identify the closest simulation scenarios to the actual fault scenario, as explained in Section 3.2.2. This process is repeated until the end of simulations search criteria is reached. Finally, the fault location and fault resistance are estimated using the weighted mean method discussed in Section 3.2.3.

Determining simulation scenarios
The simulation scenario set defines all fault scenarios to be simulated by the PS for each iteration of the search algorithm. The fault location (FL) and fault resistance (FR) of each simulated fault scenario are determined in this step. The set of simulation scenarios is an interpolation between the range of values to be explored for fault location and fault resistance, respectively. For each fault location value in the fault location set (L set ), all the fault resistance values in the fault resistance set (R set ) are simulated. There are two parameters for both FL and FR that determine their simulation set: a) the incremental step size, and b) the range of values. First, a default initial set of fault scenarios are simulated to be used in the first iteration of the k-nearest neighbour (kNN) search (discussed in 3.2.2). This simulation set explores the entire range of possible fault location (between 0.01 pu and 0.99 pu of the line length) and fault resistance (between 0 Ω and 50 Ω) values with a very large step size. The subsequent iterations will narrow down to the actual fault location and will employ smaller step sizes as discussed later in this subsection. The initial fault location set is L 0 set = {0.01, 0.33, 0.66, 0.99} pu and the fault resistance set is R 0 set = {0, 5, 12.5, 25, 50} Ω. The variable quantity such as fault location or fault resistance is denoted using capital letters, the subscript following the variable provides details on the variable, and the superscript refers to the iteration number. This initial set of simulated fault current data is also used to determine whether the fault lies on the monitored line discussed in Section 3.1.
The initial simulation set is used to run the first search iteration. The results from each iteration of the kNN search is used to determine the step size and range of FL and FR values for the next set of simulation scenarios.
Step size In these equations, max refers to the largest value and min refers to the smallest value among the results from the search algorithm discussed in Section 3.2.2. For example, L n max refers to the fault scenario with the largest FL value among the closest fault scenarios chosen by the search algorithm in the n th iteration. N is the maximum number of iterations the approach is programmed to run. In the various fault scenarios tested by the authors, five iterations were sufficient to accurately locate the fault. Hence, N was set to five in this paper. The initial step sizes before the first iteration are taken as L 1 step = 0.33 pu and R 1 step = 5 Ω. On analysing a variety of circuits and fault types, a suitable value for both L and R was identified as 0.05 pu. The step size gradually decreases with the number of iterations in (2) and (3) to obtain a more local search within the identified range of FL and FR values in the subsequent iterations.
After calculating L n+1 step and R n+1 step , if L n+1 step = L n step and R n+1 step = R n step , then This is to ensure that the search is progressive and prevent the approach from getting stuck with the same set of scenarios in a few rare unusual cases.

Range
The range of values to be simulated is defined using the start and the end values obtained using the following equations.
The start and end values of each successive iteration are a step value beyond the minimum and maximum values of their respective quantities among the search results for the n th iteration unless they exceed the boundary values. However, in cases where L n max = L n min , then two steps beyond the lower and upper limits are chosen for the start and end values, respectively.
The fault location set (L set ) and fault resistance set (R set ) are created as follows: The CPS software drives the PS to simulate the specific fault scenarios identified in the fault location and fault resistance sets. The total number of scenarios simulated for each iteration is: number of elements (L n set ) × number of elements (R n set ).

Identifying closest simulation scenarios
The data obtained from simulating the scenarios in the fault location and fault resistance sets are now used to narrow down the search space and obtain fault scenarios that closely represents the actual fault scenario. This step uses the k-nearest neighbour (kNN) search to determine the region to zoom into and focus the search in the next iteration. Given a set of points S in a metric space P and a query point q ∈ P, the kNN search finds the k nearest or closest points in S to q. Here, the set of points S are the simulated fault currents and the query point q is the actual measured fault current. In this paper, Euclidean distance is used as the distance metric for applying kNN search. The approach uses fault currents observed at the relay location of all three phases represented in terms of rectangular coordinates. Hence, the query point and the metric space are 6-dimensional (ℝ 6 ). In short, the kNN algorithm identifies k number of scenarios in the search space that are closest to the actual fault scenario using Euclidean distance as the distance metric.
The smallest and the largest fault location (L n min and L n max ) and fault resistance (R n min and R n max ) values among the k simulation scenarios identified by the kNN search algorithm are used to determine the step size and range of values to be simulated in the next iteration (discussed in Section 3.2.1). A very small k aggressively narrows down the search space, thus rapidly speeding up the fault location process. However, it can also mislead the approach's search space and not converge to the actual fault location. On the other hand, a larger k value gradually narrows down on the search space. Though it slows down the process, it is less likely to mislead the approach's search space in the subsequent iterations. Hence, an optimal value of k needs to be chosen. A k value of four worked well for all the scenarios tested by the authors and is used in all the demonstrations presented in this paper.
As the proposed search algorithm progresses, it moves closer to the actual fault location. This can be observed with the decrease in the Euclidean distance (D) between the simulated fault current vector and the measured fault current vector in the simulation scenarios identified by the kNN search algorithm. The search simulations are stopped when either of the following two criteria is reached. The first is when D of at least one of the simulation scenarios identified by the kNN search algorithm is less than a set value . The value of was determined by observational analysis and set at 50 in this paper. The other criterion is where the maximum number of allowed iterations (N ) is reached.

Determining fault location and fault resistance
A weighted mean approach is finally used to estimate the fault location and fault resistance using the closely representative simulation scenarios determined by the kNN search performed in the previous steps. The weights (W ) are assigned using (11).
where p represents the index of the k nearest neighbours identified at the last iteration of the dynamic simulation search. In (11), a larger weight is assigned when the simulated fault scenario is closer to the actual fault location (when the distance D is lesser).
Let FL kNN and FR kNN represent the fault locations and the fault resistances of the simulated fault scenarios identified by the kNN search in its final iteration when the search is stopped. Then, the final estimated fault location and fault resistance value of the given fault are calculated using (12) and (13), respectively.
As only the fault scenarios that are relevant to the actual fault scenario are simulated rather than sweeping through the entire line with various fault resistance values, this search method is called as dynamic search (DS) technique in this paper. Table 1 shows the outcome of each iteration of the fault location process described in Stage 2 for the example fault scenario. The kNN search was implemented using the knnsearch function of the statistics and machine learning toolbox in MATLAB. In Step by step illustration of the fault location and fault resistance estimation using the dynamic search technique (kNN search with k = 4) for the fault scenario presented in    The fault location and fault resistance step sizes (L 2 step and R 2 step , respectively) are calculated using (2) and (3). The next set of fault simulation scenarios are determined from these values as shown in (9) and (10) in Section 3.2.1.

Demonstrating stage 2-The fault location and fault resistance estimation process
Observe that fault location and fault resistance values simulated and analysed are narrowed down in the second iteration. The L 2 set is between 0.165 pu and 0.99 pu compared to 0.01 pu and 0.99 pu in the L 1 set . Similarly, the R 2 set is between 0 Ω and 6.25 Ω compared to 0 Ω and 50 Ω in the R 1 set . Hence, the simulation scenarios preformed in successive iterations are dynamically determined based on the kNN search such that they converge to the actual fault location rather than simulating a fixed set of fault scenarios along the entire line length.
This above process is repeated until the simulation scenarios are close to the actual fault location, that is, either D of any simulation scenario is less than 50 or the maximum number of iterations (N ) of 5 is reached as discussed in Section 3.2.2. It can be observed that with every iteration of the kNN, D decreases showing that the simulation scenarios are converging to the actual fault scenario. At the end of the 4 th iteration, D of one of the simulation scenarios from the kNN search was less than 50. Then (12) and (13) were applied to estimate the fault location and fault resistance values, respectively. Table 2 presents the observed fault currents in each phase and the estimated fault location and fault resistance using the proposed method. The error in fault location and fault resistance are calculated using (14) and (15), respectively.
Fault Location Error (pu) = |FL Actual − FL Estimated | , (14) Fault Resistance Error (Ω) = |FR Actual − FR Estimated |. (15) The proposed approach was able to accurately identify the fault location and fault resistance as shown in the Table 2.

IMPLEMENTING THE APPROACH TO ESTIMATE FAULT LOCATION ON NEIGHBOURING LINES
This section will expand the application of the approach beyond faults on the monitored line. The proposed method can independently identify the fault location accurately using only measurements from one terminal of a line when the measured fault current is unique for a given fault location. This section first shows that the fault current measured in a one-terminal line is unique for a given source voltage for every fault location and fault resistance value. Then, it explains the challenges in obtaining the accurate location of a fault in a two-terminal line and a fault on a neighbouring line using data from only one relay location. Finally, the additional data requirements and incorporation of this available supplemental data into the approach proposed in Section 3 to obtain fault location are discussed.

FIGURE 7
Simplified circuit representation of a one-terminal line

Analysing fault current in one-terminal transmission line
This subsection demonstrates that when the Thevenin impedance to the fault point from the relay location is equal for different fault scenarios, the uniqueness in the relation between fault current and fault location is lost. The notations used for analysis equations and figures in this paper are as follows. The variable quantities are denoted using capital letters, that is, current, voltage, and impedance are represented by I , V , and Z , respectively. The subscript refers to the component being measured, such as Bus 1 and Line y. The superscript corresponds to the sequence component or other additional information. For example, Z 1 y refers to the positive-sequence impedance of Line y.
Consider the one-terminal transmission line shown in Figure 7. The reduced system has an equivalent source voltage of V S , source impedance of Z S , line impedance of Z y , and a relay monitoring Line y near Bus 1. The system experiences a fault at F located at a distance d pu from Bus 1 with a fault resistance R F . Currents and voltages in the following equations are calculated according to the fault type as demonstrated in [12].
where Z Relay is the impedance seen by the relay. IBFL methods are based on (17). In (17), V 1 and I 1-F are measured by the relay and there are two unknowns: R F and d . Symmetrical components are commonly used for fault analysis because they decouple a complex three-phase network into three balanced sequence networks, enabling each sequence network to be analysed independently. The different sequence circuits are connected together based on the fault type as explained in [34].
The fault current is a function of total Thevenin impedance to the fault. For example, in a single line-to-ground fault, where Z Th is the Thevenin impedance to the fault from the equivalent source (whose prefault voltage is V 1-pre S 1 ) and comprises of the source impedance (Z S ) and the line impedance (Z y ).
Fault impedance is typically resistive. Changing the fault resistance affects only the real part of the denominator of (19), whereas varying the fault location (d ) changes the real and imaginary parts of the denominator of (19). Hence, for a given prefault voltage, the fault current is unique for each fault location and fault resistance value. Conversely, the prefault voltage and fault current are sufficient to determine the fault location and fault resistance. The above characteristic is visualised in Figure 8 using Line 6 from the demonstration circuit (shown in Figure 2) as an Furthermore, when the fault currents and prefault voltages are known, the fault voltage measurements at the bus do not provide any additional information regarding the fault, as shown by (20)- (22). This is because the voltage at the bus during a fault can be calculated using the prefault bus voltage, fault current, and system impedance parameters. For example, the voltage at Bus 1 measured by relay R in Figure 7 can be calculated using the following equations [35].

Challenges in identifying the location of a fault on a two-terminal line or a fault on a neighbouring line
Identifying the location of a fault on neighbouring lines is very challenging using IBFL methods because new sets of equations have to be derived for every different network configuration. In addition, there are more unknowns than available equations, thus requiring assumptions to discount the effect of some variables (resulting in more sources of errors) or seek additional measurements. These are demonstrated in this subsection using fault scenarios in three different circuits.
Consider a two-terminal line shown in Figure 9 as Case 1. Though (16) and (17) remain the same, an additional unknown apart from R F and d is introduced into the equation. I F is now an unknown because of the remote source contribution (I 2-F ) as shown in (23). As a result, there are more unknowns than The fault current I 2-F can be calculated when the remote source (V S 2 ) parameters are known. Hence, when MBFL approach is used with an accurate system model, the remote source fault current contribution is a known parameter and the fault location can be estimated precisely. Further analysis of the remote source and their effect on the fault location estimate are discussed in Section 5. Now, consider Case 2 where a reduced circuit contains two lines connected to the same terminal of a two-terminal line as shown in Figure 10. As discussed in Section 1, there can be a variety of reasons such as CT saturation that result in erroneous measurements from relay R2. As a result, the measurements from relay R2 cannot be used to identify the fault location accurately. The through fault current flowing in Line y monitored by relay R1 will be used to identify the location of the fault on neighbouring line (Line x 1 ). Identifying the location of a fault present after the tap point on a three-terminal line is similar to this circuit.
The following equations derive the impedance observed by relay R1 during a fault to illustrate the loss of uniqueness in the relation between through fault current and fault location.

FIGURE 11
Case 3: Simplified circuit representation to demonstrate the identification of the location of a fault located two buses away Comparing (28) to that obtained for a two-terminal line (17), an additional unknown variable, I 4-2 , is introduced into the equation. Furthermore, a very similar set of equations can be derived for identifying the location of a fault on Line x 2 using measurements from relay R1 as shown in (29).
As a result, fault current observed at relay R1 for a fault on Line x 2 can be similar to the fault current observed for a fault on Line x 1 . Hence, it is challenging to identify the location of a fault on Line x 1 using measurements from R1 instead of R2. Additional information (discussed in Section 4.3) is required to identify the faulted line and location of the fault on a neighbouring line independently. The circuit shown in Figure 11 (Case 3) presents a more complicated fault location scenario where fault current measurements from relay R1 are used to identify the location of a fault on a line two buses away (Line x 3 ). Equation (30) presents the impedance observed by the relay R1 for a fault on Line x 3 derived in the same way as (28).
Equation (30) has two more new unknown variables (I  and I 3-2 ) in comparison to (17) or one additional unknown variable compared to (28). The presence of additional fault current contributors between the relay and the fault introduces further unknowns in the impedance equation. Besides, faults at different locations on the REC can cause the same through fault current to be measured at relay R1 location similar to the previous case. In Case 2, to uniquely identify the fault location using the proposed method, some additional information from Line x 1 or Line x 2 is required to identify the fault location. A current or voltage measurement from either of the lines (Line x 1 or Line x 2 ) is sufficient to determine the faulted line. Furthermore, even a simple indication from a device like fault circuit indicator (FCI) is sufficient instead of current or voltage measurement to determine the faulted line. FCIs are relatively inexpensive devices that provide an indication of the fault path [36]. There are other methods to determine the faulted line. For example, it can be identified using the circuit breaker status communicated to the relay. If the faulted neighbouring line is known beforehand, then the dynamic search can be executed directly on the known faulted line to obtain the fault location. In situations where neither additional fault current or voltage measurement or FCI device is available, the approach can estimate the possible fault location in each neighbouring line. The electric utility protection engineer can use his local knowledge about the fault such as customer reports or phone calls about a power outage or relay trip reports to pick out the fault location result from the appropriate line.

Additional data requirements for complex fault location scenarios
To independently identify the fault location without requiring any external assistance such as from a protection engineer, for every n connections to a bus (n > 2), a minimum of n − 2 unique measurements are required. Each of the n − 2 measurements must provide unique information, that is, each additional measurement must be from different lines apart from the monitored line. The kNN search described in Section 3.2.2 is modified such that the feature space includes additional current or voltage measurements. Hence, the metric space P, the set of points S , and the query point q are in multiples of 6 dimensions depending on the number of additional measurements used. In Case 2, assume an additional measurement, current I 4-2 , is available. Once the approach detects the fault is not on the monitored line using the steps discussed in Section 3.1, it can explore possible fault locations on each of the neighbouring lines. The kNN search criteria will now include fault currents from Relay R1 as well as current I 4-2 . Faults simulated on Line x 1 will have I 4-2 current closer to the actual measured I 4-2 current than fault scenarios simulated on Line x 2 . Hence, the approach will be able to identify the faulted line and the location independently. Furthermore, the supplemental data can be multiplied by a scalar to provide additional weight or importance while incorporating it into the kNN.
Consider a line-to-line (LL) fault involving phases B and C on Line 6 in the circuit shown in Figure 2. Assume accurate measurements from relays monitoring Line 6 are unavailable; hence, through fault current measurements from a relay monitoring Line 5 near Bus 153 are used to identify the fault location. Table 3 shows possible fault locations on Line 1 and Line 6 calculated by the proposed approach using the through fault current. Though the approach accurately identified a possible fault location on Line 6, the approach needs additional information to identify the faulted line or an electric utility protection engineer to apply his local knowl-edge about the fault to choose the appropriate fault location result.
Assume a current measurement from a relay monitoring Line 1 near Bus 40 is available during the fault event. The kNN search now includes three-phase currents from relay monitoring Line 5 near Bus 153 and three-phase currents from relay monitoring Line 1 near Bus 40 in rectangular coordinates. The lower half of Table 3 shows the fault location identified by the approach using the additional available data. It is able to accurately identify the faulted line and the fault location with the help of the additional current measurement. Similarly, if an FCI device mounted on Line 1 indicates no fault present on this line, the approach can be programmed to look for possible fault locations only on Line 6.

SENSITIVITY ANALYSIS ON REMOTE VOLTAGE SOURCE
Variations in the remote source voltage and impedance affect the current contribution from the local end, and this, in turn, affects the fault location estimate. This analysis is vital to understand the impact of the variation of remote source voltage on the measured local current in cases where the system model is not updated to reflect the exact remote source voltages. For example, a decrease in the local fault current as a result of an increase in remote source voltage that was not reflected in the circuit model can result in overestimating the fault location, that is, estimating the fault location to be farther than the actual fault location.
Consider a generic two-terminal line shown in Figure 9. V S 1 is considered as the local source and the effect of the remote source voltage (V S 2 ) on the local measured current (I S 1 ) will be analysed. The local measured current in the faulted phase is expressed in terms of symmetrical components in (31).
A system having balanced voltage sources results in sources only in the positive-sequence network [37]. The various sequence circuits are connected with each other at the fault point. Hence, the voltage at fault point is a key parameter that affects the local current measured.
Partially differentiating (31) with respect to the positivesequence voltage at the fault point (V 1 F ): The circuit impedances in the negative-and zero-sequence circuits are fixed values and are connected either in series or parallel to the positive-sequence circuit depending on the fault type. As the negative-and zero-sequence circuits do not have voltage sources, I 0 S 1 ∕ V 1 F and I 2 S 1 ∕ V 1 F in (32) are constants. Hence, we focus the sensitivity analysis on the positive-sequence circuit.
Positive-sequence circuit of the two-terminal line shown in Figure 9 Analysing the positive-sequence circuit of the two-terminal line shown in Figure 12: In the scenario where the remote voltage source (V 1 S 2 ) is varied, an increase in V 1 S 2 causes an increase in I 1 S 2 and V 1 F . Partially differentiating (33) with respect to V 1 F : From (35), an increase in V 1 F causes a decrease in I 1 S 1 . Furthermore, when the system has a larger local source impedance (Z 1 S 1 ) value (weaker local source), the impact of variation of V 1 F on I 1 S 1 is lesser, that is, the same V 1 F variation results in a smaller change in I 1 S 1 . To study the impact of change in V 1 S 2 on V 1 F 1 , By Ohms law, I 1 S 2 ∕ V 1 S 2 is inversely proportional to the net total circuit impedance or proportional to the net total circuit admittance viewed from V 1 S 2 . Though the net total circuit impedance depends on how the sequence networks are connected, any increase in the total circuit impedance leads to a decrease in I 1 S 2 ∕ V 1 S 2 . As a result, in fault scenarios with higher fault resistance values, becomes smaller resulting in | V 1 F 1 ∕ V 1 S 2 | to increase. Hence, I 1 S 1 becomes more sensitive to variations in V 1 S 2 in fault scenarios with higher fault resistance. An increase in remote source impedance (Z 1 S 2 ), that is, a weaker remote source, results in closer to one as Z 1 S 2 forms a larger portion of the net total circuit impedance as viewed from V 1 S 2 . Consequently, | V 1 F 1 ∕ V 1 S 2 | becomes smaller, implying that V 1 F varies less with changes in V 1 S 2 . Hence, I 1 S 1 becomes less sensitive to variations in V 1 S 2 . This is expected because a weaker remote source is expected to have a lesser impact on the measured local current.
When a fault is farther down the line from the relay monitoring the line, the fault distance in per-unit (d ) is larger. In (36), a decrease in 1 − d value results in a decrease in , this causing | V 1 F 1 ∕ V 1 S 2 | to increase. This implies that variation on V 1 S 2 has a larger impact on V 1 F as the fault moves closer to the remote end of the line. However, according to (35), | I 1 S 1 ∕ V 1 F | decreases with an increase in d . This means that I 1 S 1 is less sensitive to variations in V 1 F as expected because the fault is much farther away from the local source. Therefore, the impact of variation of d is influenced by both (35) and (36) depending on the fault scenario.
The impact of each circuit parameter on I 1 S 1 discussed above has been summarised in Table 4. The first line indicates the parameter varied keeping the rest of the system and fault conditions constant. The following lines present the gist of the analysis. The last line presents whether the variation of I 1 S 1 increases or decreases with the change in the parameter indicated on the first line. Similar analysis can be performed for the negative-and zero-sequence circuits as well.
For all shunt fault scenarios, change in V 1 S 2 affects I 1 S 1 . However, a peculiar fault scenario is a bolted three-phase to ground fault (LLLG fault). A LLLG fault involves only the positivesequence circuit and a bolted fault results in V 1 F being fixed at zero. In this case, any change in V 1 S 2 does not affect V 1 F and in turn I 1 S 1 according to (33). Hence, any change in the remote source voltage (V 1 S 2 ) does not affect the fault location estimate obtained using local source current measurement (I 1 S 1 ). The above characteristic behaviour of the remote voltage source on the local measured fault current can be extended to three-terminal lines as well as for cases where the fault is located on a neighbouring line. The effect of remote voltage sources on the fault current observed at the local end diminishes because of multiple impedances present between both the local source and fault, and remote source and local measurement. These fault scenarios are similar to having additional impedances in the local source branch as well as the remote source branch in Figure 12. As seen in Table 4, an increase in Z 1 S 1 or Z 1 S 2 reduces the influence of V 1 S 2 on I 1 S 1 .

Tackling variations in remote voltage source
An essential requirement for the proposed approach to work successfully is having an accurate system model. This includes the equivalent source voltages in the RECs. The currents and voltages in the system during the fault depend on the equivalent source voltages as seen earlier in this section. Each of the new unknowns introduced by the presence of additional fault current contributors between the relay and the fault in the complex circuits presented in Section 4.2 will no longer be unknowns in the proposed approach as they are directly related to the prefault voltage of the remote buses.
A crucial yet reasonable assumption made in the circuit models is that equivalent source voltages are taken as 1 pu if their exact value is unknown. This is a rational assumption because transmission line voltages are strictly required to operate within their ratings as stipulated in national electric grid codes [38]. Distribution systems too are required to operate within specific voltage limits (such as between 0.95 pu to 1.05 pu [39]). Hence, the voltages typically do not deviate more than 5-10% of their rated voltage levels.
Fault recording devices are usually set up to record a few cycles of prefault data [40]. In a two-terminal line, the prefault voltage and prefault current information can be used to calculate remote source voltage according to (38).
where superscript n refers to the n th symmetrical component.
The system model can be updated after the data preprocessing step in Figure 1 to reflect the latest remote source voltage. As source impedance and line impedance parameters of the twoterminal line are known, the method can exactly identify the fault location without making any assumptions. An approach to circumvent assumptions regarding prefault voltages is modifying the implementation of the proposed framework. The electric utility engineer can apply the approach for a range of remote source voltage values and obtain the span of possible fault locations. For example, in the two-terminal line case shown in Figure 9, the electric utility engineer can run the approach for V S 2−pre = {0.98, 0.99, 1.0, 1.01, 1.02} pu and obtain the five possible fault location values for each case. This can provide further information to the maintenance crew to swiftly locate the fault.
With the rapid increase in communication capability in recent times, it is no longer a difficult task to know the system voltage at various locations at regular intervals. An ideal implementation of this approach would involve regularly updating the system model to reflect real-time conditions using state-estimation techniques on frequently measured system information [20].

COMPARISON WITH TRADITIONAL FAULT LOCATION METHODS
IBFL methods have been the benchmark for fault location standards and used in commercial relays for decades [3][4][5]. Hence, this paper compares the fault location estimates from the proposed approach with commonly used IBFL approaches such as Simple Reactance, Takagi, and Modified Takagi algorithms explained in [12] for complex network configurations.

Complex network configurations
The first complex network configuration presented is lines with mutual coupling between them. Lines 2 and 3 in the test circuit shown in Figure 2 are mutually coupled. Consider a singleline-to-ground (SLG) fault on phase C at a distance of 0.35 pu from Bus 153 on Line 3 with a fault resistance of 5.8 Ω. Table 5 presents the fault currents observed from a relay monitoring Line 3 at Bus 153 and the fault location estimates obtained using the proposed approach (dynamic search) and various IBFL methods. The presence of mutual coupling between the lines affects the zero-sequence fault circuit [41]. As a result, the fault location estimates obtained using Simple Reactance, Takagi, or Modified Takagi methods are less accurate as they do not account this factor into their calculation. Hence, special IBFL algorithms such as those presented in [9] derived exclusively for mutually coupled lines have to be used to get accurate fault location results. The estimated fault location using the proposed method is very close to the actual fault location. It also identified the fault resistance value accurately, else additional impedancebased approaches are required to estimate the fault resistance.
The next complex network configuration is a three-terminal line. The lines between Bus 153, 40, and 2876 in the test circuit shown in Figure 2 can be visualised as a three-terminal line with Bus 156 as the tap point. In this scenario, measurements from relay monitoring one end of the line are used to identify the location of a fault after the tap point from the relay's perspective. This scenario is similar to identifying the location of a fault on a neighbouring line. The fault is a line-to-line (LL) fault involving phases B and C at a location of 0.485 pu on Line 6 from Bus 156 with a fault resistance of 1.1 Ω. Table 6 presents the fault currents measured from relay monitoring Line 5 at Bus 153 and the fault location estimates obtained using the proposed approach and various IBFL methods. The proposed approach was able to accurately identify the fault location, whereas the IBFL methods estimated the fault location to be beyond the line length or out of the line. This is because of the current contribution from the remote terminal at the tap point, which affects the fault current measured by the local relay and hence the fault location. Similarly, in this scenario as well, separate IBFL algorithms developed for three-terminal lines such as those presented in [6,7,9] have to be used to get accurate fault location results.

Application to field data
The fault event analysed is a line-to-line (LL) fault involving phases A and B on a sub-transmission line caused by lightning and failed arrestor. The line connects Station A to Station B and is 9.1 miles long. It is protected with a recloser. Figure 13 shows the current and voltage waveforms recorded at Station A during the recloser action while the fault was still present in the system. The fault was at a distance of 5.33 miles (0.586 pu) from Station A.
The fundamental frequency fault current phasors were extracted using fast Fourier transform (FFT) after the DC offset had decayed. A one cycle window starting at 5.44 cycles was used for this purpose. The fault current phasors and the estimated fault location using the proposed approach as well as the IBFL methods are presented in Table 7. The fault location estimate using the proposed approach (Dynamic Search) is highly accurate and closer to the actual fault location than the IBFL approaches.
The authors extensively tested the proposed approach for several fault scenarios including various fault types, fault locations, and fault resistance values. Selected few cases were presented in this section to demonstrate the benefits of the approach. The fault location scenarios shown in Section 6.1 illustrate the need for special IBFL algorithms for each complex network configuration such as three-terminal lines and mutually coupled lines. On the other hand, the proposed approach can handle these complex network configurations without any  modifications, highlighting its versatility. Furthermore, the practical applicability and feasibility of using the proposed approach in real-world fault scenarios were shown in Section 6.2.

Advantages over various fault location methods
Impedance-based and traveling-wave based fault location approaches are the two most common fault location approaches used in relays as discussed in Section 1. A valuable unique highlight of this method is identifying the fault location on neighbouring lines using limited data. The benefits of the proposed model-based fault location approach over impedancebased fault location methods have already been shown in [30,42] as well as in Sections 4 and 6.1.
The primary benefit of this approach lies in being able to use the system model available to the electric utility. Because of the sensitivity of the proposed MBFL approach to system parameters, it can take into account all the factors that affect the fault current and overcome some of the limitations of IBFL approaches. Hence, it does not require separate algorithms for each complex network configuration and the same method can be used without much modifications.
Traveling-wave fault location methods require waveforms to be recorded at very high sampling frequency as discussed in Section 1. Hence, they cannot be used on reports from existing relays in the system that do not record at such high sampling frequency. For example, the event report in the fault scenario presented in Section 6.2 was recorded at just 16 samples per cycle and traveling-wave fault location approach cannot be applied. The proposed approach does not impose any new hardware requirements (such as high sampling frequency device, or new installation of phasor measurement units (PMU) across the transmission network). Hence, it provides additional benefits with no significant extra cost.
There are some model-based fault location approaches in the literature such as the search and find (S&F) approach in [30] and machine learning-based approaches such as artificial neural networks (ANN) [43, 44] that use the system model but they have their issues and limitations as well. The S&F approach suffers from a loss of precision based on the number of cases simulated as it is not possible to simulate fault scenarios at all possible fault locations with all possible fault resistance values. In addition, the number of cases that can be simulated at a given time depends on the computational capability and available memory of the computing device.
A common issue while using ANNs is that they have to be trained again when there is a change in the system network configuration. Furthermore, the fault location estimate will be affected by local and remote voltage levels if they are not taken into consideration while developing the ANNs. A key limitation of using most ANN based techniques is the black-box nature of the approach that limits its credibility and reliability. Furthermore, it is relatively hard and complex, both conceptually as well as computationally, to train ANNs for common electrical engineers who are less proficient in advanced mathematics. These are the few key reasons why they are not yet commonly used by electric utilities for fault location.
On the other hand, the proposed approach is transparent, enabling electric utility engineers to understand the underlying calculations and the convergence of the simulated fault scenarios towards the actual fault location. Furthermore, any change in the system network configuration only requires the REC to be updated to reflect the modification. There are no hassles involved such as simulating a large number of scenarios again and training new ANNs. The dynamic search approach further reduces the setup and run time as it simulates much fewer fault scenarios than the S&F MBFL technique. The approach is successful even when the fault event lies outside the simulated set of fault scenarios. It also offers the flexibility to incorporate as much additional data as and when available, as discussed in Section 4.3, to increase the fault location accuracy as well as to function independently without any aid from an engineer. The proposed approach can also be extended to distribution systems similar to [42] using fault circuit indicators.

CONCLUSION
A novel fault location algorithm that overcomes several limitations of conventional fault location methods is presented in this paper. A dynamic search method is proposed to perform the search process of a model-based fault location (MBFL) approach more effectively and efficiently. A key highlight of the proposed approach is identifying the location and resistance of a fault on a neighbouring line with limited measurements. The approach is universal and advantageous to implement in the presence of complex network configurations such as three-terminal lines and lines with mutual coupling between them. The approach derives its benefits by utilising information regarding the surrounding system by using the system circuit model. Furthermore, the practical applicability of the approach was demonstrated by implementing it on field data. This paper also analyses a critical factor that affects the accuracy of the proposed method, the remote source voltage. The approach is flexible and allows it to incorporate additional measurements as and when available. This can be used to identify the faulted line in scenarios where the fault is on a neighbouring line and increase the accuracy of the fault location estimate.

APPENDIX A
This section provides proof that the measured fault current of a fault anywhere on the line with any fault resistance value will lie within the region delimited by these boundary equations presented in Section 3.1. Consider an SLG fault on phase A of a two-terminal line shown in Figure 7. The current on the faulted phase (I A F ) is