Accurate partial discharge localisation using a multi-deep neural network model trained with a novel virtual measurement method

Time difference of arrival-based localisation method has been extensively used by researchers for the partial discharge (PD) diagnosis despite being time-inefﬁcient and sensitive to time delay estimation. Most of the contemporary work focuses on overcoming these problems by using data-driven approaches and/or statistical simulation methods. When used simultaneously, statistical simulation-based methods facilitate the data-driven approaches in terms of providing them with large amounts of data during the testing phase. However, the present work introduces a novel training phase strategy for a multi-deep neural network model (MDNNM). In this method, ‘ N ’ number of randomly generated PD sources in 3-D space are obtained statistically through virtual measurement method (VMM). Time delays amongst sensors, for the received ultra-high frequency signals from these ‘ N ’ PDs, are used in training of the MDNNM. This enhances the model’s PD detection ability, as the obtained time delays realise the measurement error beforehand; and consequently, the model learns to predict the PD coordinates accurately. After applying this MDNNM trained with a novel VMM practically, the experimental results show that a location accuracy of 1 ◦ can be obtained for a system error value of time difference up to 10 ns.


INTRODUCTION
Partial discharge (PD) detection has been studied extensively by researchers, as PD signals indicate the health status of the electrical equipment [1,2], knowing that is very useful for the system's reliability and safety from anticipated permanent faults and/or failures [3][4][5]. To maintain the power system's healthiness by avoiding any sort of malfunction, accurate PD localisation is needed [6,7]. For this purpose, ultra-high frequency (UHF)-based methods have been acclaimed by researchers [8] despite being sensitive to time delay estimation and timeconsuming. It was due to the anti-interference attribute and stable transmitting speed of the electromagnetic waves, which made these methods a subject of research interest [9]. UHF-based methods include angle of arrival, time of arrival, time difference of arrival (TDOA), and received signal's strength indicator [10,11]. Amongst these, the TDOA-based method has received far more attention, owing to its accuracy in PD source localisation. Nevertheless, this method demands nonlinear equations to be solved using iterative methods [12,13], 2-D and 3-D cases, involving three different types of antenna arrays, and the overall results regarding fast and accurate PD detection were encouraging for the future endeavours. Some additional studies including varying error and changing size of measurement array's radius made this work multidimensional. However, being a data-driven approach, the problem with this method was its demand for an enormous amount of data.
In order to overcome this curse of huge data demand, following in the footsteps of [21,22] expanded the work and presented an approach based on a virtual measurement method (VMM). This approach required merely a single set of measurement data, from which multiple sets of time delay data were generated. This was done by adding a random error (based on uniform distribution) to the measured delay dataset; henceforth, multiple delay datasets were obtained through statistical simulations. This method primarily focused on the accuracy of PD detection results, unlike the previously mentioned DNN approach [9], whose main consideration was the speedy detection of PD. Therefore, a hybrid approach was required that could combine the pros of these approaches while overcoming their cons. This was made possible by the study in [23], which ensured the availability of enormous amounts of data for a multi-DNN model (MDNNM), during its testing phase by making use of VMM. A single set of equations for the time delay amongst sensors was needed, which was later translated into an 'N' number of time delay equations just by adding a measurement error conforming with either uniform distribution or normal (Gaussian) distribution. The obtained results were better as compared to the previous works, both in terms of accurate and speedy detection of PD. Although this approach gave admirable results even for large measurement errors, in order to minimise the average error results for the predicted PD coordinates (r, θ, Ф) up to a further extent, some amendments were made that are reported in this work.
The present work introduces a novelty to the training phase of [23] by applying VMM while training the multi-DNN model, instead of including it during the testing phase. In this method, 'N' number of randomly generated PD sources in 3-D space are obtained statistically through VMM. This is made possible by adding an error, either obeying a Gaussian distribution or a uniform distribution, to the randomly generated PD source. Time delays amongst sensors, for the received UHF signals from these desired number of 'N' PDs, are used in training of the MDNNM, which ultimately output the three averaged PD coordinates. The model trained this way, when tested, brought a worth noticing improvement to the state-ofthe-art methods. No doubt, that the training time increases this way, but the accuracy in prediction, in turn, increases manifolds too.
The remainder of this study is organised as follows: Section 2 presents the mathematical modelling-related aspects of the 3-D PD localisation problem. Section 3 explains in detail the training and testing phases of the PD localisation algorithm. Section 4 includes the implementation of the proposed methodology in various manners and subsequently discusses the results and compares them with the state-of-the-art work. Finally, the work ends with a conclusion in Section 5.

MATHEMATICAL MODELLING FOR THE LOCALISATION METHOD
The 3-D modelling is complex as compared to the 2-D modelling because it includes different antenna array configurations, unlike there in 2-D case. Three different types have been presented in [9], which are considered for this work as well. For 3-D case, we have used four sensors but with six TDOA measurements between these sensors. Prior to proceeding towards the modelling, following subsection would briefly present the 3-D array configurations.

3D array configurations
Three types of array configurations are as follows: Uniform circular array (UCA)-(T1), Y-shaped array (T2), and right triangle pyramid-shaped array (T3). The rectangular coordinates for (all the four sensors included in) these three types have been tabulated by Table 1. Phasor coordinates for these coordinates could easily be obtained by simple formulas. All antenna arrays, including UCA (that is generally studied as 2-D), are arranged on a spherical surface having a radius R. Antennas are linearly and vertically polarised and omnidirectional with an identical response. Further detail regarding these configurations could be seen in [9].

3D modelling for the PD localisation
For 3-D modelling, all the sensors, along with the PD source, have three coordinates that are: (x, y, z). In order to replicate the coordinates for the 3-D PD source, first, (r, θ, Ф) are randomly generated. The limits being chosen for these coordinates are: r ∈ (0m, 60m), ∈ (0 • , 360 • ), ∈ (0 • , 90 • ). To convert these phasor coordinates to rectangular coordinates (X, Y, Z), Equation (1) is used: The distance 'd i ' between each sensor (x i , y i , z i ) and the PD source (X, Y, Z) is obtained through Equation (2): Further, the corresponding time of arrival, 't i ', with respect to 'd i ', is obtained by the relation t i = d i /c. In Equation (3) t ij0 = t it j expresses the difference of real arrival times between any two sensors, whereas 'c' shows the speed of light: The TDOA equations, given in Equation (3), do not fully demonstrate the practical scenario unless a deviation 'ɛ i ' caused due to the system's measurement error is included in them. This error exists because of the mismatch between actual and measured PD values. Measurement time error between PD source and sensor S i is expressed by ɛ i ; likewise, for sensor S j, the measurement time error would be ɛ j . Therefore, the measurement time difference between times of arrival associated with both these sensors S i and S j would now be 't ij0 +ɛ ij ' instead of just 't ij0 ', where, ɛ ij = ɛ i-ɛ j . This system's measurement-related deviation gets affected by certain factors such as sensing ability of the UHF sensors, system's sampling rate, and measuring accuracy of the arrival time [9].
In order to thoroughly investigate as to how PD source location gets affected by the error in TDOA, error-based deviation 'ɛ ij ' is deliberately varied, which finally alters the relation to 't ij0 +ɛ ij '. The TDOA equations after adding the aforementioned deviations are illustrated by Equation (4): It is worth mentioning that ɛ ij has an upper bound T such that ɛ ij ∼ [-T, T] while it obeys either a Gaussian distribution or a uniform distribution. The normal distribution is generally preferred, as naturally occurring phenomena follow a normal distribution. However, the uniform distribution is also studied in order to get a comparative analysis.

3.1
Data generation for the MDNNM

Training phase
Unlike some of the state-of-the-art methods, especially [23] incorporating the VMM, that pondered on improving DNNbased methods during their testing phases, this work focuses on refining the training phase strategy. For this purpose, VMM is applied during the training phase. In [23], a single set of time delay equations, while obeying a standard distribution (whether normal or uniform), was used to statistically generate a desired 'n' number of sets of time delay equations. In this work, a single PD source is randomly generated initially; afterwards, an 'n' number of desired PD sources are obtained statistically. This is done by adding an error-based deviation (which follows a standard distribution whether normal or uniform) to the initially generated single PD source. Figure 1 illustrates the methodology for the generation of 'n' PD sources.
Having done that, the corresponding time delay sequences are generated, as explained further in Figure 3, in such a manner that once the time delay sequences are obtained, then no error is added to them during the training phase before they are fed to the MDNNM. It is believed that the model would learn to cope with the upcoming error-based time delay sequences (during the testing phase again shown in Figure 3) beforehand, provided it has been trained on an 'n' number of PD sources that have been generated in such a manner that they deviate from one another while obeying a standard distribution. Thus, this way the datasets are generated and fed to the multi-DNN model for its training, which outputs the PD source coordinates.

3.2.1
Testing phase For the testing phase, 'n' number of PD sources, as given in Figure 2, are randomly generated and no error is included. Further, unlike [23], VMM is not used during the testing phase while obtaining the time delay sequences. However, the additional error is applied to the time delay sequences prior to their feeding to the pre-trained DNN model. This is done in order to replicate the practical scenario, where approaching sequences of time delays to the sensors contain errors, and the measurement system also introduces a certain amount of error. This has been elaborated in detail in Figure 3.

Training and testing of the MDNNM
Once data is available, the model is trained first; this is done for all the three array designs (that have been discussed earlier on). Afterwards, all the pre-trained models are tested and results are obtained. The complete strategy for both these phases has been given in Figure 4. The accuracy and loss curves for all the three trained models are shown, respectively, by Figures 5-7. It is apparent that type 2 array has the best performance, whereas type 1 shows the worst performance. This has been proved in the results section by comparison.
The working strategy of any one of these pre-trained models has been demonstrated in Figure 8, which clearly shows that the predicted outputs are directly dependent upon the input sequences of time delays. Therefore, it is necessary to obtain the time delay sequences in a sophisticated manner. That is why we have tested these models for varying error values. The pretrained model that predicts the test PDs well, even after introducing large simulated errors, is regarded as the most robust one.

RESULTS AND ANALYSIS
In order to test the efficiency and robustness of the MDNNM trained with a novel VMM (TNVMM), a multifaceted study has been carried out in this section. For doing so, comparisons have been made with the two previously presented methodologies for the PD localisation, for example, (1) MDNNM [9], and (2) MDNNM based on a VMM (MDNNM-VMM) [23]. First, the measurement error value is changed, and then its subsequent effects are examined. Averaged error results for all 3-D coordinates (r, θ, Ф), including the calculated distance (d) alongside, have been compared so as to explore the performance of each coordinate separately. Additionally, different antenna array designs are considered as well.
For a smooth comparative analysis, varying error has been applied to just two time delays, that are t 12 and t 43 . Type 2 array has been considered mostly owing to its robust design, as observed in Figure 6 and as suggested in [9,23]. For the graphical illustrations in this section, the methods proposed in [9,23] are, respectively, denoted as DNN and VMM old , whereas MDNNM-TNVMM being proposed in this study is regarded as VMM new .

Effect of varying simulated error on the PD coordinates
In a real case measurement scenario, there exist system-related errors and measurement time errors, which if replicated in a simulation, require an additional random error value being added to the ideal time delay sequences. This random error value is varied and its resulting effects are observed so that we come to know that up to what extent an accurate PD detection is possible. Figures 9-11 show that for all the PD coordinates, the proposed MDNNM-TNVMM method performs better as compared to the other two methods. This is collectively replicated when we see Figure 12 for the averaged distance between the test and predicted PD locations. It is worth noticing that even for a large error value of 24 ns, the proposed method could predict the PDs within 5 m range, whereas for error values below 10 ns, the prediction accuracy is within 2 m range. Both Δr and Δθ coordinates are correctly predicted within 6 m and 6 • ranges, respectively, for a simulated error up to 24 ns. For a detailed comparison, see Table 2.

Effect of antenna array's radius on the location accuracy
In order to investigate whether or not the size of an antenna array matters during the PD localisation, this section presents a detailed study. First of all, training session is carried out for different values of antenna array's spherical radius, ranging from 10-60 m separately. Having done that, for varying simulated error value, effects incurred upon each coordinate are seen separately for different training models. Figure 13 shows how an increasing array's radius value decreases the Δr value. It could be noticed from the figure that type 2 array could predict r coordinate within 2 m range accurately provided the size of the antenna's array is 30 m or above. For values above 30 m, we do not see significant improvements; therefore, this array size seems suitable for PD localisation. Figure 14 has been examined in a different manner so as to get a deep insight into PD localisation. For this purpose, error distribution has been focused so that a suitable value of antenna array could be found for different varying simulated error values. Although mean error values for the predicted outcome, when R = 20 m, 50 m, and 60 m, are approximately the same for error values ranging from 0-24 ns, the error distribution shows that R = 50 m and 60 m have less variations. This depicts that these sizes, especially 50 m, are suitable if the cost factor is neglected.
As far as factor ΔФ (see Figure 15) is concerned, the behaviour is not decisive and no useful finding could be obtained through this. Further, it is already well known that θ is the most preferred choice for PD localisation, then comes r, whereas Ф is not solely considered for PD localisation. Figure 16 shows the overall performance while considering the distance difference between the test and predicted values. It is noticed that size R = 30 m is suitable for the PD localisation if both accuracies of PDs' prediction and cost-effectiveness of the proposed method are taken into account.

Effect of distribution error type on the location accuracy
Either uniform or normal distribution has been considered in [9,[21][22][23] for their studies related to the representation of a simulated error; therefore, we have also considered both these distributions and carried out a comparative study between them with reference to PD localisation. During the training phase of the MDNNM-TNVMM, when generating an 'n' number of desired PDs statistically from a randomly generated single PD source, we had to add a deviation (i.e. a random error) obeying either of the two aforesaid distributions. After training the model for both types of standard distributions, we performed simulations in order to identify which distribution trains the model well for PD predictions. Figure 17 compares the predicted parameters by both the trained models, and it is quite obvious that normal (Gaussian)   Effect on Δr for the changing error distribution-based model performs better as compared to the other model, so it should be the preference. Nevertheless, there is not a huge difference between the final averaged-distance coordinate being evaluated, so any one the two distributions can work well for detecting the PD sources. One thing is notable that for a large error value of 24 ns, Δd for both normal and uniform distribution is below 5 and 6 m, respectively.

Effect of different array designs on the location accuracy
On the basis of the comparative analysis after the training phase of this work and some of the studies [9,23] done in the recent past, we considered type 2 array for our work. However, in order to test the performance of our model during its application, we   Table 2. Figures 18-21 show odd behaviour for the two models other than type 2. Even for large simulated error value like 24 ns and for small error values, the behaviour of type 3 array is very unusual, as the averaged deviations approximately remain alike throughout, which is practically not possible. Instead of a larger deviation for large error values, sometimes for smaller error values, large deviations are noticed that is again abnormal. Similarly, type 1 design does not follow an increasing trend and performs even worst. Figure 20 expressing ΔФ elaborates this unusual trend meaningfully, where it can be seen that even for a nil simulated error value (just possible statistically, not practically), the averaged-deviation between test and predicted PDs  Note: The subscripts VMMnew is the multi-deep neural network model trained with a novel virtual measurement method (MDNNM-TNVMM) being proposed in this study, VMMold is the method proposed in [9], DNN is the method proposed in [23].  is closer to 10 • . This shows that both these array designs are not well suited for the PD detection problem. Figure 21, at last, endorses the previously stated finding that type 2 is the best design for PD localisation studies.

Localisation experiment
To verify the effectiveness of the proposed MDNNM-TNVMM, this section presents an application case to perform a localisation experiment based on the proposed VMM. Before the test, all four antennas need to be put together for time  difference calibration as shown in Figure 22. Figure 23 shows an oscilloscope acquiring the PD signals received at four different antennas when a PD signal is generated by an electrostatic discharge gun. Data is recorded in a cylindrical coordinate system, where the coordinates of antennas are S 1 (0, 0 • , 0.97), S 2 (10, −60 • , 0.97), S 3 (10, 60 • , 0.97), and S 4 (10, 180 • , 0.97); and they are arranged as shown in Figure 24. The sampling rate is Once the PD signal is captured, the estimated TDOA dataset is obtained as t 12 , t 43 , t 14 , t 23 , t 24 , and t 13 . Having done that, the localisation algorithm is implemented with N = 1000. It has been observed that for varying T m values, the corresponding calculated radial distance 'ρ' and angle coordinate 'θ' remain stable, PD signals acquired by the oscilloscope and the predicted values meet the requirements for the accurate PD detection and localisation in a substation. The localisation algorithm has been applied on the estimated TDOA set for PD pulses generated at different cylindrical coordinates within a range of 30 m.
After capturing the PD signal for a PD pulse generated at cylindrical coordinates (17 m, 152 • , 1.8 m), the estimated TDOA dataset is (t 12 , t 43 , t 14 , t 23 , t 24 , and t 13 ) = (−79.6, −35.2, −24.8, 19.6, 54.8, and −60) ns. Inputting this dataset to the pre-trained model after applying VMM for different T m values, and subsequently predicting the PD location results in what has been are given in Table 4. Similarly, for a PD pulse generated at cylindrical coordinates (26 m, 152 • , 1.8 m), the estimated TDOA dataset is (t 12 , t 43 , t 14 , t 23 , t 24 , and t 13 ) = (−71.3, −53.9, −17.4, 10, 53.9, and −71.2) ns. The results for the predictions corresponding to these TDOA values has been presented in Table 5. Another case study has been given in Table 6, for which the TDOA set was (t 12 , t 43 , t 14 , t 23  It is evident from these results that the proposed method can predict the PD location accurately within a range of 3 m and 1 • , respectively, for the radial distance and angular coordinate within a range of 30 m. Moreover, for the values of T m below 10 ns, the predicted results are even more reliable for the PD localisation. MDNNM-TNVMM in this study, when compared to [22], shows that for the values of T m greater than 10 ns, our method exhibits a robust behaviour and the error in time delay dataset does not affect its performance very much. Additionally, the practical scenario for the PDs' detection and localisation should be a 3-D case study, unlike that in [22]. This makes the localisation method more reliable for its application and extension further up to PD detection in substation and so forth.

CONCLUSION
The present work proposes a MDNNM-TNVMM for the accurate PD localisation. Important conclusions being drawn after carrying out a multi-faceted study are as under: 1. Applying statistical simulations based VMM during the training phase, instead of its application during the testing phase, results in an improved and accurate PD localisation. 2. MDNNM-TNVMM expresses its robustness and efficiency for large simulated error values when it is compared with a multi-DNN approach and another approach based on a multi-DNN using a VMM (during its testing), that is, MDNNM-VMM. For a measurement error of 24 ns, our proposed approach in comparison to multi-DNN approach sees a respective percentage decrease of 42.14, 29.36, 42.12, and 54.63 for its averaged-predicted coordinates Δr, Δθ, ΔФ, and Δd. Similarly, while comparing to MDNNM-VMM, our proposed approach brings a percentage decrease of 21.55, 26.45, 20.05, and 29.32 to Δr, Δθ, ΔФ, and Δd, respectively.

FIGURE 24
Antenna arrangement for the PD localisation experiment 3. With an antenna array size of 18 m, for 500 PD test cases, when applying a simulated error up to 24 ns, the averaged error between the test and predicted PD sources for coordinates (r, θ, Ф) has been, respectively, found in the range (5 m, 6 • , 8 • ) approximately. Subsequently, the evaluated corresponding distance Δd for this case is within 5 m range. 4. Y-shaped antenna array design has been found to be the most suitable choice for PD localisation, while optimal size for this antenna array has been found as R = 30 m provided both accuracy and cost-effectiveness are the considerations. However, the error in prediction reduces for all the PD coordinates when the size of array is increased. 5. The experimental results show that virtual measurementbased localisation algorithm can get a location accuracy of 1 • , within a range of 30 m, for an estimated time difference value up to 10 ns.