Failure investigation and asset management of combined measuring instrument transformers

Réseau de Transport d'Électricité Abstract Asset management of instrument transformers (ITs) has gained momentum in recent years due to their large population in the network and the potential adverse impacts of their increasing number of failures. This study investigates the failure cause of a family of combined measuring ITs via temperature profile simulations and consideration of moisture migration behaviour in an oil‐impregnated paper insulation system. The temperature difference experienced by an IT, particularly throughout summer, together with a relative saturation hysteresis phenomenon could have caused a lower dielectric strength that led to failures. The study also offers insights into factors influencing the probability of failures through analysis on the lifetime data from both graphical survival function plots and statistical Cox model. In‐service age, operating voltage level and environment appear to have an influence on failure rate.


| INTRODUCTION
Instrument transformers (ITs) function by providing proportional secondary voltage or current signals or both that are in turn essential for protection, metering and control of electricity [1]. In recent years, concerns have been raised on the lack of focus on long term asset management of ITs in spite of the large IT populations that are in-service [1][2][3][4].
This issue is particularly pressing realising that IT failures will not only just cause operational errors, but also more severe outcomes for instance explosions of units enclosed in porcelain that can damage surrounding equipment and harm personnel, as well as a loss of electricity supply [1,5]. Furthermore, there has been a notable number of IT failures in recent years [1,2,4,5], prompting more attention on proactive asset management of ITs [1][2][3][4].
Traditionally, condition monitoring and asset management techniques can be categorised into bottom-up and top-down approaches [6,7].
The bottom-up approach is centred on condition assessment of individual units through regular measurements and tests for condition indicators with subsequent quantification or categorisation of its condition (health index) acting as the cornerstone for asset management decisions [7].
Pertaining to that, the application of machine learning algorithms has in recent years complemented and supplemented expert knowledge, physical understanding of degradation processes and post-mortem experiences in aiding the decision of whether to run, repair, refurbish or replace specific individual units [7][8][9][10][11][12][13].
The top-down approach, on the other hand, focuses on understanding the generic characteristics of a given population of units for projection and planning purposes [6][7][8]14]. If sufficient condition indicative data are available, they can be used to indirectly identify potential failures which can then be used as a part of lifetime data and subsequent population lifetime modelling [9,10]. Through application of statistical distributions, such as the widely used Weibull distribution, lifetime modelling can then offer a prediction of future failures or necessary replacements [8,11,12].
Condition data gathering (such as oil moisture and acidity measurements) has historically been absent or less developed for ITs if compared with power transformers. This is particularly the case for the ITs to be discussed in this study. Nevertheless, simulation studies can be performed to inform generic population characteristics which can then be used to guide a top-down approach to the management of ITs. A reasonably larger number of historical IT failures (as ITs are monophasic), if compared with power transformers, can also facilitate a top-down approach through lifetime data modelling.
In this study, ITs of interest are combined measuring units, capable of measuring both voltage and current within the same unit. They are made by the same manufacturer and of the same design family, ensuring data homogeneity. They are rated at 72 and 100 kV, but operated at 63 and 90 kV respectively in the Réseau de Transport d'Electricité (RTE) network. These units are also categorised by F, P and W pollution indices, where F units are generally used in the least polluted environment and W units in the most polluted environment (e.g. with greater moisture or saline pollution).
This study aims to align simulation studies, literature review on insulation system moisture dynamics and statistical lifetime data modelling for identifying the probable causes and influencing factors for IT failures. Asset management suggestions will also be discussed.

| DATABASE FEATURE
The study is conducted on a family of paper insulated, mineral oil filled combined measuring ITs. A total of 1600 unique entries (updated as at 31 December 2017) were recorded; out of which 153 are failures (the remaining 1447 are survivals). Figure 1 depicts the in-service age distributions of the 1600 units. For failures, in-service age is the actual lifetime, that is, difference between failure date and manufacture date. As for survivals, in-service age is the difference between 31 December 2017 and manufacture date (right-censored data). It can be seen from Figure 1 that most units (both survivals and failures) are distributed around the age of 21 years.

| Temperature profile simulation
Besides premature failures due to faults, transformer life expectancy is generally determined by the temperature of the hottest point in the winding, commonly deemed as hot spot temperature [13]. To understand the temperature distribution in a transformer, computational fluid dynamics modelling was performed through COMSOL using a time-dependent nonisothermal flow study. Figure 2 illustrates the model built in COMSOL. Due to confidentiality issues, the exact design is not to be disclosed. Nevertheless, in general, the combined measuring units to be studied are composed of a current transformer (CT) at the top and a magnetic voltage transformer (VT) at the bottom.
The VT is rated at 150 VA of class 0.5 (measurement) and 3P (protection), thermal limiting output of 700 VA with a transformation ratio of 63,000/√3 to 100/√3 V for the 63 kV unit and 90,000/√3 to 100/√3 V for the 90 kV unit [15,16]. As for the CT, it is rated at 50 VA of class 0.5 (measurement) and 5P 20 (protection) with a transformation ratio of 500 to 5 [15,16]. Note that the only difference in terms of the nameplate rating between the 63 and 90 kV units is the VT transformation ratio.
Heat generation and heat dissipation are key aspects in determining the temperature profile. Table 1 shows the COMSOL simulation inputs evaluated based on the manufacturer documents and literature [3,[15][16][17][18][19]. The values reflect estimations based on extreme conditions. For ease of simulation, only the oil domain is considered (solid parts removed with heat fluxes injected into the oil domain).
For this model, the heat generation consists of copper loss, iron (core) loss and dielectric loss. CT copper losses were evaluated based on a primary current of 500 A though the current observed throughout a year is about 300 A. VT copper losses were evaluated based on the rated burden. On the other hand for core losses, they were evaluated based on knowing the relationship between operating magnetic flux densities (0.2 T for CTs and 1.2 T for VTs) and the corresponding core losses [18,19]. As for dielectric losses, a dielectric dissipation factor (tangent delta) of 2.9% was used to represent aged insulation.
Although not a heat generating source, solar radiation will also be considered as it adds to the temperature rise of the model. Heat fluxes of 0 W/m 2 , 500 W/m 2 and 1000 W/m 2 were used to represent night time, cloudy/early morning/ evening as well as clear sunny conditions respectively [20]. These heat fluxes were assigned to all external surfaces (except the base) as well as vertically half of the surfaces to crudely represent incidence angle or presence of sun sheds.
In terms of heat dissipation, it will be facilitated by the oil natural convection or circulation within the unit itself (due to buoyancy forces) and will also include convection and radiation from the unit to the surrounding environment. Figure 3 shows the temperature profiles of a 90 kV, F pollution index unit. Ambient temperature used is 20°C. Without solar radiation (Figure 3a-night time), the hot spot temperature due to copper, iron and dielectric losses could be around 28°C. On the contrary, with the maximum solar radiation ( Figure 3b-clear sunny day) on all surfaces, the hot spot temperature could be about 66.8°C. Table 2 summarises the simulated maximum temperatures across voltage level, environment and solar radiation variations. Similar observations are made for the different pollution indices (F-least polluted, W-most polluted environment). The max difference in temperatures simulated is 1.2°C. The arrangement of porcelain fins (number and thickness) improves the electrical creepage distance for usage in different environments, but it might not noticeably affect the thermal performance.
As for the voltage level of 63 and 90 kV, similar findings are also observed as the max difference in temperatures simulated is 1.7°C. This small difference seen is most likely due to the height of the porcelain portion where 63 kV units are 200 mm shorter than the 90 kV ones. It is clear the intensity and surface area under solar radiation play a major role in causing the variations in Table 2.
More interestingly, even though the absolute temperature of the transformer could reach around 68.5°C, this temperature could still be insufficient to cause severe thermal ageing of the paper insulation, which is known to be one of the crucial lifetime limiting factors.
Based on a hot spot temperature reference of 98°C for non-thermally upgraded paper and with every 6°C increase doubling the ageing rate [21], Equation (1) shows that relative ageing rate, k at 68.5°C is indeed low.
It is acknowledged that different conditions can still change the ageing rate, but by evaluating Equation (1), the estimated lifetime should be long given the low ageing rate (≪ 1). This implies that insulation thermal ageing is not the cause of failure based on the hot spot temperature most likely experienced by the IT. Figure 4 shows the distribution of the failures with respect to time of failure, categorised into summer months and other months of the year for the entire population. It can be seen that most of the failures occurred in the evening, at night or early morning, particularly during summer. Note that the same pattern of the distribution of the time of failure can also be observed for subpopulations categorised based on operating voltage levels and pollution indices.

| Distribution of time of failure
Knowing that during summer, ambient temperature can be high and with more sunlight exposure, the temperature difference experienced by a transformer throughout 24 h of a day could be influential. Figure 5 illustrates the 24-h profile of ambient temperature in blue, on a typical summer day (31 July) as aggregated from 1989 to 2015 from various weather stations.
Taking the temperature rises from the simulation results in Table 2 and focussing on the highest rise possible (case of 63 kV, W, all Surfaces), the red curve in Figure 5 estimates an -63 indicative max oil temperature profile. This is done by first taking the difference between the maximum temperatures and ambient temperature of 20°C in Table 2, before adding to the actual aggregated ambient temperature.
It is assumed that at the times of 10:00 AM, 11:00 AM, 7:00 PM and 8:00 PM, the solar radiation is at 500 W/m 2 . On the other hand, 1000 W/m 2 of solar radiation is assumed for the duration between 12:00 PM and 6:00 PM. Other than those two durations, solar radiation is assumed to be 0 W/m 2 .
It is clear a big temperature difference exists throughout the 24 h. Note that 500 A (maximum current through the CT) was used in the simulations for estimating the extreme CT copper losses. Besides a peak-valley feature of a typical summer's day load curve [22,23], the evening/night/early morning demand in France (RTE network) in summer is likely to be smaller. This means the left and right regions of the red curve in Figure 5 are most likely lower in reality. Thus, the day/night temperature difference in a typical summer's day is most likely even greater.

| Relative saturation hysteresis
With temperature known to affect oil properties, the temperature difference could culminate in a hysteresis phenomenon affecting the relative saturation (RS) of the oil. This RS hysteresis phenomenon has been documented in [24,25] and will be discussed by referring to Figure 6.
At any oil temperature, T Oil (in Kelvin), the moisture saturation, M Saturation of the oil is expressed by Equation (2). RS is simply the ratio of the absolute moisture in oil, M Oil and the moisture saturation, M Saturation as seen in Equation (3).
At a certain temperature, moisture saturation will always be the same (given the same ageing condition and the type of oil), but the RS could be different as the absolute moisture in oil could vary. This causes the hysteresis phenomenon as noted in [24][25][26].
When the oil temperature increases, moisture residing in paper migrates more into the oil [27], thereby increasing the absolute moisture in oil. Nevertheless because of the increasing moisture saturation with the higher oil temperature, the RS of the oil can still be low. Generally, this situation is analogous to a transformer during day time operation. Depending on the rate of moisture migration into the oil and the increase in moisture  saturation of the oil, the RS of the oil can even drop if the increase in oil moisture saturation is greater. Interestingly, when oil temperature then decreases (night time operation), the moisture in oil is unable to migrate back into the paper as quickly as the reverse process. This, together with the decrease in oil moisture saturation due to a lower temperature will cause the oil RS to be high.
Such a hysteresis process progressively leads to higher absolute moisture and a higher RS in oil particularly after a sustained period (a few days) of high temperature difference [24,25]. The resulting high oil RS implies a lower oil breakdown voltage (lower dielectric strength) [28], that is, higher tendency of dielectric failure. This temperature dependent RS hysteresis is thus most likely the reason behind the failures.

| Potential source of moisture
The RS hysteresis mentioned requires moisture to be present in the system. It is true that oil-paper insulation ageing can contribute to moisture but by judging the temperature profiles simulated based on the worst case condition in Section 3.1, oilpaper insulation ageing is unlikely to be the dominant contributor.
Moisture could actually be due to an ingress from surrounding environment, facilitated by poorer sealing performance of rubber or thermoplastic polymeric materials [3]. A reduction in sealing performance could in turn be attributed to chemical and physical ageing of these polymeric materials [29][30][31].
Chemical ageing of polymers involves chain scission and cross linking of polymeric molecules [29,30]. The ageing mechanisms include oxidation (thermal and photo oxidations) as well as hydrolysis, with temperature playing a role in both the activation and acceleration of the ageing mechanisms [29,30].
Physical ageing occurs below the glass transition temperature of a polymer [31]. This ageing mechanism targets the amorphous portion of the material and happens due to the free volume present, culminating in a gradual volume contraction of the material to tend towards its equilibrium free volume [31,32]. Note physical ageing does not change the chemical structure of the material [31,32].
Both chemical and physical ageing mechanisms render a polymeric material to be more glass-like (stiffer, more brittle and with decreased damping) [31,32].

| ASSET MANAGEMENT IMPLICATIONS
In order to prevent potentially large number of unexpected failures, proactive asset management needs to be considered. Unlike condition assessment for power transformers, the lack of parameters (such as moisture measurement) that are directly linked to the state of the ITs has prompted the need for an alternative. Thus, potential indirect influencing factors will be used to aid decisions on proactive replacement. The indirect influencing factors available for the family of combined measuring ITs being studied are in-service age, operating voltage level and environment.
As indicated from Section 3.4, in-service age plays a role as over time chemical and physical ageing processes of sealing materials reduce the transformer lifetime. On the other hand, operating voltage level provides a proxy towards the level of dielectric stresses experienced by a transformer. As for environment, it provides information on the potential severity of ageing or likelihood of moisture ingress.
Note that all combined measuring ITs in this study are from the same design family of the same manufacturer. They are rated at 72 and 100 kV, but operated at 63 and 90 kV respectively in the network. These units are also categorised by F, P and W pollution indices, which are used to represent the environmental factor. Table 3 shows the number of survivals (in Su) and failures (in Fa) corresponding to subpopulations categorised by operating voltage level and environment. These additional fields of information will be used in studies investigating the influence of factors on failure probability.

| Survival function plots
With the lifetime data, survival functions will be plotted to study the reliability of these combined measuring ITs. A nonparametric, Kaplan-Meier (KM) approach will be used. This KM estimate or product limit estimate is often used for estimating survival (reliability) function when dealing with rightcensored lifetime data. Furthermore, it can be used without the need for specifying an underlying probability distribution [33]. Equation (4) shows the KM survival function, S KM [33] .
where t is time (transformer in-service age), n i refers to the number of survivals just prior to (starting of) a certain time t i and d i denotes the number of failures at that time t i . Through examining Equation (4), KM survival function is essentially the product of the proportion of units surviving a certain time period t i having initially survived preceding time periods leading up to that time period t i [34] . The corresponding pointwise 95% confidence interval (CI) of the KM survival function, S KM , 1À s which is based on a statistical significance level, s of 5% or 0.05 is shown in Equation (5) [33]. where z 1À s/2 represents the 100 (1À s/2) percentage point of the standard normal distribution and is equal to 1.96 for s of 5% or 0.05. The variance of the KM survival function, V SKM is evaluated by Greenwood's formula [33] as in Equation (6). Figure 7 illustrates the KM survival function plots and 95% CIs plotted with transformer in-service age, done using MATLAB. Note that the KM plots for the whole population are shown in black in all subfigures.
The whole population can be partitioned into 90 and 63 kV subpopulations (operating voltage level) with Figure 7a depicting the corresponding KM survival function plots. On the other hand, Figure 7b shows the plots for F, P and W subpopulations (environment).
Lastly, the population can also be further divided based on both operating voltage level and environment to account for potential interaction (confounding effect) [35]. Figure 7c shows the plots for these stratified subpopulations.
In general, the reliability decreases with in-service age, indicating an ageing influence on the probability of failures. This is seen not just for the KM survival function plot for the whole population but for all subpopulations too.
An interesting observation from Figure 7a is that the reliability of the 63 kV units appears to be higher and decrease more slowly if compared with that of the 90 kV units. As a supplement, the log-rank hypothesis test yielded a chi-square statistic of 104.2499, with a p-value that is 1.7835 �10 À 24 (<statistical significance level, s of 0.05), meaning the two subpopulations do statistically exhibit different survival functions.
In addition, the graphical plots in Figure 7b along with the log-rank hypothesis test results indicate that while generally the different subpopulations show a decreasing reliability with inservice age, the decrease is more pronounced and faster for units in more polluted areas (W > P > F). For instance, units located in coastal areas have been found to be more vulnerable to failures. The distinctions within operating voltage level subpopulations and within environment subpopulations are also observed in Figure 7c. The log-rank test statistic for the voltage level stratified subpopulations is 75.8216, resulting in a p-value of 3.4320 �10 À 17 (<0.05) which indicates the 90 and 63 kV subpopulations drawn from the same environment stratum are different. Similarly, for environment stratified subpopulations, the log-rank test statistic is 35.5947, resulting in a p-value of 2.4295 �10 À 9 (<0.05), again indicating that the F, P and W subpopulations drawn from the same voltage level stratum are different.
The graphical and statistical observations suggest no potential interaction exists between the two factors (operating voltage and pollution index). More importantly, all three subfigures in Figure 7 indicate the influence of age, operating voltage level and environment on failures of this family of combined measuring transformers.

| Cox model analysis
Prior analyses have revealed an influence of age, operating voltage level and environment on failures. These analyses performed on the whole population, subpopulations of voltage level and environment, plus the stratified subpopulations can be further supplemented with a Cox model approach [36].
Cox model, a semi-parametric model, has the advantage of being capable of incorporating multiple factors of interest (covariates), catering for both continuous and ordinal covariates as well as providing the extra layer of information in the form of quantitative magnitudes of the difference between the survival experiences [34,35].
Since its inception, Cox model has been widely used, particularly in the medical field, reliability studies and more recently in power system [6,[37][38][39]. The Cox model is shown in Equation (7) [40,41] where t is time, h is the hazard (failure rate), h base is the baseline hazard and X are matrices of covariates that can be time-independent, X (p � 1) and time-dependent, X(t) (q � 1). Also, β (1 � p) and γ (1 � q) are regression parameters, whereas p and q are the number of time-independent and time-dependent covariates respectively. In this study, for the purpose of identifying the factors (covariates) that contribute to failures, only time-independent covariates will be considered (constant hazard ratios or same rates of the changes in hazards). In addition, the focus is on the relative importance of covariates on the hazard and hence h base can be neglected. This baseline hazard is anyway cancelled off when evaluating the partial likelihood (PL), for assessing the effects of the covariates. This PL is shown in Equation (8), where it is computed based on a series of data points (both failures and right-censored survivals) over n distinct time periods [40,41].
From Equation (8), the regression parameters, B are then obtained through Maximum Likelihood Estimator (MLE) approach [42], which can be more easily accomplished through expressing Equation (8) in its natural logarithmic form. All these will be accomplished through the use of a specialised statistical program, SPSS.
To also quantify the influence of in-service age (not just voltage level and environment), the time input for the Cox model approach will be calendar year. Table 4 shows the essence of Cox model results, with in-service age as a continuous factor (covariate) and both voltage level and environment as ordinal covariates.
As all the p-values are less than 0.05 (standard significance level), this implies that all the covariates do have a significant effect on the hazard. The β values on the other hand is more informative if interpreted as Exp(β), but the sign of the β values can indicate quickly either an increasing (positive sign) or decreasing (negative sign) effect of a particular level of a factor on the hazard.
For the interpretation of Exp(β), note that for in-service age which is taken as a continuous covariate, Exp(β) is the predicted change in the hazard per unit increase in in-service age (1.0888 greater hazard per 1 year increase in in-service age).
As for the ordinal covariates, the results are expressed with respect to a reference group. For instance, the hazard of 63 kV units is 0.3521 times of that of 90 kV units. This implies 90 kV units have a higher hazard. Similarly, units with pollution indices of P and W will have 2.0880 times and 6.1472 times greater a hazard, respectively, if compared with F units.
In brief, besides quantifying the influence of factors on failure rate, Cox model results match with the prior graphical survival function plots. A decreasing reliability is observed for units that have been in-service longer, operating at a higher voltage level and in a more challenging environment (high moisture or saline pollution).

| Asset management discussions
In light of the lack of direct condition assessment information on this family of combined measuring ITs, it is recommended that proactive replacement is to target units that are older; operating at a higher voltage level and are present in harsher environments.
Considering the influence of in-service age on reliability seen from survival plots and Cox model, ageing of polymeric sealing materials could be more dominant than the ageing of oil-paper insulation as per discussed in Sections 3.1 and 3.4. The embrittlement of the polymeric materials over time could lead to more probable moisture ingress.
To offer insights into when proactive replacement should be considered, Figure 8  As observed from Figure 8, with increasing number of failures over the years, the uncertainty of the resulting hazard function reduces, hence allowing a better forecast. More importantly, the wear-out stage of a 'bathtub curve' becomes progressively more evident. These findings aid estimating a useful life for this family of combined measuring units which could be about 22 years considering the knee point of the hazard function plots.
Particularly in an environment with higher levels of pollution, the ageing could be worse [29]. In addition, with the units located in regions with greater pollution (e.g. closer to the coast), there could be greater moisture ingress. This moisture ingress, together with the cyclic temperature difference experienced could then result in relative saturation hysteresis (described in Section 3.3) that reduces the oil dielectric strength, thereby increasing the probability of failures. This could be the reason why environment (F, P and W pollution indices) has an influence of failures.
With a reduction in oil dielectric strength, a unit that is operating at a higher voltage could be more susceptible to failure, particularly if the design stress of the higher voltage unit is higher. This could be the reason behind the influence of operating voltage level (90 and 63 kV) on failures.

| CONCLUSION
Through simulation of temperature profiles and consideration of moisture behaviour in an oil-paper insulation system, the investigation on a family of combined measuring ITs indicated that the failures of such units are most likely attributed to a relative saturation hysteresis phenomenon. This is caused firstly by the presence of moisture which can be more pronounced for units in high moisture/polluted environment and for aged transformers where aged polymeric thermoplastic materials facilitate more moisture ingress. The hysteresis phenomenon is secondly influenced by the temperature difference experienced by a transformer throughout 24 h, especially in summer (high ambient temperature in general). Typically at night, the relative saturation of the oil could reach high levels that lower the dielectric strength of the oil. This culminates in an increased probability of dielectric failure, particularly for units operating at higher voltages.
In the absence of parameters that are directly linked to the condition (e.g. moisture level), lifetime data were analysed through survival function plots and statistical Cox model analysis to assess the influence of indirect influencing factors on failures. Analysis revealed the influence of age, environment and operating voltage level.
It is recommended that proactive replacement of this family of combined measuring ITs can be done by prioritising older units (in-service greater than 22 years) that are operating at higher voltages and in a harsher/more polluted environment.