Parameter identification and uncertainty quantification of a non‐linear pump‐turbine governing system based on the differential evolution adaptive Metropolis algorithm

Correspondence Tian Peng, College of Automation, Huaiyin Institute of Technology, Huaian 223003, China. Email: husthydropt@126.com Abstract Pumped storage units (PSUs) are now widely used for energy storage. However, the uncertainty of the identification results of the pump-turbine governing system (PTGS) caused by the random observation noises and the lack of prior knowledge remains an unaddressed issue. In recent years, the differential evolution adaptive Metropolis algorithm (DREAM) based on the Bayesian theory has been extensively used for parameter estimation and uncertainty analysis, but its application to the uncertainty analysis of PTGS has been rare. This study systematically evaluates the applicability of DREAM in the parameter identification and uncertainty quantification of PTGS. A real PSU in China has been employed as a case study for numerical experiments. Four groups of control experiments with different proportions of observation noises and different prior search spaces have been constructed in this study. It can be concluded from this study that: (a) accurate point identification results and effective uncertainty quantification can be obtained simultaneously using DREAM, (b) a lower proportion of observation noises can enhance the efficiency and effectiveness of the DREAM algorithm when applying to PTGS simulation. The DREAM method can narrow the prior search space of PTGS parameters effectively and thus help the hydropower engineers to make decisions.


INTRODUCTION
Pumped storage units (PSUs) are important energy storage technologies that can effectively alleviate the impact of intermittent fluctuating energy sources such as wind and solar on the power system [1,2]. They are widely used to improve the power grid's ability to absorb clean energy and thus solve the safety and stability problem brought by the integration of renewable energy generation systems [3]. With the growing market of clean and renewable energy and the increasing requirements for safe and stable operation of the power grid [4], the construction of PSUs is increasingly prominent [5]. In order to model the behaviours of different modules of PSU, the pump-turbine governing system (PTGS) has a set of parameters. However, it is difficult to obtain the actual value of the PSU parameters directly from on-site measurements, so it is becoming more and more important to identify the PTGS parameters to model the PSU accurately [6].
In the past 20 years, great progress has been made in parameter identification theory for PTGS [7,8]. The identification methods of PTGS is becoming more and more mature. However, it should be pointed out that due to the limitations of the on-site test conditions and researchers' cognition of the inherent complexity and non-linearity of PSUs [9], the uncertainty of the parameter identification results will inevitably occur. A typical flowchart of the parameter identification process is shown in Figure 1. As shown in Figure 1, the parameter identification results of PTGS are affected by the fuzziness of prior knowledge, the randomness of measurement signal noise, the incompleteness of the identification model, and the influences of different identification algorithms, thus the parameter identification of PTGS is likely to be inaccurate or even wrong [2,[10][11][12].
Traditional parameter identification methods and strategies based on optimisation theory belong to single-point estimation from the statistical point of view [13][14][15]  various intelligent evolutionary algorithms have been employed for the parameter identification of PTGS or hydraulic turbinegoverning system (HTGS), such as gravitational search algorithm and its improved versions [10,12,16,17], improved backtracking search algorithm (IBSA) [11], improved ant lion optimisation algorithm [18] and so forth. Only one theoretical optimal solution can be obtained, and the uncertainty of the identification results cannot be evaluated [19]. With the development of the control and stability analysis theory, point parameter identification results can no longer meet the requirements of theoretical research and industrial application. Parametric uncertainty analysis methods that can obtain the probability distribution and confidence interval of the model parameters have become more and more important [20].
A number of approaches have been developed for parametric uncertainty analysis [20][21][22][23], among which the Bayesian theory has been extensively used for parameter estimation and uncertainty analysis in recent years [24,25]. The probability inversion method based on Bayesian theory sets the unknown parameters as random variables, and the probability density distribution function is used to describe the uncertainty of the results. The Markov chain Monte Carlo (MCMC) is widely used to sample the search space of the model parameters randomly, and the probability description of the posterior distribution of the model parameters is obtained according to the likelihood values and the prior information of the parameter values [19].
The traversal efficiency and estimated performance of the MCMC method mainly depend on the way the prior distribution is constructed and the parameter updated strategy of the Markov chain. The differential evolution adaptive metropolis (DREAM) is a fast and efficient MCMC algorithm that combines the multi-chain parallelism idea and the differential evolution (DE) algorithm. It has been widely used in the research of parameter inversion and quantitative uncertainty evaluation in hydrology [26][27][28], geophysics [29,30], agronomy [19] and other fields. Zheng and Han [26] applied the DREAM algorithm to access the uncertainty of the parameters of watershed-scale water quality models. Sheng et al. [19] first used the DREAM method to estimate the parameters of the maize module of the Agricultural Productions Systems sIMulator model (APSIMmaize). Results indicate that the DREAM method performed well in theoretical and real-world evaluation. Zahmatkesh et al. [27] used the DREAM method to investigate the model parameters and input uncertainty of the rainfall-runoff process in urban areas. Results turn out that the developed method can provide reliable tools for rainfall and runoff modelling.
Many studies have applied the DREAM method to the parametric uncertainty analysis of complex non-linear models, but very few of the DREAM analyses addressed the PTGS parameter identification problem. In order to quantitatively evaluate the uncertainty of the parameter identification results caused by the random observation noises and the lack of prior knowledge, a Bayesian uncertainty analysis framework for parameter identification of PTGS is established in this study. To the best of the authors' knowledge, it is the first time to investigate the application of the DREAM algorithm in the parameter identification and uncertainty analysis of PTGS. The DREAM method can not only obtain the point identification results of PTGS parameters but also quantify the uncertainty of the identification results. The posterior probability density function of the parameters to be identified is sampled and traversed using the DREAM method. The posterior probability distribution including all the information of the parameter set is obtained, which can intuitively express the uncertainty of the identification results and thus provide a new idea for the improvement and development of the existing parameter identification strategies for PTGS.
The rest of this study is arranged as follows: Section 2 introduces the theoretical background involved in this study, Section 3 presents the detailed parametric uncertainty analysis process of PTGS based on DREAM, Section 4 gives a real-world case study with experiments and results analysis, and Section 5 presents the conclusions of this study.

Model of the PTGS
PTGS is designed as a modular system with governor, electrohydraulic servomechanism, penstock, pump-turbine and generator modules [11,31]. The governor module outputs the control signals with certain regulation characteristics according to the unit frequency feedback, signal setting and operation instructions. The electrohydraulic servomechanism module is used to amplify the control signal of the governor [32]. The penstock module describes the dynamic process of the penstock pressure and the water head of the pump-turbine. The pump-turbine module is the core module of PSU, which is the key to realise the efficient conversion of hydropower energy. The generator module operates as a generator under the condition of a water turbine and as a motor under the condition of the water pump. The framework of the non-linear simulation model of PTGS is given in Figure 2. PTGS is the terminal actuator and core control system for automatic generation control of PSU. As shown in Figure 2, the generator of PTGS is designed with the given speed x c and the unit speed x as input signals. After the non-linear transformation of the governor module and the amplification of the electrohydraulic servomechanism module, the guide vane opening y is obtained. The guide vane opening y is then sent into the penstock and pump-turbine modules to obtain the mathematical relationship between the unit flow (or moment) and the guide vane opening, rotating speed and working water head. The unit speed is obtained finally after the generator module. The definition of the PTGS parameters regarding different modules are as follows: (a) Governor module: The proportional gain K p , the integral gain K i , the differential gain K d ; (b) electrohydraulic servomechanism module: The main relay T y and the inertial time constant of the main pressure valve T yl ; (c) penstock module: The flow inertia time constant T w , the head-loss coefficient f; the water-hammer pressure wave time constant T r ; (d) generator module: The inertial time constant of the generator T a and the adjusting coefficient e n . The PTGS parameters to be identified include K p , K i , K d , T y , T yl , T w , T a and e n [11].

DREAM
DREAM proposed by Vrugt et al. [33] is an MCMC sampling method with high-solving efficiency. Based on an adaptive random sampling of subspaces, useless chain removal technology, cross and mutation strategies, the DREAM algorithm can quickly converge to a stable distribution state. The DREAM algorithm has good sampling efficiency and estimation accuracy for high-dimensional decision spaces with complex multi-peaks. Therefore, it is widely used for uncertainty quantitative assessment in various fields, such as hydrology, geology, environment and so forth. For a better understanding of the DREAM algorithm, the Bayesian theory and the MCMC method are first introduced in this study, and then the implementation process of DREAM algorithm is introduced.

The Bayesian theory
The Bayesian theory is proposed by British mathematician Thomas Bayes as a statistical inference method and is an important branch of mathematical-statistical theory. The Bayesian theory treats the unknown parameters as random variables and describes their uncertainties in the form of probability distributions. The core of the Bayesian theory is to first calculate the posterior distribution of unknown parameters based on the prior distribution of the parameters when the sample is unknown and the likelihood function of the parameters when the sample is known, and then calculate or infer the unknown parameters through the posterior distribution. Compared with the classical statistical theory methods, the Bayesian inference can make full use of the information of sample sampling and prior distribution to obtain an estimate with a smaller error.
Assuming that x 1 , x 2 , … , x n represents the sample data, the theoretical framework of Bayesian inference is where p( |x 1 , x 2 , … , x n ) represents the posterior distribution of ; l (x 1 , x 2 , … , x n | ) represents the likelihood function of when the sample x 1 , x 2 , … , x n is known; p( ) represents the prior distribution of when the sample is unknown, ∫ p( )p(y| )d represents the normalisation constant.

The MCMC sampling
In most cases, the integral of the denominator of Equation (1) cannot be calculated in a closed form, such that the posterior distribution cannot be solved analytically. Therefore, the Monte Carlo algorithm based on stochastic simulation, such as the MCMC sampling, is usually used to calculate the posterior distribution of model parameters.
Suppose that X 0 , X 1 , X 2 , X 3 , … is a Markov chain in which the variables are randomly generated with Markov property. Given any X t , the variable of the next moment X t +1 is only related to X t and has nothing to do with X 0 , X 1 , X 2 , … , X t −1 . The mathematical expression of the Markov chain can be expressed as follows: where and ′ the states at time t and t + 1, respectively;S represents the state-space of the Markov chain. The Markov chains can be represented by the state-space S and the transition kernel p( * , * ) between the states. Assuming that the Markov chain is in a state at the current moment, and the transfer kernel p( , ′ ) represents the probability of it being in state ′ at the next moment, after a finite number of iterations, the Markov chain finally converges to the stationary distribution ( ) that satisfies the following equation: where ( ) represents the stationary distribution of the transfer kernelp( , ′ ). The MCMC method first generates several parallel Markov chains with a stationary distribution ( ) and then samples the stationary distribution ( ). By updating the sample information iteratively, the Markov chain can effectively search the feasible region of the model parameters and finally converge to the high-probability density region, that is, the maximum posterior estimation. The key to the MCMC method is to construct an appropriate transfer kernel.

Process of the DREAM algorithm
The DREAM algorithm employs the adaptive Metropolis (AM) sampling to construct the transfer kernel of the Markov chain. It combines the Markov chain based on AM sampling with the DE algorithm to form an algorithm with strong global convergence ability [33]. The specific steps of the DREAM algorithm are as follows: Step 1: The posterior probability density of each parameter is initialised. Assuming that the number of unknown parameters to be identified is n, N initial individuals { i , i = 1, 2, … , N } are first generated according to the prior distribution of the unknown parameters randomly. And then the posterior probability density of the N initial individuals is calculated, and the iteration process of the N parallel Markov chains is carried out.
Step 2: Mutation. For the jth Markov chain, mutation operation is performed on each individual of the tth iteration according to Equation (4): where j,t , j = 1, 2, … , N represents the individual of the tth generation of the jth Markov chain; v j,t +1 denotes the mutated individual; e and are two random numbers taken from the n-dimensional uniform distribution U n (−b, b) and the n-dimensional normal distributionN n (0, b * ), respectively, b represents the width less than the target distribution; denotes the number of the Markov chains employed to generate the mutated individuals; r 1 (p), r 2 (q) ∈ {1, 2, … , N } and r 1 (p) ≠ r 2 (q) ≠ j, p = 1, 2, … , , q = 1, 2, … , ; n is the number of parameters to be identified; ( , n) denotes the jump rate and generally taken as 2.38∕ Step 3: Cross. The cross operation of the DREAM algorithm is used to determine whether to update the ith component of the current individual. The process of the cross operation is as follows: where p c denotes the crossover probability, u denotes the random number in [0,1]. The ith component of the current individual is retained ifu ≤ 1 − p c , otherwise the ith component of the current individual is updated.
Step 4: The posterior probability of v i j,t +1 and the reception probability is calculated according to Equation (6): where ( j,t , v j,t +1 ) represents the receiving probability; p(v j,t +1 |y) represents the posterior probability of v j,t +1 ; p( j,t |y) represents the posterior probability of j,t .
Step 5: Determine whether to accept the cross-mutated individual according to the reception probability. If ( j,t , v j,t +1 ) ≥ u, accept the new individual, otherwise let v j,t +1 = j,t and proceed to the next iteration.
Step 6: The useless chains according to the inter-quartilerange (IQR) statistical test is removed [33].
Step 7: If S R < 1.2, the Markov chain converges to the posterior probability distribution and stops iteration; otherwise, steps 2-6 are to be repeated. S R represents the scale-reduction-scoring factor of which the mathematical expression is as follows [34]: where k denotes the number of the iterations, s represents the number of parallel chains used for evaluation, B∕k denotes the variance of the average value of all Markov chains,ū denotes the average value of u i , and W denotes the average value of the variance s i of the s parallel chains. The flowchart of the DREAM method is given in Figure 3.

Performance evaluation indices
After the posterior parameter distributions for each parameter are obtained using the DREAM method, the posterior mean and variance are calculated to evaluate the parameter identification results of PTGS using DREAM. The posterior mean and variance are calculated as follows: where N denotes the number of chains and n denotes the number of maximum iterations, that is, the length of each chain. Moreover, the parameter accuracy, that is, the absolute relative error (ARE) between the identified value (̄) and the system value ( ) is calculated for each parameter [11,19] as

PARAMETRIC UNCERTAINTY ANALYSIS PROCESS OF PTGS BASED ON DREAM
The PTGS is a high-dimensional and strong non-linear system. It cannot be expressed analytically and the posterior probability density distribution of the system parameters cannot be directly derived. For Bayesian parametric uncertainty analysis, the parameters to be identified are considered as a set of random variables, and the MCMC method is used to sample the search space of the model parameters randomly. This study introduces the DREAM algorithm that combines multi-chain parallelism and differential evolution. It has high-sampling efficiency and estimation accuracy. The parametric uncertainty analysis steps based on DREAM for PTGS are as follows: Step 1: The observation data points of the PTGS system are collected. Because of the fuzziness of the prior information of the parameters to be identified, this study selects the uniform distribution on search space of the parameters as the prior distribution.
Step 2: For the n-dimensional parameters set to be identi-fied̂, N initialised Markov chains are generated and the parameter decision space is randomly tranversed. Step 3: The N candidate samples are sent to the non-linear model of PTGS and the model output is recorded, and then the posterior probability of the N candidate samples is calculated.
Step 4: Cross and mutate steps are performed according to Equations (4) to (6), and then whether to accept the sample according to the acceptance probability a and remove the useless chains according to the IQR value are determined.
Step 5: If the convergence criteria are reached, then we proceed to the next step, otherwise we skip to step 3.
Step 6: The calculated Markov chain is saved and the posterior probability density distribution of each parameter is calculated.

EXPERIMENTS AND RESULTS ANALYSIS
To verify the effectiveness of the parametric uncertainty analysis method of PTGS based on DREAM, a real pumped storage unit (PSU) located in the Bailian River pumped storage in China [5] is simulated. The type of the pump-turbine model is HLN-LJ-550 [7]. A group of parameters = [K p , K i , K d , T y , T yl , T w , T a , e n ] is set as the real value of PTGS. The PTGS simulates the real PSU at a 50 s time step. The sampling frequency is set to 100 Hz.

Parameters setting
The main parameters of the DREAM algorithm are set as follows: The number of the parameters to be identified is eight; the number of the Markov chains is 16; the number of max-imum iterations is 1000. The prior distribution of the eight PTGS parameters is set as the uniform distribution on the search space. This study sets the search space of the eight PTGS parameters as [50%, 150%] of the system value. The eight PTGS parameters and their pre-defined prior uniform distributions are given in Table 1. The search space of the eight PTGS parameters is chosen according to the published literatures [11,16]. In order to test the influences of the environmental interference, such as the operation of water machine and water flow, the electromagnetic interference of the generator, and the measurement noise on the parameter identification results of PTGS, different proportions of noises are added to the outputs of the governor [11]. The parameter identification experiments are designed under different signal-to-noise ratio (SNR). The SNR is set as 90 and 20 dB for the parameter identification experiments, respectively. This study chooses 90 and 20 dB as the noise scenarios because they represent two typical noise scenarios, namely, high definition and low definition of noises, respectively.

4.2
Results and discussions

Analysis of scaled-reduction score curves
The influence of the observation data uncertainty on the parameter identification results is analysed in this section. The scaledreduction sore curves of each parameter of the PTGS system using the DREAM method under 90 and 20 dB noise scenarios are illustrated in Figures 4(a) and (b), respectively. Ty stands for T y in the following figures of this study for better exhibition, and so do the other seven symbols.
From Figure 4, it is known that the DREAM method tends to converge to a stationary posterior distribution finally for all the eight PTGS parameters. As shown in Figure 4, it is noted that both the scaled-reduction score curves for the two scenarios have a sharp decline in the first 4000 model evaluations. The scaled-reduction score turns to remain unchanged when it comes to the 6000th model evaluation. The convergence speed of scenario 20 dB is a little faster than that of scenario 90 dB.

Analysis of posterior probability distribution curves
The posterior probability distribution curves of the eight PTGS parameters estimated using DREAM under scenarios 90 and 20 dB are illustrated in Figures 5 and 6, respectively. The red vertical line in the posterior probability distribution curve represents the system value of the parameters. The shapes of the posterior probability distribution curves indicate the degree of uncertainty for the parameter identification results. Flat and wide distributions indicate more uncertainty for the parameters, while sharp and narrow ones indicate less uncertainty [19].
From Figure 5, it can be found that: (a) The marginal distribution intervals are relatively small compared with the predefined search range with uniform distribution, which indicates that the eight PTGS parameters are reasonably well identifiable; (b) the posterior probability distribution curves of the eight parameters are different from each other, which indicates different degrees of parameter uncertainty for different parameters; (c) the posterior probability density distribution of the eight parameters is relatively concentrated. The maximum posterior estimations of the eight parameters are relatively close to the system value, which demonstrates the effectiveness of the DREAM method in the parameter identification of PTGS; (d) the marginal distribution intervals of K p , K i , T a and e n are very close to the system value and generally fluctuates around the system value, which indicates that the uncertainty of the parameter estimation for these parameters using DREAM is relatively small.
From the comparison analysis of Figures 5 and 6, it can be found that the uncertainty degree of the parameter estimation results under scenario 20 dB is much larger than that under 90 dB, which demonstrates that the uncertainty degree will become lager along with the increase in the observation noises, and the width of the parameter marginal distribution interval generally increases accordingly. It is worth noting that the K d , T y and T yl are irregularly distributed, and the posterior probability density distribution is relatively flat under scenario 20 dB, which indicates that different values of the parameters may obtain the same fitness value, that is, the uncertainty of these parameters increases significantly with the increasing of noises. The uncertainty of the identification results of K p and T a is relatively small under scenario 20 dB. The increasing trend of the uncertainty is not obvious along with the increasing of the noises. It should be emphasized that the maximum posterior estimation of K d and T yl deviates greatly from the system value, indicating that the non-linear characteristics of the separate modules of PTGS can limit the parameter identification accuracy of PTGS.
Through the above analysis, it is found that the observation data noise is an important factor that causes the uncertainty of the parameters. It should be pointed out that the above experiments set the search space of the parameters to be identified as [50%, 150%] of the system value. However, it may not be possible to accurately obtain the search space in practical engineering applications. In addition, with the increase of the unit operation life and the degradation of the equipment performance, the parameters may also shift. Therefore, the uncertainty of the prior search space of the parameters is also worth further research and discussion. In order to simulate the influ-ence of the prior knowledge on the parameter uncertainty, this study adjusts the search space of the decision variable to [10%, 300%] of the system value. The noise scenarios are still set to 90 and 20 dB, and the posterior probability distribution curves of PTGS obtained by the DREAM method are shown in Figures 7  and 8 for scenarios 90 and 20 dB, respectively.
Compare Figures 5 with 7, it can be found that with the expansion of the prior search space, the width of the posterior probability density distribution interval of the parameters increases accordingly, and the maximum posterior estimates of K d , T y and T yl gradually deviate from the system value. It can also be found from the comprehensive analysis of Figures 5-8 that the parameter uncertainty degree increases significantly under the dual effects of the expansion of the prior search space and the noise interference of the observed signal. Although the parameter uncertainty increases with the increase of the prior search space and the noise interference of the observed signal, the posterior probability density distribution of the parameters obtained by the DREAM method is still relatively accurate and compact. That is to say, the uncertainty analysis results can narrow the search space of the decision variables. Based on the contracted search space, refined parameter identification of PTGS can be further researched using intelligent algorithms.

Statistical performance evaluation
The posterior mean (mean) and the maximum posterior estimation (max) can be regarded as the identified value of the PTGS parameters. Therefore, five indices that include the max value, the ARE between max and the system value (ARE map ), the mean value, the ARE between mean and the system value (ARE mean ), and the standard deviation (SD) of the posterior  Tables 2 and 3, respectively. To evaluate the performance of the point identification results of the DREAM algorithm, the point identification results of the IBSA algorithm [11] have also been provided in Table 4. By comparing the results of Tables 2 with 4, it can be found that the mean and max of K p , K i , T w , T a and e n are similar to the point identification results based on the IBSA algorithm. However, the identification results of T y , T yl and T w are worse than those of the IBSA algorithm. From Figure 5, it is obvious that the posterior probability distribution curves of T y , T yl and T w are relatively wide and contain rich information, so the results are not as accurate and direct as the identification results based on IBSA.
It can be concluded from Tables 2 to 4 that: (a) The mean and max of most PTGS parameters obtained by DREAM are similar to the point identification results obtained by the intelligent

CONCLUSIONS
Parameter identification of PTGS has received a lot of attention in recent years to model the PSU accurately. However, recent parameter identification methods are mostly based on optimisation theory, which can only obtain a set of optimal parameters and cannot quantify the uncertainty of the parameters. This study employs an MCMC algorithm based on Bayesian theory, that is, the DREAM method, for efficient parameter identification and accurate uncertainty quantification for PTGS. A real PSU in China is simulated and the PTGS parameters are estimated using the DREAM method. The main results and conclusions of this study are summarized as follows: (a) The marginal distribution intervals are relatively small compared with the prior search range with uniform distribution and the posterior probability density distribution of the eight parameters is relatively concentrated, which indicates that the eight PTGS parameters are reasonably well identifiable; (b) the maximum posterior estimations of the eight parameters are relatively close to the system value. The posterior mean and the maximum posterior estimation of most PTGS parameters obtained by DREAM are similar to the point identification results obtained by intelligent evolutionary algorithm [11], which demonstrates the effectiveness of the DREAM method; (c) the uncertainty degree will become lager along with the increase of the observation noises, and the width of the parameter marginal distribution interval generally increases accordingly; (d) the uncertainty of the prior search space of the parameters has a great influence on the parameter identification accuracy of PTGS. The DREAM algorithm can narrow the prior search space of PTGS parameters effectively and thus help the hydropower engineers to make decisions. After all, it can be concluded from the results and conclusions of this study that the DREAM method can be employed as an effective tool for parameter identification and accurate uncertainty quantification of PTGS. On the one hand, the DREAM algorithm can identify the parameters of PTGS with unknown prior search space and noise; on the other hand, it can also reduce the search space of PTGS parameters. The above research framework of parameter identification and uncertainty analysis not only brings new ideas and methods for hydropower research but also has certain enlightenment significance for the parameter estimation and uncertainty analysis of complex non-linear systems in other industries. Further research will be focused on applying more parameter uncertainty methods on PTGS and compare the performances of different uncertainty analysis methods for parameter identification of PTGS on realworld simulations.