Advanced method for short-term wind power prediction with multiple observation points using extreme learning machines

: This research paper presents an advanced approach to enhance the short-term wind power prediction based on arti ﬁ cial intelligence techniques. A high-quality wind power prediction is essential for power system planning, operation, and control. Thus, a new novel approach has been developed to improve the quality and reliability of the calculated results by integrating advanced time series processing method and the extreme learning machine technique. Moreover, historical records are utilised from numerical weather information and multiple observations points close to real wind farm sites within Australia regions. The wind speed is assessed by using the developed model in the ﬁ rst stage, and then the wind power and capacity factor is calculated using wind power – speed curve for each observation site. Arti ﬁ cial neural network, fuzzy logic (adaptive neuro-fuzzy inference system), and support vector machine models are used for model veri ﬁ cations, validations, and practical applications. The developed model is tested using real wind measurements by Bureau of Meteorology, 15 selected weather stations corresponded to the locations of nearby real wind farm sites in Australia. The demonstrated results and performance indicators, e.g. root mean square error and mean absolute error are compared with Khalid, persistence, and Grey predictor models for validations and veri ﬁ cations reasons. As the potential gains over other techniques, the proposed model has found more ef ﬁ cient and superior for wind power estimation and prediction than other developed conventional methods and models, which in turn improves the power system performance, and reduces the economic impacts.


Introduction
The evolution of wind power industry has experienced extensive growth over the previous years due to environmental and sustainability targets, and availability of wind resources in addition to cost reduction. A lot of wind farms offshore and onshore have been constructed in increasing quantities [1][2][3][4]. Thus, the world's energy portfolio becomes larger. Therefore, the wind turbines become larger and more expensive. The Australian Energy Market Operator forecasts of 8.88 GW additional wind generation in National Electricity Market (NEM) by 2020 [5]. This results around 11.5 GW in total installed NEM generation capacity. The key factors to support this extra amount are reducing the power system inertia and reducing the fault level. Nowadays, the world's wind power resources are tremendous and can cover a big portion and power share of the global electricity consumption [6]. It happened in numerous regions in the world with significant installed wind power generation capacities [3]. Wind power has a continuous growth in Europe, USA, Canada, and Australia [6][7][8][9], due to the large installed capacity of the generators used. Moreover, Denmark is a pioneer in developing commercial wind power [10]. During the 1970s, and today, a substantial share of wind turbines around the world are manufactured by Danish manufacturers such as Vestas. Wind power provides 33% of the Denmark's total energy consumption in 2013 and 41% of the Denmark electricity consumption in the first half year of 2014. Furthermore, Denmark plans to meet 100% of its energy needs with renewable resources by 2050. The wind power provides a clean source of energy without carbon dioxide emission for future power generation and smart grid, and it can be a valuable supplement to conventional energy resources such as fossil fuels. However, wind power is fluctuating in spatial and time because of the uncertain nature of the wind, whereas wind shear and tower shadow effects also cause periodic fluctuations [11]. These may lead to severely forced oscillations when the frequencies of the periodic variations approach the natural oscillations frequencies of the connected power network. Therefore, the cost-effective integration of wind power into NEM is of significant challenges [12][13][14][15]. However, an accurate wind power prediction (WPP) is one of the most critical aspects of the wind power integration and control and can improve the power system operation [16,17]. Moreover, it helps the decision makers to reschedule more efficient, economical, and efficient power generators to achieve the power demand of the NEM utility users. Furthermore, short-term WPP enhances the power systems security and stability and increases the reliability of wind power integration and unit commitment. It reduces the reserve in the power demand, and allows the dispatcher to optimise the generations and reduces the production cost. While the high wind power forecasting errors impact on the decision making by the power system operator and causes serious power systems problems. Thus, WPP has been studied by many of academic researchers and scientists over many years [18][19][20][21][22][23][24][25][26][27]. However, the current forecasting methods have a lot of errors and take a long time for a prediction. Several methods, techniques, and strategies have been developed to overcome the wind power forecasting problems whether stand alone or hybrid models such as statistical models, persistence method (PM), Grey method, Khalid method, and using artificial intelligence (AI) techniques [23,[28][29][30][31][32][33][34]. Persistence wind power forecast assumes that the wind power at a particular future time will be the same as it is when the forecast is conducted [35,36]. While in Grey model, which has been widely applied in many prediction applications, the first-order accumulated generating operation series is generated, then the Grey dynamic model is formulated [37]. In Khalid model, the prediction of wind speed and wind direction is achieved in the first stage based on wind speed-direction coupling technique, then the predicted wind speed is converted to predicted wind power using the wind power-speed curve (PC) transform [31]. However, varieties of techniques have been developed for WPP using linear and non-linear models, including autoregressive moving average method [38,39], integrated neural networks model [40][41][42][43][44], support vector machines (SVM), wavelet SVM methods [28,[45][46][47], and fuzzy logic models [48,49]. Also, a hybrid method for short-term wind power forecasting has been proposed in the recent years [40], such as applications of SVMs and fuzzy systems for WPP. As the fuzzy clustering algorithm is initially used to classify the different wind speed patterns, while the support vector regression is optimised to make the wind power forecast [50]. Moreover, an ensemble short-term wind power forecasting model is developed based on artificial neural networks (ANNs) and Gaussian processes (GP), where the integrated ANN and GP improved the prediction accuracy [41,51]. Furthermore, probabilistic methods are widely utilised to solve the wind power forecasting problems such as the application of neural network for wind power generation prediction [52,53]; temporally, local GP method [54], fuzzy systems, and SVM method [55]. The extreme learning machines (ELMs) are widely developed for probabilistic WPP interval problems [56][57][58][59], and for solving the electricity market prices prediction problems [60,61]. Applications of ELMs are widely utilised for short-term wind speed and power forecasting [62][63][64] because the ELM as gradient free method has powerful tool for modelling complex and non-linear systems. It has high accuracy and very fast processing time, which in turn overcome the expensive learning algorithms of conventional ANNs. As the higher WPP error leads to inefficiency in power systems operation, and economic impacts in the electricity market, where the economic impacts of the forecast errors resulted in either wasted energy or not enough energy to meet the demand and spinning reserve requirements, and higher costs of energy that not served. Thus, different methods and models used different quantitative indices and indicators for wind power performance and accuracy evaluations and measurements such as root mean square error (RMSE), mean absolute error (MAE), and mean average percentage error, while others normalised these indices by the installed wind power capacity of the case study. The advanced time series processing method has been applied recently for WPP to reduce the prediction errors caused by the volatility and non-linearity of wind power [65]. The wind power is decomposed into components with different frequencies by ensemble empirical mode decomposition (EEMD), a chaotic time series analysis and a multiscale singular spectrum analysis (MSSSA) are applied for further data manipulation. The least square SVM method (LSSVM) is used for WPP. Table 1 describes the performance evaluation indices; normalised RMSE (NRMSE), normalised MAE (NMAE) for short-term 1 h ahead WPP based on different methods and models such as PM, radial basis function neural networks (RBFNN), LSSVM, EEMD-LSSVM, and MSSSA-LSSVM [65], where a higher accuracy can be obtained by further processing and manipulations to wind power time series.
In this research paper, multiple observations points based on 15 selected weather stations close to real wind farm within Australia in the period from 2011 to 2014 are under study. Australia is a continent that experiences various climate conditions as shown in Fig. 1 [66], where periodic fluctuations characterise the temperature and air density variations. However, on the contrary, the wind speed is described by intermittent and volatility nature due to uncertainty. Thus, the wind speed is decomposed into subsets intrinsic membership realisation functions (IMFs), which is so called sifting process. Then each IMF is further manipulated by using sample entropy (SamEn) based complexity measure, where the SamEn and approximate entropy (ApEn) are complexity measure based techniques that are used to measure the amount of regularity and unpredictability of fluctuation over time series. The main contributions and novelty of this research paper are highlighted as follows: considers the intermittent nature of wind power, examines the characteristics of wind speed at different Australian weather station locations, and studies the wind speed components during both decomposition and prediction stages. Integrating advanced time series processing method such as complete EEMD with an adaptive noise (CEEMDAN) technique with ELM and ApEn improved the WPP accuracy, minimised the accumulated errors and the computational time, and then reduced the forecasting uncertainty. Also, it takes into account the temporal and spatial variations of wind speed.
This paper is well-organised as follows. Section 2 presents the CEEMDAN in addition to EEMD and EMD based methods. Section 3 describes the ApEn and samples entropy as complexity measures based method for data reconstruction. Section 4 explains further information about the ELM-based method, while Sections 5 develops an advanced model based on CEEMDAN-ELM for WPP and capacity factor estimation. Discussions, analysis, and simulations are presented in Section 6, and finally, Section 7 concludes the paper results.

Complete EEMD with an adaptive noise
The application of CEEMDAN is highly developed and can be used to overcome the problems of EEMD [67][68][69]. CEEMDAN decreases the complexity of calculations by demanding less than one-half the sifting process. Also, it overcomes the significant drawbacks of EMD method [70,71], such as end effect, and mixing mode problems, where the mixing mode is defined as either a single IMF consisting of components of widely disparate scales, or a component of a similar scale residing in different IMFs. The CEEMDAN decomposition method can be applied and demonstrated to calculate the first residue by using the following expression: where IMF 1 [n] is determined in a similar way as EEMD. Thus, computes the initial EMD mode across an ensemble of r 1 [n]p l u s diverse realisations of a given noise, obtaining IMF 2 by averaging the following residue: This process continues with the remnant of the modes till the stopping rule is achieved. Set the operator E j (.) for a given signal, produce the jth mode found by EMD, and let w i is the white noise with N (0, 1), and x[n] is the target data, then the decomposition procedures are as follows.
Step 1: Decompose using EMD I realisations x[n] + 1 0 w i [n]t o determine their first modes and compute IMF as follows: Step 2: Obtain the first residue based on the first stage as Step 3: Decompose the number of realisations r 1 [n]+1 1 E 1 (w i [n]), i = 1, ...I until getting the initial EMD mode, and then define next mode as follows: Step 4:F o rk = 2, ...K determine the kth residue as Step 5: Decompose the number of realisations r k [n]+1 k E k (w i [n]), i = 1, ...I until their initial EMD mode, and define (k + 1)th mode as follows: Steps 6: Go to step 4 for the next k, and then steps 4-6 are repeated until the determined residue has no possible solution to be decomposed anymore. Then, the final residue achieves the following expression: where K is the aggregated number of modes for the assigned signal x[n], and can be explained as follows: This equation achieves decomposition process and gives an exact reconstruction of the original data. Moreover, the error (the difference between the correct decomposition of original data time series and the result of ensembles procedure) caused by the added white noise in the decomposition can be given by using the following formula: where N is the number of ensembles, 1 is the magnitude of the added white noise, and 1 n is the eventual standard deviation. w i (t) can be calculated using w i (t) = 1 * noise (t). By comparing EEMD and EMD ensemble modes, the only difference is that EEMD requires averaging the number of realisations to get each IMF, but EMD does not. The given wind time series are decomposed by EMD, EEMD, and CEEMDAN as shown in Fig. 2 by setting the number of realisations, noise standard deviation, and a maximum number of iterations to 150, 0.3, and 16,000, respectively.

Approximate entropy
ApEn is a complexity measure based technique used for quantifying amount of data time series [72,73]. It is widely used for assessing the time series signal and diagnosing disease. Also, it can be executed by using the data stream of length N, window size m, and distance measure for comparing reasons. The ApEn is designed to work with data samples, relies on the record length of data time series intensively [74,75], and it lacks relative consistency. SamEn is an improvement of ApEn, and indicates a substantial advantage like independency of data length. It does not require any assumption to be made regarding the data stationery [76], and considers trouble free tools in measuring the complexity of data time series. The novel approach is that SamEn parameters distinguish various systems by adjusting the SamEn m, r, N () parameters, e.g. for a given embedded dimensions m, tolerance r, and number of data points N, thus, SamEn can be defined as the negative logarithm of conditional probability in which two similar sequences of m data points remain similar at the next point, except in the self-matching. The SamEn algorithm for data samples N = [x 1 , x 2 , ...x N ,] defines the window size m as vectors of data sequence x(i) m = x i ,x i+1 , x i+2 , ..., x i+m+1 , i = 1, 2, ..., N − m + 1 in the R m range. The distance between vectors can be calculated as the absolute maximum difference between their scalar components, and can be expressed by using the following equation: where d m represents the distance between the window size vectors x m (i) and x m (j), B i can be defined for each x m (i) as the number of j that distance between x m (i) and x m (j) is less than or equal to the predefined tolerance r that services as a noise filter where A i defines number of x m+1 (i) within r of x m+1 (j), where 1 ≤ j ≤ N − m and i = j, and it can be calculated by increasing the window size dimension to m + 1 Define B m (r) and A m (r)a s where B m (r) and A m (r) are the probability that two sequences will match for m and m + 1 points, respectively, as result, the SamEn for finite data length of N can be obtained by using the following equation: It is obviously indicated that A m (r) have smaller values than B m (r). Thus, the calculated value of SamEn will be zero or positive. Moreover, the best values of window size m are 1 or 2, while the tolerance r is between 0.1 and 0.25 times the standard deviation [25]. In this study, m sets to 2 while r sets to 0.2 times the standard deviation of wind speed data series. However, based on the theory of CEEMDAN, EEMD, and EMD, the first IMF is the original time series, and with increment order of IMF number, as result, the SamEn value becomes smaller. The calculated IMFs realisation functions are reconstructed into noise components, cyclic components, and trend components based on the group rule of Fig. 3a with a λ sets to 0.3 for CEEMDAN, EEMD, and END, respectively. The SamEn values for IMFs decomposed by CEEMDAN, EEMD, and EMD for different weather stations within NSW and SA regions are shown in Figs. 3b and c, where the SamEn values are decreased with increasing the order of intrinsic membership functions.

Extreme learning machines
Applications of AI techniques such as ANN, fuzzy, ELM, SVM in power system modelling and load forecasting is an area of growing interest [77][78][79]. ELM has better advantages for power system operation and control than others AI methods, due to higher accuracy, gradient free method, and very high computational speed [79]. ELM-based prediction method is developed for short-term wind speed prediction based on the obtained noise, trend, and cycle components, where the next point value ahead can be predicted by using the ELM method. The variations of these components are wellbehaved and more predictable. ELM has a novel algorithm for training single hidden layer feed-forward network. Also, ELM randomly generates all input weights and bias parameters, and then analytically obtains the output weights using simple matrix computation [80,81]. For N distinct samples (x j , t j ) N j=1 , where the input x j [ R n and the target t j [ R m , ELM is constructed with number of hidden nodes K and the activation function g(x) can approximate non-linear function with uncertain nature. The ELM can be modelled by using the following equation: where w i = [w i1 , w i1 , ..., w iN ] T is the weight vector which is connecting between the ith hidden neuron and the input nodes, where H is the hidden layer matrix, b is the output layer vector, and T is the matrix of targets. It is clearly demonstrated that ELM with randomly chosen input weights and hidden layer biases can exactly learn N distinct observations [80]. H matrix can remain unchanged once the random values have been assigned. Therefore, the training method in ELM techniques is corresponding to find the least squares solution of H.b = T and can be expressed as follows: where H † is the Moore-Penrose generalised inverse of matrix H, compared to conventional gradient based ANN, ELM averts many difficulties, such as local minima and high computational burdens, and it can determine a good generalisation performance with increased learning speed. To improve generalisation performance and make the solution more robust, a regularisation term C can be added as shown in the following equation:

Proposed method
The proposed WPP system is developed based on multiple observations points taken from different weather stations close to real wind farms within Australia as shown in Fig. 4, where the prediction steps are highlighted as follows: recording wind data, decomposing data into IMFs, grouping based SamEn strategy, develops ELM, and finally WPP and capacity factor estimation. The focus in this study is to improve the wind prediction at a given wind farm location using wind speed measurements and to compare the proposed method with other approaches and models for accuracy, verifications, and validations reasons. The main objective of this study is to propose a complete wind power predictor which is capable of dealing with uncertainty, intermittent nature, and fluctuations, and The accurate prediction system can support the system operators to reschedule an efficient, reliable, and economical energy production to achieve the required demand of the utility customers. Moreover, the shortterm power prediction enhances to power system security, reliability, and stability. The proposed method has two stages, where in the first stage, the wind time series are decomposed by using advanced time series processing techniques such as CEEMDAN, EEMD, and EMD. The ApEn is calculated for each membership function, and then grouping or data reconstruction into three components: noise component, cyclic component, and trend component by using the grouping method is done which has been described in Fig. 3a. Moreover, decomposition by the time series by EEMD and EMD are used for accuracy and comparison purposes. Application of advanced time series based processing methods with AI techniques are developed to predict the wind speed, and temperature variations, then the predicting wind speed, and temperature values are used to calculate the air density and wind power for each location. In the second stage, the PC with rated wind power generation of 3.6 MW is developed to calculate the predicted short-term wind power and then estimates the capacity factors associated with each weather station under the study, which is close to real wind farms location within Australia regions/areas. Different models configurations and network structures, and various training algorithms have been tested to extract the optimal model. Therefore, ELM network with 250 sigmoid single hidden layer neurons is used in this paper. The CPU time and RMSE is used for testing the model accuracy, and training time. Also, three models are built by ANN, fuzzy logic, and SVM in addition to ELM model to verify the output results. A feed-forward back propagation ANN is developed with input and output layers, and two sigmoid 'tansig' hidden layers' neural network with 20 and 40 neurons, respectively. The model is trained with Levenberg-Marquardt training algorithm. Moreover, a fuzzy logic Sugeno-type system is developed by using the neuro-fuzzy designer with three 'trimf' memberships' functions, and trained using adaptive neuro-fuzzy inference system fuzzy hybrid learning algorithms. Furthermore, SVM model is built using the radial basis RBF kernel function, which is trained using sequential minimal optimisation technique. Finally, persistence, Grey and Khalid models and methods are utilised for comparative studies, model validations and verifications.

Discussions, analysis, and results
Wind power is proportional to cubic variations of the wind speed.
Thus, any small change in the wind speed leads to significant changes in the wind power. The PC is developed based on 3.6 MW rated value, cut-in, corner, and cut-out wind speed value are 4, 14, 25 m/s, respectively. The air density and wind power per unit area at a different temperature and wind speed values can be determined by using the air density and wind power equations: where r is air density (kg/m 3 ), p atom is atmospheric pressure in Pascal (101 × 10 3 Pa), R air is air gas constant (287 J/kg K). T is average air temperature for the weather station, and v 3 i equals the cube of ith wind speed (m/s) value. The model is tested using the data obtained by Bureau of Meteorology for different locations close to real wind farms within Australia, and compared with other models. Tables 2 and 3 show the accuracy (%) of the wind   Fig. 7 shows the estimated wind power and capacity factor in stage 2. Based on the tested results, the approach added reasonable improvements in WPP. However, further improvements can be obtained by further processing and manipulations to wind speed IMFs, and minimising the wind speed prediction horizon/time scale, e.g. scale 5 min.
Comparing the statistical results, analysis, and simulations, it can be clearly demonstrated based on the figures and tables that decomposing the wind speed time series into three components: cyclic, trend, noise components is much better than predicting with original speed time series. The CEEMDAN is much better than EEMD and EMD due to increasing the number of the realisations IMFs, which has higher correlation coefficient and more robust. Application of ELM as a gradient-free algorithm is much better than using the conventional neural networks, e.g. with tapped delay, which need very long time for training. The CEEMDAN-ELM has higher accuracy and less noise, and lower training times, where the average training accuracy for ELM-CEEMDAN for noise, cyclic, and trend components are 0.5, 0.02, and 0.155, respectively, while the average training time   389', respectively. The ANN has a higher training time and more accuracy than SVM, and fuzzy system models. Therefore, and based on the results, the CEEMDAN-based ELM method is much better, superior, and state-of-the-art technology for short-term wind speed and WPP systems than using conventional ANN, SVM, or fuzzy systems. Moreover, the CEEMDAN-based ELM has higher accuracy than persistence, Grey, and Khalid models. It can be indicated and highlighted from simulation results that the proposed ELM method is much better than ANN, fuzzy logic, and SVM methods. Also, decomposition of the wind speed time series by CEEMDAN is much better than EEMD and EMD due to increasing the number of realisations by adding adaptive noise. Moreover, the SamEn as complexity measure based techniques is a useful tool for assessing the wind speed time series. Furthermore, the proposed ELM method has higher accuracy than persistence, Grey, and Khalid methods, e.g. the MAE for the proposed ELM method is 0.33% in stage 1, and 0.56% in stage 2. While the MAE for persistence, Grey, and Khalid methods are 6.35, 2.94, 2.27% for stage 1, and 4.07, 4.05, and 2.87% for stage 2, respectively, which indicated the effectiveness of the proposed ELM model for solving the short-term WPP problems, and it can be highly recommended for practical applications. However, further processing to wind speed IMFs is under study.

Conclusion
The intermittent, volatility, and uncertainties nature of wind power creates significant challenges and impacts for both power system economic operation and control. Therefore, accurate WPP system enhances the power system operation and decision making. In this research paper, an advanced approach has been developed for short-term WPP by integrating advanced time series processing methods with AI techniques such as ELMs technique, ANNs, fuzzy systems, and SVMs. Also, real-time wind data resources have been collected and assessed for several geographical locations close to real wind farms within Australia. In the first stage, the forecasting system predicts the wind speed value based on multiple observations from nearby weather stations which are close to the wind farms regions, and then the wind power is predicted based on the PC in the second stage. The estimated capacity factor for each weather station is of high importance when assessing the renewable energy resources. Different models structure and training algorithms were developed to get the optimum model's construction and configurations for all models under the study 'ELM, ANN, fuzzy system, and SVM' and based on the available wind data resources. Decomposing the wind speed time series by CEEMDAN and grouping it finally into three components is more efficient when using a separate intelligence forecaster for each component than building predicting system by using the original wind speed signal. ELM is a very efficient and fast processing method dealing with complex relationships. An integrating CEEMDAN with ELM was developed for WPP. Moreover, an integrating EEMD, EMD with ANN, fuzzy logic, and SVMs were developed for validations and verifications purposes. The proposed ELM method has been compared with other techniques used for predictability such as Khalid method, Grey method, and PM, and the performance indices MAE and RMSE indicated and highlighted that the proposed ELM method is more accurate and efficient for  short-term WPP problems. Consequently, and based on the statistical analysis, and simulations results, the proposed approach has an excellent improvement in performance, state-of-the-art technology, and superiority in the final WPP, and it can be highly recommended for wind power practical applications.