Weather biased optimal delta model for short-term load forecast

: In the current scenario of the deregulated Indian electricity market where the power demand and its availability vary remarkably, the factors playing a significant role in demand variations are often associated with the impact of unprecedented weather conditions and technological evolutions. To maintain grid security and discipline that yield to financial implications, there lies a great need to formulate an equilibrium between electricity supply and demand. Devising a model to anticipate the variations which are highly adaptive to such changes is the need of the hour. For this purpose, an algorithm has been proposed in this study, which is best suited for the day-ahead load forecast. The variables selected for the forecast are one-day-lagged demand statistics, seasonality trend, weather, and calendar variables. The proposed algorithm outperforms the existing benchmark model, which is evaluated through various statistical performance metrics such as mean absolute percentage error, mean absolute error, root-mean-square error, and coefficient of variation. The performance of the proposed methodology at the seasonal level is analysed and validated through uncertainty analysis with one post-sample year for the state of Delhi, India. This model presents its compatibility to prevalent grid regulations as well as shall hold good in the weather and demand variations possibly expected in the future.


Introduction
Worldwide, the energy consumption is increasing at 1.52% due to globalisation, climatic changes, and technological advancements, whereas since 2000, the rise in Indian electricity demand is 10% annually [1,2]. Such an exponential increase in demand can result in a mismatch in demand versus availability, which may lead to several disturbances at the grid level. Despite any eventuality, maintaining grid stability is an obligation of the power management authority of any country. In India, this role is played by the National Load Despatch Center, Regional Load Despatch Center, State Load Despatch Center at national, regional, and state levels, respectively [3,4]. In 2019, the energy requirement for India was 1,274,595 million units (MU) and the energy availability was 1,267,526 MU with a shortage of 0.6%, whereas the peak demand was 177,022 MW and the peak met was 175,528 MW with a shortage of 0.8%. A graph showing this trend over the past one decade of power shortages in terms of energy and peak power supply is shown in Fig. 1 [5]. For effective power management by procurement/disposal of power, there is a need for good anticipation and planning, which can be improved with wellequipped tools capable of anticipating upcoming loads. Load forecasting plays a valuable role in planning and control operations to improve the power quality and stability of the electricity grid. Load forecast is normally classified into four categories, such as very short-term, short-term, medium-term, and long-term [6][7][8]. A very short-term forecast is done from real-time to a few minutes, which helps to estimate intra-day scheduling of power for the distribution companies and contingency analysis for the security of the system [6]. Short-term forecast is done from half an hour to the upcoming day, which is useful for the allocation of spinning reserve, operational planning, unit commitment, and maintenance scheduling [9][10][11][12]. Medium-term forecast is done from a few days to a few weeks, which is useful for seasonal planning such as peak-summer and winter [7]. Long-term forecast is used to predict the load patterns from a few months to a few years, which provide better assessment for future generation growth in advance [13]. One of the best possible methods to reduce the mismatch in demand versus availability can be done by closely analysing weather and demand trends and frequently optimising the hyperparameters for an effective forecast.
For load forecasting, various methods have been developed from time to time. Traditional models such as autoregressive (AR) and AR moving average were encouraged primitively due to their easy deployment for forecast generation [14]. They failed to justify their significance mainly because these models were developed using linear functions, whereas electricity demand data follows a non-linear trend. For this reason, it was preferred to switch to additive and multiplicative model-based approaches in which correlating load with different factors such as time, weather, calendar-based events, special occasions, gross domestic product growth etc. was done [15]. The time series-based models such as AR integrated moving average [16][17][18] and non-linear AR method with exogenous inputs [19] provide flexibility to analyse and predict load growth under different scenarios using the least possible weather data. These forecasting methods proved to be useful for cities where linear and predictable load pattern was obtained, but in the case where the load pattern was non-linear, the errors significantly increased. Recently, several numbers of artificial intelligence (AI)-based load forecasting algorithms have been developed using fuzzy logic [9,20], artificial neural network [20][21][22][23][24], and AI feed-forward neural network [17,25,26]. A semiparametric approach based on the generalised additive model (GAM) theory has been suggested for the prediction of demand at the grid level for varying time horizons [27,28]. The bootstrap method gave good responsiveness towards weather variations, but it required 48 different models each day to estimate demand on a half-hourly basis, so it led to an increase in the computational burden. The usage of multi-linear regression aided in demand prediction using the MS Excel tool, thus reducing the complexity of incorporating various costly software [29,30]. However, weather conditions (WCs) such as rains, mist, haze, fog, and so on were not used here for forecast purposes. Thus, a number of research gaps need to be addressed to improvise the forecast performance. As an instance, most of the available techniques utilised ambient temperature as a parameter to forecast demand but an accurate calculation of real feel improves forecast quality. Another observation is that existing techniques have not utilised WCs intensively even though a few conditions such as lightning and thunderstorms may lead to a major impact on demand. Thus, numerically optimising the impact of WCs can improve the results appreciably.
To fill these research gaps, this paper proposes an algorithm for short-term load forecasting utilising real feel index and numerically optimised WCs (such as rains, mist haze, fog etc.). The performance of the forecast has been analysed using various statistical analysis and uncertainty analysis. The outcomes of the proposed model are then compared with a benchmark GAM.
The rest of this paper is organised as follows: Section 2 describes the load consumption pattern; Section 3 explains the need for load forecasting; Section 4 specifies the proposed algorithm; Section 5 investigates the proposed model through statistical, uncertainty analysis, and various performance metrics. Finally, the conclusion is drawn in Section 6.

Load consumption pattern of Delhi
The National Capital Territory of Delhi is a metropolitan region located in the northern region of India, with the current population of 19.6 million [31]. Its geographical coordinates are 28.7041°N, 77.1025°E. As per the facts laid down by the Delhi Electricity Regulatory Commission, the consumers are categorised based upon the percentage share of energy drawn, as shown in Fig. 2.
Delhi exhibits three seasons, namely summer, winter, and their respective transition months. The average peak demand varies from 5400 MW in summer to 3700 MW in winter, while the average offpeak demand varies from 3600 MW in summer to 1400 MW in winter. The summer season is characterised by the onset of May, which extends up to September with the average temperature varying from 25 to 45°C. Fig. 3 shows the average hour wise monthly electricity consumption pattern of Delhi during the summer of the year 2017. It can be observed that during summer, Delhi exhibits peak consumption for 16-17 h while the rest of the 7-8 h have an off-peak consumption trend. The drop-in power consumption during night hours occurs due to switching off many cooling devices such as air conditioners. Around 3200 MW is the base load during the summer period, which is mostly the industrial load. The peak load is dominantly commercial load that functions during day hours.
The average hour wise monthly electricity consumption pattern of Delhi for the winter of the year 2017 is shown in Fig. 4. November to March show an identical demand pattern, which is significantly lower than summer. The average temperature varies  from 5 to 22°C during the winter season. In terms of electricity consumption pattern in Delhi, there are nearly 15 h as peak demand hours and the remaining 9 h are off-peak hours during winter months, which is higher than summer months. The peak and offpeak hours during winters are quite different from the summers as during winters, the drop-in load demand is ∼30% during peak hours and 50% during off-peak hours as compared to summer months. The main reason behind this drop is a massive reduction in the number of cooling devices during the period of November to March. The load demand during these months drops ∼1500 MW.
There are two transition months, one being winter to summer, which occurs during April and the other is the transition from summer to winter, which occurs during October. These two months demonstrate an identical behaviour that is entirely different from the summer and winter months. They differ in terms of peak/offpeak hours consumption patterns of the load. During these months, load demand continually increases from morning to evening and then drops during night hours. These two transition months illustrate identical electricity demand patterns, as shown in Fig. 5.

Necessity of load forecasting
To understand the need for a better load forecast model, the authors attended several workshops with various distribution companies prevalent in the state of Delhi, where experiences about real-time operations were gathered. Furthermore, the authors referred to various load generation balanced reports [4,32] released by the Central Electricity Authority, India. Based on these knowledgesharing exercises, gathered data has been analysed and divergence in demand and the availability patterns developed for years 2008-2017 are illustrated in Fig. 6.
This divergence leads to financial losses to distribution companies, which passes on to the consumers through elevated electricity bills. In the year 2009-10, demand as compared to availability was high by nearly 8.6%, whereas the scenario is quite different in the year 2010-11, where nearly 12.2% higher power availability has been observed than demand. The closest match between the demand and availability has been observed of 0.1% in the year 2014-15.
Short-term load forecast implies forecasting of power demand on the day ahead and week ahead basis. The information about time-of-day (TOD) and day-of-week (DOW) is extracted through the seasonality trend. As shown in Fig. 7, heading from a weekday towards the weekend, a slight dip in demand is observed. Further heading from weekend towards weekday, the rise can be noticed compared to the weekend. These variations are extracted by comparison of any D-day with respect to its D-2 day. It has a vector of length 24 (hours of a day) × 7 (days of a week) = 168 data points (for hourly load data of respective month).
These data points modify the forecast of any month in accordance with the identical month from the training set of previous year/years. For example, to forecast demand for July 2018, the trend is extracted from July 2017 and so on. In a broad sense, this trend varies month wise with the re-occurrence presumably not before the arrival of the respective month next year. So, as a part of the pre-requisites, this trend extracted as multipliers in this paper is termed as 'α-factor'. The permissible range of α-factor is 0.92 < α < 1.08, which represents that demand during the respective time blocks of a particular day, may vary by ±8% as compared to its D-2 day. Variation beyond 8% is neither permissible nor does it happen frequently, hence trimmed off as an outlier.
The components used from weather data are temperature, humidity, wind speed (WS), and weather conditions (WC). In some cases, the heat index (HI), also known as real feel, can also be used instead of simply using the temperature and humidity, which is indeed playing a role in power demand modifications [33][34][35]. If the temperature is ≥27°C and humidity is ≥40%, only then HI will come into the picture; otherwise, HI will be the same as temperature. The WS is another leading weather factor for load variation. WCs such as mist, thunderstorm, rains etc., are assigned multiplication factors through trend analysis. The optimised factors, as per their environmental effect are extracted and periodically reviewed for best fit by hyperparameters tuning. Thus, in this study, a combined effect of demand and weather is incorporated to forecast demand on a day-ahead basis.
The performance of the proposed forecasting method has been analysed statistically using well-established performance metrics and uncertainty analysis is performed to consider uncertainty in different parameter variation. For any analyst, the above-mentioned set of metrics plays a great role in producing a dependable set of observations, which may be further helpful in focusing on the possibility to reduce error. The following is a brief description of the metrics selected for this study: Mean absolute percentage error (MAPE): Also known as mean absolute percentage deviation is one of the metrics generally used to evaluate the performance of a forecast [34,36]. It is quite useful in the process of model evaluation, especially in regression problems. Let us consider that there are two time series in the observed data set such as y(i) and its predicted set as y¯(i) for i = [1-N]. So, MAPE is calculated as Coefficient of variation (CV): also known as relative standard deviation is used to compute the ratio of the standard deviation to the mean computed over the prediction errors [37]. It is measured using the formula: Mean absolute error (MAE): it is a measure of the difference between two variables of the same nature [18,38] and is defined as follows: MAE y, y¯= 1 Root mean square error (RMSE): it is a frequently used metric to evaluate the differences between the actual set of data and the predicted values [39]. It is defined as follows: Standard deviation (σ): consider a set of data where {y 1 , y 2 , …, y n } are the observed values of the items, y¯ is the mean of the observations [40], and n is the no. of observations in the data set Correlation: consider a set of data on variables x and y for n individuals, with x¯ as the mean and σ x as the standard deviation of the x-values, and y¯ as the mean and σ y as the standard deviation of the y-values [9]. Then correlation (c) between x and y is given as Skewness: the asymmetry in a normal distribution is measured by using the skewness metric [41] proposed by Karl Pearson is as follows: where SK p = Karl Pearson's coefficient of skewness and σ is the standard deviation. The performance of the proposed model is evaluated through performance metrics and compared with the benchmark method (i.e. GAM) by using the actual demand data gathered for the state of Delhi, India for two years, i.e. 2017 and 2018 and the forecast as an out-of-sample test has been validated on 2018 data.

Proposed algorithm for short-term load forecasting
To carry out an improved quality load forecast, it became necessary to formulate an enriched algorithm that makes use of weather biasing for reliable load forecasting. Weather biasing provides a better assessment to estimate up to what extent change in weather can contribute to change in demand patterns of a state. The basic exploratory data analyses were conducted on the electricity demand and weather data to find out the correlation of the former with the variations in the latter. This model is built as per the perception of humans, which plays a significant role in load variation. The weather variations have been noticed to be a major counterpart in demand variations. After learning the co-relations of demand data vis-à-vis weather variations, the day ahead forecast is done by making use of the delta (Δ) of various weather factors. The association of load variations with the delta of weather parameters and other prevalent conditions is done by assigning multiplicative factors that are designated as alpha (α), beta ( ), and gamma ( ). To review the forecast statistics to incorporate any unanticipated changes encountered, these factors are periodically reviewed. Table 1 illustrates the impacts affecting the demand pattern along with the variables needed to model them and the time horizon from which these variables are extracted.
For this purpose, the algorithm can be formulated as follows: where f = sum of absolute error = ∑ forecast − actual (9) Since the absolute function is not differentiable, so it can be simplified as f = sum of absolute error = ∑ forecast − actual 2 (10) Here the main purpose is to minimise the deviation between the actual load and the forecasted load. Lesser the deviation, better the forecast. A seasonality trend needs to be extracted. For this, α (seasonality trend) as TOD and DOW multipliers having a vector of length 24 × 7 = 168 (for hourly load data) and β (weather components' multiplication factors) as a vector of multipliers for each weather parameter (i.e. HI: HI 1 , HI 2 , WS, WCs, and unknown errors) having a vector of length 5 are computed. Thus, the function (f) over seasonality trend (α) and weather components' multiplication factors (β) can be calculated as where i denotes hourly data points for all 7 days of a week and i.e. when ΔHI > 0 (i.e. HI is expected to show a rise on an upcoming day with respect to the previous day), then the above equation with respect to HI is modified as follows: i.e. when ΔHI < 0 (i.e. HI is expected to show a dip on an upcoming day with respect to the previous day). β 1 and β 2 are the multiplication factors for incorporating the positive and negative impact of ΔHI i . Just as the way β 1 will be solved in the upcoming set of equations, the same principle shall be used for solving β 2 . ΔHI i , ΔWS i , and ΔWC i, are changes in HI, WS, and WCs with a vector multiplier β assigned to all these components to include their impact in the load forecast. Here 1 , 2 are the exponents assigned to β of the respective heat indices and WS. The HI and WS factors depict a non-linear behaviour in the load pattern variation. Variation in HI per unit varies the load in a retarded nonlinear manner, whereas the variation of WS per unit shows an accelerated impact on the variation of load. Keeping in view the severity of these two vital factors, numerically optimised value for 1 is kept as 0.57 (0.50-0.65 is the range decided after trend analysis on load pattern variation with respect to HI variation per°C ) and 2 on the similar grounds of analysis is assigned an impact factor of 1.15 (with 1.10-1.30 expected to be the optimum range). The values 0.57 and 1.15 represent that load demand (in MW) is going to reduce by 57% with per unit variation in HI and increase by 15% with per unit variation in WS. These optimal values assist in capturing the best impact of the HI and WS. As shown in Fig. 8a, the impact of ΔHI is increased by 15% and then remaining is clipped off as outlier represented by dots.
The ΔHI with modification used for forecast gives better results than ΔHI used without modification. A similar reduced impact of ΔWS with modification is compared to ΔWS without modification in Fig. 8b.
The function mentioned in (11) optimises the variables α, β based on previous one-week data extracted through the technique of forecast minus actuals. To extend this to a duration of 4 weeks would yield Optimal values of seasonality trend (α) and vector multipliers for weather components (β) can be derived by finding the partial derivatives with respect to each variable (α, β). The partial differentiation of (13) with respect to α will result From (14), it is possible to extract seasonality (α) using the demand and weather data for a particular month from the previous year as follows: Now, β can be obtained by partially differentiating (13) with respect to β The following mentioned are the sub-equations of (16), which are used for carrying out a parallel non-linear regression-based optimisation, so that the β extracted for different factors of weather may prove to be the best fit in accordance with the data fed for training the model. The lesser the need for re-optimisation for upcoming days, the better is the model fitting. Equation (16) is differentiated after substituting (12a) or (12b) in it.
Hence, with a proper step-by-step protocol of extracting weather and demand data, seasonality trend, and using an efficient tool for optimisation of α and β factors, the best possible load forecast can be achieved with the least quantum of deviations.

Forecast results and discussion
Weather biasing technique formulated for the purpose of short-term load forecast has effectively incorporated the impact of various weather components at an hourly interval. Their individual impact is estimated, and the multiplication factors (α, , and ) are assigned to predict the best possible forecast for the upcoming day. These multiplication factors stand numerically optimised under a set of repetitive iterations. Without weather biasing, it is not possible to have precise anticipations because the weather is a critical factor, and this may result in a significant gap between demand and availability if not considered accurately. Anticipations of power without using weather components are meaningless; thus, a weather biasing strategy provides the possibility to incorporate the impact of weather components for the development of a new load forecasting algorithm and hence named weather biased algorithm, as discussed in the previous section. The impact of weather biased algorithm can be observed in this study, as discussed below. Using the various components of weather and demand data of Delhi for the years 2017 and 2018, the demand forecast has been carried out for the year 2018, as an out-of-sample test. The model is validated on different seasons as discussed below. The performance of the proposed load forecasting algorithm is analysed using various error computing metrics and compared with the existing GAM theory-based method [28,41]. The variations in summer, winter, and transition months are predicted using the proposed model and comparison is done with a generalised additive model as shown in Figs. 9-11, respectively. Table 2 compares the performance of the proposed algorithm based on the segregation of seasons, which are created to exhibit the proper functioning of the forecast model in diverse climatic conditions of Delhi. If the error analysis does not provide a convincing situation, the model must be re-evaluated. However, it has been observed that the proposed model justifies its prudence in all aspects on being evaluated through four different performance metrics.
The deviations in the forecast as seen in Table 2 are due to multiple reasons, a few of which may be shifting weather patterns leading to a considerable demand pattern shift in adjacent years.
Another perspective used to analyse the performance is by visualising the uncertainties across actual demand and forecast computed using the proposed model, and then comparing it with the output of benchmark GAM model. For this purpose, the aspects chosen were standard deviation (σ) and skewness (SK p ) [42,43]. Fig. 12 shows the uncertainty analysis carried out on the different months of the year 2018.
The standard deviation and skewness for the forecast results of the proposed model were quite similar to the actual demand, while the performance of the benchmark model (GAM) was seen to degrade during the transitional period. As a post-facto analysis, the correlation was tested over the actual demand data of 2017 and 2018 and a comparison was drawn against weather during the same intervals. As shown in Fig. 13, the weather profile deeply affected demand. The months of transition, especially April, had the least correlation in both the profiles (actual demand and weather) for 2017 and 2018. Another sudden dip was noticed in the weather of October. Although this dip was less than the former, still it brought an identical kink in the demand profile. Figs. 12 and 13 further verified the performance of the proposed model from uncertainty analysis point of view and clearly showed how the proposed model outperformed the benchmark model GAM.
The marginal difference between actual and forecasted demand by using the proposed method is based on unpredictability of the WCs, which can affect the accuracy of demand forecasting such as a huge difference in the day ahead forecasted and actual recorded weather data from the meteorological department. The effect of climate change leads to a shift in the seasons, which is also responsible for extreme weather events. For example, thunderstorms of varying intensities and their durations lead to the    imposition of power cuts to safeguard the equipment and appliances. Apart from the aforementioned factors, there are still multiple points in Delhi state where remote terminal units are either not installed or out of service due to power failure at some time interval, the last measurement recorded is displayed till manual refresh of the equipment.

Conclusion
In this paper, a short-term load forecasting algorithm has been proposed based on weather biased technique. While working on demand forecasting, the emphasis has been made on a day to day variations in demand and temperature patterns. These modalities shall provide prudence in real-time operations and to develop a universal model. The developed model incorporates the impact of WCs with numerically optimised factors. An intensive study has been performed on various factors that have a direct impact on demand load forecasting. Due cognisance has been given to the suitability of the model with the relevant regulations of the Indian Power Market. Multi-various levels of iterative techniques have been used in consonance with the mathematical models such as generalised reduced gradient non-linear method, simplex method, and other non-linear programming techniques. The proposed algorithm has outperformed the benchmark GAM model during the entire seasons of the year, which has been visualised through various performance metrics, statistical, and uncertainty analyses. Accuracy of the proposed algorithm can be further enhanced if more authenticate demand data and better quality weather data can be received from the respective sources.