Stochastic production simulation for generating capacity reliability evaluation in power systems with high renewable penetration

There are increasing challenges in the power industry to plan a system that provides adequate generation capacity to sustain the load when the renewable energy penetration level is extremely high or 100%. This study proposes a stochastic production cost simulation method to evaluate the generating capacity reliability in power systems with high renewable penetration. In comparison with conventional approaches, such as planning reserve margin and probabilistic assessment, the proposed method can consider hourly chronological characteristics of the system operation, which is signiﬁcant in the planning process for integrating new resources, such as storage, demand response, and renewables. The method was tested on the modiﬁed ISO New England system based on two scenarios: 45% renewable penetration and 100% clean energy. In addition, several sensitivity analyses were conducted following each scenario. Simulation results indicated that the proposed method can be used to quantify the reliability value of the system under various renewable penetration levels. When the system is not reliable, this method can be used to determine additional capacity required to ensure system reliability.


INTRODUCTION
Generating capacity reliability, also referred to as generation resource adequacy (RA), is a term that is frequently used by utilities to evaluate the availability of sufficient resources to meet the system demand under all but the gravest scenarios. Lack of generation capacity may have severe implications for the system, including load shedding, extreme price spikes, and rolling blackouts. An example of insufficient generating capacity that had a severe impact was the California Electricity Crisis that occurred in [2000][2001]. Subsequently, the Californian legislature has acquired load serving entities to maintain adequate physical generating capacity (including electrical demand response) to meet the load requirements, such as peak demand and reserves. Over the past decade, power systems worldwide have been transforming to accommodate cleaner energy resources, embracing a more variable resource fleet in the system. Under these circumstances, several countries and regions have set up ambitious renewable energy targets. United States, California has committed to achieving 100% clean energy by 2045 in the Senate Bill 100 [1]. Under this law, 60% of the power purchased by utilities in California must come from renewable resources by 2030, and the additional 40% should come from "zero-carbon" resources by 2045. In Hawaii, oil accounted for more than two thirds of generating capacity in 2018; however, the state has committed to generating 100% of its power from renewable energy by 2045 [2].
The increasing renewable penetration level will substantially alter the operation and planning of the electrical grid over the next several decades. A challenge faced by the current power industry is to ensure RA alongside the changing generating mix and supply. Renewable energy resources, such as wind and solar, are naturally variable and stochastic [3,4]. Due to historical reasons, numerous power systems in the world are still using planning reserve margin (PRM) [5] to evaluate the generating capacity reliability. This method may be effective in traditional power systems with a limited amount of renewable energy resources. However, in future power systems where the renewable penetration is high, using this method may lead to inaccurate results because the PRM solely considers the peak system demand of a year without considering the detailed hourto-hour operational difficulties of a system. In several situations, the system may have sufficient capacity margin, but is still not able to meet the demand due to operational constraints, such as insufficient ramps to meet the net load variability. There is an increasing need in the power industry to use more advanced methods, such as stochastic production simulation for generating capacity reliability evaluation.
Recently, the topic of ensuring power system generating capacity reliability considering the impacts of renewable energy resources has been receiving attention from numerous publications. A method to assess the adequacy of generating systems containing wind power by considering wind speed correlation was proposed [6]. The adequacy of generating systems considering the impact of the operational strategies of storage and hydro energy resources were described in [7], where Markov chain was used to model the system operational reliability. Reference [8] studied the impact of heterogeneous expansion of renewable energy on RA in interdependent electricity markets. However, none of these studies focus on the development of a stochastic production simulation framework to evaluate the generating capacity reliability under extremely high or 100% variable energy resource (VER) penetration with a realistic dataset. Reference [9] developed a Monte Carlo simulation method to demonstrate the impact of uncertain parameters affecting the reserve margin of a system, which was a conventional approach for RA modelling. Reference [10] developed a modelling framework to investigate the impact of high penetration levels of VERs and different market designs on achieving RA. A probabilistic methodology for integrated reliability evaluation considering RA and dynamic security assessment in a unified framework was proposed in [11]; however, they did not adopt a stochastic production cost simulation framework, and thus may have underestimated the impact caused by the uncertainty of VERs.
The contributions of this paper include four key highlights. Firstly, the detailed formulations of the stochastic production cost simulation model are presented. Secondly, a practical method for the modelling of random variables to generate simulation scenarios is introduced. Further, the proposed method was implemented on the modified dataset of an actual power system in the United States. Subsequently, several extreme scenarios with 100% renewable energy penetration levels were investigated. The simulation results presented in this paper should serve as a useful guideline for researchers, system planners, and policymakers to design 100% renewable energy power systems in the future.
The rest of this paper is organized as follows. Section 2 introduces two traditional generating capacity reliability evaluation methods. Section 3 details the stochastic production cost simulation method. Section 4 describes the modified ISO New England test system and dataset used for simulation. Section 5 presents the case study results. Finally, Section 6 concludes this work.

Planning reserve margin approach
PRM is a widely used metric for RA in the planning process, which is calculated as the percentage by which the installed capacity exceeds peak demand. A typical PRM study would only need to perform a simple calculation. However, the PRM method is still used in utilities. In California, the public utilities commission requires load serving entities to hold sufficient capacity to meet their peak load with a 15% reserve margin for at least a year ahead [12]. At the midcontinent independent system operator, a reference PRM of 17.1% was used for the 2018-2019 planning year; in other words, the total installed capacity had to be 17.1% higher than the peak demand of the system [13]. The PRM is a relatively old metric that was developed in the 1940s, but is still widely used. With the ongoing development of the power industry, the drawbacks of this capacity-based metric have emerged, including (i) inaccurate assessment of the performance of energy limited resources, for example, hydro capacity with limited water resources; (ii) it does not account for the forced outage rates (FORs) of generators; (iii) it does not consider the operational characteristics of emerging technologies, such as energy storage and demand response resources. To overcome these drawbacks, the probabilistic assessment approach was proposed for power system generating capacity reliability evaluation.

Probabilistic assessment approach
During the initial stages of its development, the probabilistic assessment approach was induced by the impact on RA evaluation due to different failure rates of generation resources. Two otherwise identical systems with different unit failure rates cannot achieve the same reliability level. The most popular metrics used for RA evaluation include the loss of load expectation (LOLE) and loss of load probability (LOLP). LOLP is the probability that the demand will exceed the capacity during time horizon T, that is, where D is the system demand, C T is the installed capacity of the system, and F D is the cumulative distribution function (CDF) of demand. LOLE is the expected number of time units that demand exceeds the generation capacity. The relationship between LOLE and LOLP is LOLE = LOLP × T where T is the time horizon, which is typically 1 year. Both LOLP and LOLE are generating reliability metrics from a system level. Another important metric, the effective loadcarrying capacity (ELCC), applies to individual generators. It is usually used to calculate the capacity value of a resource. The ELCC of a generator is defined as the measure of additional load that the system can supply with the generator without changing the system reliability [14,15].
The probabilistic assessment approach is an improvement to the PRM method. However, it has limitations. Firstly, although this approach models the uncertainty of the system, it assumes that the uncertainty models are precise. For example, in Equation (1), the CDF of demand is calculated from the load duration curve, which requires accurate load forecast and unit FORs. Secondly, to model the FOR, a two-stage outage model of units is typically used. The outage probabilities are convolved with the load duration curve to calculate the LOLP [16]. When the system has numerous units and/or the units have several derated states, the convolution method is computationally intensive. Further, the probabilistic assessment method disregards hourly chronological characteristics of the system operation, which is becoming substantial with the vast deployment of new generation technologies, such as batteries, distributed generation resources (DERs), and demand response resources. To overcome these drawbacks, a stochastic production cost model (PCM) was proposed and has prevailed in the industry for generating capacity reliability evaluation over the past decade. The next section outlines the details of the stochastic PCM-based method.

3.1
Overview of production cost model The PCM was developed into an hourly chronological unit commitment and economic dispatch simulation. It captures the costs of operating a fleet of generators, with the objective of minimizing costs, while simultaneously enforcing a wide variety of operating constraints. The advantages of the PCM method include: (i) simulating all the hours in a year and not just the peak hours, thereby providing more details of the system operation; (ii) enabling complete utilization of the load, solar, and wind forecasts. With the development of advanced forecasting techniques, system planners are currently more confident in making decisions for future systems. The forecasts are captured in the PCM as they are input to it; (iii) modelling complicated system conditions, such as tie-line flow with neighbourhood systems, transmission network congestions, and large-scale integration of behind-the-meter DERs. However, the PCM method has a few disadvantages, including: (i) it may require a significant amount of simulation time when the system is large; (ii) the simulation results significantly depend on the quality of the input data, thereby necessitating a data validation process [17]. Figure 1 shows the typical input and output information of the PCM. At the centre of the model are the supposed security constrained unit commitment (SCUC) and economic dispatch (SCED) algorithms. The input data to the PCM includes the physical parameters of the generators, the economic data, FORs, operating requirements, network data, and time-series data for wind, solar, load, and reserve requirements. The output information of the PCM includes the system operation costs, locational marginal pricing, commitment and dispatch results of

Production cost model formulations
As illustrated in Section 3.1, the SCUC and SCED algorithms are key models for production cost simulation. Usually the SCUC model is operated regularly to determine the commitment status of resources. Then, the SCED model is operated for specific hours (or subhours if the resolution is less than 1 h) for obtaining the dispatch and system prices of the resources [18][19][20]. In this section, the details of the SCUC model are presented; the SCED model can easily be obtained by fixing the binary variable in the SCUC model. The SCUC model can be formulated as follows. The objective function is: where t is the time horizon, g is the generator index, k is the reserve category, y is the unit start up binary variable, p is the unit generation level, r is the cleared reserve quantity, C SU g,t is the start up cost of unit g at time t, C P g,t (⋅) is the energy generation cost function, and C R g,t is the reserve cost of unit g at time t. In this study, three types of reserves are modelled, including regulation (REG), spinning (SPIN), and nonspinning (NSPIN) reserves; thus, k ∈ {REG, SPIN, NSPIN }. system t is the load balance violation of the system at time t. It is a slack variable used to quantify the balancing violations. The value of lost load is usually set high (e.g. $5000/MWh). The term OtherPenalties represents additional penalty functions, such as transmission line limit violations, energy storage capacity violations, and reserve violations. The objective function in Equation (1) minimizes the overall system operational costs plus the operational penalties in the study horizon T. Subject to the following constraints: The hourly power balance constraints: where n is node, load n,t is the load connected to node n at time t, and loss represents the transmission loss. Hourly transmission limit constraints: where l represents the transmission line, F P l,t (⋅) is the DC power flow equation on the transmission line, and F l,t is the power flow limit on the transmission line. The load n,t is constant except for demand response resources.
Regulation reserve requirement: where R REG is the regulation reserve requirement of the system. Regulation plus spinning reserves requirement: where R SPIN is the spinning reserve requirement of the system. Operating reserves requirement: where R NSPIN is the nonspinning reserve requirement of the system.
Resources capacity limit and ramping constraints: where u g,t is a binary variable to demonstrate the commitment status of generator g at time t. P g,t and P g,t are the maximum and minimum output limits of generator g at time t, respectively. Ramp UP g,t and Ramp DOWN g,t are the upward and downward ramp rates of generator g at time t, respectively. State-of-charge (SOC) constraints for energy storage resources (ESRs) [21]: where i is the index of ESRs, I D is the resolution of the simulation, is the charging efficiency, P ESR i,t is the cleared quantity of generation when the ESR is in generation mode, L ESR i,t is cleared quantity of load when the ESR is in generation mode, SOC Initial i is the initial state-of-charge for ESR i, and SOC Max i is the maximum state-of-charge for ESR i.
Other constraints to the PCM include the minimum up and down times of the generation resources, unit commitment logic, and maximum energy per day for energy-limiting resources. However, to maintain the conciseness of this study, these constraints have not been highlighted. Readers can refer to ref. [22] for the details of those constraints.
The stochastic production cost simulation process demonstrated in Figure 2 is based on the PCM represented in Equations (1)- (12). If there is no balancing violation in the system, the variable system t in (1) is zero. Conversely, the number of hours with system t ≠ 0 in the simulation horizon is summed to obtain the loss of load hours (LOLH). Accordingly, there are two methods based on LOLH, as illustrated in Section 3.4, to calculate the LOLE of the system.

Modelling of random variables
In RA studies, random variables include load, wind generation, solar generation, and the FORs of resources. A challenging aspect of applying the stochastic production cost simulation is to generate a set of discrete scenarios that could represent the stochastic processes of the random variables. Usually, this is achieved by using a stochastic procedure to generate a scenario tree. In this study, we used the mean reversion stochastic process (MRSP) [23,24] to model the stochastics of load, wind, and solar. The historical data of several years were used to estimate the parameters of the MRSP equations. Once the parameters were determined, a set of different scenarios can be generated through the stochastic process. The details of the calculation process are shown in the appendix. The modelling of the FOR is different from the above process. First, the FORs for each generation unit were obtained from historical data. Then, the outage hours were generated randomly and independently through the simulation horizon. Consequently, the percentage outage of hours was close to the FORs of the resources.

Stochastic production simulation process
The stochastic production simulation process is shown in Figure 2. A total of N (usually no less than 100) scenarios were constructed to represent the stochastics of the random variables. The production cost simulation runs N times by iterating all the scenarios. Note that the way the scenarios are generated can have a significant impact on the results. To mitigate this issue, we can simulate a sufficiently large number (e.g. tens of thousands) of scenarios. However, this would require a significant amount of computational power. The purpose of this study is not to guide the planning process in the decisionmaking of real power systems, but to demonstrate the feasibility of the proposed model with a limited number of simulation scenarios, while obtaining insightful conclusions from the simulation results. Hence, only 100 scenarios were simulated for the stochastic PCM.
The process shown in Figure 2 is called the Monte Carlo simulation, from which the LOLE index can be calculated. At the centre of the process in Figure 2 is the PCM developed in Section 3.2. All the simulation cases in Section 5 employ the algorithm shown in Figure 2 to evaluate the generating capacity reliability. The North American electric reliability corporation (NERC) has established the supposed "1-day-in-10-year" LOLE principle for power systems in North America [25]. This LOLE criteria is based on days/year instead of hours/year, primarily because it originated from the PRM method, where the peak load was calculated for each day. However, the PCM is usually run with a 1 h resolution, and 1 year horizon. Thus, it is necessary to convert the days/year principle of the NERC to the equivalent hours/year number obtained from the PCM results. Generally, there are two ways to achieve this: • Method 1: Convert the 1-day-in-10-year to x-h-in- 10-year where x is the number of hours with lost load. Billinton et al. [26] showed that the 1-day-in-10-year principle was equivalent to 7-h-in-10-year, or 0.7-h-in-1-year. Consequently, if a total of N scenarios is simulated, the maximum number of hours that contains lost load is 0.7 × N to meet the NERC requirement. If the total LOLH is less than 0.7 × N, the system is reliable. Otherwise, the system is unreliable. • Method 2: Adhere to the days/year principle but count the days (instead of hours) with lost load from the production cost simulation results. Under this assumption, a day with load loss for 1 h and that for 24 h are equivalent, both of which are counted as one day of lost load. If a total of N scenarios is simulated, the maximum number of days that have loss of load is 0.1 × N to meet the NERC requirement.

Base model description
In this study, we performed simulations on a 76-unit 8-zone system, which was originally developed by the author in [27,28] based on the structural attributes and data from ISO New England (ISO-NE). The ISO-NE is an independent nonprofit regional transmission organization. The ISO-NE energy region is divided into eight load zones namely, Connecticut, Maine, New Hampshire, Rhode Island, Vermont, Northeastern Massachusetts/Boston, Southeastern Massachusetts, and Western/Central Massachusetts. Simplified transmission modelling was adopted to reduce computation time as this study is majorly focused on evaluating the generating capacity reliability. After the development of the ISO-NE 8-zone system, the generation resource mix in the ISO-NE system had some changes [29], including: (i) more natural gas generators were added to the generation fleet; (ii) a few coal units were retired; and (iii) more renewable energy resources were built. To reflect the changes in (i) and (ii), we further calibrated the existing thermal generation data in [27] and [28] to match up to the 2019 generation mix of ISO-NE, where the capacity of coal units was reduced to 900 MW, and the capacity of natural gas units was increased to 15,900 MW. The total installed capacity in 2019 was 31,209 MW, with natural gas being the dominant resource. To reflect the change in (iii), the wind and solar capacity was increased to 1400 MW and 589 MW, respectively. The installed capacity by resource type in 2019 is shown in Table 1.
In the base model, the FORs for thermal generators was obtained from the NERC generating availability data system (GADs) [30]. GADS is recognized as a valuable source of reliability information for total unit and major equipment groups and is widely used by industry analysts in numerous applications. The unit FORs vary by the type and size of generation. Additionally, system operating reserves are considered in the base model. The requirement for the regulation reserve is 300 MW. The requirements for the spinning and nonspinning reserves are both set to 3.5% of the load each hour.

Parameter estimation for random variables
As shown in Section 3.3, the random variables are modelled with the MRSP in Equation (A.1). The historical data for load, solar, and wind from 2017-2019, which can be downloaded from the ISO-NE website [31], were used to estimate the parameters. A similar method to reference [24] was adopted for generating the stochastic values of load, solar, and wind in multiple scenarios. The features for load, solar, and wind are different. Load has both repetitive daily and weekly profiles. However, solar only has repetitive daily profiles. Wind generally does not have repetitive profiles, as the wind output on two consecutive days may significantly vary. Some techniques were used to capture these features. Firstly, to capture the weekly repetitiveness of the load, the 2017-2019 historical data were aligned with weekly patterns. For example, the first day of 2019 was a Tuesday. Hence, for the 2017 and 2018 historical data, the first day was chosen as the first Tuesday of the corresponding years.
Thereafter, to capture the daily repetitiveness of the load and solar, hourly ratios of the historical data relative to the benchmark year were used. For example, if the year 2019 was used as a benchmark, then the hourly load and solar ratios can be derived as follows: where  (Wind Y,m,d,h ∕̂), whereî s the calculated parameter for wind in Table 2. Therefore, the stochastic wind output for target year Y, month m, day d, and hour h is (Ŵind Y,m,d,h × Wind Y,m,d,h ∕̂). Again, this procedure was repeated N times to generate N different scenarios. Figure 3 shows 20 stochastic load profiles and the deterministic load profile (shown as the thick dark curve) over three consecutive days of the peak load week.

Case study scenarios
The above model was designed to be tested on two different scenarios: a high renewable penetration scenario (Scenario 1) and 100% clean energy scenario (Scenario 2). The generation capacity data for the first scenario was obtained from the ISO-NE capacity plan for year 2029 [29], where the capacity of the renewable penetration level was 45%. The second scenario was developed based on Scenario 1 by increasing the wind and solar capacity to a certain level to achieve the clean energy penetration level of approximately 100%. The details of the two scenarios are discussed below.

Scenario 1: A 2029 45% renewables case
In the ISO-NE, the state renewable portfolio standard (RPS) requirements promote the development of renewable energy resources to serve the retail load using renewable energy. The generation queue was developed in the ISO-NE region to meet the requirements. Wind generation dominates new resource proposals and solar power ranks second in the generation queue, thereby taking the total wind and solar capacity in 2029 to 15,656 MW and 6711 MW, respectively, as listed in Table 3.
Notably, only utility-scale solar photovoltaic (PV) is included in the solar capacity. The PV that is connected to local distribution utilities, or the behind-the-meter PV, is not included in the solar capacity. Another important change to the generation mix of ISO-NE in 2029 is the retirement of conventional resources, including coal, oil, and nuclear. Compared to 2019, 2200 MW of oil capacity, the entire coal capacity, and 600 MW of the natural gas capacity will be exhausted by 2029. The third change is the increase in ESRs. As shown in [29], the capacity of ESRs would be 2400 MW in 2029. The generation capacities by resource type for Scenario 1 are shown in Table 3. The capacity of wind and solar accounts for 45% of the total system capacity.

Scenario 2: A 100% clean energy case
We observed if the LOLE reliability standard can be fulfilled if the system held 100 percent clean energy. Till now, a scenario excluding all the thermal units was created (i.e. Scenario 2). The generation capacity mix of this scenario is presented in Table 3. All the natural gas, oil, coal, and nuclear units would be exhausted. The ESR capacity remained 2400 MW (more ESR capacity will be added in the subsequent section to conduct sensitivity studies). The solar and wind capacity was increased to 2.5 times of that in Scenario 1, taking the solar and wind capacity to 16,662 MW and 38,220 MW, respectively. The total installed capacity in Scenario 2 was 59,562 MW, all of which comprised clean energy. The simulation results for both scenarios will be demonstrated in the next section. The method developed in Section 3 will be used to identify whether the LOLE reliability standard is satisfied. If the reliability standard is satisfied, the thermal generation capacity will be decreased continuously in the scenario until the LOLE value reaches the 0.1-day-per-year reliability standard. The purpose is to observe the least possible thermal generation capacity to make the system reliable. However, if the reliability standard is not satisfied, the extent of additional capacity required to make it reliable will be estimated.

SIMULATION RESULTS
In this section, the stochastic production simulation results for the two scenarios in Section 4.3 are presented. In addition, under each scenario, a few additional cases were studied for conducting sensitivity analysis. The optimization problems were solved using CPLEX [32]. A small mixed-integer programming gap, that is, 0.01%, was adopted for all the simulations to increase the accuracy of the solution.

Results for 45% renewables case
The stochastic production simulation process in Section 3.4 was performed on Scenario 1. To generate the stochastic profiles for the random variables load, wind, and solar, we needed to first obtain their deterministic profiles. However, to the best of our knowledge, these profiles could not be obtained from public literature. Therefore, the authors developed deterministic profiles based on the data in the 2019 base case, where the chronological load, wind, and solar profiles could be obtained from historical data. From [29], it was shown that the peak value of net load (i.e. actual load minus behind-the-meter DERs) in 2029 was near to that in 2019 due to the increase of behind-themeter DERs. Accordingly, we used the 2019 historical load profile as the deterministic load profile in Scenario 1. In addition, compared to the 2019 base case in Table 1, the solar and wind capacity in Scenario 1 increased 11.4 and 11.2 times, respectively. Therefore, we multiplied the solar (wind) data in each hour of 2019 by 11.4 (11.2) to obtain the deterministic solar (wind) profile in Scenario 1. After building the deterministic profiles for load, solar, and wind, the method described in Section 4.2 was used to create 100 (i.e. N = 100) stochastic profiles for each variable. Figure 3 shows the first 20 stochastic load profiles developed with this method. Next, the stochastic production cost simulation was iterated 100 times, each running an annual 8760 h production cost simulation. The reliability metrics were calculated after 100 iterations were completed. In this study, Method 1 in Section 3.4 was adopted for evaluating the LOLE metric. In the total 876,000 h of the 100 scenarios, unserved load was found only in 21 h. Thus, the LOLE was 0.21 h per year (or 21 h per 100 years). Based on Method 1, the equivalent LOLE by day was 0.03 day-per-year. This is smaller than the 0.1 day-per-year NERC standard. Consequently, we concluded that the Scenario 1 case was reliable.
The above results reveal that additional thermal units could be retired, while still meeting the NERC reliability standard. We conducted some sensitivity studies to explore the extent of thermal capacity retirement so that the LOLE in Scenario 1 would be 0.1 day-per-year. Firstly, the case below was studied: • Scenario 1.1: retiring 1500 MW of oil capacity from Scenario 1, while capacities of other resources remained.
We started with retiring oil resources because they are generally the most expensive to operate and have the highest emission rates. After running 100 iterations with the PCM, unserved load existed in 96 h. The LOLE of the system using Method 1 was calculated as 96 ÷ 70 × 0.1, which was 0.14-day-pear-year. This was larger than the 0.1-day-per-year NERC standard, thereby dissatisfying the reliability standard, which implied that we were over-retiring the capacity of the oil resource. We could continue to increase the oil capacity to satisfy the reliability standard.
However, there was an approach to accelerate this process. Based on the simulation results of an unreliable case such as the one in Scenario 1.1, we could estimate the quantity of required capacity to make the system reliable. The procedure is described as follows. Firstly, plot the loss of load duration curve, as shown in Figure 4. The x-axis is the 96 h that have unserved load. The y-axis is the MW quantity of the unserved load in each hour. The duration curve was obtained by sorting the MW quantities of the 96 h in decreasing order. Secondly, calculate the maximum  Figure 1, the corresponding y-axis value on hour 70 was 140 MW, which implies that approximately 140 MW of additional capacity would be required to make the case in Scenario 1.1 reliable. A summary of the results for Scenario 1 and Scenario 1.1 is shown in Table 4. Note that the 140-MW additional capacity was an ideal number, which implies that when 140 MW capacity was added to the system, and when all the capacity was able to show up in the hours with unserved load, the system would become reliable. A second assumption is typically difficult to guarantee for two reasons: (i) units have FORs; thus, it is likely that they have outages during those hours. (ii) Even when the units do not have outages in those hours, they may still not be available due to other constraints, such as minimum down time and ramp rate limitations. Therefore, the actual quantity of additional capacity to make the case in Scenario 1.1 reliable should be larger than 140 MW. We conducted more sensitivity studies by increasing the oil capacity in incremental steps. Accordingly, the system became reliable when 300 MW of additional oil capacity was added to the case in Scenario 1.1.

Results for 100% clean energy case
The proposed stochastic production cost simulation process was then conducted on Scenario 2 for a 100% clean energy case. Although the total capacity was 59,562 MW, much higher than the deterministic peak load which was 23,988 MW, the system could still be unreliable because the wind and solar had low capacity factors and may not have been able to serve the load when needed. This was confirmed from the stochastic production cost simulation results, which showed that the unserved load existed in approximately 45% of the total hours in the 100 simulations. The LOLE of the system in Scenario 2 would be a prohibitive value. Consequently, we conducted the following sensitivity studies to reduce the LOLE of the system.  Test results showed that the number of hours that had unserved load accounted for approximately 40% of the total hours in the 100 iterations. The system was still very unreliable because energy storage is a passive type of resource, which means it can only absorb or discharge energy generated by other resources in the grid. In Scenario 2.1, the ESRs did not have sufficient stored energy to meet the load in approximately 40% of the total hours.
In Scenario 2.2, 15,300 MW of natural gas capacity was added to Scenario 2. The system no longer comprised 100% clean energy. However, the system may have 100% net clean energy if the interchange with neighbourhood networks is considered. Our simulation results showed that 74% of the total energy was supplied by renewables, 21% by natural gas units, and 5% by demand response and imports. The curtailment of wind and solar accounted for 24% of the total energy. It can be observed that the quantity of renewables curtailment was close to the energy generation from natural gas units. If the curtailed wind and solar could be exported to neighbouring grids, the net energy generated by clean units (including renewables, demand response, ESR, and pumped storage) was nearly 100%. The LOLE of the system in Scenario 2.2 was 1.75 days per year, as shown in Table 5. The system was still unreliable. Using the method demonstrated in Figure 4, 4362 MW of "perfect" capacity would be needed to make the system reliable. Figure 5 shows the number of hours with loss of load in each month. Most of the lost load events occurred in the summer (July and August) and winter (January and December), when system demand was high. Figure 6 shows the number of hours with lost load by hour-of-day. Most of the lost load events occurred in late afternoon and early evening when the load was high and solar energy diminished to zero. Hour 19 saw the largest occurrence of load loss events.
In Scenario 2.3, an additional 5000 MW of 4-h batteries were added to Scenario 2.2. The purpose was to see if batteries could help improve the system reliability level. As shown in Table 5, the LOLE was 0.42 day-per-year, which was still larger than the NERC requirement. Hence, the system in Scenario 2.3 was not reliable. The system would need 2872 MW of "perfect" capacity to become reliable. By comparing the results between Scenarios 2.2 and 2.3, it was found that both the LOLE and required capacity were reduced. This illustrated that batteries did help improve the reliability level of the system. Finally, we observed if long-duration batteries, such as 8-h duration, could be more beneficial to the system. Scenario 2.4 was designed to achieve this. Simulation results are listed in Table 5. Comparing the results between Scenarios 2.3 and 2.4, it was found that the reliability benefits owing to longduration storage batteries were minimal because most of the loss of load events in Scenarios 2.3 and 2.4 occurred in summer when the demand was high; however, wind and solar generation was extremely low in some of the 100 iterations. For example, for a rainy day, the solar generation in the system could be zero, although its installed capacity was 16,442 MW. In the evening, when solar generation was zero, there existed hours in which wind generation was extremely low. In these cases, all the energy would have been used to serve the load, instead of charging the batteries. The system cannot depend on batteries to store energy and release it when needed due to insufficient energy to recharge the batteries. Hence, long-duration batteries did not significantly relieve the loss of loads during those hours.

CONCLUSION
This paper presents a stochastic production cost simulation method to evaluate the generating capacity reliability in power systems with high renewable penetration. Compared to conventional approaches, such as the PRM and probabilistic assessment, the proposed method was able to consider the hourly chronological characteristics of the system operations, which is significant in the planning process for integrating new resources, such as storage, demand response, and renewables.
The stochastic model was first tested using two scenarios: a 45% renewables case and a 100% clean energy case (both percentages being capacity orientated). In addition, a few sensitivity studies were conducted for each scenario. The conclusions from the studies can be summarized as follows: • The system was reliable in the 45% renewables scenario. In addition, the system could further retire 1200 MW of capacity from oil units, while still satisfying the NERC reliability criteria. • In the 100% clean energy case, where all thermal units were retired and the capacity from wind and solar was approximately 55 GW, the system was unreliable with a large LOLE value. • When the system had 100% clean energy resources, it was extremely difficult or at least prohibitively costly (e.g. building excessive ESRs) to satisfy the reliability standard. Thermal generation resources might be needed to ensure RA with moderate costs. • By re-examining the concept of 100% clean energy, results from the simulation of this study indicate that the 100% clean energy concept does not necessarily mean the system has no thermal resources. By using the "net energy" concept, if the energy generated by thermal generators can be mitigated by the clean energy exported to neighbouring grids during other hours when the system is generating excessive clean energy, the "net energy" from renewables could still be 100%. • In the 100% clean energy scenario, adding only a moderate quantity of storage capacity would not make the system reliable. This is because storage is a passive type of generation resource. When there is insufficient residual energy to charge it, it cannot store sufficient energy to meet the unserved loads. • In the case when the system had 100% clean energy, adding long-duration batteries (e.g. 8 h) only slightly increased the reliability level of the system compared to adding shortduration batteries (e.g. 4 h).
Future work on evaluating the system reliability with an increasing penetration of renewable energy would include the combined simulation of RA and dynamic stability to make the model more practical. In addition, the reliability of distribution systems is another noteworthy topic to be investigated.