Scalable design methods for online data‐driven wide‐area control of power systems

Funding information National Science Foundation, Grant/Award Number: ECS-1711004 Abstract A novel online system-identification-based control design for damping inter-area oscillations in power systems using supplementary wide-area excitation control is presented. The identification is based on a partly reduced-order and a partly full-order model of the grid. Assuming the grid to be divided into coherent areas with phasor measurement units located across the grid, the coherent states of the generators are averaged while the noncoherent states, such as the internal states of each power system stabilizer, are not. A hybrid state vector consisting of these averaged and non-averaged states is computed, and used for identifying a hybrid linear time-invariant state-space model. A linear quadratic regulator based on this hybrid model is, thereafter, designed. The reduced-order part of the model is found to save significant amount of online time for both learning and control, while the full-order part serves to enhance closed-loop stability. N4SID with randomized singular value decomposition (rSVD) is used for making the identification loop fast. The reduced-dimensional controller is finally implemented using a broadcast control strategy. The design is extended to non-linear models of power systems using Carleman bilinearization. Results for linear and bilinear control designs are validated using the IEEE 68-bus power system model.


INTRODUCTION
Wide-area control of power systems using Synchrophasors has seen a suite of control algorithms developed so far [1][2][3][4]. The majority of these algorithms, however, are model based and are performed offline. With the growing levels of uncertainty in grid operations such as constantly changing loads, wind and solar penetration and changes in grid topology, system operators currently are inclining more towards model-free and online control approaches, but the greatest challenge standing in the way is the gigantic size and spatial scale of any real-world grid. As shown in [5], using online phasor measurement units (PMU) measurements to identify an entire grid model, and employing that model to take online control actions is practically impossible as the identification itself can take hours to complete even with the the most advanced computational facilities.
To circumvent this scalability issue of learning-based control for large and complex power systems, in this paper we propose to identify a severely reduced-dimensional grid model by mak- ing use of time-scale separation property of the grid dynamics, using high-resolution and high-fidelity PMU measurements. This property is commonly seen in almost all practical grid models, either through coherency of the synchronous generators [6], or through the sparsity of the grid topology, or through controller time constants. Based on this identified reduced-order model, we design a reduced-dimensional linear quadratic regulator (LQR), which is implemented through an inverse projectionbased broadcast control strategy. The challenge, however, arises from the fact that not all states of a generator exhibit timescale separation, and, thus, cannot be reduced to an aggregate state. For example, from [6] we know that the electromechanical (swing) states and excitation system states exhibit coherency, and can be expressed as aggregate states by means of simple inertia-weighted averaging. The states of the power system stabilizer (PSS), however, do not exhibit this property. Thus, ideally speaking, the PSS states cannot be expressed as an aggregate state. In fact, forcing the reduced-order model to contain an aggregate PSS state can result in a controller which when IET Gener. Transm. Distrib. 2021;15:2085-2100.
wileyonlinelibrary.com/iet-gtd 2085 implemented via the broadcast rule can destabilize the original full-order grid model. In summary, designing a full-order controller based on a full-order grid model will result in optimal performance but will take an unacceptably long learning time, while on the other hand, designing a broadcast controller based on an exact reduced-order grid model will take significantly lesser amount of learning time but may cause closed-loop instability. We propose an approach that is midway between these two extreme scenarios. Assuming the grid to be divided into coherent areas, we average the coherent (swing and excitation) states of the generators, but do not average the non-coherent PSS states. A hybrid state vector consisting of a mixture of these averaged and non-averaged states is computed following any disturbance, and used for identifying a hybrid state-space model, followed by a LQR design based on this hybrid model. The reduced-order part of the model saves significant amount of online time for both learning and control, while retaining the full-order PSS states is found to enhance closed-loop stability. Singular value decomposition (SVD) and least squares are used in the internal steps of the identification. N4SID with randomized singular value decomposition (rSVD) is used for making the identification fast [7]. Results are validated using the 9-bus Kundur power system model [8] and the IEEE 68-bus power system model. The second half of the paper extends this reduced-order learning-based control design to non-linear models of power systems. Identifying the linear model and implementing the linear state-feedback controller may not suffice when the system may have non-linear loads, or may be too close to voltage instability, for example, where it exhibits significant amount of non-linearity in its dynamics. Recent papers such as [9,10] have considered control design for non-linear power system models using Koopman mode analysis, while [11] has designed approximate controllers using Carleman linearization. However, these are all model-based control designs, and do not consider any model reduction operation. In Section 6 of this paper we will show how the idea of learning a reduced-order model that we developed for the linear case can be extended to the Carleman linearized model, and subsequently develop a stepby-step iterative approach for designing a bilinear quadratic controller for the identified non-linear model containing both first-and second-order terms. The controller is validated using the IEEE 68-bus system with several non-conforming nonlinear loads to show its superior performance compared to the linear controller.
While learning has been used in Synchrophasor applications on several occasions, majority of them are for open loop problems such as mode estimation [12], parameter estimation [13,14], disturbance monitoring [15], etc. Very little work exists on how learning can be used for control, and that too taking the bottleneck of scalability into account. Recent works such as [16] have integrated model-based methods with datadriven methods for addressing frequency regulation, but not for LQR-based damping control. Other papers such as [17] have used direct adaptive control methods for learning real-time controllers using reinforcement learning. This approach, however, has two main drawbacks. First, during the learning phase the con-FIGURE 1 Synchronous generator with excitation system troller may cause the system to behave erratically. Second, these designs rest on the idealistic assumption that a stabilizing controller can be found for the system model in order to start the iterative process for online control. Our proposed indirect approach does not have either of these drawbacks.
Our main contributions can be summarized as follows: 1. System identification of a hybrid model of a power system. The term 'hybrid' refers to the fact that the model is composed of both reduced-order and full-order dynamics. 2. Design of a reduced-dimensional LQR-based wide-area damping controller using the hybrid model, followed by a broadcasting technique for implementation. 3. Extension of the approach to non-linear Carleman models of power systems under high load stress.
The overall approach is termed as model free as we identify the reduced-order model only from measurement data.
The rest of the paper is organized as follows. Section 2 describes the problem formulation and identifying the cluster of generators for the reduced-order model. Section 3 presents the online algorithm for identification of the reduced-order linear time-invariant (LTI) system followed by its LQR control design in Section 4. Section 5 introduces in brief the concept of Carleman bilinearization for non-linear systems and design of quadratic controllers for such models. Section 6 implements the idea to power system taking hints from the actual model equations of a non-linear power system. Section 7 presents the simulations and Section 8 concludes the paper.

Model description
Consider a power system with n synchronous generators. Every generator is modeled by a standard sixth-order dynamic model, as shown in Figure 1, with states as phase angle i , rotor speed i , transient emf due to field-flux linkage E qi , transient emf due to field-flux linkage in q-axis E di , sub-transient emf due to flux linkage in d-axis damper 1di , and sub-transient emf due to flux linkage in q-axis damper 2qi . The detailed non-linear dynamic equations can be found in [8]. For control applications, the stator transients are assumed to be much faster compared to swing dynamics, and follow from a standard set of algebraic equations. For the excitation system, we consider the FIGURE 2 Implementation of learning and control scheme IEEE-DC1A type exciter [18] with first-order dynamics of field voltage E fdi , and a speed input PSS with third-order dynamics given by Details of these models and the corresponding notations can be found in [8], and are skipped here for brevity. The system model is linearized about the load flow solution. Let us define the small-signal generator states as 6 , the exciter state as x E i = ΔE fdi ∈ ℝ, and the three PSS states as x pss i = [x (1) pss i x (2) pss i x (3) pss i ] T ∈ ℝ 3 for the ith generator, i = 1, … , n. The Kron-reduced model is written aṡ where A ∈ ℝ (7n+3n p )×(7n+3n p ) , B ∈ ℝ (7n+3n p )×n are the state matrix and input matrix, and and u(t ) ∈ ℝ n is the exciter input. n p is the total number of generators with PSS (as not all generators may have PSS). The symbol [ ; ] denotes a column vector of its arguments. We assume that the matrices A and B are unknown, but the pair (A, B) is stabilizable. PMUs are assumed to be located at geometrically observable set of buses such that the voltage and current phasors at every generator bus are computable from these measurements. Details of such PMU deployment can be found in [19]. A decentralized Kalman filter (KF) based on [20] is assumed to be present at every generator, which using these local bus phasors generates an estimate of all states of that generator, as shown in Figure 2. For implementing the KFs we will assume that the generator parameters are known individually to each generator owner, and that the uncertainty in the system model arises from variations in the grid-level parameters such as topology and loads, all of which collectively make A and B in (2) unknown. Since the KF estimates of x(t ) are highly accurate, we will ignore the error in the estimates throughout the paper, and implicitly assume that we can measure x(t ) that will be needed for both identification and control.

Problem statement
The goal is to design an LQR-based damping control input u(t ). For this, one would first need to identify A and B. Considering that the order of the system n can be in the hundreds, this identification, however, will take an unacceptably long amount of time. We, therefore, propose to identify a reduced-order model, as presented next.

Model reduction using coherency
We assume that the power system is divided into r coherent areas. Two generators G i and G j are said to be coherent if for a certain disturbance the generators swing together [6], that is, their rotor angles satisfy |Δ i − Δ j | < for every time instant, where 0 < ≪ 1 is a small specified tolerance. A smaller value of chosen tolerance gives a tighter coherent grouping. This definition can be used to impart a two time-scale separation in the system dynamics whereby (2) can be equivalently written asẋ following a similarity transformation col(x f , x s ) = Tx, where T is an invertible transformation matrix (for its detailed construction please see [6]), x f are the fast states, and x s are the slow states. A 11 through A 22 , and B 1 and B 2 are matrices of appropriate dimensions. Assuming = 0, the quasi-steady state of x f can be approximated as x f ≈ −A −1 11 A 12 x s − A −1 11 B 1 u, following which the slow dynamics in (7) can be written aṡ (8) where A red and B red are reduced-order matrices.
Our approach will be to identify (A red , B red ) using measurements of x s (t ), assuming that the identity of the coherent areas is known, and thereafter design an LQR controller based on this tuple. The identity of the coherent groups usually do not change very frequently, and therefore a nominal knowledge base at the ISO can be used to identify the generator indices belonging to each coherent area. If the ISO intends to update its coherency map then several offline data-driven techniques such as spectral clustering reported in the recent literature [21] can be used.

2.2.1
Aggregation of states Consider the power system to be divided into r < n coherent areas { 1 ,  2 , ..,  r }. The generator identities in each area are assumed to be known by the ISO. Following the coherency theory as in [6], we define the aggregate states as the area-wise average of the six generator states (x G i ) and one exciter state (x E i ). The seven aggregate slow state variables for  j are written as where n  j is the number of generators in area j and M i is the inertia of generator i. The symbol i ∈  j means generator i in the j th coherent area. Our strategy will be to identify a reduceddimensional grid model based on these seven aggregate states obtained from each area. The identification will be done in a centralized way at the ISO. We will not consider aggregation of any of the PSS states because they are not physical states, and more importantly, they do not exhibit coherency. This is illustrated in Figure 3 where the first PSS state of generators 1 and 2, respectively, denoted as x (1) pss 1 and x (1) pss 2 , for the IEEE 9-bus twoarea power system model are clearly seen to be non-coherent. The 7r-dimensional state-space model which approximates the FIGURE 3 Non-coherent PSS states of IEEE 9-bus system slow dynamics of (2) and is to be identified from the measurements of the 7r aggregate states following from (9) is written asẋ where, A red ∈ ℝ 7r×7r , and B ′ ∈ ℝ 7r×n . We will design u as where u red ∈ ℝ r is a reduced-dimensional control input to be designed using LQR, and P B is a broadcast matrix of the form The broadcast strategy indicates that all generators in area  j will receive the same control input (the j th element of u red ), but proportionally weighted according to its inertia relative to the total inertia of that area. We then write (10) aṡ In the identification stage, as seen from (13), we ignore the effect of PSS states in the slow dynamics as they have a low participation factor on the slow poles. However, for the control stage, we will shortly see that ignoring the PSS states during control design may cause closed-loop instability. We will, therefore, develop an augmentation method to include the PSS model in A red .

Implementation architecture
Before proceeding to identification, it is important to discuss the implementation strategy of how the ISO may receive the aggregate states. As mentioned earlier, we assume that the n generators are divided into r coherent areas. The set of coherent areas is denoted by  = { 1 , … ,  r }. We also assume that the grid is divided into number of regional transmission operators (RTOs), denoted by the set  = { 1 , … ,  }. We discuss the following two cases. Case 1: Consider the case when every coherent area  i is assumed to lie completely within a single RTO  j , that is, no coherent area belongs to more than one RTO. Let  ( j ) denote the subset of  that belongs to the region under RTO  j . In this case, the KF of each generator would send its states to the control center of the corresponding RTO. The aggregator or the phasor data concentrator (PDC) located at this control center would aggregate the states x  i , ∀ i ∈  ( j ) following (9). Thereafter, the RTO  j would send all the aggregates states x  i (such that  i belongs to RTO  j ) to the ISO. This case is illustrated in Figure 2.
Case 2: Consider the case when a coherent area  i may belong to more than one RTO. Denote the set of regions consisting of this coherent area  i as  (i ) . Each RTO knows the identity of the generators belonging to every coherent area in its own region. However, it may not know the global coherency structure. Therefore, each RTO  j ∈  (i ) sends the aggregate states for the generators belonging to its own coherent area as x  j  i according to (9). The ISO, on the other hand, knows the global coherency structure. Thus, once the ISO receives the aggregate states for  i from each RTO in  (i ) , it will compute the aggregate state as Once the area-wise aggregate states x  i are calculated, the slow state vector x s is constructed as in (10), and used for identification as explained next.

Identification of A red and B red
Following a small-signal disturbance, we first input a lowamplitude persistently exciting Gaussian noise to the exciter input u i of every ith generator, i = 1, … , n. Generators belonging to the same coherent area are excited with the same noise input, that is, from which we have u red,k = u  k , k = 1, 2, .., r. Using this u red and the estimated (or measured) aggregate states x s as in (10), we apply the subspace identification method N4SID [22] to identify A red and B red in (13). Traditional N4SID, however, suffers from a considerably large computational time complexity, which in our case would be (r 3 N 3 ), where N is the number of measurement samples of x s used for the identification. The computational time increases drastically with increasing number of samples and areas. Therefore, we instead propose the use of rSVD-based N4SID which has a computational complexity of (r 3 N log 2 N ), and therefore reduces the ID time significantly. Traditional N4SID would involve the formation of Hankel matrix H 0 and H 1 as where h i are the Hankel parameters and can be calculated from the measurement samples of x s and u red as in [22]. Since we are using all states as our output, the output matrix C red ∈ ℝ 7r×7r is the identity matrix. Once the Hankel matrices are computed, a rSVD is performed following [23] as where  r and  r are the extended observability and controllability matrices. H 1 can be further decomposed as where A r is a similarity transformation of A red as N4SID only preserves the input-output relationship and not the exact matrix structures. From (17) it follows that Similarly, one can compute B r and C r as in [22]. Note that these matrices maybe in the transformed subspace. To get the estimates in the subspace we are interested in, the following transformation is performed based on the knowledge of the output matrix being the identity matrix: Results comparing the ID time between regular N4SID and randomized N4SID will be presented in Section 7. An important point to keep in mind is that the trade-off for using randomized N4SID is that the randomization brings in error in the 2: Low-amplitude, persistently-exciting Gaussian random noise is injected to the excitation input of each generator following (15).
3: Generator states in each area are transmitted in real time to its corresponding aggregator; the aggregator computes the aggregate state vector x s as in (9), and sends it to the ISO as shown in Figure 2 over a chosen interval of time.
5: ISO computes A red and B red following (19).
Outputs:Â red ,B red computation of the SVD, which is given by where, E||H 0 − (U ΣV ) r || 2 is the expectation of the error norm over a randomly drawn Gaussian matrix, (U ΣV ) r is the SVD using the randomized algorithm, r+1 (H 0 ) denotes the largest discarded singular value of H 0 , e is Euler's number, and is an oversampling parameter. We refer the reader to [23] for the details of these notations. In general, the error norm decreases by increasing the oversampling, as shown in Figure 4 for the 9-bus power system model. The reduced-order system identification is summarized in Algorithm 1.

Augmentation of PSS model
The next step is to augment the identified reduced-order model with the known PSS model. Ignoring the PSS states in the reduced-order model can easily lead to an unstable closed-loop system due to conflicting oscillatory modes [24], as will be seen shortly in Section 7.1. Structurally speaking, the PSS model can be written aṡ where * denotes the known parameters (gains/time constants) of the PSS. It is important to note that the PSS model depends only on its own internal states and the generator speed. Assuming that the PSS states evolve over a relatively slower time-scale by which the fast modes of the generator speed are well damped out, we approximate the speed of the ith generator by the aggregate speed Δ A j of its area  j . In that case, the ith PSS model can be approximated aṡ We assume A pss i and B pss i to be known for all the n p PSSs. Also, note that only the exciter state (x E ) depends on the PSS states. Its dynamics can be structurally expressed aṡ where f i (x G i , x e i ), following the model shown in [8], is a linear function of the exciter state and the corresponding generator states. For area j , the aggregate exciter dynamics (x 7  j ) can, therefore, analytically be written aṡ The contribution of f  j (x  j , u  j ) is already accounted for in A red and B red , as identified in Section 3.1. To account for the first term in the RHS of (24) (i.e. the coupling of the exciter state with the PSS states), we define our augmented state vector x aug ∈ ℝ (7r+3n p ) as where x s follows from (13) and n p is the number of generators equipped with PSS. The augmented model is written aṡ whereÂ aug ∈ ℝ (7r+3n p )×(7r+3n p ) can be constructed from the identified A red and the known matrices A pss i , B pss i , and E pss i from (22) and (24).B aug ∈ ℝ (7r+3n p )×r is constructed asB aug = [B T red 0] T . For example,Â aug for the 9-bus system, assuming equal generator inertias for simplicity of writing, can be written as Note that the dimension of the augmented modelÂ aug which is (7r + 3n p ) × (7r + 3n p ) is in between that of the full-order model (7n + 3n p ) × (7n + 3n p ) and of the exact reduced-order model 10r × 10r (i.e. the model that considers PSS states as aggregates). For the 9-bus system, for example, these dimensions are, respectively, 26 × 26 for the augmented model, 40 × 40 for the fullorder model, and 20 × 20 for the exact reduced-order model. Thus, identifying the augmented model may take more time than the exact reduced-order model, but it provides a guarantee for closed-loop stability than the latter.

Controller design
Using the augmented matricesÂ aug andB aug we next design a reduced-dimensional LQR controller to facilitate inter-area power oscillation damping. The LQR problem is described as designing a state-feedback gain K r ∈ ℝ r×(7r+3n p ) such that the control input u red = −K r x aug minimizes subject to the plant dynamics in (26). The weight matrix Q 1 can be chosen as a block-diagonal matrix, where each block corresponds to each type of augmented state. The entries for the phase angles (Δ  i ) in the augmented state vector can be chosen to have a Laplacian structure [3]. The remaining blocks are chosen as diagonal matrices, with preferably higher weights for the augmented generator velocities than for excitation and PSS states as the former has higher influence on the damping factor. R ∈ ℝ r×r is typically chosen as a diagonal matrix with lower weights associated with the control inputs of areas that are closer to the source of the disturbance. For example, in our simulations we choose Q 1 for the IEEE 68-bus systems FIGURE 5 68-bus power system divided into five coherent areas as: The choice of R is dictated by the fault location. With the fault location shown in Figure 5, the weights corresponding to u red (1), u red (2), and u red (3) are chosen to be at least at 10% of that for the other inputs: The LQR controller follows from the solution of the algebraic Riccati equation (ARE) in the form of K r = R −1BT aug P. In general, LQR has a worst-case computational complexity of (n 3 ), where n is the order of the system. For the reduced-dimensional design, the complexity will therefore be only (r 3 ).

Broadcast control
Once the reduced-order feedback controller K r has been designed, there can be two potential architectures for implementing the final control signal, as follows: Architecture 1: In the first case, the ISO calculates the reduced-order control input as u red = −K r x aug ∈ ℝ r , and thereafter the full-order control input as where P B follows from (12). Each element of u(t ) is broadcast to the corresponding generator at every sampling time instant t . This architecture is, thus, equivalent to a centralized controller. Architecture 2: An alternate and perhaps more useful strategy can be that the ISO computes the full-dimensional feedback control matrix K as where V represents the averaging matrix such that x aug (t ) = Vx(t ), and does a one-time broadcast of the block-rows of K to the respective generators. Once each generator receives its control gain K i (which is the ith block-row of K ) it enables allto-all communication with the other generators to implement its control input as u i (t ) = −K i x(t ). This architecture pertains to a distributed implementation. The sparsity of K in this case will decide the structure of the wide-area communication network. An important point in the implementation of the broadcast control is to account for the communication delays. For architecture 1, we assume that the ISO is moderately distant from all the generators, and therefore the communication delay for broadcasting the control input will be almost same for every generator. Referring to [29], in our simulations we will set the upper bound of this delay as max = 300 ms. For architecture 2, on the other hand, we consider two kinds of delays [29], namely, intra-area delay s between generators belonging to the same coherent area whose typical value lies between 30 to 50 ms, and inter-area delay b between generators across different coherent areas whose value usually lies between 200 to 300 ms.

Motivation
So far we have identified small-signal models followed by linear control design. However, under certain operating conditions such as in the presence of non-linear non-conforming loads, or under high load stress, linear damping controllers may fail to guarantee closed-loop stability of the power system. For example, it was shown in [25] that when the system is stressed enough to be close to its voltage instability point, it becomes critical to consider higher-order models. The results in [26] showed that designing a linear controller for such stressed systems may potentially destabilize the system in case a contingency occurs. We cite a similar example for the IEEE 68-bus model. Figure 6 shows that when the load stress for this model is pushed to a limit where the power flow fails to converge, the reducedorder learning-based linear controller of Section 4 destabilizes the closed-loop system. Motivated by this fact, we, therefore, next propose to extend our control design to nonlinear dynamics. We use Carleman bilinearization [9,10] for this purpose followed by an iterative technique to develop optimal quadratic controllers for bilinear approximation of a power system model.

About bilinearization technique
A non-linear system of the forṁ where f (x) = col( f 1 (x), f 2 (x), … , f n (x)), x ∈ ℝ n , can be approximated up to the second-order terms in the Taylor series expansion of the non-linear function aṡ Here, n denotes the order of any general non-linear system, and is not related to the number of generators as used in Section 2. The state x = x − x o is the deviation of the actual state x from an equilibrium x o , A 11 ∈ ℝ n×n is the linear state matrix, and A 12 ∈ ℝ n×k is the quadratic state matrix, k = n(n + 1)∕2, and ⊗ denotes the Kronecker product. Analytically, we have evaluated at the operating point x o . Carleman linearization of (35) yields the bilinear model where We write (37) compactly as [ẋ where, with a slight abuse of notation, x 2 denotes x ⊗ x.

Bilinear LQR control
The LQR objective function for the bilinear model (40) is given by where Q 2 is defined as and R > 0. The goal is to design u = −K 11 x − K 12 x 2 to minimize the objective function (41) for the closed-loop systeṁ Since B c is a function of the state x, we use an iterative method to converge to an optimal controller (modified from [27]). First, at iteration 0, the optimal input u (0) = −K (0) 11 x is computed for the linear system (A 11 , B 1 ) using LQR weights (Q 1 , R). Once we have K (44) Next, we write the closed-loop system aṡ The new control input u (1) is then designed as the LQR solution of the system (A (1) cl , B cl ) with weights (Q 2 , R). In general, for any iteration index k, A (k) cl is calculated for K (k) 22 given by (46) Following this, the new input u (k) is computed for (A (k) cl , B cl ) till K 11 converges, that is, where 0 < ≪ 1 is a chosen threshold.

Deriving bilinear slow subsystem
Applying the transformation T = col(G , F ) as in Section 2.2 (for exact construction of G and F please see [6]), the fast and slow dynamics for (35) can be written as from where we obtaiṅ From properties of Kronecker product, we can write which transforms (49) tȯ where ≪ 1 is the coherency constant, and Γ (.) and Υ (.) are appropriately sized matrices constructed from G , F , A 1 , A 2 , T , and B 1 . Substituting = 0, the quasi-steady state of x f follows as (52) Neglecting the third-and fourth-order terms, the slow subsystem can be written aṡ where Ψ (.) are appropriately sized matrices formed from Γ and Υ. In practice, if the coherency is strong then the magnitude of the fast variable x f is sufficiently small (after the fast transient is over), and so its Kronecker product with x s or itself can be neglected compared to the most dominant term, which in this case is (x s ⊗ x s ). This is numerically verified for the IEEE 68-bus model in Figure 7. This figure shows the Euclidean and of x s (t ) ⊗ x f (t ) using the measurements of the generator phase angles. Note that the Kronecker product is computed not over time, but over the variables; for example, the dimension of x s (t i ) ⊗ x s (t i ) at any given time instant t i is ℝ r (r+1)∕2 . The norm of this vector is plotted in Figure 7 for all time samples t i till 4 s. The same applies for . It is clearly seen that x s (t ) ⊗ x s (t ) is the most dominant among the three Kronecker products. With this approximation and ignoring the terms involving the Kronecker product with the fast variable x f , (53) is rewritten aṡ We next describe how the matrices in the RHS of (54) can be identified using measurements of x s .

Carleman linearized system identification
For consistency of notations, we rewrite (54) aṡ where, A 11,red ∈ ℝ 7r×7r is the linear state matrix, A 12,red ∈ ℝ 7r×k , k = 7r (7r + 1)∕2. Furthermore, the power system model equations are non-linear functions of the phase angles i (in terms of sine and cosine functions of the difference of phase angles ( i − j )). The acceleratioṅ(t ) is also a quadratic function of the other states. However, as will be numerically verified shortly in the simulation section, the magnitudes of these quadratic states iṅare negligible, as a result of which for all practical purposes it is sufficient to consider only the Kronecker products involving the aggregate rotor angles Δ  i . Applying this approximation, we rewrite (55) aṡ where ) .
Treating x 2 as a pseudo-input we write (56) aṡ Using a similar approach as Algorithm 1, the ISO can employ measurements of x s and x 2 to estimateÂ 11,red andB using the rSVD-based N4SID discussed in Section 3.1.Â 12,red andB red are extracted fromB by partitioning it according to (57). Also, similar to (40) The above steps are summarized in Algorithm 2.

Augmentation of PSS model
After running Algorithm 2, we next augment the identified model with the known PSS model as in Section 3.2. The PSS model is augmented with the identified matricesÂ 11,red andB red in a similar manner as it was augmented toÂ red and B red . We ALGORITHM 2 Reduced-order quadratic Carleman linearized system identification 1: Disturbance (such as faults, etc.) occurs and is detected by the ISO.
2: A low-amplitude, persistently exciting Gaussian random noise is injected to the excitation input of each generator following (15).
3: The aggregators compute their respective aggregate state vectors x s and x 2 , and sends them to the ISO, as shown in Figure 2.
Outputs:Â 11,red ,Â 12,red ,Â 21,red ,B red ,B 2,red ALGORITHM 3 Bilinear quadratic controller 1: ISO identifiesÂ aug ,Â 12z,red ,Â 21,red ,B aug andB 2,red using Algorithm 2. Outputs: K 11 , K 12 refer to these augmented matrices as A aug and B aug . The matrix A 12,red is appropriately padded with extra zeroes (denote it bŷ A 12z,red ) to match the height of A aug . The final augmented Carleman linearized model is written as Note that (60) is not an LTI system as the input matrix is a function of x s (t ) making it time variant. We next discuss how the iterative LQR control design presented in Section 5.3 can be used to account for the PSS augmentation.

Reduced-order bilinear quadratic control design
We apply the iterative LQR design discussed in Section 5.3 to achieve our control objective for (60). The weight matrices Q 1 and R are chosen as in Section 4.1. Algorithm 3 summarizes the steps of this design.
The dimension of the reduced-order model is 7r + r (r + 1)∕2, and therefore the computational complexity of Algorithm 3 is (r 6 ). For example, for the IEEE 68-bus 16-machine test system, Algorithm 3 is found to converge within two iterations and an average clock time of 0.11 s, making this algorithm feasible for online implementation. After the ISO calculates the linear and bilinear gains K 11 and K 22 , the reduced-order control input is designed as Finally, this reduced-order control input u red ∈ ℝ r is broadcast to the excitation input of all generators u ∈ ℝ n using (32).

SIMULATION RESULTS
We next test our algorithms on two IEEE prototype models.
All simulations are carried out in a core i5 computer running Windows 10 operating system at 3.20 GHz, and with 16 GB RAM. The non-linear simulations for both are carried out using the Power System Toolbox (PST) [28] in MATLAB 2018a. The routine s_simu was used as the main non-linear simulation program for the complete simulation time. Suitable modifications were made to the exciter (actuator) routine mexc_sig to input the identification noise and state-feedback control as shown in Figure 8. The identification and broadcast control routines were executed in parallel and interfaced with s_simu to evaluate the performance of the controller. Figure 8 shows the timing diagram for the whole process starting from injection of noise, to identification and control. T 0 is the time when a contingency occurs and is detected. Measurement data is captured from T 0 to T 1 , the system is identified from T 1 to T 2 . T 2 to T 3 is the time required for the LQR control design. T 4 is the settling time. In the figure, solid black arrows indicate flow of data, and the thin blue arrows indicate actuation signals (such as the exciter input).

Two-Area, 4-machine Kundur system
We first illustrate the importance of augmenting the PSS model using the Kundur 9-bus model. The closed-loop eigenvalues for this model with the broadcast controller (13) are shown in Figure 9. It can be seen that if PSS states are ignored then the closed loop is unstable. Augmenting the PSS states, on the other hand, ensures closed-loop stability as well as the desired damping of the inter-area oscillation modes. Figure 10 shows how the reduced-order controller stabilizes the unstable open-loop system when a fault occurs on the line connecting buses 3 and 101 in Figure 2. T 0 is the time at which the fault occurs, T 1 is the time till which the measurement samples of x s and u are collected (as defined in (10)), T 2 is the time at which the identification of A red and B red is complete. The reduced-dimensional wide-area controller is applied at T 3 via broadcasting, and the system is stabilized and the oscillations are damped by T 4 = 8.4 s. The entire exercise, starting from identification to control, is done using only data without knowing any model parameter other than the PSS gains.

Linear controller on IEEE 68-bus, 5-area system
The 68-bus model is divided into five known coherent areas as marked in Figure 5 with PMUs placed on all generator buses. PSS is assumed to be present only at generators 1, 3, 8, 9, 13, and 14. The open-loop system is driven close to instability (Figure 11a) for a three-phase fault on the line connecting buses 55 and 56. Table 1 summarizes the result of the ID and control. The first three rows show that at least 4.5 s of data gathering time is needed to identify an accurate enough model that can produce a stable closed-loop system. Use of r-SVD drastically improves the ID time compared to standard N4SID, as seen from columns 2 through 4. Fidelity of the identification is quantified by a metric called final prediction error (FPE) following [7], which is defined as where e(i ) ∶= x aug (i ) −x aug (i ) is the estimation error vector for the ith sample,x aug being the value of the augmented state vector estimated from the identified model, d = n y + n u , n y being the number of outputs and n u the number of inputs. For this example, n y = 7r and n u = r, and therefore d = 8r. A lower FPE indicates a more accurately estimated model. Increasing the initial amount of information for the ID by increasing the data gathering time improves the FPE, but after a certain threshold (T 1 = 8 s), as seen from the last two rows of the table, the initial information does not impact the FPE in any notable way.
The closed-loop temporal responses of the first three generator frequencies are shown in Figure 11a-d. Figure 11b shows the case when data is gathered for 4.5 s. The control in this case is applied at approximately 5 s, and the oscillations are damped within 7 s starting from then. Figure 11c shows the case when data is gathered for 8 s, leading to a more accurate model and a more effective controller which damps the oscillations in only 4 s. The total time taken for damping (ID + control + damping time), however, for both Figure 11b,c, are almost same, which is 12 s starting from t = 0. Considering this fact, the former looks like a more feasible choice as it triggers the control faster than the latter, preventing the open-loop unstable oscillations to persist for longer. Figure 11d shows the case where adding more data samples or increasing the data gathering time before starting ID does not necessarily lead to a better controller. The right compartment of Table 1 shows the figures when the fullorder model is identified. A least 8 s of initial data gathering time is needed, given the larger size of the parameter space that needs to be identified. The ID and control design takes 17.4 s in total. Our reduced-order controller, in contrast, takes only 8.5 s. This time-saving property becomes more beneficial as the  Table 1, and choose the model where the FPE starts plateauing. We also make a comparison of the learning-based controller with the ideal LQR controller (i.e. the LQR controller if the system model was perfectly known) implemented just after the fault at T 0 in Figure 12. The figure shows the rotor speed of generator 2, indicating that the reduced-order controller performs sub-optimally compared to the ideal LQR controller. However, it stabilizes and damps the oscillations more effectively compared to when the LQR controller is designed based on a nominal system model.

7.2.1
Effect of delays Next, we study the effects of communication delays on the closed-loop response for the two architectures reported in Section 4.2. In the first architecture, that is, for centralized control, we assume the delay from the generators to the ISO, and ISO to the generators to be same. As shown in Figure 13a total delay of 200-700 ms is simulated until the system is destabilized. For a delay of 200 ms, the percentage degradation in the performance objective J is 1.55%, while increasing the delay to 500 ms increases the degradation to only 1.7%. The closed loop remains stable up to a delay of approximately 650 ms, which shows the resilience of our learning-based controller as in reality this delay will be much lesser.
For architecture 2, we consider two kinds of delays according to Section 4. For intra-area generators, we consider random delays varying between 30 and 50 ms. For inter-area generators, we consider delays between 200 and 300 ms. The closedloop response of the generator speeds with and without delays is shown in Figure 14. The results testify the resilience of our

Comparison with robust controller
We next show that the adaptive nature of our proposed controller is more advantageous in fighting against model uncertainty than a conventional model-based robust wide-area controller. For this, the nominal model of the 68-bus system is perturbed by stretching the loads at buses {18, 36, 39, 45, 52, 55, 67, 68} to their maximum limit. A robust H ∞ controller is designed assuming that the model can vary between the nominal model and this extremity. The controller is designed using the D-K iteration routine in the robust control toolbox in Matlab. Figure 15 shows the simulation results comparing the H ∞ controller to the learning-based controller. Figure 15a shows that both controllers perform comparably well in damping the major oscillation lobes, but the learning-based controller    Figure 15b shows the time response of the control input to generator 2. The H ∞ controller comes into action immediately after the fault from t = 0, while the learning controller requires 8 s for system identification, and is actuated at t = 8 s. We can see that the H ∞ controller requires a significantly higher control effort as it is designed for the worst-case scenario while our controller learns the system as it is from the measured data and thereby requires a much smaller control effort. Figure 16 shows a case when the system is significantly deviated from the nominal model such that it no longer falls within the robustness margin of the H ∞ controller. We can see that in this case the H ∞ controller fails to stabilize the system as expected, whereas our learning-based controller is still able to damp the oscillations successfully.

Results for bilinear quadratic control
We next test our reduced-order bilinear controller on the IEEE 68-bus, 16-machine test system. We add non-linear loads to FIGURE 17 Correlation hypothesis testing buses 25, 39, 42, 49, 62, and 64, and increase their consumption levels so that the load flow solution is close to its failing value. This is done using the load_con matrix in the 68-bus system data file of PST. For the second-order Kronecker terms, we only consider the interaction between and ⊗ . From [8], we can see thaṫcontains quadratic non-linearity with respect to the other states. However, we hypothesize thaṫis weakly correlated to the other cross-terms except for ⊗ . This hypothesis is proved by performing a statistical analysis as shown in Figure 17. This includes calculating the correlation coefficient oḟ with respect to the Kronecker terms, and finding the so-called the p-value that indicates the hypothesis probability of being uncorrelated. A p-value lesser than 0.05 indicates that the variables are correlated. Details of this hypothesis testing method and calculating them can be found in [30]. We can see from Figure 17 that the correlation coefficient oḟwith ⊗ given by ( ⊗ ,̇) is significantly higher compared to those with other relevant cross-terms. The hypothesis thaṫis only correlated to ⊗ is validated in Figure 17b, where the p-value is found to be only 0.018. The p-values for the other terms are higher, so we ignore their contribution while performing the Carleman linearized system identification. This reduces the dimension of A 12 ∈ ℝ 7r×k from k = 7r (7r + 1) 2 to k = r (r + 1) 2 . For the 68bus model, this means a reduction from a size of k = 630 to k = 15, which frees up significant computational time without noticeable loss of accuracy in the identification and the control.  We simulate the same non-linear fault as shown in Figure 8. The bilinear quadratic wide-area controller designed in Section 6 is learned till 7.4 s and implemented at 7.5 s. From Figure 18, we can see that the bilinear controller successfully stabilizes and damps the system whereas the linear controller drives the system towards instability. Table 2 shows the FPEs using Algorithm 1 (FPE 1 ) for the linear system identification comparing it to Algorithm 2 (FPE 2 ) for the bilinear system. We can see that for the test system with the non-linear loads, Algorithm 1 fails to identify an accurate linear model compared to the bilinear system identification using Algorithm 2. As we saw in Section 6, a highly accurate model with low FPE is required to design a stabilizing quadratic controller. It becomes implausible to design a stabilizing controller using the identified linear model as the FPE cannot fit better than a value of 10 −39 . For the bilinear system identification, it performs poorly for lesser number of samples due to underfitting. However, as T 1 increases, the FPE decreases significantly making it feasible for designing a stabilizing bilinear quadratic controller. From Figure 18 we can see that this controller successfully stabilizes and damps the oscillations within 15 s.

CONCLUSION AND FUTURE WORK
In this paper, we presented a model-free learning-based broadcast control design for damping inter-area oscillations in a power systems without knowing its dynamic model. We used coherency theory and numerical methods such as randomized (rSVD) subspace identification (N4SID) for identifying a reduced-order power system model using real-time PMU measurements. This reduced-order model is augmented with a known set of full-order PSS models to achieve closed-loop stability. We also extended the design to non-linear models where we identified a second-order Carleman linearized model, and designed a bilinear quadratic controller based on that. Results are illustrated by non-linear simulations of the IEEE 9-bus and 68-bus power system models. Robustness of the controller to wide-area communication delays and model uncertainties are also demonstrated through simulations. Our future work will be to evaluate how the proposed approximation-based identification and control extend to non-linear renewable-integrated grid models.