Volume 19, Issue 1 e13187
ORIGINAL RESEARCH
Open Access

Dynamic event-triggered-based adaptive frequency control of microgrids under cyber-attacks via adaptive dynamic programming

Zemeng Mi

Zemeng Mi

School of Information Science and Engineering, Northeastern University, Shenyang, Liaoning, China

Contribution: Formal analysis, ​Investigation, Software, Writing - original draft

Search for more papers by this author
Hanguang Su

Corresponding Author

Hanguang Su

School of Information Science and Engineering, Northeastern University, Shenyang, Liaoning, China

Correspondence

Hanguang Su, School of Information Science and Engineering, Northeastern University, Shenyang, Liaoning, China.

Email: [email protected]

Contribution: ​Investigation, Visualization

Search for more papers by this author
Qiuye Sun

Qiuye Sun

School of Information Science and Engineering, Northeastern University, Shenyang, Liaoning, China

Contribution: ​Investigation, Supervision

Search for more papers by this author
Yuliang Cai

Yuliang Cai

School of Information Science and Engineering, Northeastern University, Shenyang, Liaoning, China

Contribution: ​Investigation, Resources, Visualization

Search for more papers by this author
Zhongyang Ming

Zhongyang Ming

School of Information Science and Engineering, Northeastern University, Shenyang, Liaoning, China

Contribution: ​Investigation, Supervision

Search for more papers by this author
First published: 04 January 2025

Abstract

The increasing penetration of renewable energy sources (RES) and the development of the cyber-physical microgrids (CPMs) make greater demands for frequency control of microgrids. The common approach for frequency control is controlling the micro-turbine to compensate for frequency deviations, with energy storage systems serving as an auxiliary approach. This article proposes an online adaptive frequency control method to control the governor and energy storage to realize the frequency recovery of microgrid, subject to the external unknown disturbances caused by wind turbine, power load and false data injection (FDI) attacks. First, the non-zero sum (NZS) games of the considered system are modeled in this work, where the unknown disturbances are also taken into account. For the sake of estimating the unknown disturbances, a disturbance observer (DOB) for the microgrid system is introduced. Then, on the basis of the estimated results of the applied DOB, the disturbance compensation input is derived to offset the interference of the unknown disturbance. Meanwhile, the adaptive dynamic programming (ADP) approach is employed to derive the adaptive optimal control input for the NZS games of microgrid system. Besides, the dynamic event-triggered (DET) control is introduced, reducing the occupation of computing resources. By utilizing the Lyapunov's method, the stability of the closed-loop system, the convergence of the estimation weight, the estimation disturbances and the system state are guaranteed. The effectiveness of the proposed method is ultimately verified by the simulation results.

1 INTRODUCTION

In recent years, the increasing integration of renewable energy sources into power systems has given rise to the concept of microgrids [1]. Such systems consist of distributed energy resources such as wind turbines generators, energy storage devices, and photovoltaic systems, enabling them to operate independently or in conjunction with the main grid [2-4]. As the proportion of renewable energy in microgrids grows, the inertia of the system is reduced, making it increasingly crucial to enhance system resilience [5, 6].

Meanwhile, with the application of information and communication technologies being widespread [7-9], the interaction between energy flow and information flow has become increasingly frequent, leading to the development of cyber-physical microgrids (CPMs). However, CPMs are more vulnerable to cyber attacks, which can lead to frequency fluctuation and, consequently, undermine the stability of microgrids [10].

Controlling micro-turbine is a common approach to smooth the frequency fluctuation. However, micro-turbine may experience mechanical wear during the frequency control process, which can influence the robustness of frequency control [11]. Therefore, combining the micro-turbines and energy storage systems (ESSs) can better regulate frequency fluctuations in microgrids [12]. Over the past few years, various energy storage systems have been studied [13, 14]. They have great potential in enhancing the overall stability of microgrid frequency due to their rapid response, precise control, and high flexibility.

Many control methods have been applied to stabilize the frequency. Proportional-Integral (PI) control has been widely adopted due to its simplicity and effectiveness [15]. However, it can exhibit poor performance in dynamic conditions, and it is unable to effectively handle system nonlinearity and disturbances. Furthermore, many other robust frequency control methods like sliding mode control [16], fuzzy logic control [17], H ${\rm H}\infty$ [18] control have been investigated. Different from the above control method, the adaptive dynamic programming (ADP) is originated from dynamic programming, which can can adapt to the changing environments and system dynamics [19, 20]. Due to its ability to handle complex, nonlinear, and time-varying systems, ADP has been increasingly applied in power systems [21, 22]. ADP approach have already shown good performance in frequency control [12], but current ADP approach for frequency control primarily focus on single player. By exploring the Nash equilibrium of multiple players, players can collaborate with each other to achieve global optimization [23].

Currently, CPMs face various threats of cyber attacks, such as denial-of-service attacks, delay attacks, and false data injection (FDI) attacks. Among them, FDI attacks that occur on the actuators or the sensors can disrupt the commands transmitted through cyber channels, potentially causing unexpected frequency fluctuation [24, 25]. At the same time, microgrid systems are subject to numerous external disturbances that are either unmeasurable or difficult to control. Attempting to measure or manage these disturbances would require additional sensors and controllers, which could increase system complexity, cost and reduce system reliability. To address the above challenges, disturbance observer (DOB) has been widely researched and applied in different areas [26, 27]. In recent years, DOB has also been combined with sliding mode control to smooth the frequency fluctuation in microgrids [28].

The method of static event-triggered control (SET) has been applied into the intelligent frequency control and other power system fields. Reference [29] proposed a novel event-triggered control architecture for load frequency control with supplementary adaptive dynamic programming. Similarly, the hybrid policy-based reinforcement learning strategy proposed in [30] integrates event-driven mechanisms, further improving the adaptability of energy management. Static event-triggered control keeps the control signals unchanged between the adjacent triggering instants using zero-order holders. This method effectively conserves computing and transmission resources by reducing the frequency of control updates, making it applicable for systems with limited bandwidth and processing capabilities. Based on the static event-triggered control, dynamic event-triggered control is derived in reference [31] by introducing a dynamic variable to adjust the triggering threshold. Similarly, the DET-based distributed cooperative energy management approach developed in [32] effectively addresses the computational and communication constraints in multi-energy systems. Consequently, the introduction of the DET mechanism can have a significant positive impact on frequency control.

Currently, no research has utilized DET-based adaptive optimal control to address frequency control issues while considering unknown external disturbances. Therefore, the NZS games of the considered system are modeled in this work, taking into account the unknown external disturbances. Based on this, we propose a dynamic event-triggered-based ADP control scheme with the help of DOB, which can solve the optimal frequency control problem with one of the player's input constrained, while offsetting the interference of the disturbances by introducing the compensation input. The main tasks outlined in this article are as follows:
  • 1) The microgrid optimal frequency control problem with ESS taking into account the external unknown disturbance is transformed into an NZS games problem with the help of DOB. Then the adaptive critic design method is utilized to get the approximate Nash equilibrium solutions.
  • 2) For the first time, a dynamic event-triggering-based optimal control method, combining ADP and disturbance observer, is used to solve the non-zero sum games. The provided dynamic triggered rule can guarantee the stability of the system.
  • 3) Unlike traditional time-triggered methods, a novel DET method is utilized as an alternative to the event-triggered method for the optimal frequency control issue. By introducing a mathematically equivalent filter structure, fewer events are triggered, so that the communication and computational burden are reduced.

The remainder of this article is organized as follows. In Section 2, the mathematical model of microgrid system is reconstructed and a DOB is designed. Based on frequency dynamics, a DET-based adaptive dynamic programming control scheme is proposed in Section 3, where the stability proof is also given. Section 4 demonstrates the effectiveness of the proposed method through the simulation results. Lastly, the conclusion is presented in Section 5.

2 MATHEMATICAL MODEL OF MICROGRID SYSTEM

2.1 Microgrid system with unknown disturbance

In this section, the microgrid system consists of micro-turbine, governor, energy storage system, wind turbine and power load. And the system structure is presented in Figure 1. Similar to the description in [33], the transfer functions are formulated: G t = 1 / ( T t s + 1 ) $G_{t} = 1/(T_{t}s+1)$ , G g = 1 / ( T g s + 1 ) $G_{g} = 1/(T_{g}s+1)$ , G p = K p / ( T p s + 1 ) $G_{p}=K_{p}/(T_{p}s+1)$ , where T t $T_{t}$ , T g $T_{g}$ , T p $T_{p}$ represent the time constants of governor, micro-turbine and system inertia, respectively. K p $K_{p}$ represents the gain coefficient of the power system, R $R$ represents speed regulation coefficient and K E $K_{E}$ represents integral gain. The FDI attacks occurred on the actuator channel are also modeled [34], which can disrupt the transmission of the control commands of the ESS. As energy storage systems are often connected through IoT (Internet of Things)-based architectures, they are more exposed to cyber threats compared to micro-turbine and governor, which have a limited dependency on cyber and communication infrastructure. Therefore, this article only discuss the FDI attacks that interfere with the control input of ESS.

Details are in the caption following the image
Structure diagram of the cyber-physical microgrid.
In order to propose the adaptive dynamic programming-based controller with the help of disturbance observer, the frequency dynamics can be described by:
Δ f ̇ ( t ) = K p T p Δ P S G ( t ) K p T p Δ P L ( t ) + K p T p Δ P W F ( t ) 1 T p Δ f ( t ) + K p T p ( u 1 ( t ) + d F D I ( t ) ) , Δ P ̇ S G ( t ) = 1 T t Δ P S G ( t ) + 1 T t Δ X G ( t ) , Δ X ̇ G ( t ) = 1 T g Δ E ( t ) 1 R T g Δ f ( t ) 1 T g Δ X G ( t ) + 1 T g u 2 ( t ) , Δ E ̇ ( t ) = K E Δ f ( t ) . $$\begin{align} \Delta \dot{f}(t)&= \frac{K_{p}}{T_{p}}\Delta P_{SG}(t)-\frac{K_{p}}{T_{p}}\Delta P_{L}(t)+\frac{K_{p}}{T_{p}}\Delta P_{WF}(t)\nonumber \\ &\quad -\frac{1}{T_{p}}\Delta f(t)+{\frac{K_{p}}{T_{p}}(u_{1}(t)+d_{FDI}(t))}, \nonumber \\ \Delta \dot{P}_{SG}(t)&= -\frac{1}{T_{t}}\Delta P_{SG}(t)+\frac{1}{T_{t}}\Delta X_{G}(t), \nonumber \\ \Delta \dot{X}_{G}(t)&= -\frac{1}{T_{g}}\Delta E(t)-\frac{1}{RT_{g}}\Delta f(t)-\frac{1}{T_{g}}\Delta X_{G}(t)\nonumber \\ &\quad +\frac{1}{T_{g}}u_{2}(t), \nonumber \\ \Delta \dot{E}(t)&= K_{E}\Delta f(t). \end{align}$$ (1)
Define x ( t ) = [ Δ f ( t ) Δ P S G ( t ) Δ X G ( t ) Δ E ( t ) ] T $x(t)=[\Delta f(t) \ \Delta P_{SG}(t) \ \Delta X_{G}(t)\ \Delta E(t)]^{T}$ as the state vector; they represent frequency deviation, turbine output and governor position valve and incremental change in integral control, respectively. d 1 ( t ) = Δ P W F ( t ) Δ P L ( t ) $d_{1}(t)=\Delta P_{WF}(t)-\Delta P_{L}(t)$ is the unknown disturbance caused by the wind turbine and the load change, where Δ P W F ( t ) $ \Delta P_{WF}(t)$ , Δ P L ( t ) $ \Delta P_{L}(t)$ represent the wind turbine disturbance and the load change disturbance, respectively. d 2 ( t ) = d F D I ( t ) $d_{2}(t)=d_{FDI}(t)$ represents the FDI attacks launched on the actuator channel that can interfere with the control input of the ESS. The control input u 1 ( t ) $u_{1}(t)$ and u 2 ( t ) $u_{2}(t)$ represent the control of the ESS and the control of the governor, respectively. And the system state equation can be given by
x ̇ ( t ) = f ( x ) + g u 1 ( u 1 ( t ) + d 2 ( t ) ) + g u 2 u 2 ( t ) + g d 1 d 1 ( t ) = f ( x ) + g u 1 u 1 ( t ) + g u 2 u 2 ( t ) + g d d ( t ) , $$\begin{align} \dot{x}(t)&=f(x)+g_{u1}{(u_1(t)+d_{2}(t))}+g_{u2}u_2(t)+{g_{d1}d_{1}(t)}\nonumber \\ &=f(x)+g_{u1}u_1(t)+g_{u2}u_2(t)+g_{d}d(t), \end{align}$$ (2)
where
f ( x ) = A x ( t ) = 1 T p K p T p 0 0 0 1 T t 1 T t 0 1 R T g 0 1 T g 1 T g K E 0 0 0 x ( t ) , $$\begin{align} f(x)=Ax(t)=\def\eqcellsep{&}\begin{bmatrix} -\dfrac{1}{T_p}&\dfrac{K_p}{T_p}&0&0\\[12pt] 0&-\dfrac{1}{T_t}&\dfrac{1}{T_t}&0\\[12pt] -\dfrac{1}{RT_g}&0&-\dfrac{1}{T_g}&-\dfrac{1}{T_g}\\[12pt] K_E&0&0&0 \end{bmatrix}x(t), \end{align}$$ (3)
g u 1 = g d 1 = K p T p 0 0 0 , g u 2 = 0 0 1 T g 0 , g d = K p T p 0 0 0 , $$\begin{align} g_{u1}{=g_{d1}}=\def\eqcellsep{&}\begin{bmatrix} \dfrac{K_p}{T_p}\\[12pt] 0\\[3pt] 0\\[3pt] 0 \end{bmatrix},g_{u2}=\def\eqcellsep{&}\begin{bmatrix} 0\\[3pt] 0\\[3pt] \dfrac{1}{T_g}\\[12pt] 0 \end{bmatrix},g_{d}=\def\eqcellsep{&}\begin{bmatrix} \dfrac{K_p}{T_p}\\[12pt] 0\\[3pt] 0\\[3pt] 0 \end{bmatrix}, \end{align}$$ (4)
d ( t ) = d 1 ( t ) + d 2 ( t ) . $$\begin{align} {d(t)=d_{1}(t)+d_{2}(t).} \end{align}$$ (5)

2.2 Problem statement

In this section, we design an intelligent frequency control algorithm regulated by the ADP method for the NZS games of system (1) with unknown disturbances. The impact of these unknown disturbances is effectively eliminated by the disturbance compensation input, ensuring the stability and reliability of the frequency control at the same time. To provide a clearer understanding of the proposed intelligent frequency control method, the overall framework of the composite control scheme is shown in Figure 1. It can be observed that the composite control input contains two parts: one is the adaptive optimal input obtained by ADP for the NZS games of system (1), and the other is the disturbance compensation input based on estimation of the disturbance observer. The composite robust control effectively eliminates the impact of unknown disturbances in the microgrid system and simultaneously minimize the value function.

Inspired by the excellent achievements in [35], the following disturbance observer is designed to estimate disturbance of system (1). And then, based on the estimated disturbance, a disturbance compensation input in the following form is introduced to offset the interference of the applied unknown disturbance. The DOB is designed as
d ̂ = b + p ( x ) b ̇ = q g d ( x ) b + g d ( x ) p ( x ) + f ( x ) + j = 1 2 g u j ( x ) u j ( x ) , $$\begin{align} {\begin{cases} \hat{d}=b+p(x)\\[6pt] \displaystyle \dot{b}=-q{\left\lbrace g_d(x)b+g_d(x)p(x)+f(x)+\sum \limits _{j=1}^{2}g_{uj}(x)u_j(x)\right\rbrace}, \end{cases}} \end{align}$$ (6)
where d ̂ R $\hat{d}\in R$ is the estimation of the external disturbance, p ( x ) R $p(x)\in R$ is a vector function to be designed and q = p ( x ) / x $q=\partial p(x)/\partial x$ . b R $ b \in R$ is the intermediate variable which is given to avoid computing the derivative of the state.
The disturbance observer error is defined as
d = d d ̂ . $$\begin{align} \tilde{d}=d-\hat{d}. \end{align}$$ (7)

The following assumption is needed in the following analysis and similar assumption can also be found in [36, 37].

Assumption 1.The disturbance in system (1) is slowly time-varying and bounded, that is, lim t d ̇ ( t ) = 0 $\lim \limits _{t\rightarrow \infty }\dot{d}(t)=0$ . g d ( x ) $ g_d(x)$ is also bounded, that is, | | g d ( x ) | | G ¯ d $ ||g_{d}(x)||\le \bar{\mathcal {G}}_{d}$ with positive constant G d $ \mathcal {G}_{d}$ .

Next, the following theorem illustrates that the disturbance observer error is asymptotic stable.

Theorem 1.If θ ( x ) = q g d $\theta (x)=qg_{d}$ is a positive definite matrix and Assumption 1 is satisfied, the disturbance observer error d $ \tilde{d}$ is asymptotically stable.

Proof.Select the Lyapunov candidate as Ξ = 1 2 d T d $\Xi = \frac{1}{2} \tilde{d}^{T} \tilde{d}$ . Then the corresponding derivative of Ξ $\Xi$ is

Ξ ̇ = d T b ̇ + p ̇ ( x ) = d T q g d b q g d p + f + j = 1 2 g u j ( x ) u j ( x ) + q x ̇ . $$\begin{align} \dot{\Xi }&=- \tilde{d}^{T}{\left(\dot{b}+\dot{p}(x)\right)}\nonumber \\ &=- \tilde{d}^{T}{\left(-qg_{d}b-q{\left(g_{d}p+f+\sum _{j=1}^{2}g_{uj}(x)u_{j}(x)\right)}+q\dot{x}\right)}. \end{align}$$ (8)
$\Box$

According to system (1), we can have
Ξ ̇ = d T q g d b q g d p q f q j = 1 2 g u j ( x ) u j ( x ) d T q f + j = 1 2 g u j ( x ) u j ( x ) + g d d = d T q g d d λ m i n ( θ ) d 2 . $$\begin{align} \dot{\Xi }&=- \tilde{d}^{T}{\left(-qg_{d}b-qg_{d}p-qf-q\sum _{j=1}^{2}g_{uj}(x)u_{j}(x)\right)}\nonumber \\ &\quad -\tilde{d}^{T}q{\left(f+\sum _{j=1}^{2}g_{uj}(x)u_{j}(x)+g_{d}d\right)} \nonumber \\ &=-\tilde{d}^{T}qg_{d}\tilde{d} \nonumber \\ &\le -\lambda _{min}(\theta)\Vert \tilde{d}\Vert ^{2}. \end{align}$$ (9)
According to the Lyapunov's theory, d $\tilde{d}$ is asymptotically stable, which indicates that the DOB can asymptotically approximate the unknown disturbance. This conclusion provides a foundation for offsetting the interference caused by wind power fluctuations, load changes, and FDI attacks.
Therefore, based on the estimation of the disturbance observer and the system state equation (2), the composite control inputs of system (1) are formulated as:
u 1 ( x ) = u r 1 ( x ) d ̂ = u r 1 ( x ) + u d 1 ( x ) , $$\begin{align} u_{1}(x)&=u_{r1}(x)-\hat{d}=u_{r1}(x)+u_{d1}(x), \end{align}$$ (10)
u 2 ( x ) = u r 2 ( x ) , $$\begin{align} u_{2}(x)&=u_{r2}(x), \end{align}$$ (11)
where u d 1 ( x ) $u_{d1}(x)$ is the disturbance compensation input, u r 1 ( x ) $u_{r1}(x)$ and u r 2 ( x ) $u_{r2}(x)$ are the ADP-based optimal control inputs of energy storage and governor, respectively.

3 DYNAMIC EVENT-TRIGGERED BASED ADAPTIVE DYNAMIC PROGRAMMING APPROACH FOR THE NZS GAMES

In this section, a DET-based adaptive dynamic programming control scheme with the help of disturbance observer for the NZS games of system (1) is proposed, and the structure diagram of the closed-loop system is given in Figure 1. The NZS games of system (1) are addressed by utilizing the adaptive critic design approach under the DET mechanism. By introducing a non-quadratic function into the performance index, the robust stabilization problem is transformed into a constrained optimal control problem. Finally, the stability of the closed-loop system is verified.

3.1 Triggering adaptive controller design

The governor's control signal, that is, u 2 ( x ) $u_{2}(x)$ , should be constrained to prevent excessive wear on the governor, as this could reduce system reliability. Therefore, a non-quadratic function is introduced into the performance index, and define the utility function of the ith player as v i ( x , u r 1 , u r 2 ) $ v_i(x,u_{r1},u_{r2})$ . The performance index of the ith player can be described as:
J i x ( 0 ) = 0 x T Q i x + W ( u r i ) d τ = Δ 0 v i ( x , u r 1 , u r 2 ) d τ , $$\begin{align} J_i{\left(x(0)\right)} &= \int _0^\infty {\left(x^T Q_i x + \mathcal {W}(u_{ri}) \right)} d\tau \nonumber \\ &\stackrel{\Delta }{=} \int _{0}^{\infty } v_i(x, u_{r1},u_{r2}) d\tau, \end{align}$$ (12)
where i 2 $\forall i\in 2$ , Q 1 R 4 × 4 $Q_{1}\in R^{4\times 4}$ and Q 2 R 4 × 4 $Q_{2}\in R^{4\times 4}$ are symmetric positive-definite matrices. Inspired by [38], W ( u r 1 ) $\mathcal {W}(u_{r1})$ and W ( u r 2 ) $\mathcal {W}(u_{r2})$ are chosen as:
W ( u r 1 ) = u r 1 T R 11 u r 1 + 2 0 u r 2 Θ T ( Γ 2 1 s ) R 12 d s , $$\begin{align} \mathcal {W}(u_{r1})=u_{r1}^{T}R_{11}u_{r1}+2\int _{0}^{u_{r2}}\Theta ^{-T}(\Gamma _{2}^{-1}s)R_{12}ds,\end{align}$$ (13)
W ( u r 2 ) = u r 1 T R 21 u r 1 + 2 0 u r 2 Θ T ( Γ 2 1 s ) R 22 d s , $$\begin{align} \mathcal {W}(u_{r2})=u_{r1}^{T}R_{21}u_{r1}+2\int _{0}^{u_{r2}}\Theta ^{-T}(\Gamma _{2}^{-1}s)R_{22}ds, \end{align}$$ (14)
where Θ ( · ) = $\Theta (\cdot)=$ tanh ( · ) ${\rm tanh}(\cdot)$ is adopted, and Γ 2 $\Gamma _{2}$ is the upper bound of the control input u r 2 $u_{r2}$ . Since the overall composite control input u 2 $u_{2}$ is equivalent to the ADP-based control input u r 2 $u_{r2}$ without the disturbance compensation term, the upper bound of u r 2 $u_{r2}$ is aligns with the upper bound of u 2 $u_{2}$ . Therefore, it is possible to constrain the control signal of the governor. The ith player's value function is defined as
V i x ( t ) = t ( x T Q i x + W ( u r i ) ) d τ . $$\begin{align} V_{i}{\left(x(t)\right)}=\int _{t}^{\infty }(x^{T}Q_{i}x+\mathcal {W}(u_{ri}))d\tau . \end{align}$$ (15)

Definition 1. ([[39, 40]])The strategy set { u r 1 , u r 2 } $\lbrace u_{r1}^{*},u_{r2}^{*}\rbrace$ is a Nash equilibrium strategy set, and the following inequalities:

V 1 ( u r 1 , u r 2 ) V 1 ( u r 1 , u r 2 ) , $$\begin{align} V_{1}(u_{r1}^{*},u_{r2}^{*})\le V_{1}(u_{r1}^{*},u_{r2}),\end{align}$$ (16)
V 2 ( u r 1 , u r 2 ) V 2 ( u r 1 , u r 2 ) , $$\begin{align} V_{2}(u_{r1}^{*},u_{r2}^{*})\le V_{2}(u_{r1},u_{r2}^{*}), \end{align}$$ (17)
are satisfied for any admissible control policies.

Next, the Hamiltonian of the ith player with associated admissible control inputs is defined as
H i = x T Q i x + W ( u r i ) + ( V i ) T f ( x ) + j = 1 2 g u j ( x ) u j ( x ) + g d ( x ) d , $$\begin{align} H_{i}=&\;x^{T}Q_{i}x+\mathcal {W}(u_{ri}) \nonumber \\ &+(\nabla V_{i})^{T}{\left(f(x)+\sum _{j=1}^{2}g_{uj}(x)u_{j}(x)+g_{d}(x)d\right)}, \end{align}$$ (18)
where j 2 $ j\in 2$ and V i R 4 $ \nabla V_{i} \in R^4$ is calculated by V i V i / x $\nabla V_i\triangleq \partial V_i/\partial x$ . Thus, the optimal value function
V i x ( t ) = min u r i t ( x T Q i x + W i ( u r i ) ) d τ , $$\begin{align} V_{i}^{*}{\left(x(t)\right)}=\min _{u_{ri}}{\left\lbrace \int _{t}^{\infty }(x^{T}Q_{i}x+\mathcal {W}_{i}(u_{ri}))d\tau \right\rbrace}, \end{align}$$ (19)
satisfies the HJ equation
min u r i H i ( x , u r 1 , u r 2 , V i ) = 0 . $$\begin{align} \min _{u_{ri}}H_{i}(x,u_{r1},u_{r2},V_{i}^{*}){=}0. \end{align}$$ (20)
Based on stationary condition, the associated individual optimal ADP-based control policies are attained as
u r 1 = 1 2 R 11 1 g u 1 T ( x ) V 1 , $$\begin{align} u_{r1}^{*}=-\frac{1}{2}R_{11}^{-1}g_{u1}^{T}(x)\nabla V_{1}^{*}, \end{align}$$ (21)
u r 2 = Γ 2 Θ 1 2 R 22 1 g u 2 T ( x ) V 2 . $$\begin{align} u_{r2}^{*}=- \Gamma _{2}\Theta {\left(\frac{1}{2}R_{22}^{-1}g_{u2}^{T}(x)\nabla V_{2}^{*}\right)}. \end{align}$$ (22)
Due to the difficulty in solving the HJ equation and critic NN's universal approximation property, we approximate the optimal solution V i $V_i^*$ by
V i = W i T σ i ( x ) + ε i , $$\begin{align} V_{i}^{*}=W_{i}^{*T}\sigma _{i}(x)+\varepsilon _{i}, \end{align}$$ (23)
where W i R L $W_{i}^{*}\in R^{L}$ is the ideal weight vector, σ i R L $\sigma _{i}\in R^{L}$ is the activation function, L $L$ is the number of hidden layer neurons, and ε i R $\varepsilon _{i} \in R$ is the approximation error.
Since the ideal weight of the critic NNs is unknown, the optimal solution cannot be obtained directly. Therefore, we reformulate the function (23) by using the critic NNs
V ̂ i = W ̂ i T σ i ( x ) , $$\begin{align} \hat{V}_{i}=\hat{W}_{i}^{T}\sigma _{i}(x), \end{align}$$ (24)
where W ̂ $\hat{W}$ is the estimation of the ideal weight vector W i $W_{i}^{*}$ . Then, taking the partial derivative of V i $V_{i}^*$ , that is, V i = V i / x $\nabla V_i^*=\partial V_i^*/\partial x$ , we can have
V i = ( σ i ( x ) ) T W i + ε i . $$\begin{align} \nabla V_i^*=(\nabla \sigma _i(x))^TW_i^*+\nabla \varepsilon _i. \end{align}$$ (25)
Similarly, we have
V ̂ i = ( σ i ( x ) ) T W ̂ i . $$\begin{align} \nabla \hat{V}_{i}=(\nabla \sigma _{i}(x))^{T}\hat{W}_{i}. \end{align}$$ (26)
Submitting (25) into (21) and (22), the ADP-based optimal control policies can be derived as
u r 1 = 1 2 R 11 1 g u 1 T ( x ) σ 1 ( x ) T W 1 + ε 1 , $$\begin{align} u_{r1}^{*}&=- \frac{1}{2}R_{11}^{-1}g_{u1}^{T}(x){\left(\nabla \sigma _{1}(x)^{T}W_{1}^{*}+\nabla \varepsilon _{1}\right)},\end{align}$$ (27)
u r 2 = Γ 2 Θ 1 2 R 22 1 g u 2 T ( x ) ( σ 2 ( x ) ) T W 2 + ε 2 ) . $$\begin{align} u_{r2}^{*}&=- \Gamma _{2}\Theta {\left(\frac{1}{2}R_{22}^{-1}g_{u2}^{T}(x)(\nabla \sigma _{2}(x))^{T}W_{2}^{*}+\nabla \varepsilon _{2})\right)}. \end{align}$$ (28)

Define z k k = 0 $ \left\lbrace z_{k}\right\rbrace _{k=0}^{\infty }$ as a monotonically increasing sequence, which presents the set of the triggering moments. Here z k $ z_{k}$ represents the k-th sampling time, where z k $ z_{k}$ is a positive constant. The sampling state remains unchanged between two sampling moments.

Using a zero-order holder(ZOH), the piecewise continuous control signals can be got, and the optimal control policies are transformed into the following formulas
u ̌ r 1 ( t ) = u r 1 ( x ̌ k ( t ) , t ) , z k t < z k + 1 , $$\begin{align} \check{u}_{r1}(t)&=u_{r1}(\check{x}_{k}(t),t),z_{k}\le t<z_{k+1},\end{align}$$ (29)
u ̌ r 2 ( t ) = u r 2 ( x ̌ k ( t ) , t ) , z k t < z k + 1 , $$\begin{align} \check{u}_{r2}(t)&=u_{r2}(\check{x}_{k}(t),t),z_{k}\le t<z_{k+1}, \end{align}$$ (30)
where x ̌ k ( t ) = x ( z k ) $\check{x}_{k}(t)=x(z_{k})$ , z k t < z k + 1 $z_{k}\le t<z_{k+1}$ .
Then, the measurement error can be defined as: π l = x ( t ) x ̌ k ( t ) , z k t < z k + 1 $\pi _{l}=x(t)-\check{x}_{k}(t), z_{k} \le t<z_{k+1}$ , where x ̌ k ( t ) $\check{x}_{k}(t)$ is the sample state and x ( t ) $x(t)$ is the real-time state. Submitting (26) into (21) and (22), the event-triggered approximated control policies are transformed as
u ̌ r 1 = 1 2 R 11 1 g u 1 T ( x ̌ k ) ( σ 1 ( x ̌ k ) ) T W ̂ 1 ( z k ) , $$\begin{align} \check{u}_{r1}&=- \frac{1}{2}R_{11}^{-1}g_{u1}^{T}(\check{x}_{k})(\nabla \sigma _{1}(\check{x}_{k}))^{T}\hat{W}_{1}(z_{k}), \end{align}$$ (31)
u ̌ r 2 = Γ 2 Θ 1 2 R 22 1 g u 2 T ( x ̌ k ) ( σ 2 ( x ̌ k ) ) T W ̂ 2 ( z k ) . $$\begin{align} \check{u}_{r2}&=- \Gamma _{2}\Theta {\left(\frac{1}{2}R_{22}^{-1}g_{u2}^{{T}}(\check{x}_{k})(\nabla \sigma _{2}(\check{x}_{k}))^{T}\hat{W}_{2}(z_{k})\right)}. \end{align}$$ (32)
And the event-triggered optimal control policies can be derived as
u ̌ r 1 = 1 2 R 11 1 g u 1 T ( x ̌ k ) σ 1 ( x ̌ k ) T W 1 + ε 1 , $$\begin{align} {\check{u}_{r1}^{*}}&{=-} {\frac{1}{2}R_{11}^{-1}g_{u1}^{T}(\check{x}_{k}){\left(\nabla \sigma _{1}(\check{x}_{k})^{T}W_{1}^{*}+\nabla \varepsilon _{1}\right)},}\end{align}$$ (33)
u ̌ r 2 = Γ 2 Θ 1 2 R 22 1 g u 2 T ( x ̌ k ) ( σ 2 ( x ̌ k ) ) T W 2 + ε 2 ) . $$\begin{align} {\check{u}_{r2}^{*}}&{=-} {\Gamma _{2}\Theta {\left(\frac{1}{2}R_{22}^{-1}g_{u2}^{T}(\check{x}_{k})(\nabla \sigma _{2}(\check{x}_{k}))^{T}W_{2}^{*}+\nabla \varepsilon _{2})\right)}.} \end{align}$$ (34)
Considering the above equations, the Hamiltonian residual error can be defined as
H i x , u ̌ r 1 , u ̌ r 2 , W ̂ i = x T Q i x + W ( u ̌ r i ) + W ̂ i T σ i x ̇ = x T Q i x + W ( u ̌ r i ) + W ̂ i T i = Δ e i . $$\begin{align} &H_{i}{\left(x,\check{u}_{r1},\check{u}_{r2},\hat{W}_{i}\right)} \nonumber \\ &\quad =x^{T}Q_{i}x+\mathcal {W}(\check{u}_{ri})+\hat{W}_{i}^{T}\nabla \sigma _{i}\dot{x} \nonumber \\ &\quad =x^{T}Q_{i}x+\mathcal {W}(\check{u}_{ri})+\hat{W}_{i}^{T}\hbar _{i}\nonumber \\ &\quad \stackrel{\Delta }{=} e_{i} . \end{align}$$ (35)
We introduced an auxiliary term as
H i ( x , u ̌ r 1 , u ̌ r 2 , W i ) = x T Q i x + W ( u ̌ r i ) + W i T i = Δ ε H i . $$\begin{align} &H_{i}(x,\check{u}_{r1},\check{u}_{r2},W_{i}^{*})\nonumber \\ &\quad =x^{T}Q_{i}x+\mathcal {W}(\check{u}_{ri})+W_{i}^{*T}\hbar _{i}\nonumber \\ &\quad \stackrel{\Delta }=\varepsilon _{H_{i}}. \end{align}$$ (36)
To get the minimum residual equation error e i $e_{i}$ , it is desired to update W ̂ i $\hat{W}_{i}$ to minimize the squared residual error κ = Δ i = 1 2 κ i = 1 2 i = 1 2 e i 2 $\kappa \stackrel{\Delta }{=}\sum _{i=1}^{2}\kappa _{i}=\frac{1}{2}\sum _{i=1}^{2}e_{i}^{2}$ . Consequently, by using the gradient descent approach [41], the weight tuning laws can be calculated by
W ̂ ̇ i = ι i 1 i T i + 1 2 κ W ̂ i = ι i 1 i T i + 1 2 κ i W ̂ i = ι i i e i i T i + 1 2 = ι i i ε H i i T i + 1 2 + ι i i i T W i i T i + 1 2 , $$\begin{align} \dot{\hat{W}}_{i}&=- \iota _{i}\frac{1}{{\left(\hbar _{i}^{T}\hbar _{i}+1\right)}^{2}}\frac{\partial \kappa }{\partial \hat{W}_{i}}=- \iota _{i}\frac{1}{{\left(\hbar _{i}^{T}\hbar _{i}+1\right)}^{2}}\frac{\partial \kappa _{i}}{\partial \hat{W}_{i}} \nonumber \\ &=- \iota _{i}\frac{\hbar _{i}e_{i}}{{\left(\hbar _{i}^{T}\hbar _{i}+1\right)}^{2}}=-\frac{\iota _{i}\hbar _{i}\varepsilon _{H_{i}}}{{\left(\hbar _{i}^{T}\hbar _{i}+1\right)}^{2}}+\frac{\iota _{i}\hbar _{i}\hbar _{i}^{T}\tilde{W}_{i}}{{\left(\hbar _{i}^{T}\hbar _{i}+1\right)}^{2}}, \end{align}$$ (37)
where W i = W i W ̂ i $\tilde{W}_{i}=W_{i}^{*}-\hat{W}_{i}$ , and the learning rate ι i $\iota _{i}$ is a positive constant to be designed. Since the ideal weight is a constant vector, we can conclude that W ̇ i = W ̂ ̇ i $\dot{\tilde{W}}_{i}=- \dot{\hat{W}}_{i}$ . According to (37), we have
W ̇ i = W ̂ ̇ i = ι i i ε H i ( i T i + 1 ) 2 ι i i i T W i ( i T i + 1 ) 2 . $$\begin{align} \dot{\tilde{W}}_{i}=- \dot{\hat{W}}_{i}=\frac{\iota _{i}\hbar _{i}\varepsilon _{H_{i}}}{(\hbar _{i}^{T}\hbar _{i}+1)^{2}}-\frac{\iota _{i}\hbar _{i}\hbar _{i}^{T}\tilde{W}_{i}}{(\hbar _{i}^{T}\hbar _{i}+1)^{2}}. \end{align}$$ (38)

3.2 Static event triggering mechanism and stability analysis

According to the previous work [38, 42, 43], the following assumptions are needed for the stability analysis.

Assumption 2.The control law is Lipschitz continuous, that is, | | u r j ( x ) u r j ( x ̌ k ) | | 2 ϖ j | | x x ̌ k | | 2 = ϖ j | | π l | | 2 $||u_{rj}^{*}(x)-u_{rj}^{*}(\check{x}_{k})||^{2}\le \varpi _{j}||x-\check{x}_{k}||^{2}=\varpi _{j}||\pi _{l}||^{2}$ with z k t < z k + 1 $z_{k}\le t<z_{k+1}$ , where ϖ j > 0 $ \varpi _{j}>0$ .

Assumption 3.For any i , j 1 , 2 $ i,j\in {1,2}$ , the input coefficient matrix g u j ( x ) $g_{uj}(x)$ , the ideal critic weight W i $W_{i}^{*}$ , the gradient of the activation function σ i $\nabla \sigma _{i}$ and the gradient of the NN approximation error ε i $\nabla \varepsilon _{i}$ are bounded, that is

| | g u j ( x ) | | G ¯ u j , | | W i | | c W ¯ i , | | σ i | | σ ¯ i , | | ε i | | ε ¯ i , $$\begin{equation*} ||g_{uj}(x)||\le \bar{\mathcal {G}}_{uj},||W_{i}^{*}||\le c_{\bar{W}_{i}},||\nabla \sigma _{i}||\le \overline{\sigma }_{i},||\nabla \varepsilon _{i}||\le \overline{\varepsilon }_{i}, \end{equation*}$$
where G ¯ u j $ \bar{\mathcal {G}}_{uj}$ , c W ¯ i $c_{\bar{W}_{i}}$ , σ ¯ i $\overline{\sigma }_{i}$ , ε ¯ i $\overline{\varepsilon }_{i}$ are all positive constants.

Assumption 4.According to the persistence of excitation condition, for any player, the signal ̲ i $ \underline{\hbar }_i$ is persistently excited, so the inequality is satisfied:

H i I L × L t t + T ̲ i ̲ i T d τ , $$\begin{align} \mathcal {H}_{i}I_{L\times L}\le \int _t^{t+T}\underline{\hbar }_i\underline{\hbar }_i^Td\tau, \end{align}$$ (39)
where ̲ i = i i T i + 1 $\underline{\hbar }_{i}=\frac{\hbar _{i}}{\hbar _{i}^{T}\hbar _{i}+1}$ , and H i $\mathcal {H}_{i}$ , T $ T$ are positive constants. Furthermore, assume that 0 < H i λ min ( ̲ i ̲ i T ) $0<\mathcal {H}_{i}\le \lambda _{\min }(\underline{\hbar }_{i}\underline{\hbar }_{i}^{T})$ .

Assumption 5.The function Θ ( · ) = t a n h ( · ) $\Theta (\cdot)=tanh(\cdot)$ is Lipschitz continuous, which satisfies Θ ( I 1 ) Θ ( I 2 ) P Θ I 1 I 2 $ \Vert \Theta (\mathcal {I}_1)-\Theta (\mathcal {I}_2)\Vert \le \mathcal {P}_\Theta \Vert \mathcal {I}_1-\mathcal {I}_2\Vert $ . Here P Θ $ \mathcal {P}_\Theta$ is a positive constant and I 1 , I 2 R $\mathcal {I}_1, \mathcal {I}_2 \in R$ .

Theorem 2.Considering the NZS games of system (1) with unknown external disturbance, and assuming that Assumptions 15 hold, along with the disturbance observer (6), the event-triggered approximate optimal control law (31), (32) and weight update law (37) are used. If the event-triggered rule

| | π l | | α c i = 1 2 λ m i n ( Q i ) G ¯ 2 j = 1 2 ϖ j | | x | | = Δ Z T , $$\begin{align} ||\pi _{l}||\le \sqrt {\frac{\alpha _{c}\sum _{i=1}^2\lambda _{min}(Q_i)}{\overline{\mathcal {G}}^2\sum _{j=1}^2\varpi _j}}||x||\stackrel{\Delta }{=}Z_{T}, \end{align}$$ (40)
holds, where parameter α c ( 0 , 1 ) $ \alpha _{c}\in (0,1)$ , then the weight estimation errors W $\tilde{W}$ , the system state x $x$ and the disturbance observer estimation error d $\tilde{d}$ are uniform ultimate boundedness (UUB).

Proof.Select the Lyapunov function candidate as P = μ 1 P 1 + μ 2 P 2 + μ 3 P 3 + μ 4 P 4 $\mathcal {P}=\mu _{1}\mathcal {P}_{1}+\mu _{2}\mathcal {P}_{2}+\mu _{3}\mathcal {P}_{3}+\mu _{4}\mathcal {P}_{4}$ , with P 1 = i = 1 2 V i ( x ) $\mathcal {P}_{1}=\sum _{i=1}^{2}V_{i}^{*}(x)$ , P 2 = i = 1 2 V i ( x ̌ k ) + i = 1 2 W i T ( z k ) W i ( z k ) + 1 2 d T ( z k ) d ( z k ) $\mathcal {P}_{2}=\sum _{i=1}^{2}V_{i}^{*}(\check{x}_{k})+\sum _{i=1}^{2}\tilde{W}_{i}^{T}(z_{k})\tilde{W}_{i}(z_{k})+\frac{1}{2}\tilde{d}^{T}(z_{k})\tilde{d}(z_{k})$ , P 3 = i = 1 2 W i T W i $\mathcal {P}_{3}=\sum _{i=1}^{2}\tilde{W}_{i}^{T}\tilde{W}_{i}$ , P 4 = 1 2 d T d $\mathcal {P}_{4} =\frac{1}{2}\tilde{d}^{T}\tilde{d}$ . $\Box$

Case 1.When z k t < z k + 1 $z_{k} \le t < z_{k+1}$ , the time derivative of P 1 $\mathcal {P}_1$ is

P ̇ 1 = i = 1 2 V i T f ( x ) + i = 1 2 V i T ( j = 1 2 g u j ( x ) u ̌ j ( x ̌ k ) + g d ( x ) d ) . $$\begin{align} \dot{\mathcal {P}}_{1}=&\, \sum _{i=1}^{2} {\left(\nabla V_{i}^{*}\right)}^{T}f(x)\nonumber \\ &+\sum _{i=1}^{2} {\left(\nabla V_{i}^{*}\right)}^{T}(\sum _{j=1}^{2}g_{uj}(x)\check{u}_{j}({\check{x}_{k}})+g_{d}(x)d). \end{align}$$ (41)
Due to Equation (20), the following equation holds:
i = 1 2 V i T f ( x ) = i = 1 2 x T Q i x + W ( u r i ) i = 1 2 V i T ( j = 1 2 g u j ( x ) u j ( x ) + g d ( x ) d ) . $$\begin{align} &\sum _{i=1}^{2}{\left(\nabla V_{i}^{*}\right)}^{T}f(x)\nonumber \\ &\quad =\, -\sum _{i=1}^{2} {\left(x^{T}Q_{i}x+\mathcal {W}(u_{ri})\right)}\nonumber \\ &\qquad -\sum _{i=1}^{2}{\left({\left(\nabla V_{i}^{*}\right)}^{T}(\sum _{j=1}^{2} g_{uj}(x)u_{j}^{*}(x)+g_{d}(x)d)\right)}. \end{align}$$ (42)
Submitting (42) into (41), we can have (43), as shown at the top of next page. We define [ g u 1 ( x ) , g u 2 ( x ) ] [ g u 1 ( x ) , g u 2 ( x ) ] T = G $[g_{u1}(x),g_{u2}(x)][g_{u1}(x),g_{u2}(x)]^T=\mathcal {G}$ and it is assumed to be bounded as G G ¯ $\Vert \mathcal {G}\Vert \le \bar{\mathcal {G}}$ .
P ̇ 1 = i = 1 2 x T Q i x W ( u r i ) + i = 1 2 V i T × j = 1 2 g u j ( x ) ( u ̌ r j ( x ̌ k ) u r j ( x ) ) = i = 1 2 x T Q i x W ( u r i ) + [ V 1 ( x ) + V 2 ( x ) ] T × j = 1 2 g u j ( x ) ( u ̌ r j ( x ̌ k ) u r j ( x ) ) i = 1 2 x T Q i x W ( u r i ) + 1 2 [ V 1 ( x ) + V 2 ( x ) ] T [ V 1 ( x ) + V 2 ( x ) ] + 1 2 u ̌ r 1 ( x ̌ k ) u r 1 ( x ) u ̌ r 2 ( x ̌ k ) u r 2 ( x ) T [ g u 1 ( x ) , g u 2 ( x ) ] T [ g u 1 ( x ) , g u 2 ( x ) ] × u ̌ r 1 ( x ̌ k ) u r 1 ( x ) u ̌ r 2 ( x ̌ k ) u r 2 ( x ) i = 1 2 ( x T Q i x ) + i = 1 2 σ i ( x ) ) T W i + ε i 2 + 1 2 G ¯ 2 j = 1 2 u ̌ r j ( x ̌ k ) u r j ( x ) 2 i = 1 2 ( x T Q i x ) + 2 i = 1 2 σ ¯ i 2 c W ¯ i 2 + 2 i = 1 2 ε ¯ i 2 + 1 2 G ¯ 2 j = 1 2 u ̌ r j ( x ̌ k ) u r j ( x ) 2 . $$\begin{eqnarray} \dot{\mathcal {P}}_{1}&=& \sum _{i=1}^{2} {\left(- x^{T}Q_{i}x-\mathcal {W}(u_{ri})\right)}+\sum _{i=1}^{2} {\left(\nabla V_{i}^{*}\right)}^{T}\nonumber\\ &&\times\;{\left(\sum _{j=1}^{2}g_{uj}(x)(\check{u}_{rj}({\check{x}_{k}})-u_{rj}^{*}(x))\right)}\nonumber \\ &=& \sum _{i=1}^{2} {\left(- x^{T}Q_{i}x-\mathcal {W}(u_{ri})\right)}+[\nabla V_{1}^{*}(x)+\nabla V_{2}^{*}(x)]{}^{T}\nonumber\\ &&\times\;\sum _{j=1}^{2}g_{uj}(x)(\check{u}_{rj}({\check{x}_{k}})-u_{rj}^{*}(x)) \nonumber \\ &\le &\sum _{i=1}^{2} {\left(- x^{T}Q_{i}x-\mathcal {W}(u_{ri})\right)}\nonumber\\ &&+\;\frac{1}{2}[\nabla V_{1}^{*}(x)+\nabla V_{2}^{*}(x)]{}^{T}[\nabla V_{1}^{*}(x)+\nabla V_{2}^{*}(x)]{} \nonumber \\ &&+\frac{1}{2}\def\eqcellsep{&}\begin{bmatrix} \check{u}_{r1}({\check{x}_{k}})-u_{r1}^{*}(x)\\[3pt] \check{u}_{r2}({\check{x}_{k}})-u_{r2}^{*}(x) \end{bmatrix}^{T}[g_{u1}(x),g_{u2}(x)]{}^{T}[g_{u1}(x),g_{u2}(x)]\nonumber\\ &&\times\;\def\eqcellsep{&}\begin{bmatrix} \check{u}_{r1}({\check{x}_{k}})-u_{r1}^{*}(x)\\[3pt] \check{u}_{r2}({\check{x}_{k}})-u_{r2}^{*}(x) \end{bmatrix} {}\nonumber\\ &\le &\sum _{i=1}^{2} (- x^{T}Q_{i}x)+\sum _{i=1}^{2} \Vert \nabla \sigma _{i}(x)){}^{T}W_{i}^{*}+\nabla \varepsilon _{i}\Vert ^{2}\nonumber\\ &&+\;\frac{1}{2}\bar{\mathcal {G}}^{2}\sum _{j=1}^{2} \Vert \check{u}_{rj}({\check{x}_{k}})-u_{rj}^{*}(x)\Vert ^{2} \nonumber \\ &\le &\sum _{i=1}^{2} (- x^{T}Q_{i}x)+2\sum _{i=1}^{2} \bar{\sigma }_{i}^{2}c_{\bar{W}_{i}}^{2}+2\sum _{i=1}^{2} \bar{\varepsilon }_{i}^{2}\nonumber\\ &&+\;\frac{1}{2}\bar{\mathcal {G}}^{2}\sum _{j=1}^{2} \Vert \check{u}_{rj}({\check{x}_{k}})-u_{rj}^{*}(x)\Vert ^{2}. \end{eqnarray}$$ (43)

By applying Young's inequality, we can derive

u ̌ r 1 ( x ̌ k ) u r 1 ( x ) 2 = ( u ̌ r 1 ( x ̌ k ) u ̌ r 1 ( x ̌ k ) ) + ( u ̌ r 1 ( x ̌ k ) u r 1 ( x ) ) 2 2 u ̌ r 1 ( x ̌ k ) u ̌ r 1 ( x ̌ k ) 2 + 2 u ̌ r 1 ( x ̌ k ) u r 1 ( x ) 2 R 11 1 2 G ¯ u 1 2 ( σ ¯ 1 2 W 1 2 + ε ¯ 1 2 ) + 2 ϖ 1 | | π l | | 2 , $$\begin{align} &\Vert \check{u}_{r1}({\check{x}_{k}})-u_{r1}^{*}(x)\Vert ^{2} \nonumber \\ &\quad = \Vert (\check{u}_{r1}({\check{x}_{k}})-\check{u}_{r1}^{*}({\check{x}_{k}}))+(\check{u}_{r1}^{*}({\check{x}_{k}})-u_{r1}^{*}(x))\Vert ^{2} \nonumber \\ &\quad \le 2\Vert \check{u}_{r1}({\check{x}_{k}})-\check{u}_{r1}^{*}({\check{x}_{k}})\Vert ^{2} +2\Vert \check{u}_{r1}^{*}({\check{x}_{k}})-u_{r1}^{*}(x)\Vert ^{2} \nonumber \\ &\quad \le \Vert R_{11}^{-1}\Vert ^{2}\overline{\mathcal {G}}_{u1}^{2}(\overline{\sigma }_{1}^{2}\Vert \tilde{W}_{1}\Vert ^{2}+\overline{\varepsilon }_{1}^{2}) +2\varpi _{1}||\pi _{l}||^{2}, \end{align}$$ (44)
u ̌ r 2 ( x ̌ k ) u r 2 ( x ) 2 = ( u ̌ r 2 ( x ̌ k ) u ̌ r 2 ( x ̌ k ) ) + ( u ̌ r 2 ( x ̌ k ) u r 2 ( x ) ) 2 2 u ̌ r 2 ( x ̌ k ) u ̌ r 2 ( x ̌ k ) 2 + 2 u ̌ r 2 ( x ̌ k ) u r 2 ( x ) 2 Γ 2 2 P Θ 2 G ¯ u 2 2 R 22 1 2 ( σ ¯ 2 2 W 2 2 + ε ¯ 2 2 ) + 2 ϖ 2 | | π l | | 2 . $$\begin{align} &\Vert \check{u}_{r2}({\check{x}_{k}})-u_{r2}^{*}(x)\Vert ^{2} \nonumber \\ &\quad = \Vert (\check{u}_{r2}({\check{x}_{k}})-\check{u}_{r2}^{*}({\check{x}_{k}}))+(\check{u}_{r2}^{*}({\check{x}_{k}})-u_{r2}^{*}(x))\Vert ^{2} \nonumber \\ &\quad \le 2\Vert \check{u}_{r2}({\check{x}_{k}})-\check{u}_{r2}^{*}({\check{x}_{k}})\Vert ^{2} +2\Vert \check{u}_{r2}^{*}({\check{x}_{k}})-u_{r2}^{*}(x)\Vert ^{2} \nonumber \\ &\quad \le \Vert \Gamma _{2}\Vert ^{2}{\mathcal {P}}_{\Theta }^{2}\overline{\mathcal {G}}_{u2}^{2}\Vert R_{22}^{-1}\Vert ^{2}(\overline{\sigma }_{2}^{2}\Vert \tilde{W}_{2}\Vert ^{2}+\overline{\varepsilon }_{2}^{2}) +2\varpi _{2}||\pi _{l}||^{2}. \end{align}$$ (45)
In the process of deriving the above equations, the following equations are obtained
u ̌ r 1 ( x ̌ k ) u ̌ r 1 ( x ̌ k ) 2 = | | 1 2 R 11 1 g u 1 T ( x ̌ k ) ( σ 1 ( x ̌ k ) ) T W ̂ 1 ( z k ) ) 1 2 R 11 1 g u 1 T ( x ̌ k ) ( σ 1 ( x ̌ k ) ) T W 1 ( z k ) + ε 1 | | 2 1 4 | | R 11 1 g u 1 T ( x ̌ k ) ( σ 1 ( x ̌ k ) ) T W ̂ 1 ( z k ) R 11 1 g u 1 T ( x k ) ( σ 1 ( x ̌ k ) ) T W 1 ( z k ) + ε 1 | | 2 1 4 R 11 1 2 g u 1 T ( x ̌ k ) 2 ( σ 1 ( x ̌ k ) ) T W 1 ( z k ) ε 1 2 1 2 R 11 1 2 G ¯ u 1 2 ( σ ¯ 1 2 W 1 2 + ε ¯ 1 2 ) , $$\begin{align} &\Vert \check{u}_{r1}({\check{x}_{k}})-\check{u}_{r1}^{*}({\check{x}_{k}})\Vert ^{2} \nonumber \\ &\quad = \bigg |\bigg |\frac{1}{2}R_{11}^{-1}g_{u1}^{T}(\check{x}_{k})(\nabla \sigma _{1}(\check{x}_{k}))^{T}\hat{W}_{1}(z_{k}))\nonumber \\ &\qquad -\;\frac{1}{2}R_{11}^{-1}g_{u1}^{T}(\check{x}_{k}){\left((\nabla \sigma _{1}(\check{x}_{k}))^{T}W_{1}^*(z_{k})+\nabla \varepsilon _{1}\right)}\bigg |\bigg |^{2} \nonumber \\ &\quad \le \frac{1}{4}\Big |\Big | R_{11}^{-1}g_{u1}^{T}(\check{x}_{k})(\nabla \sigma _{1}(\check{x}_{k}))^{T}\hat{W}_{1}(z_{k}) \nonumber \\ &\qquad -\;R_{11}^{-1}g_{u1}^{T}(\tilde{x}_{k}){\left((\nabla \sigma _{1}(\check{x}_{k}))^{T}W_{1}^{*}(z_{k})+\nabla \varepsilon _{1}\right)}\Big |\Big |^{2} \nonumber \\ &\quad \le \frac{1}{4}\Vert R_{11}^{-1}\Vert ^{2}\Vert g_{u1}^{T}(\check{x}_{k})\Vert ^{2}\parallel -(\nabla \sigma _{1}(\check{x}_{k}))^{T}\tilde{W}_{1}(z_{k})-\nabla \varepsilon _{1}\Vert ^{2} \nonumber \\ &\quad \le \frac{1}{2}\Vert R_{11}^{-1}\Vert ^{2}\overline{\mathcal {G}}_{u1}^{2}(\overline{\sigma }_{1}^{2}\Vert \tilde{W}_{1}\Vert ^{2}+\bar{\varepsilon }_{1}^{2}), \end{align}$$ (46)
u ̌ r 2 ( x ̌ k ) u ̌ r 2 ( x ̌ k ) 2 = | | Γ 2 Θ 1 2 R 22 1 g u 2 T ( x ̌ k ) ( σ 2 ( x ̌ k ) ) T W ̂ 2 ( z k ) Γ 2 Θ 1 2 R 22 1 g u 2 T ( x ̌ k ) ( σ 2 ( x ̌ k ) ) T W 2 ( z k ) + ε 2 | | 2 1 4 Γ 2 2 P Θ 2 | | R 22 1 g u 2 T ( x ̌ k ) ( σ 2 ( x ̌ k ) ) T W ̂ 2 ( z k ) R 22 1 g u 2 T ( x ̌ k ) ( σ 2 ( x ̌ k ) ) T W 2 ( z k ) + ε 2 | | 2 1 4 Γ 2 2 P Θ 2 g u 2 T ( x ̌ k ) 2 R 22 1 2 ( σ 2 ( x ̌ k ) ) T W 2 ( z k ) ε 2 2 1 2 R 22 1 2 Γ 2 2 P Θ 2 G ¯ u 2 2 ( σ ¯ 2 2 W 2 2 + ε ¯ 2 2 ) . $$\begin{align} &\Vert \check{u}_{r2}({\check{x}_{k}})-\check{u}_{r2}^{*}({\check{x}_{k}})\Vert ^{2} \nonumber \\ &\quad = \bigg |\bigg |\Gamma _{2}\Theta {\left(\frac{1}{2}R_{22}^{-1}g_{u2}^{T}(\check{x}_{k})(\nabla \sigma _{2}(\check{x}_{k}))^{T}\hat{W}_{2}(z_{k})\right)}\nonumber \\ &\qquad -\;\Gamma _{2}\Theta {\left(\frac{1}{2}R_{22}^{-1}g_{u2}^{T}(\check{x}_{k}){\left((\nabla \sigma _{2}(\check{x}_{k}))^{T}W_{2}^{*}(z_{k})+\nabla \varepsilon _{2}\right)}\right)}\bigg |\bigg |^{2} \nonumber \\ &\quad \le \frac{1}{4}{\left\Vert \Gamma _{2}^{2}\right\Vert} \mathcal {P}_{\Theta }^{2}\Big |\Big | R_{22}^{-1}g_{u2}^{T}(\check{x}_{k})(\nabla \sigma _{2}(\check{x}_{k}))^{T}\hat{W}_{2}(z_{k}) \nonumber \\ &\qquad -\;R_{22}^{-1}g_{u2}^{T}(\check{x}_{k}){\left((\nabla \sigma _{2}(\check{x}_{k}))^{T}W_{2}^{*}(z_{k})+\nabla \varepsilon _{2}\right)}\Big |\Big |^{2} \nonumber \\ &\quad \le \frac{1}{4}{\left\Vert \Gamma _{2}\right\Vert} ^{2}\mathcal {P}_{\Theta }^{2} {\left\Vert g_{u2}^{T}(\check{x}_{k})\right\Vert} ^{2}{\left\Vert R_{22}^{-1}\right\Vert} ^{2}\nonumber \\ &\qquad {\left\Vert -(\nabla \sigma _{2}(\check{x}_{k}))^{T}\tilde{W}_{2}(z_{k})-\nabla \varepsilon _{2}\right\Vert} ^{2} \nonumber \\ &\quad \le \frac{1}{2}\Vert R_{22}^{-1}\Vert ^{2}{\left\Vert \Gamma _{2}\right\Vert} ^{2}\mathcal {P}_{\Theta }^{2}\overline{\mathcal {G}}_{u2}^{2}(\overline{\sigma }_{2}^{2}\Vert \tilde{W}_{2}\Vert ^{2}+\bar{\varepsilon }_{2}^{2}). \end{align}$$ (47)

Submitting (44)–(47) into (43), it yields that

P ̇ 1 i = 1 2 ( x T Q i x ) + 2 i = 1 2 σ ¯ i 2 c W ¯ i 2 + 2 i = 1 2 ε ¯ i 2 + 1 2 G ¯ 2 Γ 2 2 P Θ 2 G ¯ u 2 2 R 22 1 2 ( σ ¯ 2 2 W 2 2 + ε ¯ 2 2 ) + 1 2 G ¯ 2 R 11 1 2 G ¯ u 1 2 ( σ ¯ 1 2 W 1 2 + ε ¯ 1 2 ) + G ¯ 2 j = 1 2 ϖ j | | π l | | 2 . $$\begin{align} \dot{\mathcal {P}}_{1}\le &\, \sum _{i=1}^{2} (- x^{T}Q_{i}x)+2\sum _{i=1}^{2} \bar{\sigma }_{i}^{2}c_{\bar{W}_{i}}^{2}+2\sum _{i=1}^{2} \overline{\varepsilon }_{i}^{2} \nonumber \\ &+\frac{1}{2}\bar{\mathcal {G}}^{2}\Vert \Gamma _{2}\Vert ^{2}{\mathcal {P}}_{\Theta }^{2}\bar{\mathcal {G}}_{u2}^{2}{\left\Vert R_{22}^{-1}\right\Vert} ^{2}(\bar{\sigma }_{2}^{2}\Vert \tilde{W}_{2}\Vert ^{2}+\overline{\varepsilon }_{2}^{2})\nonumber \\ &+\frac{1}{2}\bar{\mathcal {G}}^{2}{\left\Vert R_{11}^{-1}\right\Vert} ^{2}\bar{\mathcal {G}}_{u1}^{2}(\bar{\sigma }_{1}^{2}\Vert \tilde{W}_{1}^{2}\Vert +\overline{\varepsilon }_{1}^{2})+\bar{\mathcal {G}}^{2}\sum _{j=1}^{2} \varpi _{j}||\pi _{l}||^{2}. \end{align}$$ (48)

Then, we continue to analyze the Lyapunov function. The time derivative of P 3 $ \mathcal {P}_{3}$ can be deduced as

P 3 ̇ = 2 i = 1 2 W i T W ̇ i = 2 i = 1 2 ι i W i T i ε H i ( i T i + 1 ) 2 ι i W i T i i T W i ( i T i + 1 ) 2 i = 1 2 ι i ε H i 2 W i T i i T W i i T i + 1 2 i = 1 2 ι i W i T ̲ i ̲ i T W i ε ¯ H i 2 i = 1 2 ι i λ m i n ( ̲ i ̲ i T ) | | W i | | 2 + F i = 1 2 ι i H i | | W i | | 2 + F , $$\begin{align} \dot{\mathcal {P}_{3}}&=2\sum _{i=1}^{2} \tilde{W}_{i}^{T}\dot{\tilde{W}}_{i}=2\sum _{i=1}^{2} {\left(\frac{\iota _{i}\tilde{W}_{i}^{T}\hbar _{i}\varepsilon _{H_{i}}}{(\hbar _{i}^{T}\hbar _{i}+1)^{2}}-\frac{\iota _{i}\tilde{W}_{i}^{T}\hbar _{i}\hbar _{i}^{T}\tilde{W}_{i}}{(\hbar _{i}^{T}\hbar _{i}+1)^{2}}\right)} \nonumber \\ &\le \sum _{i=1}^{2} \iota _{i}\frac{\varepsilon _{H_{i}}^{2}-\tilde{W}_{i}^{T}\hbar _{i}\hbar _{i}^{T}\tilde{W}_{i}}{{\left(\hbar _{i}^{T}\hbar _{i}+1\right)}^{2}}\le -\sum _{i=1}^{2} \iota _{i}{\left(\tilde{W}_{i}^{T}\underline{\hbar }_{i}\underline{\hbar }_{i}^{T}\tilde{W}_{i}-\overline{\varepsilon }_{H_{i}}^{2}\right)} \nonumber \\ &\le -\sum _{i=1}^{2} \iota _{i}\lambda _{min}(\underline{\hbar }_{i}\underline{\hbar }_{i}^{T})||\tilde{W}_{i}||^{2}+F \nonumber \\ &\le -\sum _{i=1}^{2}\iota _{i}\mathcal {H}_{i}||\tilde{W}_{i}||^{2}+F, \end{align}$$ (49)
where F = i = 1 2 ι i ε ¯ H i 2 $F=\sum _{i=1}^{2}\iota _{i}\overline{\varepsilon }_{H_{i}}^{2}$ . As the function P 2 $ \mathcal {P}_2$ keeps unchanged during case 1, the time derivation equals zero. The time derivation of P 4 $ \mathcal {P}_4$ is formulated as (9).

By integrating the derivations of each term in the Lyapunov equation above, we obtain the following expression:

P ̇ μ 1 i = 1 2 ( x T Q i x ) + 2 i = 1 2 σ ¯ i 2 c W ¯ i 2 + 2 i = 1 2 ε ¯ i 2 + 1 2 G ¯ 2 Γ 2 2 P Θ 2 G ¯ u 2 2 R 22 1 2 ( σ ¯ 2 2 W 2 2 + ε ¯ 2 2 ) + 1 2 G ¯ 2 R 11 1 2 G ¯ u 1 2 ( σ ¯ 1 2 W 1 2 + ε ¯ 1 2 ) + G ¯ 2 j = 1 2 ϖ j | | π l | | 2 + μ 3 i = 1 2 ι i H i | | W i | | 2 + F μ 4 λ m i n ( θ ) d 2 . $$\begin{align} \dot{\mathcal {P}}\le &\, \mu _{1}{\left(\sum _{i=1}^{2} (- x^{T}Q_{i}x)+2\sum _{i=1}^{2} \bar{\sigma }_{i}^{2}c_{\bar{W}_{i}}^{2}+2\sum _{i=1}^{2} \bar{\varepsilon }_{i}^{2}\right.} \nonumber \\ &{\left.+\frac{1}{2}\bar{\mathcal {G}}^{2}\Vert \Gamma _{2}\Vert ^{2}{\mathcal {P}}_{\Theta }^{2}\bar{\mathcal {G}}_{u2}^{2}{\left\Vert R_{22}^{-1}\right\Vert} ^{2}(\bar{\sigma }_{2}^{2}\Vert \tilde{W}_{2}\Vert ^{2}+\bar{\varepsilon }_{2}^{2})\right.}\nonumber \\ &{\left.+\frac{1}{2}\bar{\mathcal {G}}^{2}{\left\Vert R_{11}^{-1}\right\Vert} ^{2}\bar{\mathcal {G}}_{u1}^{2}(\bar{\sigma }_{1}^{2}\Vert \tilde{W}_{1}\Vert ^{2}+\bar{\varepsilon }_{1}^{2})+\bar{\mathcal {G}}^{2}\sum _{j=1}^{2} \varpi _{j}||\pi _{l}||^{2}\right)}\nonumber \\ &+\mu _{3}{\left(-\sum _{i=1}^{2}\iota _{i}\mathcal {H}_{i}||\tilde{W}_{i}||^{2}+F\right)}\nonumber \\ &-\mu _4{\left(\lambda _{min}(\theta)\Vert \tilde{d}\Vert ^2\right)}. \end{align}$$ (50)
Furthermore, we have
P ̇ μ 1 i = 1 2 ( 1 α c ) λ min ( Q i ) | | x | | 2 + 1 2 G ¯ 2 Γ 2 2 P Θ 2 G ¯ u 2 2 R 22 1 2 σ ¯ 2 2 W 2 2 + 1 2 G ¯ 2 R 11 1 2 G ¯ u 1 2 σ ¯ 1 2 W 1 2 μ 3 i = 1 2 ι i H i | | W i | | 2 μ 4 λ m i n ( θ ) | | d | | 2 + μ 1 α c i = 1 2 λ min ( Q i ) | | x | | 2 + G ¯ 2 j = 1 2 ϖ j | | π l | | 2 + ς , $$\begin{align} \dot{\mathcal {P}}\le &\, \mu _{1}{\left(-\sum _{i=1}^{2}(1-\alpha _{c})\lambda _{\mathrm{min}}(Q_{i})||x||^{2}\right.}\nonumber \\ &+\frac{1}{2}\bar{\mathcal {G}}^{2}\Vert \Gamma _{2}\Vert ^{2}{\mathcal {P}}_{\Theta }^{2}\bar{\mathcal {G}}_{u2}^{2}{\left\Vert R_{22}^{-1}\right\Vert} ^{2}\bar{\sigma }_{2}^{2}\Vert \tilde{W}_{2}\Vert ^{2}\nonumber \\ &{\left.+\frac{1}{2}\bar{\mathcal {G}}^{2}{\left\Vert R_{11}^{-1}\right\Vert} ^{2}\bar{\mathcal {G}}_{u1}^{2}\bar{\sigma }_{1}^{2}\Vert \tilde{W}_{1}\Vert ^{2}\right)}\nonumber \\ &-\mu _{3}\sum _{i=1}^{2}\iota _{i}\mathcal {H}_{i}||\tilde{W}_{i}||^{2}-\mu _{4}\lambda _{min}(\theta)||\tilde{d}||^{2}\nonumber \\ &+\mu _{1}{\left(-\alpha _{c}\sum _{i=1}^{2}\lambda _{\min }(Q_{i})||x||^{2}+\overline{\mathcal {G}}^{2}\sum _{j=1}^{2}\varpi _{j}||\pi _{l}||^{2}\right)}+\varsigma, \end{align}$$ (51)
where
ς = μ 1 2 i = 1 2 σ ¯ i 2 c W ¯ i 2 + 2 i = 1 2 ε ¯ i 2 + 1 2 G ¯ 2 Γ 2 2 P Θ 2 G ¯ u 2 2 R 22 1 2 ε ¯ 2 2 + 1 2 G ¯ 2 R 11 1 2 G ¯ u 1 2 ε ¯ 1 2 + μ 3 F > 0 . $$\begin{align} \varsigma =&\, \mu _{1}{\left(2\sum _{i=1}^{2} \bar{\sigma }_{i}^{2}c_{\bar{W}_{i}}^{2}+2\sum _{i=1}^{2} \bar{\varepsilon }_{i}^{2}\right.} \nonumber \\ &{\left.+\frac{1}{2}\bar{\mathcal {G}}^{2}\Vert \Gamma _{2}\Vert ^{2}{\mathcal {P}}_{\Theta }^{2}\bar{\mathcal {G}}_{u2}^{2}{\left\Vert R_{22}^{-1}\right\Vert} ^{2}\bar{\varepsilon }_{2}^{2} +\frac{1}{2}\bar{\mathcal {G}}^{2}{\left\Vert R_{11}^{-1}\right\Vert} ^{2}\bar{\mathcal {G}}_{u1}^{2}\bar{\varepsilon }_{1}^{2}\right)}\nonumber \\ &+\mu _{3}F>0. \end{align}$$ (52)
Therefore, under the condition (40), when the following inequality
W 1 > 2 ς G ¯ 2 Γ 2 2 P Θ 2 G ¯ u 2 2 R 22 1 2 σ ¯ 2 2 + 2 μ 3 ι 1 H 1 = ξ W 1 $$\begin{align} \Vert \tilde{W}_{1}\Vert >\sqrt {\frac{2\varsigma }{\bar{\mathcal {G}}^{2}\Vert \Gamma _{2}\Vert ^{2}{\mathcal {P}}_{\Theta }^{2}\bar{\mathcal {G}}_{u2}^{2}{\left\Vert R_{22}^{-1}\right\Vert} ^{2}\bar{\sigma }_{2}^{2}+2\mu _{3}\iota _{1}\mathcal {H}_{1}}}=\xi _{\tilde{W}_{1}} \end{align}$$ (53)
or
W 2 > 2 ς G ¯ 2 R 11 1 2 G ¯ u 1 2 σ ¯ 1 2 + 2 μ 3 ι 2 H 2 = ξ W 2 $$\begin{align} \Vert \tilde{W}_{2}\Vert > \sqrt {\frac{2\varsigma }{\bar{\mathcal {G}}^{2}{\left\Vert R_{11}^{-1}\right\Vert} ^{2}\bar{\mathcal {G}}_{u1}^{2}\bar{\sigma }_{1}^{2} + 2\mu _{3}\iota _{2}\mathcal {H}_{2}}} = \xi _{\tilde{W}_{2}} \end{align}$$ (54)
or
x > ς μ 1 ( 1 α c ) i = 1 2 λ m i n ( Q i ) = ξ x $$\begin{align} \Vert x\Vert > \sqrt {\frac{\varsigma }{\mu _{1}(1-\alpha _{c})\sum _{i=1}^{2}\lambda _{min}(Q_{i})}}=\xi _{x} \end{align}$$ (55)
or
d > ς μ 4 λ m i n ( θ ) = ξ d $$\begin{align} \Vert \tilde{d}\Vert >\sqrt {\frac{\varsigma }{\mu _{4}\lambda _{min}(\theta)}}=\xi _{\tilde{d}} \end{align}$$ (56)
is satisfied, the conclusion P ̇ < 0 $\dot{\mathcal {P}}<0$ can be drawn, which indicates that the weight estimation errors W $\tilde{W}$ , the system state x $x$ and the disturbance observer estimation error d $\tilde{d}$ are UUB.

Case 2.When t = z k + 1 $t=z_{k+1}$ , we can obtain that

Δ P 2 = i = 1 2 V i ( x ̌ ( z k + 1 ) ) i = 1 2 V i ( x ̌ ( z k ) ) + i = 1 2 W i T ( z k + 1 ) W i ( z k + 1 ) i = 1 2 W i T ( z k ) W i ( z k ) + 1 2 d T ( z k + 1 ) d ( z k + 1 ) 1 2 d T ( z k ) d ( z k ) = ( P 1 ( z k + 1 ) + P 3 ( z k + 1 ) + P 4 ( z k + 1 ) ) ( P 1 ( z k ) + P 3 ( z k ) + P 4 ( z k ) ) . $$\begin{align} \Delta \mathcal {P}_{2}=&\, \sum _{i=1}^{2} V_{i}^{*}(\check{x}(z_{k+1}))-\sum _{i=1}^{2} V_{i}^{*}(\check{x}(z_{k}))\nonumber \\ &+\sum _{i=1}^{2} \tilde{W}_{i}^{T}(z_{k+1})\tilde{W}_{i}(z_{k+1})-\sum _{i=1}^{2}\tilde{W}_{i}^{T}(z_{k})\tilde{W}_{i}(z_{k})\nonumber \\ &+\frac{1}{2}\tilde{d}^{T}(z_{k+1})\tilde{d}(z_{k+1})-\frac{1}{2}\tilde{d}^{T}(z_{k})\tilde{d}(z_{k})\nonumber \\ =&\, (\mathcal {P}_{1}(z_{k+1})+\mathcal {P}_{3}(z_{k+1})+\mathcal {P}_{4}(z_{k+1}))\nonumber \\ &-(\mathcal {P}_{1}(z_{k})+\mathcal {P}_{3}(z_{k})+\mathcal {P}_{4}(z_{k})). \end{align}$$ (57)

According to the analysis of Case 1, the function P 1 ( t ) + P 3 ( t ) + P 4 ( t ) $\mathcal {P}_{1}(t)+\mathcal {P}_{3}(t) +\mathcal {P}_{4}(t)$ is non-increasing during the time interval. Hence, when the conditions (40), (53), (54), (55), (56) holds, Δ P 2 0 $\Delta \mathcal {P}_{2}\le 0$ . As P 1 $\mathcal {P}_{1}$ , P 3 $\mathcal {P}_{3}$ and P 4 $\mathcal {P}_{4}$ are all time-continuous functions, Δ P = μ 2 Δ P 2 0 $\Delta \mathcal {P}=\mu _{2}\Delta \mathcal {P}_{2}\le 0$ .

Theorem 2 has been proven.

So far, the entire framework of the DET-based adaptive dynamic programming approach for the NZS games has been established. As shown in Figure 1, by comparing the measurement error with the triggered condition, it is determined whether an event is triggered. Then, if the event is triggered, the critic NNs approximate the solutions of the coupled HJ equations by substituting in the sampled state. Meanwhile, the optimal strategy is updated based on this approximation, and finally, the ZOH maintains the continuity of the control signal.

3.3 Dynamic event triggering mechanism and stability analysis

In order to adjust the triggering threshold based on the circumstances, this section introduces a dynamic variable χ x $\chi _x$ to generate a dynamic event-trigger rule. Dynamic variable needs to satisfy the following formulas:
χ ̇ x = ϱ χ x + α c i = 1 2 λ m i n ( Q i ) x 2 G ¯ 2 j = 1 2 ϖ j π l 2 , χ x 0 = χ x ( 0 ) 0 , $$\begin{align} \dot{\chi }_{x}&=- \varrho \chi _{x}+{\left\lbrace \alpha _{c}\sum _{i=1}^{2} \lambda _{min}(Q_{i})\Vert x\Vert ^{2}-\overline{\mathcal {G}}^{2}\sum _{j=1}^{2} \varpi _{j}\Vert \pi _{l}\Vert ^{2}\right\rbrace},\nonumber \\ \chi _x^0&=\chi _x(0)\ge 0, \end{align}$$ (58)
where the parameter ϱ > 0 $ \varrho >0$ represents the filtering coefficient.
The dynamic variable is generated by the mentioned filter structure, which can dynamically adjust the threshold. Since the DET strategy ensures that the value of χ x ${\chi }_{x}$ remains non-negative, it relaxes the stability condition, thereby increasing the trigger interval. The DET rule can be defined as:
z k + 1 = i n f { t R 0 + | t > z k [ χ x ( t ) + β ( α c i = 1 2 λ m i n ( Q i ) x 2 G ¯ 2 j = 1 2 ϖ j π l 2 ) 0 ] } , $$\begin{align} z_{k+1}=&\; inf\bigg \lbrace t\in R_{0}^{+}|t>z_{k}\cap [\chi _{x}(t)+\beta (\alpha _{c}\sum _{i=1}^{2}\lambda _{min}(Q_{i})\Vert x\Vert ^{2}\nonumber \\ &-\;\overline{\mathcal {G}}^{2}\sum _{j=1}^{2}\varpi _{j}\Vert \pi _{l}\Vert ^{2})\le 0]\bigg \rbrace, \end{align}$$ (59)
where β > 0 $\beta >0$ . When β 0 $\beta \rightarrow 0$ , the dynamic event triggering rule (59) becomes the static event triggering rule (40).

Lemma 1.For the system (1), the internal dynamic variable χ x $\chi _x$ always remains non-negative during the DET control process.

Proof.Note that based on the employed DET rule, the following inequality

χ x ( t ) + β ( k i = 1 2 λ m i n ( Q i ) x 2 G ¯ 2 j = 1 2 ϖ j π l 2 ) 0 , $$\begin{align} \chi _{x}(t)+\beta (k\sum _{i=1}^{2} \lambda _{min}(Q_{i})\Vert x\Vert ^{2}-\bar{\mathcal {G}}^{2}\sum _{j=1}^{2}\varpi _{j}\Vert \pi _{l}\Vert ^{2})\ge 0, \end{align}$$ (60)
is obviously satisfied. Then, if β 0 $\beta \ne 0$ , by combining (58) and (60), we have
χ ̇ x ( t ) + ϱ χ x ( t ) χ x ( t ) β , χ x 0 0 . $$\begin{align} \dot{\chi }_{x}(t)+\varrho \chi _{x}(t)\ge - \frac{\chi _{x}(t)}{\beta },\chi _{x}^{0}\ge 0. \end{align}$$ (61)
$\Box$

According to the comparative lemma, we can get the following relation:
χ x ( t ) χ x 0 e ϱ + 1 β t , t [ 0 , z ) . $$\begin{align} \chi _x(t)\ge \chi _x^0e^{-{\left(\varrho +\frac{1}{\beta }\right)}t},t\in [0,z_{\infty }). \end{align}$$ (62)
Therefore, χ x ( t ) 0 $\chi _{x}(t)\ge 0$ is proven.

Theorem 3.Considering the NZS games of system (1) with unknown external disturbance, and assuming that Assumptions 15 hold, along with the disturbance observer (6), the event-triggered approximate optimal control law (31), (32) and weight update law (37) are used. If the dynamic event triggering rule (59) is adopted, the system state x $ x$ , the weight estimation errors W $\tilde{W}$ and the disturbance observer estimation error d $\tilde{d}$ are UUB.

Proof.Select the following Lyapunov function

P χ = P + μ 1 χ x ( t ) . $$\begin{align} \mathcal {P}_{\chi }=\mathcal {P}+\mu _{1}\chi _{x}(t). \end{align}$$ (63)

The time derivation of P χ $\mathcal {P}_{\chi }$ can be obtained as

P ̇ χ μ 1 i = 1 2 ( 1 α c ) λ min ( Q i ) | | x | | 2 + 1 2 G ¯ 2 Γ 2 2 P Θ 2 G ¯ u 2 2 R 22 1 2 σ ¯ 2 2 W 2 2 + 1 2 G ¯ 2 R 11 1 2 G ¯ u 1 2 σ ¯ 1 2 W 1 2 i = 1 2 μ 3 i = 1 2 ι i H i | | W i | | 2 μ 4 λ m i n ( θ ) | | d | | 2 + μ 1 α c i = 1 2 λ min ( Q i ) | | x | | 2 + G ¯ 2 j = 1 2 ϖ j | | π l | | 2 + ς + μ 1 α c i = 1 2 λ m i n ( Q i ) x 2 j = 1 2 G ¯ 2 ϖ j π l 2 ϱ χ x . $$\begin{align} \dot{\mathcal {P}}_{\chi }\le &\, \mu _{1}{\left(-\sum _{i=1}^{2}(1-\alpha _{c})\lambda _{\mathrm{min}}(Q_{i})||x||^{2}\right.}\nonumber \\ &+\frac{1}{2}\bar{\mathcal {G}}^{2}\Vert \Gamma _{2}\Vert ^{2}{\mathcal {P}}_{\Theta }^{2}\bar{\mathcal {G}}_{u2}^{2}{\left\Vert R_{22}^{-1}\right\Vert} ^{2}\bar{\sigma }_{2}^{2}\Vert \tilde{W}_{2}\Vert ^{2}\nonumber \\ &{\left.+\frac{1}{2}\bar{\mathcal {G}}^{2}{\left\Vert R_{11}^{-1}\right\Vert} ^{2}\bar{\mathcal {G}}_{u1}^{2}\bar{\sigma }_{1}^{2}\Vert \tilde{W}_{1}\Vert ^{2}{\vphantom{\sum _{i=1}^{2}}}\right)}\nonumber \\ &-\mu _{3}\sum _{i=1}^{2}\iota _{i}\mathcal {H}_{i}||\tilde{W}_{i}||^{2}-\mu _{4}\lambda _{min}(\theta)||\tilde{d}||^{2}\nonumber \\ &+\mu _{1}{\left(-\alpha _{c}\sum _{i=1}^{2}\lambda _{\min }(Q_{i})||x||^{2}+\overline{\mathcal {G}}^{2}\sum _{j=1}^{2}\varpi _{j}||\pi _{l}||^{2}\right)}+\varsigma \nonumber \\ &+\mu _{1}{\left(\alpha _{c}\sum _{i=1}^{2}\lambda _{min}(Q_{i})\Vert x\Vert ^{2}-\sum _{j=1}^{2} \overline{\mathcal {G}}^{2}\varpi _{j}\Vert \pi _{l}\Vert ^{2}\right)}-\varrho \chi _{x}. \end{align}$$ (64)
Based on Lemma 1 and ϱ > 0 $ \varrho >0$ , if the conditions (53), (54), (55), (56) hold, we can have
P ̇ χ ϱ χ x < 0 . $$\begin{align} \dot{\mathcal {P}}_{\chi }\le -\varrho \chi _{x}<0. \end{align}$$ (65)
$\Box$

The proof is, thus, completed.

4 SIMULATION

In this section, the frequency control was designed according to the proposed method.

Example 1.Inspired by the excellent work [11], the power constraints of ESS was set as u 2 m a x = Γ 2 = 0.1 $u_{2max}=\Gamma _{2}=0.1$ . The parameters of microgrid [12] were chosen as T g = 0.3 $ T_{g}=0.3$ , K p = 5 $ K_{p}=5$ , T p = 10 $ T_{p}=10$ , T t = 0.3 $ T_{t}=0.3$ , R = 2 $ R=2$ , K E = 1 $ K_{E}=1$ .

The optimal cost functions of system (1) were defined with parameters Q 1 = 2 I 4 × 4 $Q_{1}=2I_{4\times 4}$ , Q 2 = I 4 × 4 $Q_{2}=I_{4\times 4}$ , R 11 = 2 $R_{11}=2$ , R 12 = 1 $R_{12}=1$ , R 21 = 1 $R_{21}=1$ and R 22 = 3 $R_{22}=3$ . The activation functions of NNs of the ESS and governor were both given as [ cos x 1 , cos x 2 , sec x 3 , sec x 4 ] T $ [\cos x_1,\cos x_2,\sec x_3,\sec x_4]^T$ . Then the initial value of the critic NN weights were set as W ̂ 1 = [ 0.2 , 0.1 , 0.5 , 0.3 ] T $ \hat{W}_{1}=[-0.2,0.1,0.5,-0.3]^T$ , W ̂ 2 = [ 0.2 , 0.3 , 0.7 , 0.5 ] T $\hat{W}_{2}=[0.2,-0.3,-0.7,0.5]^T$ , respectively. The learning rates were chosen as ι 1 = ι 2 = 5 $ \iota _{1}=\iota _{2}=5$ .

For the dynamic event-triggering condition (59), the parameters were chosen as α c = 0.75 $ \alpha _{c}=0.75$ , ϖ 1 = ϖ 2 = 1 $ \varpi _{1}=\varpi _{2}=1$ , χ x 0 = 0 $\chi _{x}^{0}=0$ , ϱ = 0.5 $ \varrho =0.5$ and β = 0.1 $ \beta =0.1$ . In order to achieve the disturbance observation, the initial value of the intermediate variable b $ b$ were set as 0 and p ( x ) $ p(x)$ was designed as 10 x 1 $ 10x_{1}$ . Thus q = 10 $ q=10$ . Starting with the initial value x ( 0 ) = [ 0.5 , 0.5 , 0.5 , 0.5 ] T $ x(0)=[0.5,-0.5,0.5,-0.5]^T$ , the simulation time step was set to 0.01 s. The unknown external disturbance was added to the MG system for the first 30 s, and the dynamic event-triggered method was implemented for 150 s. The estimation of disturbance is shown in Figure 4.

Run the algorithm to acquire the learning results which are displayed in Figures 2, 3 and 4. The system states, the proposed DET control policies and the approximate Hamiltonian functions are presented in Figure 2. It can be seen that the system state close to zero at t = 100 $ t=100$ s, indicates our method to be effective. It can be found from Figure 2c that the control input amplitude of the governor is less than 0.1, which meets the control constraints.

Details are in the caption following the image
The evolution of (a) the system state x $ x$ ; (b) the control policy of energy storage; (c) the control policy of governor; (d) the approximate Hamiltonian functions.
Details are in the caption following the image
(a) The critic weight W ̂ 1 $ \hat{W}_{1}$ ; (b) the critic weight W ̂ 2 $ \hat{W}_{2}$ ; (c) the cumulative number of the events; (d) the triggering condition.
Details are in the caption following the image
Disturbance estimation in MG.

The critic NNs training curves of the weight are shown in Figure 3. It can be seen that the converged weights are W ̂ 1 = [ 5.0138 , 7.5439 , 2.2697 , 0.6379 ] T $\hat{W}_{1}=[-5.0138,-7.5439,-2.2697,0.6379]^T$ , W ̂ 2 = [ 2.9396 , 4.0187 , 1.4412 , 0.4921 ] T $\hat{W}_{2}=[-2.9396,-4.0187,-1.4412,0.4921]^T$ , respectively. The triggering process is provided in Figure 3. And in Figure 3c, the cumulative number of triggers for the dynamic event-triggering method and the number of samples for the time triggering method are compared, indicating that the dynamic event-triggering method can indeed save system computing and communication resources. Figure 3d shows the evolution of the triggering condition, which illustrates how π l $ \Vert \pi _{l}\Vert $ and Z T $ Z_{T}$ change according to the condition (40).

The simulation result in Figure 4 illustrates that the disturbance observer efficiently estimates the external disturbance, making it possible to offset the interference of the disturbance. Notably, during the first 30 s, when the disturbance is applied, the designed DOB successfully approximates the disturbance. Afterward, the system states gradually converge.

Example 2.Different from the damping oscillating disturbance applied in Example 1, the step disturbance was added for the first 50 s.

The parameters of the system were chosen as T g = 0.5 $ T_{g}=0.5$ , K p = 1 $ K_{p}=1$ , T p = 10 $ T_{p}=10$ , T t = 0.2 $ T_{t}=0.2$ , R = 2 $ R=2$ , K E = 1 $ K_{E}=1$ . And the DET condition parameters were set as α c = 0.75 $ \alpha _{c}=0.75$ , ϖ 1 = ϖ 2 = 3 $ \varpi _{1}=\varpi _{2}=3$ , χ x 0 = 0 $\chi _{x}^{0}=0$ , ϱ = 5 $ \varrho =5$ and β = 0.1 $ \beta =0.1$ . To approximate the external unknown disturbance, p ( x ) $ p(x)$ were changed to 20 x 1 $ 20x_{1}$ , accordingly, q = 20 $ q=20$ . Other parameter were the same as Example 1. The designed method was implemented for 200 s.

The running results of the designed algorithm are shown in Figures 4-6. The system states, the DET control policies and the approximate Hamiltonian functions are presented in Figure 5. It can be seen that the system state is close to zero at 150 s, which indicates our method to be effective. The control input of the governor is constrained to 0.1, which satisfies the requirement of input constraint.

Details are in the caption following the image
The evolution of (a) the system state x $ x$ ; (b) the control policy of energy storage; (c) the control policy of governor; (d) the approximate Hamiltonian functions.
Details are in the caption following the image
(a) The critic weight W ̂ 1 $ \hat{W}_{1}$ ; (b) the critic weight W ̂ 2 $ \hat{W}_{2}$ ; (c) the cumulative number of the events; (d) the triggering condition.

According to the simulation results, we can obtain the estimation of the ideal weight as W ̂ 1 = [ 5.3000 , 7.4635 , 2.0990 , 0.0655 ] T $\hat{W}_{1}=[-5.3000,-7.4635,-2.0990, -0.0655]^T$ , W ̂ 2 = [ 3.1135 , 3.9453 , 1.3859 , 0.1391 ] T $\hat{W}_{2}=[-3.1135,-3.9453,-1.3859,0.1391]^T$ . The triggering process is shown in Figure 6. And in Figure 6c, the cumulative number of triggers for the DET method and the number of samples for the time triggering method are compared, indicating that the dynamic event-triggering method can indeed save system computing and communication resources. Figure 6d shows the evolution of the triggering condition, which illustrates how π l $ \Vert \pi _{l}\Vert $ and Z T $ Z_{T}$ change according to the condition (40).

The simulation results in Figure 4 illustrates that the disturbance observer can successfully approximate both the oscillating disturbance and the step disturbance, which demonstrates the good estimation performance of the disturbance observer. The convergence of the system state to zero also indicates that the proposed algorithm, the DET-based ADP approach combining disturbance observer, can successfully utilize the disturbance observer to offset the interference of disturbance, thereby addressing the optimal frequency control problem.

5 CONCLUSION

For the frequency optimal control problem, we studied a class of cyber-physical microgrids facing unknown disturbances from wind turbine, FDI attacks and the load change. The composite control input consists of two parts: one is the adaptive optimal input for the NZS game of the microgrid system, and the other is the disturbance compensation input derived from the estimation provided by the disturbance observer. The composite robust control effectively eliminates the impact of unknown disturbances in the microgrid system while simultaneously minimizing the value function. Meanwhile, the system stability can be guaranteed. Simulation examples were used to prove the effectiveness of the presented algorithm.

It is expected that future work could further enhance this framework by integrating more advanced FDI detection and mitigation strategies and addressing challenges such as communication delays and network uncertainties. Such advancements would improve the robustness and adaptability of frequency control in microgrid environments.

AUTHOR CONTRIBUTIONS

Zemeng Mi: Formal analysis; investigation; software; writing—original draft. Hanguang Su: Investigation; visualization. Qiuye Sun: Investigation; supervision. Yuliang Cai: Investigation; resources; visualization. Zhongyang Ming: Investigation; supervision.

ACKNOWLEDGEMENTS

This work was supported by National Natural Science Foundation of China (Nos. 62373091, 62103087, 62203311 & U22A2055), China Postdoctoral Science Foundation (Nos. 2024T170112 & 2021M690567), National Key R&D Program of China under Grant 2018YFA0702200, the Fundamental Research Funds for the Central Universities (Nos. N2104016 & N2304009), Natural Science Foundation of Liaoning Province (No. 2023-MSBA-082), China Academy of Engineering Institute of Land Cooperation Consulting Project (2023-DFZD-60, 2023-DFZD-60-03) and Key Laboratory of Integrated Energy Optimization and Secure Operation of Liaoning Province.

    CONFLICT OF INTEREST STATEMENT

    The authors declare no conflict of interest.

    DATA AVAILABILITY STATEMENT

    No data was used for the research described in the article.