Data-driven approach to iterative learning control via convex optimisation

: A new data-driven iterative learning control methodology is presented which uses the frequency response data of a system in order to avoid the problem of unmodelled dynamics associated with low-order parametric models. A convex optimisation problem is formulated to design the learning filters such that the convergence criterion is minimised. Since the frequency response data of the system is used in obtaining these filters, robustness is ensured by eliminating the uncertainty in the modelling process. The effectiveness of the method is illustrated by considering a case study where the proposed design scheme is applied to a power converter control system for a specific accelerator requirement at CERN.


Introduction
The increasing performance demands of today's modern systems have created challenging tasks for the control systems engineer. To simplify the controller design process, these systems are approximated with low-order models (which reduces the effort required to properly synthesise a controller). This approximation, however, can lead to stability and performance problems since these low-order models are subject to model uncertainty. Datadriven methods are used as a solution to this problem where controllers are synthesised simply by using time-domain or frequency-domain data. A survey on the differences associated with model-based control and data-driven control has been addressed in [1,2]. For linear time-invariant systems, the parametric uncertainties and the unmodelled dynamics associated with the data-driven scheme are irrelevant, and the only source of uncertainty is the measurement process. In addition to the problem of unmodeled dynamics is the issue of tracking in high performance systems. With general feedback control, a controller can be designed to achieve sufficient robustness to system uncertainties. However, feedback control suffers from a lag in transient tracking since the controller has to react to inputs and disturbances.
The classical model-reference adaptive control [3] may be considered as the first data-driven attempt to solve the model reference problem in an online manner. The Iterative-Feedback-Tuning (IFT) method [4] has also gained interest in the scientific community where the main goal is to obtain unbiased gradient estimates and optimise for time-domain performance. In this method, a non-convex optimisation problem is solved where multiple experiments are performed to obtain the controller parameters. In general, stability is not guaranteed with this method. Some works which devised robust stability conditions for the IFT method are asserted in [5,6], and recent applications of robust IFT controller design methods have been addressed in [7,8]. Virtual reference feedback tuning (VRFT) [9] is another offline one-shot method which minimises the (filtered) ℋ 2 norm of the difference between a desired reference model and the achieved closed-loop system. In this method, a controller is computed based on the measured plant input when fed by a 'virtual' error. Recent developments and extensions using the VRFT technique for singleinput-single-output (SISO) systems [10,11] and multiple-inputmultiple-output systems [12,13] have also been studied. More recently, a frequency-domain data-driven approach in minimising the ℋ ∞ norm of weighted sensitivity functions has been proposed in [14,15]. In this method, a convex approximation of the ℋ ∞ criterion is formulated where it is shown that the global solution to the ℋ ∞ problem is obtained as the controller order increases (while guaranteeing the stability of the closed-loop system). This method has been extended in [16] in order to obtain a local solution for the ℋ ∞ data-driven problem for fixed-structure low-order controllers. This method, however, requires the solution to a nonconvex problem. Thus the authors in [17] proposed an iterative method for obtaining a local solution for fixed-structure controllers (in a data-driven setting) by solving a set of consecutive convex problems; this method can consider both ℋ ∞ and ℋ 2 problems.
Although the data-driven methodology alleviates the problem of unmodelled dynamics, feedback controllers can still produce inadequate tracking performance for high bandwidth applications. Iterative learning control (ILC) (which was first proposed in [18]) seeks to address the tracking problem by implementing a learning algorithm where the desired performance is achieved by incorporating past error information into the control structure. The control signal (or reference signal in the case of closed-loop control) is modified to attain a desired specification; in this paper, systems that perform the same operations (i.e. repetitive tracking specifications) under the same operating conditions are considered. ILC designs have been constructed in a frequency-domain datadriven framework in several works. In [19], a two-step approach is proposed for using the frequency response of a system to satisfy the ILC stability and convergence criterion. In this method, the Nyquist plot of the plant is used where the learning filter and Qfilter are tuned such that the stability bound is satisfied. The authors in [20] proposed a manual loop-shaping method using an approximation of the plant model using frequency-domain data. Several works have realised ℋ ∞ methods for optimising the ILC convergence rate (given a desired performance specification). The authors in [21] implement a model-matching problem to find the ILC learning function for a given performance filter (i.e. the Qfilter). Here, the model-matching problem was expressed as a linear fractional transformation (where standard ℋ ∞ tools were used to solve the problem). The authors in [22] presented a robust ILC scheme for uncertain systems using the Youla parameterisation and the μ-synthesis approach. In [23], the bounded real lemma was employed to achieve ℋ ∞ performance in an ILC framework for iteration-varying loads. All of these methods based on the ℋ ∞ approach, however, require a plant model for proper controller synthesis. In [24], a frequency-domain approach was developed to maximise a weighted cost function which attempts to minimise the ILC convergence criterion; this method, however, solves a nonconvex problem and does not guarantee the stability of the learning filters; stability of the filters can only be verified a-posteriori. Another frequency-domain approach was used in [25] to minimise the ℋ ∞ norm of the ILC performance criterion (where no model is required and can ensure the performance for systems with multimodel uncertainties); this method, however, is limited to linearly parameterised learning functions and does not guarantee the stability of the learning filter. A more recent approach to datadriven ILC was taken in [26] where the authors implement a tensor decomposition method in the time-domain to obtain norm-optimal results with respect to the errors caused by external disturbances; Lautenschlager et al. [26], however, do not address the problem of minimising the ILC convergence rate while guaranteeing the stability of the ILC algorithm.
In this work, a new data-driven framework is presented for optimising the learning filters in the ILC algorithm. The error convergence rate depends on the learning filter (which should ideally invert the plant dynamics); thus a method is proposed to design the ILC learning filter such that it would approximate the plant inverse without having the plant model (i.e. directly by means of the frequency response of the system). By using the frequency response measurement data to design the learning filter, the model uncertainties are irrelevant (since all of the dynamics are captured in the measurement). This is a distinctive advantage compared to alternative robust ILC frameworks as measurement uncertainty can be made smaller than the ones typically related to unmodelled dynamics. A two-step method for attaining optimal closed-loop performance in a data-driven setting will be presented. In the first step, the frequency response of a plant is acquired (through an experiment) in order to capture all of the system dynamics and design a one-degree-of-freedom controller. In the second step, a novel approach is formulated to optimise the ILC convergence rate in the ℋ ∞ sense (while guaranteeing the stability of the ILC structure). A convex optimisation problem is proposed in order to obtain a (rational) stable learning function and simultaneously optimise the ILC convergence rate whilst eliminating the problem of unmodelled dynamics that unavoidably affect the modelling process; the advantage of this method is that optimal performance and stability can be obtained with much less conservatism with respect to model-based approaches. Some of the ideas presented in [14] are used in formulating the necessary and sufficient conditions for satisfying the convergence criterion. The methods presented in this work are implemented for a specific high performance power converter application at CERN. Currently, the load (i.e. a particle accelerator magnet) is often approximated as a simple first-order system (series RL circuit) even though it has been shown in [27] that both the real and imaginary parts of its impedance are frequency dependent (where its behaviour is also affected by the vacuum chamber of the magnet, which makes the series RL even less accurate). Therefore, it is appropriate to consider a data-driven based design for the power converter control system. The paper is organised as follows. The main results from previous works regarding data-driven controller design (with ℋ ∞ performance) and ILC formulation in a closed-loop structure are recalled in Section 2. The main methodological results and theoretical bases are addressed in Section 3 where some necessary and sufficient conditions are derived for satisfying the ILC convergence criteria. With these conditions, a convex optimisation problem is then formulated in order to optimise the learning functions (for a given Q-filter) in a data-driven context. A simulation example is studied in Section 4 which considers the design of a learning filter given a set of uncertain non-minimum phase systems. Section 5 is dedicated to a case study concerning a power converter control system for a specific accelerator requirement at CERN. Finally, the concluding remarks are given in Section 6.

Preliminaries
In order to avoid the risk of any confusion, the notation employed in this paper will now be defined. Let the set RH ∞ represent the family of all stable, proper, real-rational transfer functions with bounded infinity norms. Given a real-valued signal x(k) with a time index k, the symbol q will denote the forward time-shift operator such that qx(k) ≡ x(k + 1). The (one-sided) z-transformation of a signal x(k) is X(z) = ∑ k = 0 ∞ x(k)z −k , and can be obtained by substituting the shift operator q with z; the frequency response of a system in the z-domain is obtained by setting z = e jω for ω ∈ [ − π, π]. Finally, ℜ{ ⋅ } will represent the real part of the argument.

Design of controller K with ℋ ∞ performance
In this subsection, a method for designing a feedback controller in a data-driven setting is recalled. This controller is designed to achieve ℋ ∞ performance and will be used for stabilising the closed-loop system while simultaneously ensuring sufficient robustness to uncertainties with adequate disturbance rejection. In the next subsection, the ILC methodology is discussed for ensuring the proper tracking performance (using a closed-loop structure) [Note that to implement the proposed ILC strategy, the only requirement is that the closed-loop system is stable. A data-driven controller design methodology for obtaining a stabilising controller is used in order to emphasise the data-driven framework of the paper.]. Suppose that the plant of a SISO linear time-invariant system is represented as are called the coprime factors of G(z) over RH ∞ [28]. The frequency response function (FRF) of such a system is given by the following equation: G(e jω ) = N(e jω )M −1 (e jω ), ω ∈ Ω (1) where Ω ≜ [ − π, π]. Note that for stable systems, we assume M(e jω ) = 1 and so N(e jω ) is available by spectral analysis from a set of data. For unstable systems, the frequency response of the coprime factors can be identified from a closed-loop experiment with a stabilising controller [14]. In this case, N(e jω ) is the FRF between the reference signal and the measured output, while M(e jω ) is the FRF between the reference signal and the plant input. Given these definitions, it is evident that N(e jω )M −1 (e jω ) represents the frequency response of the plant model. In general, a set G can be formulated to represent a plant model containing p FRF models For simplicity, one model from the set G will be considered, and the subscript i will be omitted. However, in general, the design procedures outlined in this paper can be applied to the multi-model case. Fig. 1 shows the control structure that will be used in this paper. The ILC structure will be discussed in the next section; the remaining portions of this subsection will consider the design of the controller K such that closed-loop stability and ℋ ∞ performance is achieved (in a data-driven setting). The closed-loop controller can be structured as K(z) = K n (z)K d −1 (z), where K n (z), K d (z) ∈ RH ∞ . The functions K n (z) and K d (z) each represent linearly parameterised polynomials in the z-domain, i.e. K n (z, ρ) = k n 0 + k n 1 z −1 + ⋯ + k n n z −n n (3) where k n i and k d i are the controller parameters and {n n , d n } are the orders of the polynomials K n and K d , respectively. The vector of controller parameters ρ is defined as where ρ ∈ ℝ n with n = n n + d n + 1.
To attain ℋ ∞ performance for a particular sensitivity function, one can consider minimising where W v is a proper and stable weight with bounded infinity norm, and S v (ρ) is a sensitivity function of interest. For this paper, S 1 (ρ) will represent the sensitivity function from reference input r to the output y (i.e. S 1 (ρ) = GK(ρ)[1 + GK(ρ)] −1 ), while S 2 (ρ) will represent the sensitivity function from the output disturbance d to y Suppose that it is desired to minimise the nominal performance ) in order to attain sufficient robustness or adequate disturbance rejection. Minimising the nominal performance condition is equivalent to minimising Γ ∈ ℝ + with the constraint ∥ W 2 S 2 (ρ) ∥ ∞ < Γ (which is an epigraph representation of the optimisation criterion). In [14], it is shown that satisfying the nominal performance condition is equivalent to By minimising Γ with the above constraint, a quasi-convex optimisation problem can be formulated (where the optimum is obtained by implementing a bisection algorithm). The authors in [14] also show that as the controller order increases, the solution to the convex problem converges to the global optimal solution of the ℋ ∞ problem.

ILC formulation
The ILC structure that is considered in this paper is shown in Fig.  1, which is a modified version of the serial ILC architecture (where the reference input of the closed-loop system is modified instead of the plant control signal). For SISO linear time-invariant systems, the output of the closed-loop control system can be expressed as follows: where l is the ILC iteration sequence, y l (k) is the output, y d (k) is the desired output, and r l (k) is the reference signal that is modified by the ILC algorithm in order to acquire the desired output response. The transfer operators S 1 (q) and S 2 (q) are assumed to be stable. Let us denote the error of the ILC closed-loop system as e l (k) ≜ y d (k) − y l (k); the first-order ILC algorithm can then be expressed as where Q(q) and L(q) are known as the Q-filter and learning filter, respectively. By combining the z-transforms of (7) and (8), one can arrive to the following condition: Given this transformed result, a sufficient condition for stability can be obtained by invoking that Q(z)[1 − L(z)S 1 (z)] be a contraction mapping. The following theorem is recalled for ensuring the stability of the ILC algorithm [29,30].
then the system in (7) using the update equation in (8) is asymptotically stable. The result in Theorem 1 can be used to express the stability condition in the frequency domain, as follows: where γ = 1. S 1 (e jω ) can be easily obtained by spectral analysis of measured data. Suppose that S 1 (e jω ) is obtained using the Fourier transform of the reference input signal r(k) and output signal y(k); and k s is the number of data. Satisfying the stability condition also implies monotonic convergence of the error, which is recalled in the following lemma [21,29]: Lemma 1: If the ILC system and the update equation in (8) with iteration trials of infinite time duration satisfies for all l ∈ {1, 2, …}. As stated in [31], when Q and L are causal functions, then satisfying (13) also implies ∥ e ∞ − e l + 1 ∥ 2 < λ ∥ e ∞ − e l ∥ 2 for all l ∈ {1, 2, …} for the ILC system with a finite duration. Note that the convergence condition in (13) is equivalent to the stability condition in (11); thus satisfying the stability condition (11) ensures both stability and monotonic convergence (which is independent of the iteration duration).
In a model-based setting, the learning filter can be selected as L(z) = S 1 −1 (z) (assuming that S 1 (z) is minimum phase); this will ensure that the ILC algorithm converges after one iteration. According to [32], there exists a bounded non-causal signal that produces the desired output for non-minimum phase systems. Therefore, this implies that there exists a non-causal learning filter that can invert the dynamics of a non-minimum phase system. The non-causality of L(z) is not a problem since the ILC algorithm is run offline and because the ILC structure uses information from previous iteration signals, which are stored in memory. However, when S 1 (z) is non-minimum phase, this inversion is not possible as the learning filter then becomes unstable; in this case, there are methods in the literature that implement stable inversion approaches to build the learning function (see [33][34][35]). A stable inversion is necessary due to the fact that unstable filters can produce control actions that grow exponentially over time where, even over a finite duration, can become undesirably large (which can excite certain non-linearities and even damage system components [36]). In either case, a model is required to build a learning function that best approximates the inverse of the plant dynamics. Due to the unmodelled dynamics inherent in system models, a Q-filter can then be used to ensure robustness to uncertainties; the introduction of a filter Q(z) ≠ 1 (or a Q-filter with a sufficiently low bandwidth), however, will cause performance degradations for systems requiring very high performance.
convergence criterion for optimal ILC performance. For notation purposes, the dependency in ω will be neglected, and will only be reiterated when deemed necessary.
In a data-driven setting, the uncertainty from the modelling process is removed and replaced with measurement uncertainty, which, to some extent, can be made smaller and smaller by means of more accurate instruments (and smaller than the modelling uncertainty). The selection of a Q-filter can therefore be made without much conservatism. Given the results of the convergence criterion in Lemma 1, it is logical to consider minimising λ instead of simply satisfying the inequality (since a smaller λ would ensure a faster convergence rate). Minimising λ is equivalent to minimising γ such that ∥ Q[1 − LS 1 ] ∥ ∞ < γ (which is an epigraph representation of the optimisation problem). Thus an optimisation problem can be formulated to design a learning filter L (given a desired Q) such that ∥ Q[1 − LS 1 ] ∥ ∞ is minimised. By the condition in (11), it must also be ensured that Consider the filter L parameterised as where L n (α, z) is stable and possibly non-causal and L d (α, z) ∈ RH ∞ . The functions L n (α, z) and L d (α, z) are linearly are the vectors of the filter parameters and ϕ n (z), ϕ d (z) are vectors of stable orthogonal basis functions given as follows: with −1 < ξ < 1 and β d ≤ β n . Note that when β n = β d , L(α, z) is proper and causal. With this formulation, the constraint in (11) can then be expressed as follows: The condition in (16) can be graphically interpreted as a disk in the complex plane (centred at L d (α) with a radius of x r (γ, α)) that does not include the origin for any frequency point in Ω. Fig. 2 displays a graphical interpretation of this condition for a given ω. This geometrical construction will be used to prove the following lemma: is the frequency response of a bounded analytic function outside the unit circle. Then (11) is met if and only if there exists a stable proper rational transfer function F(z) that satisfies (17) Proof: The proof of this lemma follows from the proofs presented in [37]. From Fig. 2, it is clear that (11) is satisfied if and only if the disk of radius x r (γ, α) centred at L d (α) does not include the origin for all ω ∈ Ω (as stated in (16)). This is equivalent to the existence of a line passing through origin that does not intersect the disk. Therefore, at every given frequency, ω, there exists a complex number f (e jω ) that can rotate the disk such that it lays inside the right hand side of the imaginary axis. Hence, we have Since f (e jω ) = f (e jω ) e jθ f , then the above condition can be expressed as However, (19) is satisfied if and only if In [37], it is shown that f (e jω ) can be approximated arbitrarily well by the frequency response of a stable transfer function F(z) if and only if is analytic outside the unit circle for all θ ∈ [0 , 2π[ and for all γ + > γ. However, L d −1 (α) is stable because of the stability of L.
Additionally, by decreasing γ + from infinity to γ, the poles of Z move continuously with γ + . Therefore, Z is not analytic outside the unit circle if and only if Z −1 (e jω ) = 0 for a given frequency, which is not the case because the disk shown in Fig. 2 does not include the origin. □ The set of all learning filters that meet the ILC stability condition is asserted in the following theorem.
Theorem 2: Given the frequency response function of a stable system S 1 and a filter Q, then the following statements are equivalent: (a) There exists a stable filter L(α) where (23) Proof: (b ⇒ a) ℜ{L d (α)} > 0 signifies that the Nyquist plot of L d (α) will not encircle the origin ∀ω. However, note that By the positive real Lemma [38], L d −1 (α) is strictly positive real if and only if • No poles of L d −1 (α) lie in z > 1.
• The poles of L d −1 (α) on the unit circle are simple. Therefore, the positive real constraint implies that L d −1 (α) is Hurwitz (and thus L(α) is stable). On the other hand, note that L d (α) ≥ ℜ{L d (α)} for all ω ∈ Ω. From the condition in Statement (b), this implies that x r (γ, α) < L d (α) , ∀ω ∈ Ω, which leads to and consequently to (22) in Statement (a).
(a ⇒ b) Assume that L(α 0 ) = L n (α 0 )L d −1 (α 0 ) satisfies Statement (a) but not Statement (b). Then, according to Lemma 2, there exists a stable proper rational transfer function F(z) such that Therefore, there exists a higher order filter with L n (α) = L n (α 0 )F and L d (α) = L d (α 0 )F with such that Statement (b) holds. □ The necessary and sufficient condition in Theorem 2 can be used to ensure the stability and performance of the ILC algorithm (while guaranteeing the stability of the learning filter). The results of this theorem can also be used to ensure other performance constraints (such as controller output constraints in order to avoid saturation). To develop this constraint, the condition relating the converged plant input signal as a function of the desired reference signal is needed; from Fig. 1 and (8), it can be shown that where S 3 = KS 2 and u is the input to the plant G. Given an input weight function W 3 , a constraint to limit the controller output can be expressed as Corollary 1: Suppose that the stability and performance condition satisfies ∥ Q(1 − L(α)S 1 ) ∥ ∞ ≤ γ (with γ ∈ [0, 1)). Then the learning filter L(α) = L n (α)L d −1 (α) is stable and the input constraint in (27) is satisfied if where x u (γ, α) = (1 − γ) −1 W 3 QL n (α)S 3 . Proof: By the results of Theorem 2, ℜ{L d (α)} > 0, ∀ω ∈ Ω implies that L(α) is stable. Since L d (α) ≥ ℜ{L d (α)} for all ω ∈ Ω, then (28) leads to x u (γ, α) < L d (α) , ∀ω ∈ Ω, which is equivalent to Since Q(1 − L(α)S 1 ) < γ for all ω ∈ Ω, and noting that then it is evident that From (29), it can then be deduced from the above inequality that which leads directly to the condition in (27). □

Semi-definite programming
Based on the results of Theorem 2, an optimisation problem for obtaining the admissible values of α and γ can be formulated as follows: This problem is quasi-convex [Note that adding constraint (28) in the optimisation problem (32) does not change the quasi-convexity of the problem since (1 − γ) is non-negative.] and is known as a semi-infinite programming problem (since there are a finite number of optimisation variables and an infinite amount of constraints with respect to ω). The problem can be converted to a semi-definite programming problem by predefining a frequency grid Ω η = {ω 1 , ω 2 , …, ω η } a-priori and solving the problem with a finite amount of constraints (which can be solved efficiently with solvers that are readily available). The frequency points may be equally spaced, logarithmically spaced or chosen based on some information about the frequency response of S 1 (where more frequency points can be used around the resonance frequencies and closed-loop bandwidth). The optimal choice of the frequency points is an open problem. However, the complexity of the optimisation algorithm grows linearly with the number of frequency points, and so it can be chosen large enough. Another approach in selecting this grid is to use a randomised method where the frequency grid is selected such that the constraints are satisfied within a given probability level (see [39,40]). The solution to this quasi-convex problem can be obtained by implementing a bisection algorithm where convergence to the global optimum is obtained within a given tolerance level. The maximum value of the bisection variable γ max = 1 − ϵ (with ϵ being an arbitrarily small positive constant) will ensure that any feasible solution to the problem will produce a realisable learning filter. Note that the basis functions for the learning filter stated in (15) have only one parameter to be selected (ξ). For low-order filters, this choice may have a significant effect on the ILC performance. However, for high-order filters, the choice of the basis function is not important. For a given ILC sequence with a finite amount of samples, the lowest value of the optimal solution γ ⋆ for that problem is obtained when β n is selected to be equal to the number of samples in an ILC iteration (for any arbitrarily low value of β d ). However, very high order filters may cause computational problems (in both the implementation of the filter and in solving the optimisation problem). Thus the following steps can be performed for designing a low-order learning filter that will produce a feasible solution to the optimisation problem: • Experimentally obtain the FRF of S 1 by means of spectral analysis. • Select and fix an arbitrarily low value for β d and select any ξ ∈ ( − 1, 1).
• Choose a value for β n such that β n > β d and solve the problem in (32). One can start with β n = β d + 1.
• If the problem is feasible, stop. Otherwise, increase the value of β n and return to the previous step.
If a faster ILC convergence rate is desired (i.e. a desired lower value of γ ⋆ ), then β d and β n can always be increased until the desired performance is met.
Remark 1: In order to mitigate the phase shift of the Q-filter, the adjoint filter Q * is considered such that Q * Q = Q 2 is implemented. This filtering will cause the condition in (11) to change as follows: sup ω ∈ Ω Q(e jω ) 2 1 − L(e jω )S 1 (e jω ) < γ .
Consequently, the constraints in (28) and (32) will need to consider Q 2 instead of Q . The MATLAB filtfilt function can then be used to implement the Q-filter in the learning algorithm and eliminate the phase shift. Remark 2: The update equation in (8) can be extended to include information from previous iterations, i.e.
The reference signal for the next trial is calculated from the reference signals and the errors in the n o previous iterations. This represents an n o th order ILC algorithm. In this case, the proposed method can still be used to design learning filters for the n o previous iterations; the optimisation problem for this case can be formulated as follows:

Simulation example
Consider the following non-minimum phase open-loop system sampled at T s = 10 −3 s with multi-model uncertainty where δ is an uncertain parameter that belongs in the set δ ∈ {0.8δ n , δ n , 1.2δ n } (i.e. the nominal value δ n = 1.15 can vary by ±20%). Remark 3: It is imperative to note that these models are simply used to obtain the frequency response functions of the perturbed plants. The actual controller synthesis does not rely on these parametric models.
The proposed controller design method can be used to guarantee the performance and stability for all of the models in this uncertain set. The plant G i for i = 1, 2, 3 will be used to denote the plant model associated with the uncertain parameter δ i in the set δ. Consequently, the sensitivity function S v i for i = 1, 2, 3 will be used in the same manner. First, a controller K was designed by minimising ∥ W 2 S 2 i (ρ) ∥ ∞ in order to stabilise the closed-loop system; the convex approximation in (6)  According to [41], a scenario approach can be used to select the frequency grid in the optimisation problem where the number of (random) frequency points selected must satisfy the following condition: where the violation and confidence parameters are denoted as ϵ and σ, respectively. For a 4th-order controller with violation and confidence parameters set to 0.05, η ≥ 359; thus 400 points were randomly selected from a frequency grid in the interval [10 −2 , π/T s ] rad s −1 . Minimising Γ with the constraint in (6) with this selected grid (while considering all of the models G i ) produces an optimal solution Γ ⋆ = 2.438 (where a bisection algorithm was used to solve the convex problem with Γ max = 10, Γ min = 0, and a tolerance of 10 −4 ).
With the closed-loop system now stabilised for all models in the uncertain set, the proposed ILC scheme was then used to attain the desired tracking performance. A smoothed delayed step response was selected as the desired output signal, i.e.
where f d = 50. It was desired to compare the proposed method with the optimisation method developed in [25]. In [25], the Qfilter and learning filter are each linearly parameterised with noncausal functions; to convexify the performance condition (11), the authors defined a new linearly parameterised function L = QL. After solving the optimisation problem, the learning function L was then obtained as L /Q. This substitution, however, does not guarantee the stability of the learning filter L and can cause implementation problems.
Remark 4: Note that in [25], it was not clear how to shape the Q-filter with the performance filter W p . One could include an additional constraint such as ∥ Q − Q d ∥ < γ (which is convex, where Q d is the desired FRF of the Q-filter) to shape Q; however, a very large order Q-filter was needed to ensure that Q matches well with Q d (particularly with the uncertain multi-model problem considered in this example). Thus to have a fair comparison with the method presented in this paper, the Q-filter was fixed in the proposed method and in [25] while optimising for the linearly parameterised learning filter L.
The first design case will consider minimising the performance condition given a low-pass Q-filter with a bandwidth of 300 Hz (i.e. selecting Q(z) = (1 − a)(z − a) −1 with a = e −T s /τ and τ = (600π) −1 s). For a fair comparison, the orders of the learning filters with β n = 4 were selected to be equal (with the proposed method using β d = 1 and ξ = 0). The proposed optimisation algorithm (32) and the method in [25] were used to ensure the stability and performance for all the models S 1 i (using the same frequency grid selected for the feedback controller design). Fig. 3 displays the 2-norm of the error for each uncertain model as a function of the iteration sequence (with the associated optimal value of the cost function γ). It can be observed that the proposed method achieves both a faster convergence rate and a lower error. The reason for this is due to the fact that the proposed approach can simultaneously optimise for the numerator and denominator parameters of the learning filter (which leads to a better approximation of the plant inverse), whereas the method in [25]  can only consider linearly parameterised filters (and thus limits the achieved performance for a given filter order). Fig. 4 shows the FRF's of the performance criterion Q(1 − LS 1 i ) for i = 1, 2, 3 (which confirms the faster convergence rate of the proposed approach).

Plant input constraint
The second design case will consider adding a plant input constraint to the multi-model performance problem. The same lowpass Q-filter used from the previous design (with a bandwidth of 300 Hz) was also used for this design. The input constraint filter W 3 was selected to constrain the input signal at high frequencies ( f > 70 Hz). The magnitude plots of these filters are shown in Fig.  5.
The proposed optimisation problem in (32) was solved with the additional input constraint in (28); again, for a fair comparison, the learning filter with β n = 4 was selected to be equal using the proposed approach and the approach in [25] (with the proposed method using β d = 1 and ξ = 0). The resulting normed errors for each model S 1 i are shown in Fig. 6. Again, compared with the method in [25], the proposed method achieves both a faster convergence rate and a lower error. However, due to the added input constraint, the convergence rate is lower and the error is slightly larger than the case where the input constraint was neglected.

Case study
The framework of the system discussed in this case study is part of the CERN Large Hadron Collider (LHC) Injector Upgrade Project [42], which is implemented to mitigate space-charge effects and to increase the beam brightness in order to fulfil the needs of the High Luminosity LHC [43]. The Q-STRIPs are circuits powering additional windings of the focusing/defocusing quadrupole magnets of the Proton Synchrotron Booster accelerator. Beam dynamics is a very vast and complicated subject as the motion of relativistic and non-relativistic particles in electromagnetic fields is complex and non-linear. The function of magnets in a particle accelerator can be thought as being equivalent to optical functions: a dipole magnet will bend the beam, whereas a quadrupole magnet will focus the beam (in one plane, while defocusing it in the transverse plane). Higher order optical corrections are implemented by means of higher order n-pole magnets: sextupoles, octupoles etc. [44]. The higher the energy of the beam, the greater the integrated magnetic field (or its spatial first, second and higher order gradients) needs to be. The Q-STRIP circuits are used to adjust each ring of the PS booster in terms of gradient strength. The quality of the optics is determined by the quality of the magnets and by the precision of the current supplying them. In fact, the magnetic field is controlled by applying controlled currents in the magnets (which are measured with high precision devices); the power converter control system is used for this purpose. Its performance is therefore of the utmost importance for the trajectory control of particle beams in any accelerator.
Power converters can be seen as systems comprised of three main subsystems: (i) a power source (usually a voltage source) (ii) a measuring system and (iii) a controller unit. The current is usually measured with a particularly accurate current transducer called a direct current-current transformer (DCCT) [45]. The current measurement signal is fed back to a digital controller unit that implements the digital control algorithm and that usually includes a high-precision analogue-to-digital converter (ADC) [46].

CANCUN for Q-STRIP
The general configuration of the CERN power converter control system is depicted in Fig. 7. The control loop consists of a magnet (i.e. the load), a voltage source V s , low-pass anti-aliasing analogue and digital filters (ALPF, DLPF), a digital-to-analogue converter (DAC), and an ADC. The DAC (which is optional as the voltage source could have a direct digital input) and ADCs are integrated in the control unit labelled as the function generator controller (FGC, [47]) whose main function is to execute the control algorithm; it also implements all the diagnostics and communication functions with higher layers of the control architecture up to the accelerators control rooms. The DLPF may also include a decimator to reduce the sampling rate of the signal. The COMM block represents the delay associated with the communications link.
The controller implemented within the FGC is a two-degree-offreedom controller known as the RST controller (see [15,48]). In the RST structure, the polynomials R,S and T are each formulated as FIR filters (where under certain conditions, the tracking and regulation requirements can be set independently from each other). To transform the structure into the one-degree-of-freedom representation shown in Fig. 7, the polynomials R and T are set to be equal to each other; in this manner, K n in (3) would be equivalent to R and K d in (4) would be equivalent to S.
The experimental test setup consists of a CERN AC-DC Narrow Converter (CANCUN), a dummy load and a proprietary software diagnostics tool: The CANCUN power converter, depicted in Fig. 8, is comprised of three main parts: • a voltage source based on a full-bridge phase shifted topology followed by a four quadrant switching stage to allow four quadrant operation; • two high precision current sensors (DCCTs, two are used for redundancy purposes) which are able to measure DC or pulsed current at the required precision; • a digital controller (FGC3) which implements the digital control loop together with CERN designed control and diagnostics electronics.
The ratings of the CANCUN for the Q-STRIP application is ±100 A and ±30 V. The dummy load is (ideally) an RL-series load whose characteristics match those of the Q-STRIP windings. The software diagnostics tool interfaces with the main digital controller module, the FGC3 [47], and is able to acquire the relevant signals up to a sampling rate of 10 × 10 3 samples per second. The acquired signals are: • the reference current and voltage; • the measured current and voltage; • the current error: difference between measured current signal and the (optionally delayed) reference current signal.

Controller design
For this application, there are two requirements to be satisfied: (i) obtain a modulus margin m d ∈ ℝ + of at least 0.5 to guarantee robustness; (ii) guarantee that the current error, in parts-per-million (ppm), always remains within the bounds defined in Fig. 9 for the specific reference current shown. The ppm units always refer to the rating of the current for the power converter, which is 100 A in this case. Requirement (i) is achieved by means of the data-driven ℋ ∞ method discussed in Section 2 (i.e. design a controller to stabilise the closed-loop system), while requirement (ii) is accomplished by implementing the proposed ILC methodology in the closed-loop form.
[All optimisation problems in this case study were solved using a computer having the following hardware specifications: Intel-Core i7, 3.4 GHz CPU, 8 GB RAM. The optimisation algorithms were run using MATLAB version (R2017a) on a Windows 7 platform (64-bit) in conjunction with the YALMIP interface [49].]

Synthesis for closed-loop stability:
An open-loop experiment was performed to obtain the FRF of the open-loop system in order to design a stabilising controller K. For this purpose, a pseudo-random-binary-sequence (PRBS) signal was used as the input voltage reference in order to capture its dynamics. The PRBS clock period was set to T cl = 100 μs. A 14-bit PRBS signal (which corresponds to a period with a length of 16,383 samples) was used; with a signal of length 16 resolution is limited to 8192 points. Fig. 10 (top, in blue) shows a portion of the input and output signals acquired from the open-loop identification experiment. The FRF of the process was then obtained as G(e jω ) = ℱ{i(t)}/ℱ{v(t)} (where ℱ{ ⋅ } denotes the Fourier transform of the argument); Fig. 11 shows the gain and phase Bode plots of G(e jω ).
The closed-loop regulation sampling time was set to T s = 500 μs (i.e. 5 times longer than the clock period used for the PRBS); for this reason, the frequency points above 1 kHz in G(e jω ) were not needed for synthesis. The following optimisation problem was considered for attaining the desired specifications: with m d = 0.5. The last constraint in this problem ensures that the controller K itself is also stable (which is a requirement set within the proprietary software of the CERN controller [15,45,50] The optimisation problem as stated in (37) is not convex and the solution does not guarantee the closed-loop stability. Thus the convex method described in Section 2 was used to convexify the problem in (37) whilst guaranteeing the closed-loop stability, which was then solved in the SDP form over a randomly chosen frequency grid. A 5th-order controller containing integral action was designed. For a 5th-order controller (i.e. n n = d n = 5) with violation and confidence parameters set to 0.05, η ≥ 415 (from (36)); thus 500 points were randomly selected from the grid produced for the PRBS experiment. The bisection algorithm was used to solve the convex problem with Γ max = 2.5, Γ min = 0, and a tolerance of 10 −4 ; with these parameters, the optimisation time was 78.6 s and the optimal solution obtained was Γ ⋆ = 1.138. Another PRBS experiment was performed (with a different 14-bit PRBS sequence depicted, in red, in Fig. 10) to measure the closed-loop FRF whose Bode diagrams are shown in Fig. 11 (red curves). It can be observed that the desired closed-loop bandwidth has been achieved (while also guaranteeing the desired modulus margin for robustness).

Synthesis for tracking using ILC:
With the desired robustness specifications satisfied for the closed-loop system, the proposed ILC methodology was then used for attaining the desired tracking performance. A Q-filter Q = 1 was selected in order to obtain the best possible tracking performance. A non-causl learning filter L was selected with β d = 2 and β n = 8 (where ξ was set to 0 for simplicity). From the available frequency points obtained from the PRBS (closed-loop) identification experiment, 500 points were again randomly selected from this grid using the methods described in [41] (with violation and confidence parameters both set to 0.05, as before). The optimisation problem in (32) was then solved in the SDP form; the bisection algorithm was used to solve the convex problem with γ max = 1, γ min = 0, and a tolerance of 10 −4 ; with these parameters, the optimisation time was 49.6 s and the optimal solution obtained was γ ⋆ = 0.652. Fig. 12 shows the dynamics of the learning filter when the problem in (32) was solved for various orders of β n (for a fixed β d ).
It can be observed that as the order of the learning filter increases, the dynamics of the filter converges to the dynamics of the inverse of the closed-loop process; thus error convergence can be obtained with less ILC iterations by increasing the order of the learning filter.

Experimental results with ILC:
With the learning filter L obtained from the optimisation problem, the update equation in (8) was then used (in an offline manner) to generate the closed-loop reference signal that would minimise the error (given the desired reference signal shown in Fig. 9). The process was iterated eight times; Fig. 13 shows the resulting output and error (in ppm) after the 8th iteration. It can be observed that the error is well within the desired bounds and that the main source of error seems to arise from the system noise. In [48], the same case study was considered, but presenting a different data-driven control  methodology. In [48], a two-degree-of-freedom controller running with a 10 kS/s sampling rate was proposed and proven to also achieve the required precision. However, the magnitude of the peak error was approximately 500 ppm whereas the peak error shown in Fig. 9 is much smaller. Furthermore, the error in [48] was correlated to the dynamics of the reference signal (being larger during its fastest transitions) whereas with the proposed ILC controller, the error dynamics seems completely uncorrelated to the dynamics of the reference where the error is dominated by the overall system noise. It must be also noted that the tracking error has been significantly reduced (while still maintaining sufficient robustness margins for closed-loop stability) by using a controller sampling rate five times smaller than what was reported in [48], which is an important advantage in terms of computational resources required.

Conclusion
A new data-driven method for computing ILC learning filters has been presented. A frequency-domain approach has been used in order to avoid the problem of unmodelled dynamics associated with parametric models. In the proposed approach, a convex optimisation problem has been formulated in order to minimise the ℋ ∞ norm the of ILC stability and convergence criterion. The solution to this problem ensures the stability of the learning filter whilst providing the fastest convergence rate (for a given order of the filter). The proposed method has been applied to a case study at CERN, where it was shown that the data-driven ILC methodology eliminates the need of a system model and can achieve the desired tracking specifications with much less conservatism. Furthermore, it has also been shown that better performance (with respect to the tracking error) can be achieved with a substantially slower controller, which is a non-secondary asset of the proposed methodology. For future research, it will be desired to characterise the uncertainty of the FRF measurement and optimise the ILC convergence rate while explicitly taking it into account.

Acknowledgments
The authors thank Louis De Mallac from the Technology Department at CERN for his contribution to this project. Louis provided technical support for this project and was the individual responsible for programming the DSP within the power converter control system.