Dynamic online joint energy management and sampling rate control in energy harvesting aided IoT network

Energy harvesting (EH) aided Internet of Things (IoT) network is a promising paradigm to librate IoT network from energy deﬁciency. Dynamic energy and trafﬁc scheduling in such a scenario is challenging due to temporal correlation of energy constraints and delay requirements of IoT applications. In this paper, joint energy management and sampling rate control to explore the tradeoff between network utility and delay performance are studied while maintaining the energy causality constraint. Taking into account the dynamic characteristics of EH process, channel fading and trafﬁc arrivals, a stochastic optimisation problem is formulated to maximise the network utility. Leveraging the Lyapunov optimisation approach, combined with the idea of weight perturbation, a framework is proposed to decompose the stochastic problem into several deterministic sub-problems that can be solved separately. Based on the framework, an online resource allocation algorithm is developed to achieve two major goals: ﬁrst, balancing energy consumption and energy harvesting to stabilise their data and energy queues; second, deriving the utility-delay tradeoff by adjusting the control parameter. The stability of data buffer and energy buffer in the proposed network is theoretical veriﬁed with performance analysis.


INTRODUCTION
Internet of Things (IoT) as an emerging paradigm provides ubiquitous intelligence and pervasive interconnections to vast number of sensing devices. The resulting IoT applications have been widely deployed in our daily life [1][2][3]. To support these applications, green energy supply and usage play an essential role in improving overall network performance. Energy harvesting (EH) is considered as an efficient way to achieve long-term and self-sustainable operation for networks [4][5][6][7].
Integrating EH technology into IoT networks gives rise to a merged architecture, energy harvesting aided IoT network (EHIN), which can enable and prolong the lifetime of IoT networks. Although more energy-efficient, EHIN still face several challenges. First, the time-varying and unreliable EH process makes This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. © 2021 The Authors. IET Communications published by John Wiley & Sons Ltd on behalf of The Institution of Engineering and Technology energy management challenging. In this paper, we focus on the stochastic model based long term performance of IoT networks. Hence, the difficulty is imposed by the resultant temporal correlation of energy allocation, that is, depleting a sensor's battery at a rate faster than the replenishment rate leads to sensor failure. To avoid this, "energy causality" constraints must be involved and problems involving such constraints usually be formulated as dynamic programming (DP) [8]. However, the DP typically suffers from a curse of dimensionality and requires substantial statistical knowledge of EH process. Second, improving data collection efficiency of data-gathering application is a crucial issue which can be reflected by the predefined network utility in IoT networks. For instance, the more sampling data injected to the network, the higher network utility is. However, blindly maximising the sampling rate may cause IoT devices' data queue backlog saturated and thus leading to high latency, which is unacceptable for the applications with strict bound on delay [9]. Hence, it is a crucial issue to improve the network utility while maintaining the delay for latency-constrained IoT applications.
To address the above-mentioned challenges, we study a utility-delay tradeoff problem in an EHIN, while considering the multiple stochastic processes, including energy supply, traffic arrival and channel state. Specifically, we first propose a joint energy management and sampling rate control framework to facilitate the design of an online algorithm. The framework is developed with the aid of Lyapunov optimisation approach to achieve a close-to-optimal network utility while guaranteeing the stability of queue backlogs. Then, based on the framework, we propose an online algorithm, named joint energy management and sampling rate control (EMSC) algorithm, which makes decisions at each slot without any priori knowledge of the stochastic processes. Summarily, the major contributions of this paper are summarised as follows.
• To maintain the "energy causality" constraint, we transform it using the Lyapunov optimisation approach combined with the idea of weight perturbation. We construct a modified Lyapunov function embedded with a weight, but carefully perturb the weight used for decision making, so as to "push" the target energy queue levels toward certain value to ensure that the energy queues always have enough energy for transmission. • To strike a balance between network utility and delay, we incorporate the sampling rate control into our resource allocation framework. The effectiveness of sampling rate control is reflected by a control parameter. Only by adjusting the control parameter can the proposal derive the utility-delay tradeoff on demand. • The proposed algorithm is analysed to be able to achieve an [(1∕V ), (V )]-tradeoff between utility optimality and delay performance for any V > 0. We also compute the required battery capacity. Furthermore, simulation results show the superior performance of the proposal on delayguaranteed.
The rest of this paper is organised as follows. Section 2 overviews the related works in the existing literature. Section 3 states the network model and the problem formulation. The resource allocation framework and the EMSC algorithm are devised in Section 4. Section 5 analyses the optimality and stability of the proposed algorithm. Simulation results are presented in Section 6, and Section 7 concludes this paper.

RELATED WORK
There have been extensive works investigate the resource allocation in EHINs, like sensor networks, machine-type communication (MTC) networks, based on a deterministic optimisation framework [10][11][12][13]. Reference [10] developed a utilityoptimal algorithm for multihop EH-aided cognitive radio sen-sor network, which jointly control the sampling rate and channel access. In [11], sensors schedule the actions between spectrum sensing, data transmission and EH. By jointly optimising the duration of each action, the energy efficiency was maximised under the energy causality constraint. Reference [12] studied energy efficient resource allocation for a MTC enabled cellular network with EH via joint power control and time allocation. In particular, the authors considered a nonlinear EH model which leads to a nonsmooth objective function and nonsmooth constraints. In [13], authors aimed to optimise data gathering in a rechargeable sensor network under the assumption that the EH rate of each sensor can be estimated with high accuracy. Due to the unpredictability and dynamics of EH process, the above deterministic optimisation framework may not capture the characteristics of EH process. Thus, the stochastic optimisation framework based resource allocation have drawn great attentions in EHINs. With only the statistics and causal knowledge of EH state, EH process is modeled by a Markov chain to capture the dynamics over time in [14,15]. Reference [14] looked at the problem of maximising the time-average data transmission rate for cognitive radio nodes which are powered by a wireless beacon with EH capability. Authors in [15] developed a balanced policy to adapt the transmission probability to the EH state, so that energy harvesting and consumption can be balanced. References [16,17] explored the statistics of EH process by machine learning approach. Reference [16] adapted the transmission scheme to the unknown EH process via a learning approach and proposed a transmitted data maximisation algorithm. Without any distribution or future information of EH process, reference [17] proposed a suboptimal power control in the online setting. Using the tools from Lyapunov technique, the policy developed in [18] is allowed to adapt its power actions to the instantaneous EH state on each slot, while maintaining the perpetual operation for sensor network. As a common feature, the works in [14][15][16][17][18] ignores the stochastic dynamics of the traffic arrivals with the simplified assumption that the traffic arrival rate is inside the network capacity region. Thus, the above resource allocation scheme neglect the delay of critical data, which, is an important metric to measure the quality of service of IoT applications.
In Summary, the above mentioned works either consider deterministic optimisation framework, in which the distribution information of all the stochastic processes is known in advance, or fail to meet delay requirements for delay sensitive IoT applications. Therefore, they cannot be used into our stochastic optimisation problem and can hardly depict the utility-delay tradeoff, which is exactly the objective of our work.

SYSTEM MODEL
IoT applications rely on a massive number of IoT devices (IoTDs) that generate small data packets. These packets are mostly transmitted in the uplink direction, toward a control center for data storage and processing [19]. Due to this uplink-centered nature, we consider the uplink scenario in an EHIN consisting of an access point (AP) and N IoTDs indexed by  = {1, 2, … , N }. The concerned EHIN operates in discrete time with normalised time slot t ∈  = {1, 2, … , T }. The spectrum bandwidth is equally divided into K orthogonal subchannels, denoted as  = {1, … , k, … , K }, and each IoTD transmits data over allocated orthogonal.

Subchannel assignment
To schedule the transmission, the AP needs to allocate subchannels to IoTDs. We consider the scenario that the number of IoTDs no less than that of subchannels. n,k (t ) is defined as the subchannel assignment indicator of IoTD n on the kth subchannel at time slot t , that is, n,k (t ) is equal to 1 if subchannel k is allocated to IoTD n and 0 otherwise. To avoid interference among IoTDs, each subchannel can be allocated to at most one IoTD at each slot, namely In addition, each IoTD can be served by at most one subchannel at each slot, we have ∑ k∈ n,k (t ) ≤ 1, ∀n ∈  .

Admission control and data queue dynamics
In the considered EHIN, separate data queues are maintained for each IoTD to temporarily backlog the arrived data. Let Q(t ) = (Q 1 (t ), Q 2 (t ), … , Q N (t )) represents the data queue backlog of all IoTDs at time slot t . a n (t ) denotes the amount of random data arrivals destined for IoTD n at slot t . Assume that a n (t ), ∀n ∈  are independent and identical distribution (i.i.d.) over time slots according to a general distribution and are independent with respect to n. Furthermore, there exists a certain peak amount of traffic arrivals a max n satisfying a n (t ) ≤ a max n . In practice, the statistic of a n (t ) is usually unknown to IoTD n, and the achievable capacity region is difficult to estimate, that is, the situation that the arrival rate is outside of the network capacity region may occur. In this situation, data queues cannot be stabilised without a admission control mechanism to limit the amount of data that is admitted.
As illustrated in Figure 1, each IoTD first makes decisions on whether the arrived data is admitted or not over slots. If not, it directly drops the data. Let r n (t ) and b n (t ), respectively, represent the amount of admitted data and dropped data at IoTD n at slot t . Apparently, r n (t ) should be less than the traffic arrival rate, that is, r n (t ) ≤ a n (t ), ∀n ∈  . (3)

FIGURE 1 Considered EHIN
Denote n (t ) as the data transmission rate of IoTD n at slot t . The amount of data that IoTD n can transmit over channel k is determined by two factors. One is the data queue occupancy, that is, the IoTD n can only transmit the available data in its queue, which can be expressed as Another is the channel capacity n,k (t ), that is, if the channel k is allocated to IoTD n, the data transmission rate is bounded by Besides, to depict the time-varying characteristics of channel fading, we assume n,k (t ) randomly varies over time slots in an i.i.d. fashion and is bounded by n, Based on the aforementioned analysis, the data queue length various over time slot with the admitted data r n (t ) as input, and the server process n (t ) as output, as follows: To model the impact of joint sampling rate control and energy management on the average delay and the achieved network utility, the definition of the network stability is given.

Energy supply and energy queue dynamics
The EH process is characterised by the energy supply rate n (t ), which denotes the amount of energy that can be harvested by IoTD n at time slot t . We model n (t ) is i.i.d. over slots and is upper bounded by max . Further, to capture the time-varying and unreliable characteristics of EH process, we assume that IoTDs may have no priori knowledge of energy supply rate, which is practically true when the statistical knowledge of EH process is not available.
We denote e n (t ) the amount of harvested energy by IoTD n at time slot t , which is bounded by The energy consumption of data sensing is assumed to be a linear function of the sampling rate r n (t ) [20] and denoted by S r n (t ). We assume all IoTDs use the same transmission power T over each allocated subchannel for data transmission. Thus, the overall energy consumed by the IoTD n is P total Besides, we use P max to denote the upper bound of P total n (t ) and is expressed as P max = S r max + T .
We define E(t ) = (E 1 (t ), E 2 (t ), … , E N (t )) as the energy queue size of IoTDs. From the discussion above, the dynamics of E n (t ) across time slots can be expressed as Note that in a given time slot t , the "energy causality" constraint must be met, that is, the energy consumed is no more than the energy stored: Furthermore, the energy charging is bounded by the battery capacity, which is denoted by Ω n . We assume the battery capacity is the same for all IoTDs, so the subscript n is omitted here for simplicity. Thus, we have

Problem formulation
Based on the aforementioned models, we formulate a stochastic optimisation problem to maximise the time average aggregate network utility subject to the constraints mentioned above.
We define utility function U (r n (t )) be a function of sampling rate r n (t ) and assume it to be a strictly concave function [21]. Then, the time average aggregate network utility can be written as ] .
To simplify the presentation, we define (t ) ≜ (r(t ), (t ), e(t ), (t )) as the set of the variables to be optimised. The aggregate network utility can be maximised by optimising (t ) under the following stochastic optimisation problem We can see from the formulation of the above problem, on one hand, the prior knowledge on the statistical information of the multiple stochastic processes is hardly available, which limits the EHIN to obtain the maximum network utility under an off-line framework; on the other hand, the "energy causality" constraint greatly complicates the design of an efficient scheduling algorithm due to the fact that the current power allocation strategy may cause energy outage in the future and thus effect the future decision. Taking the above challenges into consideration, we design our EMSC algorithm in the following section.

Design principle of EMSC
The above two challenges inspire us to deal with the stochastic optimisation via the Lyapunov optimisation framework. On the one hand, Lyapunov optimisation does not require the prior knowledge on the statistical information of the system randomness. Moreover, the "energy causality" constraint can be removed leveraging the Lyapunov optimisation approach combined with weight perturbation. Before the algorithm, we give the following design flow according to the Lyapunov optimisation framework. We first construct the modified drift function to relax the "energy causality" constraint and queue stability constraints, then the optimisation objective is mapped into a penalty function. By minimising the "drift plus penalty" term, Lyapunov optimisation can minimising the queue length while obtain the optimal utility. At this point, the original stochastic optimisation problem can be decomposed into three separated deterministic subproblem, which can be solved independently of others. Detailed design steps are given in the following subsections.

Problem transformation
To relax the "energy causality" constraint, we employ weighted perturbation based Lyapunov optimisation technique. We first define the weighted perturbation parameter as the battery capacity Ω. The concatenated vector Z(t ) = [Q(t ), E(t )] is used to denote the network state, which captures the queue backlog. Then, a perturbed Lyapunov function is defined as The intuition behind the use of weighted perturbation parameter Ω is that by keeping the Lyapunov function value small, we needed "push" E n (t ) value towards Ω. Thus, by choosing a proper choice Ω, we can ensure there is always enough energy storied in the energy queue for sensing and transmission. Meanwhile, when minimising L(Z(t )), we push the data queue backlog towards zero, which is equivalent to satisfying network stability constraint (7). Then, the one-slot conditional Lyapunov drift can be defined as where the expectation is w.r.t. the random stochastic processes and the control actions.
To incorporate the objective function into Lyapunov drift, we map the objective function to an appropriate penalty. Instead of greedily minimising Lyapunov drift, actions are taken to minimise the drift-minus-utility Δ V (Z(t )) = Δ(Z(t )) − V {U |Z(t )}, where V is a non-negative control parameter to provide a tradeoff between queue backlog and network utility maximisation. Since the time average queue length is proportional to the queuing delay according to Little's law, the control parameter V provides a guideline to balance the network utility and delay. Then, we have the following lemma regarding the drift-minus-utility.

Lemma 1. For any optimisation decision made on slot t , and all possible values of Z(t ) the value of drift-minus-utility is bounded by
where ∑ n,k is the simplification of At this point, the design principle behind the EMSC algorithm is to minimise the RHS of (17) according to the principle of opportunistically minimising an expectation in per-slot manner.

Scheduling decisions
In this subsection, we present our EMSC algorithm to address the above problem. Since the decision variables e(t ), r(t ) and ( (t ), (t )) can be decoupled with each other and are independent of the current backlog Z(t ), the minimisation problem can be separated into three subproblems as follows.

Energy management
The energy management subproblem is expressed as Clearly, the solution of (18) has the following on-off structure and is given by:

Sampling rate control
The admission control subproblem is given by Since the utility function U (r n (t )) is a concave function and (3) is a linear constraint, the admission control problem is a convex optimisation problem. We can drive the optimal solution based on the convex optimisation theory and is given by , a n (t ) ) , where (u ′ ) −1 is the inverse of the first derivative of U (⋅) and
The objective function of channel and transmit rate allocation (CTRA) subproblem consists of the product of a integer variable (t ) and a continuous variable (t ), which makes CTRA a nonconvex mixed integer programming problem. To facilitate the design of a tractable resource allocation solution, we adjust the data queue length and modify the objective function of CTRA. Instead of solving the original CTRA, we transform it to the modified CTRA (m-CTRA) problem to find a suboptimal solution.
Here, we define the modified Q n (t ) as Now, we can replace Q n (t ) in the objective function of CTRA with Q mod Note that if we can remove the transmit rate variable (t ) and relax the constraints related to it, the m-CTRA becomes a oneto-one matching problem. Because of this, we transform the m-CTRA into an integer problem by the following two steps. First, we show that the objective function of m-CTRA is minimised if sensors transmit data at full capacity on their assigned channels. Second, we replace n (t ) by the channel capacity n,k (t ). Thus, the continuous variable can be removed from the objective function and the m-CTRA is transformed into a one-to-one matching problem. The detailed process is shown in the following.
(1) In the following lemmas, we show that a channel k is assigned to sensor n if and only if it has a sufficient amount of data to transmit (i.e. constraint (4) can be satisfied) and it transmits the data at full channel capacity.

Lemma 2.
The channel k is assigned to sensor n only when the data queue occupancy is larger than the channel capacity, that is, Q n (t ) > n,k (t ).
Proof. Suppose that channel k is assigned to sensor n, that is, * n,k (t ) = 1. Note that our objective is to minimise the objective function of m-CTRA. In the condition that ∑ k∈ * n,k (t ) = 1, m n,k (t ) must be negative. With the expression of m n,k (t ), we can further drive that Q n (t ) > n,k (t ). □

Lemma 3.
If channel k is assigned to sensor n at time slot t , it must transmit at full channel capacity, which can be expressed as * Proof. Here we first consider the condition that there is no channel allocated to sensor n, that is, n,k (t ) = 0, ∀k ∈ . Combined with the constraint (5) and its non-negativity, we can derive * n (t ) = 0. Then, we consider the condition that channel k is assigned to the sensor n. We prove that * n (t ) = ∑ k∈ * n,k (t ) n,k (t ) is the optimal solution to the m-CTRA. Since m n, , we can observe that m n,k (t ) is in inverse proportion to n (t ). Hence, we choose n (t ) as large as possible to minimise m n,k (t ). Recalling that n (t ) is restrained by its channel capacity and queue backlog. Further, from Lemma A.1, we can see that constraint (4) is obvious satisfied. Thus, n (t ) is only bounded by channel capacity constraint. We choose * n (t ) = n,k (t ) as the optimal solution to m-CTRA. This completes the proof. □ (2) According to Lemma 2 and Lemma 3, we can replace the transmission rate by channel capacity and rewritten the m-CTRA as follows The m-CTRA is transformed into a channel allocation subproblem, which is a one-to-one mapping problem and can be solved by the adaptive Hungarian algorithm. Based on the above analysis, the whole process of the EMSC algorithm is summarised in Algorithm 1. The EMSC algorithm solves the energy management, sampling control and channel and transmit rate allocation subproblems in sequence. Then, update the backlog in all queues for the procedures in the following time slot.

ALGORITHM 1 EMSC algorithm
Input: Observe the queue state Z(t ), traffic arrival a(t ) and energy supply rate (t ).

2:
Update Q(t ) and E(t ) according to their respective queue dynamics.

Application and implementation architecture of the EMSC algorithm
Following the EMSC algorithm, we propose an implementation architecture for its applications in practical systems, as shown in Figure. Moreover, some descriptions are given as follows to show the favorable properties of the EMSC algorithm for providing engineering guidelines.  N, K ))) [22]. Therefore, the complexity of the proposed algorithm increases linearly with the number of N and K , and is far lower than that of algorithms designed based on Markov decision process (MDP) (the computation complexity of MDP-based algorithms increases exponentially with N and K [23]).
Next, we provide an application scenario of the proposed EMSC algorithm in transportation system. Various sensors and automated devices are installed on the vehicles to collect the task-oriented sensory data. The sensory data is further transmitted to the roadside units for data analysis. Considering some delay-sensitive tasks, the amount of admitted traffic and service traffic should be jointly controlled to guarantee the queueing delay, which is the major issue EMSC algorithm is focused on. Moreover, it is desirable to enable these sensors with green energy harvesting to realise their self-sustainability. Under this scenario, the design principle of the energy management policy can also follow that of the EMSC algorithm.

PERFORMANCE ANALYSIS
Here, the performance bounds of the proposed EMSC algorithm will be mathematically analysed. We first illustrate the performance in Theorem 1 and then provide some intuitional observations behind them.

Theorem 1.
The EMSC algorithm with any V > 0 has the following properties.
(a) We use U * and U Alg denote the theoretical optimum value of the original problem and the network utility achieved by the proposed algorithm, respectively. Then The time-average data queue length satisfies the following upper bound where the upper bound is given by Q max = U V + r max , U is the maximum derivative of U (r n (t )).

(c) With a battery capacity Θ given by
the "energy causality" constraint will be guaranteed.

Proof. See Appendix A.2. □
To understand the results in Theorem 1, we provide intuitional observations. 1. Part (a) shows that the achieved network utility increases proportional to V , and that U Alg can be arbitrarily close to the optimal value by setting a sufficiently large value of V . Moreover, note that in Section 3, we transform the CTRA subproblem to the modified CTRA subproblem. The performance loss caused by the transformation is shown byB, in which B is the performance gap if we do not perform the transformation. 2. Part (b) demonstrates the upper bounds of data queues and implies that the time-average data queue length increases linearly with the value of V . Combining this with Part (1), we can see that the EMSC algorithm achieves an [(1∕V ), (V )] utility-delay tradeoff. Such quantitative result for utility-delay tradeoff will be useful for practical implementations. 3. Part (c) provides explicit characterisation of the needed energy capacity. Replacing Q max in Equation (28) with Equation (29), we observe that the value of battery capacity only depends on the known parameters and is easily to be determined. Furthermore, we see that the battery capacity is deterministically upper bounded by a constant of size (V ). Combined with Part (1), an explicit capacity requirement for energy storage is provided according to the desired utility performance.

SIMULATION RESULTS
In this simulations, we verify the analysis and evaluate the performance of the proposed algorithm via MATLAB simulations. We use the network topology consisting of 15 IoTDs and an AP in which the IoTDs are randomly distributed in a circular area with a radius of 30 meters and the AP is located at the center of this area. The IoTDs transmit data to the fusion node over K = 4 channels. Similar to [24], the utility function is given by U (r n (t )) = log(1 + r n (t )), and its maximum derivatives is U = 1. The energy consumption rate of data sensing is S = 0.1 Watt/bit and the transmit power is set to T = 1 Watt. The energy supply rate is uniformly distributed in [0, max ], where max = 2 Joule. The channel fading coefficient is expressed as h n,k (t ) = d − nhn,k (t ), where d n is the distance between IoTD n and the AP, denotes the path loss exponent and equals to 4. Besides,h n,k (t ) is the samll-scale fading coefficient, which is uniformly distributed between (0.9,1.1) and i.i.d. across time slots [25]. Corresponding, the instantaneous channel capacity is modeled as n,k = log(1 + T ⋅h n,k (t ) N 0 ), where N 0 denotes the noise power and equals to 10 −5 Watt. The traffic arrivals follow a Poisson distribution, and the mean traffic arrival rate is given by n = . Note that each point of the following curves is plotted based on a 10 5 -run and averaged over these values.

6.1
Network utility and queue dynamics Figure 3 displays the evolution of network utility versus with the value of control parameter V . Since a larger V means that the controller places more emphasis on network utility maximisation than on the control of queue backlog, the network utility grows linearly as the value of V increases at the startup phase. Then, the utility improvement starts to diminish with excessive increase of V . This shows that a larger V adversely aggravate the congestion of data queue and thereby limits the sampling rate. Finally, the utility gradually converges to the optimal value, which validates part (a) in Theorem 1. Figure 4 demonstrates the data queue stability with different V . It can be seen that the data queue backlog increases at the startup phase and eventually converges to the time average value. Furthermore, the upper bounds of the data queue backlog increases with the value of V . Since data queue backlog is proportional to delay according to Little's law, this indicates that the delay grows linearly in (V ). Figures 3 and 4 together indicate that there is a tradeoff between utility and delay, and can be quantitatively depicted by [(1∕V ), (V )].
Similar to the data queue, the energy queue backlog increases at the early phases and then converges to the time average value, as illustrated in Figure 5. It can be seen that the time average energy queue backlog is deterministically upper bounded by the battery capacity. Furthermore, a larger V increases the time average value, which indicates that the EMSC requires energy storage devices that are of (V ) sizes. This validates part (c) in Theorem 1.

6.2
Impacts of traffic arrival rate on system performance Here, throughput refers to the maximum amount of admissible uplink traffics that can be stabilised in the system, and its time average value is of our interest, which can be expressed as Thr Figure 6, we see that throughput under the proposed EMSC algorithm are equal to the traffic arrival rate at first, then grow to the maximum and keep unchanged as increases. More specifically, Figure 7 indicates that the maximum traffic arrival rate that system can stably carry is 0.04, 0.08 and 0.16, respectively, under different control parameter V . This is because the system can stably work for small traffic arrival rate, but has to start admission control for large traffic arrival rate. Figure 8 shows that the data queue length is controlled to slowly increase with the traffic arrival rate and the increasing rate decreases with a larger traffic arrival rate, this follows from the fact that the EMSC algorithm adapts to the traffic arrivals attributed to the queue stability constraint in (7) and will start admission control in the heavy traffic states.

Performance comparison
Here, we compare the EMSC algorithm with a baseline algorithm, which makes greedy decisions to maximise the network utility. In each slot, the greedy algorithm chooses the schedule actions that offers the most obvious and immediate utility. Con-

FIGURE 4
Energy queue dynamics and battery capacity with different V sidering the channel and transmit rate allocation subproblem, the greedy algorithm first arranges the data queues in a descending order with respect to their lengthes. Then, it assigns the channel with the maximum capacity to the longest data queue for data transmission and transmits under channel capacity. At last, the data queue is removed from the arrangement. Repeat the above steps until all the channels are assigned.
In the following, we divide the network operation time into five time blocks. To validate the efficacy of the EMSC algorithm, we compare the utility between the greedy algorithm and the EMSC algorithm under each time block . As shown in Figure 9, the utility of the greedy algorithm is shown to be high in the first time block, then decreases to a stable value in the following several time blocks. The reason is that the greedy algorithm is more concerned about the immediate utility, so it requires as much data to maximise the utility at the beginning of operation. However, its channel allocation and transmit rate control actions cannot efficiently clear the backlog in the data queue to accommodate more coming data, such that its utility decreases drastically in the following time blocks. By contrast, although the utility achieved by the EMSC algorithm is relatively lower at the beginning, it gradually increases to a stable value higher than that of the greedy algorithm.
Further, we compare the total data queue length of the two algorithms to show the superior performance of the EMSC algorithm on delay. Figure 10 shows that the total data queue length in the EMSC algorithm is much lower than that in the greedy algorithm. Since the time-average data queue length is proportional to delay, this indicates that the EMSC algorithm outperforms the greedy algorithm in terms of delay.

CONCLUSION
This paper has focused on the joint energy management and sampling rate control problem in an EHIN. Based on the Lyapunov optimisation method, combined with the idea of weight perturbation, the original stochastic problem has been decomposed into three deterministic sub-problems. By solving each sub-problem, we present our EMSC algorithm. An [(1∕V ), (V )] utility-delay tradeoff has been derived by the proposed algorithm which has been both verified by both performance analysis and numerical simulations. Further, The simulation results have shown the necessity of sampling rate control and have validated the advantages of the proposed algorithm in terms of utility and delay.
we can obtain (A.1) by squaring both sides of (6). ) . (A.1) Multiplying both sides by 1∕2, and definingB = 1 2 ( 2 max + r 2 max ), we have . (A.2) Using a similar approach, we get that . Taking conditional expectations of (A.2)-(A.3) and summing over all n gives a bound on Δ(Z(t )). Adding the penalty term to both sides proves the results.

A.2 Proof of Theorem 1 A.2.1
Proof of part (a) We prove the part (a) by comparing the Lyapunov drift with any other stationary and randomized algorithm denoted by ALT. According to [26], for any > 0, there exists an i.i.d. algorithm ALT that satisfies where r ALT (t ), e ALT (t ), ALT (t ), P total,ALT n (t ) and ALT (t ) are the resulting values under ALT algorithm, and 1 and 2 are constant scalars.
We first show that the suboptimal algorithm approximately minimizes the RHS of (17). However, the solution procedure show that the suboptimal algorithm indeed minimizes the following function where Q mod n (t ) = [Q n (t ) − max ] + . DefineD(t ) as follows ] . (A.8) is the function within the expectation on RHS of the drift-minus-utility (17). By comparingD(t ) with D(t ), we have Since the suboptimal algorithm minimizes D(t ). Here we use superscript SUB to denote the suboptimal algorithm. Then Since 0 ≤ ∑ N n=1 ∑ K k=1 n,k (t ) max n (t ) ≤ NK ( max ) 2 , we havẽ D SUB (t ) ≤D ALT (t ) + NK 2 max . That is, the value ofD(t ) under the suboptimal algorithm is no greater than that under any other alternative policy plus a constant. Now using the definition ofD(t ), (17) can be written as ] .

A.2.2 Proof of part (b)
We use induction to prove the Equation (28).
1. First we prove it in the condition that the n-th IoTD does not collect any information from the monitoring area, then Q n (t + 1) ≤ Q n (t ) ≤ U V + r max is clearly true; 2. Then we prove it when the n-th IoTD collects information with sampling rate r * n (t ) given in Equation (21). According to Equation (21), we can derive VU ′ (r * n (t )) = Q n (t ) − S (E n (t ) − Ω)). Because the term E n (t ) − Ω is negative, we have Q n (t ) ≤ VU ′ (r * n (t )). Since VU ′ (r * n (t )) ≤ U V , then Q n (t ) ≤ U V . Furthermore, r n (t ) is upper bounded by the maximum sampling rate r max , then Q n (t + 1) ≤ Q n (t ) + r max ≤ U V + r max is satisfied.

A.2.3
Proof of part (c) Here we first derive an expression for Ω in such a way that IoTD does not sense any data if the available energy is less than the maximum energy consumption. Since the utility function U (r n (t )) is concave, we get r n (t ) are inversely proportional to U (r n (t )). Combined with the expression of sampling rate in Equation (21). We get that IoTD does not sense any data, i.e., the sampling rate is zero, if Note that the maximum derivative of U (r n (t )) is U and equals U ′ (0), we have (u ′ ) −1 ( U ) = 0, substituting the above formula into Equation (A.16), we derive that (u ′ ) −1 ( , recalling that U (r n (t )) and r n (t ) are inversely proportional, we have Then, we rearrange Equation (A.17) to Ω ≥ V U −Q n (t ) S + E n (t ), and with the fact that Q n (t ) ≥ 0, we can derive Ω ≥ V U ∕ S + E n (t ). To satisfy that the sensor cannot sense any data when E n (t ) < P max , Ω can be written as Ω ≥ V U ∕ S + P max . By supposing no channel can be assigned to the n-th IoTD if the available energy is less than the maximum energy consumption, can we derive another value of Ω. As we can see from the objective function of channel allocation, no channel can be allocated to the n-th IoTD if Since Q mod n (t ) ≤ Q max and n,k (t ) ≤ max , Ω in Equation (A.19) can be written as Ω ≥ Q max max ∕ T + E n (t ). Then, we choose the maximization of the Ω, such that . (A.20)