A low complexity user scheduling algorithm aimed for the maximum number of active users in NOMA system

User scheduling algorithms in non-orthogonal multiple access (NOMA) have attracted much attention to improve the performance of the communication system. Here, a low complexity user scheduling algorithm aimed for the maximum number of active users in the single NOMA cluster ensuring the individual minimum rate requirement is proposed. Based on the precondition of the maximisation of the number of active users, the maximum sum-rate strategy is further integrated into the algorithm to compose a multi-round user scheduling algorithm. Moreover, a computational-complexity reduction algorithm is also introduced and has proved to be conveniently used in practical operation. Simulation results have shown that, compared with other existing user scheduling algorithms, the proposed user scheduling algorithms can achieve the maximum number of active users while ensuring their individual minimum rate requirements, and signiﬁcantly improve the performance of the sum-rate among the cases when the maximum number of active users is achieved. Furthermore, the proposed multi-round and computational-complexity reduction user scheduling algorithms have shown better performance than the exhaustive search in terms of the computational complexity.


INTRODUCTION
The non-orthogonal multiple access (NOMA), which superimposes the signals of multiple users over the same spectrum resource via power domain division or code division based on the users' respective channel gain differences at the base station (BS), has gained significant attention for the fifth generation (5G) and beyond communication systems [1][2][3]. At each receiver, successive interference cancellation (SIC) technology is applied to retrieve its own signal from the single power-composed signal of all users in the cluster [4]. In NOMA systems, power resource allocation plays an important role and has been studied for performance evaluation and improvement in terms of the sum-rate maximisation, the energy-efficiency maximisation, the outage probabilities minimisation, and the optimal user pairing, some of which are listed below.
maximisation problem of multiple users in one NOMA cluster is convex. For both uplink and downlink NOMAs, a sumrate maximisation problem in a cell such that the user clustering (i.e. grouping users into multiple clusters) and power allocations in NOMA clusters can be optimised under transmission power constraints, minimum rate requirements of the users, and SIC constraints in [9].

Energy-efficiency maximisation
Energy efficiency is another important performance evaluation term in communication systems, where the resource allocation to maximise the energy efficiency of the NOMA system is studied in [10][11][12][13][14][15][16]. In [10][11][12], three communication modes such as device-to-device, machine-to-machine, and relay are considered to maximise the energy efficiency in NOMA systems. The authors in [13] investigate the energy-efficient optimisation problem with a dual-connectivity mode by using ratesplitting method to transform the original non-convex problem into a convex one. A joint allocation of power and bandwidth, which is an energy-efficient algorithm in NOMA systems, is proposed with the equivalent difference-of-convex (DC) functions method in [14]. To maximise the energy efficiency, a power allocation strategy subject to meet the individual minimum rate requirement, is proposed by decoupling the optimal problem into two concatenate sub-problems in [15]. The authors in [16] study a low-complexity, energy-efficient maximisation problem by decoupling the power allocation and sub-channel assignment in NOMA systems.

User scheduling schemes
Furthermore, user scheduling has a significant impact on the performance of NOMA systems. In [17], a clustering and power allocation approach is proposed by splitting the users with high channel gains into different clusters to achieve the sum-rate maximisation. In [9], by classifying the users into two classes, a sub-optimal user clustering algorithm with low computational complexity is proposed for downlink and uplink transmission, respectively. Considering additional complexity introduced by SIC at the receiver, each NOMA cluster is restricted to include only two users in some literatures [18][19][20]. In [18], an optimal user pairing algorithm for multiple NOMA clusters is studied to propose a computational-complexity reduced user pairing scheme to achieve the sum-rate maximisation. A novel low-complexity sub-optimal user scheduling algorithm was proposed to maximise the system energy efficiency with imperfect channel state information (CSI) in [19]. The authors in [20] propose a matching algorithm in the downlink NOMA network to optimise the two users pairing and the power allocation between the weak users located at the edge of the cell and the strong users located close to the BS.

Motivation and contributions
According to the aforementioned works, an implicit assumption is made that all users are active at the same time in NOMA systems. However, it is not always true for internet-of-things (IoT) networks connecting a very large number of devices with various data rate requirements or limited resource. On the other hand, more devices are active, more accurate and comprehensive information of the NOMA system can be obtained in IoTs. However, according to the NOMA principle and the various minimum rate requirements among multiple users in a single NOMA cluster, the user scheduling result will dynamically change, which could be solved by using the exhaustive search but with unapplicable computational complexity. Therefore, in the NOMA system, it is still an open problem on how to choose the active users facing multiple users with various minimum rate requirements with lower computational complexity when the BS cannot support the QoS of all users. And its optimal solution is unknown yet.
Motivated by the introduced works on NOMA systems and the question on all active users assumption, the focus is on developing the practical user scheduling solution for multiusers in single NOMA cluster. The contributions are outlined as follows: • In the single NOMA cluster, a user scheduling problem is formulated such that the number of active users can be maximised under the constraints of individual minimum rate requirements of the users and the total transmission power. • Due to the combinatorial nature of the formulated and mixed integer non-linear programming problems, we propose a multi-round user scheduling scheme, which exploits the minimal power requirement values for all combinations of the active users while ensuing their individual minimum rate requirements by dividing the whole power range into multiple intervals to obtain the maximum number of active users. • Among multiple optimal solutions of the original problem provided by the nature of the integer programming problem, a sum-rate improvement scheme is further integrated into the proposed multi-round user scheduling scheme by specifically choosing the active users. • According to the given solution attaining the proposed multiround user scheduling scheme, a practical computationalcomplexity reduced user scheduling algorithm (CCRUSA) is further derived and proved.

Paper organisation
The remaining is organised as follows. Section 2 describes the system model and problem formulation. Section 3 proposes a proposed multi-round user scheduling algorithm (MRUSA) and a CCRUSA, followed by the analysis of the computational complexity of the proposed algorithms. Section 4 evaluates the performance of the proposed algorithms based on the simulation results in terms of the active user number, the sum-rate, the individual rates, and the computation complexity. Section 5 concludes the paper.

System model
Without loss of generality, we assume that there are M singleantenna users denoted by = (m) M m=1 which intend to be active in a downlink single-NOMA-cluster system. The channel gain between the single-antenna BS and the user m is denoted by h m , and the channel gains are sorted in the ascending order, that is, The BS transmits a superimposed signal for the clustered users as where x m and p m , ∀m, denote the signal and the allocated power for the user m, respectively. The received signal at the user m can be obtained as where n m denotes the additive white noise at the user m with the variance 2 .
In downlink NOMA, the user m receives its own signal from the received composite power signals of all users belonging to the same cluster by implementing SIC. The achievable rate of the user m can be given by , ∀m ∈ . (3)

Problem formulation
Our objective is to obtain the maximum number of active users via user scheduling. Thus, it is necessary to define a indicator index vector = ( 1 , … , M ) T to represent the user activation in the single NOMA cluster, whose element is given by Then, the optimisation problem is formulated as whereR m is the individual minimum rate required for the user m, P max is the maximum transmission power for the single NOMA cluster, and the allocated power vector p=(p 1 , … , p M ) T , respectively. The maximisation objective function in (5) is to obtain the maximum number of active users, which is the primary motivation of this work. In (5), the first constraint implies that the sum-power allocated to all the active users cannot exceed P max . If is given, all the active users should obtain the achieved rate no less than their individual minimum rate requirements, which is declared in the second constraint.

SOLUTION OF THE OPTIMISATION PROBLEM
The problem in (5) is a mixed integer non-linear programming problem, in which the optimal solution can be obtained by an exhaustive search. Since the exhaustive search needs to be performed with all combination cases on each user, the optimal solution of user scheduling is easy to obtain only when the number of user equipment is small. However, the computational complexity of exhaustive search is not tolerated when the number of users gets higher.
To reduce the exhaustive search time, in this section, we propose a multi-round user scheduling scheme under general conditions, and then specify a computational-complexity reduced scheme of user scheduling in some practical use. Finally, based on the analysis of the computational complexity of the algorithms, a hybrid user scheduling algorithm is introduced.

Multi-round user scheduling algorithm
To obtain the optimal solution of user scheduling problem, a two-stage decision method is proposed. In Stage 1, the feasible points of the user scheduling problem is given. To further achieve the better transmission rate, Stage 2 is then operated. Stage 1: Decision for the feasible points of the user scheduling problem.
The feasible points of the user scheduling problem are given by the following lemma.

Lemma 1. Assuming that all the C N M inactive users combinations, that
max , the feasible points of the user scheduling problem (5), that is, the optimal candidates of inactive users, are shown in Table 1, where N is the general expression of the inactive N -user set, and P N max is given by Proof. Please refer to Appendix 1. □ It should be noticed that the result in Table 1 defaults the condition that P max may be satisfied in some special case, in which the BS can remove the user set N to maintain the individual minimum rate requirements of the active users. To achieve the objective designed, the BS prefers to remove to achieve the maximum number of active users.
Remark 2. In practical, due to the hardware constraints on SIC, the number of accessible users is restricted. Thus, if the number of accessible users in a single NOMA cluster is limited by Q = M − N , the feasible points of the user scheduling problem given in Table 1 can also hold by setting P = ∞, and remove the first N rows in the table.
Stage 2: Decision for the inactive users. As seen in Stage 1, there may still have more than one feasible point. Therefore, extra efforts need to be made to achieve better performance. The optimal inactive users can be finally obtained by the following lemma. where Proof. Please refer to Appendix 2. □ As is well known, the considered user scheduling problem can be solved by using the exhaustive search, but with an intolerable computational complexity. To alleviate the computational burden, the following analysation is given to propose a user scheduling scheme implemented by Algorithm 1.
The objective of the user scheduling problem considered is to seek the maximum number of active users, such that if a feasible condition can support more active users, other conditions with less active users should not be calculated. For this reason, a multi-round user scheduling scheme is designed to search the optimal solution of user scheduling via the outer and inner multi-round mechanisms given as follows.
1) The outer multi-round mechanism: It can be seen from Table 1 that the whole range of P max is decomposed into M + 1 feasible conditions to obtain the optimal number of inactive users. However, there are 2 M feasible conditions for the candidates of inactive users. Thus, instead of obtaining the final solutions directly, getting the optimal number of inactive users firstly will have better performance in terms of the computational complexity.

Algorithm 1
The proposed multi-round user scheduling algorithm An outer multi-round mechanism is conducted from step 1 to step 12 iteratively in Algorithm 1 to search the minimum number of inactive users.
2) The inner multi-round mechanism: After obtaining N * , the corresponding feasible condition needs to be further decomposed into C N * M subintervals seen from Table 1. Thus, an inner multi-round mechanism is conducted from step 5 to step 10 iteratively in Algorithm 1 to obtain the feasible points of the problem (5).
Finally, the user scheduling solution having the maximal number of active users with better sum-rate performance is given by step 7 among multiple feasible points according to (7).
It can be concluded that MRUSA not only obtains the feasible points of the original problem via Stage 1, but also seeks the improved sum-rate by Stage 2. Thus, the obtained user scheduling solution is optimal in terms of active users number with higher transmission rate.

Computational-complexity reduced user scheduling algorithm
Obviously, when more users participate in the user scheduling scheme, it will also incur greater computational complexity. To further alleviate this computational burden when huge active users present in a single NOMA cluster, a CCRUSA is introduced under some practical conditions, which is shown in the following theorem.

the optimal inactive users set is given by
Proof. Please refer to Appendix 3. □

Algorithm 2
The proposed computational-complexity reduced user scheduling algorithm Remark 3. Theorem 1 demonstrates the proposed user scheduling scheme in specific cases whenR 1 ≥ ⋯ ≥R M , which includes the more practical case whenR 1 = ⋯ =R M . In practice, the proposed computational-complexity reduced user scheduling scheme can be implemented by Algorithm 2.

Computational-complexity analysis
The computational complexities of the proposed MRUSA and CCRUSA are analysed with given h i ,R i , ∀i ∈ and P max , which are compared with one of the exhaustive search. To evaluate the complexity, the computing of the number of floating-point operations in the algorithms are collected [21]. For the sake of brevity, the computational complexity is denoted by the notation ℱ and the flops count of each step in each algorithm is given as follows. For a given N , C N M combinations of user cases need to be calculated, in which the flops for each combination are expressed by Further, assuming that the optimal number of inactive users is N * , ∀N * ∈ [0, M]. The candidates of inactive users will be k * N * or … or The flops for each of C N * M − k * + 1 candidate cases in the sum-rate improvement strategy can be given by Then, based on the obtained ℱ 1 N and ℱ 2 N * , the computational-complexity evaluation of the compared algorithms in terms of flops are summarised as follows.

1) Exhaustive search (ES):
In the exhaustive search, all the combinations of user scheduling should be enumerated. Thus, ∑ M −1 N =0 C N M combinations of users cases need to be calculated to obtain the candidates of inactive users, which requires the flops of  In the sum-rate improvement strategy, additional flops are needed for C N * M − k * + 1 candidates, which are conducted to obtain the final user scheduling solution.
Therefore, the total flops of the exhaustive search are counted as Multi-round user scheduling algorithm (MRUSA): Contrasting with the exhaustive search is the multi-round mechanism to achieve the optimal solution in the proposed MRUSA.
Specifically, according to Algorithm 1, determining the candidates of inactive users needs ∑ N * N =0 C N M ℱ 1 N flops. The flops count in the sum-rate improvement strategy is equal to that in the exhaustive search.
Therefore, the total flops of the MRUSA are counted as  Table 2.
It can be seen that the proposed MRUSA can obtain less complexity than the ES when the optimal number of inactive users is not large enough, and the CCRUSA has the lowest complexity.
Furthermore, the numerical results of the flops versus M are listed in Table 3. It shows that the proposed two user scheduling algorithms achieve the significant reduction of the computational complexity than the ES, and when there are more considered users, the computational savings are more significant.

Algorithm 3
The hybrid user scheduling algorithm These observations will be will be further verified by numerical simulations in Section 4.
Since the same user scheduling solution will be achieved by MRUSA and CCRUSA if both algorithms can work and CCRUSA outperforms MRUSA in terms of computational complexity, CCRUSA has a higher priority than MRUSA to be operated in practice. We further propose a hybrid user scheduling algorithm in Algorithm 3 to describe the tradeoff between the two proposed algorithms.
In the implementation of the proposed schemes, the CSI overhead needs to be discussed. At the beginning of user scheduling, the BS needs to collect the CSI for all users on their sub-channels in the single NOMA cluster. Hence, if more users desire to access the single NOMA cluster, there is more CSI overhead suffered by the system. However, due to additional calculation caused by SIC at each user receiver, the limit of the number of accessible users needs to be considered in each single NOMA cluster [6] . Therefore, the CSI overhead can be controlled by the BS in practice.

SIMULATION AND ANALYSIS
In this section, we evaluate the performance of the proposed user scheduling algorithms in single NOMA cluster network by numerical simulations, where the BS is located in the cell center and the users are randomly distributed in a circular range with a radius of 300 m. The large-scale path loss is L(d ) = 37 + 30log(d ), where the unit of d is meter. The noise power is assumed to be N = −104 dBm. The individual minimum rate requirement follows an i.i.d. uniform distribution asR m ∼  (1, 7), for m = 1, … , M. Figure 1 depicts the cumulative distribution function (CDF) of the number of active users using the proposed user scheduling algorithm labelled as Proposed-USA and the existing enhancing sum-rate user scheduling algorithm in [9] labelled as Existing-USA, both of which provide us useful insights on the user scheduling in single NOMA cluster. For the sake of brevity, 10 users are considered in this simulation, which is repeated 10,000 times for the result to get stable statistical performance. It is clearly observed that the Proposed-USA performs better than Existing-USA with more active user numbers, for example, for the P max = 30 dBm case, the numbers of active users in the Existing-USA and Proposed-USA are concentrated on 3.3 and 5.7, respectively. In addition, the number of active users Existing-USA [9] with P max =30dBm Proposed-USA with P max =20dBm Existing-USA [9] with P max =20dBm

FIGURE 1
The CDF of the active users number Proposed-OUSA Random-OUSA FIGURE 2 Sum-rate of single NOMA verus P max increases with P max revealing the well-done mechanism of protecting active users' QoS. Figure 2 illustrates the sum-rate of the single NOMA cluster achieved by the proposed optimal user scheduling algorithm (Proposed-OUSA) and the random optimal user scheduling algorithm (Random-OUSA). It is clearly observed that based on precondition set for the maximisation of the number of active users, Proposed-OUSA performs better than Random-OUSA in terms of the sum-rate at any given cluster power except the case when there is only one inactive users candidate set, leading to the equal performance of the two algorithms, and for the same N * the performance gap between these two algorithms increases with the number of inactive users candidates sets. The performance improvement in the proposed algorithm results from the operation of Stage 2 in Section 3.1 for further improving the sum-rate of NOMA system. Moreover, at each changing point of the number of inactive users N * , the sumrate gets higher with N * , which is due to the fact that the proposed scheme is to unload some users at that point so that the remaining active users will be allocated more resources to obtain higher transmission rates. Figure 3 depicts the achieved individual rate versus the user minimum rate requirement in a four-user NOMA cluster. The sum-power is allocated to users to meet their initial minimum rate requirements, which are set as, for example,R 1 =R 2 = R 3 =R 4 = 3 b/s/Hz in the simulations. The distances of the four users to the BS are set to 100, 80, 60 and 40 m, respectively. As illustrated in Figure 3(a)-(c), when the variable minimum rate requirement denoted byR is less than 3 b/s/Hz, the achieved rate of the user 4 decreases withR, since more powers are allocated to other users. Moreover, as illustrated in each figure of individual user's rate achievement in Figure 3, whenR excesses 3 b/s/Hz, the BS cannot support all the four users, and then user 1, which has the lowest channel gain, becomes inactive first to achieve the improved sum-rate, that is, the rising performance of user 4. Further whenR for a user is large enough, there is insufficient sumpower to be allocated to that user (user 1, user 2, user 3 and user 4 in (a), (b), (c), and (d), respectively) and hence the proposed user scheduling scheme will offload the corresponding user and re-activate user 1 to sustain the maximum number of active users. It is worth noting that the proposed user scheduling scheme can offload any users to achieve the maximum number of active users, revealing the well-done mechanism of fairness. Figure 4 depicts the number of active users and sum-rate versus P max when the number of accessible users in a single NOMA cluster is limited by Q. It can be seen from Figure 4(a) that by adding the mechanism for the hardware constraints on SIC proposed in Remark 2, the number of active users can be well restricted no matter how large P max is. Thus, the error propagation in NOMA can be controlled in the proposed user scheduling algorithm. From Figure 4(b), since more power will be allocated to the user with higher channel gain when Q is small, the sum-rate with lower Q is larger than that with higher one after the restricted mechanism. However, the gap becomes smaller due to more power allocated to the user with higher channel gain will also rise the interference to other users in the same NOMA cluster.
Finally, the computational complexities of the ES, and the proposed MRUSA and CCRUSA are compared in Figure 5. It can be seen that, at any given number of users M , CCRUSA requires the lowest computational flops among three algorithms and the gaps of the flops between CCRUSA and others increase significantly with the number of users in the single NOMA cluster. It is also worth noting that MRUSA achieves the significant reduction of the computational complexity than ES when M is large, implying that MRUSA is more applicable than ES in use in the multi-user scene. Additionally, the difference of the flops between ES and MRUSA decreases when the optimal number of inactive users increases, which results from the multi-round mechanism in MRUSA. Moreover, since CCRUSA can obtain the final user scheduling solution without performing the sumrate comparison, the same computational complexity will be obtained with equal N * .

CONCLUSIONS
Here, we have investigated the user scheduling algorithm for maximising the number of active users while ensuring their individual minimum rate requirements in a single NOMA cluster. A MRUSA is proposed firstly for common practical environment. To achieve better performance, a CCRUSA is further introduced under some practical conditions. Simulation results show that the proposed user scheduling algorithms can achieve the maximum number of active users and significantly improve the performance in terms of the network sum-rate when the maximum number of active users is achieved. And, the proposed user scheduling algorithm has the properties of fairness and adapting the hardware constraints on SIC. Moreover, the required number of computational flops of the proposed computational-complexity reduced user scheduling algorithm is much lower than that of the proposed optimal user scheduling algorithm.
, the lower bound of P max for this case can be obtained similarly as (14), given by The feasible condition can be obtained similarly as problem (15), and is thus omitted.
It can be concluded that the feasible condition P is the upper bound of P max for problem (18). Now, the whole range of P max has been decomposed into 2 M feasible conditions, and for each condition the feasible points, that is, the optimal candidates of inactive users, of problem (5) are listed in Table 1.
The proof of the lemma is now complete.

A.2 Appendix 2
From Lemma 1, the feasible condition P k N max ≤ P max < P k−1 N max will lead to multiple feasible points of the original problem, for example, k N , … , C N M N . According to formula (16) in [8], the maximum sum-rate has been obtained when the active users are given. Without loss of generality, we assume that in the n users with the highest channel gains are inactive, the user (M − n) is active, and other active users are randomly placed.
Thus, for ∀n ∈ {0, 1, … , N } and ∀i ∈ {k, 2, … , C N M }, the optimal sum-rate can be given as , the associated active user J i can also be obtained. Thus, , and then J i can be rewritten as By combining (20) with (21), R i N n can be given as in (8). It can be seen from (8), for a given P max , different feasible points of user scheduling will result in different optimal sum-rates. And, any of them regarded as the inactive users being scheduled can obtain the same maximum number of active users.
To pursue the better performance in terms of transmission rate, the feasible point having the maximum sum-rate should be chosen as the finally optimal inactive users being scheduled formulated by (7).
The proof of the lemma is now complete.

A.3 Appendix 3
To demonstrate Theorem 1, we first prove P (m) N m=1 max = min{P N max }. Without loss of generality the N inactive users N can be denoted by (L m ) N m=1 which follows a general relationship as , as follows. Start with the defined variable as Then, by substituting (6) into (A.12) and manipulating it, it yields Δℙ (see(24)), where i , and we can obtain that ≥ 0 which means any component of the fourth item is also non-negative. Additionally, Therefore, Δℙ (1) , a reduced version of Δℙ (see (25)) is constructed, with Δℙ ≥ Δℙ (1) .
In the same way, the first two items in Δℙ (1) are checked to be non-negative, which leads to Δℙ (2) , a reduced version of Δℙ (1) , that is, Δℙ (1) ≥ Δℙ (2) , in which the expression of Δℙ (2) is omitted due to space limitation. (A.14) After L N − L (K +1) + 1 same operations, we can obtain the following expression as where and (a) holds because Therefore, P It can be seen that the first three items in Ξ (1) are nonnegative, resulting in Ξ (2) , a reduced version of of Ξ (1) , that is, Ξ (1) ≥ Ξ (2) , where Ξ (2) is shown in (30).
, the first item in Ξ (2) is also non-negative. Thus, Ξ (3) , a reduced version of Ξ (2) , is constructed as Ξ (2) ≥ Ξ (3) . The expression of Ξ (3) is omitted due to space limitaton. Consequently, after N − n − K same operations, we can obtain the following expression as where