A two-stage method for bus passenger load prediction using automatic passenger counting data

In high-frequency transit, providing real-time crowding information (RTCI) is a potential way to promote passenger satisfaction and reduce negative crowding externalities, by assisting passengers in choosing less crowded vehicles. To make RTCI convincing and reliable, it is necessary to provide predictive RTCI, in which bus passenger load (BPL) prediction is the primary problem. This paper proposes a novel two-stage BPL prediction method using automatic passenger counting (APC) data. The ﬁrst stage is to predict short-term passenger ﬂows at stops based on an adaptive Kalman ﬁlter approach. Using the outputs from the ﬁrst stage as well as other variables directly from APC data, the second stage is to predict BPL based on a support vector regression algorithm. Several methods from the existing literature are used as benchmarks to test the relative performance of the proposed method. An empirical study on bus line 1 in Suzhou, China shows that the proposed method outperforms all the benchmarks, and shows signiﬁcant superiority over other methods for stops with sharp increases in BPL and for multi-step ahead prediction. This study contributes to the limited literature on BPL prediction and lays the foundation for providing accurate and reliable predictive RTCI in the future.

a feasible and effective way to promote passenger travel experience in high-frequency transit. With the help of RTCI, passengers can make better-informed decisions about whether to board a bus or not based on their acceptances for crowding, waiting time etc. Our previous study [4] indicates that if RTCI is provided for crowded bus routes, it would not only equalize crowding among bus trips but also help to prevent bus bunching as the passengers transferring from crowded vehicles to less crowded vehicles act as a kind of holding tools.
From the above point of view, RTCI is considered to be welcomed and helpful for both passengers and transit agencies. However, RTCI that only presents current bus crowding conditions sometimes is not enough to support passengers to make correct decisions. For instance, a bus is currently having available seat but may become full or crowded when it arrives FIGURE 1 An illustration of the bus passenger load prediction problem in high-frequency transit at the target stop after servicing several stops; the passengers at the target stop may spend more waiting time for this bus but still cannot have a seat. In this case, the passengers would feel RTCI is not trustworthy and the effect of RTCI would be greatly reduced. Obviously, predictive RTCI that forecasts the bus crowding condition at the target stop is more reliable for passengers. Providing predictive RTCI generally requires that passenger loads be predicted several stops ahead from the current bus locations. This paper focuses on the bus passenger load (BPL) prediction problem (illustrated in Figure 1).

Literature review
Short-term passenger flow prediction is a classical and popular research topic over the years. Generally, the approaches for it can be classified into parametric and non-parametric methods [5]. In parametric methods, regression analysis [6], autoregressive integrated moving average (ARIMA) model [7], generalized autoregressive conditional heteroscedasticity (GARCH) model [8] have been applied successfully. Non-parametric methods include artificial neural network (ANN) [9], support vector machine (SVM) [10], Kalman filter [11], and random forest [12]. The fundamental of these models is to construct a nonlinear relationship between input and output variables without a priori knowledge. In recent years, multi-pattern models combining with time series or neural networks have attracted the attention of many scholars [13][14][15]. Gong, Fei [16] proposed a framework to predict the waiting passenger number at bus stop, which has similarities with the BPL prediction. In the framework, arrival passenger number and departure passenger number are predicted separately. The waiting passenger number is then calculated, and a Kalman filter approach is developed to minimize the estimation inaccuracies.
There are much fewer studies regarding passenger load prediction in public transit, and most of them focus on rail transit. For instance, Noursalehi [17] proposed a framework for urban rail crowding prediction and information provision based on automatic fare collection (AFC) data. The framework conducted data-driven prediction of origination-destination (O-D) passenger flows using random forests and boosted regression trees, with on-line simulation of supply and demand interactions. Khomchuk, Tuladhar [18] proposed a Bayesian approach to predicting the passenger loads in individual train cars downstream of their current locations based on APC data, in which passenger O-D patterns are assumed to follow a Poisson prior distribution with parameters estimated from historical data. Pasini, Khouadjia [19] and Hu, Chiu [20] both applied a long short-term memory (LSTM) neural network to predict train loads based on temporal features (e.g., day-of-week, time-of-day) and load measurements from the previous train in the current day. Jenelius [21] proposed a prediction framework consisting of historical and real-time APC data to predict individual train car loads and crowding levels based on several regression models (stepwise linear regression, lasso regression and boosted regression tree ensembles). An empirical study on a metro line in Stockholm, Sweden shows that accurate RTCI can be provided long before the trains arrive, and the three regression models perform similarly.
Passenger load prediction in bus systems is considered more difficult than in rail transit due to the following two reasons. First, the ridership flows in bus systems, including boarding, alighting, and onboard passenger numbers, are generally much smaller than that in rail transit, and thus with more volatility that makes it difficult to capture the flow patterns [11]. Second, in contrast to the regular headways in rail transit, the running conditions for buses are unstable and easily affected by the external environment [22].
A few studies have focused on the BPL prediction problem in recent years. Initially, Zhang, Shen [23] built a framework to predict passenger load using AFC data, where the first step involves identifying the historical day most similar to the current day and predicting downstream loads based on observed loads from the historical day. In the second step, the passenger loads are updated by combining real-time and historical data in an extended Kalman filter. Since the AFC data only record passengers' boarding stops, the prediction was based on the estimated passenger loads and no ground truth data validated the prediction performance. He, Yan [24] conducted BPL prediction for realizing predictive air-conditioner control on electric buses. Using manually collected load data, they employed a relatively simple prediction scheme that assumes a relationship between loads at target stops with loads at previous stops. Three methods were adopted for this scheme, including Monte Carlo, radial basis function neural network (RBF-NN), and Markov-chain. Simulation results show that all three methods integrated within a predictive air-conditioner control framework are able to achieve more stable temperature performance and lower energy consumptions. Particularly, Jenelius [25] applied a similar framework with Jenelius [21] on BPL prediction using lasso regression model. Although the framework in Jenelius [25] was extended to incorporate real-time automatic vehicle location (AVL) data, the results show that the prediction accuracy for bus system is far behind that for trains, indicating that The framework of the proposed two-stage BPL prediction method predicting passenger loads in bus systems is a more challenging task.

Summary
Predicting passenger loads in high-frequency bus systems is a challenging task due to the significant variability of passenger flows at stops and bus running conditions. Considering the passenger load prediction problems for both rail transit and bus transit, previous studies provide valuable knowledge about the ability to predict passenger loads based on different prediction techniques. However, several research gaps can be identified. First, most existing studies turn passenger load prediction into a time series problem and only use historical or current load data as predictors [19,20,23,24]. More automatic collected data in bus systems, such as AVL data and APC data, and deeper mining methods based on these datasets are considered useful for improving the BPL prediction performance. Besides, few studies have tackled the BPL prediction problem in an empirical context; thus there is no way to test the relative performances of existing methods convincingly.
In this paper, we propose a two-stage BPL prediction method using APC data (including bus arrival time information as well). The first stage is to predict short-term passenger flows at stops based on an adaptive Kalman filter approach. Using the outputs from the first stage as well as other variables, the second stage is to predict BPL based on an SVR algorithm. The main contributions of this paper include: 1. A novel two-stage BPL prediction method using APC data is proposed. By dividing the prediction task into two stages, the proposed method can better extract passenger flow patterns from APC data and thus build a more effective model to predict BPL. 2. A performance comparison between the proposed method and several state-of-the-art methods is conducted on a real bus line. The results show that the proposed two-stage method outperforms all the benchmarks, particularly for multi-step ahead prediction. 3. The performance of the adaptive Kalman filter approach on passenger flow prediction at bus stops is presented, and the effectiveness of the SVR algorithm in the proposed method is analysed.
The remainder of this paper is organized as follows. Section 2 presents the two-stage BPL prediction method in detail. Section 3 introduces the case study, including the information regarding the studied bus line and APC data, and the benchmarks for comparison. Section 4 presents the performance of different methods in both single-step and multi-step ahead predictions. Section 5 concludes the work and elaborates on future directions.

METHODOLOGY
The framework of the proposed two-stage BPL prediction method is shown in Figure 2. The method is based on real-time APC data. In the preliminary stage, the APC data are turned into boarding, alighting, and section passenger flows within fixed time intervals at each stop through a homogenization step. In stage 1, we forecast boarding, alighting and section passenger flows at subsequent time steps by an adaptive Kalman filter approach. In stage 2, for every single trip, the predictive passenger flow predictors and real-time APC predictors are combined to predict the passenger loads at the subsequent stops using support vector regression. In summary, stage 1 is passenger flow prediction at stop level and stage 2 is BPL prediction at vehicle level. In the following text, we use the term boarding/alighting number/flow to represent boarding/alighting passenger number/flow for simplicity.
To turn the series of number of boarding passengers {b m,s } of bus m ∈ M into passenger flow within fixed time intervals at stop s, a homogenization step is conducted as follows. First, the passenger number b m,s is uniformly allocated across its headway from the previous bus m − 1; thus, the virtual timestamp i m,s of passenger i among b m,s is expressed as: The boarding flow at stop s is calculated by aggregating the virtual timestamps into fixed time intervals. An interval I 0 = 15 min is adopted in this study. Let Ω s be the set of virtual timestamp i m,s for all passenger i and all buses m ∈ M at stop s. The boarding flow f b s,t of stop s at time step t is expressed as: Alighting flow f a s and section flow f l s at any stop s are obtained in the same way by replacing b m,s in Equation (2) with a m,s and l m,s , respectively. Figure 3 illustrates the homogenization step in a more intuitive way.

Passenger flow prediction based on adaptive Kalman filter
Let f s,t generally denote the (boarding, alighting, or section) flow of stop s at time step t. This subsection aims to predict the passenger flows in the subsequent time steps { f s,t , f s,t +1 , …} given the current flow series { f s,t −1 , f s,t −2 , …} and historical flow patterns based on an adaptive Kalman filter. The adaptive Kalman filter approach has been successfully applied in traffic forecasting and shows improved adaptability when traffic or passenger flow is highly volatile [11]. In this paper, the boarding, alighting, or section flow at bus stop is the sort of flow with great fluctuations, thus the adaptive Kalman filter approach is considered appropriate to predict the passenger flows. In this subsection, we fix the target stop s and omit the index for simplicity of notation.
Consider the following state-space model. We assume the state transition equation that maps the passenger flow f from t − 1 to t is given by: The observation equation is given by: where Δ t : a priori knowledge that can be obtained from historical flow patterns; y t : the measurement value of f t ; w t ∼ N (q t , Q t ), t ∼ N (0, R t ): process noise and observation noise.
The conventional Kalman filter algorithm can be divided into two phases, as described in Equations (6)-10): Predict phase: State propagation and prior state estimation error covariance estimation.
Update phase: Kalman gain computation, posterior state estimation, and posterior state estimation error covariance estimation.
where f t |t −1 , P t |t −1 : priori (predicted) estimation of passenger flow at time step t and its error covariance; f t |t , P t |t : posterior (updated) estimation of passenger flow at time step t and its error covariance; K t : Kalman gain at time step t.
For heteroscedastic passenger flow series, an adaptation mechanism for updating the parameters of process noises and observation noises is preferred, termed as the so-called adaptive Kalman filter [26]. In this study, the variance R t of observation noises and the mean q t and the variance Q t of process noises are estimated by using a memory of observation errors and state estimation errors, as described in Equation (11).
where t , w t : observation error and state estimation error at time step t; r,q: unbiased estimations of the mean of observation errors and state estimation errors; : prescribed memory size of the adaptive Kalman filter recursion.
For multi-step ahead prediction, since there is no measurement to update the posterior state, we assume the posterior estimation of flow is the same as the prior estimation in prediction time steps, while the parameters of process noises and observation noises stop updating. Using the adaptive Kalman filter approach, we can obtain the predicted boarding, alighting and section flows in the subsequent time steps for any stop.

2.3
Bus passenger load prediction based on support vector regression

Predictors
Let h m,s denote the headway of bus m from the preceding bus at stop s; that is: The corresponding time window is expressed as m,s = T m−1,s , T m,s . When predicting the passenger load of bus m at stop s from stop s − 1, we assume the headway keeps unchanged from stop s − 1 to stop s; thus, the estimated time window m,s|s−1 is given by: If m,s|s−1 falls in a 15-min interval I 0 (t * − 1), I 0 t * fully, the predicted passenger flowf s ( m,s|s−1 ) on m,s|s−1 takes the exact value off s,t * . Otherwise, m,s|s−1 falls in two consecutive intervals, thenf s ( m,s|s−1 ) takes the weighted average of the two consecutive predicted passenger flows according to their shares in where t * is the upper bound time step of m,s|s−1 , and is the proportion that time step t * occupies in m,s|s−1 . Through Equations (13) and (14), we can obtain the predicted boarding, alighting and section flows on certain time windows. The predictive passenger flow predictor F m,s|s−1 used for BPL prediction is expressed as: The real-time APC data is also important to BPL prediction. Obviously, the current load l m,s−1 and current headway h m,s−1 are directly related to passenger load at the next stop. Moreover, the passenger loads and headways of the target bus at the previous two stops may imply some variation trends in passenger loads or headways, and are selected as explanation variables as well. The real-time APC predictor APC m,s|s−1 is expressed as: Combining F m,s|s−1 and APC m,s|s−1 together and omitting the bus index m for simplicity of notation, the single-step ahead BPL prediction model aims to generalize the relationship of the following form: wherel s|s−1 is the predicted passenger load at stop s based on stop s − 1.
For multi-step ahead prediction, namely predicting stop s from stop s 0 (s 0 = s − 2, s − 3, …), the predicted stops (i.e., stop s 0 + 1, … , s) are regarded as one whole stop. The predicted passenger flow of the whole stop is an average, denoted asf s 0 s , across the predicted flows of these initial stops, i.e.: Therefore, the form of multi-step ahead BPL prediction model is expressed as:

Support vector regression
Support vector regression (SVR) is applied to develop the BPL prediction model. SVR is the counterpart of SVM for regression problems. SVR can capture the complex input and output relationship in non-linear systems by mapping the input vectors into high-dimension space. The objective function of SVR is to minimize the L2-norm of the coefficient vector with slack variables involved, i.e.: subject to: where is the coefficient vector, C is the regularization constant, is the tube size, i , * i are slack variables, y i is the dependent variable, i.e. passenger load l s at target stop, and x i is the vector of explanatory variables in Equation (17).
By solving the dual of Equation (20) and introducing the kernel function, the primal problem is turned into a linearly constrained quadratic programming problem, indicating that the solution of SVR is always unique and globally optimal [27]. A main drawback of SVR is its long training time when dealing with large datasets and samples with many features. However, this issue is not influential in our prediction framework since the dataset and sample features are relatively small. These fea-

Bus line and APC data
The proposed two-stage BPL prediction framework is applied to bus line 1 in Suzhou, China. The geographic distribution of line 1 is shown in Figure 4. The studied direction is from north to south. Line 1 is 11.2 km long and consists of 22 stops in the studied direction. The typical run time from start to end in one direction is around 45 min. Line 1 provides a quite frequent service; the departure interval is 5 min in peak and 8 min in offpeak. The fleet serviced in line 1 is medium-sized buses with 22 seats. The whole fleet of line 1 is equipped with the APC system. The APC system records the number of boarding passengers, number of alighting passengers, and bus arrival time for each bus trip at each stop. APC data from weekdays on 30 July-14 August 2018, are used in this study. After data cleaning and preprocessing, we obtained 1804 bus trips in 12 days for the studied direction. Figure 5(a) plots the average passenger load, number of boarding passengers, and alighting passengers among all bus trips at each stop. As shown in Figure 5(a), the segments from stop 5 to stop 7 (denoted as segment 1) and from stop 13 to stop 15 (denoted as segment 2) basically have more boarding or alighting passengers than any other stop except two terminal stops; the variations in passenger load on these two segments are obvious as well. Therefore, segment 1 and segment 2 are selected as the studied segments in this study since it is more challenging to predict BPL on these two segments, where segment 1 stands for a boarding-dominant segment, and segment 2 stands for an alighting-dominant segment. Figure 5(b) plots the passenger load variation with time at stop 7. The 10th percentile, medium, and 90th percentile passenger loads are calculated among all bus trips in the dataset within each 30-min interval. Figure 5(b) indicates that there is significant variability in passenger loads between different bus trips. It also shows that there is an obvious evening peak in passenger load on line 1.

Model fitting
Data from the first two weeks (30 July-10 August, 10 days) are used as the historical/training set to derive passenger flow patterns and to train the prediction model. Data from the last two days (13, 14 August) are used as the test set.
In the stage of passenger flow prediction, we calculate the average passenger flowsf s,t based on the historical set. Then, Δ t in Equation (4) is obtained by calculating the first-order difference of the flow series {… ,f s,t −1 ,f s,t , …} for each stop s. We apply the adaptive Kalman filter approach to predict the boarding, alighting and section flows for stop 3-7 and stop 11-15 in both the historical set and test set. The memory size in the adaptive Kalman filter recursion is set as 4.
In the stage of BPL prediction, for any target stop, each bus trip is turned into a sample. The samples in the first two weeks are used as training data, and the samples in the last two days are used as test data. Thus, there are 1801 samples in the training set and 303 samples in the test set for each target stop. The RBF kernel function is selected for the SVR algorithm in this study. A fivefold cross-validation approach is used to train the model. Three parameters need to be determined while using RBF kernels in SVR, namely regularization constant C, tube size , and scale parameter . A grid-search is used to pick up the optimal C, and .

Benchmarks
Four benchmarks are used to test the relative performance of the proposed framework. These methods include linear model based on predicted flows, one-step forecast, and two BPL prediction models in the existing literature [23,25].

Linear model based on predicted flows
In contrast to the SVR algorithm, the linear model in this study tries to seek a specific form of function between the dependent variable and the explanatory variables in Equation (17). The predicted number of boarding passengersb s|s−1 is estimated by: where I 0 = 15 min is the length of time intervals. For the predicted number of alighting passengersâ s|s−1 , it is assumed that the proportional relationship between a s and L s−1 is approximately equal to that betweenf a s,t andf l s−1,t ; thus,â s|s−1 is given by:â In total, the predicted passenger load given by the linear model is expressed as: The reason for introducing the linear model as a benchmark is to investigate the effects of the SVR algorithm and the auxiliary variables (l s−2 , l s−3 , h s−2 , l s−3 ) in Equation (17).

One-step forecast
It forecasts the future loadl s as the observed value l s−1 at the former time step. The one-step forecast is one of the simplest prediction ways, serving as a baseline method for comparison.

Two-step extended Kalman filter (2S-EKF) model
Zhang, Shen [23] proposed a 2S-EKF model for the BPL prediction problem. The first step is to search the historical data to find the passenger load matrix L hist that is most similar to the current load matrix L. The similarity S is defined as Equation (25).
where ⊙ is the element-wise product operator. Then, we obtain a passenger load sequence {u * 1 , u * 2 , … , u * s } from the most similar historical matrix L hist, * . The second step is to predict the passenger load using an extended Kalman filter, in which the state transition function is given by: where w s is Gaussian white noise.

Lasso regression
Jenelius [25] proposed a BPL prediction model based on both real-time and historical APC data using lasso regression. The vector of potential predictors x m,s|s−1 consist of three predictors, which are based on historical load data, real-time AVL data, and real-time APC data, respectively, i.e.: The prediction model is assumed linear in coefficients. Then, a lasso regression is applied for variable selection and parameter estimation by minimizing: where 0 and i are parameters to be estimated, and is a regularization coefficient that penalizes large parameter values. The predicted passenger load is calculated based on the estimation results of 0 and i . The linear model, one-step forecast, and lasso regression can easily extend to forms for multi-step ahead prediction.

Performance evaluation
The prediction results are evaluated in terms of the performances of two indices: mean absolute error (MAE) and root mean square error (RMSE). RMSE gives a relatively high weight to large errors and is usually larger than MAE. The greater difference between them, the greater the variance in the individual errors in the sample set [28]. The MAE and RMSE of N samples are computed as follows: wherel n and l n are the predicted and actual passenger loads of sample n, respectively. Table 1 presents the performance of passenger flow prediction on the test set (13, 14 August). The prediction performance on   Table 1, the prediction performance for alighting flow is better than that for boarding flow on segment 1, but shows the contrary on segment 2, since segment 1 has more boarding passengers while segment 1 has more alighting passengers. The prediction performance for section flow is not as good as that for boarding and alighting flow, partially because the section flow has much larger numbers. Figure 6 also plots the predicted flow values versus ground truth data for stop 7 on 14 August as an example. It shows that the passenger flows are quite unstable over time, especially for section flow. Even though, the predicted values fit with the ground truth well, indicating the effectiveness of the adaptive Kalman filter-based prediction model.

Single-step ahead passenger load prediction
In single-step ahead passenger load prediction, we forecast the load of the target stop from the last preceding stop. The performances of the proposed method and four benchmarks on two studied segments are displayed in Table 2. The detailed prediction results for each stop are given in Appendix. Table 2 shows that the proposed method outperforms all the benchmarks on both segments 1 and 2. The linear model performs closely to the proposed method. Besides, it shows that the lasso regression is also acceptable for BPL prediction. However, the 2S-EKF model presents very bad prediction results, even much worse than the results of one-step forecast. A larger historical dataset may be helpful to improve the performance of the 2S-EKF model, since the first step of the model is to find a most similar load sequence in history. Nonetheless, in essence, the 2S-EKF model is only based on load data, and does not consider other important explanation variables (e.g. headway, boarding, and alighting passenger numbers). Thus, it is more likely that the 2S-EKF model is not capable of predicting BPL in highfrequency transit; it may be more feasible for low-frequency routes, on which the headways have much less variability.
Comparing the proposed method with the lasso regression, it shows that the advantages of the proposed method are more evident on segment 1. Specifically, the proposed method shows its superiority on stops with many boarding passengers, which is fully demonstrated in Table A1 in Appendix. The reason is probably that boarding number prediction relies more on the parameter of predicted boarding flow in equation (17), while the parameter of predicted alighting flow is not that important for alighting number prediction. Figure 7 plots the predicted loads by the proposed method versus actual loads on the two segments. Red dashed lines in Figure 7 indicate the number of seats, which is 22 for the studied line. Figure 7 shows that most actual loads are smaller than 22, indicating it is seat available on line 1 in most cases. Generally, the predicted loads fit with the actual load well. For loads larger than 22, the deviations are still in an acceptable range.

Multi-step ahead passenger load prediction
In multi-step ahead prediction, we forecast the load of target stop s from the preceding stop s − , where is step number. The performances of the proposed method and three  benchmarks on two studied segments are displayed in Table 3.
Here, we consider two-step ahead and three-step ahead. The 2S-EKF model is omitted since it cannot provide competitive results. The detailed prediction results for each stop are given in Appendix as well.
Overall, the proposed method still outperforms all the benchmarks on both two segments. The differences between the performances of the proposed method and linear model are larger with the prediction step increasing. That indicates the auxiliary variables (l s−2 , l s−3 , h s−2 , h s−3 ) and non-linear relationship captured by SVR are beneficial to BPL prediction, especially for multi-step ahead prediction, in which the variation trends of loads or headways are important information. In addition, the superiority of the proposed method over the lasso regression also becomes obvious with the prediction step increasing. That probably because the lasso regression does not include an explanation variable regarding future passenger flow and assumes a linear relationship with the explanatory variables, leading to poor ability to predict sharp passenger load increasing. Similar to before, the advantages of the proposed method on segment 2 are not so evident as that on segment 1. Figure 8 plots three-step ahead predicted loads by the proposed method versus actual loads on the two segments. Generally, Figure 8(b) shows a lower scatter around the diagonal than Figure 8(a), which is consistent with the results in Table 3. Interestingly, there is an overall trend that the load on crowded runs is a bit underestimated but the load on the least crowded runs is a bit overestimated on both segments 1 and 2. A similar phenomenon also appears in the experiments in Jenelius [25], indicating a strong differentiation in BPL variation trends among different bus trips. Even though, Figure 8 shows that the proposed method can effectively predict bus loads several stops ahead from the current locations in most cases.

CONCLUSION
In high-frequency transit, providing real-time crowding information (RTCI) can help passengers choose less crowded vehicles and reduce negative crowding externalities, which would be attractive for both passengers and transit agencies. Providing timely and effective RTCI generally requires that the bus passenger load (BPL) be predicted several stops ahead from the current bus locations. This paper contributes to the limited literature on the BPL prediction problem by formulating a two-stage prediction method based on APC data. The first stage is short-term passenger flow prediction at stop level, in which boarding, alighting, and section flows at stops are predicted by applying the adaptive Kalman filter approach. The second stage is BPL prediction at vehicle level, in which the predicted flows obtained from the first stage, as well as other predictors directly from APC data, act as the explanatory variables to predict final passenger loads using the SVR algorithm. Four benchmarks are used to test the relative performance of the proposed method, including a linear model based on predicted flows, one-step forecast, and two methods in the existing literature (2S-EKF model and lasso regression). Two segments on bus line 1 in Suzhou, China are selected to test the performances of these prediction methods. The results show that the proposed method generally achieves the best performance among all the methods. By comparing with the linear model, it indicates that the SVR algorithm in the proposed method can effectively capture the non-linear relationship between bus loads and explanatory variables, particularly improving the performance in multi-step ahead prediction. The lasso regression also performs well in general, but the proposed method observably outperforms it for stops with sharp increases in BPL and for multi-step ahead prediction. In contrast, another existing method, the 2S-EKF model may not be appropriate for BPL prediction in high-frequency transit, due to its poor performance shown in the results.
It should be noted that the prediction model in this paper is trained using APC data without RTCI. The variation patterns of BPL may be a little different if RTCI is truly provided since the boarding behaviours of passengers may be affected by RTCI, but the prediction method proposed in this paper is considered still applicable by training the model using APC data with RTCI. since RTCI may influence passengers' boarding choices between two consecutive buses on the same route or different routes, adding the variables of passenger loads of preceding bus and following bus on the same route and other routes to the current prediction framework seems to be reasonable and worth trying to improve the prediction performance.
For future work, we will move forward to bus crowding level prediction based on the findings in this study. Another interesting topic is the release strategy of RTCI to disseminate the information to customers in a reliable and useful way.