Freight trip genera on to buildings under construc on: a compara ve analysis with linear regression and generalised linear regression

Recebido: 13 de novembro de 2018 Aceito para publicação: 7 de junho de 2020 Publicado: 15 de dezembro de 2020 Editor de área: Helena Beatriz Cybis ABSTRACT Es7ma7ng the number of trips generated by a company is an essen7al part of the process of freight demand modelling. In this context, the current study examines freight trip genera7on to buildings under construc7on (BUC) using generalised linear regression and linear regression, through a case study to Belo Horizonte. The main contribu7ons of this paper are related to the models to es7mate freight trips to BUC, the verifica7on of the linearity assump7ons of the linear models and the comparison of different modelling techniques for the freight trip genera7on models. Linearity assump7ons verified the reliability of the results of the linear regression models. Results indicate that the models with the best accuracy in predic7ng freight trips to the BUC are the linear models that use the area as an explanatory variable.


INTRODUCTION
Urban freight transport (UFT) supports the urban lifestyle because people need access to consumer goods (such as food, medicines, clothing and other products) and services (such as a waste collection). However, UFT operation directly impacts both local and regional economies (Ferreira and Silva, 2016). Despite this, local authorities are not able to establish measures that minimise the effects of freight transport in the urban environment, since they do not know the dynamics of freight &low in their territory .
In order to develop effective public policies that minimise the operational and environmental impacts of UFT, the number of trips generated by activities in a city must be known  and understood by transport planners (Gonçalves et al., 2012;Oliveira et al., 2016). This is particularly important given that the number of trips can cause harmful effects on road traf&ic in its immediate surroundings and, in some cases, hinder the accessibility of the region, aggravate vehicle and pedestrian safety conditions Oliveira et al., 2016), and cause increased emissions of pollutants and fuel consumption. To address such a situation, freight trip generation models (FTGM) estimate the number of trips produced and attracted, based on variables that re&lect characteristics of the region or of the phenomenon, considering the dynamics of the urban space .
For Brazil, the literature presents FTGM to pubs and restaurants (Campos and Melo, 2004;Silva and Waisman, 2007;Oliveira et al., 2016), supermarkets (Campos and Melo, 2002;Gasparini et al., 2010;Oliveira et al., 2016;Reis et al., 2018), shopping centres (Gasparini et al., 2010), and retail sites, including those specialising in clothing, food, construction materials and fuel (Campos and Melo, 2004;Oliveira et al., 2018). Against such a backdrop, this paper presents freight trip generation models for buildings under construction (BUC), which have not been examined in the Brazilian context.
The construction sector is economically important to countries around the world. For example, in 2017, the share of civil construction in GDP was 4.4% in Brazil (CBIC, 2018), 5% in the European Union (EuroStat, 2018) and 4.3% in the United States (Statista, 2018). This sector generates a signi&icant number of trips to guarantee production throughout a project's construction process, and it does not always consider the internalisation of loading and unloading operations. Also, BUC can occur in any location within a town, which adds an additional layer of complexity if it is compared with the supplying of other sectors. Thus, the development of FTGM for BUC was seen as a timely research opportunity.
Although BUC have transitory movements and the attractiveness of freight trips exists only during the construction time frame, the impact of these movements is signi&icant, especially when the location of BUC takes place in dense and congested regions. Also, many cities offer temporary unloading areas in front of the BUC. The decision on the size of unloading areas is arbitrary due to the lack of knowledge of transportation analysts about the construction process and the generation of trips. Thus, this paper contributes to the literature about freight trip generation models in BUC.
In this way, the paper presents freight trip generation models in BUC through a case study to Belo Horizonte. The following hypotheses were considered: (i) FTGM in BUC is a linear phenomenon; (ii) the linearity assumptions are fundamental to evaluating the accuracy of the model; (iii) cross-validation allows us to identify the predictive capacity of the model; and (iv) although generalized linear regression is the technique usually applied, linear regression provides models with better predictive capability. Thus, from this set of hypotheses, the paper seeks to contribute to the &ield in three ways: (i) the development of FTGMs in BUC; (ii) the use of linearity assumptions and a cross-validation technique to evaluate the accuracy and predictive capacity of the model, respectively; and (iii) the comparison of different trip generation modelling techniques.

FREIGHT TRIP GENERATION MODELLING
A systematic literature review (SLR) was carried out to identify the state of the art of FTGMs based on the review protocol proposed by Wee and Banister (2016). The SLR consisted of three steps. In the &irst step, data sources and keywords were identi&ied. Science Direct, Web of Knowledge, Scopus (Elsevier) and TRID (Transport Research International Documentation) were used as database sources. Freight trip and freight trip generation models were used as keywords (note: keywords in Portuguese and English were considered). In the second step, abstracts were read, and papers were selected according to the following approaches: FTGM (the speci&ic theme of this study), freight demand management (related to the freight trip generation) and urban goods distribution (due to the relationship between urban development patterns and freight trip). In the third step, papers not related to the theme were excluded, and those identi&ied by the snowball strategy were included.
The results of the SLR highlight the contribution of this paper, as FTGM in BUC are not considered in the literature. Also, it is unusual to use linear assumption and cross-validation to identify the predictive capacity of the model, as is proposed in the current study. Accordingly, this paper contributes to the literature in two ways: concerning the method, it used techniques to assess the accuracy of the prediction of the model, and, concerning the phenomenon, it addressed a topic that had not yet been studied in the literature.

RESEARCH APPROACH
The research approach used to determine the freight trip generation models for buildings under construction was based on Oliveira et al. (2016) and Campos et al. (2012), as described in the next sections.

Defini on of objec ve, scope and area of study
FTGM contributes to urban freight transportation planning (Oliveira et al., 2016) by creating estimate models for BUC (residential or commercial).
The study scope is based on trips generated to BUC in the study area of Belo Horizonte, MG, Brazil. A building under construction passes through &ive main stages during the construction process: foundation, structure construction, brickwork and interior rough-in, coating and &inishing. Depending on the type of constructive structure of the building (concrete, structural masonry or mixed structure), a different number of trips can be generated using many types of freight vehicles. Dump trucks are typically used in the foundation stage, cement mixer trucks are used in the structure construction stage and &latbed and box trucks are used in other phases of construction.
Regardless of the stage and type of construction, BUC attracts freight vehicles in order to supply goods for the construction project. Moreover, BUC also produces trips to remove construction waste. Despite it being possible to analyse these trips together, in this paper, the produced and attracted trips were analysed separately, since the BUC generated a different number of trips in each phase of construction. Another motivation for considering these phenomena individually is the scarcity of models to explain them. It is important to mention that regardless of whether the trip is attracted or produced, at some point, an empty trip (where the vehicle is empty) will be produced by the BUC, but these empty trips were not analysed in this study.

Defini on of dependent and explanatory variables
In freight transportation, an estimate of the trips generated can be made in terms of the number of trips or the amount of cargo transported . In this paper, the dependent variable is the number of freight vehicles attracted (i.e. deliveries of goods to BUC) and produced (i.e. collection of waste of the BUC) by week. The time unit 'week' was chosen because the planning of a BUC is on a weekly basis. In this sense, it is logical to use the same time unit in the freight trip generation modelling.
Regarding the exploratory variables, area and number of employees were considered as explanatory variables since they are the classical explanatory variables in FTGM (Alho and Silva, 2017;Oliveira et al., 2018). Additionally, the number of &loors and units were alternative explanatory variables considered in the modelling.

Iden fica on of buildings under construc on in the area of study and data collec on
The identi&ication of BUC was obtained through the Approved Building Projects Report (Belo Horizonte, 2018), which identi&ies BUC projects approved by the Belo Horizonte municipality. This report provides the location of the BUC in Belo Horizonte.
A questionnaire was designed to obtain the data. The structure of the questionnaire is presented in Table 1. Data were collected considering the stage of construction, since the number of employees is related to the services performed and varies according to the construction phase. Engineers responsible for new projects in a construction company in Belo Horizonte were interviewed to validate the questionnaire. The questionnaire was answered by engineers or construction managers involved in the routine of the building under construction.

Sample
Data from the Project Report approved by the Belo Horizonte municipality was used to de&ine the sample. This report presents information about the projects that requested a building permit from January 2017 to December 2017. We identi&ied 604 BUC in Belo Horizonte in 2017, and the data collection was planned considering the number of BUC in nine administrative the values observed in the data set. Akaike information criterion (AIC) was used to classify the regions in Belo Horizonte. The sample size estimated was 83 BUC, with a 95% con&idence level and 10% margin of error.

Analysis of the correla on
The analysis of the correlation between variables determines the strength of the relationship between two paired observations (Stevenson, 1981). Pearson's correlation coef&icient was used to analyse the correlation between the dependent variable and the exploratory variables. If there is a correlation between the independent and dependent variable, it is more probable to obtain a model with statistical signi&icance.

Modelling of freight trip genera on
Linear regression and generalised linear regression (GLM) were used as modelling techniques. According to Maia (2017), regression analysis consists of obtaining an equation that tries to explain the variation of the dependent variable by the variation of the independent variable. Washington et al. (2010) and Maia (2017) present the details of the linear regression technique.
GLM is used when linear regression is inadequate (i.e. the dependent variable is not asymmetric or represents data from counts, or the data is binary). McCullagh and Nelder (1989) developed the GLM, incorporating exponential family distributions to the regression adjustment. The Poisson distribution is indicated for the regression adjustment (Washington et al., 2010) considering FTGM. The log-likelihood function is used to estimate the parameters.
The non-bias and minimum variance in the estimation of the coef&icient were veri&ied through the t-test (Maia, 2017). The null hypothesis is that the estimation of the coef&icient is signi&icant for the model at t-test < 0.05. The coef&icient of determination (R²) was used to identify the proportion of the variability of the dependent variable that is explained by the independent variable of the model. Analysis of variance (ANOVA -test f) allowed for veri&ication as to whether the model contributes to explain the dependent variable. For this, the model contributes to explain the dependent variable if the p-value ≤ α (null hypothesis).
Linearity assumptions -linearity, mean of errors equal to zero, homoscedasticity, autocorrelation between errors and normality of residues -were tested to assess the accuracy of the model (Washington et al., 2010). Residuals vs &itted plot was used to evaluate linearity; homoscedasticity was veri&ied by the Breusch-Pagan test (Breusch;Pagan, 1979). The null hypothesis considers the model as homoscedasticity, i.e. there is constant variance in the residuals at the 5% signi&icance level, P[χ 2 ] > 0.05 (Maia, 2017). The normality of the residues was veri&ied by the Shapiro-Wilk test, in which the sample comes from a normal distribution (null hypothesis), rejecting it if the result of the test is less than P [Wcalculado] < P [Wα], where P [Wα] is from p-value.
The model has accuracy if the model meets all linearity assumptions. The predictive capacity of the model was evaluated by the chi-square test, the square root of the mean square error (RMSE) and the cross-validation test (Hyndman, 2006;Arlot and Celisse, 2010). Leave-one-out cross-validation (LOOCV) process was used; for more details, including the procedure associated with LOOCV, see Arlot and Celisse (2010).
Concerning the GLM, the hat-value identi&ies the leverage values in the model. Estimation of the GLM coef&icients was evaluated by the Pearson chi-square test (χ2), which is used to determine if there is a signi&icant difference between the expected values in a predictive model and TRANSPORTES | ISSN: 2237-1346 35 models (Akaike, 1974). Alternatively, Hurvich and Tsai (1989) proposed the corrected AIC (AICc), which is indicated for small samples with a normal distribution, as being a more suitable criterion to select the model (Davison, 2001). Burnham and Anderson (2002) recommend using AIC to select models when the number of observations is at least 40 times greater than the number of parameters. The models were estimated using the software R version 3.4.4, launched on March 15, 2018, through the following package: RSQ (Zhang, 2018), metrics (Frasco, 2018), lmtest (Hothorn et al., 2018), ISLR (James et al., 2017) and boot (Canty and Ripley, 2017).

RESULTS
Ninety BUC were randomly selected from the Report of Belo Horizonte Municipality. In the &ield survey, 36 BUC are located and were replaced by another BUC in the same region. Data from 105 BUC located in Belo Horizonte were obtained. The data collection occurred between August 2018 until November 2018.

Characteriza on of the buildings under construc on
Of the 105 BUC, 79.0% of the projects are of concrete frame network, 16.2% of structural masonry and 4.8% use mixed structure, i.e. part in reinforced concrete and part in structural masonry. Residential projects are the majority (87.6%), 3.8% are buildings exclusively for commercial use and 8.6% are mixed buildings (commercial use on the ground &loor and residential units on other &loors). Also, 17% were in the foundation phase, 19% in the structure construction stage, 21.9% in the brickwork and interior rough-in stage, 23.8% in the coating phase and 18.1% in the &inishing phase.
Regarding deliveries, 15.2% occur on Monday, 18.1% on Tuesday, 23.8% on Wednesday and 4.8% on Thursday. Some of the interviewees (38.1%) were not able to specify one day with more frequency of deliveries. Deliveries mainly occur between 7-10 hours (51.4%), 16.2% occur between 10-12 hours, 3.8% between 12-14 hours and 4.8% between 14-17 hours, and 23.8% of respondents did not know when the deliveries occur.
For unloading operations, 78.1% of the vehicles park on the street, in front of the BUC, and 10.5% park in a temporary unloading area. Also, in 11.4% of the BUC, the vehicles park at the construction site, mainly in the foundation phase. According to the Transport and Traf&ic Company of Belo Horizonte (BHTRANS), only &ive BUC requested a temporary unloading area in 2107.

Freight trip genera on models
Data were segregated according to the type of the constructive structure of the building (concrete, structural masonry and mixed) and the construction stage of the construction for the evelopment of the models. structure and stage of construction. A minimum sample of &ive BUC was considered for modelling. In this way, it was possible to obtain equations for all stages considering the BUC in concrete. Also, it is possible to obtain equations for structural masonry constructive structure for brickwork and interior rough-in stage, and coating stage.

FTGM by linear regression
We developed 110 models using linear regression -55 from attraction, 55 from production. From that, only nine models were validated by statistical tests and linearity assumptions. These models are presented in Table 5. Regarding the attraction models, freight vehicle attraction models for BUC in concrete, independent of the construction stage, were obtained. Also, model LM6 is predictive due to the cross-validation result. This model considers a linear freight vehicle attraction model for any constructive structure or stages of BUC. Regarding freight vehicle production, area is the exploratory variable in all validated models. Also, it was possible to obtain a general model for structural masonry buildings under construction (LM9).

Linear generalised freight vehicle generation models
We developed 80 models using GLM -40 for attraction, 40 for production. From that, only &ive were validated by the statistical tests and are presented in Table 6.
Considering the Akaike information criterion (AIC), the generalised linear models do not present better predictive power when compared to linear models. For example, model GLM1 has the same variables as LM1 and the AIC (GLM1) is 78.07, while the AIC (LM1) is 53.4. Comparing these results, we concluded that LM1 is more suitable to estimate the number of freight vehicles attracted by buildings under construction. Similar results were obtained comparing the models GLM2 and LM4. Also, three models (GLM3, GLM4 and GLM5) were validated using the TRANSPORTES | ISSN: 2237-1346 38 independent variable employee, while we did not obtain a linear regression model using this variable. Also, the values of c-hat do not indicate the dispersion of the data, being smaller than one for all the models.

Discussion of results
We obtained 15 valid models with the data obtained for Belo Horizonte, with accuracy and predictive capacity, including a freight vehicle attracted model (LM6) regardless of the construction type and the construction stage. Thus, it was possible to estimate FTGM in BUC.
In general, the models using number of units or &loors as explanatory variables obtained the worst adjustment. These variables could in&luence the number of trips because they are related to the volume of goods necessary to build each &loor. However, they did not contribute to explaining the phenomenon analysed in this paper.
Regarding the number of employees, three (from 15) models were estimated using this explanatory variable (GLM3, GLM4 and GLM5), all using the GLM technique. This result brings a re&lection about the location of BUC and the technology used in this construction. Analysing the data, on average, the BUC have 13.16 employees (standard deviation = 3.05; minimum = 4; maximum = 38). Thus, it is possible to assume that with the advent of technology, fewer employees are necessary. Thus, this variable needs to be used with parsimony to explain the FTGM in BUC.
Regarding the techniques, more models were obtained using linear regression than GLM. Thus, although the literature suggests the use of GLM for counting data, the results demonstrate that it is possible to obtain linear models that meet linearity assumptions with accuracy and with predictive capacity. Also, the results prove that the FTGM in BUC is a linear phenomenon.
Also, the results show that an analysis focused on the coef&icient of determination (R 2 ) is not enough to make a conclusion about the ef&iciency, accuracy and predictive capacity of the model. As an example, if the accuracy of the models presented in this paper was evaluated looking only at R 2 , 30% (equivalent to 3 models) of them were discarded. However, despite the low co-ef&icient of determination, the models explain the phenomenon under study, taking into account the linearity assumptions. Consequently, other tests are essential to making a conclusion about the accuracy and predictive capacity of the model, such as those used in this study. Therefore, linearity assumptions are fundamental to evaluating the accuracy of the model. In the same manner, cross-validation allows for the identi&ication of the predictive capacity of the model.

CONCLUSION
This paper presented freight vehicle generation models for buildings under construction. Data from BUC in Belo Horizonte were considered for the modelling. Models were estimated using linear regression and GLM.
Data were obtained from interviews in 105 BUC. We estimated 190 equations (110 using linear regression and 80 using GLM) and obtained 15 models that were statistically valid. Considering the Akaike information criterion (AIC), the models obtained by linear regression were more suitable to estimate the freight vehicle generations. Also, FTGM to BUC is a linear phenomenon in the Belo Horizonte case.
Among the variables used to explain the phenomenon in BUC, the best results were obtained using the area and employees. The number of &loors and units (related to the characteristics of the sector) did not present statistical signi&icance in the estimations.
Finally, linearity assumptions and cross-validation reduce the number of models obtained. However, the models present accuracy and predictive capacity. Thus, it is possible to conclude that the linearity assumptions are fundamental to evaluating the accuracy of the model. Also, it is important to verify the predictive capacity of the model using the cross-validation test.
For future studies, it is suggested that construction time be included as an explanatory variable in the modelling. This variable is in&luenced by the stage of construction and in&luences the storage area and the number of trips. Also, it is suggested to include the number of units and &loors as explanatory variables to con&irm that they did not contribute to freight trip generation in BUC. Still, it is suggested to develop similar analyses in other cities to compare the results, including more extensive data collection efforts. Another suggestion is to carry out temporal data collection to analyse the in&luence of time on freight trip demand. This analysis could be interesting when associated with the explanatory variables: while the area of business is one variable that does not change over time, it is possible that the number of employees could change over the time and thus could in&luence the results of the modelling.