The eﬀect of diﬀerent aggrega(cid:24)ons of severity levels of crashes with pedestrians in urban areas

Promo+ng a safer road environment for pedestrians requires an understanding of the risk factors associated with the injuries suﬀered by these users while involved in crashes. Injury levels as recorded by police reports may be subjected to bias and errors specially in adjacent and not extreme injury categories. The aim of this study is to inves+gate the impact of diﬀerent severity classiﬁca+on conﬁgura+ons on iden+fying factors related to crashes involving pedestrians in urban areas. Mul+nomial logit models were es+mated using crash records from the city of Fortaleza between the years 2017 and 2019. The results indicated that the combina+on of some severity levels can lead to diﬀerent signiﬁcant variables and, thus, depending on the speciﬁca+on of the response variable, the inﬂuence of important risk factors may end up being ignored in the model. Among the analyzed factors, the age of pedestrians, the day of the week, the +me of the crash and the type of road remained signiﬁcant for the diﬀerent conﬁgura+ons of severity levels. In addi+on, the model with three severity categories (mild/moderate, severe, and fatal) presented the best performance in terms of model adjustment. It was observed from this model that factors such as


INTRODUCTION
The new paradigm of road safety based on the Safe Systems approach and Zero Vision, considers that humans make mistakes and are vulnerable to injury in the event of a crash. In this sense, the road system must be designed so that human error does not lead to a severe or fatal outcome. These approaches bring a broader perspective of the causes behind fatalities and severe traf ic injuries and also encompass the concept of shared responsibility, where governments, the private sector, and society share the responsibility for a safe transportation system (Belin et al., 2012;Welle et al., 2018).
Safe Systems and Zero Vision take into account people's susceptibility to severe injuries in traf ic and within the road environment, where pedestrians are considered the most vulnerable users. Shinar (2017) highlights two points that increase the probability of more severe injuries to pedestrians when involved in a crash: irst, the fact that they are fully exposed without any protective barrier; and second, because they have a signi icantly lower mass in relation to motor vehicles. Moreover, some aspects that distinguish pedestrians from other users, such as a less regulated and more variable traf ic movement, make their protection more dif icult. Bhat et al. (2017) emphasize that efforts to prioritize these users need to be coordinated with strategies that increase their safety. This, in turn, requires an understanding of the risk factors associated with pedestrian injuries in crashes, to allow the identi ication of unsafe con igurations of urban space and help de ine appropriate countermeasures for transportation policies. In the literature, these factors are commonly categorized into groups such as: characteristics of the pedestrians and drivers, vehicle attributes, road characteristics, environmental conditions, crash aspects, among others (Mannering et al., 2016;Sun et al., 2019;Li e Fan, 2019;Zafri et al., 2020).
Several researchers have been developing modeling efforts aimed at estimating the degree of severity as a qualitative response variable (Kwigizile, et al., 2011;Rothman et al., 2012;Dong et al., 2019). According to Savolainen et al. (2011), the dependent variables on these models are particularly a binary response result (injury or non-injury) or a multiple response result (fatal, severe, mild, or unharmed). This data type is modeled primarily by using logistic regression models. In these cases, there is no temporal aggregation of crashes (annually, monthly, etc.), as in road safety performance functions for example, but a disaggregated analysis of a more explanatory nature among the contributing factors and the crash severity, depending on its occurrence.
Although different modeling methodologies are available in the literature to examine the severity of crashes related to various in luencing factors, little is known about how the classi ication con iguration of the adopted severity improves the outcome of the modeling and the understanding of its in luence on the risk factors. Since crash data is extracted from different sources, such as police reports or hospital records, it is possible that there are divergences among these pieces of information (Tsui et al., 2009). In addition, there is still a dif iculty in reporting minor crashes (without victims). Thus, it is common to ind several studies that use different aggregation con igurations of severity levels, especially of the intermediate categories between unharmed and fatal (Clifton et al., 2009;Tay et al., 2011;Jang et al., 2013).
In view of the above, the main objective of this work is to evaluate different severity classi ication con igurations for modeling the severity of crashes with pedestrian and to identify factors that in luence the severity of these crashes by using multinomial models. There are several ways to represent the severity levels of injuries related to crashes. However, the de initions are not uniform among countries or even within administrative regions of a given nation, which indicates a lack of consistency and dif iculty in comparing studies. The characterization of the severity adopted by the police is usually based on ixed criteria, such as the level of bodily injury and length of stay in the hospital (for example, a crash can be considered fatal if a victim dies within 30 days of the crash) (Imprialou and Quddus, 2019). One of the best known classi ications is the KABCO classi ication, developed by the National Safety Council (NSC) in the United States and applied by police of icers in crash reports. Another internationally accepted classi ication standard for measuring injury severity is the Abbreviated Injury Scale (AIS), developed by the Association for the Advancement of Automotive Medicine. The maximum score on the AIS scale, called MAIS, coded for each body region, is often used as an aggregate measure of the injury severity. Despite this more realistic classi ication based on medical criteria, of icial statistics on the injury severity from crashes in many countries are based only on the evaluation of the police of icer or traf ic warden at the scene of the crash or on information passed on to these of icers shortly after the crash, with the exception of fatalities (Working group on severe road traf ic casualties, 2010; Ferreira et al., 2017).
Some studies have investigated inconsistencies between the classi ication adopted by police of icers and the classi ication used by health professionals. Tsui et al. (2009) found that only 5% of injuries reported as severe by Hong Kong police of icers were considered severe in the hospital. Ferreira et al. (2015) compared the police classi ication of crash severity with the MAIS classi ication used in hospitals in the cities of Porto and Vila Nova de Gaia, in Portugal. The authors observed a tendency of police of icers to overestimate the injury severity. A notable proportion of severe injuries reported by police are in fact minor injuries. Inconsistencies in these classi ications may generate some bias in the results of research investigating risk factors related to speci ic severity levels. Similar results were found in other studies (McDonald et al., 2009, Broughton et al., 2010, Bhatti et al., 2011, Burdet et al., 2015and Couto et al., 2016. It is common to ind studies where the Authors choose to aggregate some levels of classi ication, especially intermediate and adjacent levels. (Clifton et al., 2009;Abay, 2013;Pour-Rouholamin and Zhou, 2016;Uddin and Ahmed, 2018). Clifton et al., (2009), while using data from the city of Baltimore (USA) that applies the KABCO classi ication, have regrouped the ive categories into three levels: fatality, injury, and no injury. Abay (2013), investigating crashes involving pedestrians and a single motor vehicle in the city of Danish (Denmark), regrouped the original four categories of the database into just three.
In general, most of these studies justi ies the aggregation of severity levels due to the very low number of observations in a given category. The authors state that grouping the categories aims to improve the results of the models and believe that merging the adjacent injury severity levels does not substantially affect the inferences, as long as these adjacent injury categories are also slightly similar. However, they do not make it clear in what aspects these models could be improved, nor do they present a comparative study between models with disaggregated levels. (Mannering and Bhat, 2014).

Factor influencing the Severity of Crashes with Pedestrians
Due to vulnerability to injuries, the severity may differ according to age, with more severe injuries mainly associated with children and the elderly (Abay, 2013;Eluru et al., 2008). Female pedestrians are associated with higher severity levels than male pedestrians (Lee and Abdel-Aty, 2005). Lee and Abdel-Aty (2005) and Eluru et al. (2008) evaluated the effects of traf ic control type on crash severity levels and found that injury levels increase in the absence of traf ic controls, including pedestrian traf ic lights. Fan, 2019, Aziz et al.( 2013), Eluru et al. (2008) and Sze and Wong (2007) examined the impact of vehicle type and road speed limits and found that heavy vehicles as well as higher speed limits are related to more severe pedestrian injuries.
Several studies have investigated this relationship between speed and severity of pedestrian injury and have indicated that more severe injuries are associated with higher impact speeds. The risk of severe injury or pedestrian death increases exponentially with speed (Garder, 2004;Rosén et al., 2011;Hussain et al., 2019). Even when speed is not the main cause for a crash to occur, it is highly related to the severity of injuries, because the speed of the vehicle at the time of impact generally plays a critical role in relation to the energy transferred to the victim. Despite its importance, it is dif icult to know the actual speed of vehicles at the time of the occurrence and therefore proxy variables, such as the reported speed limit or road classi ication are commonly employed (Jang et al., 2013;Li et al., 2016).
In the context of the Safe System approach, an appropriate speed limit is a speed level that considers traf ic safety as the main objective, it should adequately include adjacent land use mix as well as different road users with their limitations and vulnerability. For Clifton et al. (2009), arterial roads and highways are designed for higher speeds and volumes of vehicles and tend to be less safe for pedestrians. However, in areas with high population and commercial density, a large number of crashes is expected, although injuries may be less severe due to congestion and low vehicle speeds, provided by the urban characteristics of the environment. Other aspects of urban space can also have a signi icant in luence on the severity of crashes with pedestrians, especially on a micro scale, such as the number of lanes that increase the crossing distance and consequently pedestrian exposure (Aziz et al., 2013;Pour-Rouholamin e Zhou, 2016).
In general, the severity of traf ic injuries is analyzed by using categorical models. Due to the orderly nature of injury severity, models such as ordered logit/probit are commonly employed (Jang et al., 2013;Donmez e Liu, 2014;Chen et al., 2020;Liu e Fan, 2020). However, this approach assumes that the variables have the same impact (for the value and sign of the parameter) on all levels of injury severity. Consequently, there is a restriction on how the explanatory variables affect the probabilities of the outcome. In addition, ordered models are more susceptible to underreporting injury data, resulting in biased or inconsistent parameter estimates (Washington et al., 2003;Savolainen et al., 2011).
Models that do not consider the orderly nature of injuries, such as the multinomial logit, are also frequent in the analysis of the severity of crashes (Tay et al., 2011;Manner and Wünsch-Ziegler, 2013;Chen e Fan, 2019, Casado-Sanz et al., 2019Salum et al., 2019;Vajari et al. 2020;). Although they do not consider the order of outcomes of injury severity, these models are not affected by those restrictions imposed by the ordered models. In general, each approach has assumptions and restrictions that have implications on the inferences of the models, however, the superiority of one approach over another may depend on the available data The choice of the most appropriate model type, as well as the selection of factors related to the severity of pedestrian injury, depends greatly on the circumstances of the study site, the data set used for the analysis, and the objectives of the research. For Wang et al. (2013), even in developed countries or cities, differences in road infrastructure, traf ic conditions, and behavioral patterns of pedestrians and drivers may result in a different set of signi icant factors associated with the severity of pedestrian injuries. In this study, data from the Brazilian city of Fortaleza will be used to investigate the risk factors associated with the severity of injuries in pedestrians due to the possibility of accessing the necessary data to perform this study.

CRASH DATA AND URBAN ROAD ATTRIBUTES
The data of crash with pedestrians was collected from the Fortaleza Crash Information System (SIAT/FOR) for the years 2017 to 2019. This data consists of individual records with personal information of the victims, characteristics of the involved vehicles, and aspects related to the crash. In order to determine the variables related to the urban structure, a georeferenced database of traf ic lights and speed enforcement cameras provided by the municipal traf ic management agency (AMC) was used.
In the crash dataset, observations with ields without information related to the investigated explanatory variables were eliminated from the initial sample (N = 4,658), resulting in a sample of 2,660 observations. Regarding the severity of the injuries, SIAT-FOR uses ive categories: unharmed (no apparent injury), mild (possible injury), moderate (evident injury), severe (disabling injury), and fatal (killed). In the SIAT/FOR dataset, unharmed and mild crash outcomes are assigned by the municipal of icer responsible for the initial response. However, in more severe crashes, the injury category is obtained from the report of the regulation center of the Mobile Emergency Care Service (SAMU) that answers the calls of the occurrence and forward the teams to provide assistance. The regulation center classi ies the severity of the occurrence based on the need for urgent care and according to the victim's state of health during irst aid.
The unharmed category was eliminated from the dataset because it presented only one record for this period. This aspect is quite common in crash databases, since crashes without injured victims are less likely to be reported, since usually the responsible agencies are not called to provide assistance. Moreover, it is unlikely that a crash with pedestrians will result in an unharmed victim, which also reduces the number of records in this category.
Some variables were obtained from the georeferenced base from a buffer with a radius of 100 meters around the observed crash. The variables were divided into four groups: pedestrian characteristics, driver characteristics, crash conditions and road network attributes. Table 1 presents the selected variables, together with their relative frequencies.
In terms of the injury outcome four categorical models were estimated from the obtained data sample: i) In Model 01, the dependent variable has four categories, using the original database classi ication: mild, moderate, severe, and fatal injuries; ii) In Model 02, mild and moderate injuries were grouped into a single category due to similarity in terms of consequence, which may generate divergences in the classi ication between these two levels by the agents in charge; iii) In Model 03, severe and fatal injuries were grouped into one category, based on the principles of Safe Systems and Vision Zero, which aim to reduce severe and fatal injuries in traf ic; iv) In Model 04, moderate and severe injuries were grouped into a single category. Table 02 presents the frequency of data severity levels in each con iguration. 1 -Pedestrian between 31 and 60 years old (52%); 0 -Other Age_M60 1 -Pedestrian above 60 years old (20%); 0 -Other Driver Gender_D 1 -Male (84%); 0 -Female (16%); Age_D18_30 * Driver between 18 and 30 years old (45%); Age_D31_60 1 -Driver between 31 and 60 years old (52%); 0 -Other; Age_D60 1-Driver above 60 years old (3%); 0 -Other. Crash  Agresti (2006) recommends checking the number of possible predictors for a model in case of imbalance in the categories of the response variable. Peduzzi et al. (1996) suggests that the model should contain no more than n/10 parameters, where n represents the number of observations of the category with the lowest frequency, to avoid problems of super estimated and underestimated variances and therefore poor coverage of con idence intervals based on Wald tests. Thus, in Model 02, with the highest unbalance in the frequency of the categories, the number of predictors should not exceed 22, which in fact does not occur, as will be discussed in the next sections.

DEVELOPMENT OF CATEGORICAL MODELS AND ADJUSTMENT MEASURES
In this study, unordered multinomial logit models were used because they allow for more lexible variable effects, since they do not impose a monotonic effect on the dependent variable, as traditional ordered models (Abay, 2013;Savolainen et al., 2011;Washington et al., 2003). In the Multinomial Logit (MNL) model, the general structure used to estimate severity starts with the de inition of a linear function S that determines the result of injury i for observation n as: = β + ε (1) where is a vector of estimated parameters, is a vector of observable characteristics that affect the severity of the injury sustained by observation n, and ε is an error term that is responsible for unobserved effects (Washington et al., 2003). The probability for each severity level is given by Equation 2, assuming that ε is distributed identically and independently with an extreme value type 1 distribution. When an MNL is estimated, an injury level is used as a comparison group and therefore its coef icients are set to zero. In this study, the irst category of each model was used as a reference category.

= ∑
(2) Two tests were performed to assess the quality of the models: the model signi icance test and the Hosmer-Lemeshow test. The Akaike Information Criterion (AIC) and McFadden's R² were applied to provide additional information regarding the adjustment of the models developed for the four severity classi ication settings. In addition, the Hausman-McFadden test was used to assess the property of Independence of Irrelevant Alternatives (IIA) to which multinomial logit models are conditioned.
Besides identifying an arrangement that improves the model adjustment, the interest in this comparison is to evaluate whether possible problems related to the classi ication used in the database in luence potential explanatory variables as well as their related effects for a speci ic severity level. For this purpose, the signi icance of the variables in each model and the estimated parameters will also be evaluated. Table 3 presents the estimated coef icients and goodness of it indicators for the four proposed crash severity con igurations. The likelihood tests indicate that the explanatory variables improve the model's adjustment.

RESULTS AND DISCUSSIONS
According to the Hosmer-Lemeshow test, there is no evidence to reject the null hypothesis (evidence of good adjustment) for the irst two estimated models. Model 03 and Model 04, on the other hand, presented a p-value lower than the established signi icance level, rejecting the null hypothesis of a good model adjustment to the observed data. Aggregating the categories in these last two models appears to include extra bias for model adjustment. The differences in terms of the consequences of the injuries at these levels for pedestrians and the fact that models with these separate levels show a better adjustment, reinforce that the in luence of risk factors on these two levels should be analyzed differently.
Model 01 and Model 03 presented similar AIC levels. Model 02, on its turn, with the difference of only one predictor variable in relation to Model 03 and two variables in relation to Model 01 and Model 04, presented a much smaller AIC. This shows that this difference is not only due to the penalty imposed by the measure (AIC penalize models with higher number of parameters), but also by a better adjustment of Model 02.
McFadden's R² can be used to compare performance between models, with values closer to 1 associated with model superiority. It is important to note that their values tend to be considerably lower than those expected for the R² of linear regression. Values between 0.2 and 0.4 may represent an excellent adjustment (Ortúzar and Willumsen, 2011). In the case of the four models investigated, Model 01, Model 02 and Model 04 presented values that indicate a good it for the model, but Model 02 presented the highest R². It is also worth noting that R² tends to increase with the number of predictor variables, and Model 02, even with the smallest number of explanatory variables, still showed a better performance.
In order to verify if the IIA premise is valid for the observed data, the Hausman-McFadden test was performed. Each of the four models tested was compared to an estimated model with a subset of the dependent variable's category group. The results suggested that the IIA premise was met.  Table 3 provide useful insights for comparing the in luence of different variables on different severity categories. Regarding the signi icance of the variables, it was noted that the signi icant parameters in relation to the same category in the different models remained similar both in magnitude and in the sign of the parameter. This can be seen in the Gender and Weekend for the fatal category in Model 01, Model 02 and Model 04. In addition, most variables were signi icant in at least one severity category in the four estimated models.

The results in
On the other hand, some variables are no longer signi icant with the grouping of severity levels, as is the case for the Gender_D, which was signi icant for the fatal category in the irst two models and not signi icant in Models 3 and 4. In this sense, it can be said that there is a differentiation between male and female drivers in crashes that result in fatal victims, which does not occur in the other severity categories. When joining the categories, the model was not able to detect this distinction between the driver's gender in the result of the injuries. The same occurred with the presence of traf ic lights (P_Traf_Lights). In this case, it is possible to observe a distinction between places with and without the presence of traf ic lights in crashes with severe pedestrian injuries. However, the model fails to capture this difference in fatal injuries and, by grouping these two levels, the variable ends up losing its signi icance.
In the multinomial modeling approach, usually it is not possible to identify the real magnitude of the effect of a given factor directly from its estimated coef icient.
Since all explanatory variables are categorical, magnitude can be assessed by the percentage change in probability when a variable is changed from 0 to 1, keeping the remaining variables constant, the pseudoelasticity of probability. The pseudoelasticity was used to investigate the aggregations impacts of model categories. Table 4 shows the pseudoelasticity of the two models that presented the best it, considering the AIC parameter.
Through the analysis of Table 4, for the same variables of the 'Fatal' category, all Model 4 variables had a reduction in the chances of having a pedestrian crash fatality, compared to Model 2. The Age_31_60 variable was an exception with a slight increase in the chance of fatal injuries.
In a pedestrian crash involving a heavy vehicle, Model 2 has a 269% chance of fatal injuries involving pedestrians, while Model 4 presents a 21% chance. For pedestrian crashes that occur on expressways, the chance of fatal injuries is 292% in Model 2 and 208% in Model 4. Therefore, aggregations of the response variable can ignore key variables and change the risk factors in luence on signi icant variables, resulting in the adoption of insuf icient policies for pedestrian safety.
By knowing these limitations for the speci ication of the response variable, it is important to establish a criterion that allows the adoption of the best model for the analysis of risk factors. Taking into consideration the performed tests and the metrics considered for the comparison among the four models, Model 02 showed the best performance. Therefore, the analysis of the effect of risk factors associated with the severity of pedestrian injuries will be made from this model. Regarding pedestrian characteristics, the elderly is more likely to suffer more severe injuries when involved in a crash. This fact is also demonstrated in the works of Abay (2013), Jang et al. (2013), Sun et al. (2019), Li a Fan (2019) e Batouli et al. (2020). Crashes with older pedestrians can be in luenced by errors in judging the gaps in traf ic, due to their lower speed of movement and the dif iculty in terms of seeing the vehicles and correctly judging their speeds. Although age is not a causal factor itself, it is associated with a decline in cognitive ability and a weakening of the body as a result of possible health problems or the natural aging process, which makes this pedestrian group much more vulnerable to injuries. When compared to pedestrians aged 16 to 30, victims aged over 60 are almost nine times more likely to suffer a fatal injury.
As for the gender of the pedestrian, the highest risk is associated with men. One hypothesis for this result may be related to their tendency of riskier behavior, especially in relation to the gaps available for crossing (Rosenbloom, 2009;Dutta and Vasudevan, 2017;Torres et al., 2020). Male drivers are also associated with a higher probability of being involved in crashes with more severe pedestrian injuries. Eluru, Bhat and Hensher (2008) and Kim et al. (2010) found similar results. The authors point out that these results may have a relationship with a higher risk behavior, to assume higher speeds, of this group of drivers compared to female drivers.
The day of the week as well as the period of the day (day or night) of the vehicle-pedestrian collision were both signi icant and reveled that there are greater chances of severe or fatal victims in crashes occurring on weekends and in the evening. In Fortaleza´s urban area, these indings must be associated with the considerably lower traf ic low which leads to higher chances of events where the vehicle speed was simply too high for the pedestrian´s body, even though (some of them) bellow or close to the posted speed (up to 40 km -local and collectors, up to 60 km/h -arterials in the city). In addition, these high energy events can also be a result, at least in part, of inappropriate speed choice made by drivers under the in luence of substances as well as late responses caused either by their impaired condition or due to the lower visibility conditions in the evening. Similar results were veri ied in the works of Jang et al. (2013), Chen and Fan (2019), Zafri et al. (2020) e Batouli et al. (2020.
During a run over, the injuries are the result of the transfer of energy to the human body. The speed and mass of the vehicle at the moment of impact are determining factors in the amount of energy to be absorbed by the pedestrian. Regarding the type of vehicle, the heavy vehicles, which includes trucks, pickups, and buses, signi icantly in luences the severity of pedestrian injuries. This type of vehicle is about twice as likely to cause a severe injury and three times as likely to generate a fatal pedestrian injury. In addition to the greater mass, heavy vehicles frontal height and design can partly determine energy to be absorbed by the pedestrian, especially in relation to the concentration of force in more vulnerable body areas.
The results indicate that the presence of traf ic lights near the crash site reduces the chances of severe injuries to pedestrians by about 24%. Aziz, Ukkusuri and Hasan (2013) and Sze and Wong (2007) obtained similar results in their works. The authors attribute these results to the best indication of priority in locations with traf ic control, which leads to greater caution for both drivers and pedestrians.
The road functional class is also generally used as a proxy variable for speed at the scene. Although the Brazilian Traf ic Code establishes a limit of 80 km/h for urban roads, in Fortaleza, speed limits are 60 km/h for express and arterial streets, 40 km/h for collector streets and 30 km/h for local streets. According to Table 4, both arterial (R_Art) and express (R_Exp) roads are associated with a greater risk of severe and fatal injuries to pedestrians. The positive sign of the estimated parameters for both variables in Table 3 indicates an increase in the chances of serious and fatal injuries for pedestrians in relation to mild and moderate injuries, which is notably attributed to the higher speeds achieved by drivers on these roads. However, it is important to note that on these roads, the presence of heavy vehicles, especially cargo vehicles, is also more common. The combination of higher speeds and heavy vehicles can result in a high risk of injury to pedestrians. The probabilities for each severity level of the three more frequent scenarios in the observations sample beyond the base scenario were calculated. Table 5 presents the analyzed scenarios, and Table 6 shows the low probabilities of fatal injuries to pedestrians even the scenarios are more frequent. However, comparing these scenarios, a crash involving a male pedestrian on the weekend, during the night, with a male drive, has more probability of a fatal injury to the pedestrian.

FINAL REMARKS
This work evaluated the effect of different aggregations in the pedestrian crashes severity modeling. The study motivations were: i) classi ication inconsistencies adopted by police agents and used by the health professionals to record the injury gravity of crashes victims; ii) many studies justify the gravity levels aggregation due to the few observations in some categories, and the grouping could improve the models results although they did not present the comparison to the disaggregated levels models.Variables related to the pedestrian, the moment of the crash and aspects of the urban structure were investigated. In addition, four different severity classi ication con igurations were evaluated in the modeling of these risk factors. For this purpose, logit models of the multinominal type were estimated by using a sample with 2,660 observations of crashes with pedestrians collected from the Crash Information System of Fortaleza (SIAT-FOR) for the years of 2017 to 2019. The variables had different effects on the severity levels through the comparative analysis of models. Thus, the combination of some severity levels may result in different signi icant variables. Only the variable Time was signi icant in all the severity levels, while Gender_D and P_Traf_Lights were signi icant in only two of four tested models. Therefore, depending on the severity categories, the model can ignore important variables or change the pseudoelasticity signi icantly, which would mistake the real risk effect of the variables. By de ining a criterion for selecting a model to assess risk factors based on the best adjustment, the results showed that the model with the levels of mild and moderate severity grouped in a single category in addition to the severe and fatal levels in different categories improved the it for the model, unlike the model that grouped the severe and fatal levels in a single category. Possible limitations related to the incorrect classi ication between these two levels can be overcome by adding these two levels without loss in the quality of the model. From the analysis, this study considered the best model to evaluate the risk factors effect on the injury severity related to pedestrians and drivers characteristics, road network properties, and aspects associated with crashes conditions.
In relation to the type of modeling used, the multinomial logit models make some restrictions of traditional orderly models more lexible, but do not consider the orderly nature. Furthermore, some of the many factors that affect the severity of crashes are not observable or the necessary data may not be available to the analyst. If these unobserved factors are correlated with the observed factors, biased parameters will be estimated and incorrect inferences can be made. In future studies, it is recommended to explore statistical techniques that consider a possible heterogeneity not observed in the effect of risk factors and greater lexibility of the estimated parameters, such as the models of random parameters or latent class (Savolainen et al., 2011, Mannering andBhat, 2014). These techniques allow a better adjustment of the model to the data and a more in-depth analysis of the explanatory variables.