Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Pattern detection in the vehicular activity of bus rapid transit systems

  • Jaspe U. Martínez-González ,

    Contributed equally to this work with: Jaspe U. Martínez-González, Alejandro P. Riascos, José L. Mateos

    Roles Data curation, Formal analysis, Investigation, Methodology, Visualization, Writing – original draft, Writing – review & editing

    jaspe@estudiantes.fisica.unam.mx

    Affiliation Instituto de Física, Universidad Nacional Autónoma de México, Ciudad Universitaria, Ciudad de México, México

  • Alejandro P. Riascos ,

    Contributed equally to this work with: Jaspe U. Martínez-González, Alejandro P. Riascos, José L. Mateos

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Departamento de Física, Universidad Nacional de Colombia, Bogotá, Colombia

  • José L. Mateos

    Contributed equally to this work with: Jaspe U. Martínez-González, Alejandro P. Riascos, José L. Mateos

    Roles Investigation, Methodology, Writing – original draft, Writing – review & editing

    Affiliations Instituto de Física, Universidad Nacional Autónoma de México, Ciudad Universitaria, Ciudad de México, México, Centro de Ciencias de la Complejidad, Universidad Nacional Autónoma de México, Ciudad de México, México

Abstract

In this paper, we explore different methods to detect patterns in the activity of bus rapid transit (BRT) systems focusing on two aspects of transit: infrastructure and the movement of vehicles. To this end, we analyze records of velocity and position of each active vehicle in nine BRT systems located in the Americas. We detect collective patterns that characterize each BRT system obtained from the statistical analysis of velocities in the entire system (global scale) and at specific zones (local scale). We analyze the velocity records at the local scale applying the Kullback-Leibler divergence to compare the vehicular activity between zones. This information is organized in a similarity matrix that can be represented as a network of zones. The resulting structure for each system is analyzed using network science methods. In particular, by implementing community detection algorithms on networks, we obtain different groups of zones characterized by similarities in the movement of vehicles. Our findings show that the representation of the dataset with information of vehicles as a network is a useful tool to characterize at different scales the activity of BRT systems when geolocalized records of vehicular movement are available. This general approach can be implemented in the analysis of other public transportation systems.

Introduction

In the last decades, the study of cities has emerged from diverse scientific disciplines [15]. This integration is facilitated by the understanding of cities as complex systems [6, 7], offering a rich framework for the application of mathematical methods in urban studies [1, 3]. Furthermore, the recent availability of data records for diverse aspects of daily urban life has enabled the detailed characterization of human movement, allowing the identification of behavioral patterns at different scales [813]. In particular, one crucial aspect of urban dynamics is the public transport, which is intertwined with key urban issues such as optimization of the existing infrastructure [14, 15], disease spread [16], social disparities [17, 18] and economic inequalities [19], segregation [20], among many others [1, 3, 7].

The complexity of urban mobility constitutes a challenge. However, when elements of this complexity are represented as a network, it becomes more feasible to discern its properties. This approach has been proven effective in studying various other complex systems [21, 22]. For instance, by abstracting mobility infrastructure as a network [2325], it is possible to characterize the physical connectivity of these systems. Similarly, the representation of human activity datasets as a network [26, 27] allows to delve into deeper properties at different scales. In this manner, the methods of network science serve as a valuable bridge between available data and recently developed physical models, incorporating concepts such as phase transitions and percolation [2830]. Furthermore, other network properties, like community structure, can reveal patterns in vehicular activity at different scales [11].

Building on these insights, we implement different tools of network science to analyze nine bus rapid transit (BRT) systems. The term BRT refers to a bus-based system, where vehicles transit over exclusive lanes. Also, fulfill other characteristics, like the infrastructure of stops that include entering systems based on electronic cards or a schedule on arrival times to the stops [31, 32]. BRT systems appeared in the 1970 decade [31, 33] and, nowadays almost 200 systems exist worldwide [32]. This urban transit system is considered an alternative to other transport infrastructure like metro or light rail, due to their lesser costs of construction, operation, and maintenance [33, 34], meeting the transportation needs of urban residents [35]. Our focus on BRT systems is motivated by the reduced number of prior studies investigating its network infrastructure and the characterization of vehicular activity.

In this research, we explore BRT systems using network science in two aspects. First, we examine the networks associated with the infrastructure of these systems to determine features related to their connectivity. In the second part, we explore networks associated with vehicular activity within the systems. For the analysis of the movement of vehicles; initially, we consider 135 468 825 data records including a timestamp, position, and velocity of each active vehicle in nine BRT systems. Using this information we track vehicular activity at both the global level, covering the entire system, and at a local level within specific zones. For the latter, we divided the system into polygons considering the position of stations. Different statistical methods including the Kullback-Leibler divergence are implemented to compare the vehicular activity between every pair of polygons in each system. The outcomes are organized into similarity matrices that provide information to generate networks where each node is a segment of the system and links are associated with similar activity of vehicles at the local level. We apply a community detection algorithm to identify the community structure within similarity networks; this particular organization of the network allows a multiscale characterization of the movement of vehicles in the systems explored. The results show that the methods of network science are an important tool to characterize the activity of BRT systems when geolocalized records of vehicular movement are available. The approach explored is general and can be useful in the study of other public transportation systems.

Materials and methods

Data description

In this section, we describe the datasets of BRT systems considered in our research. We start with the information in two websites, [36, 37], and other sources on the internet to collect some links to useful databases. We explore 187 BRT systems worldwide detected with the implementation of the following procedure. First, we identify cities having open access to the data of transportation systems; the protocol of the required data is called “Real Time” because the information is actualized within a certain period to have current datasets of the transportation systems. In this information, we check if the available data contains records of the bus rapid transit system. Then, once identified the data it is necessary to check if the geographical coordinates (latitude and longitude) and the velocity of each active vehicle in the system are included in the available records. From this exploration, we determine nine BRT systems in the Americas providing the information to be analyzed. The BRT systems considered in our research are located in the cities of Louisville (Kentucky, USA), Austin (Texas, USA), Nashville (Tennessee, USA), San Antonio (Texas, USA), Maui (Hawaii, USA), Brampton, (Ontario, Canada), Rio de Janeiro (Brazil), Hartford (Connecticut, USA) and Mexico City (Mexico). The data collection and analysis in this study comply with the terms and conditions of the data source.

We download the data from each official source, considering their particular operational characteristics including the refresh time of the data packages and the service schedule. The datasets were processed in a human-readable format and saved in comma-separated value files at the end of each day. The information was downloaded from 2022 May 20th to 2023 May 19th. One particular case is the BRT system in Rio de Janeiro, whose datasets were downloaded from a project of Google Cloud where the authorities of the city store its open data [38], the records include the activity of vehicles from 2022 May 1st to 2023 April 30th.

In Table 1, we summarize the information considered in this research for nine BRT systems. We present the city where the system is located, the name of the BRT system as well as the link where the data was obtained. We also include the number of days in which it was possible to download at least one record of the geographic coordinates, along with the velocity v of vehicles in the system for which 0 km/h < v ≤ 90 km/h and the total number of these records. The last column shows the total number of active vehicles for which we have at least one record over the download period. The values presented in the table give an idea of the massive amount of data that is available when considering the global activity of a BRT system. The continuous monitoring of vehicles presents a challenge for the analysis of this information. In this research, several methods for studying global vehicle activity as well as in specific zones of the city are explored. This approach allows to classify and detect patterns of the movement of vehicles at different scales.

thumbnail
Table 1. Datasets considered for nine BRT systems.

Information about the city where each system is located, the name of the system, the link where the data is available, the number of days considered in the dataset, the total number of records with velocities 0 km/h < v ≤ 90 km/h and the total number of active vehicles.

https://doi.org/10.1371/journal.pone.0312541.t001

Infrastructure networks

An important characteristic of transportation networks is their connectivity features, due to the need in urban transport to reach as many points of interest as possible. A first approach to understanding the connectivity of BRT systems is to focus on end and transfer stations. Transfer stations show how lines of the systems are connected while end stations determine the direction that passengers take once they enter the system. Based on this point of view it is possible to represent a transportation system as an undirected graph, where the end and transfer stations are nodes, and the edges are the paths between them; we refer as infrastructure network to this graph describing a BRT system. This particular representation was introduced by Derrible for the analysis of metro systems [23]. In Fig 1, we show the infrastructure networks obtained for the nine BRT systems considered in this research.

thumbnail
Fig 1. Infrastructure networks of nine BRT systems.

Network representation of BRT systems considered in this research sorted by the number of nodes.

https://doi.org/10.1371/journal.pone.0312541.g001

The representation of BRT systems as networks allows a first analysis of their infrastructure with different well-known quantities and properties studied by network science [22]. The first two quantities that describe the systems are the number of nodes N and the number of edges . Furthermore, an essential quantity in the study of a network is the number of nodes connected to a node i, this is the degree ki. In infrastructure networks, a higher degree ki reveals that several lines intersect at the station i. A global view of this feature on undirected networks is given by the average degree 〈k〉 that satisfies . In addition, we can characterize how well-connected a transportation system is by counting the number of changes between lines a user have to make in order to reach a particular destination. If the number of changes is low, the system approaches an optimal case. This feature is represented in its graph by the number of edges lij of the shortest path connecting a pair of nodes i, j. Globally, the entire network is characterized by the average length of the shortest path 〈l〉, defined by [47] (1)

Another common quantity that describes the structure of a network is the average clustering coefficient 〈C〉, defined as [47] (2) where Ci = 0 if ki = 1 and (3) for ki ≥ 2. Then, 〈C〉 is a measure that gives the proportion of triangles in the network.

Once described some common quantities to analyze networks, let us apply these measures to the infrastructure networks shown in Fig 1. Our findings are reported in Table 2. In the first three columns, we present general information: the city where the BRT system is located, the number of lines and stations in the data download period. The next five columns include information about the infrastructure networks describing each system. We show the number of nodes N, the total number of edges , the average degree 〈k〉, the average length of the shortest path 〈l〉 and the average clustering coefficient 〈C〉. All the information reported in Table 2 is sorted using the number of nodes N of the infrastructure networks from small BRT systems such as Louisville’s Dixie Rapid to larger ones such as Mexico City’s Metrobus system. In particular, the BRT in Louisville has only two nodes, because the system is composed of one line, and hence, only the end stations are considered as nodes.

thumbnail
Table 2. Properties of the infrastructure networks of nine BRT systems.

Information about the city where each system is located, the number of lines, and the number of stations. Also are included the size of the infrastructure network N, the number of edges , the average degree 〈k〉, the average length of the shortest path 〈l〉 as defined in Eq (1) and the average clustering coefficient 〈C〉 in Eq (2).

https://doi.org/10.1371/journal.pone.0312541.t002

Moreover, regarding the average degrees 〈k〉, their values lie in the range 1 ≤ 〈k〉 < 3. This result shows a particular characteristic of infrastructure networks that contrasts with complex networks found in the study of human mobility (for example in taxis [27], bike sharing systems [26] or airports [48]). In addition, the global connectivity of a transportation system is characterized by counting the number of changes between lines of users to reach a particular destination. If the number of changes is low, the system approaches an ideal case. This feature of a network is quantified by the average shortest path and the results show that 1 ≤ 〈l〉 < 4. Therefore, in the BRT systems explored, the passengers need on average a low number of transfers between lines to reach a destination. Finally, concerning the average clustering coefficient 〈C〉, our findings show that most of the systems have null values. For example, in Fig 1, the infrastructure networks associated with Louisville, Austin, Nashville, and San Antonio are graphs with no cycles, then 〈C〉 = 0. In the case of the system in Rio de Janeiro, the structure has one cycle with five nodes also producing 〈C〉 = 0. Only Brampton, Hartford, Maui, and Mexico City have nodes that form triangles within the network. In general, the existence of cycles with different sizes reveals alternative routes that the passenger can choose in order to reach the final station. This is an important feature to consider in urban transportation systems; a recent work for networks of metro systems has shown that the robustness of infrastructure networks increases when multiple paths in the system can connect two nodes [49].

Global activity of vehicles in BRT systems

In addition to the infrastructure of BRT systems, it is important to characterize the movement of vehicles. Due to their specific features: exclusive lanes, velocity limits, and stops at determined bus stations, it is crucial to understand if these particularities in the vehicular activity lead to the emergence of patterns at different scales in urban areas. In this section, we analyze statistically the information of velocities considering all the active vehicles in the nine BRT systems described in Table 1, for which we have the geographical coordinates and the speed of all the vehicles. The analysis of the velocity of vehicles accumulated over a certain period gives information about behavior patterns over the entire system.

In Fig 2, we present the probability density ρ(v) of the velocities v for the BRT systems for each month covered by our study. The nine systems are sorted alphabetically using the name of the cities where they are located, starting from Austin and ending with San Antonio. All the results for ρ(v) are obtained using bin counts with size Δv = 2 km/h, this bin size is maintained for all the systems. The color of each ρ(v) represents the number of the month analyzed, starting from May 2022 and ending in May 2023 (the values of the 13 months are codified in the colorbar; in particular, May 2022 corresponds to 1 and May 2023 to 13).

thumbnail
Fig 2. Monthly probability densities ρ(v) of velocities v at the global level.

Monthly probability densities ρ(v) as a function of the velocity v in the systems in (a) Austin, (b) Brampton, (c) Hartford, (d) Louisville, (e) Maui, (f) Mexico City, (g) Nashville, (h) Rio de Janeiro and (i) San Antonio. Each ρ(v) is colored according to the month analyzed from May 2022 to May 2023 (13 months codified in the colorbar in panel (a)). The values of ρ(v) are obtained using bin counts in the interval 0 km/h < v ≤ 90 km/h with size Δv = 2 km/h.

https://doi.org/10.1371/journal.pone.0312541.g002

The results in Fig 2 show that the distributions ρ(v) are associated to completely discrete, half-discrete, or continuous values of v. For example, some distributions contain completely discrete values, which is the case of systems in Hartford (c) and Nashville (g) with relative maximums in specific values of v. This situation may be due to limitations in the GPS devices on the vehicles that only record specific values of velocity. Also, there are systems with half-discrete distributions, this is, with a combination of data that include discrete and continuous values of the velocities. This is the situation for the systems in Austin (a), Brampton (b), Louisville (d), Maui (e), and San Antonio (i). Moreover, the results show two cases with continuous records of v: Mexico City (f) and Rio de Janeiro (h), the two largest systems considered in our study. In addition, it is observed that every month, the distributions ρ(v) seem similar. This feature is maintained no matter the temporal window considered, weekly or daily, showing that statistically the global vehicular activity of bus rapid transit systems remains with no major changes through time. Finally, it is observed in the curves for ρ(v) a relative maximum of frequencies that lies in the range 20 km/h < v < 60 km/h, after which the values of probability quickly decline until the records of v > 80 km/h are very low. This result is explained by the speed limits of cities and policies of security on public transportation systems.

Local activity of vehicles: From data to similarity networks

In the previous section, we explored the probability distributions ρ(v) of velocity at the global level in each system; the results show no major differences over time. In this section, we explore the data at the local level, partitioning all the systems into polygons or segments and analyzing the records within each of them. Those segments are described by an elongated rectangle defining geographical zones delimited by two consecutive stations in the same line with no cross with another one. In more complex cases we define general polygons that consist of the fusion of several overlapped segments. The segmentation of a BRT system allows us to analyze the vehicular activity at the local level to have a finer resolution of the properties, which gives detailed knowledge of vehicular activity.

In this way, if a BRT system is divided into segments, the datasets of vehicular activity allows to generate the set of probability densities ρi(v) of the velocity v at local level for each segment . Once we obtained the distributions ρi(v) for a specific period of time, the results can be compared using the Kullback-Leibler divergence to measure the similitude between a pair of probability densities ρi(v) and ρj(v), defined as [11, 50] (4) where vmax = 90 km/h is the maximum speed considered in the analysis.

By definition is not equivalent to ; in this manner, it is useful to consider the symmetric measure [11] (5) Then, through the use of in Eq (5), all the values generated from the comparison of the vehicular activity for all the segments are consigned in a symmetric matrix with elements . In particular, if , the vehicular activity between segments i and j is similar and, the opposite case is represented by a large value. Hence, we refer to the array with elements as similarity matrix, because it quantifies the similarity between the vehicular activity in all the polygons in a BRT system.

Furthermore, it is possible to apply the methods of network science to analyze the similarity matrix of each BRT system. To this end, we define a network of size in which nodes represent segments and edges the similarity in the movement of vehicles. In this way, two nodes i, j are connected if the respective segments have similar probability densities ρi(v) and ρj(v). To generate this structure, it is necessary to define what is considered sufficiently similar using a threshold value H. If two systems have values in Eq (5) lower or equal than H then these segments are considered similar. All this information defines a similarity network for each value H. The respective adjacency matrix of the network is denoted as A(H), with elements i, j given by [11, 13] (6) Additionally, we require Aii(H) = 0 for i = 1, 2, …, N, to avoid self loops. From the symmetry of the distance , follows the symmetry of A(H), defining an undirected network.

In Fig 3 we illustrate the ideas and concepts presented in this section. We consider the simplest BRT system studied which is located in the city of Louisville, Kentucky. In Fig 3(a) we show the segmentation of this system into polygons, each of them colored with the scale shown in the upper colorbar. The indexation of the polygons corresponds to the localization from south to north of their respective geographical centers. Here, it is possible to see the unique line in the city of Louisville. In Fig 3(b) we plot the corresponding probability density of velocities ρi(v) for each polygon i for the entire dataset. The values are obtained using bins with size Δv = 2km/h. The color of each distribution ρi(v) corresponds to the color map shown in panel (a). In this image, we can see some characteristics of the different zones of the systems, for example, those whose activity differs more with respect to the mean distribution of the system, which is plotted with a dashed line, and corresponds to the analysis of the entire dataset. In contrast with the global analysis shown in Fig 2(d), the differences between the probability densities ρi(v) are more notorious at the local scale.

thumbnail
Fig 3. Vehicular activity at the local level for the BRT system in Louisville.

(a) Segmentation of the system with polygons. (b) Probability density ρi(v) for each polygon considering all the velocity records of vehicles in the interval 0 km/h < v ≤ 90 km/h in each segment (bin counts are obtained using the size Δv = 2 km/h). The dashed curve represents the probability density ρ(v) for the entire system. (c) Similarity matrix formed with the elements for all pairs of segments , the values of the elements are encoded in the colorbar. (d) Adjacency matrix A(H) describing a similarity network obtained after applying a threshold value H = 0.1 on (c), binary entries are codified in white for 0 and black for 1.

https://doi.org/10.1371/journal.pone.0312541.g003

In Fig 3(c) we depict the similarity matrix with dimension , where is the number of polygons in the BRT system in the city of Louisville. The elements are calculated as , hence, we obtain a symmetric matrix and their values are represented by the respective colorbar. In Fig 3(d), we show the adjacency matrix A(H) obtained from the values using the threshold value H = 0.1. The results shown in Fig 3 illustrate the methodology that can be implemented to express as a similarity network the massive dataset with geographic coordinates of vehicles in a BRT system and their velocity information. We use the parameter H as a threshold to transform the similarity matrices into undirected networks establishing a criterion for determining whether the vehicular activity within two given segments could be considered similar or not. The study of all the possible similarity networks generated by varying H, from isolated nodes to a fully connected graph, allows the identification of similarities at different scales using community detection algorithms; these results are presented in the next section for the nine BRT systems considered in this study.

Results and discussion

Similarity matrices

Once we discussed the general methods to map the data of the movement of vehicles to a network, in this section we apply these tools to analyze the datasets from our nine BRT systems. This approach allows us to identify properties common to this type of transportation system and features that characterize each system individually. In Fig 4, we present the similarity matrices for all systems, sorted by their respective number of segments (polygons) ; this is the number of nodes of the similarity network. We have for Louisville, for Austin, for San Antonio, for Brampton, for Rio de Janeiro, for Maui, for Nashville, for Hartford and for Mexico City. The respective similarity matrices are obtained through the comparison of the vehicular activity in each segment, with values obtained using Eq (5) and codified in the colorbar. Using , we can apply a threshold value H to transform similarity matrices into adjacency matrices and, therefore, to obtain networks associated to each system. In the following, we apply different network science methods to identify properties of collective patterns that arise at different scales in BRT systems.

thumbnail
Fig 4. Similarity matrices for nine BRT systems.

Matrices are sorted by using the number of segments considered in each system. The value of each entry is obtained using Eq (5) and encoded in the colorbar.

https://doi.org/10.1371/journal.pone.0312541.g004

Properties of similarity networks describing BRT systems.

A central concept in network theory is the connected component. In particular, an undirected network is considered connected if there is a path between every pair of nodes i, j; otherwise, it is classified as a disconnected network. In a disconnected network, two or more connected components exist; these are connected subgraphs of the network [21]. In complex systems represented as networks, this concept helps to identify groups of elements with no interaction or relation to other groups. Furthermore, examining the relationship between the sizes of connected components is crucial, as it relates to percolation in networks [30, 51] and, consequently, to phase transitions in the system [30, 47, 51, 52]. Percolation can be observed in networks where the number of links grows in direct relation to a certain parameter. A specific case of connected components is the giant component, also known as the Largest Connected Component (LCC). A criterion for determining the percolation point involves comparing the sizes of the LCC and the Second Largest Connected Component (SLCC). Initially, both connected components grow together until the SLCC reaches its maximum size. Subsequently, the SLCC decreases rapidly, while the LCC undergoes a sudden growth, leading to an explosion of connectivity. At this critical point, percolation occurs.

We search for connected components in the networks associated with vehicular movement to identify groups of polygons where activity can be considered similar. Particularly, we focus on the LCC. In Fig 5 we show some features of all the LCC for the respective value of H in the range [0, 1]. In Fig 5(a) we plot the fraction of nodes νLCC in the LCC with respect to the total number of segments , the values obtained are presented as a function of H. The results exhibit an interval of values H with a rapid growth of the νLCC.

thumbnail
Fig 5. LCC size of similarity networks for different values of H.

(a) Fraction of nodes νLCC in the LCC as a function of the value H for the nine BRT systems. The dots indicate the value Hc where percolation is detected. (b) LCC for the value Hc indicated in each panel (the systems are sorted according to the values of Hc).

https://doi.org/10.1371/journal.pone.0312541.g005

The abrupt change in the characteristics of the systems is closely associated with a phase transition, and, in networks, this behavior is characteristic of percolation limits. In Fig 5(a) we highlight with dots the value Hc where the percolation occurs for each network. This result is obtained by comparing the sizes of the LCC and the SLCC. Notably, the Nashville system exhibits the fastest growth in similarity, indicating that for a low chosen H value, a significant portion of the system displays similar vehicular activity. In Fig 5(b) we show the graphs for the networks when the percolation point is reached, the systems are sorted using corresponding threshold Hc. The color of the nodes corresponds to the color of the curves in Fig 5(a).

Community structure methods.

Once we defined a network, it is possible to identify groups of nodes highly connected among themselves and poorly connected with nodes outside. This structure is crucial because it shows similar characteristics in a group of elements in the system represented by a network. This feature is commonly referred to as a community structure, and it constitutes a fundamental topic in network science [53]. Over the last two decades, this field has proven successful in studying various complex systems [54]. Different algorithms have been developed to detect communities, addressing various types of networks. Nevertheless, the definition of community is quite vague, and this is one of the reasons for the variety of methods to detect them [21]. Some of the most popular algorithms are those based on modularity [53].

In the following, we explore the community structure in all the LCCs to know if there are groups of polygons with similar vehicular activity. We use the Clauset-Newman-Moore greedy modularity maximization [55], implemented in the library networkx in Python [56]. In Fig 6, we plot the total probability density of velocities of the two communities found within the LCC for the H value indicated in each plot. The panels represent cities in alphabetical order: (a) Austin, (b) Brampton, (c) Hartford, (d) Louisville, (e) Maui, (f) Mexico City, (g) Nashville, (h) Rio de Janeiro, and (i) San Antonio. The results consider velocities in the range 0 < v ≤ 90 km/h and a bin size Δv = 2km/h. The main feature observed in Fig 6 across all systems is the presence of a low-velocity community (denoted as and represented with black thick continuous lines) and a high-velocity community (denoted as and represented with red thin lines). Thus, using the threshold H, it is possible to categorize polygons based on their velocities, exclusively utilizing the data generated by the vehicles in BRT systems. A similar analysis was performed using a threshold H that generates three communities.

thumbnail
Fig 6. Vehicular activity in a classification with two communities.

Probability density ρ(v) for segmentation of the systems in: (a) Austin, (b) Brampton, (c) Hartford, (d) Louisville, (e) Maui, (f) Mexico City, (g) Nashville, (h) Rio de Janeiro and (i) San Antonio, considering v the range 0 < v ≤ 90 km/h with a bin size Δv = 2 km/h. The results for are presented in black and for in red. The threshold H is indicated in each panel.

https://doi.org/10.1371/journal.pone.0312541.g006

The properties of the analysis with two and three communities are presented in Table 3. In this table are reported the names of the cities in alphabetical order, the value H used to form two communities and ; and the value H considered to form three communities , and . The table also presents the number of segments that form each community as well as the the values of 〈k〉 (average number of segments with similar vehicular activity of a given one in the same community), the average shortest path 〈l〉 describing the global connectivity of the community and the global clustering coefficient 〈C〉. The last two columns report characteristics of vehicular activity within the group of segments in each community: the average velocity 〈v〉 and the standard deviation σv.

thumbnail
Table 3. Properties of the communities in the LCC.

The presented features correspond to the cases where two (, ) and three communities (, , ) are formed. In the first column are presented the names of the cities analyzed. Then, we show the value H used to generate the network, the communities detected, and the number of segments in each group. The next columns show the average degree 〈k〉, the average shortest path 〈l〉, and the average clustering 〈C〉. The last two columns include the average velocity 〈v〉 of vehicles in the entire community and their respective standard deviation σv.

https://doi.org/10.1371/journal.pone.0312541.t003

Our findings also show that by reducing the value H (i.e., by increasing the requirement to consider two segments as similar), the nodes within the LCC reorganize into new communities. In particular, the values of 〈v〉 in Table 3 show a classification with two communities with low velocity and high velocity ; whereas the results with three communities reveal a classification that includes now a medium-velocity having , , for low, medium and high averages velocities, respectively. The notable exception to this classification is the system in Louisville, where segments are only organized into two communities. Moreover, the velocity classification is more pronounced in the systems of Hartford, Maui, and Rio de Janeiro, where the velocity ratio between 〈v〉 for and is almost the double. The value of H used for the segmentation of the system into three communities generates groups with a certain balance in their respective sizes, but we observe that some systems have two strong communities that gather the majority of nodes. This behavior is present in Hartford and Maui, where the LCC has a community with a reduced number of nodes in comparison with the nodes in the two larger communities. In addition, turning into the network properties associated with each community. In general, we can say that these results are useful to gain some insights about the properties of similarity in vehicular activity. For example, we highlight the values of Louisville, where we obtain two fully connected graphs, then, for this specific H all the segments in the same community have similar vehicular activity. Regarding the other systems the magnitudes of 〈k〉 and 〈C〉 decrease with H; in contrast, 〈l〉 increases when the value H is reduced.

The results in this section show that the method implemented based on the representation of geolocalized information of vehicles in transportation systems as a similarity network of geographical zones allows the classification of the information at different scales depending on the degree of similarity controlled by the threshold limit H. The approach is general and applicable to similar information of diverse transportation modes (for example taxis, bicycles, metro) and is not limited to the analysis of velocities; since additional information like the state of traffic can be incorporated in the definition of the similarity. In the particular case of BRT systems, we can conclude that the method implemented is effective in classifying segments based on velocities and positions of vehicles.

Conclusions

We study bus rapid transit BRT systems located across the Americas, focusing on two aspects: infrastructure and vehicular activity. In the first part of the research, we adapted a method used to study connectivity in metro systems [23] to explore properties of BRT systems. Our findings show that the infrastructure of BRT systems exhibits unique characteristics that contrast with other complex networks observed in human mobility such as taxi services, bike-sharing systems, or airports. In the second part, we analyze millions of records of active vehicles in BRT systems including velocity and position (latitude and longitude). We explore vehicular activity across the entire system to outline activity patterns. Furthermore, we refine our analysis by examining the movement of vehicles dividing the systems into segments (polygons). We compare the distribution of velocities within each system, identifying differences using the symmetrized Kullback-Liebler divergence to the probability densities of velocity records for every pair of polygons. These results are organized as a similarity matrix for each system. We apply a threshold value H to transform the similarity matrices into undirected networks. This approach allows us to establish a criterion for determining whether the vehicular activity within two given segments could be considered similar or not.

After obtaining the adjacency matrices describing similarity networks, we proceed to study the properties of vehicular activity through its network representation. Initially, we analyze the size of the LCC for different values of the threshold value H. We apply community detection algorithms to the LCC to identify similarities among the polygons. Our focus lies on analyzing cases where two or three communities are formed. As a result, we conclude that we can classify the polygons of the systems in groups characterized by velocities: high and low velocity for a classification with two communities; and high, medium, and low velocity for a segmentation of the BRT systems considering three communities. Additionally, we examine the network properties of each community to characterize the connectivity within them. For instance, by quantifying the average degree in the graph of each community, we can infer the average number of polygons that exhibit similar vehicular activity within a given community. The study of the similarity networks generated by varying H, from isolated nodes to a fully connected graph, allows the identification of patterns at different scales using community detection algorithms.

Our findings show that the methods of network science explored are a useful tool to characterize at different scales BRT systems using geolocalized records of vehicular activity. The study is focused on nine BRT systems in the Americas, where detailed records of vehicular activity are available. Expanding the scope to include systems from other continents could provide a more comprehensive understanding of the activity patterns of vehicles in BRT systems. The methods implemented lead to the unsupervised detection of regions with similar characteristics considering the velocities and positions of vehicles in BRT systems. Other studies can incorporate the statistical analysis of different quantities of interest; for example, schedule adherence of the vehicles, carbon emissions, or user’s accessibility to stations. The classification of this information at different temporal scales is useful to deepen the understanding of urban transportation systems and, in collaboration with other professionals like transit specialists and urban planners, to draw up practical improvements introduced to optimize operational aspects like efficiency and connectivity at the scale of specific regions or the entire system. This approach is general and can be used in the study of other public transportation systems and pave the way for useful applications in the scientific study of urban areas.

References

  1. 1. Batty M. The new science of cities. MIT press; 2013.
  2. 2. Barthelemy M. The structure and dynamics of cities. Cambridge University Press; 2016.
  3. 3. Barthelemy M. The statistical physics of cities. Nat Rev Phys. 2019;1(6):406–415.
  4. 4. Verbavatz V, Barthelemy M. The growth equation of cities. Nature. 2020;587(7834):397–401. pmid:33208958
  5. 5. Melikov P, Kho JA, Fighiera V, Alhasoun F, Audiffred J, Mateos JL, et al. Characterizing Urban Mobility Patterns: A Case Study of Mexico City. In: Shi W, Goodchild MF, Batty M, Kwan MP, Zhang A, editors. Urban Informatics. Springer The Urban Book Series. Springer Nature Singapore; 2021. p. 153–170.
  6. 6. Bettencourt LMA. Introduction to Urban Science: Evidence and Theory of Cities as Complex Systems. The MIT Press; 2021.
  7. 7. Rybski D, González MC. Cities as complex systems—Collection overview. PLOS ONE. 2022;17(2):e0262964. pmid:35213566
  8. 8. González MC, Hidalgo CA, Barabási AL. Understanding individual human mobility patterns. Nature. 2008;453(7196):779–782. pmid:18528393
  9. 9. Alessandretti L, Sapiezynski P, Sekara V, Lehmann S, Baronchelli A. Evidence for a conserved quantity in human mobility. Nat Hum Behav. 2018;2(7):485–491. pmid:31097800
  10. 10. Alessandretti L, Aslak U, Lehmann S. The scales of human mobility. Nature. 2020;587(7834):402–407. pmid:33208961
  11. 11. Martínez-González JU, Riascos AP. Activity of vehicles in the bus rapid transit system Metrobús in Mexico City. Sci Rep. 2022;12(1):98. pmid:34997045
  12. 12. Pappalardo L, Manley E, Sekara V, Alessandretti L. Future directions in human mobility science. Nat Comput Sci. 2023;3(7):588–600. pmid:38177737
  13. 13. Betancourt F, Riascos AP, Mateos JL. Temporal visitation patterns of points of interest in cities on a planetary scale: a network science and machine learning approach. Sci Rep. 2023;13(1):4890. pmid:36966183
  14. 14. Pérez-Méndez D, Gershenson C, Lárraga ME, Mateos JL. Modeling adaptive reversible lanes: A cellular automata approach. PLOS ONE. 2021 01;16(1):e0244326. pmid:33395415
  15. 15. Patwardhan S, Barthelemy M, Erkol S, Fortunato S, Radicchi F. Symmetry breaking in optimal transport networks. Nature Communications. 2024;15(1):3758. pmid:38704371
  16. 16. Malik O, Gong B, Moussawi A, Korniss G, Szymanski BK. Modelling epidemic spread in cities using public transportation as a proxy for generalized mobility trends. Sci Rep. 2022;12(1):6372. pmid:35430595
  17. 17. Barbosa H, Hazarie S, Dickinson B, Bassolas A, Frank A, Kautz H, et al. Uncovering the socioeconomic facets of human mobility. Sci Rep. 2021;11(1):8616. pmid:33883580
  18. 18. Nilforoshan H, Looi W, Pierson E, Villanueva B, Fishman N, Chen Y, et al. Human mobility networks reveal increased segregation in large cities. Nature. 2023;624(7992):586–592. pmid:38030732
  19. 19. Moro E, Calacci D, Dong X, Pentland A. Mobility patterns are associated with experienced income segregation in large US cities. Nat Commun. 2021;12(1):4633. pmid:34330916
  20. 20. Neira M, Molinero C, Marshall S, Arcaute E. Urban segregation on multilayered transport networks: a random walk approach. Sci Rep. 2024;14(1):8370. pmid:38600261
  21. 21. Newman M. Networks. Oxford University Press; 2018.
  22. 22. Barabási AL. Network science. Cambridge: Cambridge University Press; 2016.
  23. 23. Derrible S. Network Centrality of Metro Systems. PLOS ONE. 2012;7(7):e40575. pmid:22792373
  24. 24. Boeing G. Urban spatial order: street network orientation, configuration, and entropy. Appl Netw Sci. 2019;4(1):67.
  25. 25. Louf R, Roth C, Barthelemy M. Scaling in Transportation Networks. PLOS ONE. 2014;9(7):e102007. pmid:25029528
  26. 26. Loaiza-Monsalve D, Riascos AP. Human mobility in bike-sharing systems: Structure of local and non-local dynamics. PLOS ONE. 2019;14(3):e0213106. pmid:30840674
  27. 27. Riascos AP, Mateos JL. Networks and long-range mobility in cities: A study of more than one billion taxi trips in New York City. Sci Rep. 2020;10(1):4022. pmid:32132592
  28. 28. Lampo A, Borge-Holthoefer J, Gómez S, Solé-Ribalta A. Multiple abrupt phase transitions in urban transport congestion. Phys Rev Res. 2021;3:013267.
  29. 29. Olmos LE, Çolak S, Shafiei S, Saberi M, González MC. Macroscopic dynamics and the collapse of urban traffic. PNAS. 2018;115(50):12654–12661. pmid:30530677
  30. 30. Li D, Fu B, Wang Y, Lu G, Berezin Y, Stanley HE, et al. Percolation transition in dynamical traffic network with evolving critical bottlenecks. PNAS. 2015;112(3):669–672. pmid:25552558
  31. 31. Wirasinghe SC, Kattan L, Rahman MM, Hubbell J, Thilakaratne R, Anowar S. Bus rapid transit—a review. Int J Urban Sci. 2013;17(1):1–31.
  32. 32. Ko J, Kim D, Etezady A. Determinants of Bus Rapid Transit Ridership: System-Level Analysis. J Urban Plan Dev. 2019;145(2):04019004.
  33. 33. Trubia S, Severino A, Curto S, Arena F, Pau G. On BRT Spread around the World: Analysis of Some Particular Cities. Infrastructures. 2020;5(88).
  34. 34. Shah SAR, Shahzad M, Ahmad N, Zamad A, Hussan S, Aslam MA, et al. Performance Evaluation of Bus Rapid Transit System: A Comparative Analysis of Alternative Approaches for Energy Efficient Eco-Friendly Public Transport System. Energies. 2020;13 (1377).
  35. 35. Basso LJ, Feres F, Silva HE. The efficiency of bus rapid transit (BRT) systems: A dynamic congestion approach. Transp Res B Methodol. 2019;127:47–71.
  36. 36. TransitFeed, https://www.transit.land/;.
  37. 37. TransitLand, https://transitfeeds.co;.
  38. 38. BRT Río (Rio de Janeiro), https://www.data.rio/documents/transporte-rodovi%C3%A1rio-hist%C3%B3rico-de-gps-do-brt/about;.
  39. 39. Dixie Rapid (Louisville), http://gtfsrealtime.ridetarc.org/realtime/Vehicle/VehiclePositions.pb;.
  40. 40. Metro Rapid (Austin), https://data.texas.gov/download/eiei-9rpf/application%2Foctet-stream;.
  41. 41. BRT Lite (Nashville), http://transitdata.nashvillemta.org/TMGTFSRealTimeWebService/vehicle/vehiclepositions.pb;.
  42. 42. VIA Prímo (San Antonio), http://gtfs.viainfo.net/vehicle/vehiclepositions.pb;.
  43. 43. Maui Bus (Maui), https://mauibus.org/gtfs-rt/vehiclepositions;.
  44. 44. Züm (Brampton), https://nextride.brampton.ca:81/API/VehiclePositions?format=gtfs.proto;.
  45. 45. CT Fastrak (Hartford), https://s3.amazonaws.com/cttransit-realtime-prod/vehiclepositions.pb;.
  46. 46. Metrobús (Mexico City), http://app.citi-mb.mx/GTFS-RT/vehiculosPosicion;.
  47. 47. Barrat A, Barthélemy M, Vespignani A. Dynamical Processes on Complex Networks. Cambridge: Cambridge University Press; 2008.
  48. 48. Guimerà R, Mossa S, Turtschi A, Amaral LAN. The worldwide air transportation network: Anomalous centrality, community structure, and cities’ global roles. PNAS. 2005;102(22):7794–7799. pmid:15911778
  49. 49. Eraso-Hernandez LK, Riascos AP, Michelitsch TM, Wang-Michelitsch J. Evolution of transport under cumulative damage in metro systems. Int J Mod Phys C. 2024;35(04):2450037.
  50. 50. Kullback S, Leibler RA. On Information and Sufficiency. Ann Math Stat. 1951;22(1):79–86.
  51. 51. Bradde S, Bianconi G. Percolation transition and distribution of connected components in generalized random network ensembles. J Phys A: Math Theor. 2009;42(19):195007.
  52. 52. Ambühl L, Menendez M, González MC. Understanding congestion propagation by combining percolation theory with the macroscopic fundamental diagram. Commun Phys. 2023;6(1):26. pmid:38665407
  53. 53. Cimini G, Squartini T, Saracco F, Garlaschelli D, Gabrielli A, Caldarelli G. The statistical physics of real-world networks. Nat Rev Phys. 2019;1(1):58–71.
  54. 54. Fortunato S, Newman MEJ. 20 years of network community detection. Nat Phys. 2022;18(8):848–850.
  55. 55. Clauset A, Newman MEJ, Moore C. Finding community structure in very large networks. Phys Rev E. 2004;70:066111. pmid:15697438
  56. 56. Hagberg AA, Schult DA, Swart PJ. Exploring Network Structure, Dynamics, and Function using NetworkX. In: Varoquaux G, Vaught T, Millman J, editors. Proceedings of the 7th Python in Science Conference. Pasadena, CA USA; 2008. p. 11–15.