Tuesday, May 30, 2017

Blessing in Disguise

Over the past week, Public Transit Analytics replaced a major component of the Score Generator software. One of the differentiating features of the Score Generator is the care that it gives to accurately determining what is reachable by walking in a public transit journey. When the virtual rider starts their journey or departs a transit vehicle, the stops that they can reach are not just those located in some radius of the point. Therefore, when Public Transit Analytics replaced the Score Generator's source of walking distance measurements, the decision to do so was made with utmost care. It is one with considerable implications, and some surprising benefits.

Before exploring these implications, it is useful to understand the process that the Score Generator uses to determine which destinations are in reach by walking. Recall that a Point Utility score is computed with a center point and a duration. When first run, the Score Generator builds two sets of points. The origin set contains the center point and every transit stop. The destination set contains every transit stop and points corresponding to the center of every Sector. The Score Generator calculates the straight-line distance between each origin and each destination. It then uses a walking speed estimate to convert these distances into times. Since these times are never longer than the actual walking time between the points, and no walking time can be greater than the duration, it retains only the measurements that are under that duration. These are called the candidate distance measurements.

Then, as the virtual rider journeys through the transit network, it checks how much time it has left to continue and considers all the candidate distance estimates for its current location that are less than or equal to the remaining duration. It takes these candidates and uses a software component called the Distance Client to get accurate walking directions to each destination. These directions include a total walking time, which provides the final filter on which destinations are reachable.

Prior to last week, the Distance Client relied on Google's Distance Matrix API to get walking times and distances. Google makes the Distance Matrix API available as a pay-per-use web service with no minimums and no upfront costs. For that reason, its presence made it possible to develop the Score Generator without needing considerable domain knowledge in mapping and wayfinding. Unfortunately, Google imposes a 100,000 request per day maximum for its standard pricing plan. In attempting to calculate a full Network Utility for a 10,000 Sector map of Seattle, it became clear that somewhere around four million distance measurements would be needed. Thus, acquiring the data would take around 40 days. Though premium service plans do exist, they are intended for usage patterns that involve a large number of continuous requests, not occasional periods of very heavy use.

Thus, current Utility maps from Public Transit Analytics use map data from OpenStreetMap with a locally-running instance of GraphHopper providing directions. This new solution maintains the most important factors of walking distance measurements: paths that account for the street layout and times that account for slowdowns and speedups from going up and down hills. However, to say the solutions yield identical maps would not be accurate. Compare the previous version of the Outbound Point Utility map from the Public Transit Analytics office (interactive version) with the current one (interactive version).
Overall, the current walking distance measurement source results in many more destinations being treated as reachable more often.  Based on several spot checks, variation appears to principally come from walking speed differences rather than routing differences. The open source nature of GraphHopper's navigation software gives some insight as to why. It appears that for most scenarios, it uses a walking speed of five kilometers per hour. This is a very typical speed used to model the preferred human walking speed. Google's speed selection algorithm is unknown, but appears to be slower on average.

That changing walking distance computation considerably alters Utility is an important observation. Both GraphHopper and Google intend to model what can be reached by some hypothetical average human being. An individual request to each service will probably result in only a minute or two of time variation. However, the aggregate impact when computing Point Utility is considerable. If so much variation exists between two systems trying to measure the same thing, even more variation must exist among the many riders of transit systems who, for a variety of reasons, may not behave anything like the average.

So far, the Public Transit Analytics blog has focused on how using the Score Generator makes transit networks more useful. For the network to also be just, it is necessary to ask for whom the network is being made more useful. If an individual's ability to move around the world is sufficiently different than the assumptions that walking distance calculations make, changes that make the network more useful for the average person may seriously degrade the network for that individual. Much like how using real schedule data instead of average transfer times results in a more genuine model of a transit network, a truly accurate model must also have the ability to capture a broad range of pedestrian abilities and preferences, not just some average.

Fortunately, using walking measurements based on open source solutions like OpenStreetMap and GraphHopper enables exactly that; the maps and software can be modified extensively. Transit planners working with Public Transit Analytics can opt for Utility maps that show their transit network from the perspectives of individual transit riders, accounting for factors ranging from mobility impairments, to age, to preferences. Though the need to change the walking distance measurement source was borne out of technical necessity, it has in fact been a blessing in disguise. By helping ensure that the unique needs of individuals are not lost amidst optimizing for averages, it has helped Public Transit Analytics further commit to its core tenet of justice.

Saturday, May 20, 2017

Foundations of Evaluating Public Transit Networks, Part 6: Similar, Rather than Different

Last week in the Foundations of Evaluating Public Transit Networks series, I highlighted the fact that shifting the center of a Point Utility computation a small amount can have a substantial impact on the reachability map and the corresponding score. This presented the problem of how to measure the utility of a whole network when individual points may tell very different stories about how useful the network is. This post describes one strategy for solving this problem by focusing on how Point Utility measurements from different centers are similar rather than how they are different, and using techniques from the discipline of information theory to define this similarity in a rigorous way.

To ease this discussion, this post makes use of the following terminology. Network Utility (NU) is the hypothetical measurement of how useful a transit network is using Public Transit Analytics's definition of useful. In theory, it would be calculated by running a series of full-day Point Utility computations centered at each Sector of the service area. As explained in the last post, this is a very difficult measurement to make, owing to the large number of Sectors from which Point Utility calculations must be run. A related concept is Cumulative Point Utility (CPU). Rather than aggregating the results from every Sector, CPU is the result of generating the Point Utilities at some sample of the Sectors' centers and combining these. CPU has the benefit of being flexible in the amount of computation required; there are no restrictions on how many or few samples make up a CPU calculation. There is, of course, the tradeoff that fewer samples will produce results further away from the true Network Utility.

Information theory comes in handy in establishing what "further away" means. Mutual information is an information theoretic technique that measures how much information one random variable indicates about another. It uses the joint and marginal probability distributions of two random variables. Thus, to measure how far away one Point Utility measurement is from another, it must be possible to model a Point Utility as a probability distribution. The diagram below indicates how this can be done.

Consider the simplified Point Utility maps above. Recall that in a Point Utility map, each Sector is assigned a shade of green depending on how often the Sector can be reached. Each shade of green corresponds to a number, one through nine, referred to as the "bin value". Treat each Sector as an observation of a random variable. The value of that observation is the bin value of the Sector. Sectors that cannot be reached are given a bin value of zero. Looking over a whole Point Utility map, one can count the number of Sectors with each reachability bin value. These sums can be divided by the total number of Sectors, creating a probability distribution for what the bin value of an arbitrary Sector would be on that map.

When considering two different Point Utility maps, it is then possible to construct a joint probability distribution. To do this, choose a Sector on the first map and find the same Sector on the second map. Keep a count of each pair of observations; for example Sector one has a bin value of 9 in the first map and a bin value of 5 in the second map, so the pair (X=9, Y=5) is recorded. Continue this process for each Sector, obtaining a total of each observation. Dividing these by the number of Sectors yields the joint probability distribution for the bin values of the two Point Utility maps.

With the joint distribution, and the marginal distributions that can be computed from it, it is possible to compute the mutual information of two Utility measurements of any type. This calculation can be used in two ways. If the Network Utility has been computed at great cost, the mutual information can be used to determine how accurately a Cumulative Point Utility approximates the Network Utility. Once a sufficiently accurate CPU has been found, changes to the transportation network can be measured using CPU rather than NU, saving computational resources and money.

Furthermore, even if the Network Utility is unknown, mutual information sheds light on whether the Cumulative Point Utility is approaching it. Consider building a series of Cumulative Point Utilities by selecting an initial Sector at random, adding new random Sectors, and computing a new CPU for after each new Sector is added. After each CPU is measured, the mutual information between it and the previous CPU is measured. Initially the mutual information will decrease: mutual information between any two Point Utilities is relatively high because both PUs indicate that the plurality of Sectors are unreachable and thus the maps are similar. As more PUs are combined into the CPU, fewer points are outright unreachable, and thus the maps will be less similar from the perspective of mutual information. As more PU measurements are added, it is expected that the rate of decrease will slow or reverse, as the data added from a PU is capable of being inferred from the previous CPU. In a more concrete sense, this means that one can learn something about what is reachable at a point by considering the points around it. After all, that point has similar transit stops within its walking range. Eventually, the random points that were selected comprise nearly all the information about the transit network, even though they are not an exhaustive collection of every location on the map.

Public Transit Analytics is actively working through some of the practical concerns of using Cumulative Point Utility. It is unclear how many samples would be sufficient and what properties of a transit network may result in alterations to the sufficiency criteria. Nonetheless, the approach of using mutual information appears to be a promising one. Its ultimate promise is a way of measuring an entire transit network in an unbiased way not seen in other models available to planners today.

Monday, May 8, 2017

Foundations of Evaluating Public Transit Networks, Part 5: Complex and Volatile, Again

Previously in the Foundations of Evaluating Public Transit Networks series I explained how Public Transit Analytics built Point Utility from Qualified Point Utility. Doing so necessitated examining the ways that Qualified Point Utility failed to account for how useful a rider would find a transit network. In that case, the "Qualified" nature of Qualified Point Utility, the fact that it is calculated at a single time, was scrutinized. The upcoming posts in the Foundations of Evaluating Public Transit Networks series follow the same pattern, but this time with the "Point" nature of Point Utility.

Just as looking at a transit network at a single time of day fails to consider that riders want to reach destinations throughout the day, a measurement fixed at a single point fails to capture the fact that riders exist throughout the transit network. Making transit network changes that optimize a single Point Utility may result in better transit for some riders. The Point Utility, though, says nothing about potential degradation to riders not near the selected point.

Thus, it is critical to determine how representative a single Point Utility measurement is of the overall network. To get a rough, intuitive sense of this, the Score Generator calculated the 30-minute Outbound Point Utility of the center of the Sector that contains the Public Transit Analytics office on January 25, 2016. Then, it calculated the same Point Utility but for the center of all of the adjacent Sectors. The results are depicted below, with an interactive version available as well. (The interactive version may take some time to fully load).

Not surprisingly, a variety of Point Utility scores were obtained by shifting the starting location. In this example, the most extreme shift of distance is approximately 650 meters (2,000 feet or one-third of a mile) and resulted in a score change of 14. A difference of one in a point utility score is usually non-trivial (in this 10,000-Sector map, every increment of Point Utility means that 14,400 more pairs of times and Sectors have become reachable). As such, it would be difficult to claim that the Point Utility Score at the center reflects the utility of the network on the whole; it may not even account for how useful the network is in the adjacent sectors.

Whereas Qualified Point Utility could be converted to Point Utility by merely considering the former at every minute of the day, it is more complicated to analogously consider the Point Utility at every Sector. In this map, there is an order of magnitude difference between the number of Sectors and the number of times of day. If a transit agency requires capturing a larger service area or using finer Sector granularity, the difference in order of magnitude could easily grow further. This could increase the time taken to generate a map, as well as its monetary cost, to unacceptable levels. In subsequent posts I will explore some techniques that may cut down this time and cost, and eventually surmount the problem of objectively measuring how useful a transit network is.