Greta Timaite, James Hulse and Robin Lovelace
OpenStreetMap (OSM) data has the potential to facilitate bottom-up approach to transport planning which is essential for localized data-driven policy interventions. Given this, OpenInfra project is exploring the potential of OSM data in transport research with a focus on active travel. The exploration showed that currently missing data limits the applicability of OSM data. Nevertheless, we argue that the potential and relevance of OSM data can be demonstrated by recategorizing OSM data to provide more actionable insights to policy-makers. This, therefore, could encourage the uptake of open data leading to more transparent, reproducible, and participatory transport planning.
One of the key domains in which OpenSteetMap (OSM) data has been utilized is transport research . OSM has been used in agent-based transport simulation  and routing , including cycling , walking , wheeling , and blind pedestrian routing . Another application of OSM data is in transport infrastructure planning. Nelson et al.  argue that OSM has the potential to become a primary source of data on infrastructure across the globe.
Regardless of OSM’s potential to become a primary source of data on infrastructure, its potential in active travel infrastructure planning is yet to be realized. One of the potential reasons behind this lag might be linked to the perceived unreliability of open-access crowdsourced data . The quality of OSM has received extensive examination  in which the question concerning data completeness plays a significant role because, it is argued, the mappers are not coordinated to guarantee systematic coverage . To address this issue, Barrington-Leigh and Millard-Ball  assessed OSM road completeness and found that globally over 80% of roads are mapped. Problematically, however, their assessment focused on roads designed for motor traffic, thus excluding other modes of transport. This gap has been partially addressed by Ferster et al. who examined and compared OSM cycling infrastructure in Canada. They have not, however, considered the infrastructure from the perspective of accessibility. Moreover, there seems to exist no equivalent study using OSM data in the context of pedestrian infrastructure planning.
Nevertheless, open-access crowdsourced data, such as OSM, can support an increasing need for local evidence to inform transport policies. This is important in the context of the UK in which a shift from provision for motorised modes towards more sustainable active modes of travel, such as walking, wheeling, and cycling, takes place . The importance of localizing interventions to meet the needs of local communities has been outlined in both policy  and academic  papers. A potential way to engage citizens in the decision-making is to encourage “produsage” – a model in which citizens both produce and use data .
Acknowledging the potential of OSM to boost citizen participation, OpenInfra project, run at the University of Leeds (UK), aims to address the gap of literature regarding the potential OpenStreetMap in transport research. The project started by examining the existing OSM tags relevant to active travel infrastructure in England with a focus on West Yorkshire, Greater Manchester, Greater London, and Merseyside. The data has been queried using osmextract , a package in R, and explored using exploratory data analysis (EDA) approach. A reproducible code containing all the figures discussed here can be found on GitHub: https://github.com/udsleeds/openinfra/tree/main/sotm2022
Given the extensive use of OSM data in transport research, it is not surprising that OSM provides a comprehensive active travel network, yet there is a lack of specification concerning the type of infrastructure that is present (e.g. is it a cycle lane or a cycle track?). For instance, cycleways and footways constitute about 1/3 of all the mapped highways on which one can legally walk, wheel or cycle but only a few percent of the cycleways and footways have tags detailing their type. The data gets even scarcer in the context of accessible infrastructure planning. For example, there is a lot of missing information on the presence and type of kerbs – a street element that might make the movement of a wheelchair user more challenging .
The missing data currently limits the use of OSM data in active travel planning, however this does mean that the use of OSM data should be dismissed. Following Nelson et al.’s  argument that it is important to make crowdsourced data more actionable, we decided to recategorize OSM data based on Inclusive Mobility (IM) , a guide that outlines the best practices in creating inclusive pedestrian infrastructure in the UK. For this, a function has been written (documentation can be found here: https://udsleeds.github.io/openinfra/articles/im_get.html). It takes an OSM dataframe, recategorizes its tags based on the definitions outlined in the guide, and returns an OSM dataframe with new columns to use in further analysis. However, the function provides a simplification of the IM guide for a couple of reasons. The first one could be considered in terms of definitional discrepancies. For instance, the guide defines footways as “pavements adjacent to roads”, yet this is not easily extracted from the OSM in which highway=footway is a generic tag and often there is no further refinement (e.g., sidewalk=*) to determine if it is a pavement adjacent to a road. Another reason is linked to assigned values. For example, the guide identifies six tactile paving surfaces but OSM focuses on the presence/absence of tactile paving, thus limiting how much information can be extracted from the data.
One potential application of the IM function could be to explore the existence and geographic distribution of accessibility indicators, such as the presence of a flush kerb. Yet, more interesting results can be produced by using recategorised OSM data in conjunction with other datasets that would help to improve the understanding of the accessibility of streets. As an illustration for this, an open-access Leeds Central Council Footfall data was used . We reasoned that the locations at which footfall data were collected are heavily used by pedestrians, thus demonstrating the need to ensure inclusive spaces. 5 unique streets were identified, which resulted in 35 linestrings in OSM. Then, a basic index of accessibility, ranging from 0 to 5, was created. For example, if a linestring is classified as a footway, footpath, or implied footway based on the IM guide, then it received 1, otherwise 0. If a flush kerb is mapped, it received 1, otherwise (e.g., not flush or NA), 0 is given. Finally, the values were added and a final index produced. Following this, the highest index score is 2 (19 linestrings), while the rest scored 1. This example does not necessarily show that the streets are inaccessible because the missing data make it hard to make a fair judgement (e.g., in this case not a single linestring has data on kerbs). However, we would argue that this is a space for OSM to produce more readily actionable insights regarding transport infrastructure, especially if joined with other (open) datasets that would help to overcome some of its current data limitations.
The following steps of the OpenInfra project are focused on scaling up. The goal is to produce ‘OSM transport infrastructure data packs’ for transport authorities in England to support the uptake of open-access data, such as OSM, in transport planning. We believe that the utilization of open-access data could make transport planning more transparent, reproducible, and participatory which, consequently, would support an uptake of sustainable modes of travel. OSM specifically has the potential to provide localized insights on the existing transport infrastructure and facilitate more inclusive and accessible transport planning.