Community Interactions in OSM editing
Dipto Sarkar and Jennings Anderson
We look at interactions between Corporate and Non-Corporate Editors as reflected through co-editing patterns in the OSM data. We use Social Network Analysis on 12 networks generated from four different locations and 3 different timepoints and our results show the vibrant co-production of OSM data generation. There are interactions between all editors but Corporate Editors tend to interact at a higher rate with each other. The seniority of editors and the interactions also differ between Corporate and Non-Corporate Editors.
OpenStreetMap (OSM) data is produced by a vibrant online community of mappers. To be more specific, OSM data produsers represent a plethora of individuals with different motivations, methods of data contribution, and usage (Budhathoki & Haythornthwaite, 2013; Coleman et al., 2009). Thus, OSM contributors have been aptly described as a community of communities (Solis, 2017). In recent years, corporate editing teams have introduced a new dynamic in the discussion on communities in OSM; editing teams hired by corporations, such as, Apple, Facebook, Microsoft, Uber, are capable of contributing thousands of changesets a day (Anderson et al., 2019; Anderson & Sarkar, 2020). Additionally, corporate editors (CEs) tend to focus their editing on particular types of map features. These two attributes of corporate editing can lead to CEs breaking off into a siloed group of their own with little or no interaction with the rest of the editors on the map.
Previous research on the OSM community using similar methods showed there was limited collaboration between editors with most objects being edited only a few times (Mooney & Corcoran, 2012). Senior editors in particular perform a majority of the mapping work on their own, but do interact with others through co-editing (Mooney & Corcoran, 2014). Since these studies were performed, the OSM community has grown significantly and the community dynamics have also evolved with more individual and organized participation (e.g. CE).
Here, we use a data driven approach to characterize the interactions between the CEs and the rest of the OSM community. We define interactions through editing patterns. That is, we construct a network of interactions where each node represents an editor, and two nodes are connected if they have edited the same map object. If the mapper of node A edits an object last edited by the mapper of node B, then an edge connecting these nodes exists and is directed from A to B.
We utilized the OSM-Interactions tilesets to construct these networks (Anderson, 2020). These vector tiles contain the editing history of all highway and building objects at zoom level 14. They include minor changes to the geometry of objects in which only nodes are moved, but the parent way is left untouched. In this way, we are capturing the complete history of map objects in OSM, as opposed to just changes to the basic OSM elements (primarily nodes or ways).
In keeping with the objects which are primarily edited by CEs, we focused only on highway and building objects for construction of the network. The nodes are further annotated with a binary category representing whether they are a CE or not. We classify a mapper as being a CE or not by comparing usernames in the network to the disclosed lists of usernames on a corporation’s OSM wiki or Github page.
We focus on 4 locations: Egypt, Jamaica, Thailand, and Singapore. We create networks for each of these locations at 3 timepoints, 2015, 2017, and 2019 to characterize the changes between over time. Thus, we constructed and analyzed 12 networks. The locations were chosen as they all have different groups of CEs active.
Across all networks, the Largest Connected Component (LC) accounted for 93.6% of all nodes highlighting significant interactions amongst all mappers. Within the LC, the rate of growth of CE nodes exceeds the rate of growth of non-CE nodes at rate of 11:1 between 2015 and 2019. However, both types of editors (CE and non-CE) have a comparable number of in and out degrees in each place, indicating that they edit other people’s work and have their work edited at a similar rate. In terms of who edits whose work, CE’s edit other CE’s work most often, but interaction between CEs and non-CEs have also grown through time, keeping the network connected. With regards to age of the mappers (calculated in terms of their enrollment date in OSM) and the volume of edits they perform, younger mappers in both groups tend to edit others' work at a higher rate than senior mappers, but there is more variation in these statistics for non-CE mappers. This is a finding contrary to previous research on editing interaction patterns mentioned above. Additionally, characterizing the time between edits show that edits made by CE’s persist for a slightly shorter duration than edits made by non-CE, primarily due to other CEs editing the same object soon after.
In conclusion, the editing networks highlight the vibrancy of data co-production. The volunteer editor and CEs are interacting with each other's edits to produce the map. The per-group interaction is nuanced and shows unique editing patterns which warrant further investigation. During the timespan of this study, the rate of growth of the CE community was faster than the non-CE community, but whether the pattern will hold over time and whether other locations exhibit the same pattern require more research.