Advanced

Preparation and analysis of crowdsourced GPS bicycling data : a study of Skåne, Sweden

Corney, Matthew LU (2017) In Student thesis series INES NGEM01 20162
Dept of Physical Geography and Ecosystem Science
Abstract
Despite the growing volume of available transportation data and the efforts of many cities to increase cycling levels, there remains a lack of data on where people cycle. The use of GPS trajectories have now been used in cycle studies for several years, and more recently large, crowdsourced datasets of GPS recorded cycle trips have become available and of interest to transportation and planning departments. However, there is limited research on how representative these new crowdsourced data sources are of the general cycling population.
This study uses GPS trip data from a GPS data crowdsourcing project called the Bike Data Project. It prepares a dataset of GPS for the city of Lund, Sweden and matches the trips to a street network... (More)
Despite the growing volume of available transportation data and the efforts of many cities to increase cycling levels, there remains a lack of data on where people cycle. The use of GPS trajectories have now been used in cycle studies for several years, and more recently large, crowdsourced datasets of GPS recorded cycle trips have become available and of interest to transportation and planning departments. However, there is limited research on how representative these new crowdsourced data sources are of the general cycling population.
This study uses GPS trip data from a GPS data crowdsourcing project called the Bike Data Project. It prepares a dataset of GPS for the city of Lund, Sweden and matches the trips to a street network dataset. The study creates cycle counts based on the GPS trajectories for locations through the study area and compares these to manual counts made at the same location. No correlation was found between the counts, suggesting that the GPS trajectory data set is not a reliable representation of cycle trips within the study area, likely due to a lack of unique users within the dataset. The study plugs a gap in the current literature by quantitatively testing the accuracy of different map matching algorithms for matching cycling GPS data to the street network, by comparing them to ground truth trips of the actual routes taken, in the first example of this for cycle data. It is found that in certain situations in dense networks the matching cannot be relied upon to give the correct link, which could have implications for studies looking to quantify the percentage of cycling taking place on different road infrastructure types. (Less)
Popular Abstract
More and more geographic data is being generated all the time by members of the general public. This has been brought about by the fact that more and more people have access to devices, such as smartphones, that have the ability to quickly and easily record data, and increasing numbers of people are uploading this data onto the web.
This study looks at data of cycling trips that have taken place in Scania in Southern Sweden, that have been recorded by the public by a specially designed smartphone application called ‘The Bike Data Project’. Collecting data in this way has several advantages over the traditional methods of manually counting bicyclists at certain locations. For example, as the location of the bicyclist is collected along... (More)
More and more geographic data is being generated all the time by members of the general public. This has been brought about by the fact that more and more people have access to devices, such as smartphones, that have the ability to quickly and easily record data, and increasing numbers of people are uploading this data onto the web.
This study looks at data of cycling trips that have taken place in Scania in Southern Sweden, that have been recorded by the public by a specially designed smartphone application called ‘The Bike Data Project’. Collecting data in this way has several advantages over the traditional methods of manually counting bicyclists at certain locations. For example, as the location of the bicyclist is collected along their entire route every few seconds, rather than perhaps 1 or 2 times in traditional counting, much more information can be gathered. This extra information means that this kind of cycling data can be used to investigate a large number of things.
However, there are still a number of potential problems, and questions to be answered. For example there is a lot of work that must go into the preparation of the data. For example a detailed representation of the cycle network must be prepared, and the cyclist’s route matched to this network.
This work prepares a large dataset of cycling trips, and tests a number of methods of data preparation. The study also compares the dataset to a more traditional cycle count dataset of the same area for the same year, and looks at which questions can be answered by these emerging crowdsourced datasets that cannot be answered by traditional sources and vice versa. This kind of work is important because the more these with more preparation and testing of these sources, the quicker they will become to use, and the more trust we can have in the results.
This is a rapidly emerging area, with many major cities across the world, such as, London, Montreal and many cities across the USA all acquiring publicly generated GPS datasets of cycle trips with the same goal – better understanding and improvement of the cycling experience. (Less)
Please use this url to cite or link to this publication:
author
Corney, Matthew LU
supervisor
organization
course
NGEM01 20162
year
type
H2 - Master's Degree (Two Years)
subject
keywords
geomatics, bike data project, map matching, bicycle data, crowdsourced cycle data, Physical Geography and Ecosystem Analysis, GIS
publication/series
Student thesis series INES
report number
411
language
English
id
8904581
date added to LUP
2017-03-22 15:58:01
date last changed
2017-03-22 15:58:01
@misc{8904581,
  abstract     = {Despite the growing volume of available transportation data and the efforts of many cities to increase cycling levels, there remains a lack of data on where people cycle. The use of GPS trajectories have now been used in cycle studies for several years, and more recently large, crowdsourced datasets of GPS recorded cycle trips have become available and of interest to transportation and planning departments. However, there is limited research on how representative these new crowdsourced data sources are of the general cycling population. 
This study uses GPS trip data from a GPS data crowdsourcing project called the Bike Data Project. It prepares a dataset of GPS for the city of Lund, Sweden and matches the trips to a street network dataset. The study creates cycle counts based on the GPS trajectories for locations through the study area and compares these to manual counts made at the same location. No correlation was found between the counts, suggesting that the GPS trajectory data set is not a reliable representation of cycle trips within the study area, likely due to a lack of unique users within the dataset. The study plugs a gap in the current literature by quantitatively testing the accuracy of different map matching algorithms for matching cycling GPS data to the street network, by comparing them to ground truth trips of the actual routes taken, in the first example of this for cycle data. It is found that in certain situations in dense networks the matching cannot be relied upon to give the correct link, which could have implications for studies looking to quantify the percentage of cycling taking place on different road infrastructure types.},
  author       = {Corney, Matthew},
  keyword      = {geomatics,bike data project,map matching,bicycle data,crowdsourced cycle data,Physical Geography and Ecosystem Analysis,GIS},
  language     = {eng},
  note         = {Student Paper},
  series       = {Student thesis series INES},
  title        = {Preparation and analysis of crowdsourced GPS bicycling data : a study of Skåne, Sweden},
  year         = {2017},
}