Machine Learning for Urban Computing
A. A. Salah,B. Aydoğdu
Abstract:Increasing urbanization rates and growth of population in urban areas have brought complex problems to be solved in various fields such as urban planning, energy, health, safety, and transportation. Urban computing (UC) is the name of the process that involves collection, integration and analysis of heterogeneous urban data coming from various types of sensors in the city, with the purpose of tackling urban problems (Zheng et al., 2014). Within this framework, the first step of UC is to collect data through sensors like mobile phones, surveillance cameras, vehicles with sensors, or social media applications. The second step is to store, clean, and index the spatiotemporal data through different data management techniques by preparing it for data analytics. The third step is to apply different analysis approaches to solve urban tasks related to different problems. Machine learning methods are some of the most powerful analysis approaches for this purpose, because they can help model arbitrarily complex relationships between measurements and target variables, and they can adapt to dynamically changing environments, which is very important in UC, as cities are constantly in motion. Machine learning (ML) is a subfield of artificial intelligence (AI) that combines computer science and statistics with the objective of optimizing a performance criterion by learning models from data or from past experience (Alpaydın, 2020). The capabilities of ML have sharply increased in the last decades due to increased abilities of computers to store and process large volumes of data, as well as due to theoretical advances. Especially in domains where human expertise is unable to crystallise the path between data sources and the performance measures, ML approaches offer new ways of model creation. In UC, data sources are rich and heterogenous. From a ML perspective, using heterogeneous sources of data as input to models is challenging, as most of the existing paradigms are designed to process one type of data. Additionally, urban data acquisition brings its own challenges. Urban data may come from sensors equipped by infrastructure elements, from public transportation systems, from satellites, from household surveys, governmental institutions, and even from people who act as sensors (called citizen science), either pro-actively supplying information by reporting issues, or by using mobile phones or applications whose data can be read to infer many things about the city and the way it is used by the people. These data sources have different bias and variance characteristics; each sensor comes with its own data gaps (e.g. mobile phone usage may exclude small kids, satellite coverage cannot include what goes on inside large buildings, public transport figures ignore other means of transport, etc.), and each data source is sampled at a different rate. These are all challenges in building unifying models. Apart from detecting patterns in urban usage, ML methods can be used for detecting outliers, such as anomalies in the daily life of the city. For example, mobile phone activity at a certain location can be monitored for automatically detecting unusual patterns (Gündoğdu et al., 2017). Combining ML with computer vision allows us to detect problems in urban settings visually, and to classify areas of the city in terms of safety, wealth, accessibility, etc. (Santani et al., 2015). Another important usage is in prediction of urban activity based on past data, such as predicting queues in gas stations, which can be used for improving waiting times across the city (Zheng et al., 2013). ML can also help with knowledge extraction, providing interpretable insights into data collected from urban sensing.
Environmental Science,Engineering,Computer Science