Predictive mapping of urban air pollution using apache spark on a hadoop cluster

Asgari, Marjan; Farnaghi, Mahdi; Ghaemi, Zeinab

Predictive mapping of urban air pollution using apache spark on a hadoop cluster

Mark

Asgari, Marjan ; Farnaghi, Mahdi ^LU and Ghaemi, Zeinab (2017) 2017 International Conference on Cloud and Big Data Computing, ICCBDC 2017 p.89-93

Abstract: Air pollution is one of the major environmental problems in the industrial and populated cities. Predictive mapping of urban air pollution and sharing the generated maps with the public and city officials have positive impacts on society and environment. This article presents a solution based on distributed processing concepts to generate predictive map of air pollution for the next 24 hours. Apache Hadoop has been utilized as the underlying framework to form a cluster of processing machines. In order to improve the processing speed along with required machine learning functionalities, Apache Spark has been employed on the Hadoop cluster. The solution enables us to efficiently predict air quality classes on monitoring stations of... (More); Air pollution is one of the major environmental problems in the industrial and populated cities. Predictive mapping of urban air pollution and sharing the generated maps with the public and city officials have positive impacts on society and environment. This article presents a solution based on distributed processing concepts to generate predictive map of air pollution for the next 24 hours. Apache Hadoop has been utilized as the underlying framework to form a cluster of processing machines. In order to improve the processing speed along with required machine learning functionalities, Apache Spark has been employed on the Hadoop cluster. The solution enables us to efficiently predict air quality classes on monitoring stations of Tehran, the capital of Iran for the next 24 hours. Using Inverse distance weighting (IDW) method, the predictive map of air quality classes is generated afterward for the whole city. The results showed that the proposed approach can achieve a reasonable speed in processing of big spatial data along with horizontal scalability..
(Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/4f26b580-9623-467a-ade5-942f74c5e126

author

Asgari, Marjan ; Farnaghi, Mahdi ^LU and Ghaemi, Zeinab

publishing date

2017-09-17

type

Chapter in Book/Report/Conference proceeding

publication status

published

subject

keywords

Air pollution, Big spatial data, Distributed processing, Hadoop, Predictive mapping, Spark

host publication

2017 International Conference on Cloud and Big Data Computing, ICCBDC 2017

pages

5 pages

publisher

Association for Computing Machinery (ACM)

conference name

2017 International Conference on Cloud and Big Data Computing, ICCBDC 2017

conference location

London, United Kingdom

conference dates

2017-09-17 - 2017-09-19

external identifiers

scopus:85045762585

ISBN

9781450353434

DOI

10.1145/3141128.3141131

language

English

LU publication?

no

id

4f26b580-9623-467a-ade5-942f74c5e126

date added to LUP

2019-03-19 15:47:37

date last changed

2025-10-14 10:42:28

@inproceedings{4f26b580-9623-467a-ade5-942f74c5e126,
  abstract     = {{<p>Air pollution is one of the major environmental problems in the industrial and populated cities. Predictive mapping of urban air pollution and sharing the generated maps with the public and city officials have positive impacts on society and environment. This article presents a solution based on distributed processing concepts to generate predictive map of air pollution for the next 24 hours. Apache Hadoop has been utilized as the underlying framework to form a cluster of processing machines. In order to improve the processing speed along with required machine learning functionalities, Apache Spark has been employed on the Hadoop cluster. The solution enables us to efficiently predict air quality classes on monitoring stations of Tehran, the capital of Iran for the next 24 hours. Using Inverse distance weighting (IDW) method, the predictive map of air quality classes is generated afterward for the whole city. The results showed that the proposed approach can achieve a reasonable speed in processing of big spatial data along with horizontal scalability..</p>}},
  author       = {{Asgari, Marjan and Farnaghi, Mahdi and Ghaemi, Zeinab}},
  booktitle    = {{2017 International Conference on Cloud and Big Data Computing, ICCBDC 2017}},
  isbn         = {{9781450353434}},
  keywords     = {{Air pollution; Big spatial data; Distributed processing; Hadoop; Predictive mapping; Spark}},
  language     = {{eng}},
  month        = {{09}},
  pages        = {{89--93}},
  publisher    = {{Association for Computing Machinery (ACM)}},
  title        = {{Predictive mapping of urban air pollution using apache spark on a hadoop cluster}},
  url          = {{http://dx.doi.org/10.1145/3141128.3141131}},
  doi          = {{10.1145/3141128.3141131}},
  year         = {{2017}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Predictive mapping of urban air pollution using apache spark on a hadoop cluster