Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Urban Air
Urban Air

Using a diversity of big data to infer and predict fine-grained air quality throughout a city, and finally tackle air pollutions.

                                    Install Mobile Apps

Many countries are suffering from air pollutions. Many cities have built a few air quality monitoring stations to inform people urban air quality every hour. Influenced by multiple complex factors, however, urban air quality is highly skewed in a city, varying by locations significantly and changing over time differently in different places. Thus, we do not know the air quality of a location without a monitoring station. We do not what the air quality at a place will be tomorrow either, let alone the root cause the air pollution.

This project aims to predict the fine-grained air quality of current time throughout a city and forecast the air quality of future time at each monitoring station. We also expect to identify the root cause of air pollution. For example, what's the proportion of PM2.5 in the environment derived from vehicular emission. what is the spatio-temporal causality interaction between the air pollutions of different cities?

Led by Dr. Yu Zheng, Urban Air is also a sub-project of Urban Computing, which is a research theme that aims to tackle big challenges in cities by using big data.


The research has been publicly available through a "cloud + client" framework, where the cloud continuously collect real-time data, such meteorological data and air quality data. A user can access the air quality information through using a mobile client or web client.

Urban Air

       (WPhone-En)        (Chinese Mobile Apps)                website: 


Step 1: Infer Fine-Grained Air Quality

The first step of this project is to infer the real-time and fine-grained air quality of arbitrary location by using two parts of data. One is the real-time and historical air quality data from existing monitoring stations. The other is five additional data sources we observed in a city, consisting of meteorological data, traffic, human mobility, POIs, and road network data. We propose a semi-supervised learning approach based on a co-training framework that consists of two separated classifiers. One is a spatial classifier based on an artificial neural network (ANN), which takes spatially-related features (e.g., the density of POIs and length of highways) as input to model the spatial correlation between air qualities of different locations. The other is a temporal classifier based on a linear-chain conditional random field (CRF), involving temporally-related features (e.g., traffic and meteorology) to model the temporal dependency of air quality in a location. Read the related publications for more details.


[1] Yu Zheng, Furui Liu, Hsun-Ping Hsieh. U-Air: When Urban Air Quality Inference Meets Big Data. 19th SIGKDD conference on Knowledge Discovery and Data Mining (KDD 2013). (Data) (Website) (Mobile App)(Video)

[2] Yu Zheng, Xuxu Chen, Qiwei Jin, Yubiao Chen, Xiangyun Qu, Xin Liu, Eric Chang, Wei-Ying Ma, Yong Rui, Weiwei Sun. A Cloud-Based Knowledge Discovery System for Monitoring Fine-Grained Air Quality. MSR-TR-2014-40.

A Dataset is released for research purposes: download the data.


Step 2: Forecast Air Quality at Each Station

The second step is to predict the fine-grained air quality of the next 48 hours. Specifically, in the first 6 coming hours, we predict a real-valued AQI for each kind of air pollutant, at each hour, in each station. For the next 7-12, 12-24, and 24-48 hours, we predict a max-min range of the AQIs at the corresponding time interval. Our predictive model is comprised of four major components: 1) a linear regression-based temporal predictor to model the local factor of air quality, 2) a neural network-based spatial predictor modeling the global factors, 3) a dynamic aggregator combining the predictions of the spatial and temporal predictors according to the meteorological data, and 4) an inflection predictor to capture the sudden changes of air quality.


[1] Yu Zheng, Xiuwen Yi, Ming Li, Ruiyuan Li, Zhangqing Shan, Eric Chang, Tianrui Li. Forecasting Fine-Grained Air Quality Based on Big Data. In the Proceeding of the 21th SIGKDD conference on Knowledge Discovery and Data Mining (KDD 2015).

A portion of the data used in the research has been released here.


Step 3: Suggest Locations for Monitoring Stations

Given a limited budget to build a few additional air quality monitoring stations, where shall we put them? The research solves this problem from the perspective of maximizing the inference accuracy and stability.


[1] Hsun-Ping Hsieh*, Shou-De Lin, Yu Zheng. Inferring Air Quality for Station Location Recommendation Based on Big Data. In the Proceeding of the 21th SIGKDD conference on Knowledge Discovery and Data Mining (KDD 2015).


Step 4: Identify the Root Cause of Air Pollution

1) Study the correlation between vehicular emission and air quality

2) Identify the spatio-temporal causality between air pollutants of different cities.

Idea: Find co-evolving patterns from air quality data from different stations and then apply causality models to these patterns for root cause discovery.


[1] Chao Zhang*, Yu Zheng, Xiuli Ma, Jiawei Han. Assembler: Efficient Discovery of Spatial Coevolving Patterns in Massive Geosensory Data. In Proceedings of the 21th SIGKDD conference on Knowledge Discovery and Data Mining (KDD 2015).

[2] Julie Yixuan Zhu, Yu Zheng, Xiuwen Yi, Victor O.K. Li, A Gaussian Bayesian Model to Identify Spatiotemporal Causalities for Air Pollution based on Urban Big Data. The International Workshop on Smart Cities, in conjunction with InFOCOM 2016.

3) Suggesting the locations for building additional monitoring stations;


Step 5: Study the Impact of Air Pollution to People's Health




We appreciate our partners from Microsoft Product Teams who have been working with us closely in this project.

Specifically, Jacky Hsu, Qinying Liao and their team from C+E division contribute YourWeather App. (WPhone-CN; Android-CN, IOS)

We also appreciate our partners like Stella Ye and Sandy Qi (from Bing) who made Urban Air available on Bing Map

There are a few interns who have worked with us in the urban air project. We may not be able to list all of them here.

Yubiao Chen, Xuxu Chen, Hsun-Ping Hsieh, Furui Li, Zhenni Feng, Zhangqing Shang, Ruiyuan Li, Xiuwen Yi, .