Interpretable Machine Learning for Smart Cities
by Wei Zhang, Xue Feng, King Jet Tseng and Yonggang Wen
In recent years, machine learning (ML) has been unleashing its huge potential and providing its effectiveness in various areas and sectors such as manufacturing and financial technology. Unlike the traditional methodologies, the key innovation of ML lies with its capability to extract insights and knowledge from the massive amount of data. Data plays an important part in smart city applications. Deployment of sensors, actuators, and smart objects on a city-scale generates a huge amount of data in real-time. With such diverse and large volume of the collected data for smart cities, ML comes in to perform data analytics for system modeling, trend prediction, and intelligence generation .
ML provides a wide spectrum of algorithms and technologies, each having its unique design philosophy and analysis mechanism. Some algorithms suitable for certain applications may not perform comparably well for other applications. Many research works have been conducted to investigate the effectiveness of the ML algorithms for different smart city applications, i.e., solar energy generation , building energy modeling , thermal comfort analysis  and so on. Despite the existing efforts of using ML in smart cities, some latest technologies and advancements of ML are yet to be explored thoroughly in the smart city domain.
Nowadays, ML is mostly used for achieving high accuracy for smart city modeling and prediction, i.e., accurate subway arrival time forecasting and energy usage modeling, and the progress is promising. However, we normally do not have a comprehensive and thorough understanding of the rationale and mechanism behind the ML-generated models and solutions. This can be effectively addressed by interpretable ML , which can offer a wide range of benefits. Firstly, the importance level of different input features can be evaluated. Features with the highest impact can be selected for further analysis and optimization. Also, the derived ML models can be debugged and inspected based on the interpretation to avoid irrational reasoning caused by biased datasets, unethical decisions, and discrimination of certain groups. In addition, new mechanisms and insights can be generated from the interpretation and then backpropagated to the design and development phases to further optimize the ML performance. The benefits are significant and there are existing available methods to start with. In the following part, we will discuss more about the two major groups of the methods.
Figure 1: The system architecture of the interpretable ML for smart cities. An ML model fits itself with the smart city data and the fitted model can be used for modeling and prediction. The ML model serves as the input of the interpretable ML methods, which generates the explanation and interpretation about the model.
Interpretable ML Models
An intuitive option is to use the interpretable models, generated by the ML algorithms like linear regression and decision tree. These models map the input space to the output space based on equations or rules, which are interpretable in nature. For example, a linear regression model that models the household energy usage as a linear function of f(x) = ax + b, where x is the number of occupants and a and b are the coefficients. The model tells us that with one more occupant, the modeled energy usage is expected to increase by a value of a. Another example of interpretable ML is a decision tree model. One such application is rules that describe the increase in the likelihood of traffic jam if it is raining and during peak hours. Other interpretable ML models include naïve Bayes classifier and k-nearest neighbors.
Figure 2: The trade-off between interpretability and accuracy. The ML models with good interpretability often fail to model or predict accurately, and the well performed models like neural network own poor interpretability.
While the above ML models are interpretable, they are based on simple assumptions like the linear relationship between inputs and outputs and often fail to capture the complex systems in the smart city applications. Advanced ML algorithms are more suitable for complex cases but offer poor interpretability. One example is a neural network, which theoretically can model functions perfectly. However, it is well known that the neural network model is a black box in nature. To add interpretability to those ML algorithms, model-agnostic methods bypass the black box models and investigate the model behavior and reaction to the dynamics of the model inputs.
One model-agnostic method is partial dependency plot (PDP) . The method evaluates the marginal effect of each input to the model output by drawing a plot, which shows how the average output changes when the input value increases or decreases. In a smart building setting, suppose the PDPs show that the humidity leads to a more dramatic increase of the air conditioning energy than temperature, the building managers shall try reducing the humidity level to an acceptable low level instead of purely adjusting the temperature. Other model-agnostic methods include feature interaction, feature importance, local surrogate models and so on, but the design philosophy of those methods is similar to PDP.
In this article, we introduced interpretable ML for smart cities and reviewed two groups of the available methods briefly. One group is to interpret the ML models derived by the ML algorithms like linear regression and decision tree. Those models have good interpretability but often fail to perform well due to the simplicity of the models. The other group is model-agnostic methods, which target the more advanced ML algorithms like neural networks and analyze the model behaviors by injecting different inputs.
The adoption of ML in smart cities is still in the early stage, and the ML itself is evolving fast. Moving forward, the room for improvement is huge for ML-based smart city applications. This article introduced a new dimension, interpretability, for using ML in smart cities. Further efforts may also cover the other ML dimensions like knowledge embedding and transfer learning for smart cities.
Mohammadi M, Al-Fuqaha A. Enabling cognitive smart cities using big data and machine learning: Approaches and challenges. IEEE Communications Magazine. 2018 Feb 13;56(2):94-101.
Sharma N, Sharma P, Irwin D, Shenoy P. Predicting solar generation from weather forecasts using machine learning. In 2011 IEEE international conference on smart grid communications (SmartGridComm) 2011 Oct 17 (pp. 528-533). IEEE.
Marino DL, Amarasinghe K, Manic M. Building energy load forecasting using deep neural networks. In IECON 2016-42nd Annual Conference of the IEEE Industrial Electronics Society 2016 Oct 23 (pp. 7046-7051). IEEE.
Zhang W, Hu W, Wen Y. Thermal comfort modeling for smart buildings: A fine-grained Deep Learning Approach. IEEE Internet of Things Journal. 2018 Sep 20;6(2):2540-9.
Holzinger A. From machine learning to explainable AI. In 2018 World Symposium on Digital Intelligence for Systems and Machines (DISA) 2018 Aug 23 (pp. 55-66). IEEE.
Friedman JH. Greedy function approximation: a gradient boosting machine. Annals of statistics. 2001 Oct 1:1189-232.
Wei Zhang (S’10–M’16) received the Ph.D. degree in computer science from the School of Computer Science and Engineering, Nanyang Technological University (NTU), Singapore, in 2015. He is currently an Assistant Professor with the Information and Communications Technology Cluster, Singapore Institute of Technology (SIT). Before joining SIT, he was a Scientist at the Agency for Science, Technology, and Research (A*STAR), Singapore. From 2015 to 2017, he was a Research Fellow at Nanyang Technological University (NTU), Singapore. His current research interests include energy optimization for the information and communication technology area, as well as the smart buildings and smart city.
Xue Feng obtained her PhD degree in Electrical and Electronic Engineering from Nanyang Technological University in June 2016. She joined Land Transport Authority of Singapore and worked in e-mobility office from September 2015 to June 2017. Since September 2017, she has been working as an Assistant Professor in Singapore Institute of Technology. Her research interests include microgrid operation and planning, data analytics applications in power system, data-driven energy storage system modelling, planning and control.
King-Jet Tseng was born in Singapore and studied at Gan Eng Seng School and Hwa Chong Junior College before receiving his B.Eng. (First Class) and M.Eng. from National University of Singapore, and Ph.D. from Cambridge University in United Kingdom. He has more than 25 years of academic, research, industrial and professional experience in electrical power and energy systems. Currently, he is the Professor and Director of Electrical Power Engineering at Singapore Institute of Technology. He has made important contributions to interior permanent magnet synchronous machines, resilient grid interconnection of distributed energy resources, dynamic voltage compensation, arc modeling for transient simulations and energy storage systems in utility applications.
Yonggang Wen (S’99–M’08–SM’14–F’20) is the Professor of Computer Science and Engineering at Nanyang Technological University (NTU), Singapore. He has also served as the Associate Dean (Research) at College of Engineering at NTU Singapore since 2018. He received his PhD degree in Electrical Engineering and Computer Science (minor in Western Literature) from Massachusetts Institute of Technology (MIT), Cambridge, USA, in 2008. His research interests include cloud computing, green data center, distributed machine learning, blockchain, big data analytics, multimedia network and mobile computing. He is a Fellow of IEEE.
8 March 2020