Scientists train artificial neural networks to detect real-time releases of toxins
2022.03.08 9:20 - Piotr SpinalskiWhen a release of hazardous substances is detected, the most important thing is to quickly and precisely locate the source of the release and predict the direction of the spread of the substance. Currently used dispersion models require very large computing resources. However, they can be replaced by models based on Artificial Neural Networks (ANN), which will allow for real-time monitoring of contamination. Scientists from the Department of Complex Systems NCBJ are participating in the study of the possibility of using such models.
For several years, the MANHAZ Hazard Analysis Center has been working on algorithms to locate the source of contamination, based on data on the concentrations of the released substance from a network of detectors. The main task of the emergency response groups existing in all cities is quick response to all threats to people and the environment. The primary determinant of success or failure of an activity is reaction time. Today, various chemicals are used in most industries, which means that the transport and storage of toxic materials is an ongoing risk of being released into the atmosphere and contaminated. Situations in which sensors located throughout the city report a non-zero concentration of a hazardous substance, whose source is unknown are a big challenge. In such cases, it is important that the system is able to estimate the most likely location of the contamination source in real time, based solely on concentration data from the sensor network.
Algorithms that deal with a task can be divided into two categories. The first are algorithms based on the backward approach, i.e. problem analysis starting from its last stage, but they are dedicated to open areas or a problem on a continental scale. The second category are algorithms that rely on sampling the parameters of the relevant dispersion model (parameters such as source location) to select the one that gives the smallest difference between the output data and the actual concentration measurements made by the detector network. This approach boils down to the use of sampling algorithms to find the optimal parameters of the dispersion model, based on the comparison of the model results and the contamination detection. Due to the effectiveness of the parameter scanning algorithm used, each reconstruction requires multiple runs of the model. Urban reconstruction, which is of primary interest to researchers, requires advanced dispersion models that take into account wind field turbulence around buildings. Computational Fluid Dynamics (CFD) models are the most reliable and accurate. However, they pose a very computationally demanding challenge. We must realize that the dispersion model has to be run tens of thousands of times to find the most likely source of contamination. This means that the model used must be fast in order to be applicable to a real-time fallback system. Assuming, for example, that the average time needed to perform the calculations of the dispersion model in an urbanized area is 10 minutes, full reconstruction with its use will be difficult to perform in an acceptably short time.
The solution to this problem, on which Anna Wawrzyńczak-Szaban, Ph.D. from the MANHAZ Hazard Analysis Center at NCBJ, in cooperation with the Institute of Informatics UPH in Siedlce, is working on the use of an artificial neural network in the reconstruction system, instead of the dispersion model, in an urbanized area. The point is for the artificial neural network to be effective in simulating the transport of pollutants in the air in urban areas. If this is successful, ANN can act as a dispersion model in a system that locates the source of the contamination in real time. "The main advantage of ANN is the very short response time." - describes Ph.D. Anna Wawrzyńczak-Szaban. "Obviously, ANNs must be trained in a fixed city topology, using real meteorological conditions using an appropriate and validated dispersion model. This process requires many simulations that serve as training data sets for ANN. The ANN training process is computationally expensive, but after training, the method would be a quick tool to estimate point concentrations for a given pollution source."
A study published by scientists1) presents the results of training a neural network based on data teaching the spread of toxins in the air in central London, using the test domain of the DAPPLE field experiment2). ANN training data was generated using the Quick Urban & Industrial Complex (QUIC) dispersion model. "We tested different ANN structures, i.e. the number of its layers, neurons and activation functions. The performed tests confirmed that the trained ANN can sufficiently simulate the turbulent transport of airborne toxins in a highly urbanized area." - explains Ph.D. Anna Wawrzyńczak-Szaban. "In addition, we have shown that by using ANN you can reduce the response time of the reconstruction system. The time required by the ANN presented in the work, for the estimation of thirty-minute gas concentrations in 196,000 sensor points was 3 s. In the case of the QUIC model, the time was estimated as at least 300 s, which gives us a 100-fold acceleration of the calculations. Taking this into account, the reconstruction time in an actual emergency can be short, resulting in the source of contamination being quickly located."
In the course of research, it was found that providing the trained ANN with complete information sometimes leads to some computational challenges. For example, in a single simulation of the dispersion of toxins in the air in an urban area, up to 90% of the sensor readings may be zero. This leads to a situation where the target ANN has a few percent positive values and most of the zeros. As a result, ANN focuses on what is more - on the zeros, which means that it does not adapt to the searched elements of the examined problem. "When you include a zero concentration value in your training data, there are a few questions we have to face: how do you take zero into account? How to scale a given interval to "hide" the zeros? Should you include zeros at all? Should their number be limited? " - emphasizes Ph.D. Wawrzyńczak-Szaban.
The results of the analyzes were presented at the ICMSQUARE 2021 conference and published in Does the Zero Carry Essential Information for Artificial Neural Network learning to simulate the contaminant transport in Urban Areas? M Berendt-Marchel, A Wawrzyńczak, Journal of Physics: Conference Series 2090 (1), 012027 https://iopscience.iop.org/article/10.1088/1742-6596/2090/1/012027
1) Wawrzyńczak, A., & Berendt-Marchel, M. (2020). Computation of the Airborne Contaminant Transport in Urban Area by the Artificial Neural Network. Computational Science – ICCS 2020: 20th International Conference, Amsterdam, The Netherlands, June 3–5, 2020, Proceedings, Part II, 12138, 401–413. https://doi.org/10.1007/978-3-030-50417-5_30
2) Wood, I in.: Dispersion experiments in central London: the 2007 DAPPLE project. Bulletin of the American Meteorological Society 90(7), 955{970(2009).