Jéssica Sena de Souza
Jessica Sena is a PhD student in Computer Science at Universidade Federal de Minas Gerais (UFMG), where she also received her B.Sc. and M.Sc. degrees in Information Systems and Computer Science, respectively. She is currently at UFMG as researcher at Smart Sense Laboratory (SENSE/UFMG). Her research interests include computer vision, smart surveillance and machine learning applications, with focus on visual and sensorial pattern recognition.
Sensor-based Human Activity Recognition (sensor-based HAR) provides valuable knowledge to many areas, such as medical, military and security. Recently, wearable devices have gained space as a relevant source of data due to the facility of data capture, the massive number of people who use these devices and the comfort and convenience of the device. In addition, the large number of sensors present in these devices provides complementary data as each sensor provides different information. However, there are two issues: heterogeneity between the data from multiple sensors and the temporal nature of the sensor data. We believe that mitigating these issues might provide valuable information if we handle the data correctly. To handle the first issue, we propose to processes each sensor separately, learning the features of each sensor and performing the classification before fusing with the other sensors. To exploit the second issue, we use an approach to extract patterns in multiple temporal scales of the data. This is convenient since the data are already a temporal sequence and the multiple scales extracted provide meaningful information regarding the activities performed by the users. We extract multiple temporal scales using an ensemble of Deep Convolution Neural Networks (DCNN). In this ensemble, we use a convolutional kernel with different height for each DCNN. Considering that the number of rows in the sensor data reflects the data captured over time, each kernel height reflects a temporal scale from which we can extract patterns. Consequently, our approach is able to extract both simple movement patterns such as a wrist twist when picking up a spoon and complex movements such as the human gait. This multimodal and multi-temporal approach outperforms previous state-of-the-art works in seven important datasets using two different protocols. We also demonstrate that the use of our proposed set of kernels improves sensor-based HAR in another multi-kernel approach, the widely employed inception network.