2020 |
Júnior, Carlos Antônio Caetano Motion-Based Representations for Activity Recognition Tese PhD Universidade Federal of Minas Gerais, 2020. Resumo | BibTeX | Tags: Activity Recognition, convolutional neural networks (CNNs), optical flow, spatiotemporal information, temporal stream @phdthesis{CarlosCaetano:2020:PhD, title = {Motion-Based Representations for Activity Recognition}, author = {Carlos Antônio Caetano Júnior}, year = {2020}, date = {2020-01-27}, school = {Universidade Federal of Minas Gerais}, abstract = {In this dissertation we propose four different representations based on motion information for activity recognition. The first is a spatiotemporal local feature descriptor that extracts a robust set of statistical measures to describe motion patterns. This descriptor measures meaningful properties of co-occurrence matrices and captures local space-time characteristics of the motion through the neighboring optical flow magnitude and orientation. The second, is the proposal of a compact novel mid-level representation based on co-occurrence matrices of codewords. This representation expresses the distribution of the features at a given offset over feature codewords from a pre-computed codebook and encodes global structures in various local region-based features. The third representation, is the proposal of a novel temporal stream for two-stream convolutional networks that employs images computed from the optical flow magnitude and orientation to learn the motion in a better and richer manner. The method applies simple non-linear transformations on the vertical and horizontal components of the optical flow to generate input images for the temporal stream. Finally, the forth is a novel skeleton image representation to be used as input of convolutional neural networks (CNNs). The proposed approach encodes the temporal dynamics by explicitly computing the magnitude and orientation values of the skeleton joints. Moreover, the representation has the advantage of combining the use of reference joints and a tree structure skeleton, incorporating different spatial relationships between the joints and preserving important spatial relations. The experimental evaluations carried out on challenging well-known activity recognition datasets (KTH, UCF Sports, HMDB51, UCF101, NTU RGB+D 60 and NTU RGB+D 120) demonstrated that the proposed representations achieved better or similar accuracy results in comparison to the state of the art, indicating the suitability of our approaches as video representations.}, keywords = {Activity Recognition, convolutional neural networks (CNNs), optical flow, spatiotemporal information, temporal stream}, pubstate = {published}, tppubtype = {phdthesis} } In this dissertation we propose four different representations based on motion information for activity recognition. The first is a spatiotemporal local feature descriptor that extracts a robust set of statistical measures to describe motion patterns. This descriptor measures meaningful properties of co-occurrence matrices and captures local space-time characteristics of the motion through the neighboring optical flow magnitude and orientation. The second, is the proposal of a compact novel mid-level representation based on co-occurrence matrices of codewords. This representation expresses the distribution of the features at a given offset over feature codewords from a pre-computed codebook and encodes global structures in various local region-based features. The third representation, is the proposal of a novel temporal stream for two-stream convolutional networks that employs images computed from the optical flow magnitude and orientation to learn the motion in a better and richer manner. The method applies simple non-linear transformations on the vertical and horizontal components of the optical flow to generate input images for the temporal stream. Finally, the forth is a novel skeleton image representation to be used as input of convolutional neural networks (CNNs). The proposed approach encodes the temporal dynamics by explicitly computing the magnitude and orientation values of the skeleton joints. Moreover, the representation has the advantage of combining the use of reference joints and a tree structure skeleton, incorporating different spatial relationships between the joints and preserving important spatial relations. The experimental evaluations carried out on challenging well-known activity recognition datasets (KTH, UCF Sports, HMDB51, UCF101, NTU RGB+D 60 and NTU RGB+D 120) demonstrated that the proposed representations achieved better or similar accuracy results in comparison to the state of the art, indicating the suitability of our approaches as video representations. |
2019 |
de Prates, Raphael Felipe Carvalho Matching People Across Surveillance Cameras Tese PhD Universidade Federal de Minas Gerais, 2019. Resumo | BibTeX | Tags: Computer vision, Person Re-Identification, Smart Surveillance @phdthesis{RaphaelPrates:2020:PhD, title = {Matching People Across Surveillance Cameras}, author = {Raphael Felipe Carvalho de Prates}, year = {2019}, date = {2019-03-29}, school = {Universidade Federal de Minas Gerais}, abstract = {The number of surveillance camera networks is increasing as a consequence of the escalation of the security concerns. The large amount of data collected demands intelligent surveillance systems to extract information that is useful to security officers. In order to achieve this goal, this system must be able to correlate information captured by different surveillance cameras. In this scenario, re-identification of people is of central importance in establishing a global identity for individuals captured by different cameras using only visual appearance. However, this is a challenging task, since the same person when captured by different cameras undergoes a drastic change of appearance as a consequence of the variations in the point of view, illumination and pose. Recent work addresses the person re-identification by proposing robust visual descriptors orcross-view matching functions, which are functions that learn to match images from different cameras. However, most of these works are impaired by problems such as ambiguity among individuals, scalability, and reduced number of labeled images in the training set. In this thesis, we address the problem of matching individuals between cameras in order to address the aforementioned problems and, therefore, obtain better results. Specifically, we propose two directions: the learning of subspaces and the models of indirect identification. The first learns a common subspace that is scalable with respect to the number of cameras and robust in relation to the amount of training images available. we match probe and gallery images indirectly by computing their similarities with training samples. Experimental results validate both approaches in the person re-identification problem considering both only one pair of cameras and more realistic situations with multiple cameras.}, keywords = {Computer vision, Person Re-Identification, Smart Surveillance}, pubstate = {published}, tppubtype = {phdthesis} } The number of surveillance camera networks is increasing as a consequence of the escalation of the security concerns. The large amount of data collected demands intelligent surveillance systems to extract information that is useful to security officers. In order to achieve this goal, this system must be able to correlate information captured by different surveillance cameras. In this scenario, re-identification of people is of central importance in establishing a global identity for individuals captured by different cameras using only visual appearance. However, this is a challenging task, since the same person when captured by different cameras undergoes a drastic change of appearance as a consequence of the variations in the point of view, illumination and pose. Recent work addresses the person re-identification by proposing robust visual descriptors orcross-view matching functions, which are functions that learn to match images from different cameras. However, most of these works are impaired by problems such as ambiguity among individuals, scalability, and reduced number of labeled images in the training set. In this thesis, we address the problem of matching individuals between cameras in order to address the aforementioned problems and, therefore, obtain better results. Specifically, we propose two directions: the learning of subspaces and the models of indirect identification. The first learns a common subspace that is scalable with respect to the number of cameras and robust in relation to the amount of training images available. we match probe and gallery images indirectly by computing their similarities with training samples. Experimental results validate both approaches in the person re-identification problem considering both only one pair of cameras and more realistic situations with multiple cameras. |
Jordao, Artur; Kloss, Ricardo; Yamada, Fernando; Schwartz, William Robson Pruning Deep Networks using Partial Least Squares Inproceedings British Machine Vision Conference (BMVC) Workshops, pp. 1-9, 2019. @inproceedings{Jordao:2019:BMVC, title = {Pruning Deep Networks using Partial Least Squares}, author = {Artur Jordao and Ricardo Kloss and Fernando Yamada and William Robson Schwartz}, url = {http://www.dcc.ufmg.br/~william/papers/paper_2019_BMVCW_Jordao.pdf}, year = {2019}, date = {2019-01-01}, booktitle = {British Machine Vision Conference (BMVC) Workshops}, pages = {1-9}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
de Prates, Raphael Felipe Carvalho; Schwartz, William Robson Kernel cross-view collaborative representation based classification for person re-identification Journal Article Journal of Visual Communication and Image Representation, 58 (1), pp. 304-315, 2019. Links | BibTeX | Tags: Kernel collaborative representation based classification, Person Re-Identification @article{Prates:2019:JVCI, title = {Kernel cross-view collaborative representation based classification for person re-identification}, author = {Raphael Felipe Carvalho de Prates and William Robson Schwartz}, doi = {https://doi.org/10.1016/j.jvcir.2018.12.003}, year = {2019}, date = {2019-01-01}, journal = {Journal of Visual Communication and Image Representation}, volume = {58}, number = {1}, pages = {304-315}, keywords = {Kernel collaborative representation based classification, Person Re-Identification}, pubstate = {published}, tppubtype = {article} } |
Caetano, Carlos; de Melo, Victor H C; Brémond, François; dos Santos, Jefersson A; Schwartz, William Robson Magnitude-Orientation Stream network and depth information applied to activity recognition Journal Article Journal of Visual Communication and Image Representation, 63 , pp. 102596, 2019, ISSN: 1047-3203. Resumo | Links | BibTeX | Tags: @article{CAETANO2019102596, title = {Magnitude-Orientation Stream network and depth information applied to activity recognition}, author = {Carlos Caetano and Victor H C de Melo and François Brémond and Jefersson A dos Santos and William Robson Schwartz}, url = {http://www.sciencedirect.com/science/article/pii/S1047320319302172}, doi = {https://doi.org/10.1016/j.jvcir.2019.102596}, issn = {1047-3203}, year = {2019}, date = {2019-01-01}, journal = {Journal of Visual Communication and Image Representation}, volume = {63}, pages = {102596}, abstract = {The temporal component of videos provides an important clue for activity recognition, as a number of activities can be reliably recognized based on the motion information. In view of that, this work proposes a novel temporal stream for two-stream convolutional networks based on images computed from the optical flow magnitude and orientation, named Magnitude-Orientation Stream (MOS), to learn the motion in a better and richer manner. Our method applies simple non-linear transformations on the vertical and horizontal components of the optical flow to generate input images for the temporal stream. Moreover, we also employ depth information to use as a weighting scheme on the magnitude information to compensate the distance of the subjects performing the activity to the camera. Experimental results, carried on two well-known datasets (UCF101 and NTU), demonstrate that using our proposed temporal stream as input to existing neural network architectures can improve their performance for activity recognition. Results demonstrate that our temporal stream provides complementary information able to improve the classical two-stream methods, indicating the suitability of our approach to be used as a temporal video representation.}, keywords = {}, pubstate = {published}, tppubtype = {article} } The temporal component of videos provides an important clue for activity recognition, as a number of activities can be reliably recognized based on the motion information. In view of that, this work proposes a novel temporal stream for two-stream convolutional networks based on images computed from the optical flow magnitude and orientation, named Magnitude-Orientation Stream (MOS), to learn the motion in a better and richer manner. Our method applies simple non-linear transformations on the vertical and horizontal components of the optical flow to generate input images for the temporal stream. Moreover, we also employ depth information to use as a weighting scheme on the magnitude information to compensate the distance of the subjects performing the activity to the camera. Experimental results, carried on two well-known datasets (UCF101 and NTU), demonstrate that using our proposed temporal stream as input to existing neural network architectures can improve their performance for activity recognition. Results demonstrate that our temporal stream provides complementary information able to improve the classical two-stream methods, indicating the suitability of our approach to be used as a temporal video representation. |
Goncalves, Gabriel Resende; Diniz, Matheus Alves; Laroca, Rayson; Menotti, David; Schwartz, William Robson Multi-Task Learning for Low-Resolution License Plate Recognition Inproceedings Iberoamerican Congress on Pattern Recognition (CIARP), pp. 1-10, 2019. @inproceedings{Goncalves:2019:CIARP, title = {Multi-Task Learning for Low-Resolution License Plate Recognition}, author = {Gabriel Resende Goncalves and Matheus Alves Diniz and Rayson Laroca and David Menotti and William Robson Schwartz}, url = {http://www.dcc.ufmg.br/~william/papers/paper_2019_CIARP_Goncalves.pdf}, year = {2019}, date = {2019-01-01}, booktitle = {Iberoamerican Congress on Pattern Recognition (CIARP)}, pages = {1-10}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Vareto, Rafael Henrique; Diniz, Matheus Alves; Schwartz, William Robson Face Spoofing Detection on Low-Power Devices using Embeddings with Spatial and Frequency-based Descriptors Inproceedings Iberoamerican Congress on Pattern Recognition (CIARP), pp. 1-10, 2019. @inproceedings{Vareto:2019:CIARP, title = {Face Spoofing Detection on Low-Power Devices using Embeddings with Spatial and Frequency-based Descriptors}, author = {Rafael Henrique Vareto and Matheus Alves Diniz and William Robson Schwartz}, url = {http://www.dcc.ufmg.br/~william/papers/paper_2019_CIARP_Vareto.pdf}, year = {2019}, date = {2019-01-01}, booktitle = {Iberoamerican Congress on Pattern Recognition (CIARP)}, pages = {1-10}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
de Lima, Vitor Cezar; Schwartz, William Robson Gait Recognition using Pose Estimation and Signal Processing Inproceedings Iberoamerican Congress on Pattern Recognition (CIARP), pp. 1-10, 2019. @inproceedings{Lima:2019:CIARP, title = {Gait Recognition using Pose Estimation and Signal Processing}, author = {Vitor Cezar de Lima and William Robson Schwartz}, url = {http://www.dcc.ufmg.br/~william/papers/paper_2019_CIARP_Vitor.pdf}, year = {2019}, date = {2019-01-01}, booktitle = {Iberoamerican Congress on Pattern Recognition (CIARP)}, pages = {1-10}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Bastos, Igor; de Melo, Victor Hugo Cunha; Schwartz, William Robson Multi-Loss Recurrent Residual Networks for Gesture Detection and Recognition Inproceedings Conference on Graphic, Patterns and Images (SIBGRAPI), pp. 1-8, 2019. @inproceedings{Bastos:2019:SIBGRAPIb, title = {Multi-Loss Recurrent Residual Networks for Gesture Detection and Recognition}, author = {Igor Bastos and Victor Hugo Cunha de Melo and William Robson Schwartz}, url = {http://www.dcc.ufmg.br/~william/papers/paper_2019_SIBGRAPI_Bastos.pdf}, year = {2019}, date = {2019-01-01}, booktitle = {Conference on Graphic, Patterns and Images (SIBGRAPI)}, pages = {1-8}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Mendes, G; Paiva, Jose Gustavo; Schwartz, William Robson Point-placement techniques and Temporal Self-similarity Maps for Visual Analysis of Surveillance Videos Inproceedings International Conference Information Visualisation, pp. 1-8, 2019. BibTeX | Tags: @inproceedings{Mendes:2019:ICIV, title = {Point-placement techniques and Temporal Self-similarity Maps for Visual Analysis of Surveillance Videos}, author = {G Mendes and Jose Gustavo Paiva and William Robson Schwartz}, year = {2019}, date = {2019-01-01}, booktitle = {International Conference Information Visualisation}, pages = {1-8}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Caetano, Carlos; Bremond, Francois; Schwartz, William Robson Skeleton Image Representation for 3D Action Recognition based on Tree Structure and Reference Joints Inproceedings Conference on Graphic, Patterns and Images (SIBGRAPI), pp. 1-8, 2019. @inproceedings{Caetano:2019:SIBGRAPIb, title = {Skeleton Image Representation for 3D Action Recognition based on Tree Structure and Reference Joints}, author = {Carlos Caetano and Francois Bremond and William Robson Schwartz}, url = {http://www.dcc.ufmg.br/~william/papers/paper_2019_SIBGRAPI_Caetano.pdf}, year = {2019}, date = {2019-01-01}, booktitle = {Conference on Graphic, Patterns and Images (SIBGRAPI)}, pages = {1-8}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Caetano, Carlos; Souza, Jessica; Bremond, Francois; Santos, Jefersson; Schwartz, William Robson SkeleMotion: A New Representation of Skeleton Joint Sequences based on Motion Information for 3D Action Recognition Inproceedings 16th International Conference on Advanced Video and Signal-based Surveillance (AVSS), pp. 1-6, 2019. @inproceedings{Caetano:2019:AVSSb, title = {SkeleMotion: A New Representation of Skeleton Joint Sequences based on Motion Information for 3D Action Recognition}, author = {Carlos Caetano and Jessica Souza and Francois Bremond and Jefersson Santos and William Robson Schwartz}, url = {http://www.dcc.ufmg.br/~william/papers/paper_2019_AVSS_Caetano.pdf}, year = {2019}, date = {2019-01-01}, booktitle = {16th International Conference on Advanced Video and Signal-based Surveillance (AVSS)}, pages = {1-6}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
2018 |
Sena, Jessica Human Activity Recognition based on Wearable Sensors using Multiscale DCNN Ensemble Masters Thesis Federal University of Minas Gerais, 2018. Resumo | BibTeX | Tags: CNN ensemble, Human Activity Recognition, Multimodal Data, Multiscale temporal data @mastersthesis{Sena:2018:MSc, title = {Human Activity Recognition based on Wearable Sensors using Multiscale DCNN Ensemble}, author = {Jessica Sena}, year = {2018}, date = {2018-10-18}, school = {Federal University of Minas Gerais}, abstract = {Sensor-based Human Activity Recognition (sensor-based HAR) provides valuable knowledge to many areas, such as medical, military and security. Recently, wearable devices have gained space as a relevant source of data due to the facility of data capture, the massive number of people who use these devices and the comfort and convenience of the device. In addition, the large number of sensors present in these devices provides complementary data as each sensor provides different information. However, there are two issues: heterogeneity between the data from multiple sensors and the temporal nature of the sensor data. We believe that mitigating these issues might provide valuable information if we handle the data correctly. To handle the first issue, we propose to processes each sensor separately, learning the features of each sensor and performing the classification before fusing with the other sensors. To exploit the second issue, we use an approach to extract patterns in multiple temporal scales of the data. This is convenient since the data are already a temporal sequence and the multiple scales extracted provide meaningful information regarding the activities performed by the users. We extract multiple temporal scales using an ensemble of Deep Convolution Neural Networks (DCNN). In this ensemble, we use a convolutional kernel with different height for each DCNN. Considering that the number of rows in the sensor data reflects the data captured over time, each kernel height reflects a temporal scale from which we can extract patterns. Consequently, our approach is able to extract both simple movement patterns such as a wrist twist when picking up a spoon and complex movements such as the human gait. This multimodal and multi-temporal approach outperforms previous state-of-the-art works in seven important datasets using two different protocols. We also demonstrate that the use of our proposed set of kernels improves sensor-based HAR in another multi-kernel approach, the widely employed inception network.}, keywords = {CNN ensemble, Human Activity Recognition, Multimodal Data, Multiscale temporal data}, pubstate = {published}, tppubtype = {mastersthesis} } Sensor-based Human Activity Recognition (sensor-based HAR) provides valuable knowledge to many areas, such as medical, military and security. Recently, wearable devices have gained space as a relevant source of data due to the facility of data capture, the massive number of people who use these devices and the comfort and convenience of the device. In addition, the large number of sensors present in these devices provides complementary data as each sensor provides different information. However, there are two issues: heterogeneity between the data from multiple sensors and the temporal nature of the sensor data. We believe that mitigating these issues might provide valuable information if we handle the data correctly. To handle the first issue, we propose to processes each sensor separately, learning the features of each sensor and performing the classification before fusing with the other sensors. To exploit the second issue, we use an approach to extract patterns in multiple temporal scales of the data. This is convenient since the data are already a temporal sequence and the multiple scales extracted provide meaningful information regarding the activities performed by the users. We extract multiple temporal scales using an ensemble of Deep Convolution Neural Networks (DCNN). In this ensemble, we use a convolutional kernel with different height for each DCNN. Considering that the number of rows in the sensor data reflects the data captured over time, each kernel height reflects a temporal scale from which we can extract patterns. Consequently, our approach is able to extract both simple movement patterns such as a wrist twist when picking up a spoon and complex movements such as the human gait. This multimodal and multi-temporal approach outperforms previous state-of-the-art works in seven important datasets using two different protocols. We also demonstrate that the use of our proposed set of kernels improves sensor-based HAR in another multi-kernel approach, the widely employed inception network. |
de Melo, Victor Hugo Cunha; Santos, Jesimon Barreto; Junior, Carlos Antonio Caetano; Sena, Jessica; Penatti, Otavio A B; Schwartz, William Robson Object-based Temporal Segment Relational Network for Activity Recognition Inproceedings Conference on Graphic, Patterns and Images (SIBGRAPI), pp. 1-8, 2018. BibTeX | Tags: Activity Recognition, DeepEyes, HAR-HEALTH, Relational Reasoning, Spatial Pyramid @inproceedings{DeMelo:2018:SIBGRAPI, title = {Object-based Temporal Segment Relational Network for Activity Recognition}, author = {Victor Hugo Cunha de Melo and Jesimon Barreto Santos and Carlos Antonio Caetano Junior and Jessica Sena and Otavio A B Penatti and William Robson Schwartz}, year = {2018}, date = {2018-09-21}, booktitle = {Conference on Graphic, Patterns and Images (SIBGRAPI)}, pages = {1-8}, keywords = {Activity Recognition, DeepEyes, HAR-HEALTH, Relational Reasoning, Spatial Pyramid}, pubstate = {published}, tppubtype = {inproceedings} } |
Sena, Jessica; Santos, Jesimon Barreto; Schwartz, William Robson Multiscale DCNN Ensemble Applied to Human Activity Recognition Based on Wearable Sensors Inproceedings 26th European Signal Processing Conference (EUSIPCO 2018), pp. 1-5, 2018. Links | BibTeX | Tags: Activity Recognition Based on Wearable Sensors, Deep Learning, DeepEyes, HAR-HEALTH, Human Activity Recognition, Multimodal Data, Wearable Sensors @inproceedings{Sena:2018:EUSIPCO, title = {Multiscale DCNN Ensemble Applied to Human Activity Recognition Based on Wearable Sensors}, author = {Jessica Sena and Jesimon Barreto Santos and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/PID5428933.pdf}, year = {2018}, date = {2018-09-06}, booktitle = {26th European Signal Processing Conference (EUSIPCO 2018)}, pages = {1-5}, keywords = {Activity Recognition Based on Wearable Sensors, Deep Learning, DeepEyes, HAR-HEALTH, Human Activity Recognition, Multimodal Data, Wearable Sensors}, pubstate = {published}, tppubtype = {inproceedings} } |
Gonçalves, Gabriel Resende; Diniz, Matheus Alves; Laroca, Rayson; Menotti, David; Schwartz, William Robson Real-time Automatic License Plate Recognition Through Deep Multi-Task Networks Inproceedings Conference on Graphic, Patterns and Images (SIBGRAPI), pp. 1-8, 2018. Links | BibTeX | Tags: Automatic License Plate Recognition, Deep Learning, DeepEyes, GigaFrames, Multi-Task Learning, Sense-ALPR @inproceedings{Goncalves:2018:SIBGRAPI, title = {Real-time Automatic License Plate Recognition Through Deep Multi-Task Networks}, author = {Gabriel Resende Gonçalves and Matheus Alves Diniz and Rayson Laroca and David Menotti and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/paper.pdf}, year = {2018}, date = {2018-09-04}, booktitle = {Conference on Graphic, Patterns and Images (SIBGRAPI)}, pages = {1-8}, keywords = {Automatic License Plate Recognition, Deep Learning, DeepEyes, GigaFrames, Multi-Task Learning, Sense-ALPR}, pubstate = {published}, tppubtype = {inproceedings} } |
Colque, Rensso Victor Hugo Mora Robust Approaches for Anomaly Detection Applied to Video Surveillance Tese PhD Federal University of Minas Gerais, 2018. Resumo | BibTeX | Tags: Processamento digital de Imagens} pubstate = {published, Visão por computador @phdthesis{Mora:2018:PhD, title = {Robust Approaches for Anomaly Detection Applied to Video Surveillance}, author = {Rensso Victor Hugo Mora Colque}, year = {2018}, date = {2018-08-24}, school = {Federal University of Minas Gerais}, abstract = {Modeling human behavior and activity patterns for detection of anomalous events has attracted significant research interest in recent years, particularly among the video surveillance community. An anomalous event might be characterized by the deviation from the normal or usual, but not necessarily in an undesirable manner. One of the main challenges of detecting such events is the difficulty to create models due to their unpredictability and their dependency on the context of the scene. Anomalous events detection or anomaly recognition for surveillance videos is a very hard problem. Since anomalous events depend on the characteristic or the context of a specific scene. Although many contexts could be similar, the events that can be considered anomalous are also infinity, i.e., cannot be learned beforehand. In this dissertation, we propose three approaches to detect anomalous patterns in surveillance video sequences. In the first approach, we present an approach based on a handcrafted feature descriptor that employs general concepts, such as orientation, velocity, and entropy to build a descriptor for spatiotemporal regions. With this histogram, we can compare them and detect anomalies in video sequences. The main advantage of this approach is its simplicity and promising results that will be show in the experimental results, where our descriptors had well performance in famous dataset as UCSD and Subway, reaching comparative results with the estate of the art, specially in UCSD peds2 view. This results show that this model fits well in scenes with crowds. In the second proposal, we develop an approach based on human-object interactions. This approach explores the scene context to determine normal patterns and finally detect whether some video segment contains a possible anomalous event. To validate this approach we proposed a novel dataset which contains anomalies based on the human object interactions, the results are promising, however, this approach must be extended to be robust to more situations and environments. In the third approach, we propose a novel method based on semantic information of people movement. While, most studies focus in information extracted from spatiotemporal regions, our approach detects anomalies based on human trajectory. The results show that our model is suitable to detect anomalies in environments where trajectory of the people could be extracted. The main difference among the proposed approaches is the source to describe the events in the scene. The first method intends to represent the scene from spatiotemporal regions, the second uses the human-object interactions and the third uses the people trajectory. Each approach is oriented to certain anomaly types, having advantages and disadvantages according to the inherit limitation of the source and to the subjective of normal and anomaly event definition in a determinate context.}, keywords = {Processamento digital de Imagens} pubstate = {published, Visão por computador}, pubstate = {published}, tppubtype = {phdthesis} } Modeling human behavior and activity patterns for detection of anomalous events has attracted significant research interest in recent years, particularly among the video surveillance community. An anomalous event might be characterized by the deviation from the normal or usual, but not necessarily in an undesirable manner. One of the main challenges of detecting such events is the difficulty to create models due to their unpredictability and their dependency on the context of the scene. Anomalous events detection or anomaly recognition for surveillance videos is a very hard problem. Since anomalous events depend on the characteristic or the context of a specific scene. Although many contexts could be similar, the events that can be considered anomalous are also infinity, i.e., cannot be learned beforehand. In this dissertation, we propose three approaches to detect anomalous patterns in surveillance video sequences. In the first approach, we present an approach based on a handcrafted feature descriptor that employs general concepts, such as orientation, velocity, and entropy to build a descriptor for spatiotemporal regions. With this histogram, we can compare them and detect anomalies in video sequences. The main advantage of this approach is its simplicity and promising results that will be show in the experimental results, where our descriptors had well performance in famous dataset as UCSD and Subway, reaching comparative results with the estate of the art, specially in UCSD peds2 view. This results show that this model fits well in scenes with crowds. In the second proposal, we develop an approach based on human-object interactions. This approach explores the scene context to determine normal patterns and finally detect whether some video segment contains a possible anomalous event. To validate this approach we proposed a novel dataset which contains anomalies based on the human object interactions, the results are promising, however, this approach must be extended to be robust to more situations and environments. In the third approach, we propose a novel method based on semantic information of people movement. While, most studies focus in information extracted from spatiotemporal regions, our approach detects anomalies based on human trajectory. The results show that our model is suitable to detect anomalies in environments where trajectory of the people could be extracted. The main difference among the proposed approaches is the source to describe the events in the scene. The first method intends to represent the scene from spatiotemporal regions, the second uses the human-object interactions and the third uses the people trajectory. Each approach is oriented to certain anomaly types, having advantages and disadvantages according to the inherit limitation of the source and to the subjective of normal and anomaly event definition in a determinate context. |
de Prates, Raphael Felipe Carvalho; Schwartz, William Robson Kernel multiblock partial least squares for a scalable and multicamera person reidentification system Journal Article pp. 1-33, 2018. Links | BibTeX | Tags: Kernel Partial Least Squares, Partial Least Squares, Person Re-Identification @article{Prates:2018:JEI, title = {Kernel multiblock partial least squares for a scalable and multicamera person reidentification system}, author = {Raphael Felipe Carvalho de Prates and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/article.pdf}, year = {2018}, date = {2018-06-25}, booktitle = {Journal of Electronic Imaging}, pages = {1-33}, keywords = {Kernel Partial Least Squares, Partial Least Squares, Person Re-Identification}, pubstate = {published}, tppubtype = {article} } |
Kloss, Ricardo Barbosa Boosted Projections and Low Cost Transfer Learning Applied to Smart Surveillance Masters Thesis Federal University of Minas Gerais, 2018. Resumo | Links | BibTeX | Tags: Computer vision, Deep Learning, Forensics, Machine Learning, Surveillance @mastersthesis{Kloss2018, title = {Boosted Projections and Low Cost Transfer Learning Applied to Smart Surveillance}, author = {Ricardo Barbosa Kloss}, url = {https://www.dropbox.com/s/dpkkv16zkardxdx/main.pdf?dl=0}, year = {2018}, date = {2018-02-01}, school = {Federal University of Minas Gerais}, abstract = {Computer vision is an important area related to understanding the world through images. It can be used as biometry, by verifying if a given face is of a certain identity, used to look for crime perpetrators in an airport blacklist, used in human-machine interactions and other goals. Since 2012, deep learning methods have become ubiquitous in computer vision achieving breakthroughs, and making possible for machines, for instance, to perform face verification with human-level skill. This work tackles two computer vision problems and is divided in two parts. In one we explore deep learning methods in the task of face verification and in the other the task of dimensionality reduction. Both tasks have large importance in the fields of machine learning and computer vision. We focus on their application in smart surveillance. Dimensionality reduction helps alleviate problems which usually suffer from a very high dimensionality, which can make it hard to learn classifiers. This work presents a novel method for tackling this problem, referred to as Boosted Projection. It relies on the use of several projection models based on Principal Component Analysis or Partial Least Squares to build a more compact and richer data representation. Our experimental results demonstrate that the proposed approach outperforms many baselines and provides better results when compared to the original dimensionality reduction techniques of partial least squares. In the second part of this work, regarding face verification, we explored a simple and cheap technique to extract deep features and reuse a pre-learned model. The technique is a transfer learning that involves no fine-tuning of the model to the new domain. Namely, we explore the correlation of depth and scale in deep models, and look for the layer/scale that yields the best results for the new domain, we also explore metrics for the verification task, using locally connected convolutions to learn distance metrics. Our face verification experiments use a model pre-trained in face identification and adapt it to the face verification task with different data, but still on the face domain. We achieve 96.65% mean accuracy on the Labeled Faces in the Wild dataset and 93.12% mean accuracy on the Youtube Faces dataset which are in the state-of-the-art.}, keywords = {Computer vision, Deep Learning, Forensics, Machine Learning, Surveillance}, pubstate = {published}, tppubtype = {mastersthesis} } Computer vision is an important area related to understanding the world through images. It can be used as biometry, by verifying if a given face is of a certain identity, used to look for crime perpetrators in an airport blacklist, used in human-machine interactions and other goals. Since 2012, deep learning methods have become ubiquitous in computer vision achieving breakthroughs, and making possible for machines, for instance, to perform face verification with human-level skill. This work tackles two computer vision problems and is divided in two parts. In one we explore deep learning methods in the task of face verification and in the other the task of dimensionality reduction. Both tasks have large importance in the fields of machine learning and computer vision. We focus on their application in smart surveillance. Dimensionality reduction helps alleviate problems which usually suffer from a very high dimensionality, which can make it hard to learn classifiers. This work presents a novel method for tackling this problem, referred to as Boosted Projection. It relies on the use of several projection models based on Principal Component Analysis or Partial Least Squares to build a more compact and richer data representation. Our experimental results demonstrate that the proposed approach outperforms many baselines and provides better results when compared to the original dimensionality reduction techniques of partial least squares. In the second part of this work, regarding face verification, we explored a simple and cheap technique to extract deep features and reuse a pre-learned model. The technique is a transfer learning that involves no fine-tuning of the model to the new domain. Namely, we explore the correlation of depth and scale in deep models, and look for the layer/scale that yields the best results for the new domain, we also explore metrics for the verification task, using locally connected convolutions to learn distance metrics. Our face verification experiments use a model pre-trained in face identification and adapt it to the face verification task with different data, but still on the face domain. We achieve 96.65% mean accuracy on the Labeled Faces in the Wild dataset and 93.12% mean accuracy on the Youtube Faces dataset which are in the state-of-the-art. |
Junior, Carlos Antonio Caetano; dos Santos, Jefersson A; Schwartz, William Robson Statistical Measures from Co-occurrence of Codewords for Action Recognition Inproceedings VISAPP 2018 - International Conference on Computer Vision Theory and Applications, pp. 1-8, 2018. Links | BibTeX | Tags: Action Recognition, Activity Recognition, DeepEyes, GigaFrames, Spatiotemporal Features @inproceedings{Caetano:2018:VISAPP, title = {Statistical Measures from Co-occurrence of Codewords for Action Recognition}, author = {Carlos Antonio Caetano Junior and Jefersson A dos Santos and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/VISAPP_2018_CarlosCaetano.pdf}, year = {2018}, date = {2018-01-27}, booktitle = {VISAPP 2018 - International Conference on Computer Vision Theory and Applications}, pages = {1-8}, keywords = {Action Recognition, Activity Recognition, DeepEyes, GigaFrames, Spatiotemporal Features}, pubstate = {published}, tppubtype = {inproceedings} } |
da Silva, Samira Santos Aggregating Partial Least Squares Models for Open-set Face Identification Masters Thesis Federal University of Minas Gerais, 2018. Resumo | Links | BibTeX | Tags: Face Identification, Open-set Face Recognition, Partial Least Squares @mastersthesis{Silva:2018:MSc, title = {Aggregating Partial Least Squares Models for Open-set Face Identification}, author = {Samira Santos da Silva}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2020/04/dissertation_versaocomfichacatalografica.pdf}, year = {2018}, date = {2018-01-16}, school = {Federal University of Minas Gerais}, abstract = {Face identification is an important task in computer vision and has a myriad of applications, such as in surveillance, forensics and human-computer interaction. In the past few years, several methods have been proposed to solve face identification task in closed-set scenarios, that is, methods that make assumption of all the probe images necessarily matching a gallery individual. However, in real-world applications, one might want to determine the identity of an unknown face in open-set scenarios. In this work, we propose a novel method to perform open-set face identification by aggregating Partial Least Squares models using the one-against-all protocol in a simple but fast way. The model outputs are combined into a response histogram which is balanced if the probe face belongs to a gallery individual or have a highlighted bin, otherwise. Evaluation is performed in four datasets: FRGCv1, FG-NET, Pubfig and Pubfig83. Results show significant improvement when compared to state-of-the art approaches regardless challenges posed by different datasets.}, keywords = {Face Identification, Open-set Face Recognition, Partial Least Squares}, pubstate = {published}, tppubtype = {mastersthesis} } Face identification is an important task in computer vision and has a myriad of applications, such as in surveillance, forensics and human-computer interaction. In the past few years, several methods have been proposed to solve face identification task in closed-set scenarios, that is, methods that make assumption of all the probe images necessarily matching a gallery individual. However, in real-world applications, one might want to determine the identity of an unknown face in open-set scenarios. In this work, we propose a novel method to perform open-set face identification by aggregating Partial Least Squares models using the one-against-all protocol in a simple but fast way. The model outputs are combined into a response histogram which is balanced if the probe face belongs to a gallery individual or have a highlighted bin, otherwise. Evaluation is performed in four datasets: FRGCv1, FG-NET, Pubfig and Pubfig83. Results show significant improvement when compared to state-of-the art approaches regardless challenges posed by different datasets. |
Laroca, R; Severo, E; Zanlorensi, L A; Oliveira, L S; Gon�alves, G R; Schwartz, W R; Menotti, D A Robust Real-Time Automatic License Plate Recognition Based on the YOLO Detector Inproceedings 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1-10, 2018. @inproceedings{Laroca:2018:IJCNN, title = {A Robust Real-Time Automatic License Plate Recognition Based on the YOLO Detector}, author = {R Laroca and E Severo and L A Zanlorensi and L S Oliveira and G R Gon�alves and W R Schwartz and D Menotti}, url = {http://www.dcc.ufmg.br/~william/papers/paper_2018_IJCNN_Laroca.pdf}, doi = {10.1109/IJCNN.2018.8489629}, year = {2018}, date = {2018-01-01}, booktitle = {2018 International Joint Conference on Neural Networks (IJCNN)}, pages = {1-10}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Ferreira, Gabriel Sousa; Padua, Flavio Luis Cardeal; Schwartz, William Robson; Rodrigues, Marco Tulio Alves Nolasco Mapping Sports Interest with Social Network Inproceedings XV Encontro Nacional de Inteligencia Artificial e Computacional (ENIAC), pp. 1-12, 2018. @inproceedings{Ferreira:2018:ENIAC, title = {Mapping Sports Interest with Social Network}, author = {Gabriel Sousa Ferreira and Flavio Luis Cardeal Padua and William Robson Schwartz and Marco Tulio Alves Nolasco Rodrigues}, url = {http://www.dcc.ufmg.br/~william/papers/paper_2018_ENIAC.pdf}, year = {2018}, date = {2018-01-01}, booktitle = {XV Encontro Nacional de Inteligencia Artificial e Computacional (ENIAC)}, pages = {1-12}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Kloss, Ricardo Barbosa; Jordao, Artur; Schwartz, William Robson Face Verification Strategies for Employing Deep Models Inproceedings 13th IEEE International Conference on Automatic Face & Gesture Recognition, pp. 258-262, 2018. Links | BibTeX | Tags: Artificial Neural Networks, Face Verification, GigaFrames, Metric Learning, Transfer Learning @inproceedings{Kloss:2018:FG, title = {Face Verification Strategies for Employing Deep Models}, author = {Ricardo Barbosa Kloss and Artur Jordao and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/Face-Verification-Strategies-for-Employing-Deep-Models.pdf}, year = {2018}, date = {2018-01-01}, booktitle = {13th IEEE International Conference on Automatic Face & Gesture Recognition}, pages = {258-262}, keywords = {Artificial Neural Networks, Face Verification, GigaFrames, Metric Learning, Transfer Learning}, pubstate = {published}, tppubtype = {inproceedings} } |
Jordao, Artur; Junior, Antonio Carlos Nazare; Sena, Jessica; Schwartz, William Robson Human Activity Recognition based on Wearable Sensor Data: A Benchmark Journal Article arXiv, pp. 1-12, 2018. Links | BibTeX | Tags: Activity Recognition Based on Wearable Sensors, Benchmark on Activity Recognition based on Wearable Sensors, Wearable Sensors @article{Jordao:2018:arXiv, title = {Human Activity Recognition based on Wearable Sensor Data: A Benchmark}, author = {Artur Jordao and Antonio Carlos Nazare Junior and Jessica Sena and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/03/Human-Activity-Recognition-based-on-Wearable-A-Benchmark.pdf}, year = {2018}, date = {2018-01-01}, booktitle = {Arvix}, journal = {arXiv}, pages = {1-12}, keywords = {Activity Recognition Based on Wearable Sensors, Benchmark on Activity Recognition based on Wearable Sensors, Wearable Sensors}, pubstate = {published}, tppubtype = {article} } |
Jordao, Artur; Kloss, Ricardo Barbosa; Schwartz, William Robson Latent hypernet: Exploring all Layers from Convolutional Neural Networks Inproceedings IEEE International Joint Conference on Neural Networks (IJCNN), pp. 1-7, 2018. Links | BibTeX | Tags: Activity Recognition Based on Wearable Sensors, DeepEyes, GigaFrames, Partial Least Squares, Wearable Sensors @inproceedings{Jordao:2018b:IJCNN, title = {Latent hypernet: Exploring all Layers from Convolutional Neural Networks}, author = {Artur Jordao and Ricardo Barbosa Kloss and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/Latent-HyperNet-Exploring-the-Layers.pdf}, year = {2018}, date = {2018-01-01}, booktitle = {IEEE International Joint Conference on Neural Networks (IJCNN)}, pages = {1-7}, keywords = {Activity Recognition Based on Wearable Sensors, DeepEyes, GigaFrames, Partial Least Squares, Wearable Sensors}, pubstate = {published}, tppubtype = {inproceedings} } |
Jordao, Artur; Torres, Leonardo Antônio Borges; Schwartz, William Robson Novel Approaches to Human Activity Recognition based on Accelerometer Data Journal Article 12 (7), pp. 1387–1394, 2018. Links | BibTeX | Tags: Activity Recognition Based on Wearable Sensors, HAR-HEALTH, Wearable Sensors @article{Jordao:2018:SIVP, title = {Novel Approaches to Human Activity Recognition based on Accelerometer Data}, author = {Artur Jordao and Leonardo Antônio Borges Torres and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/Novel-Approaches-to-Human-Activity-Recognition-based-on.pdf}, year = {2018}, date = {2018-01-01}, booktitle = {Signal, Image and Video Processing}, volume = {12}, number = {7}, pages = {1387–1394}, keywords = {Activity Recognition Based on Wearable Sensors, HAR-HEALTH, Wearable Sensors}, pubstate = {published}, tppubtype = {article} } |
Colque, Rensso Victor Hugo Mora; Junior, Carlos Antonio Caetano; de Melo, Victor Hugo Cunha; Chavez, Guillermo Camara; Schwartz, William Robson Novel Anomalous Event Detection based on Human-object Interactions Inproceedings VISAPP 2018 - International Conference on Computer Vision Theory and Applications, pp. 1-8, 2018. Links | BibTeX | Tags: Anomalous Event Detection, Contextual Information, DeepEyes, GigaFrames, Human-Object Interaction @inproceedings{Colque:2018:VISAPP, title = {Novel Anomalous Event Detection based on Human-object Interactions}, author = {Rensso Victor Hugo Mora Colque and Carlos Antonio Caetano Junior and Victor Hugo Cunha de Melo and Guillermo Camara Chavez and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/VISAPP_2018_92.pdf}, year = {2018}, date = {2018-01-01}, booktitle = {VISAPP 2018 - International Conference on Computer Vision Theory and Applications}, pages = {1-8}, keywords = {Anomalous Event Detection, Contextual Information, DeepEyes, GigaFrames, Human-Object Interaction}, pubstate = {published}, tppubtype = {inproceedings} } |
Bastos, Igor Leonardo Oliveira; de Melo, Victor Hugo Cunha; Gonçalves, Gabriel Resende; Schwartz, William Robson MORA: A Generative Approach to Extract Spatiotemporal Information Applied to Gesture Recognition Inproceedings 15th International Conference on Advanced Video and Signal-based Surveillance (AVSS), pp. 1-6, 2018. Links | BibTeX | Tags: Autoencoders, DeepEyes, Gesture Recognition, GigaFrames, Recurrent Models @inproceedings{Bastos:2018:AVSS, title = {MORA: A Generative Approach to Extract Spatiotemporal Information Applied to Gesture Recognition}, author = {Igor Leonardo Oliveira Bastos and Victor Hugo Cunha de Melo and Gabriel Resende Gonçalves and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/MORA_.pdf}, year = {2018}, date = {2018-01-01}, booktitle = {15th International Conference on Advanced Video and Signal-based Surveillance (AVSS)}, pages = {1-6}, keywords = {Autoencoders, DeepEyes, Gesture Recognition, GigaFrames, Recurrent Models}, pubstate = {published}, tppubtype = {inproceedings} } |
Kloss, Ricardo Barbosa; Jordao, Artur; Schwartz, William Robson Boosted Projection An Ensemble of Transformation Models Inproceedings 22nd Iberoamerican Congress on Pattern Recognition (CIARP), pp. 331-338, 2018. Links | BibTeX | Tags: Computer vision, DeepEyes, Dimensionality Reduction, Ensemble Partial Least Squares, GigaFrames, Machine Learning @inproceedings{Kloss:2018:CIARP, title = {Boosted Projection An Ensemble of Transformation Models}, author = {Ricardo Barbosa Kloss and Artur Jordao and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/Boosted-Projection-An-Ensemble-of-Transformation-Models.pdf}, year = {2018}, date = {2018-01-01}, booktitle = {22nd Iberoamerican Congress on Pattern Recognition (CIARP)}, pages = {331-338}, keywords = {Computer vision, DeepEyes, Dimensionality Reduction, Ensemble Partial Least Squares, GigaFrames, Machine Learning}, pubstate = {published}, tppubtype = {inproceedings} } |
Reis, Renan Oliveira; Dias, Igor Henrique; Schwartz, William Robson Neural network control for active cameras using master-slave setup Inproceedings International Conference on Advanced Video and Signal-based Surveillance (AVSS), pp. 1-6, 2018. Links | BibTeX | Tags: Active Camera, DeepEyes, GigaFrames, Neural network control for active cameras using master-slave setup, SMS @inproceedings{Reis:2018:AVSS, title = {Neural network control for active cameras using master-slave setup}, author = {Renan Oliveira Reis and Igor Henrique Dias and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/renan_avss_2018-1-1.pdf}, year = {2018}, date = {2018-01-01}, booktitle = {International Conference on Advanced Video and Signal-based Surveillance (AVSS)}, pages = {1-6}, keywords = {Active Camera, DeepEyes, GigaFrames, Neural network control for active cameras using master-slave setup, SMS}, pubstate = {published}, tppubtype = {inproceedings} } |
Junior, Antonio Carlos Nazare; de Costa, Filipe Oliveira; Schwartz, William Robson Content-Based Multi-Camera Video Alignment using Accelerometer Data Inproceedings Advanced Video and Signal Based Surveillance (AVSS), 2018 15th IEEE International Conference on, pp. 1-6, 2018. Links | BibTeX | Tags: Camera Synchronization, DeepEyes, GigaFrames, SensorCap, Sensors, SMS @inproceedings{Nazare:2018:AVSS, title = {Content-Based Multi-Camera Video Alignment using Accelerometer Data}, author = {Antonio Carlos Nazare Junior and Filipe Oliveira de Costa and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/2018_avss_svsync_camera_ready.pdf}, year = {2018}, date = {2018-01-01}, booktitle = {Advanced Video and Signal Based Surveillance (AVSS), 2018 15th IEEE International Conference on}, pages = {1-6}, keywords = {Camera Synchronization, DeepEyes, GigaFrames, SensorCap, Sensors, SMS}, pubstate = {published}, tppubtype = {inproceedings} } |
Jordao, Artur; Kloss, Ricardo; Yamada, Fernando; Schwartz, William Robson Pruning Deep Neural Networks using Partial Least Squares Journal Article ArXiv e-prints, 2018. Links | BibTeX | Tags: DeepEyes, GigaFrames, Neural Networks Optimization @article{Jordao:2018:arXivb, title = {Pruning Deep Neural Networks using Partial Least Squares}, author = {Artur Jordao and Ricardo Kloss and Fernando Yamada and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/03/1810.07610.pdf}, year = {2018}, date = {2018-01-01}, journal = {ArXiv e-prints}, keywords = {DeepEyes, GigaFrames, Neural Networks Optimization}, pubstate = {published}, tppubtype = {article} } |
Prates, Raphael; Schwartz, William Robson Kernel Multiblock Partial Least Squares for a Scalable and Multicamera Person Reidentification System Journal Article Journal of Electronic Imaging, 27 (3), pp. 1-33, 2018. @article{Prates:2018:JEIb, title = {Kernel Multiblock Partial Least Squares for a Scalable and Multicamera Person Reidentification System}, author = {Raphael Prates and William Robson Schwartz}, url = {http://www.dcc.ufmg.br/~william/papers/paper_2018_JEI.pdf}, doi = {10.1117/1.JEI.27.3.033041}, year = {2018}, date = {2018-01-01}, journal = {Journal of Electronic Imaging}, volume = {27}, number = {3}, pages = {1-33}, keywords = {}, pubstate = {published}, tppubtype = {article} } |
Sales, Anderson Luis Cavalcanti; Vareto, Rafael Henrique; Schwartz, William Robson; Chavez, Guillermo Camara Single-Shot Person Re-Identification Combining Similarity Metrics and Support Vectors Inproceedings Conference on Graphic, Patterns and Images (SIBGRAPI), pp. 1-8, 2018. @inproceedings{Sales:2018:SIBGRAPI, title = {Single-Shot Person Re-Identification Combining Similarity Metrics and Support Vectors}, author = {Anderson Luis Cavalcanti Sales and Rafael Henrique Vareto and William Robson Schwartz and Guillermo Camara Chavez}, url = {http://www.dcc.ufmg.br/~william/papers/paper_2018_SIBGRAPI_Sales.pdf}, year = {2018}, date = {2018-01-01}, booktitle = {Conference on Graphic, Patterns and Images (SIBGRAPI)}, pages = {1-8}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
2017 |
Bastos, Igor Leonardo Oliveira; Soares, Larissa Rocha; Schwartz, William Robson Pyramidal Zernike Over Time: A spatiotemporal feature descriptor based on Zernike Moments Inproceedings Iberoamerican Congress on Pattern Recognition (CIARP 2017), pp. 77-85, 2017. Links | BibTeX | Tags: Activity Recognition, DeepEyes, Feature Extraction, GigaFrames, Zernike Moments @inproceedings{Bastos:2017:CIARP, title = {Pyramidal Zernike Over Time: A spatiotemporal feature descriptor based on Zernike Moments}, author = {Igor Leonardo Oliveira Bastos and Larissa Rocha Soares and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/PZOT_camera_ready.pdf}, year = {2017}, date = {2017-11-07}, booktitle = {Iberoamerican Congress on Pattern Recognition (CIARP 2017)}, pages = {77-85}, keywords = {Activity Recognition, DeepEyes, Feature Extraction, GigaFrames, Zernike Moments}, pubstate = {published}, tppubtype = {inproceedings} } |
Vareto, Rafael Henrique Face Recognition based on a Collection of Binary Classifiers Masters Thesis Federal University of Minas Gerais, 2017. Resumo | BibTeX | Tags: Artificial Neural Network, Face Verification, Machine Learning, Open-set Face Identification, Partial Least Squares, Support Vector Machine, Surveillance @mastersthesis{Vareto:2017:MSc, title = {Face Recognition based on a Collection of Binary Classifiers}, author = {Rafael Henrique Vareto}, year = {2017}, date = {2017-10-16}, school = {Federal University of Minas Gerais}, abstract = {Face Recognition is one of the most relevant problems in computer vision as we consider its importance to areas such as surveillance, forensics and psychology. In fact, a real-world recognition system has to cope with several unseen individuals and determine either if a given face image is associated with a subject registered in a gallery of known individuals or if two given faces represent equivalent identities. In this work, not only we combine hashing functions, embedding of classifiers and response value histograms to estimate when probe samples belong to the gallery set, but we also extract relational features to model the relation between pair of faces to determine whether they are from the same person. Both proposed methods are evaluated on five datasets: FRGCv1, LFW, PubFig, PubFig83 and CNN VGGFace. Results are promising and show that our method continues effective for both open-set face identification and verification tasks regardless of the dataset difficulty.}, keywords = {Artificial Neural Network, Face Verification, Machine Learning, Open-set Face Identification, Partial Least Squares, Support Vector Machine, Surveillance}, pubstate = {published}, tppubtype = {mastersthesis} } Face Recognition is one of the most relevant problems in computer vision as we consider its importance to areas such as surveillance, forensics and psychology. In fact, a real-world recognition system has to cope with several unseen individuals and determine either if a given face image is associated with a subject registered in a gallery of known individuals or if two given faces represent equivalent identities. In this work, not only we combine hashing functions, embedding of classifiers and response value histograms to estimate when probe samples belong to the gallery set, but we also extract relational features to model the relation between pair of faces to determine whether they are from the same person. Both proposed methods are evaluated on five datasets: FRGCv1, LFW, PubFig, PubFig83 and CNN VGGFace. Results are promising and show that our method continues effective for both open-set face identification and verification tasks regardless of the dataset difficulty. |
Vareto, Rafael Henrique; da Silva, Samira Santos; de Costa, Filipe Oliveira; Schwartz, William Robson Towards Open-Set Face Recognition using Hashing Functions Inproceedings International Joint Conference on Biometrics (IJCB), pp. 1-8, 2017. Links | BibTeX | Tags: Face Recognition, Open-Set Classification @inproceedings{Vareto:2017:IJCB, title = {Towards Open-Set Face Recognition using Hashing Functions}, author = {Rafael Henrique Vareto and Samira Santos da Silva and Filipe Oliveira de Costa and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/paper_2017_IJCB.pdf http://lena.dcc.ufmg.br/wordpress/towards-open-set-face-recognition-using-hashing-functions/}, year = {2017}, date = {2017-01-01}, booktitle = {International Joint Conference on Biometrics (IJCB)}, pages = {1-8}, keywords = {Face Recognition, Open-Set Classification}, pubstate = {published}, tppubtype = {inproceedings} } |
Colque, Rensso Victor Hugo Mora; Junior, Carlos Antonio Caetano; de Andrade, Matheus Toledo Lustosa; Schwartz, William Robson Histograms of Optical Flow Orientation and Magnitude and Entropy to Detect Anomalous Events in Videos Journal Article IEEE Transactions on Circuits and Systems for Video Technology, 27 (3), pp. 673-682, 2017. Links | BibTeX | Tags: Anomalous Event Detection, DeepEyes, Feature Extraction, GigaFrames @article{Colque:2016:TCSVT, title = {Histograms of Optical Flow Orientation and Magnitude and Entropy to Detect Anomalous Events in Videos}, author = {Rensso Victor Hugo Mora Colque and Carlos Antonio Caetano Junior and Matheus Toledo Lustosa de Andrade and William Robson Schwartz}, url = {http://dx.doi.org/10.1109/TCSVT.2016.2637778}, year = {2017}, date = {2017-01-01}, journal = {IEEE Transactions on Circuits and Systems for Video Technology}, volume = {27}, number = {3}, pages = {673-682}, keywords = {Anomalous Event Detection, DeepEyes, Feature Extraction, GigaFrames}, pubstate = {published}, tppubtype = {article} } |
Bastos, Igor Leonardo Oliveira; Schwartz, William Robson Assigning Relative Importance to Scene Elements Inproceedings Conference on Graphics, Patterns and Images (SIBGRAPI), pp. 1-8, 2017. Links | BibTeX | Tags: Assigning Relative Importance to Scene Elements, Importance Annotation for VIP and UIUC Pascal Sentence Datasets, Importance Assignment @inproceedings{Bastos:2017:SIBGRAPI, title = {Assigning Relative Importance to Scene Elements}, author = {Igor Leonardo Oliveira Bastos and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/paper_2017_SIBGRAPI_Bastos.pdf}, year = {2017}, date = {2017-01-01}, booktitle = {Conference on Graphics, Patterns and Images (SIBGRAPI)}, pages = {1-8}, keywords = {Assigning Relative Importance to Scene Elements, Importance Annotation for VIP and UIUC Pascal Sentence Datasets, Importance Assignment}, pubstate = {published}, tppubtype = {inproceedings} } |
Junior, Carlos Antonio Caetano; de Melo, Victor Hugo Cunha; dos Santos, Jefersson Alex; Schwartz, William Robson Activity Recognition based on a Magnitude-Orientation Stream Network Inproceedings Conference on Graphics, Patterns and Images (SIBGRAPI), pp. 1-8, 2017. Links | BibTeX | Tags: Activity Recognition, Deep Learning, GigaFrames @inproceedings{Caetano:2017:SIBGRAPI, title = {Activity Recognition based on a Magnitude-Orientation Stream Network}, author = {Carlos Antonio Caetano Junior and Victor Hugo Cunha de Melo and Jefersson Alex dos Santos and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/paper_2017_SIBGRAPI_Caetano.pdf}, year = {2017}, date = {2017-01-01}, booktitle = {Conference on Graphics, Patterns and Images (SIBGRAPI)}, pages = {1-8}, keywords = {Activity Recognition, Deep Learning, GigaFrames}, pubstate = {published}, tppubtype = {inproceedings} } |
Vareto, Rafael Henrique; da Silva, Samira Santos; de Costa, Filipe Oliveira; Schwartz, William Robson Face Verification based on Relational Disparity Features and Partial Least Squares Models Inproceedings Conference on Graphics, Patterns and Images (SIBGRAPI), pp. 1-8, 2017. Links | BibTeX | Tags: Face Recognition, Face Verification @inproceedings{Vareto:2017:SIBGRAPI, title = {Face Verification based on Relational Disparity Features and Partial Least Squares Models}, author = {Rafael Henrique Vareto and Samira Santos da Silva and Filipe Oliveira de Costa and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/paper_2017_SIBGRAPI_Vareto.pdf http://lena.dcc.ufmg.br/wordpress/face-verification-based-relational-disparity-features-partial-least-squares-models/}, year = {2017}, date = {2017-01-01}, booktitle = {Conference on Graphics, Patterns and Images (SIBGRAPI)}, pages = {1-8}, keywords = {Face Recognition, Face Verification}, pubstate = {published}, tppubtype = {inproceedings} } |
2016 |
de Prates, Raphael Felipe Carvalho; Schwartz, William Robson Kernel Hierarchical PCA for Person Re-Identification Inproceedings IAPR International Conference on Pattern Recognition (ICPR), 2016. Resumo | Links | BibTeX | Tags: Kernel Hierarchical PCA, Kernel Partial Least Squares, Partial Least Squares, Person Re-Identification @inproceedings{Prates2016ICPR, title = {Kernel Hierarchical PCA for Person Re-Identification}, author = {Raphael Felipe Carvalho de Prates and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/Kernel_HPCA_final.pdf}, year = {2016}, date = {2016-12-13}, booktitle = {IAPR International Conference on Pattern Recognition (ICPR)}, abstract = {Person re-identification (Re-ID) maintains a global identity for an individual while he moves along a large area covered by multiple cameras. Re-ID enables a multi-camera monitoring of individual activity that is critical for surveillance systems. However, the low-resolution images combined with the different poses, illumination conditions and camera viewpoints make person Re-ID a challenging problem. To reach a higher matching performance, state-of-the-art methods map the data to a nonlinear feature space where they learn a cross-view matching function using training data. Kernel PCA is a statistical method that learns a common subspace that captures most of the variability of samples using a small number of vector basis. However, Kernel PCA disregards that images were captured by distinct cameras, a critical problem in person ReID. Differently, Hierarchical PCA (HPCA) captures a consensus projection between multiblock data (e.g, two camera views), but it is a linear model. Therefore, we propose the Kernel Hierarchical PCA (Kernel HPCA) to tackle camera transition and dimensionality reduction in a unique framework. To the best of our knowledge, this is the first work to propose a kernel extension to the multiblock HPCA method. Experimental results demonstrate that Kernel HPCA reaches a matching performance comparable with state-of-the-art nonlinear subspace learning methods at PRID450S and VIPeR datasets. Furthermore, Kernel HPCA reaches a better combination of subspace learning and dimensionality requiring significantly lower subspace dimensions.}, keywords = {Kernel Hierarchical PCA, Kernel Partial Least Squares, Partial Least Squares, Person Re-Identification}, pubstate = {published}, tppubtype = {inproceedings} } Person re-identification (Re-ID) maintains a global identity for an individual while he moves along a large area covered by multiple cameras. Re-ID enables a multi-camera monitoring of individual activity that is critical for surveillance systems. However, the low-resolution images combined with the different poses, illumination conditions and camera viewpoints make person Re-ID a challenging problem. To reach a higher matching performance, state-of-the-art methods map the data to a nonlinear feature space where they learn a cross-view matching function using training data. Kernel PCA is a statistical method that learns a common subspace that captures most of the variability of samples using a small number of vector basis. However, Kernel PCA disregards that images were captured by distinct cameras, a critical problem in person ReID. Differently, Hierarchical PCA (HPCA) captures a consensus projection between multiblock data (e.g, two camera views), but it is a linear model. Therefore, we propose the Kernel Hierarchical PCA (Kernel HPCA) to tackle camera transition and dimensionality reduction in a unique framework. To the best of our knowledge, this is the first work to propose a kernel extension to the multiblock HPCA method. Experimental results demonstrate that Kernel HPCA reaches a matching performance comparable with state-of-the-art nonlinear subspace learning methods at PRID450S and VIPeR datasets. Furthermore, Kernel HPCA reaches a better combination of subspace learning and dimensionality requiring significantly lower subspace dimensions. |
Jordao, Artur; Sena, Jessica; Schwartz, William Robson A Late Fusion Approach to Combine Multiple Pedestrian Detectors Inproceedings IAPR International Conference on Pattern Recognition (ICPR), pp. 1-6, 2016. Links | BibTeX | Tags: DeepEyes, Featured Publication, GigaFrames, Pedestrian Detection @inproceedings{Correia:2016:ICPR, title = {A Late Fusion Approach to Combine Multiple Pedestrian Detectors}, author = {Artur Jordao and Jessica Sena and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/A-Late-Fusion-Approach-to-Combine-Multiple.pdf}, year = {2016}, date = {2016-12-13}, booktitle = {IAPR International Conference on Pattern Recognition (ICPR)}, pages = {1-6}, keywords = {DeepEyes, Featured Publication, GigaFrames, Pedestrian Detection}, pubstate = {published}, tppubtype = {inproceedings} } |
Junior, Carlos Antonio Caetano; dos Santos, Jefersson A; Schwartz, William Robson Optical Flow Co-occurrence Matrices: A Novel Spatiotemporal Feature Descriptor Inproceedings IAPR International Conference on Pattern Recognition (ICPR), pp. 1-6, 2016. Links | BibTeX | Tags: Activity Recognition, Descriptor, Feature Extraction, Featured Publication, OFCM, VER+ @inproceedings{Caetano:2016:ICPR, title = {Optical Flow Co-occurrence Matrices: A Novel Spatiotemporal Feature Descriptor}, author = {Carlos Antonio Caetano Junior and Jefersson A dos Santos and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/ICPR2016.pdf}, year = {2016}, date = {2016-12-13}, booktitle = {IAPR International Conference on Pattern Recognition (ICPR)}, pages = {1-6}, keywords = {Activity Recognition, Descriptor, Feature Extraction, Featured Publication, OFCM, VER+}, pubstate = {published}, tppubtype = {inproceedings} } |
Gonçalves, Gabriel Resende; Menotti, David; Schwartz, William Robson License Plate Recognition based on Temporal Redundancy Inproceedings IEEE International Conference on Intelligent Transportation Systems (ITSC), pp. 1-5, 2016. Links | BibTeX | Tags: Automatic License Plate Recognition, DeepEyes, GigaFrames @inproceedings{Goncalves:2016:ITSC, title = {License Plate Recognition based on Temporal Redundancy}, author = {Gabriel Resende Gonçalves and David Menotti and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/paper_2016_ITSC.pdf}, year = {2016}, date = {2016-11-04}, booktitle = {IEEE International Conference on Intelligent Transportation Systems (ITSC)}, pages = {1-5}, keywords = {Automatic License Plate Recognition, DeepEyes, GigaFrames}, pubstate = {published}, tppubtype = {inproceedings} } |
Gonçalves, Gabriel Resende; da Silva, Sirlene Pio Gomes; Menotti, David; Schwartz, William Robson Benchmark for License Plate Character Segmentation Journal Article Journal of Electronic Imaging, 25 (5), pp. 1-5, 2016, ISBN: 1017-9909. Links | BibTeX | Tags: Automatic License Plate Recognition, Benchmark, Character Segmentation, DeepEyes, Featured Publication, GigaFrames, Jaccard Coefficient, Novel Dataset, Sense SegPlate @article{2016:JEI:Gabriel, title = {Benchmark for License Plate Character Segmentation}, author = {Gabriel Resende Gonçalves and Sirlene Pio Gomes da Silva and David Menotti and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/JEI-2016-Benchmark.pdf}, isbn = {1017-9909}, year = {2016}, date = {2016-10-24}, journal = {Journal of Electronic Imaging}, volume = {25}, number = {5}, pages = {1-5}, keywords = {Automatic License Plate Recognition, Benchmark, Character Segmentation, DeepEyes, Featured Publication, GigaFrames, Jaccard Coefficient, Novel Dataset, Sense SegPlate}, pubstate = {published}, tppubtype = {article} } |
Vareto, Rafael Henrique; de Costa, Filipe Oliveira; Schwartz, William Robson Face Identification in Large Galleries Inproceedings Workshop on Face Processing Applications, pp. 1-4, 2016. Links | BibTeX | Tags: DeepEyes, Face Identification, Face Recognition, GigaFrames, VER+ @inproceedings{Vareto:2016:WFPA, title = {Face Identification in Large Galleries}, author = {Rafael Henrique Vareto and Filipe Oliveira de Costa and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/paper_2016_WFPA.pdf}, year = {2016}, date = {2016-10-02}, booktitle = {Workshop on Face Processing Applications}, pages = {1-4}, keywords = {DeepEyes, Face Identification, Face Recognition, GigaFrames, VER+}, pubstate = {published}, tppubtype = {inproceedings} } |
de Prates, Raphael Felipe Carvalho; Dutra, Cristianne Rodrigues Santos; Schwartz, William Robson Predominant Color Name Indexing Structure for Person Re-Identification Inproceedings IEEE International Conference on Image Processing (ICIP), 2016. Resumo | Links | BibTeX | Tags: GigaFrames, Person Re-Identification, VER+ @inproceedings{Prates2016ICIP, title = {Predominant Color Name Indexing Structure for Person Re-Identification}, author = {Raphael Felipe Carvalho de Prates and Cristianne Rodrigues Santos Dutra and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/paper_2016_ICIP_Prates.pdf}, year = {2016}, date = {2016-09-25}, booktitle = {IEEE International Conference on Image Processing (ICIP)}, abstract = {The automation of surveillance systems is important to allow real-time analysis of critical events, crime investigation and prevention. A crucial step in the surveillance systems is the person re-identification (Re-ID) which aims at maintaining the identity of agents in non-overlapping camera networks. Most of the works in literature compare a test sample against the entire gallery, restricting the scalability. We address this problem employing multiple indexing lists obtained by color name descriptors extracted from partbased models using our proposed Predominant Color Name (PCN) indexing structure. PCN is a flexible indexing structure that relates features to gallery images without the need of labelled training images and can be integrated with existing supervised and unsupervised person Re-ID frameworks. Experimental results demonstrate that the proposed approach outperforms indexation based on unsupervised clustering methods such as k-means and c-means. Furthermore, PCN reduces the computational efforts with a minimum performance degradation. For instance, when indexing 50% and 75% of the gallery images, we observed a reduction in AUC curve of 0.01 and 0.08, respectively, when compared to indexing the entire gallery.}, keywords = {GigaFrames, Person Re-Identification, VER+}, pubstate = {published}, tppubtype = {inproceedings} } The automation of surveillance systems is important to allow real-time analysis of critical events, crime investigation and prevention. A crucial step in the surveillance systems is the person re-identification (Re-ID) which aims at maintaining the identity of agents in non-overlapping camera networks. Most of the works in literature compare a test sample against the entire gallery, restricting the scalability. We address this problem employing multiple indexing lists obtained by color name descriptors extracted from partbased models using our proposed Predominant Color Name (PCN) indexing structure. PCN is a flexible indexing structure that relates features to gallery images without the need of labelled training images and can be integrated with existing supervised and unsupervised person Re-ID frameworks. Experimental results demonstrate that the proposed approach outperforms indexation based on unsupervised clustering methods such as k-means and c-means. Furthermore, PCN reduces the computational efforts with a minimum performance degradation. For instance, when indexing 50% and 75% of the gallery images, we observed a reduction in AUC curve of 0.01 and 0.08, respectively, when compared to indexing the entire gallery. |
de Prates, Raphael Felipe Carvalho; Oliveira, Marina Santos; Schwartz, William Robson Kernel Partial Least Squares for Person Re-Identification Inproceedings IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS), 2016. Resumo | Links | BibTeX | Tags: DeepEyes, Featured Publication, GigaFrames, HAR-HEALTH, Kernel Partial Least Squares, Kernel Partial Least Squares for Person Re-Identification, Person Re-Identification @inproceedings{Prates2016AVSS, title = {Kernel Partial Least Squares for Person Re-Identification}, author = {Raphael Felipe Carvalho de Prates and Marina Santos Oliveira and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/egpaper_for_DoubleBlindReview.pdf}, year = {2016}, date = {2016-09-25}, booktitle = {IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS)}, abstract = {Person re-identification (Re-ID) keeps the same identity for a person as he moves along an area with nonoverlapping surveillance cameras. Re-ID is a challenging task due to appearance changes caused by different camera viewpoints, occlusion and illumination conditions. While robust and discriminative descriptors are obtained combining texture, shape and color features in a high-dimensional representation, the achievement of accuracy and efficiency demands dimensionality reduction methods. At this paper, we propose variations of Kernel Partial Least Squares (KPLS) that simultaneously reduce the dimensionality and increase the discriminative power. The Cross-View KPLS (X-KPLS) and KPLS Mode A capture cross-view discriminative information and are successful for unsupervised and supervised Re-ID. Experimental results demonstrate that XKPLS presents equal or higher matching results when compared to other methods in literature at PRID450S.}, keywords = {DeepEyes, Featured Publication, GigaFrames, HAR-HEALTH, Kernel Partial Least Squares, Kernel Partial Least Squares for Person Re-Identification, Person Re-Identification}, pubstate = {published}, tppubtype = {inproceedings} } Person re-identification (Re-ID) keeps the same identity for a person as he moves along an area with nonoverlapping surveillance cameras. Re-ID is a challenging task due to appearance changes caused by different camera viewpoints, occlusion and illumination conditions. While robust and discriminative descriptors are obtained combining texture, shape and color features in a high-dimensional representation, the achievement of accuracy and efficiency demands dimensionality reduction methods. At this paper, we propose variations of Kernel Partial Least Squares (KPLS) that simultaneously reduce the dimensionality and increase the discriminative power. The Cross-View KPLS (X-KPLS) and KPLS Mode A capture cross-view discriminative information and are successful for unsupervised and supervised Re-ID. Experimental results demonstrate that XKPLS presents equal or higher matching results when compared to other methods in literature at PRID450S. |
2020 |
Carlos Antônio Caetano Júnior Motion-Based Representations for Activity Recognition Tese PhD Universidade Federal of Minas Gerais, 2020. Resumo | BibTeX | Tags: Activity Recognition, convolutional neural networks (CNNs), optical flow, spatiotemporal information, temporal stream @phdthesis{CarlosCaetano:2020:PhD, title = {Motion-Based Representations for Activity Recognition}, author = {Carlos Antônio Caetano Júnior}, year = {2020}, date = {2020-01-27}, school = {Universidade Federal of Minas Gerais}, abstract = {In this dissertation we propose four different representations based on motion information for activity recognition. The first is a spatiotemporal local feature descriptor that extracts a robust set of statistical measures to describe motion patterns. This descriptor measures meaningful properties of co-occurrence matrices and captures local space-time characteristics of the motion through the neighboring optical flow magnitude and orientation. The second, is the proposal of a compact novel mid-level representation based on co-occurrence matrices of codewords. This representation expresses the distribution of the features at a given offset over feature codewords from a pre-computed codebook and encodes global structures in various local region-based features. The third representation, is the proposal of a novel temporal stream for two-stream convolutional networks that employs images computed from the optical flow magnitude and orientation to learn the motion in a better and richer manner. The method applies simple non-linear transformations on the vertical and horizontal components of the optical flow to generate input images for the temporal stream. Finally, the forth is a novel skeleton image representation to be used as input of convolutional neural networks (CNNs). The proposed approach encodes the temporal dynamics by explicitly computing the magnitude and orientation values of the skeleton joints. Moreover, the representation has the advantage of combining the use of reference joints and a tree structure skeleton, incorporating different spatial relationships between the joints and preserving important spatial relations. The experimental evaluations carried out on challenging well-known activity recognition datasets (KTH, UCF Sports, HMDB51, UCF101, NTU RGB+D 60 and NTU RGB+D 120) demonstrated that the proposed representations achieved better or similar accuracy results in comparison to the state of the art, indicating the suitability of our approaches as video representations.}, keywords = {Activity Recognition, convolutional neural networks (CNNs), optical flow, spatiotemporal information, temporal stream}, pubstate = {published}, tppubtype = {phdthesis} } In this dissertation we propose four different representations based on motion information for activity recognition. The first is a spatiotemporal local feature descriptor that extracts a robust set of statistical measures to describe motion patterns. This descriptor measures meaningful properties of co-occurrence matrices and captures local space-time characteristics of the motion through the neighboring optical flow magnitude and orientation. The second, is the proposal of a compact novel mid-level representation based on co-occurrence matrices of codewords. This representation expresses the distribution of the features at a given offset over feature codewords from a pre-computed codebook and encodes global structures in various local region-based features. The third representation, is the proposal of a novel temporal stream for two-stream convolutional networks that employs images computed from the optical flow magnitude and orientation to learn the motion in a better and richer manner. The method applies simple non-linear transformations on the vertical and horizontal components of the optical flow to generate input images for the temporal stream. Finally, the forth is a novel skeleton image representation to be used as input of convolutional neural networks (CNNs). The proposed approach encodes the temporal dynamics by explicitly computing the magnitude and orientation values of the skeleton joints. Moreover, the representation has the advantage of combining the use of reference joints and a tree structure skeleton, incorporating different spatial relationships between the joints and preserving important spatial relations. The experimental evaluations carried out on challenging well-known activity recognition datasets (KTH, UCF Sports, HMDB51, UCF101, NTU RGB+D 60 and NTU RGB+D 120) demonstrated that the proposed representations achieved better or similar accuracy results in comparison to the state of the art, indicating the suitability of our approaches as video representations. |
2019 |
Raphael Felipe Carvalho de Prates Matching People Across Surveillance Cameras Tese PhD Universidade Federal de Minas Gerais, 2019. Resumo | BibTeX | Tags: Computer vision, Person Re-Identification, Smart Surveillance @phdthesis{RaphaelPrates:2020:PhD, title = {Matching People Across Surveillance Cameras}, author = {Raphael Felipe Carvalho de Prates}, year = {2019}, date = {2019-03-29}, school = {Universidade Federal de Minas Gerais}, abstract = {The number of surveillance camera networks is increasing as a consequence of the escalation of the security concerns. The large amount of data collected demands intelligent surveillance systems to extract information that is useful to security officers. In order to achieve this goal, this system must be able to correlate information captured by different surveillance cameras. In this scenario, re-identification of people is of central importance in establishing a global identity for individuals captured by different cameras using only visual appearance. However, this is a challenging task, since the same person when captured by different cameras undergoes a drastic change of appearance as a consequence of the variations in the point of view, illumination and pose. Recent work addresses the person re-identification by proposing robust visual descriptors orcross-view matching functions, which are functions that learn to match images from different cameras. However, most of these works are impaired by problems such as ambiguity among individuals, scalability, and reduced number of labeled images in the training set. In this thesis, we address the problem of matching individuals between cameras in order to address the aforementioned problems and, therefore, obtain better results. Specifically, we propose two directions: the learning of subspaces and the models of indirect identification. The first learns a common subspace that is scalable with respect to the number of cameras and robust in relation to the amount of training images available. we match probe and gallery images indirectly by computing their similarities with training samples. Experimental results validate both approaches in the person re-identification problem considering both only one pair of cameras and more realistic situations with multiple cameras.}, keywords = {Computer vision, Person Re-Identification, Smart Surveillance}, pubstate = {published}, tppubtype = {phdthesis} } The number of surveillance camera networks is increasing as a consequence of the escalation of the security concerns. The large amount of data collected demands intelligent surveillance systems to extract information that is useful to security officers. In order to achieve this goal, this system must be able to correlate information captured by different surveillance cameras. In this scenario, re-identification of people is of central importance in establishing a global identity for individuals captured by different cameras using only visual appearance. However, this is a challenging task, since the same person when captured by different cameras undergoes a drastic change of appearance as a consequence of the variations in the point of view, illumination and pose. Recent work addresses the person re-identification by proposing robust visual descriptors orcross-view matching functions, which are functions that learn to match images from different cameras. However, most of these works are impaired by problems such as ambiguity among individuals, scalability, and reduced number of labeled images in the training set. In this thesis, we address the problem of matching individuals between cameras in order to address the aforementioned problems and, therefore, obtain better results. Specifically, we propose two directions: the learning of subspaces and the models of indirect identification. The first learns a common subspace that is scalable with respect to the number of cameras and robust in relation to the amount of training images available. we match probe and gallery images indirectly by computing their similarities with training samples. Experimental results validate both approaches in the person re-identification problem considering both only one pair of cameras and more realistic situations with multiple cameras. |
Artur Jordao; Ricardo Kloss; Fernando Yamada; William Robson Schwartz Pruning Deep Networks using Partial Least Squares Inproceedings British Machine Vision Conference (BMVC) Workshops, pp. 1-9, 2019. @inproceedings{Jordao:2019:BMVC, title = {Pruning Deep Networks using Partial Least Squares}, author = {Artur Jordao and Ricardo Kloss and Fernando Yamada and William Robson Schwartz}, url = {http://www.dcc.ufmg.br/~william/papers/paper_2019_BMVCW_Jordao.pdf}, year = {2019}, date = {2019-01-01}, booktitle = {British Machine Vision Conference (BMVC) Workshops}, pages = {1-9}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Raphael Felipe Carvalho de Prates; William Robson Schwartz Kernel cross-view collaborative representation based classification for person re-identification Journal Article Journal of Visual Communication and Image Representation, 58 (1), pp. 304-315, 2019. Links | BibTeX | Tags: Kernel collaborative representation based classification, Person Re-Identification @article{Prates:2019:JVCI, title = {Kernel cross-view collaborative representation based classification for person re-identification}, author = {Raphael Felipe Carvalho de Prates and William Robson Schwartz}, doi = {https://doi.org/10.1016/j.jvcir.2018.12.003}, year = {2019}, date = {2019-01-01}, journal = {Journal of Visual Communication and Image Representation}, volume = {58}, number = {1}, pages = {304-315}, keywords = {Kernel collaborative representation based classification, Person Re-Identification}, pubstate = {published}, tppubtype = {article} } |
Carlos Caetano; Victor H C de Melo; François Brémond; Jefersson A dos Santos; William Robson Schwartz Magnitude-Orientation Stream network and depth information applied to activity recognition Journal Article Journal of Visual Communication and Image Representation, 63 , pp. 102596, 2019, ISSN: 1047-3203. Resumo | Links | BibTeX | Tags: @article{CAETANO2019102596, title = {Magnitude-Orientation Stream network and depth information applied to activity recognition}, author = {Carlos Caetano and Victor H C de Melo and François Brémond and Jefersson A dos Santos and William Robson Schwartz}, url = {http://www.sciencedirect.com/science/article/pii/S1047320319302172}, doi = {https://doi.org/10.1016/j.jvcir.2019.102596}, issn = {1047-3203}, year = {2019}, date = {2019-01-01}, journal = {Journal of Visual Communication and Image Representation}, volume = {63}, pages = {102596}, abstract = {The temporal component of videos provides an important clue for activity recognition, as a number of activities can be reliably recognized based on the motion information. In view of that, this work proposes a novel temporal stream for two-stream convolutional networks based on images computed from the optical flow magnitude and orientation, named Magnitude-Orientation Stream (MOS), to learn the motion in a better and richer manner. Our method applies simple non-linear transformations on the vertical and horizontal components of the optical flow to generate input images for the temporal stream. Moreover, we also employ depth information to use as a weighting scheme on the magnitude information to compensate the distance of the subjects performing the activity to the camera. Experimental results, carried on two well-known datasets (UCF101 and NTU), demonstrate that using our proposed temporal stream as input to existing neural network architectures can improve their performance for activity recognition. Results demonstrate that our temporal stream provides complementary information able to improve the classical two-stream methods, indicating the suitability of our approach to be used as a temporal video representation.}, keywords = {}, pubstate = {published}, tppubtype = {article} } The temporal component of videos provides an important clue for activity recognition, as a number of activities can be reliably recognized based on the motion information. In view of that, this work proposes a novel temporal stream for two-stream convolutional networks based on images computed from the optical flow magnitude and orientation, named Magnitude-Orientation Stream (MOS), to learn the motion in a better and richer manner. Our method applies simple non-linear transformations on the vertical and horizontal components of the optical flow to generate input images for the temporal stream. Moreover, we also employ depth information to use as a weighting scheme on the magnitude information to compensate the distance of the subjects performing the activity to the camera. Experimental results, carried on two well-known datasets (UCF101 and NTU), demonstrate that using our proposed temporal stream as input to existing neural network architectures can improve their performance for activity recognition. Results demonstrate that our temporal stream provides complementary information able to improve the classical two-stream methods, indicating the suitability of our approach to be used as a temporal video representation. |
Gabriel Resende Goncalves; Matheus Alves Diniz; Rayson Laroca; David Menotti; William Robson Schwartz Multi-Task Learning for Low-Resolution License Plate Recognition Inproceedings Iberoamerican Congress on Pattern Recognition (CIARP), pp. 1-10, 2019. @inproceedings{Goncalves:2019:CIARP, title = {Multi-Task Learning for Low-Resolution License Plate Recognition}, author = {Gabriel Resende Goncalves and Matheus Alves Diniz and Rayson Laroca and David Menotti and William Robson Schwartz}, url = {http://www.dcc.ufmg.br/~william/papers/paper_2019_CIARP_Goncalves.pdf}, year = {2019}, date = {2019-01-01}, booktitle = {Iberoamerican Congress on Pattern Recognition (CIARP)}, pages = {1-10}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Rafael Henrique Vareto; Matheus Alves Diniz; William Robson Schwartz Face Spoofing Detection on Low-Power Devices using Embeddings with Spatial and Frequency-based Descriptors Inproceedings Iberoamerican Congress on Pattern Recognition (CIARP), pp. 1-10, 2019. @inproceedings{Vareto:2019:CIARP, title = {Face Spoofing Detection on Low-Power Devices using Embeddings with Spatial and Frequency-based Descriptors}, author = {Rafael Henrique Vareto and Matheus Alves Diniz and William Robson Schwartz}, url = {http://www.dcc.ufmg.br/~william/papers/paper_2019_CIARP_Vareto.pdf}, year = {2019}, date = {2019-01-01}, booktitle = {Iberoamerican Congress on Pattern Recognition (CIARP)}, pages = {1-10}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Vitor Cezar de Lima; William Robson Schwartz Gait Recognition using Pose Estimation and Signal Processing Inproceedings Iberoamerican Congress on Pattern Recognition (CIARP), pp. 1-10, 2019. @inproceedings{Lima:2019:CIARP, title = {Gait Recognition using Pose Estimation and Signal Processing}, author = {Vitor Cezar de Lima and William Robson Schwartz}, url = {http://www.dcc.ufmg.br/~william/papers/paper_2019_CIARP_Vitor.pdf}, year = {2019}, date = {2019-01-01}, booktitle = {Iberoamerican Congress on Pattern Recognition (CIARP)}, pages = {1-10}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Igor Bastos; Victor Hugo Cunha de Melo; William Robson Schwartz Multi-Loss Recurrent Residual Networks for Gesture Detection and Recognition Inproceedings Conference on Graphic, Patterns and Images (SIBGRAPI), pp. 1-8, 2019. @inproceedings{Bastos:2019:SIBGRAPIb, title = {Multi-Loss Recurrent Residual Networks for Gesture Detection and Recognition}, author = {Igor Bastos and Victor Hugo Cunha de Melo and William Robson Schwartz}, url = {http://www.dcc.ufmg.br/~william/papers/paper_2019_SIBGRAPI_Bastos.pdf}, year = {2019}, date = {2019-01-01}, booktitle = {Conference on Graphic, Patterns and Images (SIBGRAPI)}, pages = {1-8}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
G Mendes; Jose Gustavo Paiva; William Robson Schwartz Point-placement techniques and Temporal Self-similarity Maps for Visual Analysis of Surveillance Videos Inproceedings International Conference Information Visualisation, pp. 1-8, 2019. BibTeX | Tags: @inproceedings{Mendes:2019:ICIV, title = {Point-placement techniques and Temporal Self-similarity Maps for Visual Analysis of Surveillance Videos}, author = {G Mendes and Jose Gustavo Paiva and William Robson Schwartz}, year = {2019}, date = {2019-01-01}, booktitle = {International Conference Information Visualisation}, pages = {1-8}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Carlos Caetano; Francois Bremond; William Robson Schwartz Skeleton Image Representation for 3D Action Recognition based on Tree Structure and Reference Joints Inproceedings Conference on Graphic, Patterns and Images (SIBGRAPI), pp. 1-8, 2019. @inproceedings{Caetano:2019:SIBGRAPIb, title = {Skeleton Image Representation for 3D Action Recognition based on Tree Structure and Reference Joints}, author = {Carlos Caetano and Francois Bremond and William Robson Schwartz}, url = {http://www.dcc.ufmg.br/~william/papers/paper_2019_SIBGRAPI_Caetano.pdf}, year = {2019}, date = {2019-01-01}, booktitle = {Conference on Graphic, Patterns and Images (SIBGRAPI)}, pages = {1-8}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Carlos Caetano; Jessica Souza; Francois Bremond; Jefersson Santos; William Robson Schwartz SkeleMotion: A New Representation of Skeleton Joint Sequences based on Motion Information for 3D Action Recognition Inproceedings 16th International Conference on Advanced Video and Signal-based Surveillance (AVSS), pp. 1-6, 2019. @inproceedings{Caetano:2019:AVSSb, title = {SkeleMotion: A New Representation of Skeleton Joint Sequences based on Motion Information for 3D Action Recognition}, author = {Carlos Caetano and Jessica Souza and Francois Bremond and Jefersson Santos and William Robson Schwartz}, url = {http://www.dcc.ufmg.br/~william/papers/paper_2019_AVSS_Caetano.pdf}, year = {2019}, date = {2019-01-01}, booktitle = {16th International Conference on Advanced Video and Signal-based Surveillance (AVSS)}, pages = {1-6}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
2018 |
Jessica Sena Human Activity Recognition based on Wearable Sensors using Multiscale DCNN Ensemble Masters Thesis Federal University of Minas Gerais, 2018. Resumo | BibTeX | Tags: CNN ensemble, Human Activity Recognition, Multimodal Data, Multiscale temporal data @mastersthesis{Sena:2018:MSc, title = {Human Activity Recognition based on Wearable Sensors using Multiscale DCNN Ensemble}, author = {Jessica Sena}, year = {2018}, date = {2018-10-18}, school = {Federal University of Minas Gerais}, abstract = {Sensor-based Human Activity Recognition (sensor-based HAR) provides valuable knowledge to many areas, such as medical, military and security. Recently, wearable devices have gained space as a relevant source of data due to the facility of data capture, the massive number of people who use these devices and the comfort and convenience of the device. In addition, the large number of sensors present in these devices provides complementary data as each sensor provides different information. However, there are two issues: heterogeneity between the data from multiple sensors and the temporal nature of the sensor data. We believe that mitigating these issues might provide valuable information if we handle the data correctly. To handle the first issue, we propose to processes each sensor separately, learning the features of each sensor and performing the classification before fusing with the other sensors. To exploit the second issue, we use an approach to extract patterns in multiple temporal scales of the data. This is convenient since the data are already a temporal sequence and the multiple scales extracted provide meaningful information regarding the activities performed by the users. We extract multiple temporal scales using an ensemble of Deep Convolution Neural Networks (DCNN). In this ensemble, we use a convolutional kernel with different height for each DCNN. Considering that the number of rows in the sensor data reflects the data captured over time, each kernel height reflects a temporal scale from which we can extract patterns. Consequently, our approach is able to extract both simple movement patterns such as a wrist twist when picking up a spoon and complex movements such as the human gait. This multimodal and multi-temporal approach outperforms previous state-of-the-art works in seven important datasets using two different protocols. We also demonstrate that the use of our proposed set of kernels improves sensor-based HAR in another multi-kernel approach, the widely employed inception network.}, keywords = {CNN ensemble, Human Activity Recognition, Multimodal Data, Multiscale temporal data}, pubstate = {published}, tppubtype = {mastersthesis} } Sensor-based Human Activity Recognition (sensor-based HAR) provides valuable knowledge to many areas, such as medical, military and security. Recently, wearable devices have gained space as a relevant source of data due to the facility of data capture, the massive number of people who use these devices and the comfort and convenience of the device. In addition, the large number of sensors present in these devices provides complementary data as each sensor provides different information. However, there are two issues: heterogeneity between the data from multiple sensors and the temporal nature of the sensor data. We believe that mitigating these issues might provide valuable information if we handle the data correctly. To handle the first issue, we propose to processes each sensor separately, learning the features of each sensor and performing the classification before fusing with the other sensors. To exploit the second issue, we use an approach to extract patterns in multiple temporal scales of the data. This is convenient since the data are already a temporal sequence and the multiple scales extracted provide meaningful information regarding the activities performed by the users. We extract multiple temporal scales using an ensemble of Deep Convolution Neural Networks (DCNN). In this ensemble, we use a convolutional kernel with different height for each DCNN. Considering that the number of rows in the sensor data reflects the data captured over time, each kernel height reflects a temporal scale from which we can extract patterns. Consequently, our approach is able to extract both simple movement patterns such as a wrist twist when picking up a spoon and complex movements such as the human gait. This multimodal and multi-temporal approach outperforms previous state-of-the-art works in seven important datasets using two different protocols. We also demonstrate that the use of our proposed set of kernels improves sensor-based HAR in another multi-kernel approach, the widely employed inception network. |
Victor Hugo Cunha de Melo; Jesimon Barreto Santos; Carlos Antonio Caetano Junior; Jessica Sena; Otavio A B Penatti; William Robson Schwartz Object-based Temporal Segment Relational Network for Activity Recognition Inproceedings Conference on Graphic, Patterns and Images (SIBGRAPI), pp. 1-8, 2018. BibTeX | Tags: Activity Recognition, DeepEyes, HAR-HEALTH, Relational Reasoning, Spatial Pyramid @inproceedings{DeMelo:2018:SIBGRAPI, title = {Object-based Temporal Segment Relational Network for Activity Recognition}, author = {Victor Hugo Cunha de Melo and Jesimon Barreto Santos and Carlos Antonio Caetano Junior and Jessica Sena and Otavio A B Penatti and William Robson Schwartz}, year = {2018}, date = {2018-09-21}, booktitle = {Conference on Graphic, Patterns and Images (SIBGRAPI)}, pages = {1-8}, keywords = {Activity Recognition, DeepEyes, HAR-HEALTH, Relational Reasoning, Spatial Pyramid}, pubstate = {published}, tppubtype = {inproceedings} } |
Jessica Sena; Jesimon Barreto Santos; William Robson Schwartz Multiscale DCNN Ensemble Applied to Human Activity Recognition Based on Wearable Sensors Inproceedings 26th European Signal Processing Conference (EUSIPCO 2018), pp. 1-5, 2018. Links | BibTeX | Tags: Activity Recognition Based on Wearable Sensors, Deep Learning, DeepEyes, HAR-HEALTH, Human Activity Recognition, Multimodal Data, Wearable Sensors @inproceedings{Sena:2018:EUSIPCO, title = {Multiscale DCNN Ensemble Applied to Human Activity Recognition Based on Wearable Sensors}, author = {Jessica Sena and Jesimon Barreto Santos and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/PID5428933.pdf}, year = {2018}, date = {2018-09-06}, booktitle = {26th European Signal Processing Conference (EUSIPCO 2018)}, pages = {1-5}, keywords = {Activity Recognition Based on Wearable Sensors, Deep Learning, DeepEyes, HAR-HEALTH, Human Activity Recognition, Multimodal Data, Wearable Sensors}, pubstate = {published}, tppubtype = {inproceedings} } |
Gabriel Resende Gonçalves; Matheus Alves Diniz; Rayson Laroca; David Menotti; William Robson Schwartz Real-time Automatic License Plate Recognition Through Deep Multi-Task Networks Inproceedings Conference on Graphic, Patterns and Images (SIBGRAPI), pp. 1-8, 2018. Links | BibTeX | Tags: Automatic License Plate Recognition, Deep Learning, DeepEyes, GigaFrames, Multi-Task Learning, Sense-ALPR @inproceedings{Goncalves:2018:SIBGRAPI, title = {Real-time Automatic License Plate Recognition Through Deep Multi-Task Networks}, author = {Gabriel Resende Gonçalves and Matheus Alves Diniz and Rayson Laroca and David Menotti and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/paper.pdf}, year = {2018}, date = {2018-09-04}, booktitle = {Conference on Graphic, Patterns and Images (SIBGRAPI)}, pages = {1-8}, keywords = {Automatic License Plate Recognition, Deep Learning, DeepEyes, GigaFrames, Multi-Task Learning, Sense-ALPR}, pubstate = {published}, tppubtype = {inproceedings} } |
Rensso Victor Hugo Mora Colque Robust Approaches for Anomaly Detection Applied to Video Surveillance Tese PhD Federal University of Minas Gerais, 2018. Resumo | BibTeX | Tags: Processamento digital de Imagens} pubstate = {published, Visão por computador @phdthesis{Mora:2018:PhD, title = {Robust Approaches for Anomaly Detection Applied to Video Surveillance}, author = {Rensso Victor Hugo Mora Colque}, year = {2018}, date = {2018-08-24}, school = {Federal University of Minas Gerais}, abstract = {Modeling human behavior and activity patterns for detection of anomalous events has attracted significant research interest in recent years, particularly among the video surveillance community. An anomalous event might be characterized by the deviation from the normal or usual, but not necessarily in an undesirable manner. One of the main challenges of detecting such events is the difficulty to create models due to their unpredictability and their dependency on the context of the scene. Anomalous events detection or anomaly recognition for surveillance videos is a very hard problem. Since anomalous events depend on the characteristic or the context of a specific scene. Although many contexts could be similar, the events that can be considered anomalous are also infinity, i.e., cannot be learned beforehand. In this dissertation, we propose three approaches to detect anomalous patterns in surveillance video sequences. In the first approach, we present an approach based on a handcrafted feature descriptor that employs general concepts, such as orientation, velocity, and entropy to build a descriptor for spatiotemporal regions. With this histogram, we can compare them and detect anomalies in video sequences. The main advantage of this approach is its simplicity and promising results that will be show in the experimental results, where our descriptors had well performance in famous dataset as UCSD and Subway, reaching comparative results with the estate of the art, specially in UCSD peds2 view. This results show that this model fits well in scenes with crowds. In the second proposal, we develop an approach based on human-object interactions. This approach explores the scene context to determine normal patterns and finally detect whether some video segment contains a possible anomalous event. To validate this approach we proposed a novel dataset which contains anomalies based on the human object interactions, the results are promising, however, this approach must be extended to be robust to more situations and environments. In the third approach, we propose a novel method based on semantic information of people movement. While, most studies focus in information extracted from spatiotemporal regions, our approach detects anomalies based on human trajectory. The results show that our model is suitable to detect anomalies in environments where trajectory of the people could be extracted. The main difference among the proposed approaches is the source to describe the events in the scene. The first method intends to represent the scene from spatiotemporal regions, the second uses the human-object interactions and the third uses the people trajectory. Each approach is oriented to certain anomaly types, having advantages and disadvantages according to the inherit limitation of the source and to the subjective of normal and anomaly event definition in a determinate context.}, keywords = {Processamento digital de Imagens} pubstate = {published, Visão por computador}, pubstate = {published}, tppubtype = {phdthesis} } Modeling human behavior and activity patterns for detection of anomalous events has attracted significant research interest in recent years, particularly among the video surveillance community. An anomalous event might be characterized by the deviation from the normal or usual, but not necessarily in an undesirable manner. One of the main challenges of detecting such events is the difficulty to create models due to their unpredictability and their dependency on the context of the scene. Anomalous events detection or anomaly recognition for surveillance videos is a very hard problem. Since anomalous events depend on the characteristic or the context of a specific scene. Although many contexts could be similar, the events that can be considered anomalous are also infinity, i.e., cannot be learned beforehand. In this dissertation, we propose three approaches to detect anomalous patterns in surveillance video sequences. In the first approach, we present an approach based on a handcrafted feature descriptor that employs general concepts, such as orientation, velocity, and entropy to build a descriptor for spatiotemporal regions. With this histogram, we can compare them and detect anomalies in video sequences. The main advantage of this approach is its simplicity and promising results that will be show in the experimental results, where our descriptors had well performance in famous dataset as UCSD and Subway, reaching comparative results with the estate of the art, specially in UCSD peds2 view. This results show that this model fits well in scenes with crowds. In the second proposal, we develop an approach based on human-object interactions. This approach explores the scene context to determine normal patterns and finally detect whether some video segment contains a possible anomalous event. To validate this approach we proposed a novel dataset which contains anomalies based on the human object interactions, the results are promising, however, this approach must be extended to be robust to more situations and environments. In the third approach, we propose a novel method based on semantic information of people movement. While, most studies focus in information extracted from spatiotemporal regions, our approach detects anomalies based on human trajectory. The results show that our model is suitable to detect anomalies in environments where trajectory of the people could be extracted. The main difference among the proposed approaches is the source to describe the events in the scene. The first method intends to represent the scene from spatiotemporal regions, the second uses the human-object interactions and the third uses the people trajectory. Each approach is oriented to certain anomaly types, having advantages and disadvantages according to the inherit limitation of the source and to the subjective of normal and anomaly event definition in a determinate context. |
Raphael Felipe Carvalho de Prates; William Robson Schwartz Kernel multiblock partial least squares for a scalable and multicamera person reidentification system Journal Article pp. 1-33, 2018. Links | BibTeX | Tags: Kernel Partial Least Squares, Partial Least Squares, Person Re-Identification @article{Prates:2018:JEI, title = {Kernel multiblock partial least squares for a scalable and multicamera person reidentification system}, author = {Raphael Felipe Carvalho de Prates and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/article.pdf}, year = {2018}, date = {2018-06-25}, booktitle = {Journal of Electronic Imaging}, pages = {1-33}, keywords = {Kernel Partial Least Squares, Partial Least Squares, Person Re-Identification}, pubstate = {published}, tppubtype = {article} } |
Ricardo Barbosa Kloss Boosted Projections and Low Cost Transfer Learning Applied to Smart Surveillance Masters Thesis Federal University of Minas Gerais, 2018. Resumo | Links | BibTeX | Tags: Computer vision, Deep Learning, Forensics, Machine Learning, Surveillance @mastersthesis{Kloss2018, title = {Boosted Projections and Low Cost Transfer Learning Applied to Smart Surveillance}, author = {Ricardo Barbosa Kloss}, url = {https://www.dropbox.com/s/dpkkv16zkardxdx/main.pdf?dl=0}, year = {2018}, date = {2018-02-01}, school = {Federal University of Minas Gerais}, abstract = {Computer vision is an important area related to understanding the world through images. It can be used as biometry, by verifying if a given face is of a certain identity, used to look for crime perpetrators in an airport blacklist, used in human-machine interactions and other goals. Since 2012, deep learning methods have become ubiquitous in computer vision achieving breakthroughs, and making possible for machines, for instance, to perform face verification with human-level skill. This work tackles two computer vision problems and is divided in two parts. In one we explore deep learning methods in the task of face verification and in the other the task of dimensionality reduction. Both tasks have large importance in the fields of machine learning and computer vision. We focus on their application in smart surveillance. Dimensionality reduction helps alleviate problems which usually suffer from a very high dimensionality, which can make it hard to learn classifiers. This work presents a novel method for tackling this problem, referred to as Boosted Projection. It relies on the use of several projection models based on Principal Component Analysis or Partial Least Squares to build a more compact and richer data representation. Our experimental results demonstrate that the proposed approach outperforms many baselines and provides better results when compared to the original dimensionality reduction techniques of partial least squares. In the second part of this work, regarding face verification, we explored a simple and cheap technique to extract deep features and reuse a pre-learned model. The technique is a transfer learning that involves no fine-tuning of the model to the new domain. Namely, we explore the correlation of depth and scale in deep models, and look for the layer/scale that yields the best results for the new domain, we also explore metrics for the verification task, using locally connected convolutions to learn distance metrics. Our face verification experiments use a model pre-trained in face identification and adapt it to the face verification task with different data, but still on the face domain. We achieve 96.65% mean accuracy on the Labeled Faces in the Wild dataset and 93.12% mean accuracy on the Youtube Faces dataset which are in the state-of-the-art.}, keywords = {Computer vision, Deep Learning, Forensics, Machine Learning, Surveillance}, pubstate = {published}, tppubtype = {mastersthesis} } Computer vision is an important area related to understanding the world through images. It can be used as biometry, by verifying if a given face is of a certain identity, used to look for crime perpetrators in an airport blacklist, used in human-machine interactions and other goals. Since 2012, deep learning methods have become ubiquitous in computer vision achieving breakthroughs, and making possible for machines, for instance, to perform face verification with human-level skill. This work tackles two computer vision problems and is divided in two parts. In one we explore deep learning methods in the task of face verification and in the other the task of dimensionality reduction. Both tasks have large importance in the fields of machine learning and computer vision. We focus on their application in smart surveillance. Dimensionality reduction helps alleviate problems which usually suffer from a very high dimensionality, which can make it hard to learn classifiers. This work presents a novel method for tackling this problem, referred to as Boosted Projection. It relies on the use of several projection models based on Principal Component Analysis or Partial Least Squares to build a more compact and richer data representation. Our experimental results demonstrate that the proposed approach outperforms many baselines and provides better results when compared to the original dimensionality reduction techniques of partial least squares. In the second part of this work, regarding face verification, we explored a simple and cheap technique to extract deep features and reuse a pre-learned model. The technique is a transfer learning that involves no fine-tuning of the model to the new domain. Namely, we explore the correlation of depth and scale in deep models, and look for the layer/scale that yields the best results for the new domain, we also explore metrics for the verification task, using locally connected convolutions to learn distance metrics. Our face verification experiments use a model pre-trained in face identification and adapt it to the face verification task with different data, but still on the face domain. We achieve 96.65% mean accuracy on the Labeled Faces in the Wild dataset and 93.12% mean accuracy on the Youtube Faces dataset which are in the state-of-the-art. |
Carlos Antonio Caetano Junior; Jefersson A dos Santos; William Robson Schwartz Statistical Measures from Co-occurrence of Codewords for Action Recognition Inproceedings VISAPP 2018 - International Conference on Computer Vision Theory and Applications, pp. 1-8, 2018. Links | BibTeX | Tags: Action Recognition, Activity Recognition, DeepEyes, GigaFrames, Spatiotemporal Features @inproceedings{Caetano:2018:VISAPP, title = {Statistical Measures from Co-occurrence of Codewords for Action Recognition}, author = {Carlos Antonio Caetano Junior and Jefersson A dos Santos and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/VISAPP_2018_CarlosCaetano.pdf}, year = {2018}, date = {2018-01-27}, booktitle = {VISAPP 2018 - International Conference on Computer Vision Theory and Applications}, pages = {1-8}, keywords = {Action Recognition, Activity Recognition, DeepEyes, GigaFrames, Spatiotemporal Features}, pubstate = {published}, tppubtype = {inproceedings} } |
Samira Santos da Silva Aggregating Partial Least Squares Models for Open-set Face Identification Masters Thesis Federal University of Minas Gerais, 2018. Resumo | Links | BibTeX | Tags: Face Identification, Open-set Face Recognition, Partial Least Squares @mastersthesis{Silva:2018:MSc, title = {Aggregating Partial Least Squares Models for Open-set Face Identification}, author = {Samira Santos da Silva}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2020/04/dissertation_versaocomfichacatalografica.pdf}, year = {2018}, date = {2018-01-16}, school = {Federal University of Minas Gerais}, abstract = {Face identification is an important task in computer vision and has a myriad of applications, such as in surveillance, forensics and human-computer interaction. In the past few years, several methods have been proposed to solve face identification task in closed-set scenarios, that is, methods that make assumption of all the probe images necessarily matching a gallery individual. However, in real-world applications, one might want to determine the identity of an unknown face in open-set scenarios. In this work, we propose a novel method to perform open-set face identification by aggregating Partial Least Squares models using the one-against-all protocol in a simple but fast way. The model outputs are combined into a response histogram which is balanced if the probe face belongs to a gallery individual or have a highlighted bin, otherwise. Evaluation is performed in four datasets: FRGCv1, FG-NET, Pubfig and Pubfig83. Results show significant improvement when compared to state-of-the art approaches regardless challenges posed by different datasets.}, keywords = {Face Identification, Open-set Face Recognition, Partial Least Squares}, pubstate = {published}, tppubtype = {mastersthesis} } Face identification is an important task in computer vision and has a myriad of applications, such as in surveillance, forensics and human-computer interaction. In the past few years, several methods have been proposed to solve face identification task in closed-set scenarios, that is, methods that make assumption of all the probe images necessarily matching a gallery individual. However, in real-world applications, one might want to determine the identity of an unknown face in open-set scenarios. In this work, we propose a novel method to perform open-set face identification by aggregating Partial Least Squares models using the one-against-all protocol in a simple but fast way. The model outputs are combined into a response histogram which is balanced if the probe face belongs to a gallery individual or have a highlighted bin, otherwise. Evaluation is performed in four datasets: FRGCv1, FG-NET, Pubfig and Pubfig83. Results show significant improvement when compared to state-of-the art approaches regardless challenges posed by different datasets. |
R Laroca; E Severo; L A Zanlorensi; L S Oliveira; G R Gon�alves; W R Schwartz; D Menotti A Robust Real-Time Automatic License Plate Recognition Based on the YOLO Detector Inproceedings 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1-10, 2018. @inproceedings{Laroca:2018:IJCNN, title = {A Robust Real-Time Automatic License Plate Recognition Based on the YOLO Detector}, author = {R Laroca and E Severo and L A Zanlorensi and L S Oliveira and G R Gon�alves and W R Schwartz and D Menotti}, url = {http://www.dcc.ufmg.br/~william/papers/paper_2018_IJCNN_Laroca.pdf}, doi = {10.1109/IJCNN.2018.8489629}, year = {2018}, date = {2018-01-01}, booktitle = {2018 International Joint Conference on Neural Networks (IJCNN)}, pages = {1-10}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Gabriel Sousa Ferreira; Flavio Luis Cardeal Padua; William Robson Schwartz; Marco Tulio Alves Nolasco Rodrigues Mapping Sports Interest with Social Network Inproceedings XV Encontro Nacional de Inteligencia Artificial e Computacional (ENIAC), pp. 1-12, 2018. @inproceedings{Ferreira:2018:ENIAC, title = {Mapping Sports Interest with Social Network}, author = {Gabriel Sousa Ferreira and Flavio Luis Cardeal Padua and William Robson Schwartz and Marco Tulio Alves Nolasco Rodrigues}, url = {http://www.dcc.ufmg.br/~william/papers/paper_2018_ENIAC.pdf}, year = {2018}, date = {2018-01-01}, booktitle = {XV Encontro Nacional de Inteligencia Artificial e Computacional (ENIAC)}, pages = {1-12}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Ricardo Barbosa Kloss; Artur Jordao; William Robson Schwartz Face Verification Strategies for Employing Deep Models Inproceedings 13th IEEE International Conference on Automatic Face & Gesture Recognition, pp. 258-262, 2018. Links | BibTeX | Tags: Artificial Neural Networks, Face Verification, GigaFrames, Metric Learning, Transfer Learning @inproceedings{Kloss:2018:FG, title = {Face Verification Strategies for Employing Deep Models}, author = {Ricardo Barbosa Kloss and Artur Jordao and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/Face-Verification-Strategies-for-Employing-Deep-Models.pdf}, year = {2018}, date = {2018-01-01}, booktitle = {13th IEEE International Conference on Automatic Face & Gesture Recognition}, pages = {258-262}, keywords = {Artificial Neural Networks, Face Verification, GigaFrames, Metric Learning, Transfer Learning}, pubstate = {published}, tppubtype = {inproceedings} } |
Artur Jordao; Antonio Carlos Nazare Junior; Jessica Sena; William Robson Schwartz Human Activity Recognition based on Wearable Sensor Data: A Benchmark Journal Article arXiv, pp. 1-12, 2018. Links | BibTeX | Tags: Activity Recognition Based on Wearable Sensors, Benchmark on Activity Recognition based on Wearable Sensors, Wearable Sensors @article{Jordao:2018:arXiv, title = {Human Activity Recognition based on Wearable Sensor Data: A Benchmark}, author = {Artur Jordao and Antonio Carlos Nazare Junior and Jessica Sena and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/03/Human-Activity-Recognition-based-on-Wearable-A-Benchmark.pdf}, year = {2018}, date = {2018-01-01}, booktitle = {Arvix}, journal = {arXiv}, pages = {1-12}, keywords = {Activity Recognition Based on Wearable Sensors, Benchmark on Activity Recognition based on Wearable Sensors, Wearable Sensors}, pubstate = {published}, tppubtype = {article} } |
Artur Jordao; Ricardo Barbosa Kloss; William Robson Schwartz Latent hypernet: Exploring all Layers from Convolutional Neural Networks Inproceedings IEEE International Joint Conference on Neural Networks (IJCNN), pp. 1-7, 2018. Links | BibTeX | Tags: Activity Recognition Based on Wearable Sensors, DeepEyes, GigaFrames, Partial Least Squares, Wearable Sensors @inproceedings{Jordao:2018b:IJCNN, title = {Latent hypernet: Exploring all Layers from Convolutional Neural Networks}, author = {Artur Jordao and Ricardo Barbosa Kloss and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/Latent-HyperNet-Exploring-the-Layers.pdf}, year = {2018}, date = {2018-01-01}, booktitle = {IEEE International Joint Conference on Neural Networks (IJCNN)}, pages = {1-7}, keywords = {Activity Recognition Based on Wearable Sensors, DeepEyes, GigaFrames, Partial Least Squares, Wearable Sensors}, pubstate = {published}, tppubtype = {inproceedings} } |
Artur Jordao; Leonardo Antônio Borges Torres; William Robson Schwartz Novel Approaches to Human Activity Recognition based on Accelerometer Data Journal Article 12 (7), pp. 1387–1394, 2018. Links | BibTeX | Tags: Activity Recognition Based on Wearable Sensors, HAR-HEALTH, Wearable Sensors @article{Jordao:2018:SIVP, title = {Novel Approaches to Human Activity Recognition based on Accelerometer Data}, author = {Artur Jordao and Leonardo Antônio Borges Torres and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/Novel-Approaches-to-Human-Activity-Recognition-based-on.pdf}, year = {2018}, date = {2018-01-01}, booktitle = {Signal, Image and Video Processing}, volume = {12}, number = {7}, pages = {1387–1394}, keywords = {Activity Recognition Based on Wearable Sensors, HAR-HEALTH, Wearable Sensors}, pubstate = {published}, tppubtype = {article} } |
Rensso Victor Hugo Mora Colque; Carlos Antonio Caetano Junior; Victor Hugo Cunha de Melo; Guillermo Camara Chavez; William Robson Schwartz Novel Anomalous Event Detection based on Human-object Interactions Inproceedings VISAPP 2018 - International Conference on Computer Vision Theory and Applications, pp. 1-8, 2018. Links | BibTeX | Tags: Anomalous Event Detection, Contextual Information, DeepEyes, GigaFrames, Human-Object Interaction @inproceedings{Colque:2018:VISAPP, title = {Novel Anomalous Event Detection based on Human-object Interactions}, author = {Rensso Victor Hugo Mora Colque and Carlos Antonio Caetano Junior and Victor Hugo Cunha de Melo and Guillermo Camara Chavez and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/VISAPP_2018_92.pdf}, year = {2018}, date = {2018-01-01}, booktitle = {VISAPP 2018 - International Conference on Computer Vision Theory and Applications}, pages = {1-8}, keywords = {Anomalous Event Detection, Contextual Information, DeepEyes, GigaFrames, Human-Object Interaction}, pubstate = {published}, tppubtype = {inproceedings} } |
Igor Leonardo Oliveira Bastos; Victor Hugo Cunha de Melo; Gabriel Resende Gonçalves; William Robson Schwartz MORA: A Generative Approach to Extract Spatiotemporal Information Applied to Gesture Recognition Inproceedings 15th International Conference on Advanced Video and Signal-based Surveillance (AVSS), pp. 1-6, 2018. Links | BibTeX | Tags: Autoencoders, DeepEyes, Gesture Recognition, GigaFrames, Recurrent Models @inproceedings{Bastos:2018:AVSS, title = {MORA: A Generative Approach to Extract Spatiotemporal Information Applied to Gesture Recognition}, author = {Igor Leonardo Oliveira Bastos and Victor Hugo Cunha de Melo and Gabriel Resende Gonçalves and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/MORA_.pdf}, year = {2018}, date = {2018-01-01}, booktitle = {15th International Conference on Advanced Video and Signal-based Surveillance (AVSS)}, pages = {1-6}, keywords = {Autoencoders, DeepEyes, Gesture Recognition, GigaFrames, Recurrent Models}, pubstate = {published}, tppubtype = {inproceedings} } |
Ricardo Barbosa Kloss; Artur Jordao; William Robson Schwartz Boosted Projection An Ensemble of Transformation Models Inproceedings 22nd Iberoamerican Congress on Pattern Recognition (CIARP), pp. 331-338, 2018. Links | BibTeX | Tags: Computer vision, DeepEyes, Dimensionality Reduction, Ensemble Partial Least Squares, GigaFrames, Machine Learning @inproceedings{Kloss:2018:CIARP, title = {Boosted Projection An Ensemble of Transformation Models}, author = {Ricardo Barbosa Kloss and Artur Jordao and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/Boosted-Projection-An-Ensemble-of-Transformation-Models.pdf}, year = {2018}, date = {2018-01-01}, booktitle = {22nd Iberoamerican Congress on Pattern Recognition (CIARP)}, pages = {331-338}, keywords = {Computer vision, DeepEyes, Dimensionality Reduction, Ensemble Partial Least Squares, GigaFrames, Machine Learning}, pubstate = {published}, tppubtype = {inproceedings} } |
Renan Oliveira Reis; Igor Henrique Dias; William Robson Schwartz Neural network control for active cameras using master-slave setup Inproceedings International Conference on Advanced Video and Signal-based Surveillance (AVSS), pp. 1-6, 2018. Links | BibTeX | Tags: Active Camera, DeepEyes, GigaFrames, Neural network control for active cameras using master-slave setup, SMS @inproceedings{Reis:2018:AVSS, title = {Neural network control for active cameras using master-slave setup}, author = {Renan Oliveira Reis and Igor Henrique Dias and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/renan_avss_2018-1-1.pdf}, year = {2018}, date = {2018-01-01}, booktitle = {International Conference on Advanced Video and Signal-based Surveillance (AVSS)}, pages = {1-6}, keywords = {Active Camera, DeepEyes, GigaFrames, Neural network control for active cameras using master-slave setup, SMS}, pubstate = {published}, tppubtype = {inproceedings} } |
Antonio Carlos Nazare Junior; Filipe Oliveira de Costa; William Robson Schwartz Content-Based Multi-Camera Video Alignment using Accelerometer Data Inproceedings Advanced Video and Signal Based Surveillance (AVSS), 2018 15th IEEE International Conference on, pp. 1-6, 2018. Links | BibTeX | Tags: Camera Synchronization, DeepEyes, GigaFrames, SensorCap, Sensors, SMS @inproceedings{Nazare:2018:AVSS, title = {Content-Based Multi-Camera Video Alignment using Accelerometer Data}, author = {Antonio Carlos Nazare Junior and Filipe Oliveira de Costa and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/2018_avss_svsync_camera_ready.pdf}, year = {2018}, date = {2018-01-01}, booktitle = {Advanced Video and Signal Based Surveillance (AVSS), 2018 15th IEEE International Conference on}, pages = {1-6}, keywords = {Camera Synchronization, DeepEyes, GigaFrames, SensorCap, Sensors, SMS}, pubstate = {published}, tppubtype = {inproceedings} } |
Artur Jordao; Ricardo Kloss; Fernando Yamada; William Robson Schwartz Pruning Deep Neural Networks using Partial Least Squares Journal Article ArXiv e-prints, 2018. Links | BibTeX | Tags: DeepEyes, GigaFrames, Neural Networks Optimization @article{Jordao:2018:arXivb, title = {Pruning Deep Neural Networks using Partial Least Squares}, author = {Artur Jordao and Ricardo Kloss and Fernando Yamada and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/03/1810.07610.pdf}, year = {2018}, date = {2018-01-01}, journal = {ArXiv e-prints}, keywords = {DeepEyes, GigaFrames, Neural Networks Optimization}, pubstate = {published}, tppubtype = {article} } |
Raphael Prates; William Robson Schwartz Kernel Multiblock Partial Least Squares for a Scalable and Multicamera Person Reidentification System Journal Article Journal of Electronic Imaging, 27 (3), pp. 1-33, 2018. @article{Prates:2018:JEIb, title = {Kernel Multiblock Partial Least Squares for a Scalable and Multicamera Person Reidentification System}, author = {Raphael Prates and William Robson Schwartz}, url = {http://www.dcc.ufmg.br/~william/papers/paper_2018_JEI.pdf}, doi = {10.1117/1.JEI.27.3.033041}, year = {2018}, date = {2018-01-01}, journal = {Journal of Electronic Imaging}, volume = {27}, number = {3}, pages = {1-33}, keywords = {}, pubstate = {published}, tppubtype = {article} } |
Anderson Luis Cavalcanti Sales; Rafael Henrique Vareto; William Robson Schwartz; Guillermo Camara Chavez Single-Shot Person Re-Identification Combining Similarity Metrics and Support Vectors Inproceedings Conference on Graphic, Patterns and Images (SIBGRAPI), pp. 1-8, 2018. @inproceedings{Sales:2018:SIBGRAPI, title = {Single-Shot Person Re-Identification Combining Similarity Metrics and Support Vectors}, author = {Anderson Luis Cavalcanti Sales and Rafael Henrique Vareto and William Robson Schwartz and Guillermo Camara Chavez}, url = {http://www.dcc.ufmg.br/~william/papers/paper_2018_SIBGRAPI_Sales.pdf}, year = {2018}, date = {2018-01-01}, booktitle = {Conference on Graphic, Patterns and Images (SIBGRAPI)}, pages = {1-8}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
2017 |
Igor Leonardo Oliveira Bastos; Larissa Rocha Soares; William Robson Schwartz Pyramidal Zernike Over Time: A spatiotemporal feature descriptor based on Zernike Moments Inproceedings Iberoamerican Congress on Pattern Recognition (CIARP 2017), pp. 77-85, 2017. Links | BibTeX | Tags: Activity Recognition, DeepEyes, Feature Extraction, GigaFrames, Zernike Moments @inproceedings{Bastos:2017:CIARP, title = {Pyramidal Zernike Over Time: A spatiotemporal feature descriptor based on Zernike Moments}, author = {Igor Leonardo Oliveira Bastos and Larissa Rocha Soares and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/PZOT_camera_ready.pdf}, year = {2017}, date = {2017-11-07}, booktitle = {Iberoamerican Congress on Pattern Recognition (CIARP 2017)}, pages = {77-85}, keywords = {Activity Recognition, DeepEyes, Feature Extraction, GigaFrames, Zernike Moments}, pubstate = {published}, tppubtype = {inproceedings} } |
Rafael Henrique Vareto Face Recognition based on a Collection of Binary Classifiers Masters Thesis Federal University of Minas Gerais, 2017. Resumo | BibTeX | Tags: Artificial Neural Network, Face Verification, Machine Learning, Open-set Face Identification, Partial Least Squares, Support Vector Machine, Surveillance @mastersthesis{Vareto:2017:MSc, title = {Face Recognition based on a Collection of Binary Classifiers}, author = {Rafael Henrique Vareto}, year = {2017}, date = {2017-10-16}, school = {Federal University of Minas Gerais}, abstract = {Face Recognition is one of the most relevant problems in computer vision as we consider its importance to areas such as surveillance, forensics and psychology. In fact, a real-world recognition system has to cope with several unseen individuals and determine either if a given face image is associated with a subject registered in a gallery of known individuals or if two given faces represent equivalent identities. In this work, not only we combine hashing functions, embedding of classifiers and response value histograms to estimate when probe samples belong to the gallery set, but we also extract relational features to model the relation between pair of faces to determine whether they are from the same person. Both proposed methods are evaluated on five datasets: FRGCv1, LFW, PubFig, PubFig83 and CNN VGGFace. Results are promising and show that our method continues effective for both open-set face identification and verification tasks regardless of the dataset difficulty.}, keywords = {Artificial Neural Network, Face Verification, Machine Learning, Open-set Face Identification, Partial Least Squares, Support Vector Machine, Surveillance}, pubstate = {published}, tppubtype = {mastersthesis} } Face Recognition is one of the most relevant problems in computer vision as we consider its importance to areas such as surveillance, forensics and psychology. In fact, a real-world recognition system has to cope with several unseen individuals and determine either if a given face image is associated with a subject registered in a gallery of known individuals or if two given faces represent equivalent identities. In this work, not only we combine hashing functions, embedding of classifiers and response value histograms to estimate when probe samples belong to the gallery set, but we also extract relational features to model the relation between pair of faces to determine whether they are from the same person. Both proposed methods are evaluated on five datasets: FRGCv1, LFW, PubFig, PubFig83 and CNN VGGFace. Results are promising and show that our method continues effective for both open-set face identification and verification tasks regardless of the dataset difficulty. |
Rafael Henrique Vareto; Samira Santos da Silva; Filipe Oliveira de Costa; William Robson Schwartz Towards Open-Set Face Recognition using Hashing Functions Inproceedings International Joint Conference on Biometrics (IJCB), pp. 1-8, 2017. Links | BibTeX | Tags: Face Recognition, Open-Set Classification @inproceedings{Vareto:2017:IJCB, title = {Towards Open-Set Face Recognition using Hashing Functions}, author = {Rafael Henrique Vareto and Samira Santos da Silva and Filipe Oliveira de Costa and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/paper_2017_IJCB.pdf http://lena.dcc.ufmg.br/wordpress/towards-open-set-face-recognition-using-hashing-functions/}, year = {2017}, date = {2017-01-01}, booktitle = {International Joint Conference on Biometrics (IJCB)}, pages = {1-8}, keywords = {Face Recognition, Open-Set Classification}, pubstate = {published}, tppubtype = {inproceedings} } |
Rensso Victor Hugo Mora Colque; Carlos Antonio Caetano Junior; Matheus Toledo Lustosa de Andrade; William Robson Schwartz Histograms of Optical Flow Orientation and Magnitude and Entropy to Detect Anomalous Events in Videos Journal Article IEEE Transactions on Circuits and Systems for Video Technology, 27 (3), pp. 673-682, 2017. Links | BibTeX | Tags: Anomalous Event Detection, DeepEyes, Feature Extraction, GigaFrames @article{Colque:2016:TCSVT, title = {Histograms of Optical Flow Orientation and Magnitude and Entropy to Detect Anomalous Events in Videos}, author = {Rensso Victor Hugo Mora Colque and Carlos Antonio Caetano Junior and Matheus Toledo Lustosa de Andrade and William Robson Schwartz}, url = {http://dx.doi.org/10.1109/TCSVT.2016.2637778}, year = {2017}, date = {2017-01-01}, journal = {IEEE Transactions on Circuits and Systems for Video Technology}, volume = {27}, number = {3}, pages = {673-682}, keywords = {Anomalous Event Detection, DeepEyes, Feature Extraction, GigaFrames}, pubstate = {published}, tppubtype = {article} } |
Igor Leonardo Oliveira Bastos; William Robson Schwartz Assigning Relative Importance to Scene Elements Inproceedings Conference on Graphics, Patterns and Images (SIBGRAPI), pp. 1-8, 2017. Links | BibTeX | Tags: Assigning Relative Importance to Scene Elements, Importance Annotation for VIP and UIUC Pascal Sentence Datasets, Importance Assignment @inproceedings{Bastos:2017:SIBGRAPI, title = {Assigning Relative Importance to Scene Elements}, author = {Igor Leonardo Oliveira Bastos and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/paper_2017_SIBGRAPI_Bastos.pdf}, year = {2017}, date = {2017-01-01}, booktitle = {Conference on Graphics, Patterns and Images (SIBGRAPI)}, pages = {1-8}, keywords = {Assigning Relative Importance to Scene Elements, Importance Annotation for VIP and UIUC Pascal Sentence Datasets, Importance Assignment}, pubstate = {published}, tppubtype = {inproceedings} } |
Carlos Antonio Caetano Junior; Victor Hugo Cunha de Melo; Jefersson Alex dos Santos; William Robson Schwartz Activity Recognition based on a Magnitude-Orientation Stream Network Inproceedings Conference on Graphics, Patterns and Images (SIBGRAPI), pp. 1-8, 2017. Links | BibTeX | Tags: Activity Recognition, Deep Learning, GigaFrames @inproceedings{Caetano:2017:SIBGRAPI, title = {Activity Recognition based on a Magnitude-Orientation Stream Network}, author = {Carlos Antonio Caetano Junior and Victor Hugo Cunha de Melo and Jefersson Alex dos Santos and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/paper_2017_SIBGRAPI_Caetano.pdf}, year = {2017}, date = {2017-01-01}, booktitle = {Conference on Graphics, Patterns and Images (SIBGRAPI)}, pages = {1-8}, keywords = {Activity Recognition, Deep Learning, GigaFrames}, pubstate = {published}, tppubtype = {inproceedings} } |
Rafael Henrique Vareto; Samira Santos da Silva; Filipe Oliveira de Costa; William Robson Schwartz Face Verification based on Relational Disparity Features and Partial Least Squares Models Inproceedings Conference on Graphics, Patterns and Images (SIBGRAPI), pp. 1-8, 2017. Links | BibTeX | Tags: Face Recognition, Face Verification @inproceedings{Vareto:2017:SIBGRAPI, title = {Face Verification based on Relational Disparity Features and Partial Least Squares Models}, author = {Rafael Henrique Vareto and Samira Santos da Silva and Filipe Oliveira de Costa and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/paper_2017_SIBGRAPI_Vareto.pdf http://lena.dcc.ufmg.br/wordpress/face-verification-based-relational-disparity-features-partial-least-squares-models/}, year = {2017}, date = {2017-01-01}, booktitle = {Conference on Graphics, Patterns and Images (SIBGRAPI)}, pages = {1-8}, keywords = {Face Recognition, Face Verification}, pubstate = {published}, tppubtype = {inproceedings} } |
2016 |
Raphael Felipe Carvalho de Prates; William Robson Schwartz Kernel Hierarchical PCA for Person Re-Identification Inproceedings IAPR International Conference on Pattern Recognition (ICPR), 2016. Resumo | Links | BibTeX | Tags: Kernel Hierarchical PCA, Kernel Partial Least Squares, Partial Least Squares, Person Re-Identification @inproceedings{Prates2016ICPR, title = {Kernel Hierarchical PCA for Person Re-Identification}, author = {Raphael Felipe Carvalho de Prates and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/Kernel_HPCA_final.pdf}, year = {2016}, date = {2016-12-13}, booktitle = {IAPR International Conference on Pattern Recognition (ICPR)}, abstract = {Person re-identification (Re-ID) maintains a global identity for an individual while he moves along a large area covered by multiple cameras. Re-ID enables a multi-camera monitoring of individual activity that is critical for surveillance systems. However, the low-resolution images combined with the different poses, illumination conditions and camera viewpoints make person Re-ID a challenging problem. To reach a higher matching performance, state-of-the-art methods map the data to a nonlinear feature space where they learn a cross-view matching function using training data. Kernel PCA is a statistical method that learns a common subspace that captures most of the variability of samples using a small number of vector basis. However, Kernel PCA disregards that images were captured by distinct cameras, a critical problem in person ReID. Differently, Hierarchical PCA (HPCA) captures a consensus projection between multiblock data (e.g, two camera views), but it is a linear model. Therefore, we propose the Kernel Hierarchical PCA (Kernel HPCA) to tackle camera transition and dimensionality reduction in a unique framework. To the best of our knowledge, this is the first work to propose a kernel extension to the multiblock HPCA method. Experimental results demonstrate that Kernel HPCA reaches a matching performance comparable with state-of-the-art nonlinear subspace learning methods at PRID450S and VIPeR datasets. Furthermore, Kernel HPCA reaches a better combination of subspace learning and dimensionality requiring significantly lower subspace dimensions.}, keywords = {Kernel Hierarchical PCA, Kernel Partial Least Squares, Partial Least Squares, Person Re-Identification}, pubstate = {published}, tppubtype = {inproceedings} } Person re-identification (Re-ID) maintains a global identity for an individual while he moves along a large area covered by multiple cameras. Re-ID enables a multi-camera monitoring of individual activity that is critical for surveillance systems. However, the low-resolution images combined with the different poses, illumination conditions and camera viewpoints make person Re-ID a challenging problem. To reach a higher matching performance, state-of-the-art methods map the data to a nonlinear feature space where they learn a cross-view matching function using training data. Kernel PCA is a statistical method that learns a common subspace that captures most of the variability of samples using a small number of vector basis. However, Kernel PCA disregards that images were captured by distinct cameras, a critical problem in person ReID. Differently, Hierarchical PCA (HPCA) captures a consensus projection between multiblock data (e.g, two camera views), but it is a linear model. Therefore, we propose the Kernel Hierarchical PCA (Kernel HPCA) to tackle camera transition and dimensionality reduction in a unique framework. To the best of our knowledge, this is the first work to propose a kernel extension to the multiblock HPCA method. Experimental results demonstrate that Kernel HPCA reaches a matching performance comparable with state-of-the-art nonlinear subspace learning methods at PRID450S and VIPeR datasets. Furthermore, Kernel HPCA reaches a better combination of subspace learning and dimensionality requiring significantly lower subspace dimensions. |
Artur Jordao; Jessica Sena; William Robson Schwartz A Late Fusion Approach to Combine Multiple Pedestrian Detectors Inproceedings IAPR International Conference on Pattern Recognition (ICPR), pp. 1-6, 2016. Links | BibTeX | Tags: DeepEyes, Featured Publication, GigaFrames, Pedestrian Detection @inproceedings{Correia:2016:ICPR, title = {A Late Fusion Approach to Combine Multiple Pedestrian Detectors}, author = {Artur Jordao and Jessica Sena and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/A-Late-Fusion-Approach-to-Combine-Multiple.pdf}, year = {2016}, date = {2016-12-13}, booktitle = {IAPR International Conference on Pattern Recognition (ICPR)}, pages = {1-6}, keywords = {DeepEyes, Featured Publication, GigaFrames, Pedestrian Detection}, pubstate = {published}, tppubtype = {inproceedings} } |
Carlos Antonio Caetano Junior; Jefersson A dos Santos; William Robson Schwartz Optical Flow Co-occurrence Matrices: A Novel Spatiotemporal Feature Descriptor Inproceedings IAPR International Conference on Pattern Recognition (ICPR), pp. 1-6, 2016. Links | BibTeX | Tags: Activity Recognition, Descriptor, Feature Extraction, Featured Publication, OFCM, VER+ @inproceedings{Caetano:2016:ICPR, title = {Optical Flow Co-occurrence Matrices: A Novel Spatiotemporal Feature Descriptor}, author = {Carlos Antonio Caetano Junior and Jefersson A dos Santos and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/ICPR2016.pdf}, year = {2016}, date = {2016-12-13}, booktitle = {IAPR International Conference on Pattern Recognition (ICPR)}, pages = {1-6}, keywords = {Activity Recognition, Descriptor, Feature Extraction, Featured Publication, OFCM, VER+}, pubstate = {published}, tppubtype = {inproceedings} } |
Gabriel Resende Gonçalves; David Menotti; William Robson Schwartz License Plate Recognition based on Temporal Redundancy Inproceedings IEEE International Conference on Intelligent Transportation Systems (ITSC), pp. 1-5, 2016. Links | BibTeX | Tags: Automatic License Plate Recognition, DeepEyes, GigaFrames @inproceedings{Goncalves:2016:ITSC, title = {License Plate Recognition based on Temporal Redundancy}, author = {Gabriel Resende Gonçalves and David Menotti and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/paper_2016_ITSC.pdf}, year = {2016}, date = {2016-11-04}, booktitle = {IEEE International Conference on Intelligent Transportation Systems (ITSC)}, pages = {1-5}, keywords = {Automatic License Plate Recognition, DeepEyes, GigaFrames}, pubstate = {published}, tppubtype = {inproceedings} } |
Gabriel Resende Gonçalves; Sirlene Pio Gomes da Silva; David Menotti; William Robson Schwartz Benchmark for License Plate Character Segmentation Journal Article Journal of Electronic Imaging, 25 (5), pp. 1-5, 2016, ISBN: 1017-9909. Links | BibTeX | Tags: Automatic License Plate Recognition, Benchmark, Character Segmentation, DeepEyes, Featured Publication, GigaFrames, Jaccard Coefficient, Novel Dataset, Sense SegPlate @article{2016:JEI:Gabriel, title = {Benchmark for License Plate Character Segmentation}, author = {Gabriel Resende Gonçalves and Sirlene Pio Gomes da Silva and David Menotti and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/JEI-2016-Benchmark.pdf}, isbn = {1017-9909}, year = {2016}, date = {2016-10-24}, journal = {Journal of Electronic Imaging}, volume = {25}, number = {5}, pages = {1-5}, keywords = {Automatic License Plate Recognition, Benchmark, Character Segmentation, DeepEyes, Featured Publication, GigaFrames, Jaccard Coefficient, Novel Dataset, Sense SegPlate}, pubstate = {published}, tppubtype = {article} } |
Rafael Henrique Vareto; Filipe Oliveira de Costa; William Robson Schwartz Face Identification in Large Galleries Inproceedings Workshop on Face Processing Applications, pp. 1-4, 2016. Links | BibTeX | Tags: DeepEyes, Face Identification, Face Recognition, GigaFrames, VER+ @inproceedings{Vareto:2016:WFPA, title = {Face Identification in Large Galleries}, author = {Rafael Henrique Vareto and Filipe Oliveira de Costa and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/paper_2016_WFPA.pdf}, year = {2016}, date = {2016-10-02}, booktitle = {Workshop on Face Processing Applications}, pages = {1-4}, keywords = {DeepEyes, Face Identification, Face Recognition, GigaFrames, VER+}, pubstate = {published}, tppubtype = {inproceedings} } |
Raphael Felipe Carvalho de Prates; Cristianne Rodrigues Santos Dutra; William Robson Schwartz Predominant Color Name Indexing Structure for Person Re-Identification Inproceedings IEEE International Conference on Image Processing (ICIP), 2016. Resumo | Links | BibTeX | Tags: GigaFrames, Person Re-Identification, VER+ @inproceedings{Prates2016ICIP, title = {Predominant Color Name Indexing Structure for Person Re-Identification}, author = {Raphael Felipe Carvalho de Prates and Cristianne Rodrigues Santos Dutra and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/paper_2016_ICIP_Prates.pdf}, year = {2016}, date = {2016-09-25}, booktitle = {IEEE International Conference on Image Processing (ICIP)}, abstract = {The automation of surveillance systems is important to allow real-time analysis of critical events, crime investigation and prevention. A crucial step in the surveillance systems is the person re-identification (Re-ID) which aims at maintaining the identity of agents in non-overlapping camera networks. Most of the works in literature compare a test sample against the entire gallery, restricting the scalability. We address this problem employing multiple indexing lists obtained by color name descriptors extracted from partbased models using our proposed Predominant Color Name (PCN) indexing structure. PCN is a flexible indexing structure that relates features to gallery images without the need of labelled training images and can be integrated with existing supervised and unsupervised person Re-ID frameworks. Experimental results demonstrate that the proposed approach outperforms indexation based on unsupervised clustering methods such as k-means and c-means. Furthermore, PCN reduces the computational efforts with a minimum performance degradation. For instance, when indexing 50% and 75% of the gallery images, we observed a reduction in AUC curve of 0.01 and 0.08, respectively, when compared to indexing the entire gallery.}, keywords = {GigaFrames, Person Re-Identification, VER+}, pubstate = {published}, tppubtype = {inproceedings} } The automation of surveillance systems is important to allow real-time analysis of critical events, crime investigation and prevention. A crucial step in the surveillance systems is the person re-identification (Re-ID) which aims at maintaining the identity of agents in non-overlapping camera networks. Most of the works in literature compare a test sample against the entire gallery, restricting the scalability. We address this problem employing multiple indexing lists obtained by color name descriptors extracted from partbased models using our proposed Predominant Color Name (PCN) indexing structure. PCN is a flexible indexing structure that relates features to gallery images without the need of labelled training images and can be integrated with existing supervised and unsupervised person Re-ID frameworks. Experimental results demonstrate that the proposed approach outperforms indexation based on unsupervised clustering methods such as k-means and c-means. Furthermore, PCN reduces the computational efforts with a minimum performance degradation. For instance, when indexing 50% and 75% of the gallery images, we observed a reduction in AUC curve of 0.01 and 0.08, respectively, when compared to indexing the entire gallery. |
Raphael Felipe Carvalho de Prates; Marina Santos Oliveira; William Robson Schwartz Kernel Partial Least Squares for Person Re-Identification Inproceedings IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS), 2016. Resumo | Links | BibTeX | Tags: DeepEyes, Featured Publication, GigaFrames, HAR-HEALTH, Kernel Partial Least Squares, Kernel Partial Least Squares for Person Re-Identification, Person Re-Identification @inproceedings{Prates2016AVSS, title = {Kernel Partial Least Squares for Person Re-Identification}, author = {Raphael Felipe Carvalho de Prates and Marina Santos Oliveira and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/egpaper_for_DoubleBlindReview.pdf}, year = {2016}, date = {2016-09-25}, booktitle = {IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS)}, abstract = {Person re-identification (Re-ID) keeps the same identity for a person as he moves along an area with nonoverlapping surveillance cameras. Re-ID is a challenging task due to appearance changes caused by different camera viewpoints, occlusion and illumination conditions. While robust and discriminative descriptors are obtained combining texture, shape and color features in a high-dimensional representation, the achievement of accuracy and efficiency demands dimensionality reduction methods. At this paper, we propose variations of Kernel Partial Least Squares (KPLS) that simultaneously reduce the dimensionality and increase the discriminative power. The Cross-View KPLS (X-KPLS) and KPLS Mode A capture cross-view discriminative information and are successful for unsupervised and supervised Re-ID. Experimental results demonstrate that XKPLS presents equal or higher matching results when compared to other methods in literature at PRID450S.}, keywords = {DeepEyes, Featured Publication, GigaFrames, HAR-HEALTH, Kernel Partial Least Squares, Kernel Partial Least Squares for Person Re-Identification, Person Re-Identification}, pubstate = {published}, tppubtype = {inproceedings} } Person re-identification (Re-ID) keeps the same identity for a person as he moves along an area with nonoverlapping surveillance cameras. Re-ID is a challenging task due to appearance changes caused by different camera viewpoints, occlusion and illumination conditions. While robust and discriminative descriptors are obtained combining texture, shape and color features in a high-dimensional representation, the achievement of accuracy and efficiency demands dimensionality reduction methods. At this paper, we propose variations of Kernel Partial Least Squares (KPLS) that simultaneously reduce the dimensionality and increase the discriminative power. The Cross-View KPLS (X-KPLS) and KPLS Mode A capture cross-view discriminative information and are successful for unsupervised and supervised Re-ID. Experimental results demonstrate that XKPLS presents equal or higher matching results when compared to other methods in literature at PRID450S. |