Smart Surveillance
Computer Vision problems applied to visual surveillance have been studied for several years with the aim of finding accurate and efficient solutions, which are required to allow the execution of surveillance systems in real environments. The main goal of such systems is to analyze the scene focusing on the detection and recognition of suspicious activities performed by humans in the scene, so that the security personnel can pay closer attention to these preselected activities. To accomplish that, several problems have to be solved first, for instance background subtraction, person detection, tracking and re-identification, face recognition, and action recognition. Even though each of these problems has been researched in the past decades, they are hardly considered in a sequence, each one is usually solved individually. However, in real surveillance scenarios, the aforementioned problems have to be solved in sequence considering only videos as the input.
Related Publications
Raphael Felipe Carvalho de Prates Matching People Across Surveillance Cameras PhD Thesis Universidade Federal de Minas Gerais, 2019. @phdthesis{RaphaelPrates:2020:PhD, title = {Matching People Across Surveillance Cameras}, author = {Raphael Felipe Carvalho de Prates}, year = {2019}, date = {2019-03-29}, school = {Universidade Federal de Minas Gerais}, abstract = {The number of surveillance camera networks is increasing as a consequence of the escalation of the security concerns. The large amount of data collected demands intelligent surveillance systems to extract information that is useful to security officers. In order to achieve this goal, this system must be able to correlate information captured by different surveillance cameras. In this scenario, re-identification of people is of central importance in establishing a global identity for individuals captured by different cameras using only visual appearance. However, this is a challenging task, since the same person when captured by different cameras undergoes a drastic change of appearance as a consequence of the variations in the point of view, illumination and pose. Recent work addresses the person re-identification by proposing robust visual descriptors orcross-view matching functions, which are functions that learn to match images from different cameras. However, most of these works are impaired by problems such as ambiguity among individuals, scalability, and reduced number of labeled images in the training set. In this thesis, we address the problem of matching individuals between cameras in order to address the aforementioned problems and, therefore, obtain better results. Specifically, we propose two directions: the learning of subspaces and the models of indirect identification. The first learns a common subspace that is scalable with respect to the number of cameras and robust in relation to the amount of training images available. we match probe and gallery images indirectly by computing their similarities with training samples. Experimental results validate both approaches in the person re-identification problem considering both only one pair of cameras and more realistic situations with multiple cameras.}, keywords = {}, pubstate = {published}, tppubtype = {phdthesis} } The number of surveillance camera networks is increasing as a consequence of the escalation of the security concerns. The large amount of data collected demands intelligent surveillance systems to extract information that is useful to security officers. In order to achieve this goal, this system must be able to correlate information captured by different surveillance cameras. In this scenario, re-identification of people is of central importance in establishing a global identity for individuals captured by different cameras using only visual appearance. However, this is a challenging task, since the same person when captured by different cameras undergoes a drastic change of appearance as a consequence of the variations in the point of view, illumination and pose. Recent work addresses the person re-identification by proposing robust visual descriptors orcross-view matching functions, which are functions that learn to match images from different cameras. However, most of these works are impaired by problems such as ambiguity among individuals, scalability, and reduced number of labeled images in the training set. In this thesis, we address the problem of matching individuals between cameras in order to address the aforementioned problems and, therefore, obtain better results. Specifically, we propose two directions: the learning of subspaces and the models of indirect identification. The first learns a common subspace that is scalable with respect to the number of cameras and robust in relation to the amount of training images available. we match probe and gallery images indirectly by computing their similarities with training samples. Experimental results validate both approaches in the person re-identification problem considering both only one pair of cameras and more realistic situations with multiple cameras. |
Antonio Carlos Nazare Junior; William Robson Schwartz A scalable and flexible framework for smart video surveillance Journal Article In: Computer Vision and Image Understanding, 144 (C), pp. 258–275, 2016. @article{Nazare:2016:CVIU, title = {A scalable and flexible framework for smart video surveillance}, author = {Antonio Carlos Nazare Junior and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/paper_2016_CVIU.pdf}, year = {2016}, date = {2016-01-01}, journal = {Computer Vision and Image Understanding}, volume = {144}, number = {C}, pages = {258--275}, keywords = {}, pubstate = {published}, tppubtype = {article} } |
Antonio Carlos Nazare Junior A Scalable and Versatile Framework for Smart Video Surveillance Masters Thesis Federal University of Minas Gerais, 2014. @mastersthesis{Nazare:2014:MSc, title = {A Scalable and Versatile Framework for Smart Video Surveillance}, author = {Antonio Carlos Nazare Junior}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/03/dissertation_2014_Antonio-1.pdf}, year = {2014}, date = {2014-09-05}, school = {Federal University of Minas Gerais}, abstract = {The availability of surveillance cameras placed in public locations has increased vastly in the last years, providing a safe environment for people at the cost of huge amount of visual data collected. Such data are mostly processed manually, a task which is labor intensive and prone to errors. Therefore, automatic approaches must be employed to enable the processing of the data, so that human operators only need to reason about selected portions. Focused on solving problems in the domain of visual surveillance, computer vision problems applied to this domain have been developed for several years aiming at finding accurate and efficient solutions, required to allow the execution of surveillance systems in real environments. The main goal of such systems is to analyze the scene focusing on the detection and recognition of suspicious activities performed by humans in the scene, so that the security staff can pay closer attention to these preselected activities. However these systems are rarely tackled in a scalable manner. Before developing a full surveillance system, several problems have to be solved first, for instance: background subtraction, person detection, tracking and re-identification, face recognition, and action recognition. Even though each of these problems have been researched in the past decades, they are hardly considered in a sequence. Each one is usually solved individually. However, in a real surveillance scenario, the aforementioned problems have to be solved in sequence considering only videos as the input. Aiming at the direction of evaluating approaches in more realistic scenarios, this work proposes a framework called Smart Surveillance Framework (SSF), to allow researchers to implement their solutions to the above problems as a sequence of processing modules that communicates through a shared memory. The SSF is a C++ library built to provide important features for a surveillance system, such as a automatic scene understanding, scalability, real-time operation, multi-sensor environment, usage of low cost standard components, runtime re-configuration, and communication control.}, keywords = {}, pubstate = {published}, tppubtype = {mastersthesis} } The availability of surveillance cameras placed in public locations has increased vastly in the last years, providing a safe environment for people at the cost of huge amount of visual data collected. Such data are mostly processed manually, a task which is labor intensive and prone to errors. Therefore, automatic approaches must be employed to enable the processing of the data, so that human operators only need to reason about selected portions. Focused on solving problems in the domain of visual surveillance, computer vision problems applied to this domain have been developed for several years aiming at finding accurate and efficient solutions, required to allow the execution of surveillance systems in real environments. The main goal of such systems is to analyze the scene focusing on the detection and recognition of suspicious activities performed by humans in the scene, so that the security staff can pay closer attention to these preselected activities. However these systems are rarely tackled in a scalable manner. Before developing a full surveillance system, several problems have to be solved first, for instance: background subtraction, person detection, tracking and re-identification, face recognition, and action recognition. Even though each of these problems have been researched in the past decades, they are hardly considered in a sequence. Each one is usually solved individually. However, in a real surveillance scenario, the aforementioned problems have to be solved in sequence considering only videos as the input. Aiming at the direction of evaluating approaches in more realistic scenarios, this work proposes a framework called Smart Surveillance Framework (SSF), to allow researchers to implement their solutions to the above problems as a sequence of processing modules that communicates through a shared memory. The SSF is a C++ library built to provide important features for a surveillance system, such as a automatic scene understanding, scalability, real-time operation, multi-sensor environment, usage of low cost standard components, runtime re-configuration, and communication control. |
Antonio Carlos Nazare Junior; Cassio Elias Santos dos Junior; Renato Ferreira; William Robson Schwartz Smart Surveillance Framework: A Versatile Tool for Video Analysis Inproceedings In: IEEE Winter Conference on Applications of Computer Vision, pp. 753–760, 2014. @inproceedings{wacv2014smart, title = {Smart Surveillance Framework: A Versatile Tool for Video Analysis}, author = {Antonio Carlos Nazare Junior and Cassio Elias Santos dos Junior and Renato Ferreira and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/2014-Smart-Surveillance-Framework-A-Versatile-Tool-for-Video-Analysis.pdf}, year = {2014}, date = {2014-01-01}, booktitle = {IEEE Winter Conference on Applications of Computer Vision}, pages = {753--760}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Antonio Carlos Nazare Junior; Renato Ferreira; William Robson Schwartz Scalable Feature Extraction for Visual Surveillance Inproceedings In: Iberoamerican Congress on Pattern Recognition (CIARP), pp. 375-382, Springer International Publishing, 2014. @inproceedings{Nazare:2014:CIARP, title = {Scalable Feature Extraction for Visual Surveillance}, author = {Antonio Carlos Nazare Junior and Renato Ferreira and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/paper_2014_CIARP_Antonio.pdf}, year = {2014}, date = {2014-01-01}, booktitle = {Iberoamerican Congress on Pattern Recognition (CIARP)}, volume = {8827}, pages = {375-382}, publisher = {Springer International Publishing}, series = {Lecture Notes in Computer Science}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |