Vigilância Inteligente
Problemas de visão computacional aplicados à vigilância visual vêm sendo estudados há vários anos com o objetivo de encontrar soluções precisas e eficientes, necessárias para permitir a execução de sistemas de vigilância em ambientes reais. O principal objetivo de tais sistemas é analisar a cena com foco na detecção e reconhecimento de atividades suspeitas realizadas por seres humanos na cena, para que o pessoal de segurança possa prestar mais atenção a essas atividades pré-selecionadas. Para conseguir isso, vários problemas precisam ser resolvidos primeiro, por exemplo, subtração de fundo, detecção de pessoas, rastreamento e reidentificação, reconhecimento facial e reconhecimento de ações. Embora cada um desses problemas tenha sido pesquisado nas últimas décadas, eles são dificilmente considerados em uma sequência, cada um é geralmente resolvido individualmente. No entanto, em cenários reais de vigilância, os problemas acima mencionados devem ser resolvidos em sequência, considerando apenas os vídeos como entrada.
Publicações Relacionadas
Raphael Felipe Carvalho de Prates Matching People Across Surveillance Cameras Tese PhD Universidade Federal de Minas Gerais, 2019. @phdthesis{RaphaelPrates:2020:PhD, title = {Matching People Across Surveillance Cameras}, author = {Raphael Felipe Carvalho de Prates}, year = {2019}, date = {2019-03-29}, school = {Universidade Federal de Minas Gerais}, abstract = {The number of surveillance camera networks is increasing as a consequence of the escalation of the security concerns. The large amount of data collected demands intelligent surveillance systems to extract information that is useful to security officers. In order to achieve this goal, this system must be able to correlate information captured by different surveillance cameras. In this scenario, re-identification of people is of central importance in establishing a global identity for individuals captured by different cameras using only visual appearance. However, this is a challenging task, since the same person when captured by different cameras undergoes a drastic change of appearance as a consequence of the variations in the point of view, illumination and pose. Recent work addresses the person re-identification by proposing robust visual descriptors orcross-view matching functions, which are functions that learn to match images from different cameras. However, most of these works are impaired by problems such as ambiguity among individuals, scalability, and reduced number of labeled images in the training set. In this thesis, we address the problem of matching individuals between cameras in order to address the aforementioned problems and, therefore, obtain better results. Specifically, we propose two directions: the learning of subspaces and the models of indirect identification. The first learns a common subspace that is scalable with respect to the number of cameras and robust in relation to the amount of training images available. we match probe and gallery images indirectly by computing their similarities with training samples. Experimental results validate both approaches in the person re-identification problem considering both only one pair of cameras and more realistic situations with multiple cameras.}, keywords = {}, pubstate = {published}, tppubtype = {phdthesis} } The number of surveillance camera networks is increasing as a consequence of the escalation of the security concerns. The large amount of data collected demands intelligent surveillance systems to extract information that is useful to security officers. In order to achieve this goal, this system must be able to correlate information captured by different surveillance cameras. In this scenario, re-identification of people is of central importance in establishing a global identity for individuals captured by different cameras using only visual appearance. However, this is a challenging task, since the same person when captured by different cameras undergoes a drastic change of appearance as a consequence of the variations in the point of view, illumination and pose. Recent work addresses the person re-identification by proposing robust visual descriptors orcross-view matching functions, which are functions that learn to match images from different cameras. However, most of these works are impaired by problems such as ambiguity among individuals, scalability, and reduced number of labeled images in the training set. In this thesis, we address the problem of matching individuals between cameras in order to address the aforementioned problems and, therefore, obtain better results. Specifically, we propose two directions: the learning of subspaces and the models of indirect identification. The first learns a common subspace that is scalable with respect to the number of cameras and robust in relation to the amount of training images available. we match probe and gallery images indirectly by computing their similarities with training samples. Experimental results validate both approaches in the person re-identification problem considering both only one pair of cameras and more realistic situations with multiple cameras. |
Antonio Carlos Nazare Junior; William Robson Schwartz A scalable and flexible framework for smart video surveillance Journal Article Em: Computer Vision and Image Understanding, 144 (C), pp. 258–275, 2016. @article{Nazare:2016:CVIU, title = {A scalable and flexible framework for smart video surveillance}, author = {Antonio Carlos Nazare Junior and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/paper_2016_CVIU.pdf}, year = {2016}, date = {2016-01-01}, journal = {Computer Vision and Image Understanding}, volume = {144}, number = {C}, pages = {258--275}, keywords = {}, pubstate = {published}, tppubtype = {article} } |
Antonio Carlos Nazare Junior A Scalable and Versatile Framework for Smart Video Surveillance Masters Thesis Federal University of Minas Gerais, 2014. @mastersthesis{Nazare:2014:MSc, title = {A Scalable and Versatile Framework for Smart Video Surveillance}, author = {Antonio Carlos Nazare Junior}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/03/dissertation_2014_Antonio-1.pdf}, year = {2014}, date = {2014-09-05}, school = {Federal University of Minas Gerais}, abstract = {The availability of surveillance cameras placed in public locations has increased vastly in the last years, providing a safe environment for people at the cost of huge amount of visual data collected. Such data are mostly processed manually, a task which is labor intensive and prone to errors. Therefore, automatic approaches must be employed to enable the processing of the data, so that human operators only need to reason about selected portions. Focused on solving problems in the domain of visual surveillance, computer vision problems applied to this domain have been developed for several years aiming at finding accurate and efficient solutions, required to allow the execution of surveillance systems in real environments. The main goal of such systems is to analyze the scene focusing on the detection and recognition of suspicious activities performed by humans in the scene, so that the security staff can pay closer attention to these preselected activities. However these systems are rarely tackled in a scalable manner. Before developing a full surveillance system, several problems have to be solved first, for instance: background subtraction, person detection, tracking and re-identification, face recognition, and action recognition. Even though each of these problems have been researched in the past decades, they are hardly considered in a sequence. Each one is usually solved individually. However, in a real surveillance scenario, the aforementioned problems have to be solved in sequence considering only videos as the input. Aiming at the direction of evaluating approaches in more realistic scenarios, this work proposes a framework called Smart Surveillance Framework (SSF), to allow researchers to implement their solutions to the above problems as a sequence of processing modules that communicates through a shared memory. The SSF is a C++ library built to provide important features for a surveillance system, such as a automatic scene understanding, scalability, real-time operation, multi-sensor environment, usage of low cost standard components, runtime re-configuration, and communication control.}, keywords = {}, pubstate = {published}, tppubtype = {mastersthesis} } The availability of surveillance cameras placed in public locations has increased vastly in the last years, providing a safe environment for people at the cost of huge amount of visual data collected. Such data are mostly processed manually, a task which is labor intensive and prone to errors. Therefore, automatic approaches must be employed to enable the processing of the data, so that human operators only need to reason about selected portions. Focused on solving problems in the domain of visual surveillance, computer vision problems applied to this domain have been developed for several years aiming at finding accurate and efficient solutions, required to allow the execution of surveillance systems in real environments. The main goal of such systems is to analyze the scene focusing on the detection and recognition of suspicious activities performed by humans in the scene, so that the security staff can pay closer attention to these preselected activities. However these systems are rarely tackled in a scalable manner. Before developing a full surveillance system, several problems have to be solved first, for instance: background subtraction, person detection, tracking and re-identification, face recognition, and action recognition. Even though each of these problems have been researched in the past decades, they are hardly considered in a sequence. Each one is usually solved individually. However, in a real surveillance scenario, the aforementioned problems have to be solved in sequence considering only videos as the input. Aiming at the direction of evaluating approaches in more realistic scenarios, this work proposes a framework called Smart Surveillance Framework (SSF), to allow researchers to implement their solutions to the above problems as a sequence of processing modules that communicates through a shared memory. The SSF is a C++ library built to provide important features for a surveillance system, such as a automatic scene understanding, scalability, real-time operation, multi-sensor environment, usage of low cost standard components, runtime re-configuration, and communication control. |
Antonio Carlos Nazare Junior; Cassio Elias Santos dos Junior; Renato Ferreira; William Robson Schwartz Smart Surveillance Framework: A Versatile Tool for Video Analysis Inproceedings Em: IEEE Winter Conference on Applications of Computer Vision, pp. 753–760, 2014. @inproceedings{wacv2014smart, title = {Smart Surveillance Framework: A Versatile Tool for Video Analysis}, author = {Antonio Carlos Nazare Junior and Cassio Elias Santos dos Junior and Renato Ferreira and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/2014-Smart-Surveillance-Framework-A-Versatile-Tool-for-Video-Analysis.pdf}, year = {2014}, date = {2014-01-01}, booktitle = {IEEE Winter Conference on Applications of Computer Vision}, pages = {753--760}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |
Antonio Carlos Nazare Junior; Renato Ferreira; William Robson Schwartz Scalable Feature Extraction for Visual Surveillance Inproceedings Em: Iberoamerican Congress on Pattern Recognition (CIARP), pp. 375-382, Springer International Publishing, 2014. @inproceedings{Nazare:2014:CIARP, title = {Scalable Feature Extraction for Visual Surveillance}, author = {Antonio Carlos Nazare Junior and Renato Ferreira and William Robson Schwartz}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/02/paper_2014_CIARP_Antonio.pdf}, year = {2014}, date = {2014-01-01}, booktitle = {Iberoamerican Congress on Pattern Recognition (CIARP)}, volume = {8827}, pages = {375-382}, publisher = {Springer International Publishing}, series = {Lecture Notes in Computer Science}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |