Computer Vision problems applied to visual surveillance have been studied for several years with the aim of finding accurate and efficient solutions, which are required to allow the execution of surveillance systems in real environments. The main goal of such systems is to analyze the scene focusing on the detection and recognition of suspicious activities performed by humans in the scene, so that the security personnel can pay closer attention to these preselected activities. To accomplish that, several problems have to be solved first, for instance background subtraction, person detection, tracking and re-identification, face recognition, and action recognition. Even though each of these problems has been researched in the past decades, they are hardly considered in a sequence, each one is usually solved individually. However, in real surveillance scenarios, the aforementioned problems have to be solved in sequence considering only videos as the input.
Raphael Felipe Carvalho de Prates
Matching People Across Surveillance Cameras PhD Thesis
Universidade Federal de Minas Gerais, 2019.
The number of surveillance camera networks is increasing as a consequence of the escalation of the security concerns. The large amount of data collected demands intelligent surveillance systems to extract information that is useful to security officers. In order to achieve this goal, this system must be able to correlate information captured by different surveillance cameras. In this scenario, re-identification of people is of central importance in establishing a global identity for individuals captured by different cameras using only visual appearance. However, this is a challenging task, since the same person when captured by different cameras undergoes a drastic change of appearance as a consequence of the variations in the point of view, illumination and pose. Recent work addresses the person re-identification by proposing robust visual descriptors orcross-view matching functions, which are functions that learn to match images from different cameras. However, most of these works are impaired by problems such as ambiguity among individuals, scalability, and reduced number of labeled images in the training set. In this thesis, we address the problem of matching individuals between cameras in order to address the aforementioned problems and, therefore, obtain better results. Specifically, we propose two directions: the learning of subspaces and the models of indirect identification. The first learns a common subspace that is scalable with respect to the number of cameras and robust in relation to the amount of training images available. we match probe and gallery images indirectly by computing their similarities with training samples. Experimental results validate both approaches in the person re-identification problem considering both only one pair of cameras and more realistic situations with multiple cameras.
Antonio Carlos Nazare Junior; William Robson Schwartz
A scalable and flexible framework for smart video surveillance Journal Article
In: Computer Vision and Image Understanding, 144 (C), pp. 258–275, 2016.
Antonio Carlos Nazare Junior
Federal University of Minas Gerais, 2014.
The availability of surveillance cameras placed in public locations has increased vastly in the last years, providing a safe environment for people at the cost of huge amount of visual data collected. Such data are mostly processed manually, a task which is labor intensive and prone to errors. Therefore, automatic approaches must be employed to enable the processing of the data, so that human operators only need to reason about selected portions.
Focused on solving problems in the domain of visual surveillance, computer vision problems applied to this domain have been developed for several years aiming at finding accurate and efficient solutions, required to allow the execution of surveillance systems in real environments. The main goal of such systems is to analyze the scene focusing on the detection and recognition of suspicious activities performed by humans in the scene, so that the security staff can pay closer attention to these preselected activities. However these systems are rarely tackled in a scalable manner.
Before developing a full surveillance system, several problems have to be solved first, for instance: background subtraction, person detection, tracking and re-identification, face recognition, and action recognition. Even though each of these problems have been researched in the past decades, they are hardly considered in a sequence. Each one is usually solved individually. However, in a real surveillance scenario, the aforementioned problems have to be solved in sequence considering only videos as the input.
Aiming at the direction of evaluating approaches in more realistic scenarios, this work proposes a framework called Smart Surveillance Framework (SSF), to allow researchers to implement their solutions to the above problems as a sequence of processing modules that communicates through a shared memory.
The SSF is a C++ library built to provide important features for a surveillance system, such as a automatic scene understanding, scalability, real-time operation, multi-sensor environment, usage of low cost standard components, runtime re-configuration, and communication control.
Antonio Carlos Nazare Junior; Cassio Elias Santos dos Junior; Renato Ferreira; William Robson Schwartz
In: IEEE Winter Conference on Applications of Computer Vision, pp. 753–760, 2014.
Antonio Carlos Nazare Junior; Renato Ferreira; William Robson Schwartz
Scalable Feature Extraction for Visual Surveillance Inproceedings
In: Iberoamerican Congress on Pattern Recognition (CIARP), pp. 375-382, Springer International Publishing, 2014.