TESES E DISSERTAÇÕES
2019 |
Raphael Felipe Carvalho de Prates Matching People Across Surveillance Cameras Tese PhD Universidade Federal de Minas Gerais, 2019. Resumo | BibTeX | Tags: Computer vision, Person Re-Identification, Smart Surveillance @phdthesis{RaphaelPrates:2020:PhD, title = {Matching People Across Surveillance Cameras}, author = {Raphael Felipe Carvalho de Prates}, year = {2019}, date = {2019-03-29}, school = {Universidade Federal de Minas Gerais}, abstract = {The number of surveillance camera networks is increasing as a consequence of the escalation of the security concerns. The large amount of data collected demands intelligent surveillance systems to extract information that is useful to security officers. In order to achieve this goal, this system must be able to correlate information captured by different surveillance cameras. In this scenario, re-identification of people is of central importance in establishing a global identity for individuals captured by different cameras using only visual appearance. However, this is a challenging task, since the same person when captured by different cameras undergoes a drastic change of appearance as a consequence of the variations in the point of view, illumination and pose. Recent work addresses the person re-identification by proposing robust visual descriptors orcross-view matching functions, which are functions that learn to match images from different cameras. However, most of these works are impaired by problems such as ambiguity among individuals, scalability, and reduced number of labeled images in the training set. In this thesis, we address the problem of matching individuals between cameras in order to address the aforementioned problems and, therefore, obtain better results. Specifically, we propose two directions: the learning of subspaces and the models of indirect identification. The first learns a common subspace that is scalable with respect to the number of cameras and robust in relation to the amount of training images available. we match probe and gallery images indirectly by computing their similarities with training samples. Experimental results validate both approaches in the person re-identification problem considering both only one pair of cameras and more realistic situations with multiple cameras.}, keywords = {Computer vision, Person Re-Identification, Smart Surveillance}, pubstate = {published}, tppubtype = {phdthesis} } The number of surveillance camera networks is increasing as a consequence of the escalation of the security concerns. The large amount of data collected demands intelligent surveillance systems to extract information that is useful to security officers. In order to achieve this goal, this system must be able to correlate information captured by different surveillance cameras. In this scenario, re-identification of people is of central importance in establishing a global identity for individuals captured by different cameras using only visual appearance. However, this is a challenging task, since the same person when captured by different cameras undergoes a drastic change of appearance as a consequence of the variations in the point of view, illumination and pose. Recent work addresses the person re-identification by proposing robust visual descriptors orcross-view matching functions, which are functions that learn to match images from different cameras. However, most of these works are impaired by problems such as ambiguity among individuals, scalability, and reduced number of labeled images in the training set. In this thesis, we address the problem of matching individuals between cameras in order to address the aforementioned problems and, therefore, obtain better results. Specifically, we propose two directions: the learning of subspaces and the models of indirect identification. The first learns a common subspace that is scalable with respect to the number of cameras and robust in relation to the amount of training images available. we match probe and gallery images indirectly by computing their similarities with training samples. Experimental results validate both approaches in the person re-identification problem considering both only one pair of cameras and more realistic situations with multiple cameras. |
2014 |
Antonio Carlos Nazare Junior A Scalable and Versatile Framework for Smart Video Surveillance Masters Thesis Federal University of Minas Gerais, 2014. Resumo | Links | BibTeX | Tags: ARDOP, Smart Surveillance, Surveillance Systems, VER+, Video Surveillance @mastersthesis{Nazare:2014:MSc, title = {A Scalable and Versatile Framework for Smart Video Surveillance}, author = {Antonio Carlos Nazare Junior}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/03/dissertation_2014_Antonio-1.pdf}, year = {2014}, date = {2014-09-05}, school = {Federal University of Minas Gerais}, abstract = {The availability of surveillance cameras placed in public locations has increased vastly in the last years, providing a safe environment for people at the cost of huge amount of visual data collected. Such data are mostly processed manually, a task which is labor intensive and prone to errors. Therefore, automatic approaches must be employed to enable the processing of the data, so that human operators only need to reason about selected portions. Focused on solving problems in the domain of visual surveillance, computer vision problems applied to this domain have been developed for several years aiming at finding accurate and efficient solutions, required to allow the execution of surveillance systems in real environments. The main goal of such systems is to analyze the scene focusing on the detection and recognition of suspicious activities performed by humans in the scene, so that the security staff can pay closer attention to these preselected activities. However these systems are rarely tackled in a scalable manner. Before developing a full surveillance system, several problems have to be solved first, for instance: background subtraction, person detection, tracking and re-identification, face recognition, and action recognition. Even though each of these problems have been researched in the past decades, they are hardly considered in a sequence. Each one is usually solved individually. However, in a real surveillance scenario, the aforementioned problems have to be solved in sequence considering only videos as the input. Aiming at the direction of evaluating approaches in more realistic scenarios, this work proposes a framework called Smart Surveillance Framework (SSF), to allow researchers to implement their solutions to the above problems as a sequence of processing modules that communicates through a shared memory. The SSF is a C++ library built to provide important features for a surveillance system, such as a automatic scene understanding, scalability, real-time operation, multi-sensor environment, usage of low cost standard components, runtime re-configuration, and communication control.}, keywords = {ARDOP, Smart Surveillance, Surveillance Systems, VER+, Video Surveillance}, pubstate = {published}, tppubtype = {mastersthesis} } The availability of surveillance cameras placed in public locations has increased vastly in the last years, providing a safe environment for people at the cost of huge amount of visual data collected. Such data are mostly processed manually, a task which is labor intensive and prone to errors. Therefore, automatic approaches must be employed to enable the processing of the data, so that human operators only need to reason about selected portions. Focused on solving problems in the domain of visual surveillance, computer vision problems applied to this domain have been developed for several years aiming at finding accurate and efficient solutions, required to allow the execution of surveillance systems in real environments. The main goal of such systems is to analyze the scene focusing on the detection and recognition of suspicious activities performed by humans in the scene, so that the security staff can pay closer attention to these preselected activities. However these systems are rarely tackled in a scalable manner. Before developing a full surveillance system, several problems have to be solved first, for instance: background subtraction, person detection, tracking and re-identification, face recognition, and action recognition. Even though each of these problems have been researched in the past decades, they are hardly considered in a sequence. Each one is usually solved individually. However, in a real surveillance scenario, the aforementioned problems have to be solved in sequence considering only videos as the input. Aiming at the direction of evaluating approaches in more realistic scenarios, this work proposes a framework called Smart Surveillance Framework (SSF), to allow researchers to implement their solutions to the above problems as a sequence of processing modules that communicates through a shared memory. The SSF is a C++ library built to provide important features for a surveillance system, such as a automatic scene understanding, scalability, real-time operation, multi-sensor environment, usage of low cost standard components, runtime re-configuration, and communication control. |
2019 |
Raphael Felipe Carvalho de Prates Matching People Across Surveillance Cameras Tese PhD Universidade Federal de Minas Gerais, 2019. Resumo | BibTeX | Tags: Computer vision, Person Re-Identification, Smart Surveillance @phdthesis{RaphaelPrates:2020:PhD, title = {Matching People Across Surveillance Cameras}, author = {Raphael Felipe Carvalho de Prates}, year = {2019}, date = {2019-03-29}, school = {Universidade Federal de Minas Gerais}, abstract = {The number of surveillance camera networks is increasing as a consequence of the escalation of the security concerns. The large amount of data collected demands intelligent surveillance systems to extract information that is useful to security officers. In order to achieve this goal, this system must be able to correlate information captured by different surveillance cameras. In this scenario, re-identification of people is of central importance in establishing a global identity for individuals captured by different cameras using only visual appearance. However, this is a challenging task, since the same person when captured by different cameras undergoes a drastic change of appearance as a consequence of the variations in the point of view, illumination and pose. Recent work addresses the person re-identification by proposing robust visual descriptors orcross-view matching functions, which are functions that learn to match images from different cameras. However, most of these works are impaired by problems such as ambiguity among individuals, scalability, and reduced number of labeled images in the training set. In this thesis, we address the problem of matching individuals between cameras in order to address the aforementioned problems and, therefore, obtain better results. Specifically, we propose two directions: the learning of subspaces and the models of indirect identification. The first learns a common subspace that is scalable with respect to the number of cameras and robust in relation to the amount of training images available. we match probe and gallery images indirectly by computing their similarities with training samples. Experimental results validate both approaches in the person re-identification problem considering both only one pair of cameras and more realistic situations with multiple cameras.}, keywords = {Computer vision, Person Re-Identification, Smart Surveillance}, pubstate = {published}, tppubtype = {phdthesis} } The number of surveillance camera networks is increasing as a consequence of the escalation of the security concerns. The large amount of data collected demands intelligent surveillance systems to extract information that is useful to security officers. In order to achieve this goal, this system must be able to correlate information captured by different surveillance cameras. In this scenario, re-identification of people is of central importance in establishing a global identity for individuals captured by different cameras using only visual appearance. However, this is a challenging task, since the same person when captured by different cameras undergoes a drastic change of appearance as a consequence of the variations in the point of view, illumination and pose. Recent work addresses the person re-identification by proposing robust visual descriptors orcross-view matching functions, which are functions that learn to match images from different cameras. However, most of these works are impaired by problems such as ambiguity among individuals, scalability, and reduced number of labeled images in the training set. In this thesis, we address the problem of matching individuals between cameras in order to address the aforementioned problems and, therefore, obtain better results. Specifically, we propose two directions: the learning of subspaces and the models of indirect identification. The first learns a common subspace that is scalable with respect to the number of cameras and robust in relation to the amount of training images available. we match probe and gallery images indirectly by computing their similarities with training samples. Experimental results validate both approaches in the person re-identification problem considering both only one pair of cameras and more realistic situations with multiple cameras. |
2014 |
Antonio Carlos Nazare Junior A Scalable and Versatile Framework for Smart Video Surveillance Masters Thesis Federal University of Minas Gerais, 2014. Resumo | Links | BibTeX | Tags: ARDOP, Smart Surveillance, Surveillance Systems, VER+, Video Surveillance @mastersthesis{Nazare:2014:MSc, title = {A Scalable and Versatile Framework for Smart Video Surveillance}, author = {Antonio Carlos Nazare Junior}, url = {http://smartsenselab.dcc.ufmg.br/wp-content/uploads/2019/03/dissertation_2014_Antonio-1.pdf}, year = {2014}, date = {2014-09-05}, school = {Federal University of Minas Gerais}, abstract = {The availability of surveillance cameras placed in public locations has increased vastly in the last years, providing a safe environment for people at the cost of huge amount of visual data collected. Such data are mostly processed manually, a task which is labor intensive and prone to errors. Therefore, automatic approaches must be employed to enable the processing of the data, so that human operators only need to reason about selected portions. Focused on solving problems in the domain of visual surveillance, computer vision problems applied to this domain have been developed for several years aiming at finding accurate and efficient solutions, required to allow the execution of surveillance systems in real environments. The main goal of such systems is to analyze the scene focusing on the detection and recognition of suspicious activities performed by humans in the scene, so that the security staff can pay closer attention to these preselected activities. However these systems are rarely tackled in a scalable manner. Before developing a full surveillance system, several problems have to be solved first, for instance: background subtraction, person detection, tracking and re-identification, face recognition, and action recognition. Even though each of these problems have been researched in the past decades, they are hardly considered in a sequence. Each one is usually solved individually. However, in a real surveillance scenario, the aforementioned problems have to be solved in sequence considering only videos as the input. Aiming at the direction of evaluating approaches in more realistic scenarios, this work proposes a framework called Smart Surveillance Framework (SSF), to allow researchers to implement their solutions to the above problems as a sequence of processing modules that communicates through a shared memory. The SSF is a C++ library built to provide important features for a surveillance system, such as a automatic scene understanding, scalability, real-time operation, multi-sensor environment, usage of low cost standard components, runtime re-configuration, and communication control.}, keywords = {ARDOP, Smart Surveillance, Surveillance Systems, VER+, Video Surveillance}, pubstate = {published}, tppubtype = {mastersthesis} } The availability of surveillance cameras placed in public locations has increased vastly in the last years, providing a safe environment for people at the cost of huge amount of visual data collected. Such data are mostly processed manually, a task which is labor intensive and prone to errors. Therefore, automatic approaches must be employed to enable the processing of the data, so that human operators only need to reason about selected portions. Focused on solving problems in the domain of visual surveillance, computer vision problems applied to this domain have been developed for several years aiming at finding accurate and efficient solutions, required to allow the execution of surveillance systems in real environments. The main goal of such systems is to analyze the scene focusing on the detection and recognition of suspicious activities performed by humans in the scene, so that the security staff can pay closer attention to these preselected activities. However these systems are rarely tackled in a scalable manner. Before developing a full surveillance system, several problems have to be solved first, for instance: background subtraction, person detection, tracking and re-identification, face recognition, and action recognition. Even though each of these problems have been researched in the past decades, they are hardly considered in a sequence. Each one is usually solved individually. However, in a real surveillance scenario, the aforementioned problems have to be solved in sequence considering only videos as the input. Aiming at the direction of evaluating approaches in more realistic scenarios, this work proposes a framework called Smart Surveillance Framework (SSF), to allow researchers to implement their solutions to the above problems as a sequence of processing modules that communicates through a shared memory. The SSF is a C++ library built to provide important features for a surveillance system, such as a automatic scene understanding, scalability, real-time operation, multi-sensor environment, usage of low cost standard components, runtime re-configuration, and communication control. |