The SWAX Benchmark
The Sense Wax Attack (SWAX) Database comprises images of real persons and their correponding realistic wax-made sculptures. The proposed benchmark seemed to be the only database containing both real and wax-modeled persons up to the publication date of the scientific paper. SWAX was designed to investigate the problem in which a face media is presented to a system that must determine whether it categorizes a bona fide (authentic) or a counterfeit (attack) sample.
Dados
The proposed SWAX dataset consists of genuine and counterfeit samples for all available subjects. As illustrated in the figures below (photos of real persons and wax dummies, respectively), the dataset contains labeled photographs of characters, celebrities and public figures, chosen based on a list of waxworks obtained from a famous chain of wax museums. More precisely, this work aims at investigating face spoofing detection in realistic scenarios, whither there is little control over the images acquisition. In contemplation of fair algorithm comparisons, we provide four protocols for developing and evaluating algorithms using the SWAX benchmark.
SWAX is compiled from unrestrained online resources and consists of characters, celebrities and public figure images and videos to whom wax dummies have been sculpted into. The database contains 33 female and 22 male individuals. It consists of 1,812 images and 110 videos of 55 people/figures. More precisely, each subject holds at least 20 authentic still images and a minimum of 10 counterfeit images. All motion and still pictures are manually captured under uncontrolled scenarios, formed by uncooperative individuals and distinct camera viewpoints.
Important:
For development purposes, the proposed dataset encompasses a leave-one-out cross validation strategy in an attempt to escape unfairly algorithm biases and expose overfitting occurrences.
Data Request:
To be able to download the dataset, please read carefully this agreement, fill it and send it back to one of the suggested e-mails. The license agreement MUST be reviewed and signed by the individual or entity authorized to make legal commitments on behalf of the institution or corporation (e.g., Department or Administrative Head or similar). We cannot accept agreements signed by students or faculty members.
Protocols
The SWAX benchmark specifies evaluation guidelines and restrictions to the proper manipulation of the dataset.
Protocol 01: Unsupervised, with additional data
One-class classification can be defined as a special case of unsupervised learning where only the class comprising authentic face pictures is well characterized by training data instances. It implies that counterfeit pictures are not known at training time but may emerge at test time. The, the following conditions shall apply:
- Procedures claimed to be unsupervised cannot make use of presentation attack samples at the training stage, being restricted to authentic samples only.
There should be no parameters carrying bona fide or counterfeit labels, not to mention any other relevant information such as file names or singular identifiers. - Approaches can neither use “beforehand information” concerning the number of samples included in each class nor use the label distribution in the training and test sets.
- Supplementary samples, outside of SWAX database, are allowed in cases where they depict bona fide individuals only and should not be composed of hand-labeled data or information indicating whether pictures are authentic or comprise presentation attacks.
- A decision threshold is not supposed to be established in view of test results as it suggests that the threshold is being chosen in a supervised manner.
Protocol 02: Restricted, without additional data
This protocol admits information indicating whether a picture is authentic or consists of an attack. Yet, it dismisses any type of annotation or data from outside SWAX database, including supplementary picture samples, external tools like facial landmark detectors and alignment methods learned on separate pictures, or feature extractors trained on other data sources. Then, authentic/attack training labels provided in protocol 02 are authorized to be used along with external algorithms, provided that they satisfy the requirements below:
- Algorithms under this protocol are not allowed to use supplementary data, either to identify presentation attacks or perform any kind of picture pre-processing.
- Validation and test sets are exclusive to their own purposes and cannot be employed to learn auxiliary methods.
- Researchers cannot rely on supplementary labeled data, such as manual face segmentation or facial landmark annotation.
Protocol 03: Unrestricted, with no-label additional data
Differently from first and second protocols, which restrain investigators and developers in such a way they must employ either authentic no-label outside data or SWAX training data alone, respectively; protocol 03 acknowledges the exploitation of additional data sources in the interest of improving an algorithm’s precision. The third protocol is distinguished from the others as it is subjected to the following conditions:
- Outside data should not consist of individuals included in SWAX database.
- External pictures cannot hold corresponding information indicating whether they are genuine or fraudulent samples.
- Additional pictures can be labeled with keypoints or segments for the sake of designing pre-processing algorithms and, as a result, enhance the overall performance of spoofing detection methods.
- Non-SWAX annotations cannot present information that may accredit the formation of authentic or attack face pictures.
Protocol 04: Totally Unrestricted
The totally unrestricted pattern is the most permissive protocol as it admits outside datasets, external feature extractors and other methods that have been built on independent visual data as long as they adhere to the subsequent requirements:
- Supplementary genuine and fraudulent pictures in which their corresponding identity is not available in the SWAX database.
- External face samples may include annotated keypoints, attributes, segments as well as information carrying bona fide or counterfeit labels.
Protocol 04 admits cross-dataset experiments in the interest of assessing the generalization capability of algorithms and increasing their performance.
In consequence, researchers are allowed to avoid SWAX’s samples, using external data only in the training stage, but compelled to evaluate the designed approaches on the provided testing splits.