1Binghamton University, 2Intel Corporation
We present a novel approach to detect synthetic content in portrait videos, as a preventive solution for the emerging threat of deep fakes. In other words, we introduce a deep fake detector. We observe that detectors blindly utilizing deep learning are not effective in catching fake content, as generative models produce formidably realistic results. Our key assertion follows that biological signals hidden in portrait videos can be used as an implicit descriptor of authenticity, because they are neither spatially nor temporally preserved in fake content. To prove and exploit this assertion, we first exhibit several signal transformations for the pairwise separation problem, achieving 99.39% accuracy. Second, we utilize those findings to formulate a generalized classifier for fake content, by analyzing proposed signal transformations and corresponding feature sets. Third, we generate novel signal maps and employ a CNN to improve our traditional classifier for detecting synthetic content. Lastly, we release an "in the wild" dataset of fake portrait videos that we collected as a part of our evaluation process. We evaluate FakeCatcher both on Face Forensics dataset and on our new Deep Fakes Dataset, performing with 96% and 91.07% accuracies respectively. In addition, our approach produces a significantly superior detection rate against baselines, and does not depend on the source, generator, or properties of the fake content. We also analyze signals from various facial regions, with varying segment durations, and under several dimensionality reduction techniques.
In order to assess the generalizability of our solution against deep fakes, we need to evaluate our approach on everyday deepfake samples. For this purpose, we collected and curated a dataset of “in the wild” portrait videos, called Deep Fakes Dataset. The videos in our dataset are diverse real-world samples in terms of the source generative model, resolution, compression, illumination, aspect-ratio, frame rate, motion, pose, cosmetics, occlusion, content, and context. They originate from various sources such as news articles, forums, apps, and research presentations; totaling up to 142 videos, 32 minutes, and 17 GBs. Synthetic videos are matched with their original counterparts when possible. The visuals below demonstrates a small subset of our dataset. High accuracy on Deep Fakes Dataset substantiates that FakeCatcher is robust to all aforementioned artifacts found in deepfakes in the wild. The dataset is publicly released for academic use.
Deep Fakes Dataset is released under the Deep Fakes Academic Use License Agreement.
In order to download Deep Fakes Dataset, please fill out the following Google form after reading and agreeing our License Agreement. Upon acceptance of your request, the download link will be sent to the provided e-mail address. For any questions or feedback, please e-mail Umur Ciftci, Ilke Demir, or Lijun Yin.