Deepfakes, they’re interesting, entertaining and also deeply concerning for security officials. And as the technology behind deepfakes continues to refine and improve its delivery – researchers and social media giants are reacting quickly and trying to keep pace with the potentially troublesome technology.
Recently, CNBC featured the work of MSU’s Xiaoming Liu and researchers from Facebook, who developed a model that pulls back the curtain of who is creating deepfakes.
Artificial intelligence researchers at Facebook and Michigan State University say they have developed a new piece of software that can reveal where so-called deepfakes have come from. Deepfakes are videos that have been digitally altered in some way with AI. They’ve become increasingly realistic in recent years, making it harder for humans to determine what’s real on the internet, and indeed Facebook, and what’s not. The Facebook researchers claim that their AI software — announced on Wednesday — can be trained to establish if a piece of media is a deepfake or not from a still image or a single video frame. Not only that, they say the software can also identify the AI that was used to create the deepfake in the first place, no matter how novel the technique. Tal Hassner, an applied research lead at Facebook, told CNBC that it’s possible to train AI software “to look at the photo and tell you with a reasonable degree of accuracy what is the design of the AI model that generated that photo.” The research comes after MSU realized last year that it’s possible to determine what model of camera was used to take a specific photo — Hassner said that Facebook’s work with MSU builds on this. June 16 – CNBC There’s a lot to learn about deepfakes, the concerns about the technology and what can be done to ensure the technology doesn’t create havoc or confusion in elections or any other form of communications – and if you are a journalist looking to know more, our experts are here to help.
Xiaoming Liu a MSU Foundation Professor and is an expert when it comes on computer vision, machine learning, and biometrics, especially on face related analysis. Dr. Liu is available to speak to media about deep-fake technology – simply click on his icon now to arrange an interview today.
Media
Biography
Xiaoming Liu earned his Ph.D degree in Electrical and Computer Engineering from Carnegie Mellon University in 2004. He received a B.E. degree from Beijing Information Technology Institute, China and a M.E. degree from Zhejiang University, China in 1997 and 2000 respectively, both in Computer Science. Prior to joining MSU, he was a research scientist at the Computer Vision Laboratory of GE Global Research. His research interests include computer vision, pattern recognition, machine learning, biometrics, human computer interface, etc.
Industry Expertise
Computer Software
Computer Hardware
Biotechnology
Areas of Expertise
Human Computer Interfaces
Deepfake Detection
Pattern Recognition
Machine Learning
Computer Vision
Biometrics
Accomplishments
Best Poster Award, 26th British Machine Vision Conference (BMVC)
2015, as co-author
Invited Participant, Microsoft Research Faculty Summit
2017
Withrow Distinguished Scholar–Junior Award
2018, established by the Withrow family to recognize faculty of the MSU College of Engineering who have demonstrated excellence in scholarly activities
Best Oral Paper Award
2019, for the paper “UGLLI Face Alignment: Estimating Uncertainty with Gaussian Log-Likelihood Loss” at the First Workshop on Statistical Deep Learning in Computer Vision (SDLCV)
Finalist of the CVPR 2019 Best Paper Award
2019, for students’ work of “Deep Tree Learning for Zero-shot Face Anti-Spoofing”
Fellow of International Association for Pattern Recognition (IAPR)
2020, for contributions to face and video analysis
MSU Foundation Professor
2021
Education
Beijing Information Technology Institute
B.A.
Computer Science and Engineering
1997
Carnegie Mellon University
Ph.D.
Electrical and Computer Engineering
2004
Zhejiang University
M.S.
Computer Science and Engineering
2000
Affiliations
IEEE Transactions on Biometrics, Behavior, and Identity Science (T-BIOM) Special Issue on Trustworthy Biometrics : Guest Editor, 2020 - 2022
Corresponding Expert of Frontiers of Information Technology & Electronic Engineering : Guest Editor, 2019 - 2022
Engineering Journal Special Issue on Artificial Intelligence 2021 : Guest Editor, 2021
Pattern Recognition Letter Special Issue on Biometric Presentation Attacks: handcrafted features versus deep learning approaches : Guest Editor, 2019
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) Special Issue on Face Analysis for Applications : Guest Editor, 2018 - 2019
Machine Vision and Applications Special Issue on 2018 IEEE Winter Conference on Applications of Computer Vision : Guest Editor, 2018
International Journal of Computer Vision Special Issue on Deep Learning for Face Analysis : Guest Editor, 2017 - 2018
MSU, Facebook develop research model to fight deepfakes
MSU Today online
2021-06-16
“Our method will facilitate deepfake detection and tracing in real-world settings where the deepfake image itself is often the only information detectors have to work with,” said Xiaoming Liu, MSU Foundation Professor of computer science. “It’s important to go beyond current methods of image attribution because a deepfake could be created using a generative model that the current detector has not seen during its training.”
Facebook scientists say they can now tell where deepfakes have come from
CNBC online
2021-06-16
Schick questioned whether Facebook’s tool would work on the latter, adding that “there can never be a one size fits all detector.” But Xiaoming Liu, Facebook’s collaborator at Michigan State, said the work has “been evaluated and validated on both cases of deepfakes.” Liu added that the “performance might be lower” in cases where the manipulation only happens in a very small area.
Facebook says it’s made a big leap forward in detecting deepfakes
Fortune online
2021-06-16
Hassner says the research took inspiration from prior work by a Michigan State computer scientist who collaborated on the project, Xiaoming Liu. Liu had studied the subtle differences between images taken with different brands and kinds of digital cameras. He built machine-learning systems that could analyze images and determine, with a high degree of accuracy, the type of camera used to take that particular picture.
Biometric smart cards and civic digital identity apps to redefine wallets
Biometric Update online
2020-11-07
Michigan State University biometrics researcher Sixue Gong explains a method for de-biasing facial recognition described in a research paper written with Xiaoming Liu and Anil Jain in an interview with Biometric Update. The idea is one of several promising attempts to move beyond improving training dataset balance to address the problem, which Sixue says is necessary.
Method for facial recognition bias reduction with adversarial network shows promise
Biometric Update online
2020-11-02
A paper jointly written by Sixue Gong, Xiaoming Liu and Anil K. Jain, all of Michigan State University, ‘Jointly de-biasing face recognition and demographic attribute estimation,’ was presented at the European Conference on Computer Vision (ECCV) 2020.
“Cameras can see diverse scenes,” he explained, “so our research objects range from human faces and bodies, to urban scenes, plants, and medical imaging. Recent interests also include 3D perception in autonomous driving and defending against various digital image manipulations, such as deepfake."
Monocular Vision-based 3D Perception for Autonomous Driving
General Motor Research and Development Center, Warren MI Virtual
2020-08-04
Autonomous Sensing: from 3D Object Detection to Biometric Recognition
Army Research Laboratory, Adelphi, MD Virtual
2020-10-27
3D Perception for Autonomous Driving: Research and Education
Southern University of Science & Technology, Shenzhen, China Virtual
2020-11-13
Monocular Video-based 3D Perception for Autonomous Driving
7th Tech.AD USA conference 2020, Detroit MI Virtual
2020-11-17
On the Accuracy, Vulnerability, and Biasness of Face Recognition, The 15th Chinese Conference on Biometrics Recognition (CCBR)
The 15th Chinese Conference on Biometrics Recognition (CCBR), Shanghai, China Virtual
2021-09-11
Research Focus
Areas :
Computer Vision, Pattern Recognition, Image and Video Processing, Machine Learning, Human Computer Interface, Medical Image Analysis, Multimedia Retrieval.
Patents
Disentangled representation learning generative adversarial network for pose-invariant face recognition
US20200265219A1
2020-08-20
A system and method for identifying a subject using imaging are provided. In some aspects, the method includes receiving an image depicting a subject to be identified, and applying a trained Disentangled Representation learning-Generative Adversarial Network (DR-GAN) to the image to generate an identity representation of the subject, wherein the DR-GAN comprises a discriminator and a generator having at least one of an encoder and a decoder. The method also includes identifying the subject using the identity representation, and generating a report indicative of the subject identified.
Visual analytics system for convolutional neural network based classifiers
US10984054B2
2021-04-20
A visual analytics method and system is disclosed for visualizing an operation of an image classification model having at least one convolutional neural network layer. The image classification model classifies sample images into one of a predefined set of possible classes. The visual analytics method determines a unified ordering of the predefined set of possible classes based on a similarity hierarchy such that classes that are similar to one another are clustered together in the unified ordering. The visual analytics method displays various graphical depictions, including a class hierarchy viewer, a confusion matrix, and a response map. In each case, the elements of the graphical depictions are arranged in accordance with the unified ordering. Using the method, a user a better able to understand the training process of the model, diagnose the separation power of the different feature detectors of the model, and improve the architecture of the model.
SCH: INT: Collaborative Research: Unobtrusive sensing and motivational feedback for family wellness
National Science Foundation
2019-08-01
Principal Investigator
Face manipulation detection
Facebook
2020-06-01
Principal Investigator
Intelligent Diagnosis for Machine and Human-Centric Adversaries,” DARPA Reverse Engineering of Deceptions (RED) program
Northeastern University
2020-11-01
Principal Investigator
Journal Articles
Fully Understanding Generic Objects: Modeling, Segmentation, and Reconstruction
arXiv preprint
Feng Liu, Luan Tran, Xiaoming Liu
2021
Inferring 3D structure of a generic object from a 2D image is a long-standing objective of computer vision. Conventional approaches either learn completely from CAD-generated synthetic data, which have difficulty in inference from real images, or generate 2.5D depth image via intrinsic decomposition, which is limited compared to the full 3D reconstruction. One fundamental challenge lies in how to leverage numerous real 2D images without any 3D ground truth. To address this issue, we take an alternative approach with semi-supervised learning. That is, for a 2D image of a generic object, we decompose it into latent representations of category, shape and albedo, lighting and camera projection matrix, decode the representations to segmented 3D shape and albedo respectively, and fuse these components to render an image well approximating the input image. Using a category-adaptive 3D joint occupancy field (JOF), we show that the complete shape and albedo modeling enables us to leverage real 2D images in both modeling and model fitting. The effectiveness of our approach is demonstrated through superior 3D reconstruction from a single image, being either synthetic or real, and shape segmentation.
Unified Detection of Digital and Physical Face Attacks
arXiv preprint
Debayan Deb, Xiaoming Liu, Anil K Jain
2021
State-of-the-art defense mechanisms against face attacks achieve near perfect accuracies within one of three attack categories, namely adversarial, digital manipulation, or physical spoofs, however, they fail to generalize well when tested across all three categories. Poor generalization can be attributed to learning incoherent attacks jointly. To overcome this shortcoming, we propose a unified attack detection framework, namely UniFAD, that can automatically cluster 25 coherent attack types belonging to the three categories. Using a multi-task learning framework along with k-means clustering, UniFAD learns joint representations for coherent attacks, while uncorrelated attack types are learned separately. Proposed UniFAD outperforms prevailing defense methods and their fusion with an overall TDR = 94.73% @ 0.2% FDR on a large fake face dataset consisting of 341K bona fide images and 448K attack images of 25 types across all 3 categories. Proposed method can detect an attack within 3 milliseconds on a Nvidia 2080Ti. UniFAD can also identify the attack types and categories with 75.81% and 97.37% accuracies, respectively.
Depth Completion with Twin Surface Extrapolation at Occlusion Boundaries
arXiv preprint
Saif Imran, Xiaoming Liu, Daniel Morris
2021
Depth completion starts from a sparse set of known depth values and estimates the unknown depths for the remaining image pixels. Most methods model this as depth interpolation and erroneously interpolate depth pixels into the empty space between spatially distinct objects, resulting in depth-smearing across occlusion boundaries. Here we propose a multi-hypothesis depth representation that explicitly models both foreground and background depths in the difficult occlusion-boundary regions. Our method can be thought of as performing twin-surface extrapolation, rather than interpolation, in these regions. Next our method fuses these extrapolated surfaces into a single depth image leveraging the image data. Key to our method is the use of an asymmetric loss function that operates on a novel twin-surface representation. This enables us to train a network to simultaneously do surface extrapolation and surface fusion. We characterize our loss function and compare with other common losses. Finally, we validate our method on three different datasets; KITTI, an outdoor real-world dataset, NYU2, indoor real-world depth dataset and Virtual KITTI, a photo-realistic synthetic dataset with dense groundtruth, and demonstrate improvement over the state of the art.
Riggable 3D Face Reconstruction via In-Network Optimization
arXiv preprint
Ziqian Bai, Zhaopeng Cui, Xiaoming Liu, Ping Tan
2021
This paper presents a method for riggable 3D face reconstruction from monocular images, which jointly estimates a personalized face rig and per-image parameters including expressions, poses, and illuminations. To achieve this goal, we design an end-to-end trainable network embedded with a differentiable in-network optimization. The network first parameterizes the face rig as a compact latent code with a neural decoder, and then estimates the latent code as well as per-image parameters via a learnable optimization. By estimating a personalized face rig, our method goes beyond static reconstructions and enables downstream applications such as video retargeting. In-network optimization explicitly enforces constraints derived from the first principles, thus introduces additional priors than regression-based methods. Finally, data-driven priors from deep learning are utilized to constrain the ill-posed monocular setting and ease the optimization difficulty. Experiments demonstrate that our method achieves SOTA reconstruction accuracy, reasonable robustness and generalization ability, and supports standard face rig applications.
While radar and video data can be readily fused at the detection level, fusing them at the pixel level is potentially more beneficial. This is also more challenging in part due to the sparsity of radar, but also because automotive radar beams are much wider than a typical pixel combined with a large baseline between camera and radar, which results in poor association between radar pixels and color pixel. A consequence is that depth completion methods designed for LiDAR and video fare poorly for radar and video. Here we propose a radar-to-pixel association stage which learns a mapping from radar returns to pixels. This mapping also serves to densify radar returns. Using this as a first stage, followed by a more traditional depth completion method, we are able to achieve image-guided depth completion with radar and video. We demonstrate performance superior to camera and radar alone on the nuScenes dataset. Our source code is available at this https URL.