TITLE:
HMM-Based Photo-Realistic Talking Face Synthesis Using Facial Expression Parameter Mapping with Deep Neural Networks
AUTHORS:
Kazuki Sato, Takashi Nose, Akinori Ito
KEYWORDS:
Visual-Speech Synthesis, Talking Head, Hidden Markov Models (HMMs), Deep Neural Networks (DNNs), Facial Expression Parameter
JOURNAL NAME:
Journal of Computer and Communications,
Vol.5 No.10,
August
23,
2017
ABSTRACT: This paper proposes a technique for synthesizing a pixel-based photo-realistic talking face animation using two-step synthesis with HMMs and DNNs. We introduce facial expression parameters as an intermediate representation that has a good correspondence with both of the input contexts and the output pixel data of face images. The sequences of the facial expression parameters are modeled using context-dependent HMMs with static and dynamic features. The mapping from the expression parameters to the target pixel images are trained using DNNs. We examine the required amount of the training data for HMMs and DNNs and compare the performance of the proposed technique with the conventional PCA-based technique through objective and subjective evaluation experiments.