MSc thesis project

Individual project - spring 2021

Per-actor Based Optimization for Semantic-preserving Facial Rig Generation Using Sample Data

Summary

This MSc thesis project concluded my studies at Linköping University. I conducted the project at the VFX-company Goodbye Kansas in Stockholm where I was part of the pipeline department as an RnD intern.

With high emphasis on the need of combining recent research and technology regarding automatic facial rig generation with the artistic aspect and the usage of digital humans within film production pipelines, this thesis project presents a scalable blendshape optimization framework that is adapted to fit within a VFX-pipeline, provides stability for various kinds of usage and makes the workflow of creating facial rigs more efficient.

The framework successfully generates per-actor based facial rigs adapted towards sample data while ensuring that the semantics of the input rig are kept in the process. With the core in a reusable generic model, gradient based deformations, user-driven regularization terms, rigid alignment, and the possibility to split blendshapes in symmetrical halves, the proposed framework provides a stable algorithm that can be applied to any target blendshape. The proposed framework serves as a source for investigating and evaluating parameters and solutions related to automatic facial rig generation and optimization.

Technical Overview

The aim of this thesis project was to investigate how to automatically generate a personalized facial rig, with the use of a generic and reusable facial rig along with sample data, that is adapted towards target sample data while the semantics of the generic rig remains. Subsequently, the thesis project aimed at investigating how such a workflow influences the visual result, the correctness and the processing time.

The automatic framework implemented in this thesis project runs and outputs a facial rig consisting of approximately 400 different blendshapes in approximately an hour for a high-resolution facial mesh consisting of over 60,000 vertices when run with 39 base-shapes for optimization, and the rest building upon the established hierarchy in a generic facial rig.

Core Technologies

Implementation

The implementated framework consisted of two stages, one estimation stage and one refinement stage according to the conceptual overview below.

A conceptual overview of the implemented framework.

The implementation was made in Python with dependencies on Numpy, Scipy, Quadprog and Trimesh (only for i/o cases). Other technologies used was Maya, Matplotlib (for visualizations) and Nix (for packaging the framework).

The framework successfully generates per-actor based facial rigs adapted towards sample data while ensuring that the semantics of the input rig are kept in the process. With the core in a reusable generic model, gradient based deformations, user-driven regularization terms, rigid alignment, and the possibility to split blendshapes in symmetrical halves, the proposed framework provides a stable algorithm that can be applied for any target blendshape. The framework also served as a source for investigating and evaluating parameters and solutions related to automatic blendshape optimization and generation.

Deformation transfer was implemented as a standalone package for further usage within the production pipeline. The implementations follows the explanations by Sumner and Popović [1] and Sumner [2] and was limited to cover the case of meshes of the same topology.

The overall optimization formulation builds upon EBFR as proposed by Li, Weise and Pauly [3] and was extended to a fully automatic framework using findings from Ma et al. [4] for blendshape weights estimation and a head motion estimation and alignment in a separate step similar to the method by Seol, Ma and Lewis [5]. The optional usage of an iterative update of the refinement stage provides scalability in the solution and offers research prospects for investigating the influence of optimization parameters and changing the optimization loop for different target blendshapes.

Results

Following is a small showcase of the results after applying the automatic framework for a target facial rig. For more images and explanations, I refer to the full thesis report.

The result from generic input to optimized expression.

The affected area for a number of expressions in the optimized facial rig compared to the result of pure deformation transfer can be seen in the visualization below.

The result of the reproduction of a scanned expression by the framework can be seen below. Here it can be seen that the framework performs well using techniques such as shape matching, splitting of blendshapes in symmetrical halves and stationary constraints.

The result of the reproduction of a complex scanned expression by the framework.

Important literature

To learn more about the project, the full thesis can be found at DiVA.

Return to projects