TL;DR: Given a reference portrait image input, PERSE generates an
animatable 3D personalized avatar with disentangled and editable
control over various facial attributes.
Overview
PERSE consists of two main components and an application.
First, we generate a 2D monocular synthetic dataset from a single
image using our video diffusion model named as
portrait-CHAMP. The synthetic dataset maintains the same
identity as the input image but is edited with different attributes,
forming a 2D monocular video with the same head pose and facial
expression.
Next, we train 3DGS avatar using the 2D synthetic dataset. The 3DGS
avatar is an avatar model conditioned on a latent space, allowing
disentangled control over attributes.
Finally, PERSE avatar model enables attribute transfer from
in-the-wild 2D images, making interpolation possible between the
pretrained latent space and the in-the-wild attributes.
Results
Synthetic Dataset
We generate synthetic dataset consisting of almost a thousand
attribute-edited videos using our methods for attribute-edited
portrait image generation and animated portrait video generation by
portrait-CHAMP
Unseen Pose Rendering
We show rendering results of PERSE avatar model in unseen pose
rendering.
Interpolation Between Two Latent Codes
Beard
Beard A
Loading...
Beard B
Hair
Hair A
Loading...
Hair B
Hat
Hat A
Loading...
Hat B
Eyebrows
Eyebrows A
Loading...
Eyebrows B
Mouth
Mouth A
Loading...
Mouth B
We show that PERSE avatar model's interpolated latent codes generate
realistic, high-quality avatars with smoothly interpolated
attributes.
Facial Attribute Transfer from In-The-Wild Image
We present the results of transferring facial attributes from
in-the-wild images. As shown here, the transferred attributes are
well rigged to the avatar and are capable of rendering novel poses.
Transferred Rendering in Unseen Pose
We present the results of transferring facial attributes from
in-the-wild images.
Interpolation Between Pretrained Latent Code and Novel Latent Code
In-The-Wild Image
Loading...
Latent Space Sample Image
We demonstrate that interpolation between the transferred attribute
also produces natural, high-quality avatars.