Mobile Animatronics Telepresence System

Current research projects

The Previous System

Inhabitor Station

In the inhabitor station, the user’s head pose (especially the orientation) should be tracked continuously to control the avatar head on the mobile avatar side. Currently, the user has to wear a helmet, on which there are optical trackers for acquiring the position and orientation of the inhabitor and a camera to capture the frontal face imagery, as shown in Fig. 1.

Figure 1: User in inhabitor station wearing a helmet with optical trackers and camera.

For the user, the requirement of wearing a helmet or a hat is not very natural and convenient. And the user’s 3D face model was generated by 3rd party software FaceWorx through moving control points in photographs showing the frontal and profile face [1, 2]. This procedure requires manual identification of distinctive facial features, so it is very time-consuming to model a user’s face.

Using vision-related method (single-, multiple- camera or Kinect) to estimate the head pose and get the 3D face model for the avatar side projection is one of the approaches to solve these problems.

Mobile Avatar

Comparing with the previous avatar, the current avatar adopts rear projection instead of front projection, the projector is fixed rigidly with the face-shaped projection surface.

Currently, the alignment of projection image and the face-shaped surface is accomplished manually (the size, position and rotation of the projection image), this setup is not user-friendly. And there are still some misalignment between the projected image and the face-shaped surface, making the appearance of the avatar distorted, as shown in Fig. 2(a) and (b). When the inhabitor speaks or has some expression changes, this misalignment becomes more distinct. Moveover, due to the inter-reflection and specular reflection, the appearance of projected avatar face is not homogeneous, in Fig. 2(b) some errors are shown (especially the sparkling spot in the eye region).

Figure 2: Some imperfections of projected avatar face: (a) & (b) misalignment,
(c) inhomogeneous appearance due to inter-reflection or specular reflection.

The potential improvement on the avatar side may include
** Automatic alignment of projected face image and face-shaped surface.
** Compensation of the error caused by inter-reflection and specular reflection to get more realistic face projection, even adding the feedback of a camera.
** Automatic adjustment for the misalignment caused by expression variations of the inhabitor.

My Current Research Projects

Unencumbered Head Pose and Body Posture Estimation

The emergence of low-cost 3D sensors (e.g., Kinect) makes it possible to achieve an acceptable quality of 3D capture and pose estimation of a human head and body for many applications, without encumbering the user with sensors or markers. It would be useful to explore the use of such devices for real-time head pose and estimation for local head control of the NTU prototype Physical-Virtual Avatar (PVA) avatar, without regard for the appearance (imagery) of the head. In addition, the same technology could be used for remote body control of the UNC RoboThespian RT-3. This will require understanding/exploration of both the body capture, and also the RT-3 control. Finally, real-time head pose information, with camera imagery, could be used for face modeling and deformation.

Dynamic Face Modeling and Expression Deformation

While modern depth sensors are noteworthy in many ways, getting an accurate real-time dynamic 3D face model remains a challenging problem. In general, the quality of single frame is not sufficient to generate reasonable 3D face models, and there is little or no temporal coherence (filtering or fusion). It would be useful to explore the use of the Kinect or other sensors to build up a parametric model of the human head, with evolving dynamic textures, that could be rendered onto the PVA using the head pose information. One important factor is that of temporal coherence. The geometry of the model (the parameters) should be evolved in a way that simultaneously affords a stable base head model, and yet allows for shape changes due to facial expressions. This might be accomplished by using repeated poses and depth information, accumulating and refining the model over time. A general model of the human head, perhaps a parametric model, may be employed as prior knowledge to simplify the problem.

Direct Face Mapping to Avatar Head

The current approach used to dynamically map a real human face to the face of the PVA depends on a full 3D model of the real human head, a full 3D model of the PVA head, and very precise (in space and time) dynamic head tracking via a head-worn marker system. One of the dominant goals of the project is to un-encumber the user, while
simultaneously improving the quality (clarity, responsiveness, and stability) of the face imagery. Several methods for more direct mapping of the face should be explored, to attempt to avoid the need for a human head model and a head-worn device. For example, it might be possible to use a Kinect or some other means to dynamically synthesize an image as would be generated via a head-worn camera. If a relatively stable dynamic face image could be synthesized, then a separate process could attempt to continually fit that 2D image onto a static “image” corresponding to the PVA facial feature locations in the projector image. (In the case where the projector is affixed to the PVA head, the locations of the PVA facial features are fixed in the image.)

Photometric Issues

Errors in appearance (color and/or luminance) arise as a result of light being projected onto an opaque surface, or through a translucent head material. The sources of error can include inter-reflection, specular highlights, interior (to the head material) diffusion and scattering of light. It should be possible to model and calibrate for some of these error sources, potentially using a camera, and to then add a post-rendering correction that adapts the luminance and color throughout.


[1] Peter Lincoln, Greg Welch, Andrew Nashel, Andrei State, Adrian Ilie, and Henry Fuchs. Animatronic shader lamps avatars. Virtual Reality, 15(2-3):225–238, 2011.
[2] Peter Lincoln, Greg Welch, and Henry Fuchs. Continual surface-based multi-projector blending for moving objects. In Proc. of IEEE Virtual Reality, pages 115–118, 2011.

Back to Projects Page