Microsoft Kinect is an amazing device with state-of-the-art body tracking capabilities. However, the newest SDK version includes a touch of some great new magic: Face tracking.

Utilizing the infrared and color streams, Kinect sensor can accurately track thousands of facial points. Oddly, Microsoft has implemented two APIs for tracking a face: Face Basics and Face HD. The first one only provides limited capabilities in the 2D space. Face HD, though, includes a ton of hidden goodies for tracking a face in the 3D space.

{ Vitruvius is featured in the official Microsoft Kinect Website and Channel9 }

After reading this article, you’ll be able to understand how face tracking works, how you can access the points in the 3D space and how to display them in the 2D space.

Introducing the Face class

Even though the native Kinect API is powerful, it’s quite messy. Microsoft only exposes a huge array of face vertices, along with a big enumeration. So, while building Vitruvius, we re-imagined the whole face tracking experience, from a purely developer perspective.

Please, meet the all-new Face class. The Face class contains all of the information you’d ever wish to know. Using the Face class, accessing properties like the nose, eyes, jaw, forehead, chin, cheeks, or chin is now a matter of one line of code.

Virtuous extends the native APIs and exposes a single, unified, powerful interface.

Let me show you how.

Accessing an HD Face object using C#

Just like every Kinect stream, HD Face has its own frame type. To properly use a Face frame, you need to include the following namespace:

using LightBuzz.Vitruvius;

After that, you need to subscribe to the FrameArrived event, just like you’d do for the Color, Depth, Infrared, or Body streams. If you are using Unity, you do not need to subscribe to the event, but simply check the frame readers in your Update method.

Since you’ve done those trivial tasks, you can simply call the Face() extension method.

// Private members

private KinectSensor _sensor = null;
private BodyFrameSource _bodySource = null;
private BodyFrameReader _bodyReader = null;
private HighDefinitionFaceFrameSource _faceSource = null;
private HighDefinitionFaceFrameReader _faceReader = null;

// Initialization

_sensor = KinectSensor.GetDefault();

if (_sensor != null)
{
    _bodySource = _sensor.BodyFrameSource;
    _bodyReader = _bodySource.OpenReader();
    _bodyReader.FrameArrived += BodyReader_FrameArrived; ;

    _faceSource = new HighDefinitionFaceFrameSource(_sensor);
    _faceReader = _faceSource.OpenReader();
    _faceReader.FrameArrived += FaceReader_FrameArrived; ;

    _sensor.Open();
}

// Event handlers

private void BodyReader_FrameArrived(object sender,
BodyFrameArrivedEventArgs args)
{
    using (var frame = args.FrameReference.AcquireFrame())
    {
        if (frame != null)
        {
            Body body = frame.Bodies().Closest();

            if (!_faceSource.IsTrackingIdValid)
            {
                if (body != null)
                {
                    _faceSource.TrackingId = body.TrackingId;
                }
            }
        }
    }
}

private void FaceReader_FrameArrived(object sender,
HighDefinitionFaceFrameArrivedEventArgs args)
{
    using (var frame = args.FrameReference.AcquireFrame())
    {
        if (frame != null && frame.IsFaceTracked)
        {
            Face face = frame.Face();
        }
     }
}

Accessing the HD Face properties

Let’s now get to the funny part. Every facial characteristic is a property of the Face class. Every facial point is expressed as a CameraSpacePoint (X, Y, and Z values).

var eyeLeft = face.EyeLeft;
var eyeRight = face.EyeRight;
var cheekLeft = face.CheekLeft;
var cheekRight = face.CheekRight;
var nose = face.Nose;
var mouth = face.Mouth;
var chin = face.Chin;
var forehead = face.Forehead;

Insanely easy right? Here’s the result:

Vitruvius Kinect HD Face

Even if you need to access the entire collection of vertices (more than 1000 points), Vitruvius has got you covered:

var vertices = face.Vertices;

Yeah, the result is really creepy! Don’t try it home alone!

Vitruvius Kinect HD Face Vertices

Accessing the HD Face methods

What if you need to know more about the points that form each facial feature? For example, what if you need to get the outline of the eyes or the mouth? For such purposes, the Face class includes the following methods:

var eyeLeftOutline = face.EyeLeftPoints();
var eyeRightOutline = face.EyeRightPoints();

This is the result of drawing the contour of each eye, after calling the EyeLeftPoints() and EyeRightPoints() methods:

Vitruvius Kinect HD Face Outlines

Converting from 3D to 2D coordinates

Accessing the HD Face features as a list of CameraSpacePoints gives you all the information about the 3D coordinates. To display those points in the 2D screen space, though, you’ll need to convert the 3D coordinates into 2D coordinates.

Kinect uses the CoordinateMapper class to convert between coordinates from different spaces. The Color space is an array of 1920×1080 pixels. The Depth & Infrared space is an array of 512×424 pixels. Vitruvius simplifies the coordinate mapping process with the handy ToPoint() method.

Converting from the 3D space to the Color space:

var nosePoint = face.Nose.ToPoint(Visualization.Color);

Converting from the 3D space to the Depth or Infrared space:

var nosePoint = face.Nose.ToPoint(Visualization.Depth);

Using the coordinate mapping process, you can now display the points on screen using Unity or XAML.

Supported platforms

Vitruvius HD Face supports the following platforms and frameworks:

  • Unity3D
  • WPF / .NET 4.5+
  • Windows Store

Frequently Asked Questions

Finally, let me shed some light to a few topics almost every Kinect developer needs to know about.

1) What is the optimal distance from the sensor?

The sensor can accurately track a face between 40 centimeters and 2 meters. For best results, the optimal distance is 60-90cm.

2) What is the optimal rotation angle?

Kinect Face tracking works best when you are facing the sensor directly (enface). However, the tracking algorithm is pretty decent even if you rotate your head up to 50 degrees to the left or right. Face tracking won’t work if your head is rotated e.g. 90 degrees to one side.

3) What about the lighting?

As mentioned above, Kinect face tracking is strongly relying on the Color stream. As a result, the room should have a decent amount of lighting. Also, you need to avoid pointing laser beams directly to the sensor.

4) Can I find the documentation online?

Definitely. You can check the documentation online.

So, this is it! How are you planning to use the HD Face capabilities to your Kinect apps? Let me know in the comments below.

‘Till the next time, keep Kinecting!

Get Vitruvius

As you see, Vitruvius is helping innovative companies create Kinect apps fast. Vitruvius simplifies Kinect development, so you can now focus on what’s really important: your app, your research, and your customers. Why not give it a try?

Author Vangos Pterneas

Vangos Pterneas is a Microsoft Most Valuable Professional in the Kinect technology. He helps companies from all over the world grow their revenue by creating profitable software products. Vangos is the owner of LightBuzz Software agency and author of The Dark Art Of Freelancing. Read more

More posts by Vangos Pterneas

Join the discussion 10 Comments

  • Hannnan says:

    excuse me Mr.Pterneas;

    As I told you , I am trying to do simple program to identify ( arm is up or down, hand near to mouth or not, head up or down , fingers hold spoon or food)

    I started with hand near to mouth or not >>>( I have use the HD face )

    Can you check the code if my idea is right ???

    using WindowsPreview.Kinect;
    using Microsoft.Kinect.Face;
    using LightBuzz.Vitruvius;

    // The Blank Page item template is documented at http://go.microsoft.com/fwlink/?LinkId=234238

    namespace Kinect2FaceHD_WinRT
    {
    ///
    /// An empty page that can be used on its own or navigated to within a Frame.
    ///
    public sealed partial class MainPage : Page
    {
    private KinectSensor _sensor = null;

    private BodyFrameSource _bodySource = null;

    private BodyFrameReader _bodyReader = null;

    private HighDefinitionFaceFrameSource _faceSource = null;

    private HighDefinitionFaceFrameReader _faceReader = null;

    public MainPage()
    {
    InitializeComponent();

    _sensor = KinectSensor.GetDefault();

    if (_sensor != null)
    {
    _bodySource = _sensor.BodyFrameSource;
    _bodyReader = _bodySource.OpenReader();
    _bodyReader.FrameArrived += BodyReader_FrameArrived;

    _faceSource = new HighDefinitionFaceFrameSource(_sensor);

    _faceReader = _faceSource.OpenReader();
    _faceReader.FrameArrived += FaceReader_FrameArrived;

    _sensor.Open();
    }
    }

    private void BodyReader_FrameArrived(object sender, BodyFrameArrivedEventArgs e)
    {
    using (var frame = e.FrameReference.AcquireFrame())
    {
    if (frame != null)
    {
    Body body = frame.Bodies().Closest();

    if (!_faceSource.IsTrackingIdValid)
    {
    if (body != null)
    {
    _faceSource.TrackingId = body.TrackingId;
    }
    }
    }
    }
    }

    private void FaceReader_FrameArrived(object sender, HighDefinitionFaceFrameArrivedEventArgs e)
    {
    using (var frame = e.FrameReference.AcquireFrame())
    {
    if (frame != null && frame.IsFaceTracked)
    {
    Face face = frame.Face();
    var mouth = face.Mouth;
    var nose = face.Nose;
    var neck = face.Neck;
    var hand = body.Joints[JointType.HandRight].Position;
    var distance = mouth.Length(hand);
    if (distance < 0.1)
    //display the result on screen ( how??) the hand is closing to mouth
    }
    }
    }

    the last thing ?? how I can display the result on screen what is the simple way to use canvas.

    • Hi Hannan. You need to declare the body object as a private variable into your class. To display the results, use a XAML Canvas or a TextBlock.


      using WindowsPreview.Kinect;
      using Microsoft.Kinect.Face;
      using LightBuzz.Vitruvius;

      namespace Kinect2FaceHD_WinRT
      {
      public sealed partial class MainPage : Page
      {
      private KinectSensor _sensor = null;
      private BodyFrameSource _bodySource = null;
      private BodyFrameReader _bodyReader = null;
      private HighDefinitionFaceFrameSource _faceSource = null;
      private HighDefinitionFaceFrameReader _faceReader = null;

      private Body body = null;

      public MainPage()
      {
      InitializeComponent();

      _sensor = KinectSensor.GetDefault();

      if (_sensor != null)
      {
      _bodySource = _sensor.BodyFrameSource;
      _bodyReader = _bodySource.OpenReader();
      _bodyReader.FrameArrived += BodyReader_FrameArrived;
      _faceSource = new HighDefinitionFaceFrameSource(_sensor);
      _faceReader = _faceSource.OpenReader();
      _faceReader.FrameArrived += FaceReader_FrameArrived;
      _sensor.Open();
      }
      }

      private void BodyReader_FrameArrived(object sender, BodyFrameArrivedEventArgs e)
      {
      using (var frame = e.FrameReference.AcquireFrame())
      {
      if (frame != null)
      {
      body = frame.Bodies().Closest();

      if (!_faceSource.IsTrackingIdValid)
      {
      if (body != null)
      {
      _faceSource.TrackingId = body.TrackingId;
      }
      }
      }
      }
      }

      private void FaceReader_FrameArrived(object sender, HighDefinitionFaceFrameArrivedEventArgs e)
      {
      using (var frame = e.FrameReference.AcquireFrame())
      {
      if (frame != null && frame.IsFaceTracked)
      {
      var face = frame.Face();
      var mouth = face.Mouth;
      var nose = face.Nose;
      var neck = face.Neck;
      var hand = body.Joints[JointType.HandRight].Position;
      var distance = mouth.Length(hand);
      if (distance < 0.1) // ---> Experiment with this value.
      {
      System.Diagnostics.Debug.WriteLine("Hand close to mouth");
      // Display the results in a TextBlock, Canvas, or any other visual element, based on your UI.
      }
      }
      }
      }
      }
      }

  • Ajay says:

    We are using kinect V2 and we want a outline for upper and lower lip. Now we have picked outline points and added it in list and using this list. But is it a best way? or there is another way to achieve this?

  • JD says:

    I was just wondering how I would go about saving or exporting the face points to a model program as 3ds Max

  • hanan says:

    in this part , I faced this kind of error.

    // set the high definishin face source
    _faceSource = new HighDefinitionFaceFrameSource(_sensor);
    _faceReader = _faceSource.OpenReader();
    _faceReader.FrameArrived += FaceReader_FrameArrived;

    The name FaceReader_FrameArrived; does not exit in the current context. <<< this error.

    • You need to add a method named FaceReader_FrameArrived into your class. For example:


      private void FaceReader_FrameArrived(object sender,
      HighDefinitionFaceFrameArrivedEventArgs args)
      {
      using (var frame = args.FrameReference.AcquireFrame())
      {
      if (frame != null && frame.IsFaceTracked)
      {
      Face face = frame.Face();
      }
      }
      }

  • Duane Carey says:

    Hi Pterneas:
    I used the following to create the outline of the left eye, but all I get are 4 points left eye top center, bottom venter, left side, and right side. why don’t I get the points as per your picture. Visual Studio only show a count of 4 returning from the call ” var eyeLeftOutline = face.EyeLeftPoints();”

    Here is the code I used
    // Display Eye Outline points.
    var eyeLeftOutline = face.EyeLeftPoints();
    Ellipse ellipse;
    // Display all face points.
    if (_ellipses.Count == 0)
    {
    for (int index = 0; index < eyeLeftOutline.Count; index++)
    {
    ellipse = new Ellipse
    {
    Width = 01.50,
    Height = 01.50,
    Fill = new SolidColorBrush(Colors.Pink)
    };
    _ellipses.Add(ellipse);

    canvas.Children.Add(ellipse);
    }
    }

    for (int index = 0; index < eyeLeftOutline.Count; index++)
    {
    ellipse = _ellipses[index];

    CameraSpacePoint vertex = eyeLeftOutline[index];
    PointF point = vertex.ToPoint(Visualization.Infrared);

    Canvas.SetLeft(ellipse, point.X – ellipse.Width / 2.0);
    Canvas.SetTop(ellipse, point.Y – ellipse.Height / 2.0);
    }

    • Hello Duane. The Free version includes the Vertices property, as well as the primary (most common) face points. The Academic and Premium versions also include methods that allow you to get all of the available points of a particular face part. For example, the methods LeftEye() and LeftEyebrow() would give you all of the available points of the corresponding face parts.

Leave a Reply