Face Recognition: a constantly updated technology

Face recognition refers to the technology capable of identifying the identity of subjects in images or videos. It is a non-invasive biometric system, where the techniques used have varied enormously over the years.

During the 90’s, traditional methods used handcrafted features, such as textures and edge descriptors. Gabor, Local Binary Patterns (LBP), Histogram of Oriented Gradients (HOG), Scale Invariant Feature Transform (SIFT), etc. are some examples of this, which were the basis for more complex representations, through coding and transformation of characteristics such as Principal Component Analysis (PCA), LCA, among others. Aspects such as luminosity, pose or expression can be managed through these parameters.

In the past, there was no technique that could fully and comprehensively master all scenarios. One of the best results achieved is the one presented in the study “Blessing of dimensionality: High-dimensional feature and its efficient compression for face verification”, where 95% is achieved in the Labeled Face in the Wild (LFW) database. This indicates that the existing methods were insufficient to extract a representation of the faces that was invariant to the changes of the real world.

How does facial recognition work today?

In recent years, traditional methods have been replaced by others based on deep learning, which in turn have their origin in Convolutional Neural Networks (CNN). The main advantage of methods based on deep learning is that they can “learn”, from large databases, the best characteristics to represent the data, that is, to build the faces.

An example of this is the DeepFace network, which in 2014 achieved a “state of the art” performance in the famous LFW database. With this, he was able to approximate the performance of a human in an unrestricted scenario (DeepFace: 97.35% vs Humans: 97.53%). This, training a 9-layer model on 4 million images of faces. Inspired by this work, the focus of the research shifted towards methods based on deep learning, reaching 99.8% in just three years.

Facial recognition systems are usually made up of the stages shown in the following figure:

  1. Face detection: A query image is entered into the system. A detector finds the position of the face in the query image and returns the coordinates of the position.
  2. Face Alignment: Your goal is to scale and crop the image in the same way for all faces, using a set of reference points.
  3. Representation of the face: The pixels of the image of the face image are transformed into a compact and discriminative representation, that is, into a vector of characteristics. This representation can be achieved using classical methods or models based on deep learning. Ideally, all images of the faces of the same subject should have vectors with similar characteristics.
  4. Face matching: The images of the faces of registered individuals make up a database called a gallery. Each face image in the gallery is represented as a feature vector. Most methods calculate the similarity between the feature vector in the query image and the vectors in the gallery, using the cosine distance or the L2 distance. The one with the smallest distance indicates to which individual the consulted face belongs.