Designing Custom Face Detection & Recognition Software
Over the last few months, one of our teams was busy designing a solution that employs biometric identification and works within real-time video camera streaming. The system workflow had to comprise 3 main stages: detection of a face, near-instant recognition, and further action to be taken (e.g. grant access to authorized personnel or send a corresponding alert otherwise).
Such is our research project that we called Big Brother. Its all-seeing eye uses computer vision to analyze frames of a stream and detect faces. See how it works in the video below.
The potential of this solution can be fully realized in your own custom product, so feel free to address us with your ideas. Meanwhile we'll reveal a few technical details on Big Brother's workflow and constituents.
Face detection and recognition workflow
The entire process is triggered by a camera application, which is installed on a device that's connected to a camera. This application was written in Golang as a local console app for Ubuntu and Raspbian. Upon launch, it must be configured using a JSON config file with Local Camera ID and Camera Reader type.
The camera app detects faces in the stream using a deep neural network and computer vision. During the research, we found that the two optimal ways to implement detection were Caffe face tracking and TensorFlow object detection models. Both of these were included in the OpenCV library and worked well when we tested them.
When the face is captured, the image is cropped and sent to the back end via HTTP form data request. The back end API saves the image to a local file system and saves a record to Detection Log with a personID.
The back end has a background worker that finds records with "classified = false" and uses Dlib to calculate the 128-dimensional descriptor vector of face features. Whenever a vector is calculated, it is compared with the existing entries (multiple reference face images) by calculating Euclidean distance to each feature vector of each Person in the database, finding a match, if any.
This is what it looks like in the code: each point index is properly defined in Dlib and represents various parts of a face.
If the Euclidean distance to the detected person is less than 0.6, the worker sets a personID to the Detection Log and marks it as classified. If the distance exceeds 0.6, it creates a new Person with unknown type and enters a new personID to the log.
Images of an unidentified person can be sent to the corresponding manager with notifications—in this case, we implemented chatbots in messengers. Then it will be possible to grant remote access manually or take other actions.
Note: our case showed that the simplest alert chatbots required 2-5 days for implementation. We made two examples with Microsoft Bot Framework and Python-based Errbot.
Afterwards, these records can be managed via Admin Panel, which stores photos with IDs in the database. A database of employee pictures can be prepared and entered into the system beforehand.
Note: the database used in our case includes nearly 200 entries. Everything works in real time, and recognition happens instantly. But what's the potential of this solution if we need to scale? What if we need dozens of cameras? What if databases must comprise over 10,000 entries? Naturally, this will affect the speed of recognition on the back end. This is where our team came up with a solution – parallelization. We can set up a load balancer and build several Web workers for simultaneous work. Each of them can take a chunk of the entire database to search for matches and provide faster results.
To sum things up, here is the structure of Big Brother and the technologies we applied.
Note: the back end utilizes Golang and MongoDB Collections to store employee data. All API requests are based on RESTful API. This system can be tested on regular workstations.
Revealing the potential of Big Brother
This solution can store and manage large amounts of data. With this data and well-defined business needs at hand, Data Science models could be created and trained to get business insights. For example, to analyze customer behavior and predict demand in retail. We already have experience of applying Data Science for demand forecasting, which you may check in our ERP development case study.
Note: Big Brother is not limited to face recognition. We can develop solutions that detect and identify other objects. This is a semantic segmentation issue related to Computer Vision: depending on segmentation classes (object type, amount, accuracy, etc.), we can create and train a model which recognizes different objects.
Anti-spoofing measures to boost security of face recognition
Face recognition can be enhanced with auxiliary security measures. After all, a facial image is the easiest biometric identifier one can get, way easier than getting a fingerprint or retina scanning. This is where it becomes vital to take additional anti-spoofing features so that no one could get access by showing a picture of a face.
Here we can apply special algorithms based on a combination of data science, computer vision and deep learning:
1. Detection of eye movement: for example, a person blinks 15-30 times per minute. Thus we can check face liveness.
2. Texture analysis: we can get hand-crafted features and detect the difference between a real face and a fake one, which typically has less elaborate details, noise and other image artifacts. With these features at hand, we can apply support-vector machines for binary classification.
3. Active camera flash: it easily helps to tell the difference between a real face with complex geometry and an image with plain surface.
4. Challenge-response protocol: give instructions to the user—e.g. wink, smile, say something, or turn their head—and, again, see whether the face is real.
5. These ways can be combined. For example, we can use both challenge-response and texture analysis to counteract both photo- and video-spoofing.
The Big Brother project allowed us to work out the algorithms and mechanisms that will enable faster development of custom solutions for face-based authentication. That said, feel free to address us with any questions.
May 02, 2019
Augmented Reality repeatedly proves its viability for practical use across a number of areas—from effective customer support in... more →
Client & Business Goals: This internal project shows how to create a voice-controlled Internet of Things product for smart... more →
Client: US-based company that offers software products and services for enterprisesBusiness goals: Provision of enterprises with... more →