AI Specialist Course Case Study

After weeks of work, I finally reach the conclusion of my practical facial recognition project as part of the artificial intelligence specialist course. During this process, I documented the technical challenges, the solutions I implemented, and the project's evolution through two previous articles. However, in this last stage, I decided to focus completely on finishing the project, which led me to stop writing daily articles. Below is the presentation I prepared for the case study.

Case Study Presentation

Project Objectives

The main objective of this project is to develop a real-time facial recognition application capable of detecting, identifying, and registering faces that appear in front of a camera. This application must allow for efficient web implementation through the use of modern technologies such as Flask and Docker, to ensure it can be easily deployed in any infrastructure, including Kubernetes containers.

Data Used

The data used to train the model consists of face images that are stored in a specific directory (images/). These images are processed using the face_recognition library to obtain the facial encodings. Each image must contain a single face, and it is important that the images are in .jpg or .png format. The data also includes the facial encodings and associated names, which are saved in a pickle type file (encodings.pkl).

Models Used

The facial recognition model used is based on the face_recognition library, which implements a facial recognition algorithm based on Convolutional Neural Networks (CNN). The library uses the dlib model for face detection and facial encoding extraction, which are then compared with the previously known face data to identify individuals. The following models from the library are used:

Face detection model: Uses a CNN.
Facial encoding model: Transforms the face into a feature vector that can be compared with other faces.

Model Training

The model training in this project is quite simple, as the model itself is pre-trained. What is "trained" here are the facial encodings of the known faces. For each face image, facial features are extracted using the face_recognition model. These facial encodings are associated with the persons' names and stored in a file for later use during real-time identification. The training process includes:

Loading face images.
Image preprocessing:
- Conversion to RGB: This step is crucial because some images may be in other color formats such as CMYK or grayscale, which are not suitable for the facial recognition model, which requires images in RGB format.
- Conversion to numpy array: This format is easier to handle for the face detection and recognition operations performed by the face_recognition library.
- Image validation: It is verified that the images are of the uint8 type and have three color channels (corresponding to RGB colors). If an image does not meet these conditions, its processing is skipped.
Extracting facial encodings.
Saving the encodings and names in a file so they can be used for identification.
It is also checked that there are no duplicate encodings.

Image showing the operation of the facial recognition application

Evaluation and Performance Metrics

The evaluation of the model is performed by observing the accuracy with which the system recognizes faces in real-time. Some important metrics that can be considered to evaluate performance are:

Accuracy: The system's ability to correctly identify known faces.
False positive/negative rate: It is important to monitor whether the system mistakenly identifies people who are not in the database (false positives) or whether it fails to recognize a person who should be identified (false negatives). In my specific case, this has not been an issue, since I have only tested the model with 4 images and it has not failed. In this case, traditional cross-validation is not required, as the pre-trained model is primarily used for feature extraction.

Ethical and Responsible AI Aspects

Privacy: The collection and use of biometric data, such as faces, must be carried out with the explicit consent of the people involved.
Bias: It is important that the system is fair and accurate for all people, regardless of their race, gender, or other factors.
Transparency: It must be clear how the collected data is used and what the limitations of the system are.
Data Security: Facial images and encodings must be stored securely, protecting users' personal information.

Implementation and Deployment

The application has been implemented using Python and the Flask library to create a web server that processes images in real-time. A facial recognition model that is trained locally and deployed as part of the server has been integrated. Deployment is facilitated by Docker, which allows packaging the application in a container. This ensures that the application can run consistently in different environments. Furthermore, the application can be deployed in Kubernetes to scale in a cluster and manage multiple instances efficiently.

Impact and Benefits

The impact of this project is significant in areas where the rapid identification of people is required. Some benefits:

Security: Can be used in surveillance systems to identify people in real-time.
Access Control: Can be applied in access control systems, where only authorized people can enter.
Scalability: Thanks to the use of Docker and Kubernetes, the application can be scaled for use in multiple locations with high efficiency.

Challenges and Limitations

Image Quality: If the images are low resolution or have poor lighting, the model may have difficulty detecting and recognizing faces. I even had problems after performing image preprocessing.
Accuracy: Although in my case, with only 4 photographs, the model has been accurate, accuracy may decrease in real environments.
Computational Cost: I experienced performance issues, especially concerning detection, despite using a computer with high processing capabilities and a good graphics card. Even with low-resolution videos.

Future Improvements / Next Steps

Improve accuracy: Include more data to improve model accuracy and reduce false positives and negatives.
Performance optimization: Implement optimization techniques to improve the speed of real-time image processing.
Expand AI use: Include more advanced analysis such as emotion detection or face tracking.
Improve data security: Implement better practices for encrypting facial encodings and improving privacy management.
Deploy in cloud environments: Integrate deployment with cloud platforms such as AWS, Google Cloud, or Azure for better performance and scalability.

Demonstration of the Interface in Operation

Google Sheets Document

Link to the Google Sheets document: (https://docs.google.com/spreadsheets/d/1hh5TTBP-mkEjpn4RhzDrvrUvHvLNrtUXIvIQxjpIKAU/edit?usp=sharing)

References

Project repository on GitHub, with instructions for the development and deployment of the application: (https://github.com/jhmarina/app-reconocimiento-facial)
Articles on this blog detailing the development process:

Conclusion

Throughout the course, we focused on model creation and the implementation of various AI techniques. However, in this specific project, it was not necessary to build a model from scratch, as I used a pre-trained facial recognition model. Although the model was ready for use, the development of the project allowed me to apply key concepts learned during the course, especially regarding data preprocessing, specifically image preprocessing. My goal was to combine the theory learned in the course with my experience in product development, focusing on creating a functional solution that not only had an AI basis but could also be deployed and managed efficiently in a production environment. In summary, although I did not implement advanced modeling techniques in this project, I did use many of the concepts learned during the course to face the technical challenges and achieve my goal of creating a complete and practical application within the scope of web development and Kubernetes deployment.