Create offline facial recognition with an accuracy of 99.38% in Python and Node.js

This is my story about how I created a free, offline, real-time open source application designed to help the organizers of any events in admitting / authorizing only invited people using face recognition technology or a QR code.

If you can not wait to go directly to the code, then here is my repository .

So yes, face recognition is only part of the application, and the most difficult part. So pour yourself some coffee and enjoy my story (I tried).

Often there are dedicated to deep learning projects in Python, but not on Node.js. The reason is that under Python, there are many more libraries for efficient computations, for example, Numpy, Pandas, tensorflow, and so on. And the gap is big enough. I know Node.js, well, I know a little, and I wanted to use it in the project in order to master it better while I was busy with machine learning.
')
It all started during an online competition that had a question about working with the AzureML API (facial expression definition). I saw that there is also a face recognition API, but with some limitations. In addition, you first need to upload an image to the service, then it calculates the result and sends it to me. Too slow for me. I wanted to play with him, but at that time the service was not available in my country. So I want to thank the developers who gave me the idea to make something of my own. Until then, I had only studied what others had already done. I had certain doubts from the point of view of security, I needed a backup function. In addition, I read an article by my boss about learning a new language. And I thought that it was a great opportunity to apply my knowledge in a completely new project.

Well, and then what? I plugged the power cord into my Macbook because I knew it would take a long time, and I took my buddy Google with me for a ridiculous search for the best resources to start the project with. It turned out that people had already asked the network whether it was possible to realize what I had in mind, but there were no sensible answers. At the time, I was alone with this task. Then he came across an excellent series of publications by Adam Gaitgeya . I used to read his blog, but then somehow I got into other things. And then I came across a wonderful article by Adam and found out that he created the face_recognition package in Python. I downloaded it and tested it. Well, it did not work as smoothly as we would like.

So I felt myself when I installed dlib . I do not know why, but there was a problem with the installation at the very last stage. I spent many hours trying to figure out the reason, I didn’t even go to the gym that day.

I was on the verge of abandoning the project. But then I found out that the reason was in conflict because of the path of the anaconda Python package, or something like that. I was still studying the ecosystem, so I had to decide whether to leave anaconda and abandon the project, or get rid of the "snake". In the end, I completely removed anaconda, and spent the day completing the complete removal of different versions of Python downloaded by different packages, leaving only the system version. Then, using Homebrew, I correctly downloaded Python3, installed dlib, and by the end of the day I was able to launch it.

There was a new problem: how to integrate the library in Node.js? Again, I faced a dilemma: to study a Python-framework like Django, Flask, and so on, to continue working on a project, or get involved in a potentially endless task that might be completely unsolvable. Once I read this phrase and it surfaced in my head:

The rule of mathematics: if it looks simple, then you are doing wrong.

The phrase inspired me along with the article of the chief, so I decided to continue with Node.js, who knew a little about some web projects.

Therefore, I again began to look for ways to integrate Python and Node.js. Learned about child processes in Node.js. But my situation looked like this:

I read the documentation with scant examples. Read blogs, and everywhere was the same. But this time I intended to complete the project at any cost (read, with the help of an internship). Since the beginning of the internship season was nearing, I needed to finish the preparations as soon as possible. In addition, I needed enough time to create another such project next summer, but again on my own, with the help of Google alone. If you are a little familiar with machine learning, then you will be close to it:

If the learning rate is too low, then achieving the optimum will take a lot of time, or you will be stuck at a local minimum. If the speed is too high, then you can miss. But if the speed is correct, or regulated depending on the conditions, then the algorithm will quickly find the optimal point.

For me, as for the algorithm, the internship was the right learning rate. So I had to finish the project.

So I worked hard, sometimes I worked until 5.45 in the morning. It was an amazing time, I made a lot of stupid mistakes. I did not update the tab when I changed the code on the server. I changed the code several times on the client, but did not update the window. I do not know why, perhaps, I too wanted to sleep, at a comfortable temperature of 22 degrees Celsius in my cozy bed. There were amazing moments, like searching on Stackoverflow for an unnecessary logical error, which I later fixed by simply updating the tab.

Finally, I was able to make friends with Python and Node.js.

After that incident with AzureML, I was in awe of creating a fully offline web application that could do everything planned without any kind of Internet. I had to find alternative APIs or do them myself. As you probably know, in computer science it is in the order of things: time is inversely proportional to space . So I tried to minimize the time spent on ensuring the work of certain things. Sending a photo / video to the cloud service takes time and a channel, which means you had to increase the occupied space. Although cloud services are much more convenient, I like to write a lot of code if I do what interests me. You can accept this by reading the installation instructions for my repository .

I found many packages, tried one by one, took it to work, sometimes dropped it and looked again. For example, it was necessary to integrate the recognition of QR codes as a fallback in case of a failure of face recognition. There are many packages for generating codes, but not for scanning them. In the end I found the Instascan package. Had trouble with him. The fact is that I used OpenCV3 for face recognition in Python, and with the same camera (it’s the only one in the laptop) scanned QR codes. I needed a small video frame for scanning, but the size of the video for face recognition was also changed. Well, that didn't seem like a problem. I can stop the recognition process, scan the QR code and run it again. Pretty simple, huh? But if you read carefully, you noticed two points:

If it looks simple, then you are doing wrong.
I'm trying to minimize time (the main reason not to wait for the API to become available).

So in order to solve this problem, I had to study the code of the packages and the processes they perform.

Library parsing

Yes, I started this task three times - I analyzed libraries to deal with all sorts of difficulties. At least, the difficulties for me.

I studied the work of the face_recognition package to adapt it for my own purposes. At the same time helped others.
I studied a little Instascan to understand how it works with the camera, and how to work with the camera in general for a web application. There are many cases that need to be handled: that if the user somehow stops the operation of the camera, for example, by clicking past a modal window or closing it altogether. I changed the code, ran it many times, each time finding several bugs. Once, my Mac almost ran out of memory, and I hung for a few seconds. After repeated attempts, I finally achieved success, but again I found another bug.
This time the bug was in the modal window of the Materialize framework. Callback not working. Googled, rummaged on Github and Stackoverflow - I did not find a solution. I calculated the code responsible for the bug, tried to figure it out, ran it several times with the console.log () expressions, trying to understand what was happening, getting closer and closer to the bug, isolating the code in parts (I felt like a hacker bypassing the password). I heard that the forms in this framework are not too good either, so I'll play with them in another web application.

Event ++

This is an amazing bug caused by jQuery. The number of events increases with each click: 1, 2, 3, 4, 5, 6, 7, 8 ... My real-time notifications completely scored the entire right column into which they were displayed. It turned out that in jQuery I use on () instead of click (), and also use socket.on () events outside the handler.
Finally, after a long struggle, the server part began to work well with the client.

I thought everything was ready. But then an idea came to me: what if we added support for the database so that the user could perform CRUD operations. I wanted to leave to the discretion of users what they should use: SQL or NoSQL. I thought that I would add support and, who knows, I can make money on it (it is possible to start heading too far in the clouds, having achieved some success). Just a little bit, and I will have full functionality for automatic registration of incoming / outgoing visitors (face recognition, QR code scanning, no API restrictions, three-step authentication). But:

I learned everything from the open-source community.
I do not think that I can sell my product by knocking at different companies. For this I am too lazy. I would prefer to develop a few more similar products.
Presented a scene from the film "Social Network" in which Mark wrote the application and laid out in free access, even though he had good purchase offers (I think from Microsoft), and here I am so handsome, with a small web application that doesn't even wrote from scratch.

I tried to integrate MongoDB because I was familiar with it a bit, and besides, I hadn’t studied SQL in college yet. It will only be in the next semester. So I left the implementation of this feature for another project, perhaps, fork or porting it.
In general, I developed, integrated, spent several days studying, using, debugging, and so on.

Finally, I saw this:

Frontend, backend and database work in unison. Of course, in the center - the server part (combining the power of Python and Node.js). You can solve other problems, for example, to train the model, because I was able to integrate OpenCV3 (this requires installing binaries), face_recognition, numpy, pandas with dataset, and save the result in .csv format in my Python process. So if you have the right hardware, you can do something completely different based on my project.

I leave it to your discretion, who is on the front end gif, and who is the database.

Signing.Off ();

* * *
Link to the project .

Why did I write this text, although my codebase is not so great? So what? Well, for many of you, these are just a few hundred lines of code, but for me, integrating all parts, systematic study, updating of existing knowledge, self-correction of one base after another in this indefinite project (for me) is all a task that no one has yet solved (machine learning in Python and Node.js). Well, maybe I was looking bad. In general, for me it is a big project. I hope someone will find it useful. In addition, I wrote this post to revive moments of frustration and temporary happiness when something broke or worked. That is life.

Source: https://habr.com/ru/post/334716/

All Articles

Create offline facial recognition with an accuracy of 99.38% in Python and Node.js

Library parsing

Event ++

More articles: