Image Processing: Tensorflow Object Detection API

The last few years in the development of deep neural networks is a real revolution: new architectures are emerging, development frameworks for developers are being improved, and hardware for experiments can be obtained completely free of charge - for example, as part of the Google colaboratory project. Anyone who is interested in how to apply pre- trained models from the Tensorflow Object Detection API repository to solve their problem using the power of the Laboratory - welcome under cat.

If you do not want to read the article - you can immediately get acquainted with the notebook in the repository

Top GPU - for all

For training neural networks on large amounts of data, it is better to use GPU: the speed of learning and inference will be higher than on the CPU due to the efficient parallelization of operations across thousands of cores. Until recently, cards could be used in their calculations, for example, using Amazon cloud instances. But why pay for something that you can get for free? Google Collaboration provides access to the Tesla K80 card.

Access to the card can be obtained through the menu Edit-> Notebook setttings-> Hardware accelerator:
')

With this card, you can quickly conduct experiments with neural networks - for example, take a steep course on NLP from the guys from DeepPavlov. The only disadvantage of the service is your card for only 12 hours, this intermediate results need to be saved somewhere - if there is a need to use them in the future.

In this article I will tell you how to train a model using Colaboratory and save it for future use. The article will have little code - all the code that trains the model and interacts with Google Drive (to save intermediate results) is available in my repository .

Go!

Dataset DeepFashion

For experiments, I will use Deep Fashion dataset - these are 800k images of garments.

Images contain tags, as well as in the photo marked bounding boxes. We will teach the neural network to detect images of clothes in the photo — draw a bounding box and classify one of three classes: upper-body, lower-body, and full-body.

Data preparation

First, copy DeepFashion yourself to GoogleDrive from the Category and Attribute Prediction Benchmark directory in the root directory. We will work a lot with GoogleDrive as with file storage - copy data from there and upload work results (for example, checkpoint models) to Disk.

First, let's copy the repository with the code and install the dependencies:

!rm -r TFFashionDetection !git clone https://github.com/Dju999/TFFashionDetection.git !pip install lxml !pip install -U -q PyDrive !pip install tqdm

You also need to create an auxiliary object for working with the Google Drive file system.

 from TFFashionDetection.utils.colab_fs import GoogleColabFS fs = GoogleColabFS()

For more information, see the utils.colab_fs.py file in the repository.

Now you need to download DeepFashion datas:

 !python3 /content/TFFashionDetection/utils/dataset_download.py

There are three directories in the dataset.

Img - with garments
Eval - contains a text file with dataset splitting into train, test, valid
Anno - here files with tags, bouncing boxes and other service information

Our task is to prepare this data for feeding the neural network: a description of the files in a special format, partitioning into train and test.

Detection of images on Tensorflow

Google in 2017 released Object Detection API - a set of models and tools for detecting images. There are a lot of scripts in the repository for preparing training data, training models and visualizing results - for example, drawing bounding boxes.

The code below installs the TF Object Detection API from the github repository into the Google Laboratory environment.

 ! cd /content; git clone https://github.com/tensorflow/models.git #    object detection  # https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/installation.md !apt-get install protobuf-compiler python-pil python-lxml python-tk !pip install Cython !cd /content; git clone https://github.com/cocodataset/cocoapi.git; cd cocoapi/PythonAPI; make; cp -r pycocotools /content/models/research/ !cd /content/models/research; protoc object_detection/protos/*.proto --python_out=. #  -   !cd /content/models/research; export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim; python object_detection/builders/model_builder_test.py

Now you need to prepare the data. Img has a complex subdirectory structure, where each class of clothing has its own category. The code below copies all the photos into one directory, and also prepares a description for each file in the form of tf.train.Example - the detection model will be taught for all this good

The code has the model name ssd_mobilenet_v2_coco_2018_03_29 - you can download a suitable model in Detection Zoo . Modelka can be downloaded another - but then you will need to rewrite the file /content/data_dir/tf_api.config.

 import sys import os import numpy as np API_PATH = os.path.join('/content', 'models/research') sys.path.append(API_PATH) DETECTOR_PATH = os.path.join('/content', 'TFFashionDetection') sys.path.append(DETECTOR_PATH) from TFFashionDetection.data_preparator import DataPreparator from TFFashionDetection.utils.ssd_config import write_config data_preparator = DataPreparator() data_preparator.build() write_config('ssd_mobilenet_v2_coco_2018_03_29')

After the cell is completed, you can download the pre-trained model and start training the model. Frozen inference graph from turnips Object Detetetion will allow you to quickly train your detector. The path to the directory with the model graph will need to be transferred to the training script /object_detection/train.py.

 #   () !python /content/TFFashionDetection/utils/download_tf_zoo_model.py --name ssd_mobilenet_v2_coco_2018_03_29 --dir /content #    !export PYTHONPATH=$PYTHONPATH:/content/models/research/slim:/content/models/research/;python /content/models/research/object_detection/train.py --logtostderr --pipeline_config_path=/content/data_dir/tf_api.config --train_dir=/content/data_dir/checkpoints

If everything is done correctly, we will see how the logs run and the loss will decrease at each iteration. When you see that the loss function has ceased to decrease - you can stop learning. The graph of the model is saved in the directory / content / data_dir / checkpoints - it will need to be saved for further experiments. Training of the model should be carried out once, then use the resulting graph for the inference.

When the model is trained - you need to save it to Google-disk

 !cd /content/data_dir; zip -r checkpoint_save_20180514.zip checkpoints/* import os fs = GoogleColabFS() file_name = os.path.join('/content/data_dir', 'checkpoint_save_20180514.zip') fs.load_to_drive(file_name)

Download from Google Drive in the same way

 import os fs.load_file_from_drive('/content', 'checkpoint_save_20180514.zip') fs.unzip_file('/content', 'checkpoint_save_20180514.zip') !mkdir /content/deep_detection_model #   !export PYTHONPATH=$PYTHONPATH:/content/models/research/slim:/content/models/research/;python /content/models/research/object_detection/export_inference_graph.py --input_type image_tensor --pipeline_config_path=/content/data_dir/tf_api.config --trained_checkpoint_prefix=/content/checkpoints/model.ckpt-2108 --output_directory inference_graph

For example, select a random photo and feed it to our network for detection:

 import sys import os import matplotlib.pyplot as plt plt.switch_backend('agg') sys.path.append(os.path.join('/content', 'models/research')) from object_detection.utils import visualization_utils as vis_util from PIL import Image as Pil_image %matplotlib inline boxes = np.array([oject_detector.img_detections[3]['category_box']]) def load_image_into_numpy_array(image): (im_width, im_height) = image.size return np.array(image.getdata()).reshape( (im_height, im_width, 3)).astype(np.uint8) #       image = Pil_image.open(file_path) image_np = load_image_into_numpy_array(image) #    bounding boxes vis_util.draw_bounding_boxes_on_image_array(image_np, boxes) #     result_file_path = os.path.join('/content', 'test.png') vis_util.save_image_array_as_png(image_np, result_file_path) #  ,   from IPython.display import Image Image(result_file_path)

See detection results - lower_body garment

Conclusion

TF Object Detection API is a cool technology that allows you to use the State-of-the-Art mesh architecture in your models. And Google Collaboration is an excellent platform for experiments, which allows you to train networks on powerful hardware. The code from the article is available here .

Source: https://habr.com/ru/post/358146/

All Articles