Birds Detection with Azure ML Workbench

Have you ever thought that biologists, among other things, have a number of important tasks? They need to analyze huge amounts of information to track population dynamics, identify rare species and assess their impact. Under the cut, we want to tell you about the project for the identification of the red-legged moevas on photos taken with surveillance cameras. You will learn more about data markup, learning the model on the Azure Machine Learning Workbench using Microsoft Cognitive Toolkit (CNTK) and Tensorflow, and deploying a web prediction service.

The article is a translation of the material Bird Detection with Azure ML Workbench .

Data

The video below is provided by Abram Fleischman (State University of San Jose) and Conservation Metrics, Inc. It captures the natural habitat of the red-legged moev, a species of bird for which a detection tool is needed. Using various equipment, including mountaineering equipment, biologists install cameras on the rocks to take pictures both during the day and at night.
')

Photographs were used to train the model, and image markup was performed using the Visual Object Tagging Tool (VOTT) . The data markup took about 20 hours, during which time about 12,000 bounding boxes were noted.

Marked data is available in the repository on GitHub .

Whence the data

This data was collected by Dr. Rachel Orben (University of Oregon), Abram Fleischman (State University at San Jose) and Conservation Metrics, Inc. as part of a major project to study the early breeding period of the red-legged warrior, determine the impact of the factor of accessibility to food and analyze the non-breeding period in the Bering Sea (Alaska).

Object detection

Learn more about object detection technologies in the blog post about convolutional neural networks (CNN) . Faster R-CNN (Region proposals with Nevus Neural Network) - a relatively new approach (the first document on this method was published in 2015). It is widely used by the machine learning community and is now embedded in the most popular deep neural network (DNN) frameworks, including PyTorch, CNTK, Tensorflow, Caffe, and others.

In this article, we will look at the object detection API using the Faster R-CNN algorithm in the CNTK and Tensorflow frameworks.

Azure Machine Learning Workbench

We used the recently announced Azure Machine Learning Workbench platform to train the model and create web prediction services. It is a set of analytical tools and allows data managers to prepare data, run machine learning experiments, and deploy models in the cloud environment (see the documentation in the Installation and Setup section).

Since we have to work with images, we used the MNIST handwritten classification patterns in CNTK and Tensorflow - the tool to start the experiment.

As a rule, deep neural network (DNN) training is performed efficiently using a graphics processor (GPU), which significantly speeds up the execution of a number of operations on matrices. To train the models, we deployed virtual machines for processing and analyzing data from the GPU and used the Docker remote runtime environment available in the Azure ML Workbench (see the “ Details ” section and additional information on target platforms ).

Azure ML records the results of each task (experiment) in the execution log . Since various combinations of model parameters were used in the experiments, this possibility turned out to be quite useful: the available visualization tools help to choose the model with the best performance. Note that you need to add a tool using the Azure ML logging API to the training / assessment code in order to track the metrics you need (for example, classification accuracy).

Image Marking and Export

We used the VOTT utility (available for Windows and MacOS) to mark up and export data to CNTK and Tensorflow Pascal formats, respectively.

This is a tool with a convenient interface for identification, which allows you to mark the necessary areas on images and video. To work with it, you need to collect images into a folder, then launch VOTT, specify a set of image data and go to the markup of areas.

When finished, click Object Detection, then Export Tags to export to CNTK and Tensorflow.

In Tensorflow, the export format is VOC Pascal, so we converted the data into TFRecords format for use in training and evaluation. More details will be given below.

CNTK Bird Detection Training

As indicated in the previous section, we used the popular Faster R-CNN algorithm in the bird detection model. In this section, we will focus on two aspects of our approach:

Use the Azure ML Workbench to start learning on remote VMs.
Configure hyperparameters through the Azure ML Workbench.

Using the Azure ML Workbench for training on a remote VM

Configuring the hyperparameters is the main stage of the process of building ready-to-use machine learning models (or deep learning), provided that the first draft of the model showed good results. Here the problem lies in the effective implementation and simplification of the process through the Azure ML Workbench.

Setting up the parameters requires a large number of learning experiments, which usually takes a very long time. One of the approaches is based on training on a powerful local computer or cluster. However, our approach is aimed at learning in the cloud using Docker containers on remote (virtual) machines. The main advantage is that now we can produce the required number of containers for parallel configuration of parameters. In accordance with this documentation for Azure ML, you must register each virtual machine as the target of the calculations for the experiment. Please note that there are restrictions on the password characters, for example, using the "*" symbol in the password will cause an error.

az ml computetarget attach --name "my_dsvm" --address "my_dsvm_ip_address" --username "my_name" --password "my_password" --type remotedocker

After the command is executed, the files myvm.compute and myvm.rucomfig will be created in the aml_config folder. Since our task is more suitable for a machine with a GPU, it is necessary to make the following changes:

In myvm.compute

 baseDockerImage: microsoft/mmlspark:plus-gpu-0.7.91 nvidiaDocker: true

In myvm.runconfig

 EnvironmentVariables: "STORAGE_ACCOUNT_NAME": "STORAGE_ACCOUNT_KEY": Framework: Python PrepareEnvironment: true

We used Azure storage to store training data, pre-trained models, and model checkpoints. Storage credentials are listed as EnvironmentVariables . Ensure that the right packages are included in conda_dependencies.yml .

Now we can execute the command to start preparing the machine.

 az ml experiment –c prepare myvm

Then we will train the object detection model.

 az ml experiment submit –c Detection/FasterRCNN/run_faster_rcnn.py .. .. .. Evaluating Faster R-CNN model for 53 images. Number of rois before non-maximum suppression: 8099 Number of rois after non-maximum suppression: 1871 AP for Kittiwake = 0.7544 Mean AP = 0.7544

Configure hyperparameters through Azure ML Workbench

Using Azure ML and Workbench, you can easily record hyperparameters and other performance metrics while running multiple containers in parallel (for more information, see the “ Logging Information ” section in the documentation).

The first approach you should try is to use different pre-trained base models. At the time of this writing, the CNTK Faster R-CNN API method supported two basic models: AlexNet and VGG16 . We can use these trained models to highlight image features. Despite the fact that these basic models were trained on other data sets, such as ImageNet , at the low and medium level, the characteristics of images are the same in different applications and, therefore, are generally available. This phenomenon is known as the “transfer of learning.”

AlexNet has five convolutional CONV layers, while VGG16 has twelve. The number of parameters to be trained in VGG16 is 138 million, which is almost three times more than AlexNet; As a base model, we used VGG16 here. The following are the VGG16 hyperparameters optimized for best performance on the estimated set.

In Detection / FasterRCNN / FasterRCNN_config.py:

 # Learning parameters __C.CNTK.L2_REG_WEIGHT = 0.0005 __C.CNTK.MOMENTUM_PER_MB = 0.9 # The learning rate multiplier for all bias weights __C.CNTK.BIAS_LR_MULT = 2.0

In Detection / utils / configs / VGG16_config.py:

 __C.MODEL.E2E_LR_FACTOR = 1.0 __C.MODEL.RPN_LR_FACTOR = 1.0 __C.MODEL.FRCN_LR_FACTOR = 1.0

Azure ML Workbench greatly simplifies the visualization and comparison of different configurations of parameters.

mAP using the base model VGG16

 Evaluating Faster R-CNN model for 53 images. Number of rois before non-maximum suppression: 6998 Number of rois after non-maximum suppression: 2240 AP for Kittiwake = 0.8204 Mean AP = 0.8204

For implementation instructions, see the GitHub repository .

Learning a bird detection model with Tensorflow

Recently, Google introduced a powerful set of APIs for object detection. We used their animal recognition training documentation using the Google Cloud Machine Learning Engine cloud-based machine learning technology, which inspired us to develop a project to train a model for detecting moevs on the Azure ML Workbench. The Tensorflow Object Discovery API contains many pre-trained models on the COCO data set . In our experiments, we used the ResNet-101 ( deep residual network , 101 layer) as the base model and used the configuration of the animal recognition example to begin setting up the object detection training.

This repository contains scripts used to train object detection models using the Azure ML Workbench and Tensorflow.

Training preparation

Step 1. Prepare the data in TF Records format, which is required for the Tensorflow Object Discovery API. For this approach, you need to convert the standard output of the VOTT tool. See the create_pascal_tf_record.py universal converter for details .

 python create_pascal_tf_record.py --label_map_path=/data/pascal_label_map.pbtxt --data_dir=/data/ --output_path=/data/out/pascal_train.record --set=train python create_pascal_tf_record.py --label_map_path=/data/pascal_label_map.pbtxt --data_dir=/data/ --output_path=/data/out/pascal_val.record --set=val

Step 2. Create a package of code Tensorflow Object Detection and Slim for further installation in the Docker image that is used for experimentation. Below are the steps from the Tensorflow documentation on object detection:

 # From tensorflow/models/research/ python setup.py sdist (cd slim && python setup.py sdist)

Then move the created tar files to a place available for experimentation (for example, in conda_dependancies.yaml storage) and place the link in conda_dependancies.yaml for the experiment.

 dependencies: -python=3.5.2 -tensorflow-gpu -pip: #... More dependencies here… #TF Object Detection -<a href="https://olgalidata.blob.core.windows.net/tfobj/object_detection-0.1_3.tar.gz">https:///object_detection-0.1.tar.gz</a> -<a href="https://olgalidata.blob.core.windows.net/tfobj/slim-0.1.tar.gz">https://</a><a href="https://olgalidata.blob.core.windows.net/tfobj/object_detection-0.1_3.tar.gz">/</a><a href="https://olgalidata.blob.core.windows.net/tfobj/slim-0.1.tar.gz">/slim-0.1.tar.gz</a>

Step 3. In the experiment script add import.

 from object_detection.train import main as training_module

Then call the learning procedure in your code using the training_module (_) function.

The process of learning and assessment

Detection of the Tensorflow Object Detection API implies the launch of training and evaluation (checking the current performance of the model) by executing two separate commands from the command line. When running several experiments, it is advisable to periodically run an evaluation (for example, every 100 iterations) to analyze the model’s ability to recognize objects in hidden data.

In the case of Tensorflow Object Detection AP, we added train_eval.py , which demonstrates an approach to continuous learning and evaluation.

  print("Total number of training steps {}".format(train_config.num_steps)) print("Evaluation will run every {} steps".format(FLAGS.eval_every_n_steps)) train_config.num_steps = current_step while current_step <= total_num_steps: print("Training steps # {0}".format(current_step)) trainer.train(create_input_dict_fn, model_fn, train_config, master, task, FLAGS.num_clones, worker_replicas, FLAGS.clone_on_cpu, ps_tasks, worker_job_name, is_chief, FLAGS.train_dir) tf.reset_default_graph() evaluate_step() tf.reset_default_graph() current_step = current_step + FLAGS.eval_every_n_steps train_config.num_steps = current_step

To establish several hyperparameters of the model and assess their impact on the model, we divided the data into training, validation (customizable) and test sets : 160 images, 54 and 55 images, respectively.

Comparing Runs

The Tensorflow object detection framework provides users with various parameter settings, allowing you to choose the best option for a specific data set.

In this exercise, we will do several runs and see which one provides the best model performance. As the target metric, we will use the detection accuracy of the object, which is usually defined as mAP (Mean Average Precision). In each run, we use azureml.logging to get information about the maximum mAP and identify learning iterations. In addition, we build a “mAP and iteration” graph and then save it to the output folder for display in Azure ML Workbench.

TensorBoard event integration with Azure ML Workbench

TensorBoard is a powerful tool for debugging and visualizing deep neural networks (DNN). The Tensorflow Object Detection API already provides summary metrics for accuracy. In this project, we integrated the final Tensorflow events used by TensorBoard for visualization with the Azure ML Workbench.

 <span class="pl-k">from</span> tensorboard.backend.event_processing <span class="pl-k">import</span> event_accumulator <span class="pl-k">from</span> azureml.logging <span class="pl-k">import</span> get_azureml_logger ea = event_accumulator.EventAccumulator(eval_path, ...) df = pd.DataFrame(ea.Scalars('Precision/mAP@0.5IOU')) max_vals = df.loc[df["value"].idxmax()] #Plot chart of how mAP changers as training progresses fig = plt.figure(figsize=(6, 5), dpi=75) plt.plot(df["step"], df["value"]) plt.plot(max_vals["step"], max_vals["value"], "g+", mew=2, ms=10) fig.savefig("./outputs/mAP.png", bbox_inches='tight') # Log to AML Workbench best mAP of the run with corresponding iteration N run_logger = get_azureml_logger() run_logger.log("max_mAP", max_vals["value"]) run_logger.log("max_mAP_interation#", max_vals["step"])

For details, see the results_logger.py code.

Here is an analysis of several training runs conducted using the Azure ML Workbench experiment framework.

Run # 1 uses the stochastic gradient descent method, data augmentation is disabled (for an overview of the possibilities of gradient optimization, see this blog entry).

The execution log in Azure ML Workbench provides details on each run:

In this case, we see that the maximum mAP value was 93.37% at approximately 3,500 iterations. Accordingly, a re-training of the model for the training data occurs, and the performance on the test set begins to fall.

Run 2 uses Adam's advanced optimization algorithm. All other options are the same.

Here, the mAP value of 93.6% is achieved much faster than in run 1. Apparently, the retraining of the model occurs much earlier, since the value of accuracy on the estimated set decreases rapidly.

Run 3 adds data augmentation to the learning configuration. For subsequent runs, we will leave the Adam optimization algorithm.

 data_augmentation_options{ random_horizontal_flip{} }

Random horizontal display of images allowed to improve the mAP performance from 93.6 in run No. 2 to 94.2%. It also requires more iterations to retrain the model.

Run 4 contains more data augmentation parameters.

 data_augmentation_options{ random_horizontal_flip{} random_pixel_value_scale{} random_crop_image{} }

The following are interesting results:

Despite the fact that the mAP value is not maximal (91.1%), after 7,000 iterations, no retraining takes place. At the same time, it is logical to continue learning this model in order to understand whether the value of the mAP can be increased.

Here is a brief overview of the learning process using the Azure ML Workbench:

Azure ML Workbench allows users to simultaneously compare runs (below are the runs Nos. 1, 3 and 4):

In addition, we can build graphs with the results of assessments on the desired image (s) and also use them when comparing values. Events TensorBoard already contain all the necessary data.

Thus, the detection of objects based on ResNet allows us to achieve the best results even on small data sets. The Azure ML Workbench has a useful infrastructure that provides a single area for performing experiments and comparing results.

Deploying a web services evaluator

After developing the object detection model and classification with sufficient performance, we proceed to deploying the model as a hosted web service in such a way as to be able to connect to the bird watching application. We’ll show how this can be done using the built-in Azure ML tools, and how to perform a custom deployment.

Web services using Azure ML CLI

Azure ML provides extensive support for operationalizing the model on local computers or the Azure cloud platform.

Install Azure ML CLI

Before deploying the model as a web service, you must run SSH on the VM you are using.

 ssh @

In this example, we are using a VM to process and analyze Azure data on which the Azure CLI is installed. When using another VM, install the Azure CLI using:

 pip install azure-cli pip install azure-cli-ml

 az login

Environment preparation

First, register the environment provider with:

 az provider register -n Microsoft.MachineLearningCompute

When deploying a web service on a local computer, first of all you need to prepare the environment:

 az ml env setup -l [Azure region, eg eastus2] -n [environment name] -g [resource group]

This step will create a resource group, storage account, Azure Container Registry (ACR), and Application Insights account.

Set up the environment as shown:

 az ml env set -n [environment name] -g [resource group]

Create a model management account:

 az ml account modelmanagement create -l [Azure region, eg eastus2] -n [your account name] -g [resource group name] --sku-instances [number of instances, eg 1] --sku-name [Pricing tier for example S1]

Now you can begin to deploy the model! You can create a service with:

 az ml service create realtime --model-file [model file/folder path] -f [scoring file eg score.py] -n [your service name] -r [runtime for the Docker container eg spark-py or python] -c [conda dependencies file for additional python packages]

Please note that currently nvidia-docker is not available for prediction. Be sure to change the Conda dependencies to remove any references related to the GPU, such as tensorflow-gpu.

After deploying the service, you can view information about how to use the web service with:

 az ml service usage realtime -i [your service name]

For example, you can test a service using the curl command:

 curl -X POST -H "Content-Type:application/json" --data !! YOUR DATA HERE !! http://127.0.0.1:32769/score

Alternative deployment of evaluating web service

Another way to deploy a web service for prediction is to create your own instance of a Sanic web server. Sanic — - Python 3.5+, Flask, -. , CNTK Faster R-CNN , .

- Sanic. ( app.py ), - , . API , HTTP .

 app = Sanic(__name__) Config.KEEP_ALIVE = False server = Server() server.set_model() @app.route('/') async def test(request): return text(server.server_running()) @app.route('/predict', methods=["POST",]) def post_json(request): return json(server.predict(request)) app.run(host= '0.0.0.0', port=80) print ('exiting...') sys.exit(0)

-, , .

predict.py , , , JSON.

 regressed_rois, cls_probs = evaluate_single_image(eval_model, img_path, cfg) bboxes, labels, scores = filter_results(regressed_rois, cls_probs, cfg)

JSON — , :

 [{"label": "Kittiwake", "score": "0.963", "box": [246, 414, 285, 466]},...]

, -, . Docker, .

 cd CNTK_faster-rcnn/Detection

Docker Dockerfile, Docker:

 FROM hsienting/dl_az COPY ./ /app ADD run.sh /app/ RUN chmod +x /app/run.sh ENV STORAGE_ACCOUNT_NAME ENV STORAGE_ACCOUNT_KEY ENV AZUREML_NATIVE_SHARE_DIRECTORY /cmcntk ENV TESTIMAGESCONTAINER data EXPOSE 80 ENTRYPOINT ["/app/run.sh"]

Docker, :

 docker build -t cmcntk .

Docker cmcntk , . - cmcntk ( , ). 80 80 Docker cncntk.

 docker run -v /:/cmcntk -p 80:80 -it cmcntk:latest

- curl:

 curl -X POST http://localhost/predict -H 'content-type: application/json' -d '{"filename": ""}'

. What's next? ? API? Conservation Metrics .

Problem

, , - . Including:

( ), .
CORS ( ), / .

Decision

API, Azure API . CORS API.

Azure API

- https://ms.portal.azure.com/#create/hub
«API management» ( API).
.

Configuration:

API.

API:

, API. «+Add API» (+ API), «Blank API» ( API). API , Web service URL API , API URL suffix , URL- API, Products API-, .

API «Inbound processing» ( ) «Code View» ( ) ( CORS URL-):

* *

API URL Suffix, API, API.

Azure, , ( API CORS).

- :

/ Azure.
Azure API .
( ) .

GitHub .

API

API «» API, , Ocp-Apim-Subscription-Key. API, API.

:

«Publisher Portal» ( ) API.
.
, .

Ocp-Apim-Subscription-Key :

 <span class="hljs-keyword">export</span> <span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">cntk</span>(<span class="hljs-params">filename</span>) </span>{ <span class="hljs-keyword">return</span> fetch(<span class="hljs-string">'/tensorflow/'</span>, { method: <span class="hljs-string">'post'</span>, headers: { Accept: <span class="hljs-string">'application/json'</span>, <span class="hljs-string">'Content-Type'</span>: <span class="hljs-string">'application/json'</span>, <span class="hljs-string">'Cache-Control'</span>: <span class="hljs-string">'no-cache'</span>, <span class="hljs-string">'Ocp-Apim-Trace'</span>: <span class="hljs-string">'true'</span>, <span class="hljs-string">'Ocp-Apim-Subscription-Key'</span>: , }, body: <span class="hljs-built_in">JSON</span>.stringify({ filename, }), }) }

.

. - — , :

 <span class="xml"><span class="hljs-tag"><<span class="hljs-title">body</span>></span> <span class="hljs-tag"><<span class="hljs-title">canvas</span> <span class="hljs-attribute">id</span>=<span class="hljs-value">'myCanvas'</span>></span><span class="hljs-tag"></<span class="hljs-title">canvas</span>></span> <span class="hljs-tag"><<span class="hljs-title">script</span>></span><span class="javascript"> <span class="hljs-keyword">const</span> imageUrl = <span class="hljs-string">"some image URL"</span>; cntk(imageUrl).then(labels => { <span class="hljs-keyword">const</span> canvas = <span class="hljs-built_in">document</span>.getElementById(<span class="hljs-string">'myCanvas'</span>) <span class="hljs-keyword">const</span> image = <span class="hljs-built_in">document</span>.createElement(<span class="hljs-string">'img'</span>); image.setAttribute(<span class="hljs-string">'crossOrigin'</span>, <span class="hljs-string">'Anonymous'</span>); image.onload = () => { <span class="hljs-keyword">if</span> (canvas) { <span class="hljs-keyword">const</span> canvasWidth = <span class="hljs-number">850</span>; <span class="hljs-keyword">const</span> scale = canvasWidth / image.width; <span class="hljs-keyword">const</span> canvasHeight = image.height * scale; canvas.width = canvasWidth; canvas.height = canvasHeight; <span class="hljs-keyword">const</span> ctx = canvas.getContext(<span class="hljs-string">'2d'</span>); <span class="hljs-comment">// render image on convas and draw the square labels</span> ctx.drawImage(image, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, canvasWidth, canvasHeight); ctx.lineWidth = <span class="hljs-number">5</span>; labels.forEach((label) => { ctx.strokeStyle = label.color || <span class="hljs-string">'black'</span>; ctx.strokeRect(label.x, label.y, label.width, label.height); }); } }; image.src = imageUrl; }); </span><span class="hljs-tag"></<span class="hljs-title">script</span>></span> <span class="hljs-tag"></<span class="hljs-title">body</span>></span> <span class="hljs-tag"></<span class="hljs-title">html</span>></span> </span>

, :

. GitHub .

Conclusion

, :

;
CNTK/Tensorflow Azure ML Workbench;
Azure ML Workbench;
- ;
.

Resources

GitHub .
Azure Machine Learning Workbench.
GitHub: Microsoft Cognitive Toolkit.
GitHub: Tensorflow Object Detection API.
: Analyzing Big Data with Microsoft R ( ), Perform Cloud Data Science with Azure Machine Learning ( ) Performing Data Engineering on Microsoft HD Insight ( ).

We remind you that you can try Azure for free .

Source: https://habr.com/ru/post/342056/

All Articles