📜 ⬆️ ⬇️

AI, practical course. Project planning



This article will focus on:


Some tasks are presented with a focus on group work. If you are working alone, you can skip these sections or complete the corresponding tasks, combining several functions .

As an educational project for this series of articles, we took an application that will recognize emotions on downloaded images using an image processing algorithm (emotion recognition), create musical accompaniment suitable for recognized emotions, and then edit the video, combining images and musical accompaniment.
')

Design design in the project


Making the design in the project begins with determining the scope of the project, that is, what you will create and how. This eliminates possible obscure situations during development.

For example:


You need to find the answer to each of these questions before you can go into the implementation phase. In addition, you need a methodology that will help you formulate your own questions about the future of the application, so as not to miss any important aspects.

Project Analysis Methods


We recommend using these three simple but effective techniques for analyzing a project and formulating tasks.


Analysis of video editing applications


Based on the hierarchical decomposition method for our project, we decide that the video editing application, like any other application involving user interaction, can be divided into:

  1. the interface part, i.e. the component directly interacting with the user;
  2. software and hardware, i.e., a component, where interesting AI processes take place and which, in turn, can be divided into:
    • o emotion recognition component;
    • o component to create music.

In the interface and hardware-software parts, there are three main components: the user interface (blue), the recognition of emotions (orange) and the creation of musical accompaniment (green), as shown in Fig. one.


Fig. 1. Scheme of the application for video editing.

User interface


The user interacts with this component, using the buttons to upload images, start the video editing process, and organize sharing or downloading the result. Although the user interface is an important element, in this series of educational materials we will focus on intelligent information processing in software and hardware using new technologies from Intel.

Experience the benefits of an AI-based video editing app.

Step 1. Recognizing emotions


Step 2. Video editing


Step 3. Exchange


Emotion recognition by hardware and software (image processing)


This component is responsible for assigning emotions to images. Each intelligent component that processes data has two processes:

  1. the learning process or data extraction component, ie, the learning phase;
  2. the process of applying the component, i.e. the phase of application and testing.

Each of these processes is associated with a number of standard AI tasks .

During the learning phase, you supply the component with response data and teach the machine the "rule" of assigning the appropriate responses to input data from the same set or distribution. The tasks of the training phase include:

  1. define input and output data for each stage of data processing;
  2. find or create a machine learning dataset that matches specific input and output data;
  3. train a machine learning model;
  4. evaluate the machine learning model.

During the application and testing phase, the trained model is used to predict responses when unknown objects are received from the same population or distribution. Tasks include:

  1. define input and output data for each stage of data processing;
  2. deploy a machine learning model.

In the broader context, the tasks and steps typical of almost every AI project can be defined using the CRISP-DM methodology . Within this methodology, a project based on data mining or AI is presented in the form of a cycle with six states, as shown in Fig. 2. At each stage there are different tasks and subtasks, for example, data markup, model assessment and function development. The cycle arises due to the fact that any real intelligent system can be improved.


Figure 2. Relationships between different phases of CRISP-DM.

Moving from theory to practice, let's formulate the input and output data for training and test the model of emotion recognition.

We have six different emotions: anxiety, sadness, joy, calm, determination and fear.


Training


Testing


Next, you need to see how the project fits into the big picture, and apply system analysis methods to eliminate inconsistencies:

  1. The user will upload multiple images. Therefore, we must decide how, from the emotions recognized for each individual image, to get a single emotional mood for the entire video.
  2. Images will be accompanied by music. Therefore, we must understand how a melody will be created on the basis of a particular emotion.
  3. The content of the images must be synchronized with the music, and the transitions between the images must be smooth so that the user can enjoy watching.

To begin, we will create the music for one image. We will return to several images after we consider the component of creating musical accompaniment and understand how to integrate it with the component of emotion recognition.

Creation of musical accompaniment by software and hardware


This component produces a melody in response to an emotion. To connect the component of creating music with the component of image processing, we need a connecting link between the emotion code and the audio signal:

  1. Take a randomly known song from a previously created database of famous songs.
  2. Adjust the tempo, scale, rhythm of the melody in accordance with the emotion using a simple script.
  3. Start the process of creating music based on machine learning using a basic melody tuned to the emotion.

The process of creating musical accompaniment should complete the basic melody tuned to the emotion, identifying the most naturally sounding next musical note by learning in the past on a wide body of songs.

As in the case of the emotion recognition component, we need to define the input and output data for learning and testing the music creation model.

Training

Testing


Taking into account the constant APIs for the music creation component, we need to implement the modulation script depending on the emotion and complete all the AI ​​tasks related to creating the music accompaniment, including searching for a data set to train the music creation model or searching for basic melodies . We will discuss these aspects later in the articles. For our educational project we will use the following:


Component connection


Options for combining images and melodies are presented below with their own advantages and disadvantages.



Option A: One base tune is modulated according to the prevailing emotion for all images.




Option B: One base tune is modulated separately for each emotion found on the images.




Option : Different basic melodies are modulated separately for each emotion found on the images.


For example, we have three images with different emotions (joy, tranquility and fear). We'll use Jingle Bells as the base song, and we'll use BachBot to create the music. As a result, we will get three songs created based on three versions of “Jingle Bells” (the first for joy, the second for calm, and the third for fear). Each incoming image is processed separately by calling the API to recognize emotions using one base song that applies to all images. The base song is modulated for each emotion, after which for each image the corresponding version of the song is used, modulated to fit the emotion.

For more information, see:


General recommendations for project management


To decompose a project, do the following:

  1. determine the structure of the upper level (up to three levels of decomposition);
  2. talk to each member separately (usually performed by a system architect) to
    • learn about the capabilities and requirements of the respective components;
    • define official API requirements;
    • refine the project decomposition hierarchy. In this case, the smallest integration details will be revealed at an early stage of the project, thereby reducing the amount of work and improvements at the integration stage.

Below are general guidelines that will help you to draw up your plan in the project.


In subsequent articles, we will consider in detail all the tasks for our educational project.

Control card of decomposition elements and resources for the AI ​​project


Below is a list of typical tasks for AI projects. Use it as a template for your project.

  1. Formulate a business problem by defining input and output data, for example, objects and labels or target variables.
  2. Analyze the data.
    • Make a selection of data.
    • Perform a search analysis of the data.
  3. Clear the data.
    • Delete duplicate, extraneous data, etc.
    • Normalize function values.
  4. Create a test model for the assessment methodology.
  5. Develop a machine learning model.
  6. Prepare a machine learning dataset.
    • Collect baseline data.
    • Find a suitable data set.
    • Configure the storage infrastructure.
  7. If tags are missing, annotate the source data.
    • Make recommendations for annotation.
    • Perform an annotation process.
    • Check the quality of the annotation.
  8. Teach a machine learning model.
    • Select a library.
      • Perform a comparative analysis of existing libraries.
      • Install and configure the most appropriate library.
    • Select and configure the infrastructure for machine learning.
      • Perform a comparative analysis of private and public cloud environments and performance technologies.
      • Perform resource planning to achieve your machine learning goals.
    • Select an algorithm.
    • Create a prototype of the algorithm.
    • Refine your model by setting up hyper parameters and adding improvements related to this subject area.
  9. Perform a model evaluation.
  10. Deploy the model.
    • Determine the nature of service level agreements (SLAs) for machine learning APIs.
    • Put the machine learning model in the API (container or re-use in a more efficient programming language).
    • Test the API under load to see if it complies with the terms of service level agreements (SLA).

Conclusion


In this article, we looked at three popular systems analysis methods that were applied to the project to create a video editing application. Having performed a hierarchical decomposition, we identified three main components of the application: the user interface, emotion recognition, and the creation of musical accompaniment. We performed a detailed analysis of the AI ​​components for emotion recognition and musical accompaniment creation, including the definition of input and output data, training and testing, as well as the integration of components. Finally, we gave recommendations on project planning, and also shared a set of standard AI tasks based on the CRISP-DM methodology, suitable for any AI project.

Source: https://habr.com/ru/post/358506/


All Articles