Determine the ripeness of watermelon with the help of Keras: a full cycle, from idea to program on Google Play

How it all began

It all started with the Apple Market - I found that they have a program to determine the ripeness of watermelon. The program ... strange. What is worth, at least, the proposal to knock on the watermelon is not knuckles, and ... the phone! Nevertheless, I wanted to repeat this achievement on a more familiar Android platform.

Selection of tools

Our task is solved in several ways, and to be honest, I had to make a lot of effort not to go the “simple” way. That is, take the Fourier transform, wavelets and signal editor. However, I wanted to get experience with neural networks, so let the network and analyze the data.

Keras was chosen as a library for creating and training neural networks - a Google add-on over TensorFlow and Theano. In general, if you are just starting to work with networks of deep learning, you will not find a better tool. On the one hand, Keras is a powerful tool optimized for speed, memory and hardware (yes, he can work on video cards and their clusters). On the other hand, everything that can be “hidden” from the user is hidden there, so you don’t have to wrestle with connecting neural network layers, for example. Very comfortably.

Both Keras and neural networks in general require the knowledge of Python - this language, like a snake wrapped around ... sorry, it has become painful. In short, without Python in modern Deep Learning is not worth it. Fortunately, the Python can be studied in two weeks, at most - in a month.
')
For Python, you will need some more libraries, but these are trifles - I mean, if you have already coped with Python itself. You will need an acquaintance (very superficial) with NumPy, PyPlot, and possibly another pair of libraries, from which we will take literally a couple of functions. Not difficult. True.

Well, in conclusion, I note that we will not need the clusters of video cards mentioned above - our task is normally solved with the help of a computer CPU - slowly, but not critical, slowly.

Work plan

First you need to create a neural network - on Python and Keras, under Ubuntu. You can - on the emulator Ubunt. You can - under Vindouz, but the extra time spent will be enough for you to study the mentioned Ubuntu, and then work under it.

The next step is to write the program. I plan to do this in Java for Android. It will be a prototype of the program, in the sense that it will have a user interface, but there is no neural network yet.

What is the meaning of writing "dummy", you ask. But this is what: any task related to data analysis, sooner or later rests on data retrieval - for training our program. In fact, how many watermelons need to be tapped and tasted so that the neural network can build a reliable model on this data? A hundred? More?

Here our program will help us: fill it up on Google Play, distribute (okay, force, twisting hands) all friends who are not lucky to have a phone with Android, and the data, with a tiny stream, begin to flow ... and by the way, where?

The next step is to write a server program that receives data from our android client. True, this server program is very simple, I finished everything in about twenty minutes. But, nevertheless, this is a separate stage.

Finally, there is enough data. We teach neural network.

We port the neural network to Java and release an update of our program.

Profit Although not. The program was free. Only experience and stuffed bumps.

Creating a neural network

Working with audio, which, of course, is tapping on the watermelon, it is either recurrent neural networks, or the so-called one-dimensional convolutional network. Moreover, in recent times, convolutional networks are unambiguously leading, displacing recurrent networks. The idea of a convolutional network is that the data array - the graphics "intersivnost sound - time" - slides the window, and instead of analyzing hundreds of thousands of samples, we work only with what falls into the window. The following layers combine and analyze the results of the work of this layer.

To make it clearer, imagine that you need to find a seagull in the photo of the sea landscape. You are scanning a picture - the “window” of your attention moves along imaginary rows and columns, in search of a white check mark. This is how a convolutional 2D network works, one-dimensional scans along one coordinate — the optimal choice if we are dealing with an audio signal.

I note, however, that it is not necessary to dwell on 1D networks. As an exercise, I plotted the sound and analyzed the resulting bitmap as a picture using a 2d convolutional network. To my surprise, the result was no worse than in the analysis of "raw one-dimensional" data.

The network used had the following structure:

model = Sequential() model.add(Conv1D(filters=32, kernel_size=512, strides=3, padding='valid', use_bias=False, input_shape=(nSampleSize, 1), name='c1d', activation='relu')) model.add(Activation('relu', input_shape=(nSampleSize, 1))) model.add(MaxPooling1D(pool_size=(2))) model.add(Conv1D(32, (3))) model.add(Activation('relu')) model.add(MaxPooling1D(pool_size=(2))) model.add(Conv1D(64, (3))) model.add(Activation('relu')) model.add(MaxPooling1D(pool_size=(2))) model.add(Flatten()) model.add(Dense(64)) model.add(Activation('relu')) model.add(Dropout(0.5)) model.add(Dense(nNumOfOutputs)) #1)) model.add(Activation('sigmoid')) model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])

This network has two output values (it predicts two values): sweetness and ripeness. Sweetness is 0 (unsweetened), 1 (normal) and 2 (excellent), and ripeness, respectively, 0 is too hard, 1 is what is needed, and 2 is over-ripe, like cotton wool with sand.

Estimates for the test sample are set by the person exactly how - we'll talk in the section on the program for Android. The task of a neural network is to predict what kind of assessment a person will give for a given watermelon (by recording a tap).

Writing a program

I have already mentioned that the program should come out as two versions. The first, preliminary, honestly warns the user that her predictions are complete nonsense. But it allows the user to record a knock on the watermelon, set an assessment of the taste of this watermelon and send it over the Internet to the author of the program. That is, the first version simply collects data.

Here is the program page on Google Play, of course, the program is free.

What is she doing:

1. Press the button with the microphone and the recording begins. You have five seconds to hit the watermelon three times - tap-tap-tap. The button with a watermelon makes a "prediction", and we do not touch it yet.

Note - if on Google the old version, the recording and prediction are combined in the button with the watermelon, and there are no buttons with the microphone.

2. The saved file is temporary and will be overwritten the next time you press the record button. This allows you to repeat the tapping, if someone says arm in arm (you can’t imagine how difficult it is to make others shut up for five seconds!) Or the water is just noisy - the dishes are ringing - the neighbor drills ...

But the watermelon is selected and purchased. You brought it home, recorded the sound and cut it. Now you are ready to evaluate its taste. Select the Save tab.

On this tab, we see two combo boxes for grading - sweetness and ripeness (sweetness and ripeness, translation work is underway). Evaluated - click Save.

Attention! Save can be clicked only once. So, first set a rating. By pressing a button, the sound file is renamed, and now it will not be erased during the next recording.

3. Finally, having written down (and therefore, having eaten) about a dozen watermelons, you returned from the dacha, where you did not have the Internet. Now the Internet is. Open the Submit tab and click the button. Package (with a dozen watermelons) goes to the server developer.

Writing a server program

Everything is simple, so I'd rather put the full code of this script. The program “catches” files, gives them unique names and puts them in a directory accessible only by the site owner.

 <?php if (is_uploaded_file($_FILES['file']['tmp_name'])) { $uploads_dir = './melonaire/'; $tmp_name = $_FILES['file']['tmp_name']; $pic_name = $_FILES['file']['name']; $filename = md5(date('Ymd H:i:s:u')); move_uploaded_file($tmp_name, $uploads_dir.$filename); } else { echo "File not uploaded successfully."; } ?>

Neural network training

The data are divided into training and test, 70 and 30 percent, respectively. Neural network - converges. There are no surprises here, however, for beginners: do not forget to normalize the input data, it will save you a lot of nerves. Something like this:

 for file_name in os.listdir(path): nSweetness, nRipeness, arr_loaded = loadData(file_name) arr_data.append(arr_loaded / max(abs(arr_loaded))) # 2 stands for num. of inputs of a combo box - 1 arr_labels.append([nSweetness / 2.0, nRipeness / 2.0])

Porting neural network

There are several ways to port a network from Python to Java. Recently, Google has made this process more convenient, so that you will read the textbooks - make sure that they are not outdated. Here's how I did it:

 from keras.models import Model from keras.models import load_model from keras.layers import * import os import sys import tensorflow as tf # ------------------- def print_graph_nodes(filename): g = tf.GraphDef() g.ParseFromString(open(filename, 'rb').read()) print() print(filename) print("=======================INPUT=========================") print([n for n in g.node if n.name.find('input') != -1]) print("=======================OUTPUT========================") print([n for n in g.node if n.name.find('output') != -1]) print("===================KERAS_LEARNING=====================") print([n for n in g.node if n.name.find('keras_learning_phase') != -1]) print("======================================================") print() # ------------------- def get_script_path(): return os.path.dirname(os.path.realpath(sys.argv[0])) # ------------------- def keras_to_tensorflow(keras_model, output_dir, model_name,out_prefix="output_", log_tensorboard=True): if os.path.exists(output_dir) == False: os.mkdir(output_dir) out_nodes = [] for i in range(len(keras_model.outputs)): out_nodes.append(out_prefix + str(i + 1)) tf.identity(keras_model.output[i], out_prefix + str(i + 1)) sess = K.get_session() from tensorflow.python.framework import graph_util, graph_io init_graph = sess.graph.as_graph_def() main_graph = graph_util.convert_variables_to_constants(sess, init_graph, out_nodes) graph_io.write_graph(main_graph, output_dir, name=model_name, as_text=False) if log_tensorboard: from tensorflow.python.tools import import_pb_to_tensorboard import_pb_to_tensorboard.import_to_tensorboard( os.path.join(output_dir, model_name), output_dir) model = load_model(get_script_path() + "/models/model.h5") #keras_to_tensorflow(model, output_dir=get_script_path() + "/models/model.h5", # model_name=get_script_path() + "/models/converted.pb") print_graph_nodes(get_script_path() + "/models/converted.pb")

Pay attention to the last line: in Java code you will need to specify the names of the input and output of the network. This “print” just prints them.

So, we put the resulting concerted.pb file in the assets directory of the Android Studio, we connect (see here , or here , or better, here ) the tensorflowinferenceinterface library, that's all.

Everything. When I did this for the first time, I expected it would be difficult, but ... it worked on the first attempt.

Here is the call to the neural network from Java code:

  protected Void doInBackground(Void... params) { try { //Pass input into the tensorflow tf.feed(INPUT_NAME, m_arrInput, 1, // batch ? m_arrInput.length, 1); // channels ? //compute predictions tf.run(new String[]{OUTPUT_NAME}); //copy the output into the PREDICTIONS array tf.fetch(OUTPUT_NAME, m_arrPrediction); } catch (Exception e) { e.getMessage(); } return null; }

Here m_arrInput is an array with two elements, containing - ta-da! - our prediction, normalized from zero to one.

Conclusion

Here, like, it is necessary to thank for attention, and to express hope that it was interesting. Instead, I note that Google is the first version of the program. The second is completely ready, but there is little data. So, if you like watermelons - please put a program on your Android. The more data you send, the better the second version will work ...

Of course, it will be free.

Good luck, and yes: thank you for your attention. I hope it was interesting.

Important update: a new version has been released with improved analysis. Thanks to everyone who sent watermelons, and please send more!

Source: https://habr.com/ru/post/424099/

All Articles