FizzBuzz on TensorFlow

interviewer : Hello, do you want coffee or something else? Need a break?

me : No, I think I already had enough coffee!

Interviewer : Great, great. How do you feel about writing code on the board?

me : I just write the code!

interviewer : ...

me : It was a joke.

Interviewer : OK, so you know the fizz buzz challenge?

i : ...

Interviewer : Was it yes or no?

me : It's something like "I can not believe that you ask me about it."

interviewer : OK, then you need to print the numbers from 1 to 100, only if the number is divisible by 3, type the word "fizz", if by 5 - "buzz", and if it is divisible by 15, then - "fizzbuzz".

me : I know this task.

Interviewer : Great, candidates who cannot complete this task do not get along well with us.

i : ...

interviewer : Here is the marker and sponge.

me : [thought for a couple of minutes]

Interviewer : Do you need help to get started?

me : No, no, everything is in order. Let's start with a couple of standard imports:

import numpy as np import tensorflow as tf

interviewer : Um, you understood the problem in fizzbuzz correctly, right?

me : Yes. Let's discuss the model. I think a simple multilayer perceptron with one hidden layer will suit here.

Interviewer : Perceptron?

me : Or a neural network, whatever you like to call. We want the number to come in, and the output is the correct "fizzbuzz" representation of this number. In particular, we want to turn each input into a vector of "activations". One simple way is to convert to binary.

Interviewer : Binary View?

me : Yes, well, you know, ones and zeros? Sort of:

 def binary_encode(i, num_digits): return np.array([i >> d & 1 for d in range(num_digits)])

interviewer : [looks at the board for a minute]

I : And our output will be unitary coding of the fizzbuzz representation of the number, where the first position means “print as is”, the second means “fizz”, and so on.

 def fizz_buzz_encode(i): if i % 15 == 0: return np.array([0, 0, 0, 1]) elif i % 5 == 0: return np.array([0, 0, 1, 0]) elif i % 3 == 0: return np.array([0, 1, 0, 0]) else: return np.array([1, 0, 0, 0])

interviewer : OK, that seems to be enough.

I : Yes, you are right, this is enough for customization. Now we need to generate data to train the network. It will be dishonest to use numbers from 1 to 100 for training, so let's train on all the remaining numbers up to 1024:

 NUM_DIGITS = 10 trX = np.array([binary_encode(i, NUM_DIGITS) for i in range(101, 2 ** NUM_DIGITS)]) trY = np.array([fizz_buzz_encode(i) for i in range(101, 2 ** NUM_DIGITS)])

interviewer : ...

me : Now our model needs to be adapted for tensorflow. Immediately, I'm not very sure what thickness of the layer to take, maybe 10?

interviewer : ...

me : Yes, perhaps 100 would be better. We can always change this later:

 NUM_HIDDEN = 100

We need an input variable of width in NUM_DIGITS, and an output variable with a width of 4:

 X = tf.placeholder("float", [None, NUM_DIGITS]) Y = tf.placeholder("float", [None, 4])

Interviewer : How far do you plan to go with this?

me : Oh, just two layers - one hidden layer and one layer for output. Let's use randomly initialized weights for our neurons:

 def init_weights(shape): return tf.Variable(tf.random_normal(shape, stddev=0.01)) w_h = init_weights([NUM_DIGITS, NUM_HIDDEN]) w_o = init_weights([NUM_HIDDEN, 4])

And we are ready to define our model. As I said earlier, one hidden layer, and let's use, well, I don’t know, ReLU activation:

 def model(X, w_h, w_o): h = tf.nn.relu(tf.matmul(X, w_h)) return tf.matmul(h, w_o)

We can use softmax cross-entropy as our cost function and try to minimize it:

 py_x = model(X, w_h, w_o) cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(py_x, Y)) train_op = tf.train.GradientDescentOptimizer(0.05).minimize(cost)

interviewer : ...

I : And, of course, the prediction will simply be the greatest output:

 predict_op = tf.argmax(py_x, 1)

Interviewer : Before you get lost, the problem you had to solve was the generation of fizz buzz for numbers from 1 to 100.

Me : Oh, great note, the predict_op function will return a number from 0 to 3, but we want the "fizz buzz" output:

 def fizz_buzz(i, prediction): return [str(i), "fizz", "buzz", "fizzbuzz"][prediction]

interviewer : ...

me : Now we are ready to train the model. Let's lift tensorflow session and we initialize variables:

 with tf.Session() as sess: tf.initialize_all_variables().run()

Let's run, say, 1000 workout epochs?

interviewer : ...

me : Yes, probably it will not be enough - let it be 10,000, so for sure.

Also, our training data is consistent, which I don’t like, so let's mix it up at each iteration:

 for epoch in range(10000): p = np.random.permutation(range(len(trX))) trX, trY = trX[p], trY[p]

And, each era will train in packs for, I do not know, well, let 128 inputs.

 BATCH_SIZE = 128

As a result, each pass will look like this:

  for start in range(0, len(trX), BATCH_SIZE): end = start + BATCH_SIZE sess.run(train_op, feed_dict={X: trX[start:end], Y: trY[start:end]})

and then we can deduce the error of the training data, because why not?

  print(epoch, np.mean(np.argmax(trY, axis=1) == sess.run(predict_op, feed_dict={X: trX, Y: trY})))

Interviewer : Are you serious?

me : Yes, it seems to me that it is very useful to see how precision is progressing.

interviewer : ...

me : So, after the model is trained, the fizz buzz time. Our input will be only binary coding number from 1 to 100:

 numbers = np.arange(1, 101) teX = np.transpose(binary_encode(numbers, NUM_DIGITS))

And then our output is simply a fizz_buzz function applied to the output model:

 teY = sess.run(predict_op, feed_dict={X: teX}) output = np.vectorize(fizz_buzz)(numbers, teY) print(output)

interviewer : ...

me : And this will be your fizz buzz!

interviewer : That's enough, really. We will contact you.

me : We will contact, it sounds promising.

interviewer : ...

P.S

I did not get this job. But I tried to actually run this code ( code on Github ), and it turned out that it gives a somewhat wrong conclusion! Thanks a lot, machine learning!

 In [185]: output Out[185]: array(['1', '2', 'fizz', '4', 'buzz', 'fizz', '7', '8', 'fizz', 'buzz', '11', 'fizz', '13', '14', 'fizzbuzz', '16', '17', 'fizz', '19', 'buzz', '21', '22', '23', 'fizz', 'buzz', '26', 'fizz', '28', '29', 'fizzbuzz', '31', 'fizz', 'fizz', '34', 'buzz', 'fizz', '37', '38', 'fizz', 'buzz', '41', '42', '43', '44', 'fizzbuzz', '46', '47', 'fizz', '49', 'buzz', 'fizz', '52', 'fizz', 'fizz', 'buzz', '56', 'fizz', '58', '59', 'fizzbuzz', '61', '62', 'fizz', '64', 'buzz', 'fizz', '67', '68', '69', 'buzz', '71', 'fizz', '73', '74', 'fizzbuzz', '76', '77', 'fizz', '79', 'buzz', '81', '82', '83', '84', 'buzz', '86', '87', '88', '89', 'fizzbuzz', '91', '92', '93', '94', 'buzz', 'fizz', '97', '98', 'fizz', 'fizz'], dtype='<U8')

Probably need to take a deeper neural network.

Source: https://habr.com/ru/post/301536/

All Articles

FizzBuzz on TensorFlow

P.S

More articles: