Algorithm and tactics of finding words in the game Balda

Once on Habré, I found an article about a word search algorithm in a noodle game: habrahabr.ru/post/207734 I myself am the author of the Robot Balda 2 solver, which over the years has gained popularity with many online players in the Balda game. And I would like to share my experience and tell about one unique algorithm in the game of noodle, which has not been used by anyone yet.

About that article as a whole. According to the same algorithm, my words are searched for through prefix trees. But instead of two trees, I have one, which contains the symbol "separator", after which the rest of the word goes inverted.

It is also possible to enable a more complex prefix tree (“turbo mode”), with the “dummy” symbol. In this case, the terminal node contains all the letters that can be put in a dummy. For example, we passed the path K * T, and met a terminal node. It will contain two letters, “O” and “And.” In cell “O” there will be a link to the word CAT, and in “AND” the word KIT. As a result, a dummy allows you to get rid of sorting 32 letters in each iteration on empty cells. But it increases the size of the prefix dictionary about 5 times. In its pure form, this gives an acceleration of 4 times (if memory serves), but in addition to searching for words, I also spend time on analysis, so the total acceleration is only 1.5 times.
')
It is best to make the dummy symbol the first symbol (root node) in the prefix tree, and the search for words should always begin with a dummy (even if you have a tree without a dummy symbol). In this case, the search speed will be much faster, and the tree will not grow 5 times. However, I specifically made that the search began with the letters already set. Although it is extremely not effective, because increases the number of recursive passages, and it will be necessary to sift out duplicates that inevitably appear on every cell through which the word passes. But such an approach (search from non-empty cells) was conceived specially so that you can apply one tricky optimization, which I will discuss at the end. With her, the problem of duplicates will disappear, the speed will be compensated (and even grow up), and a new property will appear - the stability of the search time.

Many may ask why more quickly. If even the simplest and slowest algorithms look for words in milliseconds. These questions can arise only from those programmers who have not tried to play against the "champions". Their programs are able to think several moves ahead. It means that programs should not only search for words, but also apply tactics. And the most universal tactic for all logic games is minimax. What it is: well described here or here . And the use of minimax means that you need to do a lot of virtual moves. Well, for example, if each player gets an average of 100 words on each turn, then it would take 100 ^ (4-1) = 1,000,000 searches to see all the options for developing the game for 4 moves ahead. If your program can search for all words in 1ms, then it will check the game for 4 moves ahead in 16 minutes! And to think about the course is usually given 2 minutes. Now you understand why you need a very fast search. My program can analyze the game for 8-10 moves ahead in a few seconds.

The greatest acceleration comes from eliminating words. If it is simpler to say - only the longest words become branches of the search tree. (This is approximately. In reality, I have a few more difficult words to select). As practice has shown, it is extremely rare that in the beginning and in the middle of the game shorter words in the future play their difference in points with a longer word, and moreover bring in more points. The opponent has too many options to “recoup,” and the probability of trapping him is too low. So, it makes no sense to waste time on short words. And if the end of the game is already there, and not many moves remain, the program expands the list of analyzed words. In 4-6 moves before the end of the game, absolutely all words are involved in the search. And by the end of the game it is just appropriate to watch short words. It rarely happens that at the end of a game with words of 4-5 letters, it is more profitable to walk from 2-3.

The second significant acceleration gives alpha beta clipping .

Before that, I mentioned standard algorithms that you can always find and familiarize yourself with. And now I will write my "invention". It concerns the optimization of the search for words in the process of building a tree of virtual moves. Look here. As usual we build a brute force tree in the balde:
1. Find all the words.
2. Enumerate all the words, putting each word in turn on the playing field.
3. We proceed recursively to point 1. or we return from recursion if we climbed too deep.

And now I will write, like me:
1. Find all the words.
2. Enumerate all the words, putting each word in turn on the playing field.
3. Find all the words, but in a different way! We copy into the search result all the words that have already been found in the last move. We exclude from this list those words that are now impossible to compose, namely, delete all the words whose inserted letter was in the same cell where the letter of the last played word was put (after all, this cell is now occupied, and therefore you will no longer compile the words). Next, add new words to the list. To do this, we are not looking for all the words on the playing field, but only those that pass through the occupied cell with the last word. After all, what has changed from the previous move - a new letter appeared on the playing field. So, if there are new words, they all have to go through a new letter. If they don’t pass, then these words have already been found earlier, and we have it in the list.
What is the result: Instead of allowing a recursive search for words from each cell of the playing field, we let it all into one cell! and no matter what size the playing field. Even if the playing field contains a million cells. We will always “search for” new words in only one cell!
4. We move recursively to point 2. or we return from recursion if we climbed too deep.

In fact, my list of words is not copied every time, and words are not deleted in their pure form. And all this is done by switching the pointer to the pages in a multidimensional array, whose dimension is defined as “cell address”, “stroke depth”, and the ID word itself, which were found in a particular cell, at a specific depth. Thus, when we roll back to the level in the tree, we do not need to restore the list of found words in this node, we simply decrease the pointer to the "page". And when on the contrary, we go deep - the pointer increases, and new words are entered into a new page. In this way. Somewhere on the 10th move in order to find out all the words in this move, we need to add up all the words from 10 “pre-searched pages”. They all add up to a list of words, as if searching in the usual way. But honestly, I don’t remember all the technical details of the structure of a multidimensional array.

Strangely enough, but my algorithm of “searching for” words did not give a stunning acceleration. But he gave the stability of the search time. The usual search method by the middle of the game slows down significantly, and mine does not notice this “load”. After all, as previously stated, no matter how many letters there are on the playing field, for each virtual move only words in the 1st cell will be searched, not all. This property is especially valuable on large playing fields.

In conclusion, I want to say that minimax analysis strongly depends on the accuracy of the dictionary. Therefore, it is not the size of the dictionary that is important, but its compliance with the dictionary of the game portal.

Source: https://habr.com/ru/post/211618/

All Articles

Algorithm and tactics of finding words in the game Balda

More articles: