πŸ“œ ⬆️ ⬇️

Building crosswords using the Wolfram Language (Mathematica)


Translation of Michael Trott's post, β€œ Constructing Crossword Arrays Faster ”.
Download the translation in the form of a Mathematica document that contains all the code used in the article here .

In chapter 6 of my book Mathematica GuideBook for Programming , as an example of working with lists, I discussed how to build an array that represents a crossword puzzle. Although this example was good for demonstrating advanced work with lists, however, the use of lists is not the best way to build an array of crossword puzzles. The complexity of adding a new word to an array with n - 1 words already placed was for this algorithm ConstructingCrosswordArrays_1.png Thus, the total difficulty of creating an array of crossword of n words became equal to ConstructingCrosswordArrays_2.png .

Over the past few years, some Mathematica users have asked me if a faster algorithm can be built. The answer is yes, you can. If we use hashing methods, we can quickly and at the same time check whether it is possible to use some element of the array and, therefore, we can reduce the overall complexity of the algorithm with ConstructingCrosswordArrays_3.png before ConstructingCrosswordArrays_4.png that for crosswords of thousands of words will give a big difference in time spent on calculations. This algorithm is implemented in this article. When we place individual letters of a word in a certain rectangular table, many different situations need to be considered. As a result, the article contains more than usual amount of procedural code. Although some definitions of functions are somewhat long, thanks to the comments between the steps of the calculations and the decision branches, the code should be fairly simple to read and understand.

In order for the article to be self-sufficient, we will not use any algorithms implemented in the GuideBook .
')
Here is the statement of the problem that we will solve:

Suppose we are given a list of words as strings. Our task is to insert them into a rectangular table consisting of squares so that each word is placed horizontally (read from left to right) or vertically (read from top to bottom), and the words must be connected in squares containing the same for them letters.

Here is an example that uses the names of scientists (citizens of the USSR and Russia) who have become Nobel laureates. (In the rest of the article, we will use words that are directly related to Mathematica as examples.)

ConstructingCrosswordArrays_5.png

To improve the readability of the crossword, we also require that any two horizontal or any two vertical words be separated from each other by at least one empty square. In addition to this, no horizontal word should begin or end in a square that is adjacent to a square occupied by a letter of a vertical word, and, similarly, no vertical word should begin or end in a square that is adjacent to a square occupied by a letter of a horizontal word . (In other words, the squares adjacent to the left and right of the horizontal word and the squares adjacent the top and bottom of the vertical word must remain empty.) However, we allow words to begin or end in squares occupied by another word.

We will place all the words in a square table (not limited in size), built from separate squares and we will track the contents of each square.

Function

positionAWord [characterList, k, {i0, j0}, hv]

arranges the characterList character list in such a way that the kth element of the list will be located in the square { i0 , j0 } and the word characterList will be located according to the value of the parameter h v (horizontally or vertically, which will be marked with "" or "", respectively) . We will use the construction of the form C [data] to set the current status of the square, while the following situations are possible:


All this information will be associated with status values ​​down values. At the same time, status [{i, j}] will calculate the current state of the square centered at the point {i, j}. At the output, it will produce an object of the form C [...] or remain in a non-computed form, if the square has not yet been used in any way or is not a neighbor for some of the words already in the table.

We associate the value of this function with the current content of the square by means of the expression status [square] = value . This will allow us to fix the search time for the status of each of the necessary squares by calculating the expression status [square] .

The positionAWord function sets the values ​​of the squares in which the characters of the word are placed, and also adds information about the potential use of adjacent squares in the future. For example, squares located directly above or below a horizontal word can only be used for vertical words in the future.

The optional last argument of the positionAWord function is used to create possible side effects, such as, say, monitoring the order of adding words, which we will use below. In the implementation of this function, we had to consider a large number of different situations - this is easily seen in the comments to the code below.

In [1]: =

ConstructingCrosswordArrays_6.png

As an example, we place the word Mathematica , starting from the square {1,1} horizontally.

In [2]: =

ConstructingCrosswordArrays_7.png

Below is a list of the squares that have been marked.

In [3]: =

ConstructingCrosswordArrays_8.png

Out [3] =

ConstructingCrosswordArrays_9.png

In order to see the build process live, let's create a function makeCWPGridGraphics , which will color the squares according to their current status.

In [4]: ​​=

ConstructingCrosswordArrays_10.png

The graph below shows the current status of the squares adjacent to the symbols already arranged in squares.

In [5]: =

ConstructingCrosswordArrays_11.png

Out [5] =

ConstructingCrosswordArrays_12.png

Function

positioningAWordIsPossibleQ [characterList, k, {i0, j0}, hv]

checks whether the list of characters characterList can be arranged in such a way that the kth element of the list is located in the square {i0, j0}, and the word characterList is located according to the value of the hv parameter (horizontally "" or vertically ""). In order to do this, we need to check that each of the required squares is either available or has the required symbol already in place. Also, adjacent squares must either be empty or must have letters from a word that is already perpendicular. The auxiliary function whileTrueLoops is an iterative function that finishes its work as soon as the first incompatible square is found using the Throw - Catch technique. In the future, we will always assume that a new word is placed only with the help of the positionAWord function and only when the positioningAWordIsPossibleQ function returns the value True.

In [6]: =

ConstructingCrosswordArrays_13.gif

In [8]: =

ConstructingCrosswordArrays_14.png

If we have already placed the word Mathematica horizontally, we can arrange another copy of the word Mathematica vertically.

In [9]: =

ConstructingCrosswordArrays_15.png

Out [9] =

ConstructingCrosswordArrays_16.png

But we cannot horizontally place the word Mathematica in the same position.

In [10]: =

ConstructingCrosswordArrays_17.png

Out [10] =

ConstructingCrosswordArrays_18.png

In general, it would be possible to place another copy along the one already located, starting with the letter β€œt” in the position {3, 3}.

In [11]: =

ConstructingCrosswordArrays_19.png

Out [11] =

ConstructingCrosswordArrays_20.png

Of course, we could put another copy of the word Mathematica in the middle, say, make it vertical, passing through the second letter β€œm” of the horizontal copy of the word Mathematica .

In [12]: =

ConstructingCrosswordArrays_21.png

Out [12] =

ConstructingCrosswordArrays_22.png

In [13]: =

ConstructingCrosswordArrays_23.png

Using the code below, we can visually display the current state of our table.

In [14]: =

ConstructingCrosswordArrays_24.png

Out [14] =

ConstructingCrosswordArrays_25.png

Function

findAPossiblePositionAndPositionAWord [characterList]

looks for a possible location of the characters characterList and then places them. The function returns either the coordinates of the added character, or $ Failed, in case there is no way to add the word. We also provided this function with several fairly obvious options that, among other things, allow us to place words as tightly as possible, or vice versa, as freely as possible. If the value of the AttachmentPosition option is made equal to Inner , then the function tends to add each new word inside the already placed ones, Outer value - on the contrary, if the RandomSample value is used , then the position of the new word is chosen randomly.

In [15]: =

ConstructingCrosswordArrays_26.png

In [16]: =

ConstructingCrosswordArrays_27.png

In [17]: =

ConstructingCrosswordArrays_28.png

Finally, we will create the function makeCWPGrid [status] , which will construct, in fact, a table of letters based on the down values ​​of the status function.

In [18]: =

ConstructingCrosswordArrays_29.png

The graph below shows the state of all the already added squares after the word Mathematica was added 6 times to the original. Gray squares correspond to words already placed; red - must remain empty; green - new words can be added through them. Arrows indicate in which direction you can add new words.

In [19]: =

ConstructingCrosswordArrays_30.gif

Out [22] =

ConstructingCrosswordArrays_31.png

In [23]: =

ConstructingCrosswordArrays_32.gif

The graph below shows some of the steps involved in building an array of crossword puzzles. If you carefully study the device of squares in each of the figures (for each of the options), it becomes much more understandable how the algorithms presented above are arranged.

In [25]: =

ConstructingCrosswordArrays_33.png

Out [25] =

ConstructingCrosswordArrays_34.png

Using the Manipulate function, we can easily create an interactive version of the algorithm that will allow us to follow its steps clearly and simply. The following interactive version of the algorithm allows you to follow its 12 first steps for different values ​​of options.

In [26]: =

ConstructingCrosswordArrays_35.png

Out [26] =

ConstructingCrosswordArrays_36.png

Now let's consider the complexity of the implemented method. Due to the fact that the search for the current state of a square is almost constant (independent of the number of squares already specified), we expect the linear complexity of O ( n ) depending on the number of words. As a test, we will find the placement of the 500 words Mathematica . To visualize the calculation process, we will use the Monitor function.

In [27]: =

ConstructingCrosswordArrays_37.gif

Now we will draw the measured values ​​of the elapsed time. The graph on the right shows the average time to add a specified number of words. It grows linearly with the number of words you need to add.

In [30]: =

ConstructingCrosswordArrays_38.png

Out [30] =

ConstructingCrosswordArrays_39.png

Thus, the total time of forming an array of n wordword crossword is of order complexity ConstructingCrosswordArrays_40.png . The graph shown below clearly demonstrates this (black - elapsed time, red - found model of the form ConstructingCrosswordArrays_41.png ).

In [31]: =

ConstructingCrosswordArrays_42.png

Out [32] =

ConstructingCrosswordArrays_43.png

Below is the resulting crossword array. We use a small font size to display it here in full.

In [33]: =

ConstructingCrosswordArrays_44.png

Out [33] =

ConstructingCrosswordArrays_45.png

Due to the much higher speed of this approach, based on hashing for a large number of words (say, thousands), we can now very quickly create a crossword puzzle, say, of 250 words. In this case, as a list of these words, we take 250 selected randomly embedded functions of the Wolfram Language ( Mathematica system).

In [34]: =

ConstructingCrosswordArrays_46.gif

Out [38] =

ConstructingCrosswordArrays_47.png

In [39]: =

ConstructingCrosswordArrays_48.png

Out [39] =

ConstructingCrosswordArrays_49.png

The graph shown below displays the order in which the words were located on a plane (in the table). Red, high columns correspond to words that were added at the beginning, blue, low β€œturrets” - added later.

In [40]: =

ConstructingCrosswordArrays_50.png

Out [40] =

ConstructingCrosswordArrays_51.png

Now we will build an array of crossword, consisting of the names of the functions (and their options) associated with various visualizations in the Wolfram Language.

In [41]: =

ConstructingCrosswordArrays_52.gif

Out [42] =

ConstructingCrosswordArrays_53.png

In [43]: =

ConstructingCrosswordArrays_54.gif

In [47]: =

ConstructingCrosswordArrays_55.png

Out [47] =

ConstructingCrosswordArrays_56.png

Now create a potential function f for the position function AWord . This function will be used to determine which particular word already belongs to one or another character.

In [48]: =

ConstructingCrosswordArrays_57.gif

Now let's create some variation of the function makeCWPGrid , which will use the information associated with the function f to color the words.

In [53]: =

ConstructingCrosswordArrays_58.png

The result is presented in the table below. The color of the letters standing at the intersection of words is mixed.

In [54]: =

ConstructingCrosswordArrays_59.png

Out [54] =

ConstructingCrosswordArrays_60.png

We will finish the article by creating a single function CrossWordConstructionCached , which serves to create a crossword puzzle based on the list of words that is given to it at the entrance.

In [55]: =

ConstructingCrosswordArrays_61.png

The constructed algorithm can work with any characters. As an example, there are two ways to make a crossword puzzle from all three-digit prime numbers.

In [56]: =

ConstructingCrosswordArrays_62.png

Out [56] =

ConstructingCrosswordArrays_63.png

Or you can create a crossword puzzle of the names of all the Nobel laureates (citizens of Russia and the USSR):

In [57]: =

ConstructingCrosswordArrays_64.gif

Out [58] =

ConstructingCrosswordArrays_65.png

Resources for learning Wolfram Language (Mathematica) in Russian: habrahabr.ru/post/244451

Source: https://habr.com/ru/post/249141/


All Articles