Next, in search of palindromes 3

After the seemingly good result obtained in the previous part turned out to be only a “local maximum”, I gave up the puzzle for a while. I recall the condition:

“The decimal number 585 is 1001001001 in binary. It is palindromic in both bases. Find n-th palindromic number ". Or, in Russian: “The decimal number 585 in binary number system looks like 1001001001. It is a palindrome in both number systems. Find the nth palindrome like. ”

But the existence of a much faster, with a fundamentally different computational complexity, the algorithm did not give me rest, and in the end I returned to its analysis .

In the end, the algorithm was not so complicated, but, in my opinion, very beautiful.

The original explanation and implementation uses a prefix tree, but in my opinion, a slightly deeper understanding of the algorithm's mathematics allows us to get by with simpler and faster structures. I think it is best to disassemble the algorithm by example.

We will search for binary palindromes among decimal palindromes of width 8 . The source palindromes are exactly 9000 : from 10,000001 to 99999999.
')
Now take 2 sets of numbers:

And [ 90 ] = { 10 0000 01 , 11 0000 11 , 12 0000 21 , ..., 98 0000 89 , 99 0000 99 }
B [ 100 ] = { 00 0000 00 , 00 0110 00 , 00 0220 00 , ..., 00 9889 00 , 00 9999 00 }

If we describe these sets formally, then set A is a subset of decimal palindromes of width 8 , in which the average 4 digits are zeroes, and the set of B consists of 0 , the set of decimal palindromes of width 2 multiplied by 10 ³ and the set of decimal palindromes of width 4 multiplied by 10 ² .

And if informally, then the set A is all possible “edges”, and the set B is all possible “middles” of decimal palindromes of width 8 . It is obvious that the set of all decimal palindromes of width 8 is the set of all pairwise sums of elements of the sets A and B.

for each (a in A) { for each (b in B) { palindrome = a + b; } }

Further, for brevity, I will use “a” instead of “element of set A”, and “b” instead of “element of set B”.

Now we translate the elements of both sets into a binary number system:

A

10000001 100110 001001011010 000001
11000011 101001 111101100011 001011
12000021 101101 110001101100 010101
...
98000089 101110 101110101110011 011001
99000099 101111 001101001111100 100011

00000000 000000
00011000 10101011 111000
00022000 101010111 110000
...
00988900 11110001011011 100100
00999900 11110100000111 011100

I singled out 6 upper and lower discharges for all a, and 6 lower discharges for all b. Width 6 was not chosen randomly - this is the maximum degree of 2 , not exceeding the width of the “edges” A = 10 ² .

Now let's take each a, and see what all palindromes have in common, formed from it by adding b. What they have in common is that they are all in the interval between a itself and the next element A.

For example, for a = 10000001, all decimal palindromes formed from it {10000001, 10011001, 10022001, ..., 10988901, 10999901} are in the interval [10000001, 11000011).

This means that all decimal palindromes formed from a = 10000001 can begin only with the following 6 binary digits:

100110 - the first binary digits a = 10000001
100111
101,000
101001 - the first binary digits a = 11000011

Therefore, in order to be binary palindromes, all these decimal palindromes can only end with the inverse permutation of the initial binary digits:

100110 -> 011001
100111 -> 111001
101000 -> 000101
101001 -> 100101

And considering that a = 10000001 itself ends in binary numbers 000001, out of all possible b we are only interested in those that end in binary digits:

011001 - 000001 = 011000
111001 - 000001 = 111000
000101 - 000001 = 000100
100101 - 000001 = 100100

Only for such b it is necessary to check whether 10000001 + b is a binary palindrome.

Using this approach, the algorithm for finding palindromes in the bases of BASE ₀ , BASE ₁ in the interval [1, N] can be described as follows:

For each width W of [1, the number of digits N in BASE ₀ ]
Generate sets A and B using BASE ₀ . “Edge” width AW _A = O (W / 4), W _A ≥ W _B
Translate elements A and B to BASE ₁
Sort items B by recycle bin by last digit in BASE ₁
For each a of A
For each X from the set of possible initial digits a + b
For each b ending in (X - end digits a)
Check whether a + b is a palindrome in BASE ₁

Unfortunately, I have already forgotten how to strictly prove the computational complexity of algorithms, I will give only a few considerations:

The sets A and B contain O (N ^1/4 ) elements.
For each a, on average, there is at most BASE ₀ variants X.
(We initially choose the width of the initial digits of interest to us in BASE ₁ so that the resulting number is not more than BASE ₀ ^{W _A} , and max (A) is greater than min (A) in BASE ₀ times.)
For each X, on average, no more than BASE ₁ different b is checked.
(X is distributed evenly and almost does not correlate with the final numbers a in BASE _1. Each basket with b is chosen uniformly, and there are no more than ₁ times less such baskets in BASE than b.)
Check for a palindrome still takes O (log (N)) .

In general, I assume that the computational complexity of the algorithm is O (N ^1/4 * log (N) * BASE ₀ * BASE ₁ ) .

The first implementation of this algorithm was quite predictably a couple of orders of magnitude faster, and after spending a little more time on optimization, I brought the calculation time to just over 0.01 seconds, more than 1000 times faster than the previous algorithm. This result finally satisfied me, but quite naturally there was a desire to test it on numbers larger than 2 ⁶⁰ . By this time, I had already discovered the sequence of interest in the Encyclopedia of Integral Sequences. The list of double palindromes had 122 members, the largest of which consisted of 131 bits. It was impressive, but also indirectly indicated that no one had yet come up with an algorithm of logarithmic complexity.

The call was serious - there was no doubt that the people who received such a result invested a lot of effort into the development of the algorithm and were also certainly ready to spend days and weeks of computer time for a subsequent search. Therefore, I decided to at least approximately estimate the time that my algorithm needs to repeat:

2 ^131/2 ⁶⁰ = 2 ⁷¹

2 ^{71 * 1/4} < 2 ¹⁸ = 262144

Given the 0.01 seconds needed for the limit of 2 ⁶⁰ , it turned out that my algorithm had to cope with this task in about 1 hour! I felt some kind of trick, but the challenge was already accepted.

To be continued.

Source: https://habr.com/ru/post/272659/

All Articles

Next, in search of palindromes 3

More articles: