📜 ⬆️ ⬇️

Packaging of Boolean variables for storage and search in the database

Faced with the need to store forms in the database, most of which were “ticks”, that is, in fact, Boolean variables. I didn't want to look for another method to create thirty INT fields under thirty ticks.

And I found - a bitmap.
More precisely, it is not exactly a bitmap, it is a representation of it in the whole type.
Did so:
- All “ticks” numbered from the unit like this:
 <input type = "checkbox" name = "variable []" value = "1">
 <input type = "checkbox" name = "variable []" value = "2">
 <input type = "checkbox" name = "variable []" value = "3">
 <input type = "checkbox" name = "variable []" value = "3">

- Next - made two functions: packing an array into a “bitmap”:

  function binary_array_pack ($ array) {
	 $ ret = 0;

 // Punchy pun =)
	 if (! is_array ($ array))
	     $ array = array ($ array);	
		
	 foreach ($ array as $ val) {
		 if ($ val> 32 || $ val <1)
			 throw new Exception ("Values ​​must not exceed 32 and be less than 1");
		 $ ret | = bindec ('1'.str_repeat (' 0 ', $ val-1));
	 }
	 return $ ret;
 }

 function binary_array_unpack ($ val) {
	 $ ret = array ();
	 for ($ i = 0; $ i <32; $ i ++) {
		 if ($ val & bindec (str_repeat ('0', 31- $ i). '1'.str_repeat (' 0 ', $ i))) {
			 $ ret [] = $ i;
		 }
	 }
	 return $ ret;
 }

')
So we get the following:
- The user noted, for example, the first, second and fourth items
- We received the array $ variable = array (1,2,4);
- Next, we fed this array to the binary_array_pack function ($ variable)
- The function returned an integer, which in binary representation looks like this - 1011, and in decimal it is 11

Actually, we have that the i-1 st binary digit indicates the presence of the i-th "tick". In this case, we have zero, first and third digits, which means the presence of the first, second and fourth tick.

Now this number can be written in one base field.

The question is - how now to do a sample of these fields?
The answer is very simple. Bitwise operations that are supported (as far as I know) by all DBMS will help us.

For example:
- The user has selected the second tick for the search
- We packed it into our “bit array”, got the number 2.
- Made a query SELECT * FROM t WHERE field & 2 != 0
- That is, we did a bitwise multiplication and, for example, the number 11, which is written in binary representation as 1011 will give us the number 0010 when multiplied by 0010. If the second tick was not recorded in the field, then this operation will give 0. The same the most will be if there is not a single match from the multiple choice.

For other types of search, you can use other bitwise operations.

This is how you can reduce the number of fields in the database without sacrificing functionality.

As for the performance of such a sample, there is no exact data [for now] (tests will follow), although IMHO, bitwise operations are native for the processor and must fly.

These functions only reflect the essence of the approach and, of course, do not claim to be the final implementation. All notes and comments are welcome.

The main limitation of this approach is a maximum of 64 “ticks” (BIGINT), and with this implementation, a maximum of 32 “ticks” (maximum for the php bindec function).

UPD
Thanks to all commented. Comments all on the case (fortunately).
Total: This approach can be used to fix multiple choices (for example, it is not known in advance how many checkboxes a user can put, so it’s not clear how many fields to drive), but using it will result in fullscan, which is, of course, much worse than normal sampling across the fields. However, this is deduced from the theory, no experiments were put =)

Source: https://habr.com/ru/post/37131/


All Articles