📜 ⬆️ ⬇️

64-bit integers in MongoDB

In my PHP project I had to face the need to store 64-bit integer data in the database. Found only one article on the topic, but very detailed (in some places too) and explaining all the subtleties. I decided to publish the translation on Habré, in case someone faces a similar problem.



The current project I'm working on is based on MongoDB , the bridge between key-value storages and traditional RDBMS. Users in this project are identified by their Facebook UserID , which is a 64-bit integer. Unfortunately, the MongoDB driver for PHP only had support for 32-bit integers, which caused problems with new Facebook users. The new cool long UserID has been truncated to 32 bits, which is why the application did not work correctly.
')
For internal document storage, MongoDB uses something called BSON (Binary JSON ). In BSON, there are two integer numeric types: a 32-bit signed integer called INT and a 64-bit signed integer called LONG . The MongoDB driver documentation for PHP says (or it was said, depending on when you read this) that only 32-bit integer types are supported, since "PHP does not support 8-byte integers." This is not quite true. The type integer in PHP supports 64-bit values ​​on platforms where the long type in C is 64-bit. This is any 64-bit platform (if PHP is compiled for a 64-bit architecture), except for Windows, where the long type in C is always 32-bit.

Each time an integer was transferred from PHP to MongoDB, the driver used only the least significant 32 bits to store the number in the document. The example below shows what was happening (on a 64-bit platform):

<?php $m = new Mongo(); $c = $m->selectCollection('test', 'inttest'); $c->remove(array()); $c->insert(array('number' => 1234567890123456)); $r = $c->findOne(); echo $r['number'], "\n"; ?> 


Showed:

  int (1015724736) 


In binary form:

  1234567890123456 = 10001100010110101010011110010001010101010110010000
       1015724736 = 111100100010101011101011000000 


Trimming data is obviously not a good idea. To solve this problem, we could just allow the standard PHP integer type to be passed directly to MongoDB. But instead of changing how the MongoDB driver works by default, I added a new mongo.native_long setting - simply because otherwise we could break some running applications. With the mongo.native_long setting enabled , we see a different script execution result:

 <?php ini_set('mongo.native_long', 1); $c->insert(array('number' => 1234567890123456)); $r = $c->findOne(); var_dump($r['number']); ?> 


This script will show:

  int (1234567890123456) 


On 64-bit platforms, the mongo.native_long setting allows you to save 64-bit integers in MongoDB. The MongoDB data type that is used in this case is BSON LONG, instead of BSON INT, which is used if this setting is turned off. The setting also changes the BSON LONG data behavior when reading back from MongoDB. Without the mongo.native_long setting enabled , the driver would convert all BSON LONG to PHP type float, which would lead to a loss of accuracy. You can see this in the following example:

 <?php ini_set('mongo.native_long', 1); $c->insert(array('number' => 12345678901234567)); ini_set('mongo.native_long', 0); $r = $c->findOne(); var_dump($r['number']); ?> 


This script will show:

  float (1.2345678901235E + 16) 


On 32-bit platforms, the mongo.native_long setting does not change anything when saving integers in MongoDB: the number will be saved as BSON INT, as before. However, when reading BSON LONG numbers from MongoDB with the setting enabled on a 32-bit platform, a MongoCursorException exception will be thrown warning you that the data cannot be read without loss of accuracy:

  MongoCursorException: Can not natively represent the long 1234567890123456 on this platform 


If the setting is disabled, BSON LONG will be converted to PHP type float so as not to lose backward compatibility with the previous driver behavior.

Although the mongo.native_long setting allows you to use 64-bit numbers on 64-bit platforms, it does not give anything on 32-bit platforms other than protecting against data loss when reading BSON LONG values ​​- and then only by throwing an exception.

As part of the work to ensure reliable work with 64-bit numbers in MongoDB from PHP, I also added two new classes: MongoInt32 and MongoInt64 . These two classes are simple wrappers around the string representation of a number. They are created like this:

 <?php $int32 = new MongoInt32("32091231"); $int64 = new MongoInt64("1234567980123456"); ?> 


You can use these objects in normal queries to insert and modify data as normal numbers:

 <?php $m = new Mongo(); $c = $m->selectCollection('test', 'inttest'); $c->remove(array()); $c->insert(array( 'int32' => new MongoInt32("1234567890"), 'int64' => new MongoInt64("12345678901234567"), )); $r = $c->findOne(); var_dump($r['int32']); var_dump($r['int64']); ?> 


Conclusion:

  int (1234567890)
 float (1.2345678901235E + 16) 


As you can see from the example, nothing has changed in reading the values ​​from the database. BSON INT is also returned as an integer, and BSON LONG is returned as a float. If we enable the mongo.native_long setting, then BSON LONG saved using the MongoInt64 class will be returned as PHP integer type on 64-bit platforms, and on 32-bit platforms we will get a MongoCursorException.

To get 64-bit numbers back from MongoDB on 32-bit platforms, I added another setting - mongo.long_as_object . It (on any platform) will enable BSON LONG from MongoDB to be returned as a MongoInt64 object. The following script shows this:

 <?php $m = new Mongo(); $c = $m->selectCollection('test', 'inttest'); $c->remove(array()); $c->insert(array( 'int64' => new MongoInt64("12345678901234567"), )); ini_set('mongo.long_as_object', 1); $r = $c->findOne(); var_dump($r['int64']); echo $r['int64'], "\n"; echo $r['int64']->value, "\n"; ?> 


Script output:

  object (MongoInt64) # 7 (1) {
   ["value"] =>
   string (17) "12345678901234567"
 }
 12345678901234567
 12345678901234567 


The MongoInt32 and MongoInt64 classes implement the __toString () method so that their values ​​can be output via echo. You can get their values ​​only as strings. Please note that MongoDB is type sensitive and will not accept the number contained in the string as a number. This script shows this (on a 64-bit platform):

 <?php ini_set('mongo.native_long', 1); $m = new Mongo(); $c = $m->selectCollection('test', 'inttest'); $c->remove(array()); $nr = "12345678901234567"; $c->insert(array('int64' => new MongoInt64($nr))); $r = $c->findOne(array('int64' => $nr)); // $nr is a string here var_dump($r['int64']); $r = $c->findOne(array('int64' => (int) $nr)); var_dump($r['int64']); ?> 


Conclusion:

  Null
 int (12345678901234567) 


The following tables show how all the various number conversions work depending on the included settings:

PHP -> MongoDB on 32-bit platforms

Baseline valuenative_long = 0native_long = 1
1234567INT (1234567)INT (1234567)
123456789012FLOAT (123456789012)FLOAT (123456789012)
MongoInt32 ("1234567")INT (1234567)INT (1234567)
MongoInt64 ("123456789012")LONG (123456789012)LONG (123456789012)


PHP -> MongoDB on 64-bit platforms

Baseline valuenative_long = 0native_long = 1
1234567INT (1234567)LONG (1234567)
123456789012garbageLONG (123456789012)
MongoInt32 ("1234567")INT (1234567)INT (1234567)
MongoInt64 ("123456789012")LONG (123456789012)LONG (123456789012)


MongoDB -> PHP on 32-bit platforms

In MongoDBlong_as_object = 0, native_long = 0long_as_object = 0, native_long = 1long_as_object = 1
INT (1234567)int (1234567)int (1234567)int (1234567)
LONG (123456789012)float (123456789012)MongoCursorExceptionMongoInt64 ("123456789012")


MongoDB -> PHP on 64-bit platforms

In MongoDBlong_as_object = 0, native_long = 0long_as_object = 0, native_long = 1long_as_object = 1
INT (1234567)int (1234567)int (1234567)int (1234567)
LONG (123456789012)float (123456789012)int (123456789012)MongoInt64 ("123456789012")


Conclusion

As we noted, getting support for 64-bit integers in PHP with MongoDB can be a nontrivial matter. My recommendations are to use mongo.native_long = 1 if you work only with 64-bit platforms in your code. In this case, all the integers that you write to the database will return from there as well as integers in the original form, even if they are 64-bit.

If you have to work with 32-bit platforms (this includes 64-bit PHP builds for Windows!), Then you cannot use the standard type of integer in PHP to store 64-bit numbers, you have to use the MongoInt64 class, which means and work with string representations of numbers. You also need to keep in mind that the MongoDB console considers all numbers as floating-point numbers (float), and that it cannot display 64-bit integers. Instead, it will show them as float. Do not try to modify these numbers in the console, it will change their type.

For example, after running the script:

 <?php $m = new Mongo(); $c = $m->selectCollection('test', 'inttest'); $c->remove(array()); $c->insert(array('int64' => new MongoInt64("123456789012345678"))); 


the MongoDB ( mongo ) console will behave like this:

  $ mongo
 MongoDB shell version: 1.4.4
 url: test
 connecting to: test
 type "help" for help
 > use test
 switched to db test
 > db.inttest.find ()
 {"_id": ObjectId ("4c5ea6d59a14ce1319000000"), "int64": {"floatApprox": 123456789012345680, "top": 28744523, "bottom": 2788225870}} 


Of course, when reading data through a driver that supports 64-bit integers, you will get the correct result:

 ini_set('mongo.long_as_object', 1); $r = $c->findOne(); var_dump($r['int64']); ?> 


will show:

  object (MongoInt64) # 7 (1) {
   ["value"] =>
   string (18) "123456789012345678"
 } 


The new functionality described in this article is part of the release of mongo 1.0.9 , which is available through PECL using the pecl install mongo command .
Good luck with your 64-bit integers!

PS This is my first translation, please do not kick your feet strongly :)

Source: https://habr.com/ru/post/117155/


All Articles