📜 ⬆️ ⬇️

HBase + Thrift + PHP

Apparently it happened historically, but there are not very many articles about HBase, Thrift and even more about how to connect them to work with a PHP client. Let's eliminate this gap and go from installing HBase to getting PHP client primitive data from HBase.

Introduction

For those who do not know what HBase and Thrift are, here is a brief description:

Hbase is an open source, non-relational, distributed database created by analogy with Google's BigTable and written in Java. It was developed as part of the Hadoop project (part of the Apache Software Foundation) and runs on an HDFS (Hadoop Distributed Filesystem) cluster, providing capabilities similar to BigTable. That is, it provides a fail-safe way to store large volumes of sparse data. More information about HBase in Wikipedia .

Thrift is a language for describing (declaring) interfaces that are used to describe and create services in various programming languages. It is used as an RPC framework and was developed by Facebook. More information about Thrift on Wikipedia .
')
Installation

So, let's install the same. To begin, put Thrift. We will install everything from sources taken from the official site. So Thrift:
$ wget apache.strygunov.com//thrift/0.6.1/thrift-0.6.1.tar.gz
$ tar xfz thrift-0.6.1.tar.gz
$ cd thrift-0.6.1/

$ ./configure
$ make
$ make install
$ cd ..

After completing the configuration, you should see which languages ​​you have will be supported by Thrift:
Building C++ Library ......... : no
Building C (GLib) Library .... : no
Building Java Library ........ : no
Building C# Library .......... : no
Building Python Library ...... : yes
Building Ruby Library ........ : no
Building Haskell Library ..... : no
Building Perl Library ........ : no
Building PHP Library ......... : yes
Building Erlang Library ...... : yes

After make and make install, Thrift will be installed and ready to use. Go to HBase. Everything is much simpler here, all we need is to download the distribution kit, unzip it, fix the config and HBase is ready for a test launch.
$ wget apache.infocom.ua//hbase/hbase-0.90.3/hbase-0.90.3.tar.gz
$ tar xfz hbase-0.90.3.tar.gz

Now you need to edit the config and add the path to the folder where the database will be stored:
$ vim hbase-0.90.3/conf/hbase-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hbase.rootdir</name>
<value>file:///path/to/folder/for/hbase</value>
</property>
</configuration>

After that you can run HBase:
$ ./hbase-0.90.3/bin/start-hbase.sh


Hbase Thrift Generation and Testing

Generating Thrift for Hbase is very simple, just run the following command to do this:
$ thrift --gen php hbase-0.90.3/src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift

After executing this command, you will have a gen-php folder in which there will be generated PHP clients for HBase. Now these scripts need to be moved to the folder with the PHP library to work with Thrift. After make install, the library should have been automatically copied to the folder with php, for example, we assume that this is the / usr / lib / php folder and in it you need to create the packages folder and put the contents of the gen-php folder there. As you can see everything is simple.

Now let's enter the test data in HBase, to do this, first run the HBase shell:

$ ./hbase-0.90.3/bin/hbase shell
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version: 0.90.0, r1001068, Fri Sep 24 13:55:42 PDT 2010

hbase(main):001:0> create 'blogposts', 'post', 'image'
0 row(s) in 1.2200 seconds

hbase(main):002:0> put 'blogposts', 'post1', 'post:title', 'Hello World'
hbase(main):003:0> put 'blogposts', 'post1', 'post:author', 'The Author'
hbase(main):004:0> put 'blogposts', 'post1', 'post:body', 'This is a blog post'
hbase(main):005:0> put 'blogposts', 'post1', 'image:header', 'image1.jpg'
hbase(main):006:0> put 'blogposts', 'post1', 'image:bodyimage', 'image2.jpg'

hbase(main):007:0> get 'blogposts', 'post1'

COLUMN CELL
image:bodyimage timestamp=1229953133260, value=image2.jpg
image:header timestamp=1229953110419, value=image1.jpg
post:author timestamp=1229953071910, value=The Author
post:body timestamp=1229953072029, value=This is a blog post
post:title timestamp=1229953071791, value=Hello World

hbase(main):008:0> exit


So, what have we just done? First, create 'blogposts', 'post', 'image' created a table blogposts with two families of columns post and image. Next, put 'blogposts', 'post1', 'post:title', '...' we created one line with a set of values, and at the end we checked the availability of data in the table and left the shell.

So, everything is ready to run the thrift server and create a demo client in PHP. First we start the thrift server, without which we will not be able to work:
$ ./hbase-0.90.3/bin/hbase thrift start

Great, now it remains to create a small test client in PHP:
 <?php //      Thrift  $GLOBALS['THRIFT_ROOT'] = '/usr/lib/php'; require_once( $GLOBALS['THRIFT_ROOT'].'/Thrift.php' ); require_once( $GLOBALS['THRIFT_ROOT'].'/transport/TSocket.php' ); require_once( $GLOBALS['THRIFT_ROOT'].'/transport/TBufferedTransport.php' ); require_once( $GLOBALS['THRIFT_ROOT'].'/protocol/TBinaryProtocol.php' ); // HBase  require_once( $GLOBALS['THRIFT_ROOT'].'/packages/Hbase/Hbase.php' ); //   thrift  $socket = new TSocket( 'localhost', 9090 ); $socket->setSendTimeout( 10000 ); $socket->setRecvTimeout( 20000 ); $transport = new TBufferedTransport( $socket ); $protocol = new TBinaryProtocol( $transport ); $client = new HbaseClient( $protocol ); $transport->open(); //     HBase print_r($client->getTableNames()); print_r($client->getColumnDescriptors( 'blogposts' )); print_r($client->getRow( 'blogposts', 'post1' )); $transport->close(); ?> 

That's all you need to get started with PHP and HBase.

Bibliography:

Source: https://habr.com/ru/post/123560/


All Articles