GitHub, website and automatic test site creation from the latest source code version

This article will talk about how to automatically get the latest version of the source code from the main branch of your repository and deploy from it a project on a virtual hosting. Just want to note that I met GitHub and Git only yesterday. Therefore, this article may seem trivial to an experienced web programmer. And I hope that will help those who are just starting their way as a web programmer.

Introduction

I have a small website on a virtual hosting. It does not have full shell access and scripts are limited in some rights. For example, I cannot use the PHP system function and the file_get_contents function. After I created the repository on GitHub, learned to work a bit with the changes and updated the source code, it was time to think about what to do next. I wanted to see my changes in action, but at the same time so that the main site continued to work.

Of the scripting languages available to me, I only know PHP. The choice on what to write was made automatically. I understood that my script should somehow receive update notifications from GitHub and download the source code. I decided to make a subdomain of development.mysite.com and post the latest version of the source code there. In addition, I have a website configuration file with passwords for the database, which I did not share on GitHub. This file must be added to the downloaded sources for it to work.
')
Thus, the whole process can be divided into the following steps:

figure out how to get notifications from GitHub;
download source codes;
make the necessary changes with them.

Notifications from GitHub

Everything is quite simple here. GitHub supports hooks. We register the address of our script in the Post-Receive URLs and that's it. It will be called at each change of the main repository branch. Read more about this on the GitHub site (in English). In this case, in my script, I do not process information about the last transaction (commit).

Download source

The developer has two options for downloading the code:

use GitHub API;
Download an archived version of the main branch.

Github API

Using the program interfaces you can get information about the last commit. It contains the Tree SHA ID. This identifier allows you to consistently get a list and contents of all project files.

An example of using the GitHub API for PHP is described in the blog of David Volsh . Take some useful features from there and add your own. Let's start writing our script. First of all the parameters

<?php /* static settings */ $user = '<github_username>' ; $repo = '<github_reponame>' ; $user_repo = $user . '/' . $repo; $tree_base_url = "http://github.com/api/v2/json/tree/show/" . $user_repo; // path on the server where your repository will go $stage_dir = $_SERVER[ 'DOCUMENT_ROOT' ] . dirname($_SERVER[ 'SCRIPT_NAME' ]); ?>
<?php /* static settings */ $user = '<github_username>' ; $repo = '<github_reponame>' ; $user_repo = $user . '/' . $repo; $tree_base_url = "http://github.com/api/v2/json/tree/show/" . $user_repo; // path on the server where your repository will go $stage_dir = $_SERVER[ 'DOCUMENT_ROOT' ] . dirname($_SERVER[ 'SCRIPT_NAME' ]); ?>
<?php /* static settings */ $user = '<github_username>' ; $repo = '<github_reponame>' ; $user_repo = $user . '/' . $repo; $tree_base_url = "http://github.com/api/v2/json/tree/show/" . $user_repo; // path on the server where your repository will go $stage_dir = $_SERVER[ 'DOCUMENT_ROOT' ] . dirname($_SERVER[ 'SCRIPT_NAME' ]); ?>
<?php /* static settings */ $user = '<github_username>' ; $repo = '<github_reponame>' ; $user_repo = $user . '/' . $repo; $tree_base_url = "http://github.com/api/v2/json/tree/show/" . $user_repo; // path on the server where your repository will go $stage_dir = $_SERVER[ 'DOCUMENT_ROOT' ] . dirname($_SERVER[ 'SCRIPT_NAME' ]); ?>
<?php /* static settings */ $user = '<github_username>' ; $repo = '<github_reponame>' ; $user_repo = $user . '/' . $repo; $tree_base_url = "http://github.com/api/v2/json/tree/show/" . $user_repo; // path on the server where your repository will go $stage_dir = $_SERVER[ 'DOCUMENT_ROOT' ] . dirname($_SERVER[ 'SCRIPT_NAME' ]); ?>
<?php /* static settings */ $user = '<github_username>' ; $repo = '<github_reponame>' ; $user_repo = $user . '/' . $repo; $tree_base_url = "http://github.com/api/v2/json/tree/show/" . $user_repo; // path on the server where your repository will go $stage_dir = $_SERVER[ 'DOCUMENT_ROOT' ] . dirname($_SERVER[ 'SCRIPT_NAME' ]); ?>
<?php /* static settings */ $user = '<github_username>' ; $repo = '<github_reponame>' ; $user_repo = $user . '/' . $repo; $tree_base_url = "http://github.com/api/v2/json/tree/show/" . $user_repo; // path on the server where your repository will go $stage_dir = $_SERVER[ 'DOCUMENT_ROOT' ] . dirname($_SERVER[ 'SCRIPT_NAME' ]); ?>
<?php /* static settings */ $user = '<github_username>' ; $repo = '<github_reponame>' ; $user_repo = $user . '/' . $repo; $tree_base_url = "http://github.com/api/v2/json/tree/show/" . $user_repo; // path on the server where your repository will go $stage_dir = $_SERVER[ 'DOCUMENT_ROOT' ] . dirname($_SERVER[ 'SCRIPT_NAME' ]); ?>
<?php /* static settings */ $user = '<github_username>' ; $repo = '<github_reponame>' ; $user_repo = $user . '/' . $repo; $tree_base_url = "http://github.com/api/v2/json/tree/show/" . $user_repo; // path on the server where your repository will go $stage_dir = $_SERVER[ 'DOCUMENT_ROOT' ] . dirname($_SERVER[ 'SCRIPT_NAME' ]); ?>

A copy of the source code will be created in the directory where our script is located. Next, insert the function to retrieve the data at the address, peeped from David:

<? php
/ * gets url * /
function get_content_from_github ($ url)
{
$ ch = curl_init ();
curl_setopt ($ ch, CURLOPT_URL, $ url);
curl_setopt ($ ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ ch, CURLOPT_CONNECTTIMEOUT, 1);
echo "Getting: {$ url}" ;
$ content = curl_exec ($ ch);
curl_close ($ ch);
return $ content;
}
?>

Then comes the function that finds the Tree SHA and starts downloading files:

<? php
function get_repo_json ()
{
global $ user, $ repo, $ user_repo, $ tree_base_url, $ stage_dir;
$ json = array ();
$ list_commits_url = 'http://github.com/api/v2/json/commits/list/' . $ user_repo. '/ master' ;
echo "Master branch url: {$ list_commits_url} \ n <br>" ;
$ json [ 'commit' ] = json_decode (get_content_from_github ($ list_commits_url), true );
// get sha for the latest tree
$ tree_sha = $ json [ 'commit' ] [ 'commits' ] [0] [ 'tree' ];
echo "Tree sha: {$ tree_sha} \ n <br>" ;
$ cont_str = $ tree_base_url. "/ {$ tree_sha}" ;
$ base = json_decode (get_content_from_github ($ cont_str), true );
// output project structure
echo "<pre>" ;
get_repo ($ base [ 'tree' ], 0, $ stage_dir);
echo "</ pre>" ;
}
?>

This function calls the get_repo function, which recursively passes through all the directories of the project.

<? php
function get_repo ($ objects, $ level = 0, $ current_dir)
{
global $ tree_base_url, $ user_repo;
chdir ($ current_dir);
foreach ($ objects as & $ object )
{
$ type = $ object [ 'type' ];
$ sha = $ object [ 'sha' ];
$ name = $ object [ 'name' ];
// add padding
echo str_pad ( "" , $ level, "\ t" );
echo $ name. "\ n" ;
if (strcmp ($ type, "tree" ) == 0)
{
mkdir ($ name);
$ new_dir = $ current_dir. '/' . $ name;
$ tree = $ tree_base_url. '/' . $ sha;
$ new_objects = json_decode (get_content_from_github ($ tree), true );
get_repo ($ new_objects [ 'tree' ], $ level + 1, $ new_dir);
// change current directory back
chdir ($ current_dir);
}
else
{
// get file content
$ blob_url = "http://github.com/api/v2/json/blob/show/" . $ user_repo. "/" . $ sha;
$ data = get_content_from_github ($ blob_url);
$ filename = $ current_dir. '/' . $ name;
file_put_contents ($ filename, $ data);
}
}
}
?>

It is worth noting that we immediately receive the contents of the file, without any additional information. Therefore, we do not need to call the json_decode function as in the cases with calls to other API functions.

Results of getting source code via API

This script has two significant drawbacks:

it works quite slowly;
I never managed to download the entire project. CURL ends on timeout, the whole process stops without downloading a third of the source files.

In addition, the very idea of pumping a project on individual files seems ideologically incorrect.

Archive with the main branch of the project

Having poked at the GitHub buttons, I discovered the ability to download an archived version of the source files. This approach is much better! You can download a choice of either a Zip archive or Tar. My choice fell on Zip, because it is easier to unpack it on a shared hosting. Let's look at the script.

<? php
$ download = true ;
$ unzip = true ;
$ move = true ;
$ stage_dir = $ _SERVER [ 'DOCUMENT_ROOT' ]. dirname ($ _ SERVER [ 'SCRIPT_NAME' ]);
$ filepath = $ stage_dir. '/' . 'master.zip' ;
echo "<pre>" ;
?>

The variables download , unzip and move control the program flow and allow you to disable its parts. They can be used for debugging. For example, if the archive is already downloaded, but not unpacked, then there is no point in downloading it again.

<? php
if ($ download)
{
$ url = "http://github.com/<your_github_username>/<your_github_repo_name>/zipball/master" ;
$ ch = curl_init ();
curl_setopt ($ ch, CURLOPT_URL, $ url);
curl_setopt ($ ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ ch, CURLOPT_CONNECTTIMEOUT, 1);
echo "Getting: {$ url} \ n" ;
$ content = curl_exec ($ ch);
echo "Got \" {$ content} \ "\ n" ;
curl_close ($ ch);
$ dom = new DOMDocument ();
@ $ dom-> loadHTML ($ content);
$ xpath = new DOMXPath ($ dom);
$ hrefs = $ xpath-> evaluate ( "/ html / body // a" );
$ href = $ hrefs-> item (0);
$ zipurl = $ href-> getAttribute ( 'href' );
echo "Zip url: {$ zipurl} \ n" ;
$ data = http_get_file ($ zipurl);
if (substr ($ data, "http: //" ))
{
$ data = http_get_file ($ data);
}
file_put_contents ($ filepath, $ data);
}
?>

GitHub makes several redirects that for some reason are not executed with CURL. Therefore, with the help of it we find the address of the first redirect, then we try to go through it. We get another redirect address and finally get to the coveted archive. The http_get_file function used in the code above:

<? php
function http_get_file ($ url)
{
$ url_stuff = parse_url ($ url);
$ port = isset ($ url_stuff [ 'port' ])? $ url_stuff [ 'port' ]: 80;
$ path = $ url_stuff [ 'path' ];
$ last = $ path [strlen ($ path) -1];
if (strcmp ($ last, "_" ) == 0)
{
$ path = substr_replace ($ path, "" , -1);
}
$ fp = fsockopen ($ url_stuff [ 'host' ], $ port);
$ query = 'GET' . $ path. "HTTP / 1.0 \ n" ;
$ query. = 'Host:' . $ url_stuff [ 'host' ];
$ query. = "\ n \ n" ;
fwrite ($ fp, $ query);
while ($ line = fread ($ fp, 1024))
{
$ buffer. = $ line;
}
if (preg_match ( '/ ^ Location: (. +?) $ / m' , $ buffer, $ matches))
{
return $ matches [1];
}
preg_match ( '/ Content-Length: ([0-9] +) /' , $ buffer, $ parts);
return substr ($ buffer, - $ parts [1]);
}
?>

Found a strange functionality. After the parse_url function is executed, the underscore is added to the end of the file name at the last address pointing to the real archive.

Unpack the archive:

<? php
if ($ unzip)
{
echo "Uncompressing archive ... \ n" ;
$ zip = new ZipArchive;
$ res = $ zip-> open ($ filepath);
if ($ res === TRUE)
{
$ zip-> extractTo ($ stage_dir);
$ zip-> close ();
echo "Done! \ n" ;
} else
{
echo "Failed \ n" ;
exit (1);
}
}
?>

Inside the archive we are waiting for a folder with the name consisting of the user name, the repository name and a piece of SHA code of the commit. And inside this folder are the project files. I have a code folder. The next step is to move the code folder up one level. This is necessary in order to properly display the subdomain. The subdomain is configured, for example, to the / public_html / development folder. The archive is unpacked in / public_html / development / <user> _ <repo> _ <sha> / <files> .

<? php
if ($ move)
{
$ files = scandir ($ stage_dir);
$ match_array = preg_grep ( '/ <user_name> * /' , $ files);
if (is_array ($ match_array))
{
// remove all directory if any
delete_directory ( "code" );
$ dir_name = current ($ match_array);
$ rep_dir = $ dir_name. "/ code" ;
echo "Try to move {$ rep_dir} to code \ n" ;
rename ($ rep_dir, "code" );
rmdir ($ dir_name);
echo "Done moving files \ n" ;
}
}
function delete_directory ($ dirname)
{
if (is_dir ($ dirname))
$ dir_handle = opendir ($ dirname);
if (! $ dir_handle)
return false ;
while ($ file = readdir ($ dir_handle))
{
if ($ file! = "." && $ file! = ".." )
{
if (! is_dir ($ dirname. "/" . $ file))
unlink ($ dirname. "/" . $ file);
else
delete_directory ($ dirname. '/' . $ file);
}
}
closedir ($ dir_handle);
rmdir ($ dirname);
}
?>

And finally, I copy the configuration file, which is located in the / public_html / development / folder

<? php
copy ( "config.php" , "<new_path> /core.php" );
echo "All jobs have been done! \ n" ;
echo "</ pre>" ;
?>
* This source code was highlighted with Source Code Highlighter .

Script results downloading archived source code

The script does what you need! =)

Conclusion

This article reviewed approaches to creating a test software environment for a website, the source code of which is stored in the GitHub system. The proposed approaches in the future can be combined. First create a complete copy of the repository by downloading the archive, and then keep track of which files were changed in the last commit and update only them.

The solution I used to create a test software environment may not be ideal. I am very interested to know how other people do it. Share your knowledge!

PS This article was published with the support of dive 'habrauser, who sent me an invite. Thank!

Source: https://habr.com/ru/post/83235/

All Articles