📜 ⬆️ ⬇️

Monitoring git clone and git push events on a local GitLab server


Sometimes there is a desire to monitor the local GIT server for who (name from LDAP), which project and from where (ip-address) tends or pulls.

Having studied the documentation, it became clear that there is no such functionality out of the box, or rather, but in the paid version of GitLab. Under the cut is my experience in monitoring implementation.

My recipe does not pretend to universality, I hope it will come in handy to many as a starting point.

Description of my configuration:
We have “GitLab Community Edition 10.0.3” installed, users are authorized via LDAP, they make a clone and push using a single SSH account 'git'. In essence, this is the standard configuration for a more or less large company.

For each git-clone and git-push in the '/var/log/auth.log' file, a message appears stating that the user "git" logged in to the system with such a 'fingerprint'
')
cat auth.log
Oct 17 12:41:11 GitLab-test sshd[25931]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=192.168.111.24 user=git Oct 17 12:41:13 GitLab-test sshd[25931]: Accepted publickey for git from 192.168.111.24 port 55268 ssh2: RSA 63:3b:ca:8d:23:2f:d2:0c:40:ce:4d:2e:b1:2e:5f:7c Oct 17 12:41:13 GitLab-test sshd[25931]: pam_unix(sshd:session): session opened for user git by (uid=0) 

Then a message appears in the '/var/log/gitlab/gitlab-shell/gitlab-shell.log' file stating that a user with such and such a 'key' has inclined or dried such and such a project.

cat gitlab-shell.log
 I, [2017-10-19T16:17:32.006429 #1115] INFO -- : POST http://127.0.0.1:8080/api/v4/internal/allowed 0.02417 I, [2017-10-19T16:17:32.006954 #1115] INFO -- : gitlab-shell: executing git command <git-upload-pack /var/opt/gitlab/git-data/repositories/tech/ansible-server.git> for user with key key-1030. 

Finally, a message appears in the '/var/log/auth.log' file that the user 'git' has logged out:

cat auth.log
 Oct 17 12:41:13 GitLab-test sshd[25944]: Received disconnect from 192.168.111.24: 11: disconnected by user Oct 17 12:41:13 GitLab-test sshd[25931]: pam_unix(sshd:session): session closed for user git 

After examining all the tables in the database, it became clear that the mythical 'key', which is in the logs of GitLab ʻa, you can find the imputed name of the user and his 'fingerprint'.

Since the lines appear in the logs in strict order, then the most recent entry in '/var/log/auth.log' with the 'fingerprint' we need will contain the user's IP address. Even if messages from different users will not be recorded strictly in order, the failure will not occur because matching 'fingerprint' - 'ip address' is searched from the end.

The username is in the 'identities' table by 'user_id', which can be found in the 'keys' table by 'key' which we see in the log file 'gitlab-shell.log'.
 SELECT extern_uid FROM identities WHERE user_id = (SELECT user_id FROM keys WHERE id = 1030); 

In the 'keys' table by 'key' is 'fingerprint'
 SELECT fingerprint FROM keys WHERE id = 1030; 

SQL query to find the real username:
 SELECT extern_uid FROM identities WHERE user_id = (SELECT user_id FROM keys WHERE id = 1030); 

Having fought with this data, an algorithm was invented, how to find the user name, his ip-address, project and the action he performed.
1. parsing with tail -f new lines in git log,
2. as soon as there is a line corresponding to the regular schedule, go to the database and look for the user’s name and his fingerprint by the received 'key',
3. by 'fingerprint' in 'auth.log', look for the ip-address of the user and take the most recent entry.
The script is written in python (Python, forgive me fans).
This is my first experience writing something on python. I would appreciate constructive criticism and recommendations.

The following libraries and modules are necessary for its work:
python-tail To track new lines in the 'gitlab-shell.log' file
psycopg To work with PostgreSQL
gelfHandler To send messages to the GrayLog server

The script turned out to be quite large, it can pull out the data from the gitlab.rb configuration file to connect to the PostgreSQL database, check for the presence of SSH log files and gitlab-shell.log, write the result to a file and send it to the GreyLog server, writes error logs in the file and in the console. Any of these options can be turned on or off.

Script
 #!/usr/bin/python # -*- coding: utf-8 -*- """  : 1.  '/var/log/gitlab/gitlab-shell/gitlab-shell.log'   ,   'key' . 2.          'fingerprint'    'key'. 3.  "/var/log/auth.log"  IP     ,    'fingerprint'.   gitlab-shell.log I, [2017-10-17T12:19:56.526131 #21521] INFO -- : gitlab-shell: executing git command <git-receive-pack /var/opt/gitlab/git-data/repositories/web/markets.git> for user with key key-11. SQL    'fingerprint' SELECT fingerprint FROM keys WHERE id = 11; SQL      : SELECT extern_uid FROM identities WHERE user_id = (SELECT user_id FROM keys WHERE id = 11); """ #     from gelfHandler import GelfHandler import logging import psycopg2 import re import os import tail import subprocess import sys import datetime time=str(datetime.datetime.now()) # / -on   - off debug = 'on' #debug = '' # /      GIT.     . #pach_git_all = 'on' pach_git_all = 'off' #   -: log_file_GuiLab = '/var/log/gitlab/gitlab-shell/gitlab-shell.log' log_file_SSH = '/var/log/auth.log' #    SQL.    ,       ,    . gitlab_rb = '/etc/gitlab/gitlab.rb' #gitlab_rb = '' sql_db = 'gitlab' sql_user = 'python_user' sql_password = 'qwer123' sql_host = '127.0.0.1' sql_port = '5432' #   : out_GrayLog='yes' #out_GrayLog='' logger = logging.getLogger() gelfHandler = GelfHandler( host='192.168.250.145', port=6514, protocol='UDP', facility='Python_parsing_log_file_GitLab' ) logger.addHandler(gelfHandler) #     out_log_file_name='/home/viktor/pars_log_GitLab.log' #out_log_file_name='' #      - def funk_error_file(log_file): print " ", log_file, " !!!" out_log_file = open(out_log_file_name, 'a') out_log_file.write(time); out_log_file.write(" "); out_log_file.write(" "); out_log_file.write(log_file); out_log_file.write("  !!!"); out_log_file.close() sys.exit() #      tail,   ,    "gitlab-shell.log"    def funk_pars_gitlab_shell(string_gitlab_shell): #  : action = 'action' project = 'project' user_key = 0 username = 'username' time_ssh = 'time_ssh' host_name = 'host_name' id_ssh_log_message = 0 usr_name_git = 'usr_name_git' ip_address = 'ip_address' fingerprint_log_ssh = 'fingerprint_log_ssh' fingerprint = 'fingerprint' time_git = 'time_git' id_git_log_message = 'id_git_log_message' #  ,     ,       ,     (        ) if pach_git_all == 'on': regexp_string_gitlab_shell = re.findall(r'^.*\[([^ ]+)\.[\d]+\s\#([^ ]+)\]\s.*<([^ ]*)\s([^ ]*)>\s{1,}for user with key\s{1,}key-([^ ]*)\.$', string_gitlab_shell) elif pach_git_all == 'off': regexp_string_gitlab_shell = re.findall(r'^.*\[([^ ]+)\.[\d]+\s\#([^ ]+)\]\s.*<([^ ]*)\s.+repositories/([^ ]*)\.git>\s{1,}for user with key\s{1,}key-([^ ]*)\.$', string_gitlab_shell) #      for arr_string__gitlab_shell in regexp_string_gitlab_shell: time_git = arr_string__gitlab_shell[0] #     - GIT id_git_log_message = arr_string__gitlab_shell[1] #    - GIT action = arr_string__gitlab_shell[2] #  .. ,   -  . project = arr_string__gitlab_shell[3] #       user_key = arr_string__gitlab_shell[4] #      . #                   if action != 'action': #     connect = psycopg2.connect(database=sql_db, user=sql_user, password=sql_password, host=sql_host, port=sql_port) #   (..    ) curs = connect.cursor() #      SQL .         user_id,        sql_string_find_username="""SELECT extern_uid FROM identities WHERE user_id = (SELECT user_id FROM keys WHERE id = %s);""" %user_key curs.execute(sql_string_find_username) #   string_external_uid = curs.fetchall() #     #      SQL .         fingerprint. sql_string_find_fingerprint="""SELECT fingerprint FROM keys WHERE id = %s;""" %user_key curs.execute(sql_string_find_fingerprint) #   sql_string_fingerprint = curs.fetchall() #     #     connect.close() #         SQL for fingerprint_array in sql_string_fingerprint: fingerprint = fingerprint_array[0] for username_array in string_external_uid: username = re.findall(r'^.*uid=([-\w\._]+)', username_array[0]) username = username[0] #            - auth.log    SQL fingerprint. command = """grep '%s' %s | tail -n 1""" %(fingerprint, log_file_SSH) p = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT) for line in p.stdout.readlines(): str_line = str(line) retval = p.wait() #     auth.log     , IP       .( ,  , SSH  ) arr_reg_exp_ssh_info = re.findall(r'^([\w*\s\d\:\s]+)\s([^ ]+)\ssshd\[(\d+)\]:\sAccepted publickey for\s([^ ]+)\sfrom\s([^ ]+)\sport\s[\s\w]+:\sRSA\s([\w\:]+)$', str_line) #     for data_ssh_info in arr_reg_exp_ssh_info: time_ssh = data_ssh_info[0] host_name = data_ssh_info[1] id_ssh_log_message = data_ssh_info[2] usr_name_git = data_ssh_info[3] ip_address = data_ssh_info[4] fingerprint_log_ssh = data_ssh_info[5] #    'action'     . if action == 'git-receive-pack': action = 'push' elif action == 'git-upload-pack': action = 'clone' #     if debug: print '----' print ' ', 'time_ssh', '\t\t', time_ssh print ' ', 'time_git', '\t\t', time_git print ' ', 'username','\t\t', username print ' ', 'action', '\t\t', action print ' ', 'project', '\t\t', project print ' ', 'ip_address', '\t\t', ip_address print ' ', 'user_key', '\t\t', user_key print ' ', 'host_name', '\t\t', host_name print ' ', 'id_ssh_log_message','\t', id_ssh_log_message print ' ', 'id_git_log_message','\t', id_git_log_message print ' ', 'usr_name_git', '\t', usr_name_git print ' ', 'fingerprint_ssh', '\t', fingerprint_log_ssh print ' ', 'fingerprint_sql', '\t', fingerprint print '----' print '\n' #       if out_GrayLog != 'no': logger.warning( 'Now message', extra={'gelf_props': { 'title_time_ssh':time_ssh, 'title_time_git':time_git, 'title_username':username, 'title_action':action, 'title_project':project, 'title_ip_address':ip_address, 'title_user_key':user_key, 'title_host_name':host_name, 'title_id_ssh_log_message':id_ssh_log_message, 'title_id_git_log_message':id_git_log_message, 'title_fingerprint':fingerprint }}) #       if out_log_file_name: out_log_file = open(out_log_file_name, 'a') out_log_file.write("{time_ssh:"); out_log_file.write(time_ssh); out_log_file.write("}{time_git:"); out_log_file.write(time_git); out_log_file.write("}{username:"); out_log_file.write(username); out_log_file.write("}{action:"); out_log_file.write(action); out_log_file.write("}{project:"); out_log_file.write(project); out_log_file.write("}{ip_address:"); out_log_file.write(ip_address); out_log_file.write("}{user_key:"); out_log_file.write(user_key); out_log_file.write("}{host_name:"); out_log_file.write(host_name); out_log_file.write("}{id_ssh_log_message:"); out_log_file.write(id_ssh_log_message); out_log_file.write("}{id_git_log_message:"); out_log_file.write(id_git_log_message); out_log_file.write("}{fingerprint:"); out_log_file.write(fingerprint); out_log_file.write("}"); out_log_file.write("\n"); out_log_file.close() #      ,       (  ) if os.path.exists(log_file_GuiLab): if os.path.exists(log_file_SSH): if gitlab_rb: command_searh_db = """grep "gitlab_rails\['db_" %s|tr -s '\\n' '\ '""" %gitlab_rb p = subprocess.Popen(command_searh_db, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT) for line in p.stdout.readlines(): str_db_line = str(line) retval = p.wait() arr_db_con = re.findall(r'^.*db_database\'\]\s\=\s\"([^ ]+)\".*db_username\'\]\s\=\s\"([^ ]+)\".*db_password\'\]\s\=\s\"([^ ]+)\".*db_host\'\]\s\=\s\"([^ ]+)\".*db_port\'\]\s\=\s([^ ]+)\s.*$', str_db_line) #     for data_db_info in arr_db_con: sql_db = data_db_info[0] sql_user = data_db_info[1] sql_password = data_db_info[2] sql_host = data_db_info[3] sql_port = data_db_info[4] #   ,     if debug: print '\n' print 'debug ==> on' print 'Start', (os.path.basename(__file__)) print '\n' print 'sql_db ==> ', sql_db print 'sql_user ==> ', sql_user print 'sql_password ==> ', sql_password print 'sql_host ==> ', sql_host print 'sql_port ==> ', sql_port print '----' #    t = tail.Tail(log_file_GuiLab) # 'log_file_GuiLab' -      t.register_callback(funk_pars_gitlab_shell) #    'funk_pars_gitlab_shell' t.follow(s=1) #    funk_error_file(log_file_SSH) funk_error_file(log_file_GuiLab) 


And as the logical conclusion of service for systemd:

cat /lib/systemd/system/pars_log_GitLab.service
 [Unit] Description=Python parsing_log_file_GitLab #      #After=network.target postgresql.service #   #Requires=postgresql.service #   #Wants=postgresql.service [Service] #   Type=simple #    Restart=always #  PID  PIDFile=/var/run/appname/appname.pid #   #WorkingDirectory=/home/username/appname #        User=root Group=root #           #PermissionsStartOnly=true # ExecStartPre -     #ExecStartPre=-/usr/bin/mkdir -p /var/run/appname #ExecStartPre=/usr/bin/chown -R app_user:app_user_group /var/run/appname #   ExecStart=/usr/bin/python2 /usr/bin/pars_log_GitLab.py & #    TimeoutSec=300 [Install] WantedBy=multi-user.target 


Now the script can be controlled by the following commands:
 systemctl start pars_log_GitLab.service systemctl status pars_log_GitLab.service systemctl stop pars_log_GitLab.service 

Do not forget to turn off debug before launch.

Thank you all for your attention!

Source: https://habr.com/ru/post/340768/


All Articles