📜 ⬆️ ⬇️

Bareos: tapes, Hyper-V and more

This is a post about life after Getting Started with Bareos, and what a comprehensive manual had to read the longest.


Storoj

If you already drove test tasks in the sandbox and can communicate with the baros via bconsole, then climb through the cat.
')

Situation


We just have an organization, not an IT profile, not a hosting. Backing up virtual machines from Hyper-V clusters, files from file farms and database dumps, and there are also various trivia.

Why baros


Because the screw client is available. As you know, Bareos is a dramatic fork of Bacula, well-deserved and proven. But during the selection of Bakula, the source code (and even binaries) of its fd for Windows was clamped, so no. Veeam is good, but it's worth as an armchair of the stadium of FC «Zenit». There was DPM, but how much Antoan and I from Microsoft technical support did not fight with him, love never arose.

Installation


has been described repeatedly. Who is the director and what he does with other demons - you can read, for example, here . I note only that dir and sd are extremely desirable to be the same version, the fd version is not so important. Feels like changers, the version is better to have 16 or higher.

Basic setting


According to the DPM habit, I wanted to create a big task and cram a lot into it. It turned out that small tasks are more convenient: in case of unsuccessful execution, a small one will be executed again faster and crawls through the spooler better (more on that later). The size of the task does not affect the recovery process, except that a task with hundreds of thousands of files can slow down at the stage of their selection.

Hyper-v


Virtual Machines (VMs) run on Hyper-V clusters. Within the cluster on the nodes, the fd settings are the same, the hostname for all is the cluster name. The director as a client also indicates a cluster with its cluster address. VM can move to another cluster volume, therefore we specify not a specific path, but the path to the script:

Hidden text
FileSet { #       ,    Name = "VM_lamachine-fs" Include { #    (/ "example.com"  ) File = "\\|C:/Windows/System32/WindowsPowerShell/v1.0/powershell.exe -file c:/cmd/search-vm.ps1 -machine lamachine.example.com" Options { #      Compression = LZO #     RegexFile = ".*/Virtual Machines/.*.bin" Exclude = yes } } } 

But c: \ cmd \ search-vm.ps1, which gives the path to the machine:

 Param( [string]$level, [string]$machine = "NOEXISTENTVM.example.com" ) Import-Module failoverclusters $backuppath = @() $Cluster = Get-Cluster $ClusterMachines = @() $ClusterMachines += Get-ClusterResource -Cluster $Cluster | where { $_.ResourceType -like "Virtual Machine" } | where { $_.Name -like "*$machine"} | ` select -Property OwnerNode,Name, @{ Name ="VmID";Expression ={ (Get-ClusterParameter -Cluster $Cluster -InputObject $_ | where { $_.Name -eq "VmID" } | select -Property Value).Value } } if ($ClusterMachines.Count -eq 0 ) { "NO MACHINES" exit 2 } foreach ($ClusterMachine in $ClusterMachines){ $VM = Get-VM -ComputerName $ClusterMachine.OwnerNode -Id $ClusterMachine.VmID $path = $VM.Path.Replace('\','/') $backuppath += $path foreach ($HardDrive in $VM.HardDrives){ $drivepath = $HardDrive.Path | Split-Path -Parent $drivepath = $drivepath.Replace('\','/') if ($drivepath -notin $backuppath){ $backuppath += $drivepath } } } $backuppath 

Before backup, snapshot is done, after backup it is deleted, for this there are a couple of someone else's roughly doped scripts.

Hidden text
Creature:

 #Copyright disclaimer: # Copyright (C) 2015, ITHierarchy Inc (www.ithierarchy.com). ALl rights reserverd. # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see <http://www.gnu.org/licenses/>. Param( [string]$level, [string]$machine = "noexist.example.com", #[string]$prefix = "", [int]$DayOfWeekForFullBackup = 2 ) Import-Module failoverclusters "Processing $machine via $env:computername" $dow = [int]$(get-date).DayOfWeek if ($dow -eq $DayOfWeekForFullBackup){ $prefix="Weekly" } $DateStamp=$(((get-date)).ToString("yyyyMMddTHHmmss")) if ($level -eq "Full"){$Backup=" Bacula -*"}Else{$Backup=" Bacula -$level*"} #$HyperVPath="C:\Hyper-V" #Set path to your Hyper-V Machines to be backed up #Sort out Actual Volume path to VM #$VMDrive=$HyperVPath.Substring(0,1) #$volume=Get-Volume $VMDrive #$TrueHyperVPath=$($HyperVPath.Replace("$($VMDrive):\",$($Volume.path))) #Get List of VMs $Cluster = Get-Cluster # let's initialize it like array (for simplier size check) $ClusterMachines = @() $ClusterMachines += Get-ClusterResource -Cluster $Cluster | where { $_.ResourceType -like "Virtual Machine" } | where { $_.Name -like "*$machine"} | ` select -Property OwnerNode,Name, @{ Name ="VmID";Expression ={ (Get-ClusterParameter -Cluster $Cluster -InputObject $_ | where { $_.Name -eq "VmID" } | select -Property Value).Value } } if ($ClusterMachines.count -gt 1){ "Ambiguous machine name" exit 2 } if ($ClusterMachines.count -ne 1){ "Machine not found: absent, not in failover cluster or something" exit 2 } foreach ($ClusterMachine in $ClusterMachines){ $VM = Get-VM -ComputerName $ClusterMachine.OwnerNode -Id $ClusterMachine.VmID write-host "Working on VM $($vm.Name) @ '$($vm.Path)'" $CurrentSnapShots = $VM | Get-VMSnapshot foreach ($SnapShot in $CurrentSnapShots){ if ($SnapShot.Name -like ("$($prefix)Backup*")){ write-host "Removing VM Checkpoint '$($SnapShot.Name)'" $SnapShot | Remove-VMSnapshot # -ComputerName $ClusterMachine.OwnerNode $LoopCount=0 do { Write-host "Waiting for snapshot '$($SnapShot.name)' to delete..." Start-Sleep -s 10 $LoopCount=$LoopCount+1 }while ($VM.Status -eq "Merging disks" -and $LoopCount -lt 30) } } $label = "$($prefix)Backup-$level-$DateStamp" write-host "Creating Checkpoint $label ($($VM.Name))" $VM | Checkpoint-VM -SnapshotName $label } 

Uninstall:

 #Copyright disclaimer: # Copyright (C) 2015, ITHierarchy Inc (www.ithierarchy.com). ALl rights reserverd. # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see <http://www.gnu.org/licenses/>. Param( [string]$level, [string]$machine = "noexist.example.com", [string]$vmmserver = "vldvmm.example.com" ) Import-Module failoverclusters $Cluster = Get-Cluster $ClusterMachines = Get-ClusterResource -Cluster $Cluster | where { $_.ResourceType -like "Virtual Machine" } | where { $_.Name -like "*$machine"} | ` select -Property OwnerNode,Name, @{ Name ="VmID";Expression ={ (Get-ClusterParameter -Cluster $Cluster -InputObject $_ | where { $_.Name -eq "VmID" } | select -Property Value).Value } } # FIXME foreach    foreach ($ClusterMachine in $ClusterMachines){ $VM = Get-VM -ComputerName $ClusterMachine.OwnerNode -Id $ClusterMachine.VmID write-host "Working on VM $($vm.Name) @ '$($vm.Path)'" $CurrentSnapShots = $VM | Get-VMSnapshot foreach ($SnapShot in $CurrentSnapShots){ if ($SnapShot.Name -like ("Backup*")){ write-host "Removing VM Checkpoint '$($SnapShot.Name)'" $SnapShot | Remove-VMSnapshot # -ComputerName $ClusterMachine.OwnerNode $LoopCount=0 do { Write-host "Waiting for snapshot '$($SnapShot.name)' to delete..." Start-Sleep -s 10 $LoopCount=$LoopCount+1 }while ($VM.Status -eq "Merging disks" -and $LoopCount -lt 30) } } } 


Snapshot can not be deleted, then it will be possible to make incremental backups (the script has the beginnings of such functionality - DayOfWeekForFullBackup).

Cassette


We use tape libraries: such a dvuhunitovaya dvuhinyunitovaya dvuhshshkovaya box with cassettes, one or two streamers and a robot avto-changer. Bacula, and by inheritance, and Bareos, are excellent friends with tapes (better than with HDD). What confused me - in one library, the Baros discovered two autochanger, this should not be. It turned out that the device was programmatically divided into two “logical libraries” since the time of the struggle with DPM. Go to the device admin panel and disable it unnecessary, now the system has the correct number of changers - one. The devices will be shown by the command ls /dev/tape/by-id/ , with the suffix "-nst" - writing drive, without it - a robot avto-changer.

Regarding the use of two drives for parallel writing to one pool (set of volumes): this would reduce the recording time, but did not do so. In the selected window backups and so fit, but the flow of the film may increase. But if anyone wants a parallel entry, do not forget to set the Prefer Mounted Volumes to No You can write in two different pools without problems and an overhead projector.

Scratch pools. I want to draw attention to them: of them, the baros takes the tapes to add to the other pools to which he is going to write. An unknown tape without a baros baron will not be added to the work pool. Therefore, all new tapes are added to the Scratch pool:

 label barcodes storage=mylittlestorage slot=1 pool=Scratch 

You can add not to Scratch, but immediately to the working pool, but if there are several pools, it is not always possible to predict how many cassettes you need. Therefore, let him take it out of necessity.

The size of blocks and files need to be put more, it will have a beneficial effect on the speed of the tapes. “If you’re looking for a LTO-4 or newer drive, you can’t make it a little bit more. ” So don't be shy.

In order for the baros not to be tempted to write something onto a cassette that is already in a distant safe, it is better to change its status from Append to Used before recess:

 update volume=KYF389L6 volstatus=Used 

The same command can change other properties of the cassette, say, move to another pool:

 update volume=KYF389L6 pool=YetAnotherPool 

The cassette is easy to steal (well, easier than the IBM DS8800), so it is highly desirable to encrypt the data. You can do this by means of the writer itself, but I like software solutions as more versatile and flexible. Just don't forget .

It happens that the baros has once written a label on a cassette, but does not have a record of this cassette in the database. The second time the label does not work (“error: already labeled”), there is an add command, but in my case it caused problems, after which it was impossible to use the cassette. In this case, such a one-liner was born (it is executed in bash on the sd server, the bareos-sd itself must be stopped):

 mtx -f /dev/sg10 load 25 && mt -f /dev/st0 rewind && mt -f /dev/st0 weof && mt -f /dev/st0 rewind && mtx -f /dev/sg10 unload 

If it does not allow unloading, then

 mt -f /dev/st0 offline 

Spooling


This is when the data is written first to the SSD (or at least a fast HDD), and then to the tape. Compared with the sequential execution, the runtime of the streamer is reduced (he writes faster) and the total time of task execution, if there are a lot of them. If the task is one, the time taken to use the drive will also be reduced, but the total time to complete the task will increase.

To work, first on the sd side you need to specify the location and size of the spooler:

 Device { Name = Drive-0 ... #   Maximum Concurrent Jobs = 20 Spool Directory = /mnt/backup/spool Maximum Spool Size = 1950 G Maximum Job Spool Size = 1200 G } 

and then enable for specific tasks:

 JobDefs { Name = "SundayTape" ... Spool Data = Yes } 

I have this directive in the template of the “ribbon” task, and for the “disk” tasks the spooling is practically useless.

The principles are:


The size of the spooler and the number of simultaneous tasks depend on the total number of tasks, sizes of tasks, data reading speed (which can rest on the network or spooler speed), tape recording speed, ratios of sizes and speeds of different tasks, human desires (get results faster, or less Piskaku, or spend less cassettes). All this can be connected with the hellish matan, but I recommend setting up the spool as a god will put it, because even the simplified rules are not simple:


The number of simultaneously performed tasks is limited in many places, look in the documentation “Concurrent Jobs =”. It turned out to be convenient for me to put a large number with a margin everywhere, and limit it to the required number on a specific device ( sd device ).

About files


Linuksovaya habit - to crawl into the guts, pick open and ruffle. I wanted the same thing with the Baros file volumes so that you can find the volume you need and restore it even when the director is not working. For this, I tried for each task to create a new file with a name containing the name of the task. I had to delete obsolete volumes with a crown script, and also make sure that each volume has a task in the database and vice versa. Bareos quickly began to grow into eerie crutches, it was decided to abandon the human-readable naming, use meaningless names and Recycle (cleaning and reusing the file for another task). Still, without a director, nowhere, if you lose a server one, you need to restore it first of all.

And IBM recommends storing one task in one file, and so far I agree with them.

Monitoring


Some reporting facilities on the street ^ W ^ W also had to be added with scripts. The script that returns the status of the last task launch was the most popular.

Hidden text
 #!/bin/bash RED='\033[0;31m' NC='\033[0m' # No Color GREEN='\033[0;32m' YELLOW='\033[0;33m' JOBS=`su - postgres -c "psql -d bareos -c \"WITH summary AS ( SELECT name,jobstatus,jobid, ROW_NUMBER() OVER(PARTITION BY name ORDER BY starttime DESC) AS rk FROM job p WHERE starttime > current_date - INTERVAL '5 days') SELECT s.* FROM summary s WHERE s.rk=1;\"" | grep "1$" | sed 's/ //g'` #echo "$JOBS" for job in $JOBS; do jobstatus=`echo $job | cut -d '|' -f2` jobname=`echo $job | cut -d '|' -f1` jobid=$(echo $job | cut -d '|' -f3) if [ "$jobstatus" == "R" ]; then printf "%-30s" "$jobname ($jobid)" echo -e "$YELLOW running$NC ($jobstatus)" elif [ "$jobstatus" == "W" ]; then printf "%-30s" "$jobname ($jobid)" echo -e "$YELLOW warning$NC ($jobstatus)" elif [ "$jobstatus" == "T" ]; then if [[ $1 == "printall" ]]; then printf "%-30s" "$jobname ($jobid)" echo -e "$GREEN OK$NC ($jobstatus)" fi else printf "%-30s" "$jobname ($jobid)" echo -e "$RED failed$NC ($jobstatus)" fi done 

The original version of the script also checked the fact of encryption, but this turned out to be a breeze. If there are problems with encryption, the task is fatally completed, which will be seen again by status.

There is a version for Zabbix
 #!/bin/bash JOBS=`su - postgres -c "psql -d bareos -c \" SELECT name,starttime,jobstatus FROM job p WHERE starttime > current_date - INTERVAL '62 days' AND name = '$1' ORDER BY starttime DESC LIMIT 1;\"" | sed 's/ //g' | grep "|.$"` for job in $JOBS; do jobname=`echo $job | cut -d '|' -f1` jobstatus=`echo $job | cut -d '|' -f3` if [ "$jobstatus" == "E" ] || [ "$jobstatus" == "f" ]; then #echo "Job $jobname failed ($jobstatus)." echo "3" exit elif [ "$jobstatus" == "W" ]; then #echo "Job $jobname with warning ($jobstatus)." echo "1" exit elif [ "$jobstatus" != "T" ] && [ "$jobstatus" != "R" ]; then #echo "Job $jobname not ok ($jobstatus)." echo "2" exit elif [ "$jobfiles" == 0 ] || [ "$jobbytes" == 0 ] ; then #echo "Job $jobname is empty." echo "4" exit else echo "0" exit fi done 

The discovery script (LLD) is pinned to the bconsole output format and can easily break, but it still works. And JSON is molded by hand, but for now it also works.

 #!/bin/bash FIRST=true JOBS=$(echo "show jobs" | bconsole | grep "^ *Name = \|Enabled = no" | sed 'N;/\n Enabled = no/d;P;D' | grep -v -e "-test\"$" | cut -d'=' -f 2 | grep -o "[a-zA-Z0-9_-]*" ) echo '{ "data": [' for job in $JOBS; do if [ "$FIRST" = false ]; then echo -n "," fi FIRST=false echo "" echo " {" echo " \"{#JOBNAME}\": \"$job\"" echo -n " }" done echo ' ] }' 


And I also love stacked graphics in zabbiks, for example, the volume of film occupied by different tasks:



It is evident that it is time to divide the blue task into several small ones.

Pleasant trifles



Future achievements



Ask questions, the software is clear, although with character, I want to contribute to its spread.

And do not forget to check your backups.

Source: https://habr.com/ru/post/275259/


All Articles