⬆️ ⬇️

Backup Time Machine do it yourself

How not cool, and in the New Year holidays, the risk of file corruption increases significantly. This trouble did not pass me. As it is not difficult to guess, I confused the disk when formatting and ... yes, all that was acquired in an unjust way by overwork, at one point was destroyed.



Having remembered a collection of software and an archive of scanned directories, I thought about the issue of backups. And ... I came to the conclusion that what I really needed was not. More precisely, of course, there is, but either it is expensive, or it does not work as I would have liked.



Having finished with the torture of Google on the topic: "do me well," I decided to act like a true Unixoidoid, albeit working in window vents. Namely: do not show off, the simpler - the better.



Then I remembered the MacOS presentation on which they demonstrated their Time Machine. After all, if you think about it, it is very convenient to be able to access any file for any day. But ... If you make full copies in the form of archives, then no volume is enough to store it all. Then the thought caught on incremental backups. That is, the first time you do a full archive, and then archive only what has changed.

')

And ... I also rejected this option, if not convenient for my case. First, there is a need for the ability to delete arbitrary “days”. And secondly, I very often rename my files and from time to time I shift from directory to directory. And this, in turn, to my surprise, cut off almost all candidates for “backup”. That is, the software stupidly looked at the file name, the date it was changed and ... and that's it. As a result, backup swelled.



So, two ideas saved me:

First, it doesn’t matter what the name of the file is, the content is important. Thus, several hashes should be taken from each file, and on the basis of this signature it is possible to judge with sufficient precision what file it is. In my case, I limited myself to the withdrawal of the md5 amount and the size of the file. The choice is certainly controversial, but for scans this is quite enough.

Secondly, if the file has not changed or only its name is changed, it is not necessary to copy the entire file, but to make a hard link to it, since there is such a benefit in NTFS.



If anyone knows, thanks to the team:

fsutil hardlink create <> <>

In windows you can get a real hard link.



In the end, it turned out a simple algorithm, which I, without bothering, designed on the console PHP. Now backup occurs when a portable disk is connected to the computer, or (if the disk is already connected) once a day.



And here is actually the "backup".

 <?php // $dir = array(); $hah = array(); $hah_new = array(); $file = array(); $copy = 0; $link = 0; include 'conf.php'; $date = date('Ym-d'); //        if(is_dir($date)){ exit("Backup already exists\n"); } //     foreach(glob('*', GLOB_ONLYDIR) as $v){ if(is_file($v.'/hah.db')){ $hah = array_merge($hah, unserialize(file_get_contents($v.'/hah.db'))); } } //    foreach($dir as $v){ $x = explode('/', $v); array_unshift($x, $date); $x[1] = substr($x[1], 0, 1); foreach($x as $k=>$v){ $y = implode('/', array_slice($x, 0, $k+1)); if(!is_dir($y)){ mkdir($y); } } } //    while($n = array_pop($dir)){ if(!is_dir($date.'/'.substr($n, 0, 1).'/'.substr($n, 3))){ mkdir($date.'/'.substr($n, 0, 1).'/'.substr($n, 3)); } $dir = array_merge($dir, glob($n.'/*', GLOB_ONLYDIR)); $file = array_merge($file, array_diff(glob($n.'/*'), glob($n.'/*', GLOB_ONLYDIR))); } //       foreach($file as $k=>$v){ $x = md5_file($v).filesize($v); if(!$x){ continue; } $f = $date.'/'.substr($v, 0, 1).'/'.substr($v, 3); if($hah[$x]){ exec('fsutil hardlink create "'.$f.'" "'.$hah[$x].'"'); $hah_new[$x] = $f; $link++; }else{ copy($v, $f); $hah_new[$x] = $f; $copy++; } print ceil($k*100/count($file))."%\r"; } print "\nLink: ".$link."\n"; print "Copy: ".$copy."\n"; //     file_put_contents($date.'/hah.db', serialize($hah_new)); exit; 




Its configuration is:

 <?php date_default_timezone_set('Asia/Novosibirsk'); $dir[] = 'c:/scan'; //  $dir[] = 'c:/web'; //    $dir[] = 'c:/gohsrf'; //  $dir[] = 'q:'; //  




Well, BAT`nik launching it:

 @echo off cls php backup.php pause 

Source: https://habr.com/ru/post/135798/



All Articles