
Good afternoon, Giktayms!
Some time ago, there were several interesting questions about the storage of information on hard drives on the toaster, which made me want to dig a little deeper, and I did some research.
Some of the information already ran through Habré, but not all. And something I could not find in the Russian-speaking Internet, so I decided to share what I found with the community.
')
About degaussing data on disk.
In normal living conditions (no abrupt change in temperature / humidity / pressure, no shock), the magnetized surface of the disk can store information for several decades. It is difficult to guarantee, since real industrial tests were not conducted, and those that are carried out - usually just represent a change of external conditions for exposure to an aggressive environment.
But most agree that the magnetic field is degrading at a rate of about
1% per year .
At the same time, one cannot say that after 50 years half of the disk will not be read - this is incorrect, because the field degradation is not equal to breakage - here the role is played by the sensitivity of the read heads and the
accuracy of the positioning mechanism .
Even in one batch of good manufacturer's hard disks, some excellent plates are obtained at the exit, and the whole device is carefully calibrated at the factory. Recalibration at home is not possible.
Over time, outwardly it may seem that magnetic recording has deteriorated, but in the overwhelming majority of cases - the deterioration of reading is associated with mechanical degradation of materials - this causes both
positioning errors and sensitivity of the heads.
If the important data for you is no longer read from the old hard disk - most likely it is a matter of degradation of mechanics / electronics, and they can be considered in special companies that specialize in data recovery - the hard drive will be disassembled, the pancakes will be removed and installed on a separate device, then they will be considered they are data directly.
Even if the mechanics and electronics are completely screwed up - the plates themselves and the information on them is subject to reading.
There are many witnesses who have old disks lying in a locker, they can be easily read after 15, and even 20 years (by the way, I am also one of them). And it happens that the disk does not start, just going over the warranty period.
So, in modern disks, electronics and mechanics first fail, connectors are broken, standards can even become outdated, but hardly the main cause is data demagnetization.
To this we can add that the low-level markings of tracks and sectors, which were marked by the manufacturer, and which the user cannot overwrite using standard methods, should be demagnetized first. True, the field strength at the markup is much higher, which is noticeable under the microscope, but nevertheless nothing lasts forever.
Conclusions from this point - overwrite the information on the disk in order to “update” the magnetic recording - there is no reason.
It is much more important to ensure the absence of aggressive external influence, as the most elementary is to tighten it more reliably in order to reduce vibrations. Turning on and off causes the disk temperature to change and therefore the material expands and contracts. This is one of the important factors why fast HDDs live less than slow drives from the “green” series, which have a much smaller temperature drop. But do not forget that if the disk is not hot to the touch, this does not mean that the metal has not expanded - each on-off cycle accelerates the degradation of the material, it is just less in the “cold disks”.
If your computer regularly falls asleep and wakes up several times a day, and it is powered from the network - it makes sense to increase the waiting time until the disk is turned off when it is powered from the network. Modern hard drives in idle mode consume only a
couple of watts .
About sectors
This is not exactly 512 bytes. This is an area in which 512 bytes are allocated for user data. There is also a service information about the sector - it is a low-level mark of the beginning and end of the sector, as well as a data correction block, usually it goes after user data. Plus unallocated space between sectors (gap).
Sector labels are applied by the manufacturer during the so-called low-level formatting. In the ancient years, it could be done independently from the BIOS, but now it is no longer available to the user using standard methods. The amount of service data may vary depending on the optimization of the firmware of the disk, but in it is considered that the sector, along with the service data, takes 577 bytes. Plus gap.
More precisely, it was before.
In 2007, an increase in the size of the sector was proposed, and after the approval and approval procedures, starting from 2011, all issued disks are already formatted with a sector of 4,096 bytes of user data (approximately 4,211 bytes with service data) - the so-called Advanced Format.
Simplifying the addressing of low-level sectors, which were eight times smaller with the same volume, is an increase in productivity by simplifying calculations and working with large blocks, and the efficiency of disk utilization has increased markedly. How much? Let's read the next paragraph.
ECC data block
In 512 byte sectors, the ECC Block occupied
50 bytes . In 4096 byte sectors, the ECC block increased to
100 bytes , but the number of sectors decreased. And in fact, ECC now occupies
four times less (
100 bytes by 4096 bytes versus
400 bytes by 8 * 512 bytes).
In addition, on a longer data chain, the correction algorithm works more efficiently, as a result, both space was saved and efficiency was increased. According to various estimates, the ECC calculation speed increased by 5-10%. So, the disk controller is less tense and can do other things. This indirectly affects the overall write / read performance of the data.
One of the main advantages is of course saving space.
Totally - a decrease in the volume allocated for ECC blocks, a decrease in the total number of sectors (less gap, less tags, less indexes for addressing sectors) - the total space allocated for user data has increased by
more than 10% !
There is another small plus associated with large sectors. In the event of a defect or surface defect, a large portion will be marked as bad right away. If you mark megabytes of sectors by 512 bytes, it will take many times longer than 4 KB.
In addition, the unreadable part will be labeled more reliably - if we cut a rotten or wormy piece of a tasty apple, we cut off some of the good - and in the hard drive - it is better to mark the bad part
not in the butt .
But of course it is better to get rid of the disks with the beds.
The only exception is logical bad blocks. They are connected precisely with the ECC - when for various reasons (the electricity, firmware firmware, lunar storms ...) suddenly turned off, and the ECC turned out to be incorrect - this sector of the disk controller will be considered a failed one. They can be corrected by rescanning the bad sectors - there are a lot of utilities now, starting with the famous Victoria.
Pro virtual 512 byte sector
The logo with "512e" means that the disk itself is already 4kb-sector, but it works in the emulation mode of virtual 512 byte sectors.
The logo with "4Kn" says that the disk supports 4k native interface, such disks are on sale since 2014.
Many still popular OS (here I am talking about Windows 7 and Windows Vista), do not support 4k disks natively.
However, the old disks work fine for them, and the new disks provide an interface with virtual 512-byte sectors.
On virtual 512-byte sectors should be remembered when you are testing 512e drives, or during the test run on an outdated OS.
For example, the recording of random 512-byte sectors in such conditions will look like “count 4k, write 4k”, which obviously will produce an incomprehensible degradation of speed on the graph. At the same time, linear read and write speeds will show normal performance.
Windows supports 4k disks natively, starting with Windows 8 and Windows server 2012.
About Cluster Straddling.
This applies to those disks that work in 512e emulations (there are still a lot of such ones).
Divide such a disk into partitions and format it with default settings. Standard cluster NTFS-4 kilobytes. The HFS + (or ext4) block is usually also 4 kilobytes. And the physical sector of the disk is already 4 kilobytes. A very convenient size (even the x86 mem page is also 4 KB).
But during the partitioning of a 512e disk, it may turn out so that the partition will begin not from the beginning of the 4-sector, but with an offset of 512 bytes.
As a result, the 4 kilobyte cluster / block will lie between the two 4 kilobyte physical sectors of the hard disk.
Each time you read such a cluster, the hard disk (due to the logic of its work) will read two sectors entirely. When recording, too, not everything is smooth.
This problem is solved by various align utilities - the same WD Align Tool or HGST Align Tool for Windows 7 and higher.
You only need to apply them AFTER you split the disk into partitions — the utility will verify that the boundaries of the partitions coincide with the start of the new 4kbyte sector, and move them if necessary. After that, you can work without a drop in performance.
Where information is read faster - at the beginning or at the end of the disc?
On hard drives, the first sector is on the outside of the disk, and the last sector is on the inside.
At the beginning of time, the number of sectors on the track was the same, but it was so in the dense time that you can not remember. Now the tracks that are closer to the beginning of the disk (outer side) contain more sectors.
So, the linear speed of writing and reading information located at the
beginning of the disc is much higher . Exact numbers depend on the performance of the disk itself, but in percent - the difference can be 200% and even a little more percent between the most extreme tracks (!)
The number of sectors per track is not indicated individually, but for an area into which several tracks are combined, so the difference in speed will not be visible for the two extreme tracks, but for the two extreme zones and gradually decrease towards the middle of the disk. In addition, it is empirically possible to say that there are more “fast” sectors on the disk - since there are simply more of them on the outer part of the disk.
How to store?
When compared to CDs, DVDs and flash drives, CDs and flash drives are clearly losing data storage duration. DVDs can argue, but everything is ambiguous - high-quality discs, a good drive, and recording not at maximum speed are needed, and it’s still possible that the data will stop reading. In addition, 4.5 or even 9 GB on DVD is not so much, plus the lack of comfort. And you can save only once - contact with DVD-RW for long-term data storage is generally not worth it.
I recorded over 5000 CD / DVDs in my time, tested reading. Of course, the quality of reading and durability depended on the quality of the disc, but the same Verbatim, which was one of the standards of the CD-R 650, was rather mediocre in the DVD. And there could be something unfortunate in each game.
If you take Blue Ray discs, then the cost of a writing drive and discs is such that, if not cheaper, then it is almost equivalent to buy a new hard disk after 5 years and copy data to it.
At the moment, inexpensive ways to store personal data are mainly divided into:
* If there is not too much data, and the Internet allows - you can store it in the cloud, or better in two different independent clouds, by pre-encrypting the data with a script / archiver. Here I will advertise WinRAR, which besides archiving with a password, in addition can use ECC. You can increase the size of the archive by a certain percentage, but you can have the opportunity to recover data from any damaged place in this archive, within this percentage. It is even possible to split the archive into volumes, and create a volume for recovery as a separate file. In antiquity, I actively used it with old floppy disks, when a whole floppy disk could simply not be read in another floppy disk drive.
* Removable HDD, but I recommend changing the media at intervals of 3-5 years to a newer one, trying not to go too far from the warranty period. You can simply buy a SATA / USB adapter and upgrade the system disk to a faster / more capacious, old disk to give backups.
* Buy an inexpensive home NAS with a raid and set up an ordinary simple mirror. This method is noticeably more expensive than the previous two, but if one of the disks fails, you just need to replace the broken disk with a new one, and the raid controller itself connects the new disk to the array and fills it with data. That is, nothing will need to be re-configured, search and restore information from different backups. Just replaced the disk and that's it. The NAS is also very low power, it can be left on all the time and automate all backup processes.
UPD:
DaemonGloom recommends the wonderful WD My Cloud Mirror device, which comes at almost the price of hard screws, plus a small overpayment for the case / controller:
"At current prices - a 2x4TB device gives $ 100 overpayment, 2x6TB - $ 80."
Personally, I back up everything important to the second disk, and periodically upload archives to an external USB drive manually.
Thus there is a) a working copy, b) a daily archive on the second disk, and c) approximately a monthly archive on an external disconnected disk. But basically I'm starting to think about NAS.
How do you keep it?