📜 ⬆️ ⬇️

Why I recheck the recorded data, or the story of a single investigation

A recent habrastatya about differences in byte-by-identical files caused a small piece of my correspondence with one of the engineers at the time responsible for the MPG drive line at Fujitsu from the depths of the memory (and the mailbox). For the convenience of anglophone readers, I provide a translation from English under the cut.

Honorable Sir,
In 2001, I already talked to you about problems with my Fujitsu MPG3409AH disk. Now I am faced with another problem - I fear, much worse. Can I still contact you? If not, tell the person in charge who I can contact.

Monday, July 29, 2002, 8:57:37 AM, you wrote:
gffc> What problem do you have with your disk?

Let's start with the bureaucratic details:

Model: MPG3409AH
Serial Number: VLxxxxxxxxCF (August 2001)
Firmware revision: A9
')
The problem is this: from time to time, the only bit in the read data changes from 1 to 0 - but only if simultaneously with reading the data is exchanged between the HDD on the primary channel and the CD-ROM on the secondary channel.

The location of the bit is always the same - xxxx1xxx turns into xxxx0xxx at offset XXXXX02E approximately every 50 megabytes read or written, but absolutely random.

For example:

File Offset - Expected Value - Read Value
26002E 5A 52
C2D02E 8C 84
28002E 99 91

I first noticed a problem by copying a zip file from a compact disk to the hard drive: the file from the disk opened normally, but not from the hard drive; file comparison showed that in this way - from 1 to 0 - one bit was reset. Then, comparing the 130-megabyte file on another freshly recorded CD with its original copy on the hard drive, I found that the copies sometimes coincide, and sometimes they don’t (!!!). By requesting a bit-by-bit listing of discrepancies, I got a similar result: the information read from the hard drive, from time to time, turned out to be spoiled. A byte that was damaged during a previous attempt at reading turned out to be correct during another attempt, and vice versa.

At first I sinned on the memory bar in my computer. I put the memory with ECC support and added a cooler to it to the heap - to no avail. I suspected CD-ROM and began to compare the file on the CD with the file on the hard drive, and a copy of the file on another computer. Comparison over the network has always been successful, comparison with the hard drive is not. Suspected hard disk controller (i845D chipset). I transferred the hard drive to a computer with an older motherboard (DELL two years ago - so the chipset is guaranteed there is different, and the CD-ROM is also there) - and the “hard drive from CD” comparison errors are reproduced.

For me, this initially looks like a broken cell in the internal cache of the hard drive. However, I cannot understand one thing - why I cannot repeat the problem when copying from a master hard drive to a slave hard drive on the same controller - or from a problematic hard drive to himself? Maybe because the data stream in this case flows twice as slowly as when copying between Primary and Secondary IDE controllers?

My first problem, which I addressed to you a couple of years ago, also brings suspicions - it was the fact that the Fujitsu disc of the earlier series, MPD, was hanging randomly, so that the RESET button did not help, but only the power off time to access a CD-ROM on a Secondary IDE controller.

If it were not for this additional, but necessary condition (the exchange between the hard drive and the CD-ROM), I would not be so discouraged. Have you encountered such behavior?

I will repeat a set of sufficient and necessary conditions for the occurrence of a problem:
  1. The experimental hard drive must be set as Master on the primary IDE channel;
  2. CD-ROM should be set as Master on the secondary IDE channel;
  3. Test drive and CD-ROM must simultaneously transfer data actively.

Factors that do not affect the reproduction of the problem (the problem is reproduced):
  • after replacing the CD-ROM drive
  • after replacing the RAM
  • after replacing the power supply
  • after replacing the motherboard
  • on another computer

At the same time, if you change at least one parameter from the list of required ones, the bug stops playing:
  • If you switch to the secondary controller, and the CD-ROM to the primary controller;
  • if you put the hard drive (or CD-ROM) as Slave;
  • if the CD-ROM is idle while reading data from the hard drive.

Thank you in advance for your attention.
Two months later I received a new hard drive from Fujitsu with an updated revision of the firmware. He worked fine ...
And the moral of this story is this: if I had not had the habit of comparing files copied somewhere else bit by bit, no one would have known anything for a long time ...

Source: https://habr.com/ru/post/274235/


All Articles