What to do when RAM fails. Medical history and treatment methods
RAM - a part of the system that rarely fails. But spontaneous system reboots with and without BSOD, games or software crashes, incorrect task processing results in heavy software - all this and much more can be symptoms of problems with it. In fact, such problems occur quite often and are mainly the result of incorrect configuration by the user, although it is impossible to exclude hardware problems. In this article we will get acquainted with the actual memory modules for desktop systems, tell you about possible problems in their work and the reasons why they arise, as well as help with diagnostics. Why also why memory malfunctions can occur? What to do in the end or not to do? Answering these questions, we will not torture the brains of newbies - we will tell everything in simple language for maximum understanding.
What does a memory module consist of?
From the point of view of circuit design, RAM is a very simple device when compared to other electronic components of the system and not counting fans (some of them have a simple controller that implements PWM control). What components are the modules assembled from?
The chips themselves are the key elements that determine the speed of the memory.
SPD (Serial Presence Detect) is a separate chip containing information about a specific module.
The key is a slot in the printed circuit board so that modules of the same type cannot be installed on the boards that do not support them.
The circuit board itself.
Various kinds of SMD components located on the printed circuit board.
')
Of course, the set of components is far from complete. But for the minimum work of memory is enough. What else could it be? Most often - radiators. They help to cool high-frequency chips that operate at a higher voltage (though not always at a higher voltage), as well as when the user accelerates the memory.
Someone will say that this is marketing and all that. In some cases, yes, but not HyperX. 4000 MHz Predator modules easily heat up radiators to 43 degrees, which we found out in the material about them . By the way, about overheating today will be discussed.
Next - the backlight. Some manufacturers install such a certain color, and some - full RGB, so even with the ability to customize both with the help of switches on the modules themselves, as well as using plug-in cables, as well as motherboard software.
But, for example, HyperX engineers went further - they implemented on-board infrared sensors that are required for full synchronization of the backlight.
We will not go into it - the material is not about that, and we talked about them earlier, so if anyone is interested, we will get acquainted with the video below and read the material on the case further.
What to be - not to be avoided
Choosing a budget memory from little-known manufacturers, you get a cat in a bag - such modules can be assembled “on the knee in Uncle Liao’s basement” and not even know what quality control is. In other words - the problem may be the first time you turn it on. Memory ValueRAM from Kingston, of course, does not apply to such, although the price tags on it are close to the minimum. Considering the previous chapter, some users can say that the more components, the higher the chance of their failure. Logically, it can not be refuted. But HyperX’s confidence in its products (in particular, Predator RGB modules) is such that it is covered by a lifetime warranty! But so anyway - what can fail? Any LEDs and other similar design elements we do not take into account.
Damage to the memory cells.
Each memory chip contains a huge number of such cells, in which a huge amount of information is written and read. In the case of writing data to a damaged cell, they are distorted, which causes a system or application to malfunction.
Overclocking, wrong timings and voltage.
Each of us has ever tried or wants to try to overclock the memory. It is allowed to increase the memory frequency not on all platforms, but if you have already acquired a motherboard supporting overclocking, you may encounter certain problems along the way. In modern reality, memory overclocking depends not only on the chips themselves, but also on the memory controller built into the processor and the wiring of the lines on the motherboard. The last two aspects affect overclocking to a lesser extent than the memory chips used. The more you increase the clock frequency of the memory modules, the more likely the occurrence of errors in their work. With timings - the opposite. Their reduction can lead to unstable work. Improved stability of the overclocked memory can help increase the voltage on it, which leads to greater heating and reduced overall life, as well as the potential possibility of failure at any time. In general, if the system is unstable, then first of all return all settings to the factory settings.
Overheat.
Yes, high memory temperatures can also affect the stability of the system. Therefore, choosing high-frequency kits, you should take care of their cooling. At a minimum, they should have radiators. The same applies to low-frequency modules, subject to acceleration on your part. Do you want to install a set of fast memory in a working system, in which calculations are made with its help? Do not believe that modern DDR4 with a working voltage of 1.2 V can get very hot? Admire! The temperature of the microcircuits of modules that are not equipped with radiators practically reaches 85 degrees, which is the limit for most microcircuits. Impressive, isn't it?
Mechanical damage Any inaccurate movement - and you can damage the memory module. Break the chip, SPD or in the PCB the tracks burst. With some damage, the memory can still work, but with critical errors. For example, the SPD chip, which is shown in the photo below, made the module completely inoperable. To talk about radiators - they allow you to reduce to almost zero the probability of mechanical memory damage, unless, of course, you shed tea or coffee on it ...
Other sources of memory problems, but when the memory has nothing to do with it.
Separately, it must be said that memory may become unstable and not due to the reasons described above. Problems may lie in the processor or motherboard. The memory controller in modern processors is implemented directly in the processor itself. And it can “behave badly” for various reasons, especially when overclocking. And it happens that even if you reset the settings to nominal, for example, the “dead” memory channel will not come to life anymore. Accordingly, the replacement of the module will not lead to anything. Physical damage to the processor socket or to the motherboard (bends or other external / internal influences) can also be the cause of incorrect memory operation. Therefore, we will not stop trying to persuade you to check all the components separately before you go to buy a new set of memory, which can be a waste of money. And the Kingston company went further - it offers a configurator by which you can easily and conveniently find suitable memory modules for certain systems! You can find it at https://www.kingston.com/ru/memory/searchoptions .
Be careful ...
Few know that there are three letters that can simplify the selection of system components - QVL. Decryption sounds like a Qualified Vendors List, which in Russian sounds like a compatibility list. It includes those components with which the motherboard manufacturer checked its product and ensures correct operation. For obvious reasons, not everyone can check hundreds of items. But every self-respecting manufacturer offers a fairly extensive list in our case of memory models.
Blue screens of death, hang and reboot - the fault is exactly in ...
What is the minimum set of electronic components for a PC / laptop / all-in-one? From the motherboard, processor, drive, power supply and RAM. All of these components are interconnected, so if one of them is unstable, it causes the entire system to fail. The most correct way to diagnose is to test each of these components in a different system. Thus, by elimination we will be able to determine the “weakest link” and replace it. But it is not always possible to find another system for such actions. For example, not every one of your friends may have a card for checking modules with a clock frequency of 4000 MHz or so. Let's say the problem is revealed, and it lies in the memory. We checked several times in different slots and on a pair of motherboards - and she began to work stably. Magic? As they say in the Marvel universe, magic is just an unexplored technology, the secret of which in our case is very simple. Contacts on the memory modules oxidize over time, which makes it impossible for them to work correctly, and when you take out and return several times, they are polished a little, after which everything starts to work normally. In fact, contact oxidation is the most common problem of memory malfunctioning (and not only), so make it a rule - if you have any problems with the platform, then arm yourself with a regular office eraser and gently wipe the contacts from both sides. This is true just in cases where problems arise when memory is working in its nominal mode, if it has been working without failures for months or years.
If the eraser did not help
What to do next? If the system works with catastrophic failures, then only check the components on a knowingly working platform. If the suspicion is on the memory operating in the nominal mode, then you can run several tests. There are free and paid versions of software, some of which work from Windows / Linux, and some from DOS or even UEFI.
Let's start with the fact that each user has Windows 7 and newer. Oddly enough, the built-in Windows memory test works very efficiently and is able to detect errors. It is launched in two ways - from the Start menu:
Or through Win + R:
The result is waiting for us one:
If the basic or regular tests did not reveal errors, then you should definitely test in the “Wide” mode, which includes tests from the previous modes, but supplemented by MATS +, Stride38, WSCHCKR, WStride-6, CHCKR4, WCHCKR3, ERAND, Stride6 and CHCKR8 .
You can view the results in the application "Event Viewer", namely - "Windows Logs" - "System". If there are a lot of events, then the easiest way is to find the log we need via search (CTRL + F) by the name MemoryDiagnostics-Results.
To check the memory, it is recommended to use programs that function before the OS boots. In this way, we will be able to check the maximum available free memory, which will increase the chance of detecting errors, if any. A very common program is MemTest86. It exists in two versions - for legacy (Legacy BIOS) systems and for UEFI-compatible platforms. For the latter, the program is paid, although there is a free version with limited functionality. If you are interested, then a comparative table of editions is available on the official website of the manufacturer - https://www.memtest86.com/features.htm .
This program is the best solution for finding memory errors. It has a sufficient number of settings and displays the result in a clear form. How much to test the memory? The more - the better, if the probability of an error is small. If any memory chip is clearly a problem, the result will not be long in coming.
There is also MemTest for Windows. You can also use it, but it will make less sense - it does not test the memory area allocated for the OS and the programs running in the background.
Since this program is not new, enthusiasts (mostly Asians) write additional shells for it, so that you can conveniently and quickly run several copies at once to test a large amount of memory.
Unfortunately, updates to these shells most often remain in Chinese.
But our enthusiasts write their software. A vivid example is TestMem5 from Serj.
In general, it is possible to bring linpack to the list of tests, but for its operation it will also require a full load on the processor, which is fraught with its overheating, especially if AVX instructions are used. Yes, and it is not quite suitable for testing memory test, rather - to warm up the processor in order to study the effectiveness of the cooling system. Well, look at tsiferki. In general, this is not a benchmark for home use, it has a completely different purpose.
Quick solution to all problems
But this, unfortunately, no. Unless you are the owner of a thick wallet that will allow you to give your PC for diagnostics and repair. And then - even quickly for the money will not work, unless you simply buy a set of new components. Answering the questions posed at the very beginning of the article, we can say the following. The causes of system failures due to RAM may be several. And not all of them relate directly to the memory modules, the processor and the motherboard can also be to blame. Speaking directly about memory, overclocking in any manifestation also affects the stability of work, and you can completely kill a module physically by accident - with static or careless movement of your hand. If you exclude a motherboard with a processor, make sure that the temperature is right, overclocking is removed and the modules are checked in another system, and they will not cease to generate errors, then you will have to go to the warranty department or, if all deadlines have passed, buy new modules. Only a few users will be able to fix the problem themselves - this will require you to find the faulty chip and replace it with a new one, and also, if required, make corrections to the SPD. Difficult, but possible. And do not forget about the eraser - perhaps the problem is solved very quickly :)
For more information about HyperX and Kingston products, please visit the company websites.