The strange crashes of my Ryzen desktop

On Wednesday my Ryzen 3 2200G started to crash frequently and I have the feeling that it will take time to get it repaired. So for the moment I had to rely on my backup a 12 years old HP Elitebook 8460p. My desktop has the following devices:

2019 Ryzen 3 2200G (4C4T; 3.7GHz OC); 16GB DDR4 (Geil 3000MHz); 512GB Silicon Power nvme SSD (3400/2300MB/s); 2TB Seagate HDD (192 MB/s) cached by a 128GB Silicon Power sata-SSD; ASRock B450 HDV Rel 4.0 motherboard and an Xtech desktop/power supply combo (600W).

The software is Ubuntu 22.04 with Linux 5.19.0-40; Virtualbox 7.06 and OpenZFS 2.1.5 and it is run from the nvme-SSD.

Wednesday the system started to crash running Firefox. The screen went black and it tried to recover/reboot producing hundreds of useless error messages.
If I power-off the system and reboot, it works again for minutes without issues and than the same type of crash repeats. Sometimes it crashed when running or starting a VM, but the most secure way to get a crash within 5 minutes, is to run YouTube in Firefox.

Software wise I run the current releases for months without any issue, however after or during the weekend Linux upgraded to 5.19.0-40. To eliminate that cause I did reboot from 5.19.0-38 and from another partition on my HDD with the latest 5.15.0-?? version, but in both cases the crashes kept occurring. So It is NOT an software issue.

I did run the memory test program and it did not give any errors, I could run it for hours instead of ~15 minutes, but it did not report any errors. I re-seated memory and swapped the location of the sticks, but the crashes kept appearing.

I re-seated the nvme-SSD, but that did not change anything. I did run the OpenZFS scrub test on the 492GB partition, but it did not find any errors, so that part is fine. The problem also occurs, running from the HDD, so the nvme-SSD is not causing the issue.

I never power-off the system, but the electricity company powers off my house 3 to 30 times per week :slight_smile: Instead of powering off the system, I suspend it after 30 minutes of inactivity. To protect my computers I use an Avtek 1200W Surge Protector since 2012. The power supply is thus used 24/7 for 4 years and I noticed, because I had to replace the fan.

In order of plausibility I expect that one of the following items causes the issue:

  • Xtech power supply bought locally in 2019 for DOP 800 say $15.
  • ASRock B450 HDV Rel 4.0 motherboard, bought for $60 from Newegg in 2019
  • Ryzen 3 2200G, bought for $97 from Newegg in 2019.

I live in Santiago de los Caballeros. The shop with a pro repair shop here, concentrates on selling off-lease computers, so they have no experience with Ryzen yet. The other shops don’t not really sell Ryzen PCs, currently they sell 10th gen Intel.
I think I ask the first one to have a look and to try another power supply. The second step would be to do the upgrade to a Ryzen 5 5600G now. The last possibility is to buy another B450 motherboard.

Any other suggestion?

I had something of a similar issue four or five months ago. My desktop for do a hard freeze once or twice a day. Firefox seemed to be involved in some way as well. I ran memtest overnight. I ran diagnostics on the drives in Disks. I suspected that it was the bios/uefi firmware. It had never received an update since building the PC five years or more prior. Fortunately, I dual boot Windows 10 with Fedora, so it was relatively simple to update the motherboard firmware. It stopped the hard freezes.

If you don’t know what firmware your motherboard uses, CPU-X will give you that information and it’s available in Linux.

Maybe, but last year I asked ASRock what release they suggested for an upgrade to a Ryzen 5 5600G. I upgraded the motherboard firmware to the release 4.50 advised to me by ASRock, but downloaded a newer releases of the firmware 4.80 and there are even 5 more new releases (4.86 Beta; 4.90; 7.20 Beta; 7.40 and 10.01). I have a Raven Ridge CPU and they don’t advice to update the firmware starting at release 3.20 for Raven Ridge CPUs, so I surpassed that level already. I will download the last two.

I’m a little bit afraid to upgrade the firmware with a doubtful PC now. I first try the power supply, that is risk free and relative cheap, but afterwards I could try the firmware upgrade.

1 Like

If you’re already past the recommended firmware you probably shouldn’t go further. If anything it would be better to roll back to 3.20 after doing some research to learn if anyone else has had the same problem with going beyond 3.20 with your hardware for your OS.

I replaced the power supply by a $22 power supply (600W) of the same brand. I do not have very much choice locally and I wanted first to try to replace the cheapest of the suspected components. I brought back the desktop to life in 3 steps using a day for testing each step:

  1. First I did run everything from the HDD (2TB; 192MB/s) and I rolled back the VMs with the problems.
  2. I added the sata-SSD for the caching (128GB; 530MB/s and NO disk-head movements!).
  3. I added the nvme-SSD and did run and test the VMs, that crashed a week ago. I also boot the Host OS from the nvme-SSD without any problems.

It had nothing to do with the nvme-SSD, like I thought originally. I got that impression, because the system kept crashing on certain VMs on the nvme-SSD. But the problem was my 4 year old $18 Xtech power supply (500W), that I used 24/7 during 4 years. It just switched off during work, thus corrupting some of the most used VMs from the nvme-SSD.

The new power supply with 20% more power should be good for another 4+ years, because I will try to get hibernation working and if not hibernating, I will power off the PC, no 24/7 work for a cheap power supply :slight_smile:

I noticed that my L2ARC is much more effective in caching real data and code, it is now skipping the caching of zeroes :slight_smile: