New hardware woes and frustrations (Fixed?)

If none of the more specific forums is the right place to ask

New hardware woes and frustrations (Fixed?)

Postby Ardouos » 2020-06-24 15:29

inxi here:
https://pastebin.com/9BGRGZ7A

Recently I have bought new hardware as I was due an upgrade for my ageing hardware; which was:
Code: Select all
CPU - i5 2500k
RAM - 8GB 1333 MHz  - x2 4 GB sticks
Mobo - P8P67 LE


And now is:
Code: Select all
CPU – AMD Ryzen 5 3600
RAM - 16GB 2133 MHz - x2 8 GB sticks
Mobo -  ASUS Prime B450-PLUS

I even installed the relevant packages to help support the Ryzen chipset - amd64-microcode

Everything else was the same.

The installation of components went well and Debian booted right up which was a good outcome.

The main issue that games would start to freeze, crash or not even load, examples would be:

# Xonotic will have random freezes
# CSGO would just crash after around 30 seconds – 1 minute
# Overload would just crash after around 30 seconds – 1 minute

I tried using later kernels and backported Nvidia drivers, which did not resolve anything. I even tried using Debian unstable to temporarily check to see if it is an issue with older packages as the hardware is reasonably new. But to no avail, I still had the same problems. The graphics card worked fine before on my older hardware.

I then tried to just install Ubuntu 20.04 just to see if the issue was Debian related. Whilst the system was definitely more stable, some games still would not “run right”. An example would be Overload, it just crashed after it tries to load the game…. As stated above.

Another frustrating issue was when I ran the game in the terminal. There would be no error message whatsoever to why the games are freezing or crashing.

Frustrated and working on the issue for hours. I decided to go to bed for the day as it was getting pretty late.


The next day I was looking into troubleshooting the issue some more before I went to see the in-laws. I decided to start looking into system logs… specifically /var/log/syslog and dmesg.



The system has no errors until I run the games I start getting these errors in both dmesg and syslog:

https://pastebin.com/DYnx7uGi

Based on that, you would think that it was a graphics card issue? I did.

I found a table on error codes on one of Nvidia’s website, here is a broken down one with the errors that were appearing for me:
https://imgur.com/a/ngMLpSh

After doing some research, many people who had the same issue had the following answers:

# Bad drivers
# Faulty hardware
# Unsupported kernel

And probably others which I cannot remember right now…

I remember digging some more and seeing a comment where someone mentions an issue with dual channel memory and to just try to remove them.

So I decided to remove one of the 8 GB RAM modules my PC and all now seem to be running as smooth as butter! I have done some testing today with swapping the DIMMs out and there does seem to be an issue with one of the DIMMs that replicates the issue.

Another weird issue was that I ran a memory test memory test on my machine and it finds no faults with either DIMM.

I have requested for a new kit to be sent and hopefully that will be the last of it…

In any case, can anyone decode what is going on with the errors in my previous paste bin? All of the errors point towards a driver error as all three are checked in… But it may have been memory corruption causing the game to fall over?
Last edited by Ardouos on 2020-06-24 19:17, edited 1 time in total.
User avatar
Ardouos
 
Posts: 1061
Joined: 2013-11-03 00:30
Location: Elicoor II

Re: New hardware woes - RANT (Fixed?)

Postby LE_746F6D617A7A69 » 2020-06-24 16:02

First check Your BIOS version - You need at least v1201 for this CPU.
Bill Gates: "(...) In my case, I went to the garbage cans at the Computer Science Center and I fished out listings of their operating system."
The_full_story and Nothing_have_changed
LE_746F6D617A7A69
 
Posts: 184
Joined: 2020-05-03 14:16

Re: New hardware woes - RANT (Fixed?)

Postby Ardouos » 2020-06-24 16:13

LE_746F6D617A7A69 wrote:First check Your BIOS version - You need at least v1201 for this CPU.

BIOS version is 2008, the latest you can get currently.
User avatar
Ardouos
 
Posts: 1061
Joined: 2013-11-03 00:30
Location: Elicoor II

Re: New hardware woes - RANT (Fixed?)

Postby LE_746F6D617A7A69 » 2020-06-24 17:43

Definitely this looks like hardware/firmware problem.

First things first: You're using an old PSU from previous setup -> in such situation You should measure the voltages (no, HW monitor in the BIOS is not showing real voltages). Unstable voltage can cause astonishing HW failures.

Assuming that the PSU voltages are stable:

Are the RAM modules identical?
Have You enabled XMP RAM profiles in the BIOS? - if so, try disabling them.
If the the system will become stable with XMP=off, then it means that the BIOS or the SPD config in RAM modules is broken -> replacing the modules with the same model will not help.

It can be also a problem with incompatibility between BIOSes in MB and in the GFX -> sometimes moving the card to a different PCI-E slot can "magically" fix such problems (IRQ routing)
Another thing is that it can be NVidia driver problem - have You tried Nouveau? (btw. 750Ti works better with nouveau if You allow changing of pstates/re-clocking)
Bill Gates: "(...) In my case, I went to the garbage cans at the Computer Science Center and I fished out listings of their operating system."
The_full_story and Nothing_have_changed
LE_746F6D617A7A69
 
Posts: 184
Joined: 2020-05-03 14:16

Re: New hardware woes - RANT (Fixed?)

Postby Ardouos » 2020-06-25 14:56

LE_746F6D617A7A69 wrote:Definitely this looks like hardware/firmware problem.

First things first: You're using an old PSU from previous setup -> in such situation You should measure the voltages (no, HW monitor in the BIOS is not showing real voltages). Unstable voltage can cause astonishing HW failures.

Assuming that the PSU voltages are stable:


I will make a note of that, I don't own a multimeter or anything to test voltages with though. I might get a new power supply at some point soon anyway as I don't want to chance it knackering my hardware.

LE_746F6D617A7A69 wrote:Are the RAM modules identical?

Yes, they are.

LE_746F6D617A7A69 wrote:Have You enabled XMP RAM profiles in the BIOS? - if so, try disabling them.
If the the system will become stable with XMP=off, then it means that the BIOS or the SPD config in RAM modules is broken -> replacing the modules with the same model will not help.

I have not done any overclocking or tweeking of the BIOS / UEFI. I have had a look for XMP and AMP / X-AMP and there was no option available for it, from looking online I would have to manually enable D.O.C.P Standard on the motherboard to be able to start enabling it. Which it is not.

LE_746F6D617A7A69 wrote:It can be also a problem with incompatibility between BIOSes in MB and in the GFX -> sometimes moving the card to a different PCI-E slot can "magically" fix such problems (IRQ routing)

I did actually try that surprisingly, sadly it did not help in any way.

LE_746F6D617A7A69 wrote:Another thing is that it can be NVidia driver problem - have You tried Nouveau? (btw. 750Ti works better with nouveau if You allow changing of pstates/re-clocking)

I have not tried the noveau driver yet, I could have given it a go.

Just an update, I got replacement RAM from Amazon for free as I reported it as a faulty part. I have just been testing and so far so good!
User avatar
Ardouos
 
Posts: 1061
Joined: 2013-11-03 00:30
Location: Elicoor II


Return to General Questions

Who is online

Users browsing this forum: No registered users and 12 guests

fashionable