Scheduled Maintenance: We are aware of an issue with Google, AOL, and Yahoo services as email providers which are blocking new registrations. We are trying to fix the issue and we have several internal and external support tickets in process to resolve the issue. Please see: viewtopic.php?t=158230
mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4:
-
- Posts: 932
- Joined: 2020-05-03 14:16
- Has thanked: 7 times
- Been thanked: 65 times
Re: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4:
Thermal throttling means insufficient cooling - this is a fact, not a guess.
Bill Gates: "(...) In my case, I went to the garbage cans at the Computer Science Center and I fished out listings of their operating system."
The_full_story and Nothing_have_changed
The_full_story and Nothing_have_changed
-
- Global Moderator
- Posts: 2638
- Joined: 2018-06-20 15:16
- Location: Colorado
- Has thanked: 41 times
- Been thanked: 192 times
Re: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4:
...get your money back.GeNe64 wrote:Any tips?
First off, that isn't exactly server grade hardware.
Most server grade stuff would never throttle due to core temp. The socket temp would be the trigger in a proper setup.
-
- Posts: 932
- Joined: 2020-05-03 14:16
- Has thanked: 7 times
- Been thanked: 65 times
Re: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4:
+1CwF wrote:...get your money back.GeNe64 wrote:Any tips?
First off, that isn't exactly server grade hardware.
Most server grade stuff would never throttle due to core temp. The socket temp would be the trigger in a proper setup.
Bill Gates: "(...) In my case, I went to the garbage cans at the Computer Science Center and I fished out listings of their operating system."
The_full_story and Nothing_have_changed
The_full_story and Nothing_have_changed
Re: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4:
Finally, I've solved the issue by adding intel_idle.max_cstate=1 to the file /etc/default/grub
and rebooting.
Code: Select all
GRUB_CMDLINE_LINUX_DEFAULT="consoleblank=0 intel_idle.max_cstate=1"
Code: Select all
# update-grub
-
- Posts: 932
- Joined: 2020-05-03 14:16
- Has thanked: 7 times
- Been thanked: 65 times
Re: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4:
This makes completely no sense.GeNe64 wrote:Finally, I've solved the issue by adding intel_idle.max_cstate=1
In Your previous post You said that the connection was lost after a series of warnings saying that the critical temperature has been reached. Practically this means, that the CPU has nearly melted down (max Tjunction is 100deg.C, and the treshold is 95deg.C).
The kernel parameter intel_idle.max_cstate=1 completely disables power saving - how could it help in this situation?
Yes, there was a problem with hard lookups in BayTrail CPUs, where this option was used as a workaround - but this not the case here.
I think that some other factors could have come into play here - like f.e. someone have "fixed" a problem with air conditioning system by opening all the doors and windows in that not-so-cold room
Bill Gates: "(...) In my case, I went to the garbage cans at the Computer Science Center and I fished out listings of their operating system."
The_full_story and Nothing_have_changed
The_full_story and Nothing_have_changed
Re: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4:
Yep, that's the problem. The bug is very strange and described here https://forum.proxmox.com/threads/rando ... 597/page-3LE_746F6D617A7A69 wrote:This makes completely no sense.
In Your previous post You said that the connection was lost after a series of warnings saying that the critical temperature has been reached. Practically this means, that the CPU has nearly melted down (max Tjunction is 100deg.C, and the treshold is 95deg.C).
The kernel parameter intel_idle.max_cstate=1 completely disables power saving - how could it help in this situation?
Yes, there was a problem with hard lookups in BayTrail CPUs, where this option was used as a workaround - but this not the case here.
I think that some other factors could have come into play here - like f.e. someone have "fixed" a problem with air conditioning system by opening all the doors and windows in that not-so-cold room
It's not possible to find anything useful in logs but server crashes all the time.
I was trying to link any weird messages in logs (temp, mce, etc) and crashing but couldn't resolve it anyway.
It's a bug of Intel CPUs that can be fixed by adding intel_idle.max_cstate=1
-
- Posts: 932
- Joined: 2020-05-03 14:16
- Has thanked: 7 times
- Been thanked: 65 times
Re: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4:
So it happened Again?!GeNe64 wrote:It's a bug of Intel CPUs that can be fixed by adding intel_idle.max_cstate=1
Anyway, it's good to know...
Bill Gates: "(...) In my case, I went to the garbage cans at the Computer Science Center and I fished out listings of their operating system."
The_full_story and Nothing_have_changed
The_full_story and Nothing_have_changed