Scheduled Maintenance: We are aware of an issue with Google, AOL, and Yahoo services as email providers which are blocking new registrations. We are trying to fix the issue and we have several internal and external support tickets in process to resolve the issue. Please see: viewtopic.php?t=158230

 

 

 

Random system freezing

Linux Kernel, Network, and Services configuration.
Post Reply
Message
Author
undesign
Posts: 108
Joined: 2015-05-27 09:03
Has thanked: 8 times
Been thanked: 8 times

Random system freezing

#1 Post by undesign »

I have a Dell Studio 1749 laptop with the following configuration:
System: Host: dell Kernel: 4.9.0-6-amd64 x86_64 (64 bit gcc: 6.3.0) Desktop: LXDE (Openbox 3.6.1)
Distro: Debian GNU/Linux 9 (stretch)
Machine: Device: portable System: Dell product: Studio 1749 serial: 70J0SM1
BIOS: Dell v: A08 date: 03/24/2011
Battery BAT1: charge: 29.8 Wh 99.9% condition: 29.8/57.7 Wh (52%) model: Simplo DELL U150P05L status: Full
CPU: Dual core Intel Core i7 M 620 (-HT-MCP-) cache: 4096 KB
flags: (lm nx sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx) bmips: 10639
clock speeds: max: 2667 MHz 1: 1333 MHz 2: 1199 MHz 3: 1333 MHz 4: 1199 MHz
Graphics: Card: Advanced Micro Devices [AMD/ATI] Madison [Mobility Radeon HD 5650/5750 / 6530M/6550M]
bus-ID: 02:00.0
Display Server: X.org 1.19.2 driver: N/A tty size: 266x65 Advanced Data: N/A for root
Network: Card-1: Intel Centrino Advanced-N 6205 [Taylor Peak] driver: iwlwifi bus-ID: 08:00.0
IF: wlp8s0 state: down mac: d2:59:7f:4d:1c:20
Card-2: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
driver: r8169 v: 2.3LK-NAPI port: 5000 bus-ID: 20:00.0
IF: enp32s0 state: up speed: 1000 Mbps duplex: full mac: 5c:26:0a:06:53:72
Drives: HDD Total Size: 2512.5GB (37.7% used)
ID-1: model: ST2000LM003_HN
ID-2: model: Samsung_SSD_850
Info: Processes: 212 Uptime: 59 min Memory: 1344.8/7848.6MB Init: systemd runlevel: 5 Gcc sys: 6.3.0
Client: Shell (bash 4.4.121) inxi: 2.3.5
The laptop randomly freezes, apparently without any trigger like CPU, I/O, memory load, etc. It can run perfectly for 3 day or freeze 3 times in 10 minutes.
When it freezes, it doesn't respond to anything, including SysReq - B (to reboot). The only way is to stop it from power button.
Absolutely nothing is written in logs (/var/log/messages and dmesg).

It isn't a heat issue, if I believe sensors and the laptop doesn't heat anyway:
radeon-pci-0200
Adapter: PCI adapter
temp1: +48.0°C (crit = +120.0°C, hyst = +90.0°C)

coretemp-isa-0000
Adapter: ISA adapter
Core 0: +46.0°C (high = +95.0°C, crit = +105.0°C)
Core 2: +48.0°C (high = +95.0°C, crit = +105.0°C)

dell_smm-virtual-0
Adapter: Virtual device
Processor Fan: 0 RPM
CPU: +47.0°C
GPU: +59.0°C
The last temperature, could be wrong, because it is displayed from the beginning and it is hard to believe that the GPU achieves 59 degrees Celsius in just a few seconds after the laptop is started, especially when there is a powerful cooler underneath.
Also, another sensor displays 48 degrees for radeon-pci-0200, so this might be closer to reality.

The CPU settings are:
#cpupower frequency-info
analyzing CPU 0:
driver: acpi-cpufreq
CPUs which run at the same hardware frequency: 0
CPUs which need to have their frequency coordinated by software: 0
maximum transition latency: 10.0 us
hardware limits: 1.20 GHz - 2.67 GHz
available frequency steps: 2.67 GHz, 2.67 GHz, 2.53 GHz, 2.40 GHz, 2.27 GHz, 2.13 GHz, 2.00 GHz, 1.87 GHz, 1.73 GHz, 1.60 GHz, 1.47 GHz, 1.33 GHz, 1.20 GHz
available cpufreq governors: ondemand performance schedutil
current policy: frequency should be within 1.20 GHz and 2.67 GHz.
The governor "ondemand" may decide which speed to use
within this range.
current CPU frequency: 1.20 GHz (asserted by call to hardware)
boost state support:
Supported: no
Active: no
3067 MHz max turbo 2 active cores
3333 MHz max turbo 1 active cores
I investigated and the solution was to add intel_idle.max_cstate=1 to cat /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_idle.max_cstate=1"
Apparently there is an issue about this i7 first generation and the kernel not being able to change the CPU frequency properly.

There is already one week since is running in this configuration, I run some stress tests and the laptop performed flawlessly, no matter the load. Also, prior to that, I run memtes86+, internal hardware tests and no hardware issues were detected. Prior to this, there was no problem with Debian 8.
But now suspend and hibernate functions do not work anymore. I mean the computer does suspend (to RAM or disk), but it is not able to resume, I can only see a blank screen, but no mouse cursor. I press Ctrl-Alt-F2, login into console and reboot from there.

Any ideas?

undesign
Posts: 108
Joined: 2015-05-27 09:03
Has thanked: 8 times
Been thanked: 8 times

Re: Random system freezing

#2 Post by undesign »

I found the cause: it is fancontrol & pwmconfig when set the CPU fan to run at full speed from the beginning (or when CPU fan reaches maximum speed).
If I configure the CPU fan to start at half speed (with Dell there are 3 options: 0, half speed, full speed), then it is ok (unless I reach a state when the fan reaches full speed, but normally this is rare).

Question: I can't suspend/hibernate, because I can't change the CPU C state (intel_idle.max_cstate=1), right?

Post Reply