Scheduled Maintenance: We are aware of an issue with Google, AOL, and Yahoo services as email providers which are blocking new registrations. We are trying to fix the issue and we have several internal and external support tickets in process to resolve the issue. Please see: viewtopic.php?t=158230

 

 

 

[Software] general advice on debugging system hangs

If none of the specific sub-forums seem right for your thread, ask here.
Post Reply
Message
Author
rayandrews
Posts: 115
Joined: 2014-01-31 21:32
Has thanked: 5 times
Been thanked: 1 time

[Software] general advice on debugging system hangs

#1 Post by rayandrews »

Since upgrading to D11 my system occasionally hangs. I'd been running D9 before that and it was bomb proof reliable. Usually the system just freezes solid, tho sometimes the mouse pointer will continue to move, but be unable to do anything but move -- clicks doing nothing. Just sometimes I can ctrl+alt+F1 and get to console, then kill Xorg and restart my desktop just fine, but not usually. These hangs might happen once every six or eight hours of on time. I'd been 'hoping' for it to get worse so that it might become more obvious what the problem is -- and usually a hardware issue will get worse over time of course. But it's not getting worse and I still have no idea if this is a software or hardware issue or what. So, I'm wondering what might be possible as far as diagnosing this. Is there some sort of 'hang log' or something? Some recommended hardware tests? As I said, it's infrequent enough to be hard to track down, but frequent enough to be a nuisance. And of course I expect Debian Stable to be just that -- rock solid. Where to start?

CwF
Global Moderator
Global Moderator
Posts: 2709
Joined: 2018-06-20 15:16
Location: Colorado
Has thanked: 41 times
Been thanked: 201 times

Re: [Software] general advice on debugging system hangs

#2 Post by CwF »

Sometimes it comes down to a guess, a hunch based on working with the hardware and being familiar. Even on stable for example, rarely encountered gpu issues can crash the X server with no logs at all, it simply disappears. I've caught logs of gpu resets in a vm that survive, where the bare metal test leaves no trace. So far amdgpu is susceptible to this issue in a way I can not isolate, happening a few times a year on 24/7. Radeon seems very stable. Nouveau in between where the worst case is a locked desktop and the machine is still alive and the desktop restartable.

My worst mind boggling scenario is using X2X to said amdgpu server and catching a gpulock, then watching from the workstation while things terminate or evaporate, often accompanied by a scurry of net traffic (worrisome) and then if my workstation mouse is on the guest desktop when the last flash occurs - It STEALS my mouse, and keyboard focus until the crashed machine reboots...very weird.

Anyway, if it's not dust bunny heat or failing hardware, I always suspect a rare GPU related bug of unknown origin. If Debian9 is demonstrably more stable, I'll double down on GPU!

User avatar
kent_dorfman766
Posts: 540
Joined: 2022-12-16 06:34
Location: socialist states of america
Has thanked: 59 times
Been thanked: 70 times

Re: [Software] general advice on debugging system hangs

#3 Post by kent_dorfman766 »

https://en.wikipedia.org/wiki/Magic_SysRq_key

The above key combination can be a valuable debugging aid, but its use and subsequent diagnosis is quite advanced.

It is my considered opinion that the majority of these problems are related to X11 and all that entails: gpu/drivers, convoluted desktop services. It is rare that a "text mode server" suffers these problems. There are also a whole slew of kernel build options that can help diagnose or mitigate poorly written drivers but it is hit or miss whether debian supplies a fully "debug enabled" kernel and I'm too lazy to look.

rayandrews
Posts: 115
Joined: 2014-01-31 21:32
Has thanked: 5 times
Been thanked: 1 time

Re: [Software] general advice on debugging system hangs

#4 Post by rayandrews »

Thanks gentlemen. Too bad, I boast about Debian stability. Sound like this won't be easy to figure out.

User avatar
sunrat
Administrator
Administrator
Posts: 6494
Joined: 2006-08-29 09:12
Location: Melbourne, Australia
Has thanked: 118 times
Been thanked: 476 times

Re: [Software] general advice on debugging system hangs

#5 Post by sunrat »

Did you check for error messages with journalctl?
“ computer users can be divided into 2 categories:
Those who have lost data
...and those who have not lost data YET ”
Remember to BACKUP!

rayandrews
Posts: 115
Joined: 2014-01-31 21:32
Has thanked: 5 times
Been thanked: 1 time

Re: [Software] general advice on debugging system hangs

#6 Post by rayandrews »

Now there's a start. I keep forgetting that there's that systemd log. Any options to add to that, or just 'journalctl' virgin? So there's the chance that it will contain info on whatever caused a freeze?

BTW I agree with answers above that the culprit is likely to be something to do with Xorg, sometimes before hanging the video goes sorta crazy, screens start to look like -- what do you call those pixilated squares that look like a game of 'Conway's Life' that you point your smartphone to and order a pizza? -- looks like that. But, if I'm fast enough, ctrl+alt+F1 goes to console and then I kill xorg, restart (xfce in my case) and it's all fine -- so that seems not to be a hardware issue or it would still be screwed up. What to do?

User avatar
kent_dorfman766
Posts: 540
Joined: 2022-12-16 06:34
Location: socialist states of america
Has thanked: 59 times
Been thanked: 70 times

Re: [Software] general advice on debugging system hangs

#7 Post by kent_dorfman766 »

sometimes before hanging the video goes sorta crazy
This is often caused by rogue pointers corrupting either the video timing algorithms or screen memory backing. Certain nvidia drivers are known to exhibit this problem. I could, with startling regularity, demonstrate that problem on my old gateway laptop using the nvidia-340 drivers.

At the risk of starting a flame war, the linux kernel design itself doesn't help matters. in places, it's just...ugly. and by that I mean it doesn't handle prioritized interrupt dispatching well, imho. Doesn't matter how busy a system is, mouse and keyboard events should be dispatched in a timely manner. That's not the case in linux though.

rayandrews
Posts: 115
Joined: 2014-01-31 21:32
Has thanked: 5 times
Been thanked: 1 time

Re: [Software] general advice on debugging system hangs

#8 Post by rayandrews »

"Certain nvidia drivers are known to exhibit this problem."

Interesting! So my crazy screen is a known issue then. Very vaguely, I might dare to say that whereas I'd always used the nouveau driver, at one point I decided to 'upgrade' to the recommended nvidia driver, found it no improvement, *and* it used up half a bleeding gig of disk space! So I uninstalled it and went back to nouveau ... I think. But it could be that the trouble started then. Dunno, maybe it left something busted. And I'm pretty sure the hangs always happen when I'm using the mouse.

User avatar
sunrat
Administrator
Administrator
Posts: 6494
Joined: 2006-08-29 09:12
Location: Melbourne, Australia
Has thanked: 118 times
Been thanked: 476 times

Re: [Software] general advice on debugging system hangs

#9 Post by sunrat »

rayandrews wrote: 2023-01-17 16:42Any options to add to that, or just 'journalctl' virgin? So there's the chance that it will contain info on whatever caused a freeze?

Code: Select all

journalctl -p 3 -b -1
Will show errors from the previous boot, -p filters by priority, -b is boot number where 0 is current.. Change to -p 4 for warnings as well, or just omit priority to see all journal messages.
man journalctl for more options.
So I uninstalled it and went back to nouveau ... I think. But it could be that the trouble started then. Dunno, maybe it left something busted.
Check for remnants with

Code: Select all

 apt list -i *nvidia*
“ computer users can be divided into 2 categories:
Those who have lost data
...and those who have not lost data YET ”
Remember to BACKUP!

User avatar
kent_dorfman766
Posts: 540
Joined: 2022-12-16 06:34
Location: socialist states of america
Has thanked: 59 times
Been thanked: 70 times

Re: [Software] general advice on debugging system hangs

#10 Post by kent_dorfman766 »

rayandrews wrote: 2023-01-17 20:48 "So I uninstalled it and went back to nouveau
nouveau is its own problem...I mean reverse engineered access to a proprietary GPU.
[sarcasm]what could possibly go wrong?[/sarcasm]

I've never cared for it and given the choice between nouveau and the old VESA driver I would more readily trust the old VESA driver, which only used standard video modes/timings and did direct screen memory updates to a planar display memory window...very slow (but reliable)

User avatar
kent_dorfman766
Posts: 540
Joined: 2022-12-16 06:34
Location: socialist states of america
Has thanked: 59 times
Been thanked: 70 times

Re: [Software] general advice on debugging system hangs

#11 Post by kent_dorfman766 »

rayandrews wrote: 2023-01-17 20:48 And I'm pretty sure the hangs always happen when I'm using the mouse.
I experienced this exact problem for several years when I was still using Fedora. Upon a fresh power-on bootup I could count on mouse/screen problems within about 5 minutes of bootup. Usually, after crash/reboot the system could stay running for a month without issue. Was so regularly periodic that I suspected a scheduled event was causing it but I never found the cause. I only know that by switching to a later model nvdidia GPU and subsequently a newer line of driver, the problem went away.

rayandrews
Posts: 115
Joined: 2014-01-31 21:32
Has thanked: 5 times
Been thanked: 1 time

Re: [Software] general advice on debugging system hangs

#12 Post by rayandrews »

"I only know that by switching to a later model nvdidia GPU and subsequently a newer line of driver, the problem went away."

You guys are telling me more than I want to know about the defects in stuff that regular users aren't supposta know about ;-) Oh, well, maybe I'll just be trying to outlive the problem.

BTW:

apt list -i *nvidia*

... shows hundred of lines of stuff, mostly '[installed automatic]'. Do I need to purge that stuff? Sheesh, I'd rather reinstall, it's a big list.

User avatar
kent_dorfman766
Posts: 540
Joined: 2022-12-16 06:34
Location: socialist states of america
Has thanked: 59 times
Been thanked: 70 times

Re: [Software] general advice on debugging system hangs

#13 Post by kent_dorfman766 »

rayandrews wrote: 2023-01-19 04:08 You guys are telling me more than I want to know about the defects in stuff that regular users aren't supposta know about ;-)
I would say that's a trueism for Linux in general. I don't consider the platform to be for "regular users", at least not to my level of expectations of others. But then, I'm the kind of guy that will reverse engineer the electrical connections to the ECU (engine control unit) on my motorcycle so that I can intercept pesky signals that overridewhat I think is an appropriate fuel/air mixture. YMMV (literally!)

rayandrews
Posts: 115
Joined: 2014-01-31 21:32
Has thanked: 5 times
Been thanked: 1 time

Re: [Software] general advice on debugging system hangs

#14 Post by rayandrews »

"I don't consider the platform to be for "regular users", at least not to my level of expectations of others."

Me, loathing Windows as I do, the move to Linux was like a personal Shawshank Redemption and as I said D9 was bomb-proof. Still, no doubt the learning curve is steep. BTW had another hang an hour ago, but this time all three of my screens started flashing back and forth, all white, then all normal. about four cycles, then hang.

CwF
Global Moderator
Global Moderator
Posts: 2709
Joined: 2018-06-20 15:16
Location: Colorado
Has thanked: 41 times
Been thanked: 201 times

Re: [Software] general advice on debugging system hangs

#15 Post by CwF »

kent_dorfman766 wrote: 2023-01-19 09:23 I can intercept pesky signals that overridewhat I think is an appropriate fuel/air mixture. YMMV (literally!)
So you spliced in a rheostat into the coolant temp to fatten things up? Bad boy!

Overall I have found Debian very stable. All OS's have at minimum, minor issues.

The XP's I use to remap and flash those ECU's, TCM's BCM's etc, is also very stable on the right hardware.

User avatar
kent_dorfman766
Posts: 540
Joined: 2022-12-16 06:34
Location: socialist states of america
Has thanked: 59 times
Been thanked: 70 times

Re: [Software] general advice on debugging system hangs

#16 Post by kent_dorfman766 »

So you spliced in a rheostat into the coolant temp to fatten things up? Bad boy!

I was more interested in O2 sensor output...but then I replaced the ECU with the racing model and they had it mapped correctly, so no need.

Post Reply