2020-07-01 02:50:51

by 孙世龙 sunshilong

[permalink] [raw]
Subject: How do you investigate the cause of total hang? Is there some information that I should pay attention to in order to get some hint?

Hi, list

My x86 machine(linux4.19) sometimes hangs, suddenly not responding in
any way to the mouse or the keyboard.

How can I investigate why it hung up? Is there extra information I can
find for a clue? Is there anything less drastic than power-off to get
some kind of action, if only some limited shell or just beeps,
but might give a clue?

Thank you for your attention to this matter.


2020-07-01 06:09:34

by Cong Wang

[permalink] [raw]
Subject: Re: How do you investigate the cause of total hang? Is there some information that I should pay attention to in order to get some hint?

On Tue, Jun 30, 2020 at 7:49 PM 孙世龙 sunshilong <[email protected]> wrote:
>
> Hi, list
>
> My x86 machine(linux4.19) sometimes hangs, suddenly not responding in
> any way to the mouse or the keyboard.
>
> How can I investigate why it hung up? Is there extra information I can
> find for a clue? Is there anything less drastic than power-off to get
> some kind of action, if only some limited shell or just beeps,
> but might give a clue?
>

If the hang was a crash which you didn't get a chance to capture the
last kernel log, you can use kdump to collect them. The kernel log
tells what kind of crash it is, a NULL pointer deref, a kernel page fault
etc..

If the hang was a hard lockup, you have to turn on lockup detector
and also kdump to capture what the detector tells.

Thanks.

2020-07-03 06:10:32

by 孙世龙 sunshilong

[permalink] [raw]
Subject: Re: How do you investigate the cause of total hang? Is there some information that I should pay attention to in order to get some hint?

Hi, Cong Wang

Thank you for taking the time to respond to me.
Do you think the message(i.e. "RCU detect a stall on CPU 2") indicates
there is a lockup.

Cong Wang <[email protected]> 于2020年7月1日周三 下午2:07写道:


>
> On Tue, Jun 30, 2020 at 7:49 PM 孙世龙 sunshilong <[email protected]> wrote:
> >
> > Hi, list
> >
> > My x86 machine(linux4.19) sometimes hangs, suddenly not responding in
> > any way to the mouse or the keyboard.
> >
> > How can I investigate why it hung up? Is there extra information I can
> > find for a clue? Is there anything less drastic than power-off to get
> > some kind of action, if only some limited shell or just beeps,
> > but might give a clue?
> >
>
> If the hang was a crash which you didn't get a chance to capture the
> last kernel log, you can use kdump to collect them. The kernel log
> tells what kind of crash it is, a NULL pointer deref, a kernel page fault
> etc..
>
> If the hang was a hard lockup, you have to turn on lockup detector
> and also kdump to capture what the detector tells.
>
> Thanks.