2018-07-31 21:29:53

by Vineet Gupta

[permalink] [raw]
Subject: ARC show_regs() triggers preempt debug splat, lockdep

Hi Peter, Al,

Reaching out about a problem I understand, but not quite sure how to fix it.
Its the weird feeling of how was this working all along, if at all.

With print-fatal-signals enabled, there's CONFIG_DEBUG_PREEMPT splat all over,
even with a simple single threaded segv inducing program (console log below). This
originally came to light with a glibc test suite tst-tls3-malloc which is a
multi-threaded monster.

ARC show_regs() is a bit more fancy as it tries to print the executable path,
faulting vma name (in case it was a shared lib etc). This involves taking a bunch
of customary locks which seems to be tripping the debug infra.

The preemption disabling around show_regs() in core signal handling seem to have
been introduced back in 2009 by 3a9f84d354ce1 ("signals, debug: fix BUG: using
smp_processor_id() in preemptible code in print_fatal_signal()") and the fact it
it there still implies it is needed in general.

Possible solutions are to
(1) override this by re-enabling preemption in ARC show_regs()
(2) rip out all the mm access and hence locks from ARC show_regs()
...

Thx,
-Vineet

---------------------------->8---------------------------------
# ./segv # access invalid address 0x62345678
potentially unexpected fatal signal 11.
BUG: sleeping function called from invalid context at ../kernel/fork.c:979
in_atomic(): 1, irqs_disabled(): 0, pid: 57, name: segv
no locks held by segv/57.
Preemption disabled at:
[<8182f17e>] get_signal+0x4a6/0x7c4
CPU: 0 PID: 57 Comm: segv Not tainted 4.17.0+ #23

Stack Trace:
arc_unwind_core.constprop.1+0xd0/0xf4
__might_sleep+0x1f6/0x234
mmput+0x18/0xc0
show_regs+0x2c/0x36c
get_signal+0x4ac/0x7c4
do_signal+0x30/0x224
resume_user_mode_begin+0x90/0xd8

Path: /segv
CPU: 0 PID: 57 Comm: segv Tainted: G W 4.17.0+ #23

[ECR ]: 0x00050200 => Invalid Write @ 0x62345678 by insn @ 0x0001035c
[EFA ]: 0x62345678
[BLINK ]: 0x2003a53a
[ERET ]: 0x1035c
@off 0x1035c in [/segv]
VMA: 0x00010000 to 0x00012000
[STAT32]: 0x80080882 : IE U
BTA: 0x00010344 SP: 0x5fa8fdd0 FP: 0x5fa8fdd4
LPS: 0x20014464 LPE: 0x20014468 LPC: 0x00000000
...
---------------------------->8---------------------------------







2018-07-31 22:34:41

by Vineet Gupta

[permalink] [raw]
Subject: Re: ARC show_regs() triggers preempt debug splat, lockdep

On 07/31/2018 02:26 PM, Vineet Gupta wrote:
> Hi Peter, Al,
>
> Reaching out about a problem I understand, but not quite sure how to fix it.
> Its the weird feeling of how was this working all along, if at all.
>
> With print-fatal-signals enabled, there's CONFIG_DEBUG_PREEMPT splat all over,
> even with a simple single threaded segv inducing program (console log below). This
> originally came to light with a glibc test suite tst-tls3-malloc which is a
> multi-threaded monster.
>
> ARC show_regs() is a bit more fancy as it tries to print the executable path,
> faulting vma name (in case it was a shared lib etc). This involves taking a bunch
> of customary locks which seems to be tripping the debug infra.
>
> The preemption disabling around show_regs() in core signal handling seem to have
> been introduced back in 2009 by 3a9f84d354ce1 ("signals, debug: fix BUG: using
> smp_processor_id() in preemptible code in print_fatal_signal()") and the fact it
> it there still implies it is needed in general.
>
> Possible solutions are to
> (1) override this by re-enabling preemption in ARC show_regs()
> (2) rip out all the mm access and hence locks from ARC show_regs()
> ...

I investigated a bit more and it seems the story is more complicated and there are
2 distinct issues.

1. print-fatal-signals ENABLED: induces the show_regs() issue of __might_sleep()
with preemption_disabled(). This happens with simplest of programs

2. print-fatal-signals DISABLED: this causes glibc testsuite tst-tls3-malloc to
barf still, see below. This is a multi-threaded test where one thread is serving a
page fault, gets scheduled out and other thread observes the signal and decides to
exit (this is UP kernel BTW)


------------------->8------------------
# while true; do ./tst-tls3-malloc ; done
Didn't expect signal from child: got `Segmentation fault'
^C
============================================
WARNING: possible recursive locking detected
4.17.0+ #25 Not tainted
--------------------------------------------
tst-tls3-malloc/510 is trying to acquire lock:
606c7728 (&mm->mmap_sem){++++}, at: __might_fault+0x28/0x5c

but task is already holding lock:
606c7728 (&mm->mmap_sem){++++}, at: do_page_fault+0x9c/0x2a0

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(&mm->mmap_sem);
lock(&mm->mmap_sem);

*** DEADLOCK ***

May be due to missing lock nesting notation

1 lock held by tst-tls3-malloc/510:
#0: 606c7728 (&mm->mmap_sem){++++}, at: do_page_fault+0x9c/0x2a0

stack backtrace:
CPU: 0 PID: 510 Comm: tst-tls3-malloc Not tainted 4.17.0+ #25

Stack Trace:
arc_unwind_core.constprop.1+0xd0/0xf4
__lock_acquire+0x586/0x142c
lock_acquire+0x36/0x4c
__might_fault+0x42/0x5c
exit_robust_list+0x40/0x19c
mm_release+0xce/0xf4
do_exit+0x554/0x780
do_group_exit+0x22/0x84
get_signal+0x196/0x79c
do_signal+0x30/0x224
resume_user_mode_begin+0x90/0xd8

Timed out: killed the child process
------------------->8------------------


2018-08-01 07:55:06

by Peter Zijlstra

[permalink] [raw]
Subject: Re: ARC show_regs() triggers preempt debug splat, lockdep

On Tue, Jul 31, 2018 at 02:26:32PM -0700, Vineet Gupta wrote:
> Hi Peter, Al,
>
> Reaching out about a problem I understand, but not quite sure how to fix it.
> Its the weird feeling of how was this working all along, if at all.
>
> With print-fatal-signals enabled, there's CONFIG_DEBUG_PREEMPT splat all over,
> even with a simple single threaded segv inducing program (console log below). This
> originally came to light with a glibc test suite tst-tls3-malloc which is a
> multi-threaded monster.
>
> ARC show_regs() is a bit more fancy as it tries to print the executable path,
> faulting vma name (in case it was a shared lib etc). This involves taking a bunch
> of customary locks which seems to be tripping the debug infra.

Right, so I think that that is a fairly dodgy thing to do. As shown in
your subsequent email, if a pagefault generates a signal we might
already be holding the mmap_sem.

The thing you could do is maybe use down_read_trylock() there.

diff --git a/arch/arc/kernel/troubleshoot.c b/arch/arc/kernel/troubleshoot.c
index 783b20354f8b..bb7bde11d2c8 100644
--- a/arch/arc/kernel/troubleshoot.c
+++ b/arch/arc/kernel/troubleshoot.c
@@ -92,7 +92,10 @@ static void show_faulting_vma(unsigned long address, char *buf)
/* can't use print_vma_addr() yet as it doesn't check for
* non-inclusive vma
*/
- down_read(&active_mm->mmap_sem);
+ if (!down_read_trylock(&active_mm->mmap_sem)) {
+ pr_info(" @Trylock failed\n");
+ return;
+ }
vma = find_vma(active_mm, address);

/* check against the find_vma( ) behaviour which returns the next VMA

2018-08-01 18:43:48

by Vineet Gupta

[permalink] [raw]
Subject: Re: ARC show_regs() triggers preempt debug splat, lockdep

On 08/01/2018 12:53 AM, Peter Zijlstra wrote:
> On Tue, Jul 31, 2018 at 02:26:32PM -0700, Vineet Gupta wrote:
>> Hi Peter, Al,
>>
>> Reaching out about a problem I understand, but not quite sure how to fix it.
>> Its the weird feeling of how was this working all along, if at all.
>>
>> With print-fatal-signals enabled, there's CONFIG_DEBUG_PREEMPT splat all over,
>> even with a simple single threaded segv inducing program (console log below). This
>> originally came to light with a glibc test suite tst-tls3-malloc which is a
>> multi-threaded monster.
>>
>> ARC show_regs() is a bit more fancy as it tries to print the executable path,
>> faulting vma name (in case it was a shared lib etc). This involves taking a bunch
>> of customary locks which seems to be tripping the debug infra.
> Right, so I think that that is a fairly dodgy thing to do. As shown in
> your subsequent email, if a pagefault generates a signal we might
> already be holding the mmap_sem.
>
> The thing you could do is maybe use down_read_trylock() there.
>
> diff --git a/arch/arc/kernel/troubleshoot.c b/arch/arc/kernel/troubleshoot.c
> index 783b20354f8b..bb7bde11d2c8 100644
> --- a/arch/arc/kernel/troubleshoot.c
> +++ b/arch/arc/kernel/troubleshoot.c
> @@ -92,7 +92,10 @@ static void show_faulting_vma(unsigned long address, char *buf)
> /* can't use print_vma_addr() yet as it doesn't check for
> * non-inclusive vma
> */
> - down_read(&active_mm->mmap_sem);
> + if (!down_read_trylock(&active_mm->mmap_sem)) {
> + pr_info(" @Trylock failed\n");
> + return;
> + }
> vma = find_vma(active_mm, address);
>
> /* check against the find_vma( ) behaviour which returns the next VMA

That's not the only issue here. We also call page allocator in show_regs code path
and that barfs as well with __might_sleep.

I think for us, it would make sense to re-enable preemption inside ARC show_regs()
- undoing the generic disable from get_signal()
The rationale for that in first place, per commit 3a9f84d354ce1 was some arch
(x86?) show_regs() calling smp_processor_id(). Arguable that could call the
raw_smp variant, but we don't want to needlessly bother rest of the world.

Do you see any pitfall with my proposal ?

-Vineet