The linux-next commit bf2c59fce407 ("sched/core: Fix illegal RCU from
offline CPUs") delayed,
idle->active_mm = &init_mm;
into finish_cpu() instead of idle_task_exit() which results in a false
positive warning that was originally designed in the commit 3eda69c92d47
("kernel/fork.c: detect early free of a live mm").
WARNING: CPU: 127 PID: 72976 at kernel/fork.c:697
__mmdrop+0x230/0x2c0
do_exit+0x424/0xfa0
Call Trace:
do_exit+0x424/0xfa0
do_group_exit+0x64/0xd0
sys_exit_group+0x24/0x30
system_call_exception+0x108/0x1d0
system_call_common+0xf0/0x278
Fixes: bf2c59fce407 ("sched/core: Fix illegal RCU from offline CPUs")
Signed-off-by: Qian Cai <[email protected]>
---
kernel/fork.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/kernel/fork.c b/kernel/fork.c
index 142b23645d82..5334efd2a680 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -694,7 +694,6 @@ void __mmdrop(struct mm_struct *mm)
{
BUG_ON(mm == &init_mm);
WARN_ON_ONCE(mm == current->mm);
- WARN_ON_ONCE(mm == current->active_mm);
mm_free_pgd(mm);
destroy_context(mm);
mmu_notifier_subscriptions_destroy(mm);
--
2.21.0 (Apple Git-122.2)
On Thu, Jun 04, 2020 at 11:03:44AM -0400, Qian Cai wrote:
> The linux-next commit bf2c59fce407 ("sched/core: Fix illegal RCU from
> offline CPUs") delayed,
>
> idle->active_mm = &init_mm;
>
> into finish_cpu() instead of idle_task_exit() which results in a false
> positive warning that was originally designed in the commit 3eda69c92d47
> ("kernel/fork.c: detect early free of a live mm").
>
> WARNING: CPU: 127 PID: 72976 at kernel/fork.c:697
> __mmdrop+0x230/0x2c0
> do_exit+0x424/0xfa0
> Call Trace:
> do_exit+0x424/0xfa0
> do_group_exit+0x64/0xd0
> sys_exit_group+0x24/0x30
> system_call_exception+0x108/0x1d0
> system_call_common+0xf0/0x278
>
> Fixes: bf2c59fce407 ("sched/core: Fix illegal RCU from offline CPUs")
> Signed-off-by: Qian Cai <[email protected]>
Peter, Andrew, can you take a look at this trivial patch? The offensive commit
is now in the mainline, and it is quite easy to trigger this warning by a
non-root user.
# git clone https://gitlab.com/cailca/linux-mm
# cd linux-mm; make
# ./random -x 0-100 -f (it will switch to use a non-root user.)
[14933.435035][T1176230] ------------[ cut here ]------------
[14933.435062][T1176230] WARNING: CPU: 25 PID: 1176230 at kernel/fork.c:679 __mmdrop+0x1e0/0x270
[14933.435086][T1176230] Modules linked in: vfio_pci vfio_virqfd vfio_iommu_spapr_tce vfio vfio_spapr_eeh loop kvm_hv kvm ip_tables x_tables sd_mod tg3 firmware_class ahci libahci libphy libata dm_mirror dm_region_hash dm_log dm_mod
[14933.435176][T1176230] CPU: 25 PID: 1176230 Comm: trinity-c25 Not tainted 5.8.0-rc6-next-20200721 #5
[14933.435200][T1176230] NIP: c0000000000c4560 LR: c0000000000d4fb4 CTR: 0000000000000000
[14933.435223][T1176230] REGS: c00000001520f910 TRAP: 0700 Not tainted (5.8.0-rc6-next-20200721)
[14933.435255][T1176230] MSR: 9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24002822 XER: 20040000
[14933.435294][T1176230] CFAR: c0000000000c43f4 IRQMASK: 0
[14933.435294][T1176230] GPR00: c0000000000d4fb4 c00000001520fba0 c000000005911500 c000000f6e227080
[14933.435294][T1176230] GPR04: 0000000000000001 fffffffffffff026 c0000007aa2a17e0 0000000000000001
[14933.435294][T1176230] GPR08: 0000000000000000 c000000f6e227080 0000000000000001 bfffffffffffffff
[14933.435294][T1176230] GPR12: 0000000000002000 c000000ffffeb780 0000000010037be0 00000000109ec8e0
[14933.435294][T1176230] GPR16: 00000000109eccb0 0000000010037ff0 0000000000000001 00007fff8d000000
[14933.435294][T1176230] GPR20: 00000000100380d0 00000000100380a8 000000001003a1d0 0000000010037f10
[14933.435294][T1176230] GPR24: 00007fff8df100e0 00000000ed54649a c000000f6e227128 c00000001520fcc8
[14933.435294][T1176230] GPR28: c000000fe7540890 c000000f6e227080 c000000fe7540080 c000000f6e227080
[14933.435470][T1176230] NIP [c0000000000c4560] __mmdrop+0x1e0/0x270
[14933.435493][T1176230] LR [c0000000000d4fb4] do_exit+0x424/0xee0
[14933.435514][T1176230] Call Trace:
[14933.435524][T1176230] [c00000001520fba0] [c00c0000000164c0] 0xc00c0000000164c0 (unreliable)
[14933.435550][T1176230] [c00000001520fc60] [c0000000000d4fb4] do_exit+0x424/0xee0
[14933.435575][T1176230] [c00000001520fd60] [c0000000000d5b2c] do_group_exit+0x5c/0xd0
[14933.435600][T1176230] [c00000001520fda0] [c0000000000d5bbc] sys_exit_group+0x1c/0x20
[14933.435626][T1176230] [c00000001520fdc0] [c00000000002c978] system_call_exception+0xf8/0x180
[14933.435654][T1176230] [c00000001520fe20] [c00000000000c9e8] system_call_common+0xe8/0x214
[14933.435686][T1176230] Instruction dump:
[14933.435706][T1176230] 480c35e9 60000000 4bffff68 60000000 7c832378 38800000 482a8d81 60000000
[14933.435750][T1176230] 4bfffee8 60000000 60000000 60000000 <0fe00000> 4bfffe94 60000000 60000000
[14933.435795][T1176230] irq event stamp: 370444
[14933.435817][T1176230] hardirqs last enabled at (370443): [<c0000000003b8964>] __slab_free+0x2e4/0x5e0
[14933.435868][T1176230] hardirqs last disabled at (370444): [<c00000000000964c>] program_check_common_virt+0x2bc/0x310
[14933.435913][T1176230] softirqs last enabled at (352366): [<c0000000009b0790>] __do_softirq+0x650/0x920
[14933.435948][T1176230] softirqs last disabled at (352359): [<c0000000000d70dc>] irq_exit+0x17c/0x1b0
[14933.435971][T1176230] ---[ end trace 5823e9bcf4dee099 ]---
> ---
> kernel/fork.c | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 142b23645d82..5334efd2a680 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -694,7 +694,6 @@ void __mmdrop(struct mm_struct *mm)
> {
> BUG_ON(mm == &init_mm);
> WARN_ON_ONCE(mm == current->mm);
> - WARN_ON_ONCE(mm == current->active_mm);
> mm_free_pgd(mm);
> destroy_context(mm);
> mmu_notifier_subscriptions_destroy(mm);
> --
> 2.21.0 (Apple Git-122.2)
>
On Thu, Jun 04, 2020 at 11:03:44AM -0400, Qian Cai wrote:
> The linux-next commit bf2c59fce407 ("sched/core: Fix illegal RCU from
> offline CPUs") delayed,
>
> idle->active_mm = &init_mm;
>
> into finish_cpu() instead of idle_task_exit() which results in a false
> positive warning that was originally designed in the commit 3eda69c92d47
> ("kernel/fork.c: detect early free of a live mm").
>
> WARNING: CPU: 127 PID: 72976 at kernel/fork.c:697
> __mmdrop+0x230/0x2c0
> do_exit+0x424/0xfa0
> Call Trace:
> do_exit+0x424/0xfa0
> do_group_exit+0x64/0xd0
> sys_exit_group+0x24/0x30
> system_call_exception+0x108/0x1d0
> system_call_common+0xf0/0x278
Please explain; because afaict this is a use-after-free.
The thing is __mmdrop() is going to actually free the mm, so then what
is finish_cpu()'s mmdrop() going to do?
->active_mm() should have a refcount on the mm.
> Fixes: bf2c59fce407 ("sched/core: Fix illegal RCU from offline CPUs")
> Signed-off-by: Qian Cai <[email protected]>
> ---
> kernel/fork.c | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 142b23645d82..5334efd2a680 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -694,7 +694,6 @@ void __mmdrop(struct mm_struct *mm)
> {
> BUG_ON(mm == &init_mm);
> WARN_ON_ONCE(mm == current->mm);
> - WARN_ON_ONCE(mm == current->active_mm);
> mm_free_pgd(mm);
> destroy_context(mm);
> mmu_notifier_subscriptions_destroy(mm);
> --
> 2.21.0 (Apple Git-122.2)
>
On Wed, Jul 22, 2020 at 12:06:37PM +0200, [email protected] wrote:
> On Thu, Jun 04, 2020 at 11:03:44AM -0400, Qian Cai wrote:
> > The linux-next commit bf2c59fce407 ("sched/core: Fix illegal RCU from
> > offline CPUs") delayed,
> >
> > idle->active_mm = &init_mm;
> >
> > into finish_cpu() instead of idle_task_exit() which results in a false
> > positive warning that was originally designed in the commit 3eda69c92d47
> > ("kernel/fork.c: detect early free of a live mm").
> >
> > WARNING: CPU: 127 PID: 72976 at kernel/fork.c:697
> > __mmdrop+0x230/0x2c0
> > do_exit+0x424/0xfa0
> > Call Trace:
> > do_exit+0x424/0xfa0
> > do_group_exit+0x64/0xd0
> > sys_exit_group+0x24/0x30
> > system_call_exception+0x108/0x1d0
> > system_call_common+0xf0/0x278
>
> Please explain; because afaict this is a use-after-free.
>
> The thing is __mmdrop() is going to actually free the mm, so then what
> is finish_cpu()'s mmdrop() going to do?
>
> ->active_mm() should have a refcount on the mm.
Well, the refcount issue you mentioned then happens all before bf2c59fce407 was
introduced as well, but then it looks harmless because mmdrop() in finish_cpu()
will do,
if (unlikely(atomic_dec_and_test(&mm->mm_count)))
__mmdrop(mm);
where that atomic_dec_and_test() see the negative refcount and will not involve
__mmdrop() again. It is not clear to me that once the CPU is offline if it
needs to care about its idle thread mm_count at all. Even if this refcount
issue is finally addressed, it could hit this warning in finish_cpu() without
this patch.
On the other hand, if you look at the commit 3eda69c92d47, it is clearly that
the assumption of,
WARN_ON_ONCE(mm == current->active_mm);
is totally gone due to bf2c59fce407. Thus, the patch is to fix that discrepancy
first and then I'll look at that the imbalance mmdrop()/mmgrab() elsewhere.
>
> > Fixes: bf2c59fce407 ("sched/core: Fix illegal RCU from offline CPUs")
> > Signed-off-by: Qian Cai <[email protected]>
> > ---
> > kernel/fork.c | 1 -
> > 1 file changed, 1 deletion(-)
> >
> > diff --git a/kernel/fork.c b/kernel/fork.c
> > index 142b23645d82..5334efd2a680 100644
> > --- a/kernel/fork.c
> > +++ b/kernel/fork.c
> > @@ -694,7 +694,6 @@ void __mmdrop(struct mm_struct *mm)
> > {
> > BUG_ON(mm == &init_mm);
> > WARN_ON_ONCE(mm == current->mm);
> > - WARN_ON_ONCE(mm == current->active_mm);
> > mm_free_pgd(mm);
> > destroy_context(mm);
> > mmu_notifier_subscriptions_destroy(mm);
> > --
> > 2.21.0 (Apple Git-122.2)
> >
On Wed, Jul 22, 2020 at 09:19:00AM -0400, Qian Cai wrote:
> On Wed, Jul 22, 2020 at 12:06:37PM +0200, [email protected] wrote:
> > On Thu, Jun 04, 2020 at 11:03:44AM -0400, Qian Cai wrote:
> > > The linux-next commit bf2c59fce407 ("sched/core: Fix illegal RCU from
> > > offline CPUs") delayed,
> > >
> > > idle->active_mm = &init_mm;
> > >
> > > into finish_cpu() instead of idle_task_exit() which results in a false
> > > positive warning that was originally designed in the commit 3eda69c92d47
> > > ("kernel/fork.c: detect early free of a live mm").
> > >
> > > WARNING: CPU: 127 PID: 72976 at kernel/fork.c:697
> > > __mmdrop+0x230/0x2c0
> > > do_exit+0x424/0xfa0
> > > Call Trace:
> > > do_exit+0x424/0xfa0
> > > do_group_exit+0x64/0xd0
> > > sys_exit_group+0x24/0x30
> > > system_call_exception+0x108/0x1d0
> > > system_call_common+0xf0/0x278
> >
> > Please explain; because afaict this is a use-after-free.
> >
> > The thing is __mmdrop() is going to actually free the mm, so then what
> > is finish_cpu()'s mmdrop() going to do?
> >
> > ->active_mm() should have a refcount on the mm.
>
> Well, the refcount issue you mentioned then happens all before bf2c59fce407 was
> introduced as well, but then it looks harmless because mmdrop() in finish_cpu()
> will do,
>
> if (unlikely(atomic_dec_and_test(&mm->mm_count)))
> __mmdrop(mm);
That's not harmless, that's a use-after-free. Those can cause memory
corruption bugs and the like at best. Who knows what's at the location
of mm->mm_count after we've already freed it.
> where that atomic_dec_and_test() see the negative refcount and will not involve
> __mmdrop() again. It is not clear to me that once the CPU is offline if it
> needs to care about its idle thread mm_count at all. Even if this refcount
> issue is finally addressed, it could hit this warning in finish_cpu() without
> this patch.
>
> On the other hand, if you look at the commit 3eda69c92d47, it is clearly that
> the assumption of,
>
> WARN_ON_ONCE(mm == current->active_mm);
>
> is totally gone due to bf2c59fce407. Thus, the patch is to fix that discrepancy
> first and then I'll look at that the imbalance mmdrop()/mmgrab() elsewhere.
No, you're talking nonsense. We must not free @mm when
'current->active_mm == mm', never.
On Wed, Jul 22, 2020 at 03:44:06PM +0200, Peter Zijlstra wrote:
> On Wed, Jul 22, 2020 at 09:19:00AM -0400, Qian Cai wrote:
> > On Wed, Jul 22, 2020 at 12:06:37PM +0200, [email protected] wrote:
> > > On Thu, Jun 04, 2020 at 11:03:44AM -0400, Qian Cai wrote:
> > > > The linux-next commit bf2c59fce407 ("sched/core: Fix illegal RCU from
> > > > offline CPUs") delayed,
> > > >
> > > > idle->active_mm = &init_mm;
> > > >
> > > > into finish_cpu() instead of idle_task_exit() which results in a false
> > > > positive warning that was originally designed in the commit 3eda69c92d47
> > > > ("kernel/fork.c: detect early free of a live mm").
> > > >
> > > > WARNING: CPU: 127 PID: 72976 at kernel/fork.c:697
> > > > __mmdrop+0x230/0x2c0
> > > > do_exit+0x424/0xfa0
> > > > Call Trace:
> > > > do_exit+0x424/0xfa0
> > > > do_group_exit+0x64/0xd0
> > > > sys_exit_group+0x24/0x30
> > > > system_call_exception+0x108/0x1d0
> > > > system_call_common+0xf0/0x278
> > >
> > > Please explain; because afaict this is a use-after-free.
> > >
> > > The thing is __mmdrop() is going to actually free the mm, so then what
> > > is finish_cpu()'s mmdrop() going to do?
> > >
> > > ->active_mm() should have a refcount on the mm.
> >
> > Well, the refcount issue you mentioned then happens all before bf2c59fce407 was
> > introduced as well, but then it looks harmless because mmdrop() in finish_cpu()
> > will do,
> >
> > if (unlikely(atomic_dec_and_test(&mm->mm_count)))
> > __mmdrop(mm);
>
> That's not harmless, that's a use-after-free. Those can cause memory
> corruption bugs and the like at best. Who knows what's at the location
> of mm->mm_count after we've already freed it.
>
> > where that atomic_dec_and_test() see the negative refcount and will not involve
> > __mmdrop() again. It is not clear to me that once the CPU is offline if it
> > needs to care about its idle thread mm_count at all. Even if this refcount
> > issue is finally addressed, it could hit this warning in finish_cpu() without
> > this patch.
> >
> > On the other hand, if you look at the commit 3eda69c92d47, it is clearly that
> > the assumption of,
> >
> > WARN_ON_ONCE(mm == current->active_mm);
> >
> > is totally gone due to bf2c59fce407. Thus, the patch is to fix that discrepancy
> > first and then I'll look at that the imbalance mmdrop()/mmgrab() elsewhere.
>
> No, you're talking nonsense. We must not free @mm when
> 'current->active_mm == mm', never.
Yes, you are right. It still trigger this below on powerpc with today's
linux-next by fuzzing for a while (saw a few times on recent linux-next before
as well but so far mostly reproducible on powerpc here). Any idea?
[12802.547809][T191552] BUG mm_struct (Tainted: G O ): Poison overwritten
[12802.547824][T191552] -----------------------------------------------------------------------------
[12802.547824][T191552]
[12802.547843][T191552] Disabling lock debugging due to kernel taint
[12802.547867][T191552] INFO: 0x000000000e2a54ec-0x000000000e2a54ec @offset=96464. First byte 0x6a instead of 0x6b
[12802.547889][T191552] INFO: Allocated in dup_mm+0x48/0x6d0 age=955 cpu=108 pid=191552
[12802.547915][T191552] __slab_alloc+0xa4/0xf0
[12802.547937][T191552] kmem_cache_alloc+0x314/0x4a0
[12802.547959][T191552] dup_mm+0x48/0x6d0
dup_mm at kernel/fork.c:1344
[12802.547978][T191552] copy_process+0x11bc/0x19a0
[12802.548010][T191552] kernel_clone+0x120/0xb80
[12802.548031][T191552] __do_sys_clone+0x88/0xd0
[12802.548055][T191552] system_call_exception+0xf8/0x1d0
[12802.548083][T191552] system_call_common+0xe8/0x218
[12802.548093][T191552] INFO: Freed in __mmdrop+0x144/0x250 age=942 cpu=69 pid=882503
[12802.548140][T191552] kmem_cache_free+0x47c/0x500
[12802.548161][T191552] __mmdrop+0x144/0x250
__mmdrop at kernel/fork.c:685
[12802.548170][T191552] do_exit+0x3f4/0xed0
[12802.548212][T191552] do_group_exit+0x5c/0xd0
[12802.548244][T191552] sys_exit_group+0x1c/0x20
[12802.548277][T191552] system_call_exception+0xf8/0x1d0
[12802.548309][T191552] system_call_common+0xe8/0x218
[12802.548342][T191552] INFO: Slab 0x0000000048df84af objects=64 used=64 fp=0x0000000000000000 flags=0x87fff8000010200
[12802.548379][T191552] INFO: Object 0x00000000583c5ba3 @offset=96384 fp=0x00000000681f5d04
[12802.548379][T191552]
[12802.548419][T191552] Redzone 000000004a1ea01e: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
[12802.548445][T191552] Redzone 0000000037d12952: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
[12802.548471][T191552] Redzone 000000008124eae0: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
[12802.548511][T191552] Redzone 000000009b782382: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
[12802.548559][T191552] Redzone 0000000005c781f2: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
[12802.548608][T191552] Redzone 00000000f334982a: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
[12802.548645][T191552] Redzone 0000000018372bc6: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
[12802.548706][T191552] Redzone 00000000de34ccbe: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
[12802.548755][T191552] Object 00000000583c5ba3: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.548804][T191552] Object 000000007701f6eb: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.548864][T191552] Object 00000000796c61b2: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.548912][T191552] Object 00000000d5d3e0a7: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.548960][T191552] Object 00000000be4c7347: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.548997][T191552] Object 000000000e2a54ec: 6a 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b jkkkkkkkkkkkkkkk
[12802.549034][T191552] Object 000000005f2499ea: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.549093][T191552] Object 000000007dfc6e96: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.549120][T191552] Object 0000000033cbf36a: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.549135][T191552] Object 00000000b62c5d59: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.549172][T191552] Object 00000000fc047f4a: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.549210][T191552] Object 00000000c28e582c: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.549258][T191552] Object 0000000058ab5b6a: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.549316][T191552] Object 000000005a56e917: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.549364][T191552] Object 000000005a3db061: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.549426][T191552] Object 00000000831930db: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.549464][T191552] Object 00000000dfbae818: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.549500][T191552] Object 000000007c1d0838: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.549548][T191552] Object 0000000061011d8a: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.549585][T191552] Object 000000000e949754: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.549634][T191552] Object 000000006413f485: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.549671][T191552] Object 00000000c2345eaa: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.549718][T191552] Object 0000000092085813: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.549755][T191552] Object 00000000bd1573c3: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.549813][T191552] Object 00000000ea86aa44: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.549862][T191552] Object 00000000f6c1034d: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.549910][T191552] Object 000000001d90fa29: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.549958][T191552] Object 000000001397fc70: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.550016][T191552] Object 0000000073b0be2d: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.550053][T191552] Object 00000000887c2ae9: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.550101][T191552] Object 00000000b662d1ef: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.550183][T191552] Object 000000000f9f4844: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.550280][T191552] Object 0000000030f51915: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.550406][T191552] Object 0000000055fe92a1: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.550518][T191552] Object 0000000018acbccc: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.550641][T191552] Object 0000000003bc1e0d: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.550755][T191552] Object 000000002d3ab81e: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.550879][T191552] Object 000000008e60297f: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.551005][T191552] Object 00000000816738aa: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.551104][T191552] Object 000000001418ad0f: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.551226][T191552] Object 00000000f753b837: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.551363][T191552] Object 000000003456e3f7: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.551489][T191552] Object 000000006e6ba90f: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.551609][T191552] Object 00000000731663e1: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.551730][T191552] Object 00000000c3364461: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.551854][T191552] Object 00000000eebcf88b: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.551956][T191552] Object 000000004de29fa4: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.552067][T191552] Object 000000005bd1967e: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.552184][T191552] Object 00000000d8d1d981: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.552321][T191552] Object 00000000fd01955d: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.552447][T191552] Object 000000005aad9974: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.552555][T191552] Object 000000007fa2efe4: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.552653][T191552] Object 000000001e6bbc3d: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.552782][T191552] Object 000000004e7b9320: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.552913][T191552] Object 000000007660c732: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.553024][T191552] Object 0000000005fe5824: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.553125][T191552] Object 000000007072b5da: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.553257][T191552] Object 00000000ce50558d: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.553375][T191552] Object 00000000ee40426b: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.553508][T191552] Object 00000000151dd063: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.553588][T191552] Object 000000006dde4155: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.553719][T191552] Object 00000000bba9c8b4: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.553835][T191552] Object 0000000081fef250: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.553952][T191552] Object 00000000db9d7aa0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.554078][T191552] Object 00000000513748d5: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.554190][T191552] Object 000000001b7e4b57: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.554313][T191552] Object 00000000969509b3: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.554430][T191552] Object 00000000df85a9df: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.554558][T191552] Object 00000000d526fda8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.554664][T191552] Object 000000008be58260: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.554784][T191552] Object 000000006a8d52b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.554911][T191552] Object 00000000ad1dfd55: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.555005][T191552] Object 00000000873b52ea: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.555134][T191552] Object 000000009716c879: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.555249][T191552] Object 00000000eca252fd: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.555371][T191552] Object 000000002d09f068: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.555493][T191552] Object 0000000095d7f3b1: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.555617][T191552] Object 000000009b66f877: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.555728][T191552] Object 00000000d4d0da23: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.555848][T191552] Object 00000000545dfae3: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.555965][T191552] Object 00000000c686086a: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.556093][T191552] Object 0000000076efef7b: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.556213][T191552] Object 000000007642cc9f: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.556313][T191552] Object 00000000a2c7182e: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.556431][T191552] Object 00000000d8508993: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.556570][T191552] Object 0000000007078b31: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.556676][T191552] Object 000000002111128f: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.556769][T191552] Object 0000000096a989ba: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.556904][T191552] Object 00000000078fa309: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.557025][T191552] Object 00000000b68d0e77: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.557158][T191552] Object 00000000144b15b3: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.557247][T191552] Object 00000000a806800d: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.557351][T191552] Object 000000005edb4355: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.557484][T191552] Object 0000000049aaca1e: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.557600][T191552] Object 000000000eb0b7f9: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[12802.557727][T191552] Object 000000008fdb29be: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkkkkkkkkkk.
[12802.557831][T191552] Redzone 00000000c5a61231: bb bb bb bb bb bb bb bb ........
[12802.557947][T191552] Padding 000000003163b13a: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
[12802.558076][T191552] Padding 0000000092412b1a: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
[12802.558187][T191552] Padding 00000000319fa8cb: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
[12802.558314][T191552] Padding 00000000963c7ce8: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
[12802.558432][T191552] CPU: 71 PID: 191552 Comm: trinity-main Tainted: G B O 5.9.0-rc4-next-20200908+ #1
[12802.558551][T191552] Call Trace:
[12802.558590][T191552] [c000201cb3427620] [c000000000701758] dump_stack+0xec/0x144 (unreliable)
[12802.558691][T191552] [c000201cb3427660] [c0000000003cb53c] print_trailer+0x278/0x2a0
[12802.558794][T191552] [c000201cb34276f0] [c0000000003c0d14] check_bytes_and_report+0x184/0x1b0
[12802.558900][T191552] [c000201cb34277a0] [c0000000003c1000] check_object+0x2c0/0x330
[12802.558990][T191552] [c000201cb3427800] [c0000000003c11ec] alloc_debug_processing+0x17c/0x1e0
[12802.559096][T191552] [c000201cb3427880] [c0000000003c5468] ___slab_alloc+0xb78/0xc60
[12802.559190][T191552] [c000201cb3427980] [c0000000003c55f4] __slab_alloc+0xa4/0xf0
[12802.559284][T191552] [c000201cb34279d0] [c0000000003c5954] kmem_cache_alloc+0x314/0x4a0
[12802.559362][T191552] [c000201cb3427a50] [c0000000000c4818] dup_mm+0x48/0x6d0
[12802.559445][T191552] [c000201cb3427b00] [c0000000000c665c] copy_process+0x11bc/0x19a0
[12802.559528][T191552] [c000201cb3427c20] [c0000000000c7210] kernel_clone+0x120/0xb80
[12802.559630][T191552] [c000201cb3427d00] [c0000000000c7cf8] __do_sys_clone+0x88/0xd0
[12802.559714][T191552] [c000201cb3427dc0] [c00000000002c748] system_call_exception+0xf8/0x1d0
[12802.559810][T191552] [c000201cb3427e20] [c00000000000d0a8] system_call_common+0xe8/0x218
[12802.559906][T191552] FIX mm_struct: Restoring 0x000000000e2a54ec-0x000000000e2a54ec=0x6b
[12802.559906][T191552]
[12802.560030][T191552] FIX mm_struct: Marking all objects used
On Tue, Sep 08, 2020 at 12:50:44PM -0400, Qian Cai wrote:
> > No, you're talking nonsense. We must not free @mm when
> > 'current->active_mm == mm', never.
>
> Yes, you are right. It still trigger this below on powerpc with today's
> linux-next by fuzzing for a while (saw a few times on recent linux-next before
> as well but so far mostly reproducible on powerpc here). Any idea?
If you can reliably reproduce this, the next thing is to trace mm_count
and figure out where it goes side-ways. I suppose we're looking for an
'extra' decrement.
Mark tried this for a while but gave up because he couldn't reliably
reproduce.