2017-03-02 19:46:06

by Dmitry Vyukov

[permalink] [raw]
Subject: cgroup: WARNING in cgroup_kill_sb

Hello,

The following program triggers WARNING in cgroup_kill_sb:
https://gist.githubusercontent.com/dvyukov/47a37d3b899ece1f57e512dc6c90bca6/raw/250894f3d6e2954eed01bac39e4c3b7ec59a9c31/gistfile1.txt


WARNING: CPU: 2 PID: 3092 at lib/percpu-refcount.c:317
percpu_ref_kill_and_confirm+0x3ff/0x4f0 lib/percpu-refcount.c:316
percpu_ref_kill_and_confirm called more than once on css_release!
Kernel panic - not syncing: panic_on_warn set ...

CPU: 2 PID: 3092 Comm: a.out Not tainted 4.10.0+ #260
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:15 [inline]
dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
panic+0x1fb/0x412 kernel/panic.c:179
__warn+0x1c4/0x1e0 kernel/panic.c:540
warn_slowpath_fmt+0xc5/0x100 kernel/panic.c:563
percpu_ref_kill_and_confirm+0x3ff/0x4f0 lib/percpu-refcount.c:316
percpu_ref_kill include/linux/percpu-refcount.h:119 [inline]
cgroup_kill_sb+0x188/0x530 kernel/cgroup/cgroup.c:1833
deactivate_locked_super+0x88/0xd0 fs/super.c:309
deactivate_super+0x155/0x1b0 fs/super.c:340
cleanup_mnt+0xb2/0x160 fs/namespace.c:1112
__cleanup_mnt+0x16/0x20 fs/namespace.c:1119
task_work_run+0x18a/0x260 kernel/task_work.c:116
tracehook_notify_resume include/linux/tracehook.h:191 [inline]
exit_to_usermode_loop+0x23b/0x2a0 arch/x86/entry/common.c:160
prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
syscall_return_slowpath+0x4d3/0x570 arch/x86/entry/common.c:259
entry_SYSCALL_64_fastpath+0xc0/0xc2
RIP: 0033:0x440b39
RSP: 002b:00007f3e8bd0cdb8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffec RBX: 0000000000000000 RCX: 0000000000440b39
RDX: 00000000004a0f3b RSI: 00000000004a0f34 RDI: 00000000004a0f34
RBP: 00007f3e8bd0cdd0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000005 R11: 0000000000000202 R12: 0000000000000000
R13: 0000000000000000 R14: 00007f3e8bd0d9c0 R15: 00007f3e8bd0d700

On commit 4977ab6e92e267afe9d8f78438c3db330ca8434c


2017-03-06 21:55:37

by Tejun Heo

[permalink] [raw]
Subject: Re: cgroup: WARNING in cgroup_kill_sb

Hello, Dmitry.

Can you please see whether the following patch resolves the issue?
I'm a bit nervous about it ending up in circular dependency, but I
*think* it should be okay.

Thanks.

diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 0125589..9c40421 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -1820,6 +1820,8 @@ static void cgroup_kill_sb(struct super_block *sb)
struct kernfs_root *kf_root = kernfs_root_from_sb(sb);
struct cgroup_root *root = cgroup_root_from_kf(kf_root);

+ mutex_lock(&cgroup_mutex);
+
/*
* If @root doesn't have any mounts or children, start killing it.
* This prevents new mounts by disabling percpu_ref_tryget_live().
@@ -1834,6 +1836,8 @@ static void cgroup_kill_sb(struct super_block *sb)
percpu_ref_kill(&root->cgrp.self.refcnt);

kernfs_kill_sb(sb);
+
+ mutex_unlock(&cgroup_mutex);
}

struct file_system_type cgroup_fs_type = {

2017-03-07 09:12:42

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: cgroup: WARNING in cgroup_kill_sb

On Mon, Mar 6, 2017 at 10:55 PM, Tejun Heo <[email protected]> wrote:
> Hello, Dmitry.
>
> Can you please see whether the following patch resolves the issue?
> I'm a bit nervous about it ending up in circular dependency, but I
> *think* it should be okay.
>
> Thanks.
>
> diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
> index 0125589..9c40421 100644
> --- a/kernel/cgroup/cgroup.c
> +++ b/kernel/cgroup/cgroup.c
> @@ -1820,6 +1820,8 @@ static void cgroup_kill_sb(struct super_block *sb)
> struct kernfs_root *kf_root = kernfs_root_from_sb(sb);
> struct cgroup_root *root = cgroup_root_from_kf(kf_root);
>
> + mutex_lock(&cgroup_mutex);
> +
> /*
> * If @root doesn't have any mounts or children, start killing it.
> * This prevents new mounts by disabling percpu_ref_tryget_live().
> @@ -1834,6 +1836,8 @@ static void cgroup_kill_sb(struct super_block *sb)
> percpu_ref_kill(&root->cgrp.self.refcnt);
>
> kernfs_kill_sb(sb);
> +
> + mutex_unlock(&cgroup_mutex);
> }
>
> struct file_system_type cgroup_fs_type = {



No, still happens. Please run the repro.


[ 367.607496] ------------[ cut here ]------------
[ 367.608012] WARNING: CPU: 1 PID: 16161 at lib/percpu-refcount.c:317
percpu_ref_kill_and_confirm+0x3ff/0x500
[ 367.608019] percpu_ref_kill_and_confirm called more than once on css_release!
[ 367.608019] Kernel panic - not syncing: panic_on_warn set ...
[ 367.608019]
[ 367.608019] CPU: 1 PID: 16161 Comm: a.out Not tainted 4.11.0-rc1+ #311
[ 367.608019] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS Bochs 01/01/2011
[ 367.608019] Call Trace:
[ 367.608019] dump_stack+0x2fb/0x3fd
[ 367.608019] ? arch_local_irq_restore+0x53/0x53
[ 367.608019] ? vprintk_emit+0x566/0x770
[ 367.608019] ? console_unlock+0xf50/0xf50
[ 367.608019] ? kasan_check_write+0x14/0x20
[ 367.608019] ? sched_clock_local+0xe2/0x150
[ 367.608019] ? do_raw_spin_trylock+0x1a0/0x1a0
[ 367.608019] ? sched_clock_cpu+0x12e/0x170
[ 367.608019] ? memcpy+0x45/0x50
[ 367.608019] ? vprintk_emit+0x566/0x770
[ 367.608019] ? console_unlock+0xf50/0xf50
[ 367.608019] ? percpu_ref_kill_and_confirm+0xeb/0x500
[ 367.608019] ? check_noncircular+0x20/0x20
[ 367.608019] ? vprintk_default+0x28/0x30
[ 367.608019] ? vprintk_func+0x47/0x90
[ 367.608019] ? printk+0xc8/0xf9
[ 367.608019] ? load_image_and_restore+0x134/0x134
[ 367.608019] ? pointer+0xac0/0xac0
[ 367.608019] panic+0x20f/0x426
[ 367.608019] ? copy_mm+0x1219/0x1219
[ 367.608019] ? percpu_ref_kill_and_confirm+0x3ff/0x500
[ 367.608019] ? vprintk_default+0x28/0x30
[ 367.608019] ? percpu_ref_kill_and_confirm+0x3ff/0x500
[ 367.608019] __warn+0x1c4/0x1e0
[ 367.608019] warn_slowpath_fmt+0xc5/0x100
[ 367.608019] ? __warn+0x1e0/0x1e0
[ 367.608019] ? depot_save_stack+0x12c/0x480
[ 367.608019] ? css_free_rcu_fn+0x1d0/0x1d0
[ 367.608019] percpu_ref_kill_and_confirm+0x3ff/0x500
[ 367.608019] ? __percpu_ref_switch_mode+0x850/0x850
[ 367.608019] ? deactivate_super+0x173/0x1b0
[ 367.608019] ? cleanup_mnt+0xb2/0x160
[ 367.608019] ? __cleanup_mnt+0x16/0x20
[ 367.608019] ? task_work_run+0x197/0x260
[ 367.608019] ? exit_to_usermode_loop+0x23b/0x2a0
[ 367.608019] ? mark_held_locks+0xaf/0x100
[ 367.608019] ? kfree+0xed/0x250
[ 367.608019] ? trace_hardirqs_on_caller+0x545/0x6f0
[ 367.608019] ? mark_held_locks+0x100/0x100
[ 367.608019] ? lock_set_class+0xc00/0xc00
[ 367.608019] ? check_same_owner+0x320/0x320
[ 367.608019] cgroup_kill_sb+0x196/0x550
[ 367.608019] ? cgroup_migrate_add_task+0xc60/0xc60
[ 367.608019] ? unregister_shrinker+0x1c1/0x2c0
[ 367.608019] ? perf_trace_mm_vmscan_writepage+0x7a0/0x7a0
[ 367.608019] ? down_write+0x8c/0x120
[ 367.608019] ? down_read+0x150/0x150
[ 367.608019] deactivate_locked_super+0x99/0xe0
[ 367.608019] deactivate_super+0x173/0x1b0
[ 367.608019] ? mount_ns+0x190/0x190
[ 367.608019] ? dput.part.25+0x2a/0x7c0
[ 367.608019] ? dput.part.25+0x176/0x7c0
[ 367.608019] ? dput.part.25+0x2a/0x7c0
[ 367.608019] cleanup_mnt+0xb2/0x160
[ 367.608019] __cleanup_mnt+0x16/0x20
[ 367.608019] task_work_run+0x197/0x260
[ 367.608019] ? task_work_cancel+0x2f0/0x2f0
[ 367.608019] ? __unwind_start+0x380/0x380
[ 367.608019] ? entry_SYSCALL_64_fastpath+0x1f/0xc2
[ 367.608019] exit_to_usermode_loop+0x23b/0x2a0
[ 367.608019] ? trace_event_raw_event_sys_exit+0x270/0x270
[ 367.608019] ? __save_stack_trace+0x7e/0xd0
[ 367.608019] syscall_return_slowpath+0x4d3/0x570
[ 367.608019] ? prepare_exit_to_usermode+0x2e0/0x2e0
[ 367.608019] ? save_stack_trace+0x16/0x20
[ 367.608019] ? save_stack+0x43/0xd0
[ 367.608019] ? kasan_slab_free+0x6f/0xb0
[ 367.608019] ? kfree+0xd3/0x250
[ 367.608019] ? SyS_mount+0xcf/0x120
[ 367.608019] ? entry_SYSCALL_64_fastpath+0x1f/0xc2
[ 367.608019] ? mntput+0x66/0x90
[ 367.608019] ? check_noncircular+0x20/0x20
[ 367.608019] ? kfree+0xed/0x250
[ 367.608019] ? entry_SYSCALL_64_fastpath+0x93/0xc2
[ 367.608019] ? trace_hardirqs_on_caller+0x545/0x6f0
[ 367.608019] ? mark_held_locks+0x100/0x100
[ 367.608019] ? check_stack_object+0x140/0x140
[ 367.608019] ? check_stack_object+0x140/0x140
[ 367.608019] ? rcu_read_lock_sched_held+0x108/0x120
[ 367.608019] ? __kmalloc_track_caller+0x40a/0x6f0
[ 367.608019] ? SyS_mount+0xcf/0x120
[ 367.608019] ? trace_hardirqs_off+0xd/0x10
[ 367.608019] ? quarantine_put+0xea/0x190
[ 367.608019] ? SyS_mount+0xcf/0x120
[ 367.608019] ? trace_hardirqs_on_thunk+0x1a/0x1c
[ 367.608019] entry_SYSCALL_64_fastpath+0xc0/0xc2
[ 367.608019] RIP: 0033:0x440b39
[ 367.608019] RSP: 002b:00007f86c630adb8 EFLAGS: 00000202 ORIG_RAX:
00000000000000a5
[ 367.608019] RAX: ffffffffffffffec RBX: 0000000000000000 RCX: 0000000000440b39
[ 367.608019] RDX: 00000000004a0f3b RSI: 00000000004a0f34 RDI: 00000000004a0f34
[ 367.608019] RBP: 00007f86c630add0 R08: 0000000000000000 R09: 0000000000000000
[ 367.608019] R10: 0000000000000005 R11: 0000000000000202 R12: 0000000000000000
[ 367.608019] R13: 0000000000000000 R14: 00007f86c630b9c0 R15: 00007f86c630b700
[ 367.608019] Kernel Offset: disabled
[ 367.608019] Rebooting in 86400 seconds..

2017-03-07 19:47:48

by Tejun Heo

[permalink] [raw]
Subject: Re: cgroup: WARNING in cgroup_kill_sb

On Tue, Mar 07, 2017 at 10:11:59AM +0100, Dmitry Vyukov wrote:
> No, still happens. Please run the repro.

Oh, I did run it but it didn't trigger the failure here. Can you
please let me know the kernel version, config, system setup and usual
trigger duration?

Thanks.

--
tejun

2017-03-07 19:59:35

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: cgroup: WARNING in cgroup_kill_sb

On Tue, Mar 7, 2017 at 8:45 PM, Tejun Heo <[email protected]> wrote:
> On Tue, Mar 07, 2017 at 10:11:59AM +0100, Dmitry Vyukov wrote:
>> No, still happens. Please run the repro.
>
> Oh, I did run it but it didn't trigger the failure here. Can you
> please let me know the kernel version, config, system setup and usual
> trigger duration?

Kernel is c1ae3cfa0e89fa1a7ecc4c99031f5e9ae99d9201
.config is attached
I start qemu with:
qemu-system-x86_64 -hda wheezy.img -net
user,host=10.0.2.10,hostfwd=tcp::10022-:22 -net nic -nographic -kernel
arch/x86/boot/bzImage -append "kvm-intel.nested=1
kvm-intel.enable_unrestricted_guest=1 kvm-intel.ept=1
kvm-intel.flexpriority=1 kvm-intel.vpid=1
kvm-intel.emulate_invalid_guest_state=1 kvm-intel.eptad=1
kvm-intel.enable_shadow_vmcs=1 kvm-intel.pml=1
kvm-intel.enable_apicv=1 console=ttyS0 root=/dev/sda
earlyprintk=serial slub_debug=UZ vsyscall=native rodata=n oops=panic
panic_on_warn=1 panic=86400" -enable-kvm -pidfile vm_pid -m 2G -smp 4
-cpu host -usb -usbdevice mouse -usbdevice tablet -soundhw all

The crash was at [ 367.608012], so 5-10 minutes.


Attachments:
.config (117.59 kB)

2017-03-09 09:25:54

by Zefan Li

[permalink] [raw]
Subject: Re: cgroup: WARNING in cgroup_kill_sb

On 2017/3/3 3:15, Dmitry Vyukov wrote:
> Hello,
>
> The following program triggers WARNING in cgroup_kill_sb:
> https://gist.githubusercontent.com/dvyukov/47a37d3b899ece1f57e512dc6c90bca6/raw/250894f3d6e2954eed01bac39e4c3b7ec59a9c31/gistfile1.txt
>
>
> WARNING: CPU: 2 PID: 3092 at lib/percpu-refcount.c:317
> percpu_ref_kill_and_confirm+0x3ff/0x4f0 lib/percpu-refcount.c:316
> percpu_ref_kill_and_confirm called more than once on css_release!
> Kernel panic - not syncing: panic_on_warn set ...
>
> CPU: 2 PID: 3092 Comm: a.out Not tainted 4.10.0+ #260
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Call Trace:
> __dump_stack lib/dump_stack.c:15 [inline]
> dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
> panic+0x1fb/0x412 kernel/panic.c:179
> __warn+0x1c4/0x1e0 kernel/panic.c:540
> warn_slowpath_fmt+0xc5/0x100 kernel/panic.c:563
> percpu_ref_kill_and_confirm+0x3ff/0x4f0 lib/percpu-refcount.c:316
> percpu_ref_kill include/linux/percpu-refcount.h:119 [inline]
> cgroup_kill_sb+0x188/0x530 kernel/cgroup/cgroup.c:1833
> deactivate_locked_super+0x88/0xd0 fs/super.c:309
> deactivate_super+0x155/0x1b0 fs/super.c:340
> cleanup_mnt+0xb2/0x160 fs/namespace.c:1112
> __cleanup_mnt+0x16/0x20 fs/namespace.c:1119
> task_work_run+0x18a/0x260 kernel/task_work.c:116
> tracehook_notify_resume include/linux/tracehook.h:191 [inline]
> exit_to_usermode_loop+0x23b/0x2a0 arch/x86/entry/common.c:160
> prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
> syscall_return_slowpath+0x4d3/0x570 arch/x86/entry/common.c:259
> entry_SYSCALL_64_fastpath+0xc0/0xc2
> RIP: 0033:0x440b39
> RSP: 002b:00007f3e8bd0cdb8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
> RAX: ffffffffffffffec RBX: 0000000000000000 RCX: 0000000000440b39
> RDX: 00000000004a0f3b RSI: 00000000004a0f34 RDI: 00000000004a0f34
> RBP: 00007f3e8bd0cdd0 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000005 R11: 0000000000000202 R12: 0000000000000000
> R13: 0000000000000000 R14: 00007f3e8bd0d9c0 R15: 00007f3e8bd0d700
>
> On commit 4977ab6e92e267afe9d8f78438c3db330ca8434c
> .

could you share your kernel config? I can't reproduce this bug.

2017-03-09 09:49:45

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: cgroup: WARNING in cgroup_kill_sb

On Thu, Mar 9, 2017 at 10:24 AM, Zefan Li <[email protected]> wrote:
> On 2017/3/3 3:15, Dmitry Vyukov wrote:
>> Hello,
>>
>> The following program triggers WARNING in cgroup_kill_sb:
>> https://gist.githubusercontent.com/dvyukov/47a37d3b899ece1f57e512dc6c90bca6/raw/250894f3d6e2954eed01bac39e4c3b7ec59a9c31/gistfile1.txt
>>
>>
>> WARNING: CPU: 2 PID: 3092 at lib/percpu-refcount.c:317
>> percpu_ref_kill_and_confirm+0x3ff/0x4f0 lib/percpu-refcount.c:316
>> percpu_ref_kill_and_confirm called more than once on css_release!
>> Kernel panic - not syncing: panic_on_warn set ...
>>
>> CPU: 2 PID: 3092 Comm: a.out Not tainted 4.10.0+ #260
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>> Call Trace:
>> __dump_stack lib/dump_stack.c:15 [inline]
>> dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
>> panic+0x1fb/0x412 kernel/panic.c:179
>> __warn+0x1c4/0x1e0 kernel/panic.c:540
>> warn_slowpath_fmt+0xc5/0x100 kernel/panic.c:563
>> percpu_ref_kill_and_confirm+0x3ff/0x4f0 lib/percpu-refcount.c:316
>> percpu_ref_kill include/linux/percpu-refcount.h:119 [inline]
>> cgroup_kill_sb+0x188/0x530 kernel/cgroup/cgroup.c:1833
>> deactivate_locked_super+0x88/0xd0 fs/super.c:309
>> deactivate_super+0x155/0x1b0 fs/super.c:340
>> cleanup_mnt+0xb2/0x160 fs/namespace.c:1112
>> __cleanup_mnt+0x16/0x20 fs/namespace.c:1119
>> task_work_run+0x18a/0x260 kernel/task_work.c:116
>> tracehook_notify_resume include/linux/tracehook.h:191 [inline]
>> exit_to_usermode_loop+0x23b/0x2a0 arch/x86/entry/common.c:160
>> prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>> syscall_return_slowpath+0x4d3/0x570 arch/x86/entry/common.c:259
>> entry_SYSCALL_64_fastpath+0xc0/0xc2
>> RIP: 0033:0x440b39
>> RSP: 002b:00007f3e8bd0cdb8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
>> RAX: ffffffffffffffec RBX: 0000000000000000 RCX: 0000000000440b39
>> RDX: 00000000004a0f3b RSI: 00000000004a0f34 RDI: 00000000004a0f34
>> RBP: 00007f3e8bd0cdd0 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000005 R11: 0000000000000202 R12: 0000000000000000
>> R13: 0000000000000000 R14: 00007f3e8bd0d9c0 R15: 00007f3e8bd0d700
>>
>> On commit 4977ab6e92e267afe9d8f78438c3db330ca8434c
>> .
>
> could you share your kernel config? I can't reproduce this bug.

It's attached to the previous email.

Run the repro for longer, it's some kind of race.