Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753275AbaKCT2Z (ORCPT ); Mon, 3 Nov 2014 14:28:25 -0500 Received: from forward3h.mail.yandex.net ([84.201.187.148]:36193 "EHLO forward3h.mail.yandex.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751908AbaKCT2X (ORCPT ); Mon, 3 Nov 2014 14:28:23 -0500 From: Kirill Tkhai To: Fengguang Wu , Kirill Tkhai Cc: Ingo Molnar , LKP , "linux-kernel@vger.kernel.org" In-Reply-To: <20141103145332.GA5157@wfg-t540p.sh.intel.com> References: <20141103145332.GA5157@wfg-t540p.sh.intel.com> Subject: Re: [sched] [ INFO: suspicious RCU usage. ] MIME-Version: 1.0 Message-Id: <2127541415042898@web25h.yandex.ru> X-Mailer: Yamail [ http://yandex.ru ] 5.0 Date: Mon, 03 Nov 2014 22:28:18 +0300 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=koi8-r Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, the fix is already in PeterZ queue, it will go to mainline soon. 03.11.2014, 17:54, "Fengguang Wu" : > Greetings, > > 0day kernel testing robot got the below dmesg and the first bad commit is > > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master > commit eeb61e53ea19be0c4015b00b2e8b3b2185436f2b > Author: ????Kirill Tkhai > AuthorDate: Mon Oct 27 14:18:25 2014 +0400 > Commit: ????Ingo Molnar > CommitDate: Tue Oct 28 10:45:59 2014 +0100 > > ????sched: Fix race between task_group and sched_task_group > > ????The race may happen when somebody is changing task_group of a forking task. > ????Child's cgroup is the same as parent's after dup_task_struct() (there just > ????memory copying). Also, cfs_rq and rt_rq are the same as parent's. > > ????But if parent changes its task_group before it's called cgroup_post_fork(), > ????we do not reflect this situation on child. Child's cfs_rq and rt_rq remain > ????the same, while child's task_group changes in cgroup_post_fork(). > > ????To fix this we introduce fork() method, which calls sched_move_task() directly. > ????This function changes sched_task_group on appropriate (also its logic has > ????no problem with freshly created tasks, so we shouldn't introduce something > ????special; we are able just to use it). > > ????Possibly, this decides the Burke Libbey's problem: https://lkml.org/lkml/2014/10/24/456 > > ????Signed-off-by: Kirill Tkhai > ????Signed-off-by: Peter Zijlstra (Intel) > ????Cc: Linus Torvalds > ????Link: http://lkml.kernel.org/r/1414405105.19914.169.camel@tkhai > ????Signed-off-by: Ingo Molnar > > +------------------------------------------------+-----------+------------+---------------+ > | ???????????????????????????????????????????????| v3.18-rc2 | eeb61e53ea | next-20141031 | > +------------------------------------------------+-----------+------------+---------------+ > | boot_successes ????????????????????????????????| 60 ???????| 0 ?????????| 0 ????????????| > | boot_failures ?????????????????????????????????| 0 ????????| 20 ????????| 11 ???????????| > | INFO:suspicious_RCU_usage ?????????????????????| 0 ????????| 20 ????????| 11 ???????????| > | backtrace:do_fork ?????????????????????????????| 0 ????????| 20 ????????| 11 ???????????| > | WARNING:at_kernel/sched/core.c:__might_sleep() | 0 ????????| 0 ?????????| 11 ???????????| > | backtrace:rfcomm_run ??????????????????????????| 0 ????????| 0 ?????????| 11 ???????????| > | WARNING:CPU:#PID:#at/kbuild/src/smo ???????????| 0 ????????| 0 ?????????| 1 ????????????| > | BUG:kernel_boot_hang ??????????????????????????| 0 ????????| 0 ?????????| 1 ????????????| > +------------------------------------------------+-----------+------------+---------------+ > > [ ???0.086529] debug: unmapping init [mem 0xffffffff82607000-0xffffffff82609fff] > [ ???0.090115] > [ ???0.090782] =============================== > [ ???0.091842] [ INFO: suspicious RCU usage. ] > [ ???0.092000] 3.18.0-rc2-00001-geeb61e5 #7 Not tainted > [ ???0.092000] ------------------------------- > [ ???0.092000] kernel/sched/core.c:7408 suspicious rcu_dereference_check() usage! > [ ???0.092000] > [ ???0.092000] other info that might help us debug this: > [ ???0.092000] > [ ???0.092000] > [ ???0.092000] rcu_scheduler_active = 1, debug_locks = 0 > [ ???0.092000] 2 locks held by swapper/0/0: > [ ???0.092000] ?#0: ?(&p->pi_lock){......}, at: [] task_rq_lock+0x30/0xaa > [ ???0.092000] ?#1: ?(&rq->lock){-.....}, at: [] task_rq_lock+0x52/0xaa > [ ???0.092000] > [ ???0.092000] stack backtrace: > [ ???0.092000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc2-00001-geeb61e5 #7 > [ ???0.092000] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 > [ ???0.092000] ?0000000000000001 ffffffff819dbcb8 ffffffff815e2acd 000000000000182c > [ ???0.092000] ?ffffffff819f85c0 ffffffff819dbce8 ffffffff810d1774 ffff880013510000 > [ ???0.092000] ?ffff8800139de180 ffff880013528000 0000000000800300 ffffffff819dbd48 > [ ???0.092000] Call Trace: > [ ???0.092000] ?[] dump_stack+0x4e/0x68 > [ ???0.092000] ?[] lockdep_rcu_suspicious+0x107/0x110 > [ ???0.092000] ?[] sched_move_task+0xda/0x17b > [ ???0.092000] ?[] cpu_cgroup_fork+0x9/0xb > [ ???0.092000] ?[] cgroup_post_fork+0x9b/0xaa > [ ???0.092000] ?[] copy_process+0x17e5/0x1832 > [ ???0.092000] ?[] ? rest_init+0x136/0x136 > [ ???0.092000] ?[] do_fork+0x8a/0x243 > [ ???0.092000] ?[] kernel_thread+0x21/0x23 > [ ???0.092000] ?[] rest_init+0x1e/0x136 > [ ???0.092000] ?[] start_kernel+0x4ba/0x4c7 > [ ???0.092000] ?[] ? early_idt_handlers+0x120/0x120 > [ ???0.092000] ?[] x86_64_start_reservations+0x2a/0x2c > [ ???0.092000] ?[] x86_64_start_kernel+0x106/0x115 > [ ???0.096042] Getting VERSION: 1050014 > [ ???0.097061] Getting VERSION: 1050014 > [ ???0.098000] Getting ID: 0 > > git bisect start 4fbe40970dc154aaeeda0584aab8913fc073127b cac7f2429872d3733dc3f9915857b1691da2eb2f -- > git bisect good fd2e6bbbe5b2e8f6d8ea6a28b614ec5c1c8b5a62 ?# 23:14 ????20+ ?????0 ?Merge remote-tracking branch 'hwmon-staging/hwmon-next' > git bisect good ab6af24617af5f94df5185611f1eb39432d49292 ?# 01:33 ????20+ ?????0 ?Merge remote-tracking branch 'spi/for-next' > git bisect ?bad f5ef810eb87c283118797ea33778c5b3c937ad29 ?# 02:08 ?????0- ?????3 ?Merge remote-tracking branch 'staging/staging-next' > git bisect ?bad 50247251e8b6078b491b5fb54dc31c24867f5930 ?# 02:47 ?????0- ?????9 ?Merge remote-tracking branch 'rcu/rcu/next' > git bisect ?bad 6a91cd724c20352f8cdc2a73c0649bae71897ed8 ?# 03:27 ?????0- ?????7 ?Merge remote-tracking branch 'edac-amd/for-next' > git bisect ?bad d7a8d245f2ffb9adde1a7121e5ab86eed8d03584 ?# 03:39 ?????0- ?????4 ?Merge remote-tracking branch 'tip/auto-latest' > git bisect good 26c2d2b39128adba276d140eefa2745591b88536 ?# 09:46 ????20+ ?????0 ?i386/audit: stop scribbling on the stack frame > git bisect ?bad 3427445afd26bd2395f29241319283a93f362cd0 ?# 11:03 ?????0- ?????1 ?sched: Exclude cond_resched() from nested sleep test > git bisect ?bad 1a43a14a5bd9c32dbd7af35e35a5afa703944bcb ?# 11:39 ?????0- ?????2 ?sched: Fix schedule_tail() to disable preemption > git bisect ?bad f3a7e1a9c464a32ee186ab91388313c82e7ce018 ?# 12:38 ?????0- ?????1 ?sched/dl: Fix preemption checks > git bisect ?bad 1effd9f19324efb05fccc7421530e11a52db0278 ?# 12:55 ?????0- ?????1 ?sched/numa: Fix unsafe get_task_struct() in task_numa_assign() > git bisect ?bad 64be6f1f5f710f5995d41caf8a1767fe6d2b5a87 ?# 15:30 ?????0- ?????3 ?sched/deadline: Don't replenish from a !SCHED_DEADLINE entity > git bisect ?bad eeb61e53ea19be0c4015b00b2e8b3b2185436f2b ?# 17:18 ?????0- ?????1 ?sched: Fix race between task_group and sched_task_group > # first bad commit: [eeb61e53ea19be0c4015b00b2e8b3b2185436f2b] sched: Fix race between task_group and sched_task_group > git bisect good cac7f2429872d3733dc3f9915857b1691da2eb2f ?# 17:31 ????60+ ?????0 ?Linux 3.18-rc2 > git bisect ?bad 4fbe40970dc154aaeeda0584aab8913fc073127b ?# 17:31 ?????0- ????11 ?Add linux-next specific files for 20141031 > git bisect ?bad 0df1f2487d2f0d04703f142813d53615d62a1da4 ?# 20:17 ?????0- ?????3 ?Linux 3.18-rc3 > git bisect ?bad a641d0e16b552f986e8c4a8735e66a1443a769cf ?# 21:44 ?????0- ?????3 ?Add linux-next specific files for 20141103 > > This script may reproduce the error. > > ---------------------------------------------------------------------------- > #!/bin/bash > > kernel=$1 > initrd=yocto-minimal-x86_64.cgz > > wget --no-clobber https://github.com/fengguang/reproduce-kernel-bug/raw/master/initrd/$initrd > > kvm=( > ????????qemu-system-x86_64 > ????????-enable-kvm > ????????-cpu Haswell,+smep,+smap > ????????-kernel $kernel > ????????-initrd $initrd > ????????-m 320 > ????????-smp 1 > ????????-net nic,vlan=1,model=e1000 > ????????-net user,vlan=1 > ????????-boot order=nc > ????????-no-reboot > ????????-watchdog i6300esb > ????????-rtc base=localtime > ????????-serial stdio > ????????-display none > ????????-monitor null > ) > > append=( > ????????hung_task_panic=1 > ????????earlyprintk=ttyS0,115200 > ????????debug > ????????apic=debug > ????????sysrq_always_enabled > ????????rcupdate.rcu_cpu_stall_timeout=100 > ????????panic=-1 > ????????softlockup_panic=1 > ????????nmi_watchdog=panic > ????????oops=panic > ????????load_ramdisk=2 > ????????prompt_ramdisk=0 > ????????console=ttyS0,115200 > ????????console=tty0 > ????????vga=normal > ????????root=/dev/ram0 > ????????rw > ????????drbd.minor_count=8 > ) > > "${kvm[@]}" --append "${append[*]}" > ---------------------------------------------------------------------------- > > Thanks, > Fengguang > , > > _______________________________________________ > LKP mailing list > LKP@linux.intel.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/