Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754279AbbBBJdc (ORCPT ); Mon, 2 Feb 2015 04:33:32 -0500 Received: from relay.parallels.com ([195.214.232.42]:36307 "EHLO relay.parallels.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751787AbbBBJda (ORCPT ); Mon, 2 Feb 2015 04:33:30 -0500 Message-ID: <1422869606.19790.2.camel@tkhai> Subject: Re: [sched] [ INFO: suspicious RCU usage. ] From: Kirill Tkhai To: Fengguang Wu CC: Ingo Molnar , LKP , Date: Mon, 2 Feb 2015 12:33:26 +0300 In-Reply-To: <20150202042235.GB23438@wfg-t540p.sh.intel.com> References: <20150202042235.GB23438@wfg-t540p.sh.intel.com> Organization: Parallels Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.8.5-2+b3 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Originating-IP: [10.30.26.172] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 13444 Lines: 218 Hi, Fengguang, We already fixed that in f7b8a47da17c "sched: Remove lockdep check in sched_move_task()". В Вс, 01/02/2015 в 20:22 -0800, Fengguang Wu пишет: > Hi Kirill, > > FYI the RCU warning shows up since this commit, however there are > several other BUGs, so it's not necessarily a problem of this patch. > > commit eeb61e53ea19be0c4015b00b2e8b3b2185436f2b > Author: Kirill Tkhai > AuthorDate: Mon Oct 27 14:18:25 2014 +0400 > Commit: Ingo Molnar > CommitDate: Tue Oct 28 10:45:59 2014 +0100 > > sched: Fix race between task_group and sched_task_group > > The race may happen when somebody is changing task_group of a forking task. > Child's cgroup is the same as parent's after dup_task_struct() (there just > memory copying). Also, cfs_rq and rt_rq are the same as parent's. > > But if parent changes its task_group before it's called cgroup_post_fork(), > we do not reflect this situation on child. Child's cfs_rq and rt_rq remain > the same, while child's task_group changes in cgroup_post_fork(). > > To fix this we introduce fork() method, which calls sched_move_task() directly. > This function changes sched_task_group on appropriate (also its logic has > no problem with freshly created tasks, so we shouldn't introduce something > special; we are able just to use it). > > Possibly, this decides the Burke Libbey's problem: https://lkml.org/lkml/2014/10/24/456 > > Signed-off-by: Kirill Tkhai > Signed-off-by: Peter Zijlstra (Intel) > Cc: Linus Torvalds > Link: http://lkml.kernel.org/r/1414405105.19914.169.camel@tkhai > Signed-off-by: Ingo Molnar > > =================================================== > PARENT COMMIT NOT CLEAN. LOOK OUT FOR WRONG BISECT! > =================================================== > > Attached dmesg for the parent commit, too, to help confirm whether it is a noise error. > > +--------------------------------------------------------+------------+------------+------------+ > | | xfs-for-li | eeb61e53ea | 28ff344e1d | > +--------------------------------------------------------+------------+------------+------------+ > | boot_successes | 0 | 0 | 0 | > | boot_failures | 60 | 20 | 11 | > | WARNING:at_include/linux/kref.h:#kobject_get() | 60 | 20 | 11 | > | WARNING:at_fs/kernfs/dir.c:#kernfs_remove_by_name_ns() | 1 | 1 | 1 | > | BUG:unable_to_handle_kernel | 59 | 20 | 11 | > | Oops | 59 | 20 | 11 | > | RIP:string | 1 | 1 | 1 | > | Kernel_panic-not_syncing:Fatal_exception | 59 | 20 | 11 | > | backtrace:platform_driver_unregister | 60 | 20 | 11 | > | backtrace:init_mac80211_hwsim | 60 | 20 | 11 | > | backtrace:kernel_init_freeable | 60 | 20 | 11 | > | backtrace:bus_register | 1 | 1 | 1 | > | backtrace:i2o_driver_init | 1 | 1 | 1 | > | backtrace:i2o_iop_init | 1 | 1 | 1 | > | INFO:trying_to_register_non-static_key | 59 | 18 | 10 | > | RIP:mutex_lock_nested | 58 | 18 | 10 | > | INFO:rcu_sched_detected_stalls_on_CPUs/tasks | 1 | | | > | INFO:task_blocked_for_more_than#seconds | 1 | | | > | INFO:lockdep_is_turned_off | 1 | | | > | RIP:native_apic_mem_write | 1 | | | > | RIP:rep_nop | 1 | | | > | Kernel_panic-not_syncing:hung_task:blocked_tasks | 1 | | | > | backtrace:ring_buffer_producer_thread | 1 | | | > | backtrace:watchdog | 1 | | | > | INFO:suspicious_RCU_usage | 0 | 20 | 11 | > | WARNING:at_kernel/locking/lockdep.c:#__bfs() | 0 | 1 | | > | RIP:__bfs | 0 | 1 | | > | backtrace:do_fork | 0 | 20 | 11 | > +--------------------------------------------------------+------------+------------+------------+ > > [ 0.014295] debug: unmapping init [mem 0xffffffff82820000-0xffffffff82823fff] > [ 0.017254] > [ 0.017433] =============================== > [ 0.017885] [ INFO: suspicious RCU usage. ] > [ 0.018000] 3.18.0-rc2-00001-geeb61e5 #6 Not tainted > [ 0.018000] ------------------------------- > [ 0.018000] kernel/sched/core.c:7408 suspicious rcu_dereference_check() usage! > [ 0.018000] > [ 0.018000] other info that might help us debug this: > [ 0.018000] > [ 0.018000] > [ 0.018000] rcu_scheduler_active = 1, debug_locks = 1 > [ 0.018000] 2 locks held by swapper/0/0: > [ 0.018000] #0: (&p->pi_lock){......}, at: [] task_rq_lock+0x38/0xe7 > [ 0.018000] #1: (&rq->lock){-.....}, at: [] task_rq_lock+0x53/0xe7 > [ 0.018000] > [ 0.018000] stack backtrace: > [ 0.018000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc2-00001-geeb61e5 #6 > [ 0.018000] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 > [ 0.018000] 0000000000000001 ffffffff82403cc8 ffffffff81b9e83f ffffffff82425580 > [ 0.018000] 0000000000000001 ffffffff82403cf8 ffffffff810f3d12 ffff880012068000 > [ 0.018000] ffff880012422040 0000000000000000 0000000000000000 ffffffff82403d58 > [ 0.018000] Call Trace: > [ 0.018000] [] dump_stack+0x51/0xaa > [ 0.018000] [] lockdep_rcu_suspicious+0xfa/0x103 > [ 0.018000] [] sched_move_task+0xe5/0x153 > [ 0.018000] [] cpu_cgroup_fork+0x9/0xb > [ 0.018000] [] cgroup_post_fork+0x9b/0xaa > [ 0.018000] [] copy_process+0x178a/0x18bf > [ 0.018000] [] ? rest_init+0x130/0x130 > [ 0.018000] [] ? vm_unmap_aliases+0x196/0x1a5 > [ 0.018000] [] do_fork+0xec/0x3e1 > [ 0.018000] [] ? rest_init+0x130/0x130 > [ 0.018000] [] kernel_thread+0x21/0x23 > [ 0.018000] [] rest_init+0x1e/0x130 > [ 0.018000] [] start_kernel+0x433/0x440 > [ 0.018000] [] ? early_idt_handlers+0x120/0x120 > [ 0.018000] [] x86_64_start_reservations+0x2a/0x2c > [ 0.018000] [] x86_64_start_kernel+0x13b/0x14a > [ 0.018079] Getting VERSION: 1050014 > [ 0.019012] Getting VERSION: 1050014 > [ 0.019403] Getting ID: 0 > > git bisect start 5e62863aea463c9bf24538315327d0922be35d64 v3.17 -- > git bisect good f207671f402b9aa7367ee21dbab2e450d16c9d7e # 01:43 20+ 20 staging: unisys: refactor ULTRA_DIAG_CHANNEL_PROTOCOL > git bisect good 7c0496d19159df8b5a537154194a75dcdfca873a # 01:50 20+ 20 staging: vt6655: mac80211 conversion: s_vFillRTSHead convert to using struct ieee80211_hdr > git bisect bad 33192fa1d173b91ba98c29cb8ab414c76ae3ff0b # 01:53 0- 20 staging: unisys: refactor VISORCHIPSET_BUS_INFO > git bisect good 146761058f63d8530d150bef00c1d3daaa53dbba # 01:58 20+ 20 staging: vt6655: struct vnt_private remove dead members > git bisect good 037a449ac71fc5c70d2b0af28ce6792065daae55 # 02:04 20+ 20 staging: rtl8188eu: os_dep: Remove useless cast on void pointer > git bisect good 425ebd0aa35dcfc5fe2449ef3e1194e422f6f9b4 # 02:14 20+ 20 staging: unisys: virtpci: Add a blank line after the definition of driver_private > git bisect good 9056be30542bfff51190bdda67088f319cf4c9f5 # 02:20 20+ 20 Staging: lustre: lnet: lnet: Better cookie gen > git bisect bad 2c683cde06c1d57a9ecbacc484540e259e7255fa # 02:26 0- 20 staging: unisys: refactor CRASH_OBJ_TYPE > git bisect bad 37951dabd899587afc31ddd508b9cdddc9c24524 # 02:33 0- 20 staging: unisys: get rid of VISORCHIPSET_STATE typedef > git bisect bad 7cbe010a5ea728d7c4440b11a1a3997faca0e46d # 02:40 0- 20 Merge 3.18-rc3 into staging-next > git bisect good f7e87a44ef60ad379e39b45437604141453bf0ec # 02:48 20+ 20 Merge tag 'media/v3.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media > git bisect bad 89453379aaf0608253124057df6cd8ac63948135 # 02:56 0- 20 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net > git bisect bad 19e0d5f16ad9759d92f5739b53acdebfe22fa2a2 # 03:04 0- 20 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip > git bisect good 94712927d06c19b785bae92db07ef84b89f156c8 # 03:11 20+ 20 Merge tag 'sound-3.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound > git bisect good aea4869f68b5869afbb308bfb7d777d725df8900 # 03:14 20+ 20 Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip > git bisect good 5656b408ff2696551c0f572689edcad3113e3a32 # 03:17 20+ 20 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip > git bisect good d1cd1210834649ce1ca6bafe5ac25d2f40331343 # 03:31 20+ 20 x86, pageattr: Prevent overflow in slow_virt_to_phys() for X86_PAE > git bisect bad 2847c90e1b3ae95379af24894fc4f98e7f2fd705 # 03:34 0- 20 sched/fair: Care divide error in update_task_scan_period() > git bisect bad 64be6f1f5f710f5995d41caf8a1767fe6d2b5a87 # 03:37 0- 20 sched/deadline: Don't replenish from a !SCHED_DEADLINE entity > git bisect bad eeb61e53ea19be0c4015b00b2e8b3b2185436f2b # 03:39 0- 20 sched: Fix race between task_group and sched_task_group > # first bad commit: [eeb61e53ea19be0c4015b00b2e8b3b2185436f2b] sched: Fix race between task_group and sched_task_group > git bisect good cac7f2429872d3733dc3f9915857b1691da2eb2f # 03:45 60+ 60 Linux 3.18-rc2 > # extra tests with DEBUG_INFO > git bisect good eeb61e53ea19be0c4015b00b2e8b3b2185436f2b # 03:52 60+ 49 sched: Fix race between task_group and sched_task_group > # extra tests on HEAD of next/master > git bisect good 827e3bdf1bb2401c1a1e5586eb3977d228d298b2 # 03:59 60+ 0 Add linux-next specific files for 20150130 > # extra tests on tree/branch linus/master > git bisect bad 788807d7ca3821b5ea835a588a52d55631c17e18 # 04:01 58- 3 Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux > # extra tests with first bad commit reverted > # extra tests on tree/branch linus/master > git bisect bad 788807d7ca3821b5ea835a588a52d55631c17e18 # 04:03 0- 3 Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux > # extra tests on tree/branch next/master > git bisect bad 827e3bdf1bb2401c1a1e5586eb3977d228d298b2 # 04:07 295- 40 Add linux-next specific files for 20150130 > > > This script may reproduce the error. > > ---------------------------------------------------------------------------- > #!/bin/bash > > kernel=$1 > initrd=quantal-core-x86_64.cgz > > wget --no-clobber https://github.com/fengguang/reproduce-kernel-bug/raw/master/initrd/$initrd > > kvm=( > qemu-system-x86_64 > -enable-kvm > -cpu Haswell,+smep,+smap > -kernel $kernel > -initrd $initrd > -m 320 > -smp 2 > -net nic,vlan=1,model=e1000 > -net user,vlan=1 > -boot order=nc > -no-reboot > -watchdog i6300esb > -rtc base=localtime > -serial stdio > -display none > -monitor null > ) > > append=( > hung_task_panic=1 > earlyprintk=ttyS0,115200 > debug > apic=debug > sysrq_always_enabled > rcupdate.rcu_cpu_stall_timeout=100 > panic=-1 > softlockup_panic=1 > nmi_watchdog=panic > oops=panic > load_ramdisk=2 > prompt_ramdisk=0 > console=ttyS0,115200 > console=tty0 > vga=normal > root=/dev/ram0 > rw > drbd.minor_count=8 > ) > > "${kvm[@]}" --append "${append[*]}" > ---------------------------------------------------------------------------- Regards/С уважением, Kirill Tkhai/Кирилл Тхай -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/