Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755227AbbBCKBw (ORCPT ); Tue, 3 Feb 2015 05:01:52 -0500 Received: from mailout1.w1.samsung.com ([210.118.77.11]:28041 "EHLO mailout1.w1.samsung.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752180AbbBCKBr (ORCPT ); Tue, 3 Feb 2015 05:01:47 -0500 X-AuditID: cbfec7f4-b7f126d000001e9a-7b-54d09bf7a7c6 Message-id: <1422957702.17540.1.camel@AMDC1943> Subject: Re: [rcu] [ INFO: suspicious RCU usage. ] From: Krzysztof Kozlowski To: Fengguang Wu Cc: "Paul E. McKenney" , LKP , linux-kernel@vger.kernel.org Date: Tue, 03 Feb 2015 11:01:42 +0100 In-reply-to: <20150201025922.GA16820@wfg-t540p.sh.intel.com> References: <20150201025922.GA16820@wfg-t540p.sh.intel.com> Content-type: text/plain; charset=UTF-8 X-Mailer: Evolution 3.10.4-0ubuntu2 MIME-version: 1.0 Content-transfer-encoding: 7bit X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFlrKLMWRmVeSWpSXmKPExsVy+t/xa7rfZ18IMZjSqmXx/vl6ZovLu+aw Waw83s5q8Xbzd1YHFo/7e9k9Fu95yeTx4NBmFo/Pm+QCWKK4bFJSczLLUov07RK4MjbN+MRc 8Ca/omn6AtYGxiVRXYwcHBICJhIndvp3MXICmWISF+6tZ+ti5OIQEljKKHGxq5sdJCEk8JlR 4t9adxCbV0BfYtW/qywgtrCAkcTKd1OZQGw2AWOJzcuXsIHYIgKaEm+BjgCxmQVSJZZv2wdm swioSsxatIcZxOYUsJE48XseM8gNQgLWEhc/WUCUq0tMmreIGeI0ZYnGfjeIrYISPybfg5oo L7F5zVvmCYwCs5B0zEJSNgtJ2QJG5lWMoqmlyQXFSem5hnrFibnFpXnpesn5uZsYIUH7ZQfj 4mNWhxgFOBiVeHhXGF0IEWJNLCuuzD3EKMHBrCTCO7ETKMSbklhZlVqUH19UmpNafIiRiYNT qoGxUbT1icIusTs3nRpy1LhZfm5K38oiunbT5ml3Ol2WLWtQvHho+rOAFVGz5m9NKom943on 7APL5O3xwrsWqXxtPTpZXi0mwcVpd5zz9mdianseXXf8uGxPWVAW2/mX79v5z/xxLTt4mo/l fmNddMRF0cDy5z+tY7bcz5FWXDuP7fnCCTe3SC2WUGIpzkg01GIuKk4EAABHZW84AgAA Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 14293 Lines: 273 On sob, 2015-01-31 at 18:59 -0800, Fengguang Wu wrote: > Greetings, > > 0day kernel testing robot got the below dmesg and the first bad commit is > > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git revert-c418b8035fac0cc7d242e5de126cec1006a34bed-dd2b39be8eee9d175c7842c30e405a5cbe50095a On next-20150203 I hit similar error on ARM/Exynos4412 (Trats2 board) while suspending to RAM: [ 30.986262] PM: Syncing filesystems ... done. [ 30.994661] PM: Preparing system for mem sleep [ 31.002064] Freezing user space processes ... (elapsed 0.002 seconds) done. [ 31.008629] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. [ 31.016325] PM: Entering mem sleep [ 31.016338] Suspending console(s) (use no_console_suspend to debug) [ 31.051009] random: nonblocking pool is initialized [ 31.085811] wake enabled for irq 102 [ 31.086964] wake enabled for irq 123 [ 31.086972] wake enabled for irq 124 [ 31.090409] PM: suspend of devices complete after 59.684 msecs [ 31.090524] CAM_ISP_CORE_1.2V: No configuration [ 31.090534] VMEM_VDDF_3.0V: No configuration [ 31.090543] VCC_SUB_2.0V: No configuration [ 31.090552] VCC_SUB_1.35V: No configuration [ 31.090562] VMEM_1.2V_AP: No configuration [ 31.090587] MOTOR_VCC_3.0V: No configuration [ 31.090596] LCD_VCC_3.3V: No configuration [ 31.090605] TSP_VDD_1.8V: No configuration [ 31.090614] TSP_AVDD_3.3V: No configuration [ 31.090623] VMEM_VDD_2.8V: No configuration [ 31.090631] VTF_2.8V: No configuration [ 31.090640] VDDQ_PRE_1.8V: No configuration [ 31.090649] VT_CAM_1.8V: No configuration [ 31.090658] CAM_ISP_SEN_IO_1.8V: No configuration [ 31.090667] CAM_SENSOR_CORE_1.2V: No configuration [ 31.090677] VHSIC_1.8V: No configuration [ 31.090685] VHSIC_1.0V: No configuration [ 31.090694] VABB2_1.95V: No configuration [ 31.090703] NFC_AVDD_1.8V: No configuration [ 31.090712] VUOTG_3.0V: No configuration [ 31.090721] VABB1_1.95V: No configuration [ 31.090730] VMIPI_1.8V: No configuration [ 31.090739] CAM_ISP_MIPI_1.2V: No configuration [ 31.090747] VMIPI_1.0V: No configuration [ 31.090756] VPLL_1.0V_AP: No configuration [ 31.090765] VMPLL_1.0V_AP: No configuration [ 31.090773] VCC_1.8V_IO: No configuration [ 31.090782] VCC_2.8V_AP: No configuration [ 31.090791] VCC_1.8V_AP: No configuration [ 31.090800] VM1M2_1.2V_AP: No configuration [ 31.090809] VALIVE_1.0V_AP: No configuration [ 31.100297] PM: late suspend of devices complete after 9.445 msecs [ 31.108891] PM: noirq suspend of devices complete after 8.577 msecs [ 31.109052] Disabling non-boot CPUs ... [ 31.113921] [ 31.113925] =============================== [ 31.113928] [ INFO: suspicious RCU usage. ] [ 31.113935] 3.19.0-rc7-next-20150203 #1914 Not tainted [ 31.113938] ------------------------------- [ 31.113943] kernel/sched/fair.c:4740 suspicious rcu_dereference_check() usage! [ 31.113946] [ 31.113946] other info that might help us debug this: [ 31.113946] [ 31.113952] [ 31.113952] RCU used illegally from offline CPU! [ 31.113952] rcu_scheduler_active = 1, debug_locks = 0 [ 31.113957] 3 locks held by swapper/1/0: [ 31.113988] #0: ((cpu_died).wait.lock){......}, at: [] complete+0x14/0x44 [ 31.114012] #1: (&p->pi_lock){-.-.-.}, at: [] try_to_wake_up+0x28/0x300 [ 31.114035] #2: (rcu_read_lock){......}, at: [] select_task_rq_fair+0x5c/0xa04 [ 31.114038] [ 31.114038] stack backtrace: [ 31.114046] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.19.0-rc7-next-20150203 #1914 [ 31.114050] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) [ 31.114076] [] (unwind_backtrace) from [] (show_stack+0x10/0x14) [ 31.114091] [] (show_stack) from [] (dump_stack+0x70/0xbc) [ 31.114105] [] (dump_stack) from [] (select_task_rq_fair+0x6e0/0xa04) [ 31.114118] [] (select_task_rq_fair) from [] (try_to_wake_up+0xd4/0x300) [ 31.114129] [] (try_to_wake_up) from [] (__wake_up_common+0x4c/0x80) [ 31.114140] [] (__wake_up_common) from [] (__wake_up_locked+0x14/0x1c) [ 31.114150] [] (__wake_up_locked) from [] (complete+0x34/0x44) [ 31.114167] [] (complete) from [] (cpu_die+0x24/0x84) [ 31.114179] [] (cpu_die) from [] (cpu_startup_entry+0x328/0x358) [ 31.114189] [] (cpu_startup_entry) from [<40008784>] (0x40008784) [ 31.114226] CPU1: shutdown [ 31.132479] CPU2: shutdown [ 31.146815] CPU3: shutdown [ 31.160767] Enabling non-boot CPUs ... [ 31.175645] CPU1 is up [ 31.191120] CPU2 is up [ 31.206650] CPU3 is up [ 31.206922] s3c-i2c 13860000.i2c: slave address 0x10 [ 31.206935] s3c-i2c 13860000.i2c: bus frequency set to 390 KHz [ 31.206952] s3c-i2c 13890000.i2c: slave address 0x10 [ 31.206962] s3c-i2c 13890000.i2c: bus frequency set to 390 KHz [ 31.206978] s3c-i2c 138d0000.i2c: slave address 0x10 [ 31.206987] s3c-i2c 138d0000.i2c: bus frequency set to 97 KHz [ 31.209201] PM: noirq resume of devices complete after 2.539 msecs [ 31.212202] PM: early resume of devices complete after 2.812 msecs [ 31.229844] Failed to resume regulators from suspend (-22) [ 31.230915] wake disabled for irq 123 [ 31.230923] wake disabled for irq 124 [ 31.232003] wake disabled for irq 102 [ 31.259950] max77686_rtc_tm_to_data: MAX77686 RTC cannot handle the year 1970.Assume it's 2000. [ 31.298929] mmc_host mmc1: Bus speed (slot 0) = 50000000Hz (slot req 400000Hz, actual 396825HZ div = 63) [ 31.526729] mmc_host mmc1: Bus speed (slot 0) = 50000000Hz (slot req 52000000Hz, actual 50000000HZ div = 0) [ 31.526976] mmc_host mmc1: Bus speed (slot 0) = 100000000Hz (slot req 52000000Hz, actual 50000000HZ div = 1) [ 31.527207] PM: resume of devices complete after 297.352 msecs [ 31.985665] PM: Finishing wakeup. [ 31.988959] Restarting tasks ... done. root@target:~# Best regards, Krzysztof > > commit dd2b39be8eee9d175c7842c30e405a5cbe50095a > Author: Paul E. McKenney > AuthorDate: Wed Jan 28 14:42:09 2015 -0800 > Commit: Paul E. McKenney > CommitDate: Fri Jan 30 12:59:22 2015 -0800 > > rcu: Handle outgoing CPUs on exit from idle loop > > This commit informs RCU of an outgoing CPU just before that CPU invokes > arch_cpu_idle_dead() during its last pass through the idle loop (via a > new CPU_DYING_IDLE notifier value). This change means that RCU need not > deal with outgoing CPUs passing through the scheduler after informing > RCU that they are no longer online. Note that removing the CPU from > the rcu_node ->qsmaskinit bit masks is done at CPU_DYING_IDLE time, > and orphaning callbacks is still done at CPU_DEAD time, the reason being > that at CPU_DEAD time we have another CPU that can adopt them. > > Signed-off-by: Paul E. McKenney > > +-------------------------------------+------------+------------+------------+ > | | 8c50d7da91 | dd2b39be8e | d586642522 | > +-------------------------------------+------------+------------+------------+ > | boot_successes | 198 | 11 | 51 | > | boot_failures | 0 | 55 | 15 | > | INFO:suspicious_RCU_usage | 0 | 55 | 15 | > | RCU_used_illegally_from_offline_CPU | 0 | 55 | 15 | > | backtrace:cpu_startup_entry | 0 | 55 | 15 | > | BUG:kernel_test_hang | 0 | 0 | 4 | > +-------------------------------------+------------+------------+------------+ > > [ 15.244825] numa_remove_cpu cpu 0 node 0: mask now 1 > [ 15.246713] > [ 15.246917] =============================== > [ 15.247424] [ INFO: suspicious RCU usage. ] > [ 15.247964] 3.19.0-rc1-gdd2b39b #10 Not tainted > [ 15.248531] ------------------------------- > [ 15.248586] include/trace/events/rcu.h:35 suspicious rcu_dereference_check() usage! > [ 15.248586] > [ 15.248586] other info that might help us debug this: > [ 15.248586] > [ 15.248586] > [ 15.248586] RCU used illegally from offline CPU! > [ 15.248586] rcu_scheduler_active = 1, debug_locks = 0 > [ 15.248586] no locks held by swapper/0/0. > [ 15.248586] > [ 15.248586] stack backtrace: > [ 15.248586] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.19.0-rc1-gdd2b39b #10 > [ 15.248586] 0000000000000001 ffffffff81e03e08 ffffffff8171b89b 0000000000000000 > [ 15.248586] ffffffff81e0e580 ffffffff81e03e38 ffffffff810efec2 ffffffff81e4b140 > [ 15.248586] ffffffff81c77ba0 0000000000000002 ffffffff81e11e98 ffffffff81e03e58 > [ 15.248586] Call Trace: > [ 15.248586] [] dump_stack+0x7f/0xa7 > [ 15.248586] [] lockdep_rcu_suspicious+0x107/0x110 > [ 15.248586] [] trace_rcu_utilization+0x127/0x133 > [ 15.248586] [] rcu_cpu_notify+0x527/0x53b > [ 15.248586] [] cpu_startup_entry+0x1dc/0x4ea > [ 15.248586] [] rest_init+0x159/0x15f > [ 15.248586] [] start_kernel+0x565/0x572 > [ 15.248586] [] ? early_idt_handlers+0x120/0x120 > [ 15.248586] [] x86_64_start_reservations+0x41/0x43 > [ 15.248586] [] x86_64_start_kernel+0x13d/0x14c > [ 15.265151] CPU 0 is now offline > [ 15.265941] debug: unmapping init [mem 0xffffffff82365000-0xffffffff82539fff] > [ 15.266726] Write protecting the kernel read-only data: 14336k > > git bisect start d58664252218cfefb19709d597ff0c5d93688203 26bc420b59a38e4e6685a73345a0def461136dce -- > git bisect bad 19f7d9c2f948a4c5c7742adb16fe00920f35f302 # 13:29 23- 6 Merge 'jtkirshe-net-next/i40e-queue' into devel-roam-smoke-201501311226 > git bisect bad 2c86978183cc365003e2d6949052a30865ef8b89 # 13:33 34- 32 Merge 'wsa/i2c/for-next' into devel-roam-smoke-201501311226 > git bisect good 1ffdd3662d27b1d4d59d51bbcc104b200be63d6a # 13:37 66+ 0 Merge 'pci/pci/virtualization' into devel-roam-smoke-201501311226 > git bisect bad 0ce6ea6707a3d5ae5bfdbdc4d16ebc86cff77f5f # 13:43 32- 22 Merge 'rcu/rcu/next' into devel-roam-smoke-201501311226 > git bisect good 53805a9f2fa76294af534fb7e9f96d43f1d820eb # 13:52 66+ 0 Merge 'iio/testing' into devel-roam-smoke-201501311226 > git bisect good 78e691f4ae2d5edea0199ca802bb505b9cdced88 # 14:01 66+ 0 Merge branches 'doc.2015.01.07a', 'fixes.2015.01.15a', 'preempt.2015.01.06a', 'srcu.2015.01.06a', 'stall.2015.01.16a' and 'torture.2015.01.11a' into HEAD > git bisect good 17366dc8dc49858ba931c4120d8de494e388d93e # 14:05 66+ 0 documentation: Update rcutree.kthread_prio for grace-period kthread use > git bisect good 569c1500e44189136c8a9f4b5e39f0055e422b0d # 14:14 66+ 0 documentation: Update based on on-demand vmstat workers > git bisect good 14fefdb410cf48327c972ce91deb5e98edc8671f # 14:18 66+ 0 rcu: Eliminate ->onoff_mutex from rcu_node structure > git bisect bad dd2b39be8eee9d175c7842c30e405a5cbe50095a # 14:29 11- 55 rcu: Handle outgoing CPUs on exit from idle loop > git bisect good 8c50d7da9124a9f1e92e13996a0a148b2431390d # 14:34 66+ 0 cpu: Make CPU-offline idle-loop transition point more precise > # first bad commit: [dd2b39be8eee9d175c7842c30e405a5cbe50095a] rcu: Handle outgoing CPUs on exit from idle loop > git bisect good 8c50d7da9124a9f1e92e13996a0a148b2431390d # 14:37 198+ 0 cpu: Make CPU-offline idle-loop transition point more precise > # extra tests with DEBUG_INFO > git bisect good dd2b39be8eee9d175c7842c30e405a5cbe50095a # 15:35 198+ 198 rcu: Handle outgoing CPUs on exit from idle loop > # extra tests on HEAD of linux-devel/devel-roam-smoke-201501311226 > git bisect bad d58664252218cfefb19709d597ff0c5d93688203 # 15:35 0- 15 0day head guard for 'devel-roam-smoke-201501311226' > # extra tests on tree/branch rcu/rcu/next > git bisect bad c418b8035fac0cc7d242e5de126cec1006a34bed # 15:52 47- 21 cpu: Stop newly offlined CPU from using RCU readers > # extra tests with first bad commit reverted > # extra tests on tree/branch linus/master > git bisect good 2141fd018156db0f29efb384f4d99ead23b48f18 # 16:04 198+ 0 Merge tag 'char-misc-3.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc > # extra tests on tree/branch next/master > git bisect good 827e3bdf1bb2401c1a1e5586eb3977d228d298b2 # 16:12 198+ 0 Add linux-next specific files for 20150130 > > > This script may reproduce the error. > > ---------------------------------------------------------------------------- > #!/bin/bash > > kernel=$1 > initrd=quantal-core-x86_64.cgz > > wget --no-clobber https://github.com/fengguang/reproduce-kernel-bug/raw/master/initrd/$initrd > > kvm=( > qemu-system-x86_64 > -cpu kvm64 > -enable-kvm > -kernel $kernel > -initrd $initrd > -m 320 > -smp 2 > -net nic,vlan=1,model=e1000 > -net user,vlan=1 > -boot order=nc > -no-reboot > -watchdog i6300esb > -rtc base=localtime > -serial stdio > -display none > -monitor null > ) > > append=( > hung_task_panic=1 > earlyprintk=ttyS0,115200 > debug > apic=debug > sysrq_always_enabled > rcupdate.rcu_cpu_stall_timeout=100 > panic=-1 > softlockup_panic=1 > nmi_watchdog=panic > oops=panic > load_ramdisk=2 > prompt_ramdisk=0 > console=ttyS0,115200 > console=tty0 > vga=normal > root=/dev/ram0 > rw > drbd.minor_count=8 > ) > > "${kvm[@]}" --append "${append[*]}" > ---------------------------------------------------------------------------- > > Thanks, > Fengguang > _______________________________________________ > LKP mailing list > LKP@linux.intel.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/