Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755411AbaJDD2G (ORCPT ); Fri, 3 Oct 2014 23:28:06 -0400 Received: from g4t3425.houston.hp.com ([15.201.208.53]:31859 "EHLO g4t3425.houston.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754875AbaJDD2D (ORCPT ); Fri, 3 Oct 2014 23:28:03 -0400 Message-ID: <542F693E.5070200@hp.com> Date: Fri, 03 Oct 2014 23:27:58 -0400 From: Waiman Long User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.12) Gecko/20130109 Thunderbird/10.0.12 MIME-Version: 1.0 To: Fengguang Wu CC: Dave Hansen , Ingo Molnar , LKML , lkp@01.org, "linux-btrfs@vger.kernel.org" , Chris Mason Subject: Re: [x86, locking/rwlocks, btrfs] INFO: rcu_sched self-detected stall on CPU References: <20141003133351.GC32634@wfg-t540p.sh.intel.com> In-Reply-To: <20141003133351.GC32634@wfg-t540p.sh.intel.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10/03/2014 09:33 AM, Fengguang Wu wrote: > Hi Waiman, > > FYI, we noticed the below changes on commit > > bd01ec1a13f9a327950c8e3080096446c7804753 ("x86, locking/rwlocks: Enable qrwlocks on x86") > > +----------------------------------------------+------------+------------+ > | | 70af2f8a4f | bd01ec1a13 | > +----------------------------------------------+------------+------------+ > | boot_successes | 3 | 2 | > | boot_failures | 7 | 13 | > | BUG:kernel_test_crashed | 7 | 8 | > | INFO:rcu_sched_self-detected_stall_on_CPU | 0 | 4 | > | RIP:intel_idle | 0 | 4 | > | RIP:queue_write_lock_slowpath | 0 | 4 | > | RIP:queue_read_lock_slowpath | 0 | 4 | > | RIP:sys_imageblit_sysimgblt | 0 | 2 | > | RIP:default_send_IPI_mask_sequence_phys | 0 | 1 | > | RIP:memcpy | 0 | 1 | > | RIP:delay_tsc | 0 | 4 | > | backtrace:cpu_startup_entry | 0 | 3 | > | backtrace:do_fsync | 0 | 4 | > | backtrace:SyS_fsync | 0 | 4 | > | backtrace:normal_work_helper | 0 | 1 | > | backtrace:vfs_write | 0 | 3 | > | backtrace:SyS_write | 0 | 3 | > | backtrace:do_sys_open | 0 | 4 | > | backtrace:SyS_open | 0 | 4 | > | backtrace:flush_to_ldisc | 0 | 1 | > | RIP:cpu_startup_entry | 0 | 1 | > | RIP:native_read_tsc | 0 | 2 | > | RIP:rcu_eqs_exit_common | 0 | 1 | > | INFO:rcu_sched_detected_stalls_on_CPUs/tasks | 0 | 1 | > +----------------------------------------------+------------+------------+ > > > run: /lkp/lkp/src/monitors/wrapper sched_debug {"interval"=>"10"} > run: /usr/bin/time -v -o /lkp/lkp/src/tmp/time /lkp/lkp/src/tests/wrapper fsmark {"filesize"=>"9B", "test_size"=>"400M", "sync_method"=>"fsyncBeforeClose", "nr_directories"=>"16d", "nr_files_per_directory"=>"256fpd"} > run: /lkp/lkp/src/monitors/wrapper pmeter {} > [ 125.656200] INFO: rcu_sched self-detected stall on CPU > [ 125.657199] INFO: rcu_sched self-detected stall on CPUINFO: rcu_sched self-detected stall on CPUINFO: rcu_sched self-detected stall on CPUINFO: rcu_sched self-detected stall on CPU { { { { 0 9 7 14} } } } (t=100000 jiffies g=1792 c=1791 q=0) > [ 125.657218] (t=100000 jiffies g=1792 c=1791 q=0) > [ 125.657219] (t=100000 jiffies g=1792 c=1791 q=0) > [ 125.657221] (t=100000 jiffies g=1792 c=1791 q=0) > [ 125.657222] sending NMI to all CPUs: > [ 125.657224] NMI backtrace for cpu 0 > [ 125.657227] CPU: 0 PID: 3025 Comm: fs_mark Not tainted 3.16.0 #1 > [ 125.657227] Hardware name: Intel Corporation S5520UR/S5520UR, BIOS S5500.86B.01.00.0050.050620101605 05/06/2010 > [ 125.657228] task: ffff88007ef58000 ti: ffff88007ef54000 task.ti: ffff88007ef54000 > [ 125.657229] RIP: 0010:[] [] native_read_tsc+0x6/0x20 > [ 125.657236] RSP: 0018:ffff88036fc03d20 EFLAGS: 00000002 > [ 125.657237] RAX: 000000003f172acf RBX: 000000003f172ab0 RCX: 0000000000000028 > [ 125.657238] RDX: 00000000000014e5 RSI: 0000000000000018 RDI: 000000000004773a > [ 125.657238] RBP: ffff88036fc03d20 R08: ffffffff81da2200 R09: 0000000000000092 > [ 125.657239] R10: 000014e53edc9480 R11: 0000000000080000 R12: 000000000004773a > [ 125.657239] R13: 0000000000000000 R14: 0000000000000002 R15: 0000000000000001 > [ 125.657241] FS: 0000000001ee0880(0063) GS:ffff88036fc00000(0000) knlGS:0000000000000000 > [ 125.657241] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 125.657242] CR2: 000000000061c000 CR3: 000000007ef3c000 CR4: 00000000000007f0 > [ 125.657243] Stack: > [ 125.657243] ffff88036fc03d48 ffffffff813f85e3 0000000000001000 00000000000003e9 > [ 125.657244] 0000000000000400 ffff88036fc03d58 ffffffff813f8538 ffff88036fc03d78 > [ 125.657246] ffffffff81046d1a 000000000000b032 ffffffff81da2200 ffff88036fc03dc0 > [ 125.657247] Call Trace: > [ 125.657247] [] delay_tsc+0x43/0x90 > [ 125.657253] [] __const_udelay+0x28/0x30 > [ 125.657254] [] native_safe_apic_wait_icr_idle+0x2a/0x60 > [ 125.657257] [] default_send_IPI_mask_sequence_phys+0xd0/0x100 > [ 125.657259] [] physflat_send_IPI_mask+0xe/0x10 > [ 125.657261] [] arch_trigger_all_cpu_backtrace+0x134/0x150 > [ 125.657262] [] rcu_check_callbacks+0x4c9/0x740 > [ 125.657265] [] update_process_times+0x47/0x70 > [ 125.657267] [] tick_sched_handle.isra.18+0x25/0x60 > [ 125.657269] [] tick_sched_timer+0x41/0x60 > [ 125.657270] [] __run_hrtimer+0x77/0x1f0 > [ 125.657274] [] ? tick_sched_do_timer+0x60/0x60 > [ 125.657275] [] hrtimer_interrupt+0xff/0x240 > [ 125.657277] [] local_apic_timer_interrupt+0x3a/0x60 > [ 125.657278] [] smp_apic_timer_interrupt+0x3f/0x50 > [ 125.657282] [] apic_timer_interrupt+0x6d/0x80 > [ 125.657285] [] ? queue_read_lock_slowpath+0x111/0x130 > [ 125.657289] [] _raw_read_lock+0x29/0x30 > [ 125.657292] [] btrfs_clear_lock_blocking_rw+0x4c/0x100 [btrfs] > [ 125.657307] [] btrfs_clear_path_blocking+0x41/0x80 [btrfs] > [ 125.657312] [] btrfs_search_forward+0x203/0x330 [btrfs] > [ 125.657318] [] btrfs_log_inode+0x27b/0x920 [btrfs] > [ 125.657327] [] ? find_get_pages_tag+0x110/0x1a0 > [ 125.657330] [] ? check_parent_dirs_for_sync+0xfd/0x140 [btrfs] > [ 125.657339] [] btrfs_log_inode_parent+0x1d6/0x540 [btrfs] > [ 125.657349] [] btrfs_log_dentry_safe+0x42/0x60 [btrfs] > [ 125.657358] [] btrfs_sync_file+0x194/0x310 [btrfs] > [ 125.657367] [] do_fsync+0x51/0x80 > [ 125.657369] [] SyS_fsync+0x10/0x20 > [ 125.657370] [] system_call_fastpath+0x16/0x1b > [ 125.657372] Code: 65 ff 0c 25 e0 c7 00 00 e9 37 ff ff ff 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 5d c3 0f 1f 44 00 00 55 48 89 e5 0f 31<48> c1 e2 20 89 c0 48 09 d0 5d c3 66 66 66 66 66 66 2e 0f 1f 84 > [ 125.657423] NMI backtrace for cpu 1 > [ 125.657425] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.16.0 #1 > [ 125.657426] Hardware name: Intel Corporation S5520UR/S5520UR, BIOS S5500.86B.01.00.0050.050620101605 05/06/2010 > [ 125.657427] task: ffff880366eb1d80 ti: ffff880366eb8000 task.ti: ffff880366eb8000 > [ 125.657428] RIP: 0010:[] [] intel_idle+0xd7/0x160 > [ 125.657433] RSP: 0018:ffff880366ebbe50 EFLAGS: 00000046 > [ 125.657434] RAX: 0000000000000020 RBX: 0000000000000008 RCX: 0000000000000001 > [ 125.657435] RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000046 > [ 125.657436] RBP: ffff880366ebbe78 R08: 00000000000ec708 R09: 0000000000000002 > [ 125.657436] R10: 000014e53eafed88 R11: 00000000000006f6 R12: 0000000000000020 > [ 125.657437] R13: 0000000000000003 R14: 0000000000000004 R15: 0000000000000004 > [ 125.657438] FS: 0000000000000000(0000) GS:ffff8801e9c00000(0000) knlGS:0000000000000000 > [ 125.657439] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 125.657440] CR2: 00000000006e7c24 CR3: 0000000001c21000 CR4: 00000000000007e0 > [ 125.657441] Stack: > [ 125.657441] 00000001810dc3db ffff8801e9c1c900 ffffffff81d05740 0000001d0cdfc5fe > [ 125.657443] 0000000000000004 ffff880366ebbea8 ffffffff816c1f9a ffffffff81da50c0 > [ 125.657444] ffff8801e9c1c900 ffffffff81d05740 0000000008004000 ffff880366ebbeb8 > [ 125.657446] Call Trace: > [ 125.657447] [] cpuidle_enter_state+0x3a/0xc0 > [ 125.657451] [] cpuidle_enter+0x17/0x20 > [ 125.657452] [] cpu_idle_loop+0x317/0x460 > [ 125.657456] [] cpu_startup_entry+0x13/0x20 > [ 125.657458] [] start_secondary+0x1ff/0x290 > [ 125.657463] Code: c8 00 00 48 89 d1 48 2d c8 3f 00 00 0f 01 c8 65 48 8b 04 25 08 c8 00 00 48 8b 80 38 c0 ff ff a8 08 75 08 b1 01 4c 89 e0 0f 01 c9<65> 48 8b 04 25 08 c8 00 00 f0 80 a0 3a c0 ff ff df 0f ae f0 65 > [ 125.657492] NMI backtrace for cpu 2 > [ 125.657494] CPU: 2 PID: 3003 Comm: fs_mark Not tainted 3.16.0 #1 > [ 125.657494] Hardware name: Intel Corporation S5520UR/S5520UR, BIOS S5500.86B.01.00.0050.050620101605 05/06/2010 > [ 125.657495] task: ffff8801e6f75880 ti: ffff8801e75f0000 task.ti: ffff8801e75f0000 > [ 125.657496] RIP: 0010:[] [] queue_write_lock_slowpath+0xf8/0x120 > [ 125.657499] RSP: 0018:ffff8801e75f37f0 EFLAGS: 00000212 > [ 125.657500] RAX: 0000000000006c64 RBX: ffff880359dc3e10 RCX: 00000000000082c8 > [ 125.657501] RDX: 00000000000082d0 RSI: 00000000000082d0 RDI: ffff880359dc3e84 > [ 125.657501] RBP: ffff8801e75f37f0 R08: ffff880359dc3e80 R09: 0000000000000206 > [ 125.657502] R10: 00000000000039ba R11: 00000000000003a6 R12: ffff880359dc3ea0 > [ 125.657503] R13: ffff880359dc3e88 R14: ffff8801e75f3828 R15: ffff880359dc3e80 > [ 125.657504] FS: 0000000001ee0880(0063) GS:ffff88036fc20000(0000) knlGS:0000000000000000 > [ 125.657505] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 125.657506] CR2: 00007f39f56165c8 CR3: 00000001dd750000 CR4: 00000000000007e0 > [ 125.657506] Stack: > [ 125.657507] ffff8801e75f3800 ffffffff8184ed99 ffff8801e75f3860 ffffffffa0280aa9 > [ 125.657508] ffffffff00000000 ffff8801e6f75880 ffffffff810b9400 ffff8801e75f3828 > [ 125.657510] ffff8801e75f3828 ffff880359dc3e10 ffff88035571a000 ffff88035571a000 > [ 125.657512] Call Trace: > [ 125.657512] [] _raw_write_lock+0x29/0x30 > [ 125.657515] [] btrfs_tree_lock+0xc9/0x1e0 [btrfs] > [ 125.657528] [] ? abort_exclusive_wait+0xb0/0xb0 > [ 125.657530] [] btrfs_lock_root_node+0x3b/0x50 [btrfs] > [ 125.657537] [] btrfs_search_slot+0x939/0xa40 [btrfs] > [ 125.657543] [] ? btrfs_drop_extent_cache+0x37d/0x440 [btrfs] > [ 125.657554] [] btrfs_lookup_file_extent+0x37/0x40 [btrfs] > [ 125.657563] [] __btrfs_drop_extents+0x16e/0xd80 [btrfs] > [ 125.657573] [] ? start_transaction+0x93/0x5a0 [btrfs] > [ 125.657583] [] cow_file_range_inline+0x192/0x590 [btrfs] > [ 125.657593] [] ? __set_extent_bit+0x493/0x580 [btrfs] > [ 125.657604] [] cow_file_range+0x2e8/0x410 [btrfs] > [ 125.657614] [] ? find_delalloc_range.constprop.44+0xab/0x1a0 [btrfs] > [ 125.657624] [] ? test_range_bit+0x109/0x180 [btrfs] > [ 125.657635] [] run_delalloc_range+0x314/0x340 [btrfs] > [ 125.657645] [] ? find_lock_delalloc_range.constprop.43+0x1e1/0x210 [btrfs] > [ 125.657656] [] writepage_delalloc.isra.32+0xec/0x160 [btrfs] > [ 125.657667] [] __extent_writepage+0xb6/0x310 [btrfs] > [ 125.657678] [] extent_write_cache_pages.isra.27.constprop.49+0x35a/0x410 [btrfs] > [ 125.657689] [] ? btrfs_file_write_iter+0x198/0x540 [btrfs] > [ 125.657700] [] extent_writepages+0x4d/0x70 [btrfs] > [ 125.657711] [] ? btrfs_writepage_end_io_hook+0x190/0x190 [btrfs] > [ 125.657720] [] ? new_sync_write+0x82/0xb0 > [ 125.657723] [] btrfs_writepages+0x28/0x30 [btrfs] > [ 125.657733] [] do_writepages+0x1e/0x30 > [ 125.657736] [] __filemap_fdatawrite_range+0x59/0x60 > [ 125.657738] [] filemap_fdatawrite_range+0x13/0x20 > [ 125.657740] [] btrfs_sync_file+0x4e/0x310 [btrfs] > [ 125.657750] [] do_fsync+0x51/0x80 > [ 125.657751] [] SyS_fsync+0x10/0x20 > [ 125.657753] [] system_call_fastpath+0x16/0x1b > [ 125.657755] Code: b7 f6 66 66 66 90 66 66 90 5d c3 83 e2 fe 0f b7 f2 b8 00 80 00 00 0f 1f 44 00 00 41 0f b7 48 04 66 39 ca 0f 84 33 ff ff ff f3 90<83> e8 01 75 eb 66 66 66 90 66 66 90 eb d8 89 d0 f0 41 0f b1 48 > [ 125.657774] NMI backtrace for cpu 3 > [ 125.657776] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 3.16.0 #1 > [ 125.657777] Hardware name: Intel Corporation S5520UR/S5520UR, BIOS S5500.86B.01.00.0050.050620101605 05/06/2010 > [ 125.657778] task: ffff880366eb5880 ti: ffff880366f08000 task.ti: ffff880366f08000 > [ 125.657779] RIP: 0010:[] [] intel_idle+0xd7/0x160 > [ 125.657782] RSP: 0018:ffff880366f0be50 EFLAGS: 00000046 > [ 125.657783] RAX: 0000000000000020 RBX: 0000000000000008 RCX: 0000000000000001 > [ 125.657783] RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000046 > [ 125.657784] RBP: ffff880366f0be78 R08: 0000000000129657 R09: 0000000000000002 > [ 125.657785] R10: 000014e53be57e08 R11: 00000000000185c2 R12: 0000000000000020 > [ 125.657786] R13: 0000000000000003 R14: 0000000000000004 R15: 0000000000000004 > [ 125.657787] FS: 0000000000000000(0000) GS:ffff8801e9c20000(0000) knlGS:0000000000000000 > [ 125.657788] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 125.657789] CR2: 000000000065600d CR3: 0000000001c21000 CR4: 00000000000007e0 > [ 125.657789] Stack: > [ 125.657789] 00000003810dc3db ffff8801e9c3c900 ffffffff81d05740 0000001d0becef49 > [ 125.657791] 0000000000000004 ffff880366f0bea8 ffffffff816c1f9a ffffffff81da50c0 > [ 125.657793] ffff8801e9c3c900 ffffffff81d05740 0000000008004000 ffff880366f0beb8 > [ 125.657794] Call Trace: > [ 125.657795] [] cpuidle_enter_state+0x3a/0xc0 > [ 125.657797] [] cpuidle_enter+0x17/0x20 > [ 125.657799] [] cpu_idle_loop+0x317/0x460 > [ 125.657801] [] cpu_startup_entry+0x13/0x20 > [ 125.657803] [] start_secondary+0x1ff/0x290 > [ 125.657806] Code: c8 00 00 48 89 d1 48 2d c8 3f 00 00 0f 01 c8 65 48 8b 04 25 08 c8 00 00 48 8b 80 38 c0 ff ff a8 08 75 08 b1 01 4c 89 e0 0f 01 c9<65> 48 8b 04 25 08 c8 00 00 f0 80 a0 3a c0 ff ff df 0f ae f0 65 > [ 125.657823] NMI backtrace for cpu 4 > [ 125.657825] CPU: 4 PID: 3023 Comm: fs_mark Not tainted 3.16.0 #1 > [ 125.657825] Hardware name: Intel Corporation S5520UR/S5520UR, BIOS S5500.86B.01.00.0050.050620101605 05/06/2010 > [ 125.657826] task: ffff88007eefbb00 ti: ffff88007ef1c000 task.ti: ffff88007ef1c000 > [ 125.657827] RIP: 0010:[] [] delay_tsc+0x60/0x90 > [ 125.657829] RSP: 0018:ffff88036fc43bd8 EFLAGS: 00000007 > [ 125.657830] RAX: 0000000000000268 RBX: 000000003f2c1278 RCX: 0000000000000000 > [ 125.657830] RDX: 000000003f2c14e0 RSI: 0000000000000005 RDI: 0000000000000b6f > [ 125.657831] RBP: ffff88036fc43bf0 R08: 0000000000000000 R09: 0000000000000046 > [ 125.657831] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000b6f > [ 125.657832] R13: 0000000000000004 R14: ffffffff81ffc952 R15: 0000000000000039 > [ 125.657833] FS: 0000000001ee0880(0063) GS:ffff88036fc40000(0000) knlGS:0000000000000000 > [ 125.657833] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 125.657834] CR2: 00007fe629a1c018 CR3: 000000007ef05000 CR4: 00000000000007e0 > [ 125.657834] Stack: > [ 125.657834] ffffffff82138c80 000000000000270e 0000000000000020 ffff88036fc43c00 > [ 125.657836] ffffffff813f8538 ffff88036fc43c28 ffffffff814d02c0 ffffffff82138c80 > [ 125.657837] 0000000000000046 ffffffff81ffc979 ffff88036fc43c48 ffffffff814d04dc > [ 125.657838] Call Trace: > [ 125.657838] [] __const_udelay+0x28/0x30 > [ 125.657840] [] wait_for_xmitr+0x30/0xa0 > [ 125.657844] [] serial8250_console_putchar+0x1c/0x30 > [ 125.657845] [] ? serial8250_console_write+0x190/0x190 > [ 125.657846] [] uart_console_write+0x3c/0x70 > [ 125.657848] [] serial8250_console_write+0xce/0x190 > [ 125.657849] [] call_console_drivers.constprop.24+0xa3/0xf0 > [ 125.657852] [] console_unlock+0x1d9/0x420 > [ 125.657853] [] vprintk_emit+0x310/0x580 > [ 125.657854] [] printk+0x4f/0x57 > [ 125.657858] [] rcu_check_callbacks+0x412/0x740 > [ 125.657859] [] ? account_process_tick+0x62/0x170 > [ 125.657864] [] update_process_times+0x47/0x70 > [ 125.657865] [] tick_sched_handle.isra.18+0x25/0x60 > [ 125.657866] [] tick_sched_timer+0x41/0x60 > [ 125.657867] [] __run_hrtimer+0x77/0x1f0 > [ 125.657869] [] ? tick_sched_do_timer+0x60/0x60 > [ 125.657870] [] hrtimer_interrupt+0xff/0x240 > [ 125.657871] [] local_apic_timer_interrupt+0x3a/0x60 > [ 125.657872] [] smp_apic_timer_interrupt+0x3f/0x50 > [ 125.657874] [] apic_timer_interrupt+0x6d/0x80 > [ 125.657875] [] ? queue_read_lock_slowpath+0x100/0x130 > > Thanks, > Fengguang The btrfs filesystem had problem using qrwlock. This was a known btrfs problem in 3.16-rc1. The following patch by Chris should have fixed the problem: > commit ea4ebde02e08558b020c4b61bb9a4c0fcf63028e > Author: Chris Mason > Date: Thu Jun 19 14:16:52 2014 -0700 > > Btrfs: fix deadlocks with trylock on tree nodes Was that patch included in your test? -Longman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/