Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932966Ab3HHBnv (ORCPT ); Wed, 7 Aug 2013 21:43:51 -0400 Received: from cn.fujitsu.com ([222.73.24.84]:60905 "EHLO song.cn.fujitsu.com" rhost-flags-OK-FAIL-OK-OK) by vger.kernel.org with ESMTP id S932260Ab3HHBnt (ORCPT ); Wed, 7 Aug 2013 21:43:49 -0400 X-IronPort-AV: E=Sophos;i="4.89,836,1367942400"; d="scan'208";a="8149211" Message-ID: <5202F8CC.2020703@cn.fujitsu.com> Date: Thu, 08 Aug 2013 09:47:56 +0800 From: Lai Jiangshan User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.9) Gecko/20100921 Fedora/3.1.4-1.fc14 Thunderbird/3.1.4 MIME-Version: 1.0 To: paulmck@linux.vnet.ibm.com CC: Steven Rostedt , Peter Zijlstra , linux-kernel@vger.kernel.org, C.Emde@osadl.org Subject: Re: [PATCH 0/8] rcu: Ensure rcu read site is deadlock-immunity References: <1375871104-10688-1-git-send-email-laijs@cn.fujitsu.com> <20130807123827.GB4306@linux.vnet.ibm.com> <20130808003635.GA9487@linux.vnet.ibm.com> In-Reply-To: <20130808003635.GA9487@linux.vnet.ibm.com> X-MIMETrack: Itemize by SMTP Server on mailserver/fnst(Release 8.5.3|September 15, 2011) at 2013/08/08 09:42:23, Serialize by Router on mailserver/fnst(Release 8.5.3|September 15, 2011) at 2013/08/08 09:42:24, Serialize complete at 2013/08/08 09:42:24 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7678 Lines: 144 On 08/08/2013 08:36 AM, Paul E. McKenney wrote: > On Wed, Aug 07, 2013 at 05:38:27AM -0700, Paul E. McKenney wrote: >> On Wed, Aug 07, 2013 at 06:24:56PM +0800, Lai Jiangshan wrote: >>> Although all articles declare that rcu read site is deadlock-immunity. >>> It is not true for rcu-preempt, it will be deadlock if rcu read site >>> overlaps with scheduler lock. >> >> The real rule is that if the scheduler does its outermost rcu_read_unlock() >> with one of those locks held, it has to have avoided enabling preemption >> through the entire RCU read-side critical section. >> >> That said, avoiding the need for this rule would be a good thing. >> >> How did you test this? The rcutorture tests will not exercise this. >> (Intentionally so, given that it can deadlock!) >> >>> ec433f0c, 10f39bb1 and 016a8d5b just partially solve it. But rcu read site >>> is still not deadlock-immunity. And the problem described in 016a8d5b >>> is still existed(rcu_read_unlock_special() calls wake_up). >>> >>> The problem is fixed in patch5. >> >> This is going to require some serious review and testing. One requirement >> is that RCU priority boosting not persist significantly beyond the >> re-enabling of interrupts associated with the irq-disabled lock. To do >> otherwise breaks RCU priority boosting. At first glance, the added >> set_need_resched() might handle this, but that is part of the review >> and testing required. >> >> Steven, would you and Carsten be willing to try this and see if it >> helps with the issues you are seeing in -rt? (My guess is "no", since >> a deadlock would block forever rather than waking up after a couple >> thousand seconds, but worth a try.) > > No joy from either Steven or Carsten on the -rt hangs. > > I pushed this to -rcu and ran tests. I hit this in one of the > configurations: > > [ 393.641012] ================================= > [ 393.641012] [ INFO: inconsistent lock state ] > [ 393.641012] 3.11.0-rc1+ #1 Not tainted > [ 393.641012] --------------------------------- > [ 393.641012] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage. > [ 393.641012] rcu_torture_rea/697 [HC1[1]:SC0[0]:HE0:SE1] takes: > [ 393.641012] (&lock->wait_lock){?.+...}, at: [] rt_mutex_unlock+0x53/0x100 > [ 393.641012] {HARDIRQ-ON-W} state was registered at: > [ 393.641012] [] __lock_acquire+0x651/0x1d40 > [ 393.641012] [] lock_acquire+0x95/0x210 > [ 393.641012] [] _raw_spin_lock+0x36/0x50 > [ 393.641012] [] rt_mutex_slowlock+0x39/0x170 > [ 393.641012] [] rt_mutex_lock+0x2a/0x30 > [ 393.641012] [] rcu_boost_kthread+0x173/0x800 > [ 393.641012] [] kthread+0xd6/0xe0 > [ 393.641012] [] ret_from_fork+0x7c/0xb0 > [ 393.641012] irq event stamp: 96581116 > [ 393.641012] hardirqs last enabled at (96581115): [] restore_args+0x0/0x30 > [ 393.641012] hardirqs last disabled at (96581116): [] apic_timer_interrupt+0x6a/0x80 > [ 393.641012] softirqs last enabled at (96576304): [] __do_softirq+0x174/0x470 > [ 393.641012] softirqs last disabled at (96576275): [] irq_exit+0x96/0xc0 > [ 393.641012] > [ 393.641012] other info that might help us debug this: > [ 393.641012] Possible unsafe locking scenario: > [ 393.641012] > [ 393.641012] CPU0 > [ 393.641012] ---- > [ 393.641012] lock(&lock->wait_lock); > [ 393.641012] > [ 393.641012] lock(&lock->wait_lock); Patch2 causes it! When I found all lock which can (chained) nested in rcu_read_unlock_special(), I didn't notice rtmutex's lock->wait_lock is not nested in irq-disabled. Two ways to fix it: 1) change rtmutex's lock->wait_lock, make it alwasys irq-disabled. 2) revert my patch2 > [ 393.641012] > [ 393.641012] *** DEADLOCK *** > [ 393.641012] > [ 393.641012] no locks held by rcu_torture_rea/697. > [ 393.641012] > [ 393.641012] stack backtrace: > [ 393.641012] CPU: 3 PID: 697 Comm: rcu_torture_rea Not tainted 3.11.0-rc1+ #1 > [ 393.641012] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 > [ 393.641012] ffffffff8586fea0 ffff88001fcc3a78 ffffffff8187b4cb ffffffff8104a261 > [ 393.641012] ffff88001e1a20c0 ffff88001fcc3ad8 ffffffff818773e4 0000000000000000 > [ 393.641012] ffff880000000000 ffff880000000001 ffffffff81010a0a 0000000000000001 > [ 393.641012] Call Trace: > [ 393.641012] [] dump_stack+0x4f/0x84 > [ 393.641012] [] ? console_unlock+0x291/0x410 > [ 393.641012] [] print_usage_bug+0x1f5/0x206 > [ 393.641012] [] ? save_stack_trace+0x2a/0x50 > [ 393.641012] [] mark_lock+0x283/0x2e0 > [ 393.641012] [] ? print_irq_inversion_bug.part.40+0x1f0/0x1f0 > [ 393.641012] [] __lock_acquire+0x906/0x1d40 > [ 393.641012] [] ? __lock_acquire+0x2eb/0x1d40 > [ 393.641012] [] ? __lock_acquire+0x2eb/0x1d40 > [ 393.641012] [] lock_acquire+0x95/0x210 > [ 393.641012] [] ? rt_mutex_unlock+0x53/0x100 > [ 393.641012] [] _raw_spin_lock+0x36/0x50 > [ 393.641012] [] ? rt_mutex_unlock+0x53/0x100 > [ 393.641012] [] rt_mutex_unlock+0x53/0x100 > [ 393.641012] [] rcu_read_unlock_special+0x17a/0x2a0 > [ 393.641012] [] rcu_check_callbacks+0x313/0x950 > [ 393.641012] [] ? hrtimer_run_queues+0x1d/0x180 > [ 393.641012] [] ? trace_hardirqs_off+0xd/0x10 > [ 393.641012] [] update_process_times+0x43/0x80 > [ 393.641012] [] tick_sched_handle.isra.10+0x31/0x40 > [ 393.641012] [] tick_sched_timer+0x47/0x70 > [ 393.641012] [] __run_hrtimer+0x7c/0x490 > [ 393.641012] [] ? ktime_get_update_offsets+0x4d/0xe0 > [ 393.641012] [] ? tick_nohz_handler+0xa0/0xa0 > [ 393.641012] [] hrtimer_interrupt+0x107/0x260 > [ 393.641012] [] local_apic_timer_interrupt+0x33/0x60 > [ 393.641012] [] smp_apic_timer_interrupt+0x3e/0x60 > [ 393.641012] [] apic_timer_interrupt+0x6f/0x80 > [ 393.641012] [] ? rcu_scheduler_starting+0x60/0x60 > [ 393.641012] [] ? __rcu_read_unlock+0x91/0xa0 > [ 393.641012] [] rcu_torture_read_unlock+0x33/0x70 > [ 393.641012] [] rcu_torture_reader+0xe4/0x450 > [ 393.641012] [] ? rcu_torture_reader+0x450/0x450 > [ 393.641012] [] ? rcutorture_trace_dump+0x30/0x30 > [ 393.641012] [] kthread+0xd6/0xe0 > [ 393.641012] [] ? _raw_spin_unlock_irq+0x2b/0x60 > [ 393.641012] [] ? flush_kthread_worker+0x130/0x130 > [ 393.641012] [] ret_from_fork+0x7c/0xb0 > [ 393.641012] [] ? flush_kthread_worker+0x130/0x130 > > I don't see this without your patches. > > .config attached. The other configurations completed without errors. > Short tests, 30 minutes per configuration. > > Thoughts? > > Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/