Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752106AbbEDTCG (ORCPT ); Mon, 4 May 2015 15:02:06 -0400 Received: from mx1.redhat.com ([209.132.183.28]:53883 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751078AbbEDTB4 (ORCPT ); Mon, 4 May 2015 15:01:56 -0400 Message-ID: <5547C1DC.10802@redhat.com> Date: Mon, 04 May 2015 15:00:44 -0400 From: Rik van Riel User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Paolo Bonzini , Ingo Molnar , Andy Lutomirski CC: "linux-kernel@vger.kernel.org" , X86 ML , williams@redhat.com, Andrew Lutomirski , fweisbec@redhat.com, Peter Zijlstra , Heiko Carstens , Thomas Gleixner , Ingo Molnar , "Paul E. McKenney" , Linus Torvalds Subject: Re: question about RCU dynticks_nesting References: <554399D1.6010405@redhat.com> <20150501155912.GA451@gmail.com> <20150501162109.GA1091@gmail.com> <5543A94B.3020108@redhat.com> <20150501163431.GB1327@gmail.com> <5543C05E.9040209@redhat.com> <20150501184025.GA2114@gmail.com> <5543CFE5.1030509@redhat.com> <20150502052733.GA9983@gmail.com> <55473B47.6080600@redhat.com> <55479749.7070608@redhat.com> In-Reply-To: <55479749.7070608@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2749 Lines: 72 On 05/04/2015 11:59 AM, Rik van Riel wrote: > However, currently the RCU code seems to use a much more > complex counting scheme, with a different increment for > kernel/task use, and irq use. > > This counter seems to be modeled on the task preempt_counter, > where we do care about whether we are in task context, irq > context, or softirq context. > > On the other hand, the RCU code only seems to care about > whether or not a CPU is in an extended quiescent state, > or is potentially in an RCU critical section. > > Paul, what is the reason for RCU using a complex counter, > instead of a simple increment for each potential kernel/RCU > entry, like rcu_read_lock() does with CONFIG_PREEMPT_RCU > enabled? Looking at the code for a while more, I have not found any reason why the rcu dynticks counter is so complex. The rdtp->dynticks atomic seems to be used as a serial number. Odd means the cpu is in an rcu quiescent state, even means it is not. This test is used to verify whether or not a CPU is in rcu quiescent state. Presumably the atomic_add_return is used to add a memory barrier. atomic_add_return(0, &rdtp->dynticks) & 0x1) > In fact, would we be able to simply use tsk->rcu_read_lock_nesting > as an indicator of whether or not we should bother waiting on that > task or CPU when doing synchronize_rcu? We seem to have two variants of __rcu_read_lock(). One increments current->rcu_read_lock_nesting, the other calls preempt_disable(). In case of the non-preemptible RCU, we could easily also increase current->rcu_read_lock_nesting at the same time we increase the preempt counter, and use that as the indicator to test whether the cpu is in an extended rcu quiescent state. That way there would be no extra overhead at syscall entry or exit at all. The trick would be getting the preempt count and the rcu read lock nesting count in the same cache line for each task. In case of the preemptible RCU scheme, we would have to examine the per-task state (under the runqueue lock) to get the current task info of all CPUs, and in addition wait for the blkd_tasks list to empty out when doing a synchronize_rcu(). That does not appear to require special per-cpu counters; examining the per-cpu rdp and the lists inside it, with the rnp->lock held if doing any list manipulation, looks like it would be enough. However, the current code is a lot more complicated than that. Am I overlooking something obvious, Paul? Maybe something non-obvious? :) -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/