Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965115AbcKWOnR (ORCPT ); Wed, 23 Nov 2016 09:43:17 -0500 Received: from mx1.redhat.com ([209.132.183.28]:59616 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965075AbcKWOnO (ORCPT ); Wed, 23 Nov 2016 09:43:14 -0500 Date: Wed, 23 Nov 2016 15:43:06 +0100 From: Oleg Nesterov To: Davidlohr Bueso Cc: Peter Zijlstra , mingo@kernel.org, john.stultz@linaro.org, dimitrysh@google.com, linux-kernel@vger.kernel.org, Davidlohr Bueso Subject: Re: [PATCH 3/3] locking/percpu-rwsem: Avoid unnecessary writer wakeups Message-ID: <20161123144306.GA23738@redhat.com> References: <1479495277-9075-1-git-send-email-dave@stgolabs.net> <1479495277-9075-4-git-send-email-dave@stgolabs.net> <20161121122343.GA635@redhat.com> <20161121122935.GD3092@twins.programming.kicks-ass.net> <20161121124722.GA1459@redhat.com> <20161121150722.GA7951@redhat.com> <20161122035911.GA17027@linux-80c1.suse> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161122035911.GA17027@linux-80c1.suse> User-Agent: Mutt/1.5.18 (2008-05-17) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Wed, 23 Nov 2016 14:43:14 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2052 Lines: 55 On 11/21, Davidlohr Bueso wrote: > > On Mon, 21 Nov 2016, Oleg Nesterov wrote: > >> On 11/21, Oleg Nesterov wrote: >>> >>> No, no, I meant that afaics both readers can see per_cpu_sum() != 0 and >>> thus the writer won't be woken up. Till the next down_read/up_read. >>> >>> Suppose that we have 2 CPU's, both counters == 1, both readers decrement. >>> its counter at the same time. >>> >>> READER_ON_CPU_0 READER_ON_CPU_1 >>> >>> --ctr_0; --ctr_1; >>> >>> if (ctr_0 + ctr_1) if (ctr_0 + ctr_1) >>> wakeup(); wakeup(); >>> >>> Why we can't miss a wakeup? > > But the patch is really: if (!(ctr_0 + ctr_1)). Of course, I meant if (ctr_0 + ctr_1 == 0). >> And in fact I am not sure this optimization makes sense... But it would be >> nice to avoid wake_up() when the writer sleeps in rcu_sync_enter(). Or this >> is the "slow mode" sem (cgroup_threadgroup_rwsem). > > Why do you think using per_cpu_sum() does not make sense? As mentioned in the > changelog it optimizes for incoming readers while the writer is doing sync_enter > and getting the regular rwsem. What am I missing? And this does make sense, but see below, >> I need to re-check, but what do you think about the change below? > > While optimizing for multiple writers (rcu_sync_enter) is certainly valid > (at least considering the cgroups rwsem you mention), No, it is not for multiple writers. rcu_sync_enter() is slow, the new readers can come and acquire/release this lock. And if it is a "slow mode" sem then every up() does wakeup which we want to eliminate. But after sem->readers_block is already true, I am not sure the additional per_cpu_sum() is a win (even if it was correct), the new readers can't come. Except __percpu_down_read()->__percpu_up_read() which we want to optimize too, but in this case we do not need per_cpu_sum() too. I'll try to make a patch this week... I had this optimization in mind from the very beginning, I event mentioned it during the last discussion, but never had time. Basically we should not inc if readers_block == T. Oleg.