Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752103AbdINQEi convert rfc822-to-8bit (ORCPT ); Thu, 14 Sep 2017 12:04:38 -0400 Received: from mga02.intel.com ([134.134.136.20]:29393 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751420AbdINQEd (ORCPT ); Thu, 14 Sep 2017 12:04:33 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.42,393,1500966000"; d="scan'208";a="151353922" From: "Reshetova, Elena" To: Peter Zijlstra CC: Thomas Gleixner , "linux-kernel@vger.kernel.org" , "linux-fsdevel@vger.kernel.org" , "gregkh@linuxfoundation.org" , "viro@zeniv.linux.org.uk" , "tj@kernel.org" , "mingo@redhat.com" , "hannes@cmpxchg.org" , "lizefan@huawei.com" , "acme@kernel.org" , "alexander.shishkin@linux.intel.com" , "eparis@redhat.com" , "akpm@linux-foundation.org" , "arnd@arndb.de" , "luto@kernel.org" , "keescook@chromium.org" , "dvhart@infradead.org" , "ebiederm@xmission.com" Subject: RE: [PATCH 14/15] futex: convert futex_pi_state.refcount to refcount_t Thread-Topic: [PATCH 14/15] futex: convert futex_pi_state.refcount to refcount_t Thread-Index: AQHTIYrzZIHADGwHSEu0EVkr0E9TAqKflyIAgAAhQgCAAAFJAIAAI4yQgAADB+CAAAkkgIAAGdsA///3mwCAABRlgIAASW8AgAQ1RNCAAAjqgIAQBuyw Date: Thu, 14 Sep 2017 16:02:47 +0000 Message-ID: <2236FBA76BA1254E88B949DDB74E612B6FF6A282@IRSMSX102.ger.corp.intel.com> References: <20170901093852.it4d4bxoy2lmojrk@hirez.programming.kicks-ass.net> <2236FBA76BA1254E88B949DDB74E612B6FF6347F@IRSMSX102.ger.corp.intel.com> <20170901123415.s3fxlyeyourz47av@hirez.programming.kicks-ass.net> <2236FBA76BA1254E88B949DDB74E612B6FF63506@IRSMSX102.ger.corp.intel.com> <20170901133644.jf57pwuaep6zirxz@hirez.programming.kicks-ass.net> <2236FBA76BA1254E88B949DDB74E612B6FF6369D@IRSMSX102.ger.corp.intel.com> <20170901191234.ghybbmpm73miwmkp@hirez.programming.kicks-ass.net> <2236FBA76BA1254E88B949DDB74E612B6FF63ED7@IRSMSX102.ger.corp.intel.com> <20170904120009.ah2qu3lbgdqdgz6i@hirez.programming.kicks-ass.net> In-Reply-To: <20170904120009.ah2qu3lbgdqdgz6i@hirez.programming.kicks-ass.net> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-version: 11.0.0.116 dlp-reaction: no-action x-originating-ip: [163.33.239.181] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3657 Lines: 94 Sorry for delayed reply. > On Mon, Sep 04, 2017 at 10:31:54AM +0000, Reshetova, Elena wrote: > > > > But can they make "fast" implementation on ARM that would give stronger > > > > memory guarantees? > > > > > > Whatever for? > > > > Well, maybe just by default when arch.-specific implementation is > > done. But I was just trying to speculate to understand. I will resend > > this one with new comment added. > > So the generic lib/refcount.c already has weak ordering. It doesn't make > sense for an arch specific implementation (on a weakly ordered machine) > to provide stronger guarantees (it would make things slower). Thank you for explaining this! Helps to understand a lot. > > The weaker ordering of the refcount_t primitives is sufficient if we're > talking pure refcounts. If for some reason code relies on stronger > ordering there _SHOULD_ be a comment with describing the additional > ordering requirements. > > But that's a fairly big 'should'. I can well imagine the comment not > being there. In fact, see below. > > > Still not sure if I need to resend the whole series with updated > > commits or break this up by individual patches further for the > > separate merges. > > I've yet to look at the ones targeted at subsystems I do, I'm forever > and terminally behind on review :/ > > I called out the issue on futex in particular because it is fairly > tricky code that. > > Now Thomas would like you to mention the fact that refcount_t doesn't > provide the exact same ordering as the atomic_t usages it replaces and > I think it would be good if you could hand-wave an argument on why the > futex code doesn't care. I think I can mention the ordering differences on all yet-to-be-merged patches to make sure maintainers are aware. The problem with concrete cases is that I don't usually have enough knowledge of code to understand for sure where it would matter or not. Previously I was even under impression that it should not matter at all for the variables that we are converting since they are classical refcounters, but your examples clearly show that it is not *always* the case (but I think it is the case for most of patches). So, I am a bit confused on how to approach this. Either just put a statement to all patches and rely that maintainers certainly know their code and can catch these things or do an analysis myself, but then I would need a bit of guidance on what is the reasonable heuristics on how check each refcounter. This goes really beyond my current kernel knowledge, but I am happy to learn if somebody points me to smth I can read/fill missing points. Best Regards, Elena. > > > Now, suppose we were to convert i_count to refcount_t (yes, I know, my > initial conversion wasn't well received), then we need to add > futex_get_inode() similar to futex_get_mm(). > > That is, smp_mb__{before,after}_atomic() works as expected and can be > used to fortify the implied barriers by refcount_t. > > --- > Subject: fs,inode: Add comment explaining additional ordering > > Add a note to ihold() to document the ordering futex relies upon. > > Signed-off-by: Peter Zijlstra (Intel) > --- > fs/inode.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/fs/inode.c b/fs/inode.c > index 50370599e371..17192ba92fef 100644 > --- a/fs/inode.c > +++ b/fs/inode.c > @@ -395,6 +395,10 @@ void __iget(struct inode *inode) > */ > void ihold(struct inode *inode) > { > + /* > + * Note: futex.c:get_futex_key_refs() relies on this function > + * implying an smp_mb(). > + */ > WARN_ON(atomic_inc_return(&inode->i_count) < 2); > } > EXPORT_SYMBOL(ihold);