Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753486AbbDTQq2 (ORCPT ); Mon, 20 Apr 2015 12:46:28 -0400 Received: from mail-ie0-f180.google.com ([209.85.223.180]:34180 "EHLO mail-ie0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751297AbbDTQqY (ORCPT ); Mon, 20 Apr 2015 12:46:24 -0400 Message-ID: <1429548382.7346.261.camel@edumazet-glaptop2.roam.corp.google.com> Subject: Re: [RFC PATCH] fs: use a sequence counter instead of file_lock in fd_install From: Eric Dumazet To: Mateusz Guzik Cc: Al Viro , Andrew Morton , "Paul E. McKenney" , Yann Droneaud , Konstantin Khlebnikov , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Mon, 20 Apr 2015 09:46:22 -0700 In-Reply-To: <20150420134103.GB2513@mguzik> References: <20150416121628.GA20615@mguzik> <1429307216.7346.255.camel@edumazet-glaptop2.roam.corp.google.com> <20150417221646.GA15589@mguzik> <20150417230252.GE889@ZenIV.linux.org.uk> <1429386098.7346.260.camel@edumazet-glaptop2.roam.corp.google.com> <20150420134103.GB2513@mguzik> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.10.4-0ubuntu2 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2764 Lines: 63 On Mon, 2015-04-20 at 15:41 +0200, Mateusz Guzik wrote: > On Sat, Apr 18, 2015 at 12:41:38PM -0700, Eric Dumazet wrote: > > On Sat, 2015-04-18 at 00:02 +0100, Al Viro wrote: > > > On Sat, Apr 18, 2015 at 12:16:48AM +0200, Mateusz Guzik wrote: > > > > > > > I would say this makes the use of seq counter impossible. Even if we > > > > decided to fall back to a lock on retry, we cannot know what to do if > > > > the slot is reserved - it very well could be that something called > > > > close, and something else reserved the slot, so putting the file inside > > > > could be really bad. In fact we would be putting a file for which we > > > > don't have a reference anymore. > > > > > > > > However, not all hope is lost and I still think we can speed things up. > > > > > > > > A locking primitive which only locks stuff for current cpu and has > > > > another mode where it locks stuff for all cpus would do the trick just > > > > fine. I'm not a linux guy, quick search suggests 'lglock' would do what > > > > I want. > > > > > > > > table reallocation is an extremely rare operation, so this should be > > > > fine. It would take the lock 'globally' for given table. > > > > > > It would also mean percpu_alloc() for each descriptor table... > > > > I would rather use an xchg() instead of rcu_assign_ponter() > > > > old = xchg(&fdt->fd[fd], file); > > if (unlikely(old)) > > filp_close(old, files); > > > > If threads are using close() on random fds, final result is not > > guaranteed anyway. > > > > Well I don't see how could this be used to fix the problem. > > If you are retrying and see NULL, you don't know whether your previous > update was not picked up by memcpy OR the fd got closed, which also > unreferenced the file you are installing. But you can't tell what > happened. > > If you see non-NULL and what you found is not the file you are > installing, you know the file was freed so you can't close the old file. > > One could try to introduce an invariant that files installed in a > lockless manner have to start with refcount 1, you still can't infer > anything from the fact that the counter is 1 when you retry (even if you > take the lock). It could have been duped, or even sent over a unix > socket and closed (although that awould surely require a solid pause in > execution) and who knows what else. > > In general I would say this approach is too hard to get right to be > worthwile given expected speedup. > Hey, that's because I really meant (during the week end) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/