Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752391AbbDPWfk (ORCPT ); Thu, 16 Apr 2015 18:35:40 -0400 Received: from mx1.redhat.com ([209.132.183.28]:36855 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751277AbbDPWf3 (ORCPT ); Thu, 16 Apr 2015 18:35:29 -0400 Date: Fri, 17 Apr 2015 00:35:09 +0200 From: Mateusz Guzik To: Al Viro Cc: Andrew Morton , "Paul E. McKenney" , Yann Droneaud , Konstantin Khlebnikov , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [RFC PATCH] fs: use a sequence counter instead of file_lock in fd_install Message-ID: <20150416223507.GC20615@mguzik> References: <20150416121628.GA20615@mguzik> <20150416180932.GW889@ZenIV.linux.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20150416180932.GW889@ZenIV.linux.org.uk> User-Agent: Mutt/1.5.23.1-rc1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2274 Lines: 54 On Thu, Apr 16, 2015 at 07:09:32PM +0100, Al Viro wrote: > On Thu, Apr 16, 2015 at 02:16:31PM +0200, Mateusz Guzik wrote: > > @@ -165,8 +165,10 @@ static int expand_fdtable(struct files_struct *files, int nr) > > cur_fdt = files_fdtable(files); > > if (nr >= cur_fdt->max_fds) { > > /* Continue as planned */ > > + write_seqcount_begin(&files->fdt_seqcount); > > copy_fdtable(new_fdt, cur_fdt); > > rcu_assign_pointer(files->fdt, new_fdt); > > + write_seqcount_end(&files->fdt_seqcount); > > if (cur_fdt != &files->fdtab) > > call_rcu(&cur_fdt->rcu, free_fdtable_rcu); > > Interesting. AFAICS, your test doesn't step anywhere near that path, > does it? So basically you never hit the retries during that... well, yeah. In fact for non-shared tables one could go a step further and just plop the pointer in, but I don't know if that makes much sense. Other processes inspecting the table could get away with a data dependency barrier. Closing would still take the lock, so you can only suddenly see filp installed, but never one going away. Now, as far as correctness goes, I think there is a bug in the patch (which does not invalidate the idea though). Chances are I got a fix as well. Benchmark prog is here: http://people.redhat.com/~mguzik/pipebench.c A modified version: http://people.redhat.com/~mguzik/fdi-fail.c Benchmark is just doing pipe + close in a loop in multiple threads. Modified version spawns threads, sleeps 100 ms and does dup(0, 300) to reallocate the table while other threads continue the work. This succesfully tested retries (along with cases where installed file got copied and was encountered during retry). However, I see sporadic close failures. I presume this is because of a missing read barrier after write_seqcount_begin. Adding a smp_mb() seems to solve the problem, but I could only test on 2 * 16 Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz. My memory barrier-fu is rather weak and I'm not that confident in my crap suspicion here. Thoughts? -- Mateusz Guzik -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/