Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1030478AbbD1QVF (ORCPT ); Tue, 28 Apr 2015 12:21:05 -0400 Received: from mail-ig0-f170.google.com ([209.85.213.170]:35166 "EHLO mail-ig0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1030289AbbD1QU7 (ORCPT ); Tue, 28 Apr 2015 12:20:59 -0400 Message-ID: <1430238055.28069.12.camel@edumazet-glaptop2.roam.corp.google.com> Subject: Re: [PATCH] fs/file.c: don't acquire files->file_lock in fd_install() From: Eric Dumazet To: Mateusz Guzik Cc: Al Viro , Andrew Morton , "Paul E. McKenney" , Yann Droneaud , Konstantin Khlebnikov , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Tue, 28 Apr 2015 09:20:55 -0700 In-Reply-To: <20150427190515.GC2588@mguzik> References: <20150420130633.GA2513@mguzik> <20150420134326.GC2513@mguzik> <20150420151054.GD2513@mguzik> <1429550126.7346.268.camel@edumazet-glaptop2.roam.corp.google.com> <1429562991.7346.290.camel@edumazet-glaptop2.roam.corp.google.com> <1429639543.7346.329.camel@edumazet-glaptop2.roam.corp.google.com> <20150421200624.GA16097@mguzik> <20150421201201.GB16097@mguzik> <1429650413.18561.10.camel@edumazet-glaptop2.roam.corp.google.com> <1429678768.18561.64.camel@edumazet-glaptop2.roam.corp.google.com> <20150427190515.GC2588@mguzik> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.10.4-0ubuntu2 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2658 Lines: 81 On Mon, 2015-04-27 at 21:05 +0200, Mateusz Guzik wrote: > On Tue, Apr 21, 2015 at 09:59:28PM -0700, Eric Dumazet wrote: > > From: Eric Dumazet > > > > Mateusz Guzik reported : > > > > Currently obtaining a new file descriptor results in locking fdtable > > twice - once in order to reserve a slot and second time to fill it. > > > > Holding the spinlock in __fd_install() is needed in case a resize is > > done, or to prevent a resize. > > > > Mateusz provided an RFC patch and a micro benchmark : > > http://people.redhat.com/~mguzik/pipebench.c > > > > A resize is an unlikely operation in a process lifetime, > > as table size is at least doubled at every resize. > > > > We can use RCU instead of the spinlock : > > > > __fd_install() must wait if a resize is in progress. > > > > The resize must block new __fd_install() callers from starting, > > and wait that ongoing install are finished (synchronize_sched()) > > > > resize should be attempted by a single thread to not waste resources. > > > > rcu_sched variant is used, as __fd_install() and expand_fdtable() run > > from process context. > > > > It gives us a ~30% speedup using pipebench with 16 threads, and a ~10% > > speedup with another program using TCP sockets. > > > > Signed-off-by: Eric Dumazet > > Reported-by: Mateusz Guzik > > Reviewed-by: Mateusz Guzik > Tested-by: Mateusz Guzik > Thanks Mateusz I'll send a v2, replacing the READ_ONCE() by more appropriate rcu_dereference_sched() : > > new_fdt->max_fds = NR_OPEN_DEFAULT; > > @@ -553,11 +572,20 @@ void __fd_install(struct files_struct *files, unsigned int fd, > > struct file *file) > > { > > struct fdtable *fdt; > > - spin_lock(&files->file_lock); > > - fdt = files_fdtable(files); > > + > > + rcu_read_lock_sched(); > > + > > + while (unlikely(files->resize_in_progress)) { > > + rcu_read_unlock_sched(); > > + wait_event(files->resize_wait, !files->resize_in_progress); > > + rcu_read_lock_sched(); > > + } > > + /* coupled with smp_wmb() in expand_fdtable() */ > > + smp_rmb(); > > + fdt = READ_ONCE(files->fdt); This should be : fdt = rcu_dereference_sched(files->fdt); > > BUG_ON(fdt->fd[fd] != NULL); > > rcu_assign_pointer(fdt->fd[fd], file); > > - spin_unlock(&files->file_lock); > > + rcu_read_unlock_sched(); > > } > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/