Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754027AbbDPUzu (ORCPT ); Thu, 16 Apr 2015 16:55:50 -0400 Received: from mail-ie0-f169.google.com ([209.85.223.169]:35091 "EHLO mail-ie0-f169.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753706AbbDPUzl (ORCPT ); Thu, 16 Apr 2015 16:55:41 -0400 Message-ID: <1429217739.7346.218.camel@edumazet-glaptop2.roam.corp.google.com> Subject: Re: [RFC PATCH] fs: use a sequence counter instead of file_lock in fd_install From: Eric Dumazet To: Al Viro Cc: Mateusz Guzik , Andrew Morton , "Paul E. McKenney" , Yann Droneaud , Konstantin Khlebnikov , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Thu, 16 Apr 2015 13:55:39 -0700 In-Reply-To: <1429216923.7346.211.camel@edumazet-glaptop2.roam.corp.google.com> References: <20150416121628.GA20615@mguzik> <20150416180932.GW889@ZenIV.linux.org.uk> <1429216923.7346.211.camel@edumazet-glaptop2.roam.corp.google.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.10.4-0ubuntu2 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3950 Lines: 101 On Thu, 2015-04-16 at 13:42 -0700, Eric Dumazet wrote: > On Thu, 2015-04-16 at 19:09 +0100, Al Viro wrote: > > On Thu, Apr 16, 2015 at 02:16:31PM +0200, Mateusz Guzik wrote: > > > @@ -165,8 +165,10 @@ static int expand_fdtable(struct files_struct *files, int nr) > > > cur_fdt = files_fdtable(files); > > > if (nr >= cur_fdt->max_fds) { > > > /* Continue as planned */ > > > + write_seqcount_begin(&files->fdt_seqcount); > > > copy_fdtable(new_fdt, cur_fdt); > > > rcu_assign_pointer(files->fdt, new_fdt); > > > + write_seqcount_end(&files->fdt_seqcount); > > > if (cur_fdt != &files->fdtab) > > > call_rcu(&cur_fdt->rcu, free_fdtable_rcu); > > > > Interesting. AFAICS, your test doesn't step anywhere near that path, > > does it? So basically you never hit the retries during that... > > Right, but then the table is almost never changed for a given process, > as we only increase it by power of two steps. > > (So I scratch my initial comment, fdt_seqcount is really mostly read) I tested Mateusz patch with my opensock program, mimicking a bit more what a server does (having lot of sockets) 24 threads running, doing close(randomfd())/socket() calls like crazy. Before patch : # time ./opensock real 0m10.863s user 0m0.954s sys 2m43.659s After patch : # time ./opensock real 0m9.750s user 0m0.804s sys 2m18.034s So this is an improvement for sure, but not massive. perf record ./opensock ; report 87.60% opensock [kernel.kallsyms] [k] _raw_spin_lock 1.57% opensock [kernel.kallsyms] [k] find_next_zero_bit 0.50% opensock [kernel.kallsyms] [k] memset_erms 0.44% opensock [kernel.kallsyms] [k] __alloc_fd 0.44% opensock [kernel.kallsyms] [k] tcp_close 0.43% opensock [kernel.kallsyms] [k] get_empty_filp 0.43% opensock [kernel.kallsyms] [k] kmem_cache_free 0.40% opensock [kernel.kallsyms] [k] free_block 0.34% opensock [kernel.kallsyms] [k] __close_fd 0.32% opensock [kernel.kallsyms] [k] sk_alloc 0.30% opensock [kernel.kallsyms] [k] _raw_spin_lock_bh 0.24% opensock [kernel.kallsyms] [k] inet_csk_destroy_sock 0.22% opensock [kernel.kallsyms] [k] kmem_cache_alloc 0.22% opensock opensock [.] __pthread_disable_asynccancel 0.21% opensock [kernel.kallsyms] [k] lockref_put_return 0.20% opensock [kernel.kallsyms] [k] filp_close perf record -g ./opensock ; perf report --stdio 87.80% opensock [kernel.kallsyms] [k] _raw_spin_lock | --- _raw_spin_lock | |--52.70%-- __close_fd | sys_close | system_call_fastpath | __libc_close | | | |--98.97%-- 0x0 | --1.03%-- [...] | |--46.41%-- __alloc_fd | get_unused_fd_flags | sock_map_fd | sys_socket | system_call_fastpath | __socket | | | --100.00%-- 0x0 --0.89%-- [...] 1.54% opensock [kernel.kallsyms] [k] find_next_zero_bit | -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/