From: Andrew Morton Subject: Re: [PATCH 010 of 11] knfsd: make rpc threads pools numa aware Date: Mon, 31 Jul 2006 21:43:28 -0700 Message-ID: <20060731214328.5770f1a5.akpm@osdl.org> References: <20060731103458.29040.patches@notabene> <1060731004234.29291@suse.de> <20060730211454.ccf803f3.akpm@osdl.org> <17613.35001.745409.144623@cse.unsw.edu.au> <1154320957.21040.1836.camel@hole.melbourne.sgi.com> <1154325296.21040.1850.camel@hole.melbourne.sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Cc: neilb@suse.de, nfs@lists.sourceforge.net, linux-kernel@vger.kernel.org Return-path: Received: from sc8-sf-mx2-b.sourceforge.net ([10.3.1.92] helo=mail.sourceforge.net) by sc8-sf-list2-new.sourceforge.net with esmtp (Exim 4.43) id 1G7m6R-0003ms-6I for nfs@lists.sourceforge.net; Mon, 31 Jul 2006 21:43:35 -0700 Received: from smtp.osdl.org ([65.172.181.4]) by mail.sourceforge.net with esmtps (TLSv1:DES-CBC3-SHA:168) (Exim 4.44) id 1G7m6R-0002fL-Cm for nfs@lists.sourceforge.net; Mon, 31 Jul 2006 21:43:35 -0700 To: Greg Banks In-Reply-To: <1154325296.21040.1850.camel@hole.melbourne.sgi.com> List-Id: "Discussion of NFS under Linux development, interoperability, and testing." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: nfs-bounces@lists.sourceforge.net Errors-To: nfs-bounces@lists.sourceforge.net On Mon, 31 Jul 2006 15:54:57 +1000 Greg Banks wrote: > On Mon, 2006-07-31 at 14:42, Greg Banks wrote: > > On Mon, 2006-07-31 at 14:36, Neil Brown wrote: > > > On Sunday July 30, akpm@osdl.org wrote: > > > > On Mon, 31 Jul 2006 10:42:34 +1000 > > > > NeilBrown wrote: > > > > > > > > > +static int > > > > > +svc_pool_map_init_percpu(struct svc_pool_map *m) > > > > > +{ > > > > > + unsigned int maxpools = num_possible_cpus(); > > > > > + unsigned int pidx = 0; > > > > > + unsigned int cpu; > > > > > + int err; > > > > > + > > > > > + > > > > > > > > That isn't right - it assumes that cpu_possible_map is not sparse. If it > > > > is sparse, we allocate undersized pools and then overindex them. > > > > Umm, I think Andrew's right, num_possible_cpus() should be NR_CPUS. > > How about this version of the patch? It replaces num_possible_cpus() > with highest_possible_processor_id()+1 and similarly for nodes. > -- > > knfsd: Actually implement multiple pools. On NUMA machines, allocate > a svc_pool per NUMA node; on SMP a svc_pool per CPU; otherwise a single > global pool. Enqueue sockets on the svc_pool corresponding to the CPU > on which the socket bh is run (i.e. the NIC interrupt CPU). Threads > have their cpu mask set to limit them to the CPUs in the svc_pool that > owns them. > > This is the patch that allows an Altix to scale NFS traffic linearly > beyond 4 CPUs and 4 NICs. > > Incorporates changes and feedback from Neil Brown, Trond Myklebust, > Christoph Hellwig and Andrew Morton. > Something has gone rather wrong here. > - serv = __svc_create(prog, bufsize, shutdown, /*npools*/1); > + serv = __svc_create(prog, bufsize, shutdown, npools); __svc_create() is: __svc_create(struct svc_program *prog, unsigned int bufsize, int npools, void (*shutdown)(struct svc_serv *serv)) so heaven knows what tree you're patching. Incremental patches really are preferred. So we can see what people are monkeying with ;) After fixing the rejects and cleaning a few things up, your proposed change amounts to: --- a/net/sunrpc/svc.c~knfsd-make-rpc-threads-pools-numa-aware-fix +++ a/net/sunrpc/svc.c @@ -116,7 +116,7 @@ fail: static int svc_pool_map_init_percpu(struct svc_pool_map *m) { - unsigned int maxpools = num_possible_cpus(); + unsigned int maxpools = highest_possible_processor_id() + 1; unsigned int pidx = 0; unsigned int cpu; int err; @@ -136,6 +136,18 @@ svc_pool_map_init_percpu(struct svc_pool return pidx; }; +static int +highest_possible_node_id(void) +{ + unsigned int node; + unsigned int highest = 0; + + for_each_node(node) + highest = node; + + return highest; +} + /* * Initialise the pool map for SVC_POOL_PERNODE mode. @@ -144,7 +156,7 @@ svc_pool_map_init_percpu(struct svc_pool static int svc_pool_map_init_pernode(struct svc_pool_map *m) { - unsigned int maxpools = num_possible_nodes(); + unsigned int maxpools = highest_possible_node_id() + 1; unsigned int pidx = 0; unsigned int node; int err; _ Which shouldn't have compiled, due to the missing forward declaration. And I'd be surprised if it worked very well with CONFIG_NUMA=n. And it's naughty to be sneaking general library functions into the sunrpc code anyway. Please, - Write a standalone patch which adds highest_possible_node_id() to lib/cpumask.c(?) Make sure it's inside #ifdef CONFIG_NUMA Remember to export it to modules. Provide a !CONFIG_NUMA version in include/linux/nodemask.h which just returns constant zero. Consider doing something more efficient than the for_each_node() loop. Although I'm not sure what that would be, given that we don't have find_last_bit(). - Provide an incremental patch against knfsd-make-rpc-threads-pools-numa-aware.patch which utilises highest_possible_node_id(). A replacement patch will be grudgingly accepted, but I'll only go and turn it into an incremental one, so you can't hide ;) - Test it real good. Modular, non-modular, NUMA, non-NUMA, !SMP. Thanks. ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs