Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755899Ab1CMROV (ORCPT ); Sun, 13 Mar 2011 13:14:21 -0400 Received: from one.firstfloor.org ([213.235.205.2]:44777 "EHLO one.firstfloor.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753014Ab1CMROU (ORCPT ); Sun, 13 Mar 2011 13:14:20 -0400 Date: Sun, 13 Mar 2011 18:14:19 +0100 From: Andi Kleen To: Matthew Wilcox Cc: Andi Kleen , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [REVIEW] NVM Express driver Message-ID: <20110313171419.GL2499@one.firstfloor.org> References: <20110303204749.GY3663@linux.intel.com> <20110312055146.GA4183@linux.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110312055146.GA4183@linux.intel.com> User-Agent: Mutt/1.4.2.2i Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2328 Lines: 59 On Sat, Mar 12, 2011 at 12:51:46AM -0500, Matthew Wilcox wrote: > Is there a good API to iterate through each socket, then each core in a > socket, then each HT sibling? eg, if I have 20 queues and 2x6x2 CPUs, Not for this particular order. And also you have to handle hotplug in any case anyways. And whatever you do, don't add NR_CPUS arrays. > I want to assign at least one queue to each core; some threads will get > their own queues and others will have to share with their HT sibling. Please write a generic library function for this if you do this. > > > > + nprps = DIV_ROUND_UP(length, PAGE_SIZE); > > > + npages = DIV_ROUND_UP(8 * nprps, PAGE_SIZE); > > > + prps = kmalloc(sizeof(*prps) + sizeof(__le64 *) * npages, GFP_ATOMIC); > > > + prp_page = 0; > > > + if (nprps <= (256 / 8)) { > > > + pool = dev->prp_small_pool; > > > + prps->npages = 0; > > > > > > Unchecked GFP_ATOMIC allocation? That will oops soon. > > Besides GFP_ATOMIC a very risky thing to do on a low memory situation, > > which can trigger writeouts. > > Ah yes, thank you. There are a few other places like this. Bizarrely, > they've not oopsed during the xfstests runs. You need suitable background load. If you run it in LTP the harness has support for background load. For GFP_ATOMIC exhaustion you typically need something interrupt intensive, like a lot of networking. > > My plan for this is, instead of using a mempool, to submit partial I/Os > in the rare cases where a write cannot allocate memory. I have the > design in my head, just not committed to code yet. The design also > avoids allocating any memory in the driver for I/Os that do not cross > a page boundary. I forgot the latest status, but there were a lot of improvements with dirty pages handling since that "no memory allocation on writeout" rule was introduced. It may not be as big a problem as it used to be with GFP_NOFS. Copying linux-mm in case there are deep thoughts on this there. Just GFP_ATOMIC is definitely still a bad idea there. -Andi -- ak@linux.intel.com -- Speaking for myself only. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/