Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757952AbcC2Roj (ORCPT ); Tue, 29 Mar 2016 13:44:39 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:40345 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753155AbcC2Roi (ORCPT ); Tue, 29 Mar 2016 13:44:38 -0400 Date: Tue, 29 Mar 2016 10:44:37 -0700 From: Christoph Hellwig To: Jens Axboe Cc: Shaohua Li , Christoph Hellwig , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, Kernel-team@fb.com Subject: Re: [PATCH 1/3] blk-mq: add an API to estimate hardware queue node Message-ID: <20160329174437.GA451@infradead.org> References: <68fed570910230ce847f8f3b685eeea399640a7f.1458941500.git.shli@fb.com> <20160329072443.GA18920@infradead.org> <20160329164722.GA1208161@devbig084.prn1.facebook.com> <56FAB243.5020804@fb.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <56FAB243.5020804@fb.com> User-Agent: Mutt/1.5.24 (2015-08-30) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1092 Lines: 21 On Tue, Mar 29, 2016 at 10:50:11AM -0600, Jens Axboe wrote: > >This looks weird, shouldn't the cpu assignment be determined by block > >core (blk-mq) because block core decides how to use the queue? > > I agree, that belongs in the blk-mq proper, the driver should just follow > the rules outlined, not impose their own in this regard. It'll also help > with irq affinity mappings, once we get that in. It's not going to work that way unfortunately. Lots of driver simply have no control over the underlying interrupts. Think of any RDMA storage or other layer drivers - they get low level queues from a layer they don't control and need a block queue for each of them. My plan is to make the block layer follow what the networking layer does - get the low level queues / MSI-X pairs and then use the infrastructure in lib/cpu_rmap.c to figure out the number of queues and queue placement for them. The lower half of that is about to get rewritten by Thomas after disccussion between him, me and a few others to provide drivers nice APIs for spreading MSI-X vectors over CPUs or nodes.