Received: by 10.213.65.68 with SMTP id h4csp272365imn; Tue, 13 Mar 2018 04:02:14 -0700 (PDT) X-Google-Smtp-Source: AG47ELto0qhIebxDtqAhXEq6TzznQs6VqGIalt223mp1X+yJZQ2rFhZFwczDgc+Z06ej9jnttgnF X-Received: by 2002:a17:902:8487:: with SMTP id c7-v6mr155804plo.143.1520938934373; Tue, 13 Mar 2018 04:02:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1520938934; cv=none; d=google.com; s=arc-20160816; b=B78AtkXTkP6kW3QOC666uMF9L6RzyZALYBccGqTAPXxklhjBGT61dIVTm0Gq4kXCzJ Drdii1JSV8bBZxhBN+X9Q3sRWmeKTDkVfF/FHp9fH8yR/BO28G9f1mufgs6Yh4SFWDpU gcA2l0fKWupQISbWr0inlV4xkQyIU9OZkbUHk2awssln71N4UyV8JBqZnJgfIYURd6iJ CZAVHob1nmLFgOcDHzNG2zrS0afbcoutCzJm4/TTcpepDMYHZtpKxwrQwPhE13z2W7Pf Mq2t7klzEkE26LKw9uKDH2Q1I945V9tUTJTkyigEMAMAUbGhQWU77d4BAI/aKLNhiuYW FHDQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :dkim-signature:arc-authentication-results; bh=/OzoIl4JP0S6ZjqzDd1VW2YfQzp9Tn4i7o6wKyxtk7g=; b=y6j/tN/sMu1cQuwqo8/bW1mCuKdtlY07CNzFrimxkPExCQ6GbuwIla1JSKyrXulR3g 3Y3gX0FJwbferXIIjfrdiLbLaceOhHOVMapV+gSyqJduG5eEqzoWv714gsQwtZ4yxYu4 BONeyzrDjGcvqJd7KdDNLwE28vQ1ET3MeZpJiBK4wcZ+cmvd21W6LD0E52jolaD4+QoV PQtOJXoY0Agykvw2I51nwl/T68E65/0Oq+YFqStP9GeGHUqJ7fGsE6XfJMvfcP9l088a 43Hu/VwEP3RM7Rwf5/vMFi+3X8oqzlBIUPlkxUfViVkZbCPrUI6rcJxT8dtRScT22TT6 9TSg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=SJpJ0OdZ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y15si80884pfe.184.2018.03.13.04.01.59; Tue, 13 Mar 2018 04:02:14 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=SJpJ0OdZ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752685AbeCMJ61 (ORCPT + 99 others); Tue, 13 Mar 2018 05:58:27 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:45308 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752181AbeCMJ6Z (ORCPT ); Tue, 13 Mar 2018 05:58:25 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w2D9vx6k187222; Tue, 13 Mar 2018 09:57:59 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id; s=corp-2017-10-26; bh=/OzoIl4JP0S6ZjqzDd1VW2YfQzp9Tn4i7o6wKyxtk7g=; b=SJpJ0OdZP9i1gxLmMbcxgTYK3/lf53wlWZvLZot+wQirdck1r2tu9rgbyWv6d80n2ctg sZq4ExZ7zyCbT/F5Fp2FemIHHM1KFEA4inHDQyhOIGmf8TG4pQ3o4VV9/sLNAzorOAkg J7Slemm4smmGIbyOrA3ayGtC4i2m9URIF+iJOQAviTvNTafYmhsCxKltzkclhN6Fq6BG +pmHhtCt/Fvgg0IbEkhsC/capD1VnFIHaAm5RQKPnM2Kfe8Pf6BcY2LxS3y5Y6HNLFVN LGMyeL8kXK+ANT4CwiGYPZuVxYMM1U9KuSx1ys/9boRR1okPgahx7kI2CNgigaA5dTnh xg== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by aserp2120.oracle.com with ESMTP id 2gpc6yr3ya-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 13 Mar 2018 09:57:59 +0000 Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id w2D9vu5P005241 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 13 Mar 2018 09:57:56 GMT Received: from abhmp0015.oracle.com (abhmp0015.oracle.com [141.146.116.21]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id w2D9vsgF018133; Tue, 13 Mar 2018 09:57:54 GMT Received: from will-ThinkCentre-M910s.cn.oracle.com (/10.182.70.254) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 13 Mar 2018 02:57:53 -0700 From: Jianchao Wang To: keith.busch@intel.com, axboe@fb.com, hch@lst.de, sagi@grimberg.me Cc: ming.lei@redhat.com, linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org Subject: [PATCH V3] nvme-pci: assign separate irq vectors for adminq and ioq1 Date: Tue, 13 Mar 2018 17:58:08 +0800 Message-Id: <1520935088-1343-1-git-send-email-jianchao.w.wang@oracle.com> X-Mailer: git-send-email 2.7.4 X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8830 signatures=668690 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1803130119 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Currently, adminq and ioq1 share the same irq vector which is set affinity to cpu0. If a system allows cpu0 to be offlined, the adminq will not be able work any more. To fix this, assign separate irq vectors for adminq and ioq1. Set .pre_vectors == 1 when allocate irq vectors, then assign the first one to adminq which will have affinity cpumask with all possible cpus. On the other hand, if controller has only legacy or single -message MSI, we will setup adminq and 1 ioq and let them share the only one irq vector. Signed-off-by: Jianchao Wang --- V2->V3 - change changelog based on Ming's insights - some cleanup based on Andy's suggestions V1->V2 - add case to handle the scenario where there is only one irq vector - add nvme_ioq_vector to map ioq vector and qid drivers/nvme/host/pci.c | 30 +++++++++++++++++++++++------- 1 file changed, 23 insertions(+), 7 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index b6f43b7..47c33f4 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -84,6 +84,7 @@ struct nvme_dev { struct dma_pool *prp_small_pool; unsigned online_queues; unsigned max_qid; + unsigned int num_vecs; int q_depth; u32 db_stride; void __iomem *bar; @@ -139,6 +140,17 @@ static inline struct nvme_dev *to_nvme_dev(struct nvme_ctrl *ctrl) return container_of(ctrl, struct nvme_dev, ctrl); } +static inline unsigned int nvme_ioq_vector(struct nvme_dev *dev, + unsigned int qid) +{ + /* + * If controller has only legacy or single-message MSI, there will + * be only 1 irq vector. At the moment, we setup adminq + 1 ioq + * and let them share irq vector. + */ + return (dev->num_vecs == 1) ? 0 : qid; +} + /* * An NVM Express queue. Each device has at least two (one for admin * commands and one for I/O commands). @@ -1457,7 +1469,7 @@ static int nvme_create_queue(struct nvme_queue *nvmeq, int qid) nvmeq->sq_cmds_io = dev->cmb + offset; } - nvmeq->cq_vector = qid - 1; + nvmeq->cq_vector = nvme_ioq_vector(dev, qid); result = adapter_alloc_cq(dev, qid, nvmeq); if (result < 0) goto release_vector; @@ -1628,11 +1640,12 @@ static int nvme_create_io_queues(struct nvme_dev *dev) { unsigned i, max; int ret = 0; + int vec; for (i = dev->ctrl.queue_count; i <= dev->max_qid; i++) { - /* vector == qid - 1, match nvme_create_queue */ + vec = nvme_ioq_vector(dev, i); if (nvme_alloc_queue(dev, i, dev->q_depth, - pci_irq_get_node(to_pci_dev(dev->dev), i - 1))) { + pci_irq_get_node(to_pci_dev(dev->dev), vec))) { ret = -ENOMEM; break; } @@ -1913,6 +1926,8 @@ static int nvme_setup_io_queues(struct nvme_dev *dev) struct pci_dev *pdev = to_pci_dev(dev->dev); int result, nr_io_queues; unsigned long size; + struct irq_affinity affd = {.pre_vectors = 1}; + int ret; nr_io_queues = num_possible_cpus(); result = nvme_set_queue_count(&dev->ctrl, &nr_io_queues); @@ -1949,11 +1964,12 @@ static int nvme_setup_io_queues(struct nvme_dev *dev) * setting up the full range we need. */ pci_free_irq_vectors(pdev); - nr_io_queues = pci_alloc_irq_vectors(pdev, 1, nr_io_queues, - PCI_IRQ_ALL_TYPES | PCI_IRQ_AFFINITY); - if (nr_io_queues <= 0) + ret = pci_alloc_irq_vectors_affinity(pdev, 1, (nr_io_queues + 1), + PCI_IRQ_ALL_TYPES | PCI_IRQ_AFFINITY, &affd); + if (ret <= 0) return -EIO; - dev->max_qid = nr_io_queues; + dev->num_vecs = ret; + dev->max_qid = max(ret - 1, 1); /* * Should investigate if there's a performance win from allocating -- 2.7.4