Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755972AbdCWO7L (ORCPT ); Thu, 23 Mar 2017 10:59:11 -0400 Received: from mx1.redhat.com ([209.132.183.28]:34616 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751444AbdCWO7J (ORCPT ); Thu, 23 Mar 2017 10:59:09 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com CC33370027 Authentication-Results: ext-mx04.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx04.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=rjones@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com CC33370027 Date: Thu, 23 Mar 2017 14:59:03 +0000 From: "Richard W.M. Jones" To: Thorsten Leemhuis Cc: mst@redhat.com, hch@lst.de, virtio-dev@lists.oasis-open.org, Linux Kernel Mailing List Subject: Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues") Message-ID: <20170323145903.GE30978@redhat.com> References: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="Bn2rw/3z4jIqBvZU" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.20 (2009-12-10) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Thu, 23 Mar 2017 14:59:09 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 56027 Lines: 1713 --Bn2rw/3z4jIqBvZU Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Thu, Mar 23, 2017 at 03:51:25PM +0100, Thorsten Leemhuis wrote: > Hi Christoph! Hi Michael! > > (Mail roughly based on text from > https://bugzilla.kernel.org/show_bug.cgi?id=194911 ) > > I'm seeing random crashes during boot every few boot attempts when > running Linux 4.11-rc/mainline in a Fedora 26 guest under a CentOS7 host > (CPU: Intel(R) Pentium(R) CPU G3220) using KVM. Sometimes when the guest > actually booted the network did not work. To get some impressions of the > crashes I got see this gallery: > https://plus.google.com/+ThorstenLeemhuis/posts/FjyyGjNtrrG > > Richard W.M. Jones and Adam Williamson see the same problems. See above > bug for details. It seems they ran into the problem in the past few > days, so I assume it's still present in mainline (I'm travelling > currently and haven't had time for proper tests since last last Friday > (pre-rc3); but I thought it's time to get the problem to the lists). > > Long story short: Richard and I did bisections and we both found that > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=07ec51480b5e > ("virtio_pci: use shared interrupts for virtqueues") is the first bad > commit. Any idea what might be wrong? Do you need more details from us > to fix this? Laura Abbott posted a kernel RPM which works for me. She has had to revert quite a number of commits, which are detailed in this comment: https://bugzilla.redhat.com/show_bug.cgi?id=1430297#c7 Her reverting patch is also attached. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-builder quickly builds VMs from scratch http://libguestfs.org/virt-builder.1.html --Bn2rw/3z4jIqBvZU Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="virtio_revert.patch" >From 4d3cba0be27b20516eb765c2913bce93e73fe30e Mon Sep 17 00:00:00 2001 From: Laura Abbott Date: Wed, 22 Mar 2017 15:41:27 -0700 Subject: [PATCH] Revert a bunch of virtio commits 07ec51480b5e ("virtio_pci: use shared interrupts for virtqueues") is linked to a bunch of issues. Unfortunately we can't just revert it by itself. Revert it and dependency patches as well. Revert "virtio: provide a method to get the IRQ affinity mask for a virtqueue" This reverts commit bbaba479563910aaa51e59bb9027a09e396d3a3c. Revert "virtio-console: avoid DMA from stack" This reverts commit c4baad50297d84bde1a7ad45e50c73adae4a2192. Revert "vhost: introduce O(1) vq metadata cache" This reverts commit f889491380582b4ba2981cf0b0d7d6a40fb30ab7. Conflicts: drivers/vhost/vhost.c Revert "virtio_scsi: use virtio IRQ affinity" This reverts commit 0d9f0a52c8b9f7a003fe1650b7d5fb8518efabe0. Revert "virtio_blk: use virtio IRQ affinity" This reverts commit ad71473d9c43725c917fc5a86d54ceb7001ee28c. Revert "blk-mq: provide a default queue mapping for virtio device" This reverts commit 73473427bb551686e4b68ecd99bfd27e6635286a. Revert "virtio: allow drivers to request IRQ affinity when creating VQs" This reverts commit fb5e31d970ce8b4941f03ed765d7dbefc39f22d9. Revert "virtio_pci: simplify MSI-X setup" This reverts commit 52a61516125fa9a21b3bdf4f90928308e2e5573f. Revert "virtio_pci: don't duplicate the msix_enable flag in struct pci_dev" This reverts commit 53a020c661741f3b87ad3ac6fa545088aaebac9b. Revert "virtio_pci: use shared interrupts for virtqueues" This reverts commit 07ec51480b5eb1233f8c1b0f5d7a7c8d1247c507. --- block/Kconfig | 5 - block/Makefile | 1 - block/blk-mq-virtio.c | 54 ------ drivers/block/virtio_blk.c | 14 +- drivers/char/virtio_console.c | 14 +- drivers/crypto/virtio/virtio_crypto_core.c | 2 +- drivers/gpu/drm/virtio/virtgpu_kms.c | 2 +- drivers/misc/mic/vop/vop_main.c | 2 +- drivers/net/caif/caif_virtio.c | 3 +- drivers/net/virtio_net.c | 2 +- drivers/remoteproc/remoteproc_virtio.c | 3 +- drivers/rpmsg/virtio_rpmsg_bus.c | 2 +- drivers/s390/virtio/kvm_virtio.c | 3 +- drivers/s390/virtio/virtio_ccw.c | 3 +- drivers/scsi/virtio_scsi.c | 127 +++++++++++-- drivers/vhost/vhost.c | 136 +++----------- drivers/vhost/vhost.h | 8 - drivers/virtio/virtio_balloon.c | 3 +- drivers/virtio/virtio_input.c | 3 +- drivers/virtio/virtio_mmio.c | 3 +- drivers/virtio/virtio_pci_common.c | 287 +++++++++++++++-------------- drivers/virtio/virtio_pci_common.h | 25 ++- drivers/virtio/virtio_pci_legacy.c | 3 +- drivers/virtio/virtio_pci_modern.c | 11 +- include/linux/blk-mq-virtio.h | 10 - include/linux/cpuhotplug.h | 1 + include/linux/virtio_config.h | 12 +- include/uapi/linux/virtio_pci.h | 2 +- net/vmw_vsock/virtio_transport.c | 3 +- 29 files changed, 337 insertions(+), 407 deletions(-) delete mode 100644 block/blk-mq-virtio.c delete mode 100644 include/linux/blk-mq-virtio.h diff --git a/block/Kconfig b/block/Kconfig index e9f780f..a2a92e5 100644 --- a/block/Kconfig +++ b/block/Kconfig @@ -189,9 +189,4 @@ config BLK_MQ_PCI depends on BLOCK && PCI default y -config BLK_MQ_VIRTIO - bool - depends on BLOCK && VIRTIO - default y - source block/Kconfig.iosched diff --git a/block/Makefile b/block/Makefile index 081bb68..2ad7c30 100644 --- a/block/Makefile +++ b/block/Makefile @@ -25,7 +25,6 @@ obj-$(CONFIG_BLOCK_COMPAT) += compat_ioctl.o obj-$(CONFIG_BLK_CMDLINE_PARSER) += cmdline-parser.o obj-$(CONFIG_BLK_DEV_INTEGRITY) += bio-integrity.o blk-integrity.o t10-pi.o obj-$(CONFIG_BLK_MQ_PCI) += blk-mq-pci.o -obj-$(CONFIG_BLK_MQ_VIRTIO) += blk-mq-virtio.o obj-$(CONFIG_BLK_DEV_ZONED) += blk-zoned.o obj-$(CONFIG_BLK_WBT) += blk-wbt.o obj-$(CONFIG_BLK_DEBUG_FS) += blk-mq-debugfs.o diff --git a/block/blk-mq-virtio.c b/block/blk-mq-virtio.c deleted file mode 100644 index c3afbca..0000000 --- a/block/blk-mq-virtio.c +++ /dev/null @@ -1,54 +0,0 @@ -/* - * Copyright (c) 2016 Christoph Hellwig. - * - * This program is free software; you can redistribute it and/or modify it - * under the terms and conditions of the GNU General Public License, - * version 2, as published by the Free Software Foundation. - * - * This program is distributed in the hope it will be useful, but WITHOUT - * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or - * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for - * more details. - */ -#include -#include -#include -#include -#include -#include "blk-mq.h" - -/** - * blk_mq_virtio_map_queues - provide a default queue mapping for virtio device - * @set: tagset to provide the mapping for - * @vdev: virtio device associated with @set. - * @first_vec: first interrupt vectors to use for queues (usually 0) - * - * This function assumes the virtio device @vdev has at least as many available - * interrupt vetors as @set has queues. It will then queuery the vector - * corresponding to each queue for it's affinity mask and built queue mapping - * that maps a queue to the CPUs that have irq affinity for the corresponding - * vector. - */ -int blk_mq_virtio_map_queues(struct blk_mq_tag_set *set, - struct virtio_device *vdev, int first_vec) -{ - const struct cpumask *mask; - unsigned int queue, cpu; - - if (!vdev->config->get_vq_affinity) - goto fallback; - - for (queue = 0; queue < set->nr_hw_queues; queue++) { - mask = vdev->config->get_vq_affinity(vdev, first_vec + queue); - if (!mask) - goto fallback; - - for_each_cpu(cpu, mask) - set->mq_map[cpu] = queue; - } - - return 0; -fallback: - return blk_mq_map_queues(set); -} -EXPORT_SYMBOL_GPL(blk_mq_virtio_map_queues); diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c index 1d4c9f8..024b473 100644 --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -5,7 +5,6 @@ #include #include #include -#include #include #include #include @@ -13,7 +12,6 @@ #include #include #include -#include #include #define PART_BITS 4 @@ -428,7 +426,6 @@ static int init_vq(struct virtio_blk *vblk) struct virtqueue **vqs; unsigned short num_vqs; struct virtio_device *vdev = vblk->vdev; - struct irq_affinity desc = { 0, }; err = virtio_cread_feature(vdev, VIRTIO_BLK_F_MQ, struct virtio_blk_config, num_queues, @@ -455,8 +452,7 @@ static int init_vq(struct virtio_blk *vblk) } /* Discover virtqueues and write information to configuration. */ - err = vdev->config->find_vqs(vdev, num_vqs, vqs, callbacks, names, - &desc); + err = vdev->config->find_vqs(vdev, num_vqs, vqs, callbacks, names); if (err) goto out; @@ -590,18 +586,10 @@ static int virtblk_init_request(void *data, struct request *rq, return 0; } -static int virtblk_map_queues(struct blk_mq_tag_set *set) -{ - struct virtio_blk *vblk = set->driver_data; - - return blk_mq_virtio_map_queues(set, vblk->vdev, 0); -} - static struct blk_mq_ops virtio_mq_ops = { .queue_rq = virtio_queue_rq, .complete = virtblk_request_done, .init_request = virtblk_init_request, - .map_queues = virtblk_map_queues, }; static unsigned int virtblk_queue_depth; diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c index e9b7e0b..17857be 100644 --- a/drivers/char/virtio_console.c +++ b/drivers/char/virtio_console.c @@ -1136,8 +1136,6 @@ static int put_chars(u32 vtermno, const char *buf, int count) { struct port *port; struct scatterlist sg[1]; - void *data; - int ret; if (unlikely(early_put_chars)) return early_put_chars(vtermno, buf, count); @@ -1146,14 +1144,8 @@ static int put_chars(u32 vtermno, const char *buf, int count) if (!port) return -EPIPE; - data = kmemdup(buf, count, GFP_ATOMIC); - if (!data) - return -ENOMEM; - - sg_init_one(sg, data, count); - ret = __send_to_port(port, sg, 1, count, data, false); - kfree(data); - return ret; + sg_init_one(sg, buf, count); + return __send_to_port(port, sg, 1, count, (void *)buf, false); } /* @@ -1947,7 +1939,7 @@ static int init_vqs(struct ports_device *portdev) /* Find the queues. */ err = portdev->vdev->config->find_vqs(portdev->vdev, nr_queues, vqs, io_callbacks, - (const char **)io_names, NULL); + (const char **)io_names); if (err) goto free; diff --git a/drivers/crypto/virtio/virtio_crypto_core.c b/drivers/crypto/virtio/virtio_crypto_core.c index 21472e4..b5b1533 100644 --- a/drivers/crypto/virtio/virtio_crypto_core.c +++ b/drivers/crypto/virtio/virtio_crypto_core.c @@ -120,7 +120,7 @@ static int virtcrypto_find_vqs(struct virtio_crypto *vi) } ret = vi->vdev->config->find_vqs(vi->vdev, total_vqs, vqs, callbacks, - names, NULL); + names); if (ret) goto err_find; diff --git a/drivers/gpu/drm/virtio/virtgpu_kms.c b/drivers/gpu/drm/virtio/virtgpu_kms.c index 4918668..30f989a 100644 --- a/drivers/gpu/drm/virtio/virtgpu_kms.c +++ b/drivers/gpu/drm/virtio/virtgpu_kms.c @@ -176,7 +176,7 @@ int virtio_gpu_driver_load(struct drm_device *dev, unsigned long flags) #endif ret = vgdev->vdev->config->find_vqs(vgdev->vdev, 2, vqs, - callbacks, names, NULL); + callbacks, names); if (ret) { DRM_ERROR("failed to find virt queues\n"); goto err_vqs; diff --git a/drivers/misc/mic/vop/vop_main.c b/drivers/misc/mic/vop/vop_main.c index c2e29d7..1a2b67f3 100644 --- a/drivers/misc/mic/vop/vop_main.c +++ b/drivers/misc/mic/vop/vop_main.c @@ -374,7 +374,7 @@ static struct virtqueue *vop_find_vq(struct virtio_device *dev, static int vop_find_vqs(struct virtio_device *dev, unsigned nvqs, struct virtqueue *vqs[], vq_callback_t *callbacks[], - const char * const names[], struct irq_affinity *desc) + const char * const names[]) { struct _vop_vdev *vdev = to_vopvdev(dev); struct vop_device *vpdev = vdev->vpdev; diff --git a/drivers/net/caif/caif_virtio.c b/drivers/net/caif/caif_virtio.c index bc0eb47..b306210 100644 --- a/drivers/net/caif/caif_virtio.c +++ b/drivers/net/caif/caif_virtio.c @@ -679,8 +679,7 @@ static int cfv_probe(struct virtio_device *vdev) goto err; /* Get the TX virtio ring. This is a "guest side vring". */ - err = vdev->config->find_vqs(vdev, 1, &cfv->vq_tx, &vq_cbs, &names, - NULL); + err = vdev->config->find_vqs(vdev, 1, &cfv->vq_tx, &vq_cbs, &names); if (err) goto err; diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index ea9890d..e9d7e2b 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -2080,7 +2080,7 @@ static int virtnet_find_vqs(struct virtnet_info *vi) } ret = vi->vdev->config->find_vqs(vi->vdev, total_vqs, vqs, callbacks, - names, NULL); + names); if (ret) goto err_find; diff --git a/drivers/remoteproc/remoteproc_virtio.c b/drivers/remoteproc/remoteproc_virtio.c index 0142cc3..364411f 100644 --- a/drivers/remoteproc/remoteproc_virtio.c +++ b/drivers/remoteproc/remoteproc_virtio.c @@ -137,8 +137,7 @@ static void rproc_virtio_del_vqs(struct virtio_device *vdev) static int rproc_virtio_find_vqs(struct virtio_device *vdev, unsigned int nvqs, struct virtqueue *vqs[], vq_callback_t *callbacks[], - const char * const names[], - struct irq_affinity *desc) + const char * const names[]) { int i, ret; diff --git a/drivers/rpmsg/virtio_rpmsg_bus.c b/drivers/rpmsg/virtio_rpmsg_bus.c index 5e66e08..3090b0d 100644 --- a/drivers/rpmsg/virtio_rpmsg_bus.c +++ b/drivers/rpmsg/virtio_rpmsg_bus.c @@ -869,7 +869,7 @@ static int rpmsg_probe(struct virtio_device *vdev) init_waitqueue_head(&vrp->sendq); /* We expect two virtqueues, rx and tx (and in this order) */ - err = vdev->config->find_vqs(vdev, 2, vqs, vq_cbs, names, NULL); + err = vdev->config->find_vqs(vdev, 2, vqs, vq_cbs, names); if (err) goto free_vrp; diff --git a/drivers/s390/virtio/kvm_virtio.c b/drivers/s390/virtio/kvm_virtio.c index 2ce0b3e..5e5c11f 100644 --- a/drivers/s390/virtio/kvm_virtio.c +++ b/drivers/s390/virtio/kvm_virtio.c @@ -255,8 +255,7 @@ static void kvm_del_vqs(struct virtio_device *vdev) static int kvm_find_vqs(struct virtio_device *vdev, unsigned nvqs, struct virtqueue *vqs[], vq_callback_t *callbacks[], - const char * const names[], - struct irq_affinity *desc) + const char * const names[]) { struct kvm_device *kdev = to_kvmdev(vdev); int i; diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c index 0ed209f..648373c 100644 --- a/drivers/s390/virtio/virtio_ccw.c +++ b/drivers/s390/virtio/virtio_ccw.c @@ -628,8 +628,7 @@ static int virtio_ccw_register_adapter_ind(struct virtio_ccw_device *vcdev, static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs, struct virtqueue *vqs[], vq_callback_t *callbacks[], - const char * const names[], - struct irq_affinity *desc) + const char * const names[]) { struct virtio_ccw_device *vcdev = to_vc_device(vdev); unsigned long *indicatorp = NULL; diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c index 939c47d..c680d76 100644 --- a/drivers/scsi/virtio_scsi.c +++ b/drivers/scsi/virtio_scsi.c @@ -18,7 +18,6 @@ #include #include #include -#include #include #include #include @@ -30,7 +29,6 @@ #include #include #include -#include #define VIRTIO_SCSI_MEMPOOL_SZ 64 #define VIRTIO_SCSI_EVENT_LEN 8 @@ -110,6 +108,7 @@ struct virtio_scsi { bool affinity_hint_set; struct hlist_node node; + struct hlist_node node_dead; /* Protected by event_vq lock */ bool stop_events; @@ -119,6 +118,7 @@ struct virtio_scsi { struct virtio_scsi_vq req_vqs[]; }; +static enum cpuhp_state virtioscsi_online; static struct kmem_cache *virtscsi_cmd_cache; static mempool_t *virtscsi_cmd_pool; @@ -766,13 +766,6 @@ static void virtscsi_target_destroy(struct scsi_target *starget) kfree(tgt); } -static int virtscsi_map_queues(struct Scsi_Host *shost) -{ - struct virtio_scsi *vscsi = shost_priv(shost); - - return blk_mq_virtio_map_queues(&shost->tag_set, vscsi->vdev, 2); -} - static struct scsi_host_template virtscsi_host_template_single = { .module = THIS_MODULE, .name = "Virtio SCSI HBA", @@ -808,7 +801,6 @@ static struct scsi_host_template virtscsi_host_template_multi = { .use_clustering = ENABLE_CLUSTERING, .target_alloc = virtscsi_target_alloc, .target_destroy = virtscsi_target_destroy, - .map_queues = virtscsi_map_queues, .track_queue_depth = 1, }; @@ -825,6 +817,80 @@ static struct scsi_host_template virtscsi_host_template_multi = { virtio_cwrite(vdev, struct virtio_scsi_config, fld, &__val); \ } while(0) +static void __virtscsi_set_affinity(struct virtio_scsi *vscsi, bool affinity) +{ + int i; + int cpu; + + /* In multiqueue mode, when the number of cpu is equal + * to the number of request queues, we let the qeueues + * to be private to one cpu by setting the affinity hint + * to eliminate the contention. + */ + if ((vscsi->num_queues == 1 || + vscsi->num_queues != num_online_cpus()) && affinity) { + if (vscsi->affinity_hint_set) + affinity = false; + else + return; + } + + if (affinity) { + i = 0; + for_each_online_cpu(cpu) { + virtqueue_set_affinity(vscsi->req_vqs[i].vq, cpu); + i++; + } + + vscsi->affinity_hint_set = true; + } else { + for (i = 0; i < vscsi->num_queues; i++) { + if (!vscsi->req_vqs[i].vq) + continue; + + virtqueue_set_affinity(vscsi->req_vqs[i].vq, -1); + } + + vscsi->affinity_hint_set = false; + } +} + +static void virtscsi_set_affinity(struct virtio_scsi *vscsi, bool affinity) +{ + get_online_cpus(); + __virtscsi_set_affinity(vscsi, affinity); + put_online_cpus(); +} + +static int virtscsi_cpu_online(unsigned int cpu, struct hlist_node *node) +{ + struct virtio_scsi *vscsi = hlist_entry_safe(node, struct virtio_scsi, + node); + __virtscsi_set_affinity(vscsi, true); + return 0; +} + +static int virtscsi_cpu_notif_add(struct virtio_scsi *vi) +{ + int ret; + + ret = cpuhp_state_add_instance(virtioscsi_online, &vi->node); + if (ret) + return ret; + + ret = cpuhp_state_add_instance(CPUHP_VIRT_SCSI_DEAD, &vi->node_dead); + if (ret) + cpuhp_state_remove_instance(virtioscsi_online, &vi->node); + return ret; +} + +static void virtscsi_cpu_notif_remove(struct virtio_scsi *vi) +{ + cpuhp_state_remove_instance_nocalls(virtioscsi_online, &vi->node); + cpuhp_state_remove_instance_nocalls(CPUHP_VIRT_SCSI_DEAD, + &vi->node_dead); +} + static void virtscsi_init_vq(struct virtio_scsi_vq *virtscsi_vq, struct virtqueue *vq) { @@ -834,8 +900,14 @@ static void virtscsi_init_vq(struct virtio_scsi_vq *virtscsi_vq, static void virtscsi_remove_vqs(struct virtio_device *vdev) { + struct Scsi_Host *sh = virtio_scsi_host(vdev); + struct virtio_scsi *vscsi = shost_priv(sh); + + virtscsi_set_affinity(vscsi, false); + /* Stop all the virtqueues. */ vdev->config->reset(vdev); + vdev->config->del_vqs(vdev); } @@ -848,7 +920,6 @@ static int virtscsi_init(struct virtio_device *vdev, vq_callback_t **callbacks; const char **names; struct virtqueue **vqs; - struct irq_affinity desc = { .pre_vectors = 2 }; num_vqs = vscsi->num_queues + VIRTIO_SCSI_VQ_BASE; vqs = kmalloc(num_vqs * sizeof(struct virtqueue *), GFP_KERNEL); @@ -870,8 +941,7 @@ static int virtscsi_init(struct virtio_device *vdev, } /* Discover virtqueues and write information to configuration. */ - err = vdev->config->find_vqs(vdev, num_vqs, vqs, callbacks, names, - &desc); + err = vdev->config->find_vqs(vdev, num_vqs, vqs, callbacks, names); if (err) goto out; @@ -937,6 +1007,10 @@ static int virtscsi_probe(struct virtio_device *vdev) if (err) goto virtscsi_init_failed; + err = virtscsi_cpu_notif_add(vscsi); + if (err) + goto scsi_add_host_failed; + cmd_per_lun = virtscsi_config_get(vdev, cmd_per_lun) ?: 1; shost->cmd_per_lun = min_t(u32, cmd_per_lun, shost->can_queue); shost->max_sectors = virtscsi_config_get(vdev, max_sectors) ?: 0xFFFF; @@ -991,6 +1065,9 @@ static void virtscsi_remove(struct virtio_device *vdev) virtscsi_cancel_event_work(vscsi); scsi_remove_host(shost); + + virtscsi_cpu_notif_remove(vscsi); + virtscsi_remove_vqs(vdev); scsi_host_put(shost); } @@ -998,6 +1075,10 @@ static void virtscsi_remove(struct virtio_device *vdev) #ifdef CONFIG_PM_SLEEP static int virtscsi_freeze(struct virtio_device *vdev) { + struct Scsi_Host *sh = virtio_scsi_host(vdev); + struct virtio_scsi *vscsi = shost_priv(sh); + + virtscsi_cpu_notif_remove(vscsi); virtscsi_remove_vqs(vdev); return 0; } @@ -1012,6 +1093,11 @@ static int virtscsi_restore(struct virtio_device *vdev) if (err) return err; + err = virtscsi_cpu_notif_add(vscsi); + if (err) { + vdev->config->del_vqs(vdev); + return err; + } virtio_device_ready(vdev); if (virtio_has_feature(vdev, VIRTIO_SCSI_F_HOTPLUG)) @@ -1066,6 +1152,16 @@ static int __init init(void) pr_err("mempool_create() for virtscsi_cmd_pool failed\n"); goto error; } + ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, + "scsi/virtio:online", + virtscsi_cpu_online, NULL); + if (ret < 0) + goto error; + virtioscsi_online = ret; + ret = cpuhp_setup_state_multi(CPUHP_VIRT_SCSI_DEAD, "scsi/virtio:dead", + NULL, virtscsi_cpu_online); + if (ret) + goto error; ret = register_virtio_driver(&virtio_scsi_driver); if (ret < 0) goto error; @@ -1081,12 +1177,17 @@ static int __init init(void) kmem_cache_destroy(virtscsi_cmd_cache); virtscsi_cmd_cache = NULL; } + if (virtioscsi_online) + cpuhp_remove_multi_state(virtioscsi_online); + cpuhp_remove_multi_state(CPUHP_VIRT_SCSI_DEAD); return ret; } static void __exit fini(void) { unregister_virtio_driver(&virtio_scsi_driver); + cpuhp_remove_multi_state(virtioscsi_online); + cpuhp_remove_multi_state(CPUHP_VIRT_SCSI_DEAD); mempool_destroy(virtscsi_cmd_pool); kmem_cache_destroy(virtscsi_cmd_cache); } diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index f0ba362..c323bce 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -284,22 +284,6 @@ void vhost_poll_queue(struct vhost_poll *poll) } EXPORT_SYMBOL_GPL(vhost_poll_queue); -static void __vhost_vq_meta_reset(struct vhost_virtqueue *vq) -{ - int j; - - for (j = 0; j < VHOST_NUM_ADDRS; j++) - vq->meta_iotlb[j] = NULL; -} - -static void vhost_vq_meta_reset(struct vhost_dev *d) -{ - int i; - - for (i = 0; i < d->nvqs; ++i) - __vhost_vq_meta_reset(d->vqs[i]); -} - static void vhost_vq_reset(struct vhost_dev *dev, struct vhost_virtqueue *vq) { @@ -330,7 +314,6 @@ static void vhost_vq_reset(struct vhost_dev *dev, vq->busyloop_timeout = 0; vq->umem = NULL; vq->iotlb = NULL; - __vhost_vq_meta_reset(vq); } static int vhost_worker(void *data) @@ -710,18 +693,6 @@ static int vq_memory_access_ok(void __user *log_base, struct vhost_umem *umem, return 1; } -static inline void __user *vhost_vq_meta_fetch(struct vhost_virtqueue *vq, - u64 addr, unsigned int size, - int type) -{ - const struct vhost_umem_node *node = vq->meta_iotlb[type]; - - if (!node) - return NULL; - - return (void *)(uintptr_t)(node->userspace_addr + addr - node->start); -} - /* Can we switch to this memory table? */ /* Caller should have device mutex but not vq mutex */ static int memory_access_ok(struct vhost_dev *d, struct vhost_umem *umem, @@ -764,14 +735,8 @@ static int vhost_copy_to_user(struct vhost_virtqueue *vq, void __user *to, * could be access through iotlb. So -EAGAIN should * not happen in this case. */ + /* TODO: more fast path */ struct iov_iter t; - void __user *uaddr = vhost_vq_meta_fetch(vq, - (u64)(uintptr_t)to, size, - VHOST_ADDR_DESC); - - if (uaddr) - return __copy_to_user(uaddr, from, size); - ret = translate_desc(vq, (u64)(uintptr_t)to, size, vq->iotlb_iov, ARRAY_SIZE(vq->iotlb_iov), VHOST_ACCESS_WO); @@ -799,14 +764,8 @@ static int vhost_copy_from_user(struct vhost_virtqueue *vq, void *to, * could be access through iotlb. So -EAGAIN should * not happen in this case. */ - void __user *uaddr = vhost_vq_meta_fetch(vq, - (u64)(uintptr_t)from, size, - VHOST_ADDR_DESC); + /* TODO: more fast path */ struct iov_iter f; - - if (uaddr) - return __copy_from_user(to, uaddr, size); - ret = translate_desc(vq, (u64)(uintptr_t)from, size, vq->iotlb_iov, ARRAY_SIZE(vq->iotlb_iov), VHOST_ACCESS_RO); @@ -826,12 +785,17 @@ static int vhost_copy_from_user(struct vhost_virtqueue *vq, void *to, return ret; } -static void __user *__vhost_get_user_slow(struct vhost_virtqueue *vq, - void __user *addr, unsigned int size, - int type) +static void __user *__vhost_get_user(struct vhost_virtqueue *vq, + void __user *addr, unsigned size) { int ret; + /* This function should be called after iotlb + * prefetch, which means we're sure that vq + * could be access through iotlb. So -EAGAIN should + * not happen in this case. + */ + /* TODO: more fast path */ ret = translate_desc(vq, (u64)(uintptr_t)addr, size, vq->iotlb_iov, ARRAY_SIZE(vq->iotlb_iov), VHOST_ACCESS_RO); @@ -852,32 +816,14 @@ static void __user *__vhost_get_user_slow(struct vhost_virtqueue *vq, return vq->iotlb_iov[0].iov_base; } -/* This function should be called after iotlb - * prefetch, which means we're sure that vq - * could be access through iotlb. So -EAGAIN should - * not happen in this case. - */ -static inline void __user *__vhost_get_user(struct vhost_virtqueue *vq, - void *addr, unsigned int size, - int type) -{ - void __user *uaddr = vhost_vq_meta_fetch(vq, - (u64)(uintptr_t)addr, size, type); - if (uaddr) - return uaddr; - - return __vhost_get_user_slow(vq, addr, size, type); -} - -#define vhost_put_user(vq, x, ptr) \ +#define vhost_put_user(vq, x, ptr) \ ({ \ int ret = -EFAULT; \ if (!vq->iotlb) { \ ret = __put_user(x, ptr); \ } else { \ __typeof__(ptr) to = \ - (__typeof__(ptr)) __vhost_get_user(vq, ptr, \ - sizeof(*ptr), VHOST_ADDR_USED); \ + (__typeof__(ptr)) __vhost_get_user(vq, ptr, sizeof(*ptr)); \ if (to != NULL) \ ret = __put_user(x, to); \ else \ @@ -886,16 +832,14 @@ static inline void __user *__vhost_get_user(struct vhost_virtqueue *vq, ret; \ }) -#define vhost_get_user(vq, x, ptr, type) \ +#define vhost_get_user(vq, x, ptr) \ ({ \ int ret; \ if (!vq->iotlb) { \ ret = __get_user(x, ptr); \ } else { \ __typeof__(ptr) from = \ - (__typeof__(ptr)) __vhost_get_user(vq, ptr, \ - sizeof(*ptr), \ - type); \ + (__typeof__(ptr)) __vhost_get_user(vq, ptr, sizeof(*ptr)); \ if (from != NULL) \ ret = __get_user(x, from); \ else \ @@ -904,12 +848,6 @@ static inline void __user *__vhost_get_user(struct vhost_virtqueue *vq, ret; \ }) -#define vhost_get_avail(vq, x, ptr) \ - vhost_get_user(vq, x, ptr, VHOST_ADDR_AVAIL) - -#define vhost_get_used(vq, x, ptr) \ - vhost_get_user(vq, x, ptr, VHOST_ADDR_USED) - static void vhost_dev_lock_vqs(struct vhost_dev *d) { int i = 0; @@ -1015,7 +953,6 @@ static int vhost_process_iotlb_msg(struct vhost_dev *dev, ret = -EFAULT; break; } - vhost_vq_meta_reset(dev); if (vhost_new_umem_range(dev->iotlb, msg->iova, msg->size, msg->iova + msg->size - 1, msg->uaddr, msg->perm)) { @@ -1025,7 +962,6 @@ static int vhost_process_iotlb_msg(struct vhost_dev *dev, vhost_iotlb_notify_vq(dev, msg); break; case VHOST_IOTLB_INVALIDATE: - vhost_vq_meta_reset(dev); vhost_del_umem_range(dev->iotlb, msg->iova, msg->iova + msg->size - 1); break; @@ -1169,26 +1105,12 @@ static int vq_access_ok(struct vhost_virtqueue *vq, unsigned int num, sizeof *used + num * sizeof *used->ring + s); } -static void vhost_vq_meta_update(struct vhost_virtqueue *vq, - const struct vhost_umem_node *node, - int type) -{ - int access = (type == VHOST_ADDR_USED) ? - VHOST_ACCESS_WO : VHOST_ACCESS_RO; - - if (likely(node->perm & access)) - vq->meta_iotlb[type] = node; -} - static int iotlb_access_ok(struct vhost_virtqueue *vq, - int access, u64 addr, u64 len, int type) + int access, u64 addr, u64 len) { const struct vhost_umem_node *node; struct vhost_umem *umem = vq->iotlb; - u64 s = 0, size, orig_addr = addr; - - if (vhost_vq_meta_fetch(vq, addr, len, type)) - return true; + u64 s = 0, size; while (len > s) { node = vhost_umem_interval_tree_iter_first(&umem->umem_tree, @@ -1205,10 +1127,6 @@ static int iotlb_access_ok(struct vhost_virtqueue *vq, } size = node->size - addr + node->start; - - if (orig_addr == addr && size >= len) - vhost_vq_meta_update(vq, node, type); - s += size; addr += size; } @@ -1225,15 +1143,13 @@ int vq_iotlb_prefetch(struct vhost_virtqueue *vq) return 1; return iotlb_access_ok(vq, VHOST_ACCESS_RO, (u64)(uintptr_t)vq->desc, - num * sizeof(*vq->desc), VHOST_ADDR_DESC) && + num * sizeof *vq->desc) && iotlb_access_ok(vq, VHOST_ACCESS_RO, (u64)(uintptr_t)vq->avail, sizeof *vq->avail + - num * sizeof(*vq->avail->ring) + s, - VHOST_ADDR_AVAIL) && + num * sizeof *vq->avail->ring + s) && iotlb_access_ok(vq, VHOST_ACCESS_WO, (u64)(uintptr_t)vq->used, sizeof *vq->used + - num * sizeof(*vq->used->ring) + s, - VHOST_ADDR_USED); + num * sizeof *vq->used->ring + s); } EXPORT_SYMBOL_GPL(vq_iotlb_prefetch); @@ -1814,7 +1730,7 @@ int vhost_vq_init_access(struct vhost_virtqueue *vq) r = -EFAULT; goto err; } - r = vhost_get_used(vq, last_used_idx, &vq->used->idx); + r = vhost_get_user(vq, last_used_idx, &vq->used->idx); if (r) { vq_err(vq, "Can't access used idx at %p\n", &vq->used->idx); @@ -2018,7 +1934,7 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq, last_avail_idx = vq->last_avail_idx; if (vq->avail_idx == vq->last_avail_idx) { - if (unlikely(vhost_get_avail(vq, avail_idx, &vq->avail->idx))) { + if (unlikely(vhost_get_user(vq, avail_idx, &vq->avail->idx))) { vq_err(vq, "Failed to access avail idx at %p\n", &vq->avail->idx); return -EFAULT; @@ -2045,7 +1961,7 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq, /* Grab the next descriptor number they're advertising, and increment * the index we've seen. */ - if (unlikely(vhost_get_avail(vq, ring_head, + if (unlikely(vhost_get_user(vq, ring_head, &vq->avail->ring[last_avail_idx & (vq->num - 1)]))) { vq_err(vq, "Failed to read head: idx %d address %p\n", last_avail_idx, @@ -2261,7 +2177,7 @@ static bool vhost_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq) * with the barrier that the Guest executes when enabling * interrupts. */ smp_mb(); - if (vhost_get_avail(vq, flags, &vq->avail->flags)) { + if (vhost_get_user(vq, flags, &vq->avail->flags)) { vq_err(vq, "Failed to get flags"); return true; } @@ -2288,7 +2204,7 @@ static bool vhost_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq) * interrupts. */ smp_mb(); - if (vhost_get_avail(vq, event, vhost_used_event(vq))) { + if (vhost_get_user(vq, event, vhost_used_event(vq))) { vq_err(vq, "Failed to get used event idx"); return true; } @@ -2335,7 +2251,7 @@ bool vhost_vq_avail_empty(struct vhost_dev *dev, struct vhost_virtqueue *vq) if (vq->avail_idx != vq->last_avail_idx) return false; - r = vhost_get_avail(vq, avail_idx, &vq->avail->idx); + r = vhost_get_user(vq, avail_idx, &vq->avail->idx); if (unlikely(r)) return false; vq->avail_idx = vhost16_to_cpu(vq, avail_idx); @@ -2371,7 +2287,7 @@ bool vhost_enable_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq) /* They could have slipped one in as we were doing that: make * sure it's written, then check again. */ smp_mb(); - r = vhost_get_avail(vq, avail_idx, &vq->avail->idx); + r = vhost_get_user(vq, avail_idx, &vq->avail->idx); if (r) { vq_err(vq, "Failed to check avail idx at %p: %d\n", &vq->avail->idx, r); diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h index f55671d..a9cbbb1 100644 --- a/drivers/vhost/vhost.h +++ b/drivers/vhost/vhost.h @@ -76,13 +76,6 @@ struct vhost_umem { int numem; }; -enum vhost_uaddr_type { - VHOST_ADDR_DESC = 0, - VHOST_ADDR_AVAIL = 1, - VHOST_ADDR_USED = 2, - VHOST_NUM_ADDRS = 3, -}; - /* The virtqueue structure describes a queue attached to a device. */ struct vhost_virtqueue { struct vhost_dev *dev; @@ -93,7 +86,6 @@ struct vhost_virtqueue { struct vring_desc __user *desc; struct vring_avail __user *avail; struct vring_used __user *used; - const struct vhost_umem_node *meta_iotlb[VHOST_NUM_ADDRS]; struct file *kick; struct file *call; struct file *error; diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c index 4e11915..a610061 100644 --- a/drivers/virtio/virtio_balloon.c +++ b/drivers/virtio/virtio_balloon.c @@ -414,8 +414,7 @@ static int init_vqs(struct virtio_balloon *vb) * optionally stat. */ nvqs = virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_STATS_VQ) ? 3 : 2; - err = vb->vdev->config->find_vqs(vb->vdev, nvqs, vqs, callbacks, names, - NULL); + err = vb->vdev->config->find_vqs(vb->vdev, nvqs, vqs, callbacks, names); if (err) return err; diff --git a/drivers/virtio/virtio_input.c b/drivers/virtio/virtio_input.c index 79f1293..350a2a5 100644 --- a/drivers/virtio/virtio_input.c +++ b/drivers/virtio/virtio_input.c @@ -173,8 +173,7 @@ static int virtinput_init_vqs(struct virtio_input *vi) static const char * const names[] = { "events", "status" }; int err; - err = vi->vdev->config->find_vqs(vi->vdev, 2, vqs, cbs, names, - NULL); + err = vi->vdev->config->find_vqs(vi->vdev, 2, vqs, cbs, names); if (err) return err; vi->evt = vqs[0]; diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c index 78343b8..08357d7 100644 --- a/drivers/virtio/virtio_mmio.c +++ b/drivers/virtio/virtio_mmio.c @@ -446,8 +446,7 @@ static struct virtqueue *vm_setup_vq(struct virtio_device *vdev, unsigned index, static int vm_find_vqs(struct virtio_device *vdev, unsigned nvqs, struct virtqueue *vqs[], vq_callback_t *callbacks[], - const char * const names[], - struct irq_affinity *desc) + const char * const names[]) { struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev); unsigned int irq = platform_get_irq(vm_dev->pdev, 0); diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c index df548a6..a3376731 100644 --- a/drivers/virtio/virtio_pci_common.c +++ b/drivers/virtio/virtio_pci_common.c @@ -33,8 +33,10 @@ void vp_synchronize_vectors(struct virtio_device *vdev) struct virtio_pci_device *vp_dev = to_vp_device(vdev); int i; - synchronize_irq(pci_irq_vector(vp_dev->pci_dev, 0)); - for (i = 1; i < vp_dev->msix_vectors; i++) + if (vp_dev->intx_enabled) + synchronize_irq(vp_dev->pci_dev->irq); + + for (i = 0; i < vp_dev->msix_vectors; ++i) synchronize_irq(pci_irq_vector(vp_dev->pci_dev, i)); } @@ -97,10 +99,77 @@ static irqreturn_t vp_interrupt(int irq, void *opaque) return vp_vring_interrupt(irq, opaque); } -static void vp_remove_vqs(struct virtio_device *vdev) +static int vp_request_msix_vectors(struct virtio_device *vdev, int nvectors, + bool per_vq_vectors) +{ + struct virtio_pci_device *vp_dev = to_vp_device(vdev); + const char *name = dev_name(&vp_dev->vdev.dev); + unsigned i, v; + int err = -ENOMEM; + + vp_dev->msix_vectors = nvectors; + + vp_dev->msix_names = kmalloc(nvectors * sizeof *vp_dev->msix_names, + GFP_KERNEL); + if (!vp_dev->msix_names) + goto error; + vp_dev->msix_affinity_masks + = kzalloc(nvectors * sizeof *vp_dev->msix_affinity_masks, + GFP_KERNEL); + if (!vp_dev->msix_affinity_masks) + goto error; + for (i = 0; i < nvectors; ++i) + if (!alloc_cpumask_var(&vp_dev->msix_affinity_masks[i], + GFP_KERNEL)) + goto error; + + err = pci_alloc_irq_vectors(vp_dev->pci_dev, nvectors, nvectors, + PCI_IRQ_MSIX); + if (err < 0) + goto error; + vp_dev->msix_enabled = 1; + + /* Set the vector used for configuration */ + v = vp_dev->msix_used_vectors; + snprintf(vp_dev->msix_names[v], sizeof *vp_dev->msix_names, + "%s-config", name); + err = request_irq(pci_irq_vector(vp_dev->pci_dev, v), + vp_config_changed, 0, vp_dev->msix_names[v], + vp_dev); + if (err) + goto error; + ++vp_dev->msix_used_vectors; + + v = vp_dev->config_vector(vp_dev, v); + /* Verify we had enough resources to assign the vector */ + if (v == VIRTIO_MSI_NO_VECTOR) { + err = -EBUSY; + goto error; + } + + if (!per_vq_vectors) { + /* Shared vector for all VQs */ + v = vp_dev->msix_used_vectors; + snprintf(vp_dev->msix_names[v], sizeof *vp_dev->msix_names, + "%s-virtqueues", name); + err = request_irq(pci_irq_vector(vp_dev->pci_dev, v), + vp_vring_interrupt, 0, vp_dev->msix_names[v], + vp_dev); + if (err) + goto error; + ++vp_dev->msix_used_vectors; + } + return 0; +error: + return err; +} + +/* the config->del_vqs() implementation */ +void vp_del_vqs(struct virtio_device *vdev) { struct virtio_pci_device *vp_dev = to_vp_device(vdev); struct virtqueue *vq, *n; + int i; list_for_each_entry_safe(vq, n, &vdev->vqs, list) { if (vp_dev->msix_vector_map) { @@ -112,170 +181,117 @@ static void vp_remove_vqs(struct virtio_device *vdev) } vp_dev->del_vq(vq); } -} -/* the config->del_vqs() implementation */ -void vp_del_vqs(struct virtio_device *vdev) -{ - struct virtio_pci_device *vp_dev = to_vp_device(vdev); - int i; - - if (WARN_ON_ONCE(list_empty_careful(&vdev->vqs))) - return; + if (vp_dev->intx_enabled) { + free_irq(vp_dev->pci_dev->irq, vp_dev); + vp_dev->intx_enabled = 0; + } - vp_remove_vqs(vdev); + for (i = 0; i < vp_dev->msix_used_vectors; ++i) + free_irq(pci_irq_vector(vp_dev->pci_dev, i), vp_dev); - if (vp_dev->pci_dev->msix_enabled) { - for (i = 0; i < vp_dev->msix_vectors; i++) + for (i = 0; i < vp_dev->msix_vectors; i++) + if (vp_dev->msix_affinity_masks[i]) free_cpumask_var(vp_dev->msix_affinity_masks[i]); + if (vp_dev->msix_enabled) { /* Disable the vector used for configuration */ vp_dev->config_vector(vp_dev, VIRTIO_MSI_NO_VECTOR); - kfree(vp_dev->msix_affinity_masks); - kfree(vp_dev->msix_names); - kfree(vp_dev->msix_vector_map); + pci_free_irq_vectors(vp_dev->pci_dev); + vp_dev->msix_enabled = 0; } - free_irq(pci_irq_vector(vp_dev->pci_dev, 0), vp_dev); - pci_free_irq_vectors(vp_dev->pci_dev); + vp_dev->msix_vectors = 0; + vp_dev->msix_used_vectors = 0; + kfree(vp_dev->msix_names); + vp_dev->msix_names = NULL; + kfree(vp_dev->msix_affinity_masks); + vp_dev->msix_affinity_masks = NULL; + kfree(vp_dev->msix_vector_map); + vp_dev->msix_vector_map = NULL; } static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned nvqs, - struct virtqueue *vqs[], vq_callback_t *callbacks[], - const char * const names[], struct irq_affinity *desc) + struct virtqueue *vqs[], + vq_callback_t *callbacks[], + const char * const names[], + bool per_vq_vectors) { struct virtio_pci_device *vp_dev = to_vp_device(vdev); - const char *name = dev_name(&vp_dev->vdev.dev); - int i, err = -ENOMEM, allocated_vectors, nvectors; - unsigned flags = PCI_IRQ_MSIX; - bool shared = false; u16 msix_vec; - - if (desc) { - flags |= PCI_IRQ_AFFINITY; - desc->pre_vectors++; /* virtio config vector */ - } - - nvectors = 1; - for (i = 0; i < nvqs; i++) - if (callbacks[i]) - nvectors++; - - /* Try one vector per queue first. */ - err = pci_alloc_irq_vectors_affinity(vp_dev->pci_dev, nvectors, - nvectors, flags, desc); - if (err < 0) { - /* Fallback to one vector for config, one shared for queues. */ - shared = true; - err = pci_alloc_irq_vectors(vp_dev->pci_dev, 2, 2, - PCI_IRQ_MSIX); - if (err < 0) - return err; - } - if (err < 0) - return err; - - vp_dev->msix_vectors = nvectors; - vp_dev->msix_names = kmalloc_array(nvectors, - sizeof(*vp_dev->msix_names), GFP_KERNEL); - if (!vp_dev->msix_names) - goto out_free_irq_vectors; - - vp_dev->msix_affinity_masks = kcalloc(nvectors, - sizeof(*vp_dev->msix_affinity_masks), GFP_KERNEL); - if (!vp_dev->msix_affinity_masks) - goto out_free_msix_names; - - for (i = 0; i < nvectors; ++i) { - if (!alloc_cpumask_var(&vp_dev->msix_affinity_masks[i], - GFP_KERNEL)) - goto out_free_msix_affinity_masks; + int i, err, nvectors, allocated_vectors; + + if (per_vq_vectors) { + /* Best option: one for change interrupt, one per vq. */ + nvectors = 1; + for (i = 0; i < nvqs; ++i) + if (callbacks[i]) + ++nvectors; + } else { + /* Second best: one for change, shared for all vqs. */ + nvectors = 2; } - /* Set the vector used for configuration */ - snprintf(vp_dev->msix_names[0], sizeof(*vp_dev->msix_names), - "%s-config", name); - err = request_irq(pci_irq_vector(vp_dev->pci_dev, 0), vp_config_changed, - 0, vp_dev->msix_names[0], vp_dev); + err = vp_request_msix_vectors(vdev, nvectors, per_vq_vectors); if (err) - goto out_free_msix_affinity_masks; + goto error_find; - /* Verify we had enough resources to assign the vector */ - if (vp_dev->config_vector(vp_dev, 0) == VIRTIO_MSI_NO_VECTOR) { - err = -EBUSY; - goto out_free_config_irq; + if (per_vq_vectors) { + vp_dev->msix_vector_map = kmalloc_array(nvqs, + sizeof(*vp_dev->msix_vector_map), GFP_KERNEL); + if (!vp_dev->msix_vector_map) + goto error_find; } - vp_dev->msix_vector_map = kmalloc_array(nvqs, - sizeof(*vp_dev->msix_vector_map), GFP_KERNEL); - if (!vp_dev->msix_vector_map) - goto out_disable_config_irq; - - allocated_vectors = 1; /* vector 0 is the config interrupt */ + allocated_vectors = vp_dev->msix_used_vectors; for (i = 0; i < nvqs; ++i) { if (!names[i]) { vqs[i] = NULL; continue; } - if (callbacks[i]) - msix_vec = allocated_vectors; - else + if (!callbacks[i]) msix_vec = VIRTIO_MSI_NO_VECTOR; - + else if (per_vq_vectors) + msix_vec = allocated_vectors++; + else + msix_vec = VP_MSIX_VQ_VECTOR; vqs[i] = vp_dev->setup_vq(vp_dev, i, callbacks[i], names[i], msix_vec); if (IS_ERR(vqs[i])) { err = PTR_ERR(vqs[i]); - goto out_remove_vqs; + goto error_find; } + if (!per_vq_vectors) + continue; + if (msix_vec == VIRTIO_MSI_NO_VECTOR) { vp_dev->msix_vector_map[i] = VIRTIO_MSI_NO_VECTOR; continue; } - snprintf(vp_dev->msix_names[i + 1], - sizeof(*vp_dev->msix_names), "%s-%s", + /* allocate per-vq irq if available and necessary */ + snprintf(vp_dev->msix_names[msix_vec], + sizeof *vp_dev->msix_names, + "%s-%s", dev_name(&vp_dev->vdev.dev), names[i]); err = request_irq(pci_irq_vector(vp_dev->pci_dev, msix_vec), - vring_interrupt, IRQF_SHARED, - vp_dev->msix_names[i + 1], vqs[i]); + vring_interrupt, 0, + vp_dev->msix_names[msix_vec], + vqs[i]); if (err) { /* don't free this irq on error */ vp_dev->msix_vector_map[i] = VIRTIO_MSI_NO_VECTOR; - goto out_remove_vqs; + goto error_find; } vp_dev->msix_vector_map[i] = msix_vec; - - /* - * Use a different vector for each queue if they are available, - * else share the same vector for all VQs. - */ - if (!shared) - allocated_vectors++; } - return 0; -out_remove_vqs: - vp_remove_vqs(vdev); - kfree(vp_dev->msix_vector_map); -out_disable_config_irq: - vp_dev->config_vector(vp_dev, VIRTIO_MSI_NO_VECTOR); -out_free_config_irq: - free_irq(pci_irq_vector(vp_dev->pci_dev, 0), vp_dev); -out_free_msix_affinity_masks: - for (i = 0; i < nvectors; i++) { - if (vp_dev->msix_affinity_masks[i]) - free_cpumask_var(vp_dev->msix_affinity_masks[i]); - } - kfree(vp_dev->msix_affinity_masks); -out_free_msix_names: - kfree(vp_dev->msix_names); -out_free_irq_vectors: - pci_free_irq_vectors(vp_dev->pci_dev); +error_find: + vp_del_vqs(vdev); return err; } @@ -289,8 +305,9 @@ static int vp_find_vqs_intx(struct virtio_device *vdev, unsigned nvqs, err = request_irq(vp_dev->pci_dev->irq, vp_interrupt, IRQF_SHARED, dev_name(&vdev->dev), vp_dev); if (err) - return err; + goto out_del_vqs; + vp_dev->intx_enabled = 1; for (i = 0; i < nvqs; ++i) { if (!names[i]) { vqs[i] = NULL; @@ -300,28 +317,33 @@ static int vp_find_vqs_intx(struct virtio_device *vdev, unsigned nvqs, VIRTIO_MSI_NO_VECTOR); if (IS_ERR(vqs[i])) { err = PTR_ERR(vqs[i]); - goto out_remove_vqs; + goto out_del_vqs; } } return 0; - -out_remove_vqs: - vp_remove_vqs(vdev); - free_irq(pci_irq_vector(vp_dev->pci_dev, 0), vp_dev); +out_del_vqs: + vp_del_vqs(vdev); return err; } /* the config->find_vqs() implementation */ int vp_find_vqs(struct virtio_device *vdev, unsigned nvqs, - struct virtqueue *vqs[], vq_callback_t *callbacks[], - const char * const names[], struct irq_affinity *desc) + struct virtqueue *vqs[], + vq_callback_t *callbacks[], + const char * const names[]) { int err; - err = vp_find_vqs_msix(vdev, nvqs, vqs, callbacks, names, desc); + /* Try MSI-X with one vector per queue. */ + err = vp_find_vqs_msix(vdev, nvqs, vqs, callbacks, names, true); if (!err) return 0; + /* Fallback: MSI-X with one vector for config, one shared for queues. */ + err = vp_find_vqs_msix(vdev, nvqs, vqs, callbacks, names, false); + if (!err) + return 0; + /* Finally fall back to regular interrupts. */ return vp_find_vqs_intx(vdev, nvqs, vqs, callbacks, names); } @@ -345,7 +367,7 @@ int vp_set_vq_affinity(struct virtqueue *vq, int cpu) if (!vq->callback) return -EINVAL; - if (vp_dev->pci_dev->msix_enabled) { + if (vp_dev->msix_enabled) { int vec = vp_dev->msix_vector_map[vq->index]; struct cpumask *mask = vp_dev->msix_affinity_masks[vec]; unsigned int irq = pci_irq_vector(vp_dev->pci_dev, vec); @@ -361,17 +383,6 @@ int vp_set_vq_affinity(struct virtqueue *vq, int cpu) return 0; } -const struct cpumask *vp_get_vq_affinity(struct virtio_device *vdev, int index) -{ - struct virtio_pci_device *vp_dev = to_vp_device(vdev); - unsigned int *map = vp_dev->msix_vector_map; - - if (!map || map[index] == VIRTIO_MSI_NO_VECTOR) - return NULL; - - return pci_irq_get_affinity(vp_dev->pci_dev, map[index]); -} - #ifdef CONFIG_PM_SLEEP static int virtio_pci_freeze(struct device *dev) { diff --git a/drivers/virtio/virtio_pci_common.h b/drivers/virtio/virtio_pci_common.h index ac8c9d7..2038887 100644 --- a/drivers/virtio/virtio_pci_common.h +++ b/drivers/virtio/virtio_pci_common.h @@ -64,12 +64,18 @@ struct virtio_pci_device { /* the IO mapping for the PCI config space */ void __iomem *ioaddr; + /* MSI-X support */ + int msix_enabled; + int intx_enabled; cpumask_var_t *msix_affinity_masks; /* Name strings for interrupts. This size should be enough, * and I'm too lazy to allocate each name separately. */ char (*msix_names)[256]; - /* Total Number of MSI-X vectors (including per-VQ ones). */ - int msix_vectors; + /* Number of available vectors */ + unsigned msix_vectors; + /* Vectors allocated, excluding per-vq vectors if any */ + unsigned msix_used_vectors; + /* Map of per-VQ MSI-X vectors, may be NULL */ unsigned *msix_vector_map; @@ -83,6 +89,14 @@ struct virtio_pci_device { u16 (*config_vector)(struct virtio_pci_device *vp_dev, u16 vector); }; +/* Constants for MSI-X */ +/* Use first vector for configuration changes, second and the rest for + * virtqueues Thus, we need at least 2 vectors for MSI. */ +enum { + VP_MSIX_CONFIG_VECTOR = 0, + VP_MSIX_VQ_VECTOR = 1, +}; + /* Convert a generic virtio device to our structure */ static struct virtio_pci_device *to_vp_device(struct virtio_device *vdev) { @@ -97,8 +111,9 @@ bool vp_notify(struct virtqueue *vq); void vp_del_vqs(struct virtio_device *vdev); /* the config->find_vqs() implementation */ int vp_find_vqs(struct virtio_device *vdev, unsigned nvqs, - struct virtqueue *vqs[], vq_callback_t *callbacks[], - const char * const names[], struct irq_affinity *desc); + struct virtqueue *vqs[], + vq_callback_t *callbacks[], + const char * const names[]); const char *vp_bus_name(struct virtio_device *vdev); /* Setup the affinity for a virtqueue: @@ -108,8 +123,6 @@ const char *vp_bus_name(struct virtio_device *vdev); */ int vp_set_vq_affinity(struct virtqueue *vq, int cpu); -const struct cpumask *vp_get_vq_affinity(struct virtio_device *vdev, int index); - #if IS_ENABLED(CONFIG_VIRTIO_PCI_LEGACY) int virtio_pci_legacy_probe(struct virtio_pci_device *); void virtio_pci_legacy_remove(struct virtio_pci_device *); diff --git a/drivers/virtio/virtio_pci_legacy.c b/drivers/virtio/virtio_pci_legacy.c index f7362c5..47292da 100644 --- a/drivers/virtio/virtio_pci_legacy.c +++ b/drivers/virtio/virtio_pci_legacy.c @@ -165,7 +165,7 @@ static void del_vq(struct virtqueue *vq) iowrite16(vq->index, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_SEL); - if (vp_dev->pci_dev->msix_enabled) { + if (vp_dev->msix_enabled) { iowrite16(VIRTIO_MSI_NO_VECTOR, vp_dev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR); /* Flush the write out to device */ @@ -190,7 +190,6 @@ static const struct virtio_config_ops virtio_pci_config_ops = { .finalize_features = vp_finalize_features, .bus_name = vp_bus_name, .set_vq_affinity = vp_set_vq_affinity, - .get_vq_affinity = vp_get_vq_affinity, }; /* the PCI probing function */ diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c index 7bc3004..00e6fc1 100644 --- a/drivers/virtio/virtio_pci_modern.c +++ b/drivers/virtio/virtio_pci_modern.c @@ -384,12 +384,13 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev, } static int vp_modern_find_vqs(struct virtio_device *vdev, unsigned nvqs, - struct virtqueue *vqs[], vq_callback_t *callbacks[], - const char * const names[], struct irq_affinity *desc) + struct virtqueue *vqs[], + vq_callback_t *callbacks[], + const char * const names[]) { struct virtio_pci_device *vp_dev = to_vp_device(vdev); struct virtqueue *vq; - int rc = vp_find_vqs(vdev, nvqs, vqs, callbacks, names, desc); + int rc = vp_find_vqs(vdev, nvqs, vqs, callbacks, names); if (rc) return rc; @@ -411,7 +412,7 @@ static void del_vq(struct virtqueue *vq) vp_iowrite16(vq->index, &vp_dev->common->queue_select); - if (vp_dev->pci_dev->msix_enabled) { + if (vp_dev->msix_enabled) { vp_iowrite16(VIRTIO_MSI_NO_VECTOR, &vp_dev->common->queue_msix_vector); /* Flush the write out to device */ @@ -437,7 +438,6 @@ static const struct virtio_config_ops virtio_pci_config_nodev_ops = { .finalize_features = vp_finalize_features, .bus_name = vp_bus_name, .set_vq_affinity = vp_set_vq_affinity, - .get_vq_affinity = vp_get_vq_affinity, }; static const struct virtio_config_ops virtio_pci_config_ops = { @@ -453,7 +453,6 @@ static const struct virtio_config_ops virtio_pci_config_ops = { .finalize_features = vp_finalize_features, .bus_name = vp_bus_name, .set_vq_affinity = vp_set_vq_affinity, - .get_vq_affinity = vp_get_vq_affinity, }; /** diff --git a/include/linux/blk-mq-virtio.h b/include/linux/blk-mq-virtio.h deleted file mode 100644 index b1ef6e1..0000000 --- a/include/linux/blk-mq-virtio.h +++ /dev/null @@ -1,10 +0,0 @@ -#ifndef _LINUX_BLK_MQ_VIRTIO_H -#define _LINUX_BLK_MQ_VIRTIO_H - -struct blk_mq_tag_set; -struct virtio_device; - -int blk_mq_virtio_map_queues(struct blk_mq_tag_set *set, - struct virtio_device *vdev, int first_vec); - -#endif /* _LINUX_BLK_MQ_VIRTIO_H */ diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index 62d240e..bb790c4 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -26,6 +26,7 @@ enum cpuhp_state { CPUHP_ARM_OMAP_WAKE_DEAD, CPUHP_IRQ_POLL_DEAD, CPUHP_BLOCK_SOFTIRQ_DEAD, + CPUHP_VIRT_SCSI_DEAD, CPUHP_ACPI_CPUDRV_DEAD, CPUHP_S390_PFAULT_DEAD, CPUHP_BLK_MQ_DEAD, diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h index 8355bab..26c155b 100644 --- a/include/linux/virtio_config.h +++ b/include/linux/virtio_config.h @@ -7,8 +7,6 @@ #include #include -struct irq_affinity; - /** * virtio_config_ops - operations for configuring a virtio device * @get: read the value of a configuration field @@ -58,7 +56,6 @@ struct irq_affinity; * This returns a pointer to the bus name a la pci_name from which * the caller can then copy. * @set_vq_affinity: set the affinity for a virtqueue. - * @get_vq_affinity: get the affinity for a virtqueue (optional). */ typedef void vq_callback_t(struct virtqueue *); struct virtio_config_ops { @@ -71,15 +68,14 @@ struct virtio_config_ops { void (*set_status)(struct virtio_device *vdev, u8 status); void (*reset)(struct virtio_device *vdev); int (*find_vqs)(struct virtio_device *, unsigned nvqs, - struct virtqueue *vqs[], vq_callback_t *callbacks[], - const char * const names[], struct irq_affinity *desc); + struct virtqueue *vqs[], + vq_callback_t *callbacks[], + const char * const names[]); void (*del_vqs)(struct virtio_device *); u64 (*get_features)(struct virtio_device *vdev); int (*finalize_features)(struct virtio_device *vdev); const char *(*bus_name)(struct virtio_device *vdev); int (*set_vq_affinity)(struct virtqueue *vq, int cpu); - const struct cpumask *(*get_vq_affinity)(struct virtio_device *vdev, - int index); }; /* If driver didn't advertise the feature, it will never appear. */ @@ -173,7 +169,7 @@ struct virtqueue *virtio_find_single_vq(struct virtio_device *vdev, vq_callback_t *callbacks[] = { c }; const char *names[] = { n }; struct virtqueue *vq; - int err = vdev->config->find_vqs(vdev, 1, &vq, callbacks, names, NULL); + int err = vdev->config->find_vqs(vdev, 1, &vq, callbacks, names); if (err < 0) return ERR_PTR(err); return vq; diff --git a/include/uapi/linux/virtio_pci.h b/include/uapi/linux/virtio_pci.h index 15b4385..90007a1 100644 --- a/include/uapi/linux/virtio_pci.h +++ b/include/uapi/linux/virtio_pci.h @@ -79,7 +79,7 @@ * configuration space */ #define VIRTIO_PCI_CONFIG_OFF(msix_enabled) ((msix_enabled) ? 24 : 20) /* Deprecated: please use VIRTIO_PCI_CONFIG_OFF instead */ -#define VIRTIO_PCI_CONFIG(dev) VIRTIO_PCI_CONFIG_OFF((dev)->pci_dev->msix_enabled) +#define VIRTIO_PCI_CONFIG(dev) VIRTIO_PCI_CONFIG_OFF((dev)->msix_enabled) /* Virtio ABI version, this must match exactly */ #define VIRTIO_PCI_ABI_VERSION 0 diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c index 9d24c0e..6788264 100644 --- a/net/vmw_vsock/virtio_transport.c +++ b/net/vmw_vsock/virtio_transport.c @@ -532,8 +532,7 @@ static int virtio_vsock_probe(struct virtio_device *vdev) vsock->vdev = vdev; ret = vsock->vdev->config->find_vqs(vsock->vdev, VSOCK_VQ_MAX, - vsock->vqs, callbacks, names, - NULL); + vsock->vqs, callbacks, names); if (ret < 0) goto out; -- 2.7.4 --Bn2rw/3z4jIqBvZU--