Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp516856pxk; Thu, 3 Sep 2020 06:06:05 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzVcF95Bjd7RFPzQLofWm9oW6p0N7cBCQE4Iqrly4Dr+Dn0WDnnQyS3LDnl8wANRQI8Qsp5 X-Received: by 2002:a17:906:2686:: with SMTP id t6mr1985278ejc.406.1599138364929; Thu, 03 Sep 2020 06:06:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1599138364; cv=none; d=google.com; s=arc-20160816; b=p0bd9epRkngFIu1yt9n7uWdoE0Ri+WYOfoEDivhQhsmuX3cw5uOt756HgH2Nm/wYT3 v6+aLGhRBdS1hCItmW2LRoP5bmEMucsVdImt+Ni3IurHCnZqe0XFnp6GgVGbFCwM8/S5 /r2ikQKPsptGEHh6gzwY+3sR0voSu4saaBKjptcYyRLG++e6BvqoCzPMqr39ysk9ohm5 qVBAlfwbIhUQ2KbrsTV0sH7LtaQVBEQe7leVwgmbv2oJBzFyfzmRtwP7Zn/4E+T9I0NK D3ngNh/Vn57IuP1zpF/yixqdrxB2hFpP7loSiqdyDYJCTTvVG5Khhj1en3+lUTaMipcE asJg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :ironport-sdr:ironport-sdr; bh=pPoK6c29dXuWYITW5bO5sZ/XRDLP9qdZnmi37pokJrc=; b=D7lDD0i4teOPysmKZT/lgUdlPR8+vxkaxsU2UYyVH6TFBBen7wkOIQ6Jqr0IcCNzcB EC40pKB9uwUMWP/d60Au0uvyd5XcDaapBmVd9Sj8dtMEZsji3EFjXp1HKJozCcmV0Uel rhi3oschSt0Z9Xj2zuWM8vGak/J3Hbz41W076qiZ8F7KWjrbDMoy0K6iGSvEc7AXtNMG hKXIhT1phCZCx6aMQZE8SbihNm/W6ubfXOVriSmw9j6GlcSBqraAUX6IjYbGPFwfyzTw adkZsNhWoMLenKQ4bP8wdplDRCDIW8RvEFFF5yU9jDPV8g+Tk8Dpz2rOO52ZY2uHQ9oq 9g4Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id i12si1558528edy.604.2020.09.03.06.05.40; Thu, 03 Sep 2020 06:06:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728975AbgICNAa (ORCPT + 99 others); Thu, 3 Sep 2020 09:00:30 -0400 Received: from mga09.intel.com ([134.134.136.24]:15200 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728970AbgICMqG (ORCPT ); Thu, 3 Sep 2020 08:46:06 -0400 IronPort-SDR: vl2FFEdSesjYSXjTTy8FLPrlTs0aGhUlAePD1LfdLrImuNoPC3R078sR3A8zsssrv6UNk+5jj2 3qDSPN+ia+pQ== X-IronPort-AV: E=McAfee;i="6000,8403,9732"; a="158549508" X-IronPort-AV: E=Sophos;i="5.76,386,1592895600"; d="scan'208";a="158549508" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Sep 2020 05:43:55 -0700 IronPort-SDR: DHOVffJCmlL9Lk6lAZIA/b/JYLLzq4+vkdWXAVnrkBi/kEJ0p2BaTWwoSItEeZGYciS6zTc0hG Qq+4y2ol3eqg== X-IronPort-AV: E=Sophos;i="5.76,386,1592895600"; d="scan'208";a="478039742" Received: from shsi6026.sh.intel.com (HELO localhost) ([10.239.147.135]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Sep 2020 05:43:50 -0700 From: shuo.a.liu@intel.com To: linux-kernel@vger.kernel.org, x86@kernel.org Cc: Greg Kroah-Hartman , "H . Peter Anvin" , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Sean Christopherson , Yu Wang , Reinette Chatre , Shuo Liu , Zhi Wang , Zhenyu Wang Subject: [PATCH v2 16/17] virt: acrn: Introduce irqfd Date: Thu, 3 Sep 2020 20:42:00 +0800 Message-Id: <20200903124201.17275-17-shuo.a.liu@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200903124201.17275-1-shuo.a.liu@intel.com> References: <20200903124201.17275-1-shuo.a.liu@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Shuo Liu irqfd is a mechanism to inject a specific interrupt to a User VM using a decoupled eventfd mechanism. Vhost is a kernel-level virtio server which uses eventfd for interrupt injection. To support vhost on ACRN, irqfd is introduced in HSM. HSM provides ioctls to associate a virtual Message Signaled Interrupt (MSI) with an eventfd. The corresponding virtual MSI will be injected into a User VM once the eventfd got signal. Signed-off-by: Shuo Liu Reviewed-by: Zhi Wang Reviewed-by: Reinette Chatre Cc: Zhi Wang Cc: Zhenyu Wang Cc: Yu Wang Cc: Reinette Chatre Cc: Greg Kroah-Hartman --- drivers/virt/acrn/Makefile | 2 +- drivers/virt/acrn/acrn_drv.h | 10 ++ drivers/virt/acrn/hsm.c | 7 ++ drivers/virt/acrn/irqfd.c | 236 +++++++++++++++++++++++++++++++++++ drivers/virt/acrn/vm.c | 3 + include/uapi/linux/acrn.h | 15 +++ 6 files changed, 272 insertions(+), 1 deletion(-) create mode 100644 drivers/virt/acrn/irqfd.c diff --git a/drivers/virt/acrn/Makefile b/drivers/virt/acrn/Makefile index 755b583b32ca..08ce641dcfa1 100644 --- a/drivers/virt/acrn/Makefile +++ b/drivers/virt/acrn/Makefile @@ -1,3 +1,3 @@ # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_ACRN_HSM) := acrn.o -acrn-y := hsm.o vm.o mm.o ioreq.o ioeventfd.o +acrn-y := hsm.o vm.o mm.o ioreq.o ioeventfd.o irqfd.o diff --git a/drivers/virt/acrn/acrn_drv.h b/drivers/virt/acrn/acrn_drv.h index 4cb8a61d4ccd..3495f529574a 100644 --- a/drivers/virt/acrn/acrn_drv.h +++ b/drivers/virt/acrn/acrn_drv.h @@ -159,6 +159,9 @@ extern rwlock_t acrn_vm_list_lock; * @ioeventfds_lock: Lock to protect ioeventfds list * @ioeventfds: List to link all hsm_ioeventfd * @ioeventfd_client: I/O client for ioeventfds of the VM + * @irqfds_lock: Lock to protect irqfds list + * @irqfds: List to link all hsm_irqfd + * @irqfd_wq: Workqueue for irqfd async shutdown */ struct acrn_vm { struct device *dev; @@ -179,6 +182,9 @@ struct acrn_vm { struct mutex ioeventfds_lock; struct list_head ioeventfds; struct acrn_ioreq_client *ioeventfd_client; + struct mutex irqfds_lock; + struct list_head irqfds; + struct workqueue_struct *irqfd_wq; }; struct acrn_vm *acrn_vm_create(struct acrn_vm *vm, @@ -215,4 +221,8 @@ int acrn_ioeventfd_init(struct acrn_vm *vm); int acrn_ioeventfd_config(struct acrn_vm *vm, struct acrn_ioeventfd *args); void acrn_ioeventfd_deinit(struct acrn_vm *vm); +int acrn_irqfd_init(struct acrn_vm *vm); +int acrn_irqfd_config(struct acrn_vm *vm, struct acrn_irqfd *args); +void acrn_irqfd_deinit(struct acrn_vm *vm); + #endif /* __ACRN_HSM_DRV_H */ diff --git a/drivers/virt/acrn/hsm.c b/drivers/virt/acrn/hsm.c index 7290c67fca15..f5bb7ae34001 100644 --- a/drivers/virt/acrn/hsm.c +++ b/drivers/virt/acrn/hsm.c @@ -118,6 +118,7 @@ static long acrn_dev_ioctl(struct file *filp, unsigned int cmd, struct acrn_vm_memmap memmap; struct acrn_msi_entry *msi; struct acrn_pcidev *pcidev; + struct acrn_irqfd irqfd; struct page *page; u64 cstate_cmd; int ret = 0; @@ -307,6 +308,12 @@ static long acrn_dev_ioctl(struct file *filp, unsigned int cmd, ret = acrn_ioeventfd_config(vm, &ioeventfd); break; + case ACRN_IOCTL_IRQFD: + if (copy_from_user(&irqfd, (void __user *)ioctl_param, + sizeof(irqfd))) + return -EFAULT; + ret = acrn_irqfd_config(vm, &irqfd); + break; default: dev_warn(dev, "Unknown IOCTL 0x%x!\n", cmd); ret = -ENOTTY; diff --git a/drivers/virt/acrn/irqfd.c b/drivers/virt/acrn/irqfd.c new file mode 100644 index 000000000000..0538464ca851 --- /dev/null +++ b/drivers/virt/acrn/irqfd.c @@ -0,0 +1,236 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * ACRN HSM irqfd: use eventfd objects to inject virtual interrupts + * + * Copyright (C) 2020 Intel Corporation. All rights reserved. + * + * Authors: + * Shuo Liu + * Yakui Zhao + */ +#define pr_fmt(fmt) "acrn: " fmt + +#include +#include +#include +#include + +#include "acrn_drv.h" + +static LIST_HEAD(acrn_irqfd_clients); +static DEFINE_MUTEX(acrn_irqfds_mutex); + +/** + * struct hsm_irqfd - Properties of HSM irqfd + * @vm: Associated VM pointer + * @wait: Entry of wait-queue + * @shutdown: Async shutdown work + * @eventfd: Associated eventfd + * @list: Entry within &acrn_vm.irqfds of irqfds of a VM + * @pt: Structure for select/poll on the associated eventfd + * @msi: MSI data + */ +struct hsm_irqfd { + struct acrn_vm *vm; + wait_queue_entry_t wait; + struct work_struct shutdown; + struct eventfd_ctx *eventfd; + struct list_head list; + poll_table pt; + struct acrn_msi_entry msi; +}; + +static void acrn_irqfd_inject(struct hsm_irqfd *irqfd) +{ + struct acrn_vm *vm = irqfd->vm; + + acrn_msi_inject(vm, irqfd->msi.msi_addr, + irqfd->msi.msi_data); +} + +static void hsm_irqfd_shutdown(struct hsm_irqfd *irqfd) +{ + u64 cnt; + + lockdep_assert_held(&irqfd->vm->irqfds_lock); + + /* remove from wait queue */ + list_del_init(&irqfd->list); + eventfd_ctx_remove_wait_queue(irqfd->eventfd, &irqfd->wait, &cnt); + eventfd_ctx_put(irqfd->eventfd); + kfree(irqfd); +} + +static void hsm_irqfd_shutdown_work(struct work_struct *work) +{ + struct hsm_irqfd *irqfd; + struct acrn_vm *vm; + + irqfd = container_of(work, struct hsm_irqfd, shutdown); + vm = irqfd->vm; + mutex_lock(&vm->irqfds_lock); + if (!list_empty(&irqfd->list)) + hsm_irqfd_shutdown(irqfd); + mutex_unlock(&vm->irqfds_lock); +} + +/* Called with wqh->lock held and interrupts disabled */ +static int hsm_irqfd_wakeup(wait_queue_entry_t *wait, unsigned int mode, + int sync, void *key) +{ + unsigned long poll_bits = (unsigned long)key; + struct hsm_irqfd *irqfd; + struct acrn_vm *vm; + + irqfd = container_of(wait, struct hsm_irqfd, wait); + vm = irqfd->vm; + if (poll_bits & POLLIN) + /* An event has been signaled, inject an interrupt */ + acrn_irqfd_inject(irqfd); + + if (poll_bits & POLLHUP) + /* Do shutdown work in thread to hold wqh->lock */ + queue_work(vm->irqfd_wq, &irqfd->shutdown); + + return 0; +} + +static void hsm_irqfd_poll_func(struct file *file, wait_queue_head_t *wqh, + poll_table *pt) +{ + struct hsm_irqfd *irqfd; + + irqfd = container_of(pt, struct hsm_irqfd, pt); + add_wait_queue(wqh, &irqfd->wait); +} + +/* + * Assign an eventfd to a VM and create a HSM irqfd associated with the + * eventfd. The properties of the HSM irqfd are built from a &struct + * acrn_irqfd. + */ +static int acrn_irqfd_assign(struct acrn_vm *vm, struct acrn_irqfd *args) +{ + struct eventfd_ctx *eventfd = NULL; + struct hsm_irqfd *irqfd, *tmp; + unsigned int events; + struct fd f; + int ret = 0; + + irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL); + if (!irqfd) + return -ENOMEM; + + irqfd->vm = vm; + memcpy(&irqfd->msi, &args->msi, sizeof(args->msi)); + INIT_LIST_HEAD(&irqfd->list); + INIT_WORK(&irqfd->shutdown, hsm_irqfd_shutdown_work); + + f = fdget(args->fd); + if (!f.file) { + ret = -EBADF; + goto out; + } + + eventfd = eventfd_ctx_fileget(f.file); + if (IS_ERR(eventfd)) { + ret = PTR_ERR(eventfd); + goto fail; + } + + irqfd->eventfd = eventfd; + + /* + * Install custom wake-up handling to be notified whenever underlying + * eventfd is signaled. + */ + init_waitqueue_func_entry(&irqfd->wait, hsm_irqfd_wakeup); + init_poll_funcptr(&irqfd->pt, hsm_irqfd_poll_func); + + mutex_lock(&vm->irqfds_lock); + list_for_each_entry(tmp, &vm->irqfds, list) { + if (irqfd->eventfd != tmp->eventfd) + continue; + ret = -EBUSY; + mutex_unlock(&vm->irqfds_lock); + goto fail; + } + list_add_tail(&irqfd->list, &vm->irqfds); + mutex_unlock(&vm->irqfds_lock); + + /* Check the pending event in this stage */ + events = f.file->f_op->poll(f.file, &irqfd->pt); + + if (events & POLLIN) + acrn_irqfd_inject(irqfd); + + fdput(f); + return 0; +fail: + if (eventfd && !IS_ERR(eventfd)) + eventfd_ctx_put(eventfd); + + fdput(f); +out: + kfree(irqfd); + return ret; +} + +static int acrn_irqfd_deassign(struct acrn_vm *vm, + struct acrn_irqfd *args) +{ + struct hsm_irqfd *irqfd, *tmp; + struct eventfd_ctx *eventfd; + + eventfd = eventfd_ctx_fdget(args->fd); + if (IS_ERR(eventfd)) + return PTR_ERR(eventfd); + + mutex_lock(&vm->irqfds_lock); + list_for_each_entry_safe(irqfd, tmp, &vm->irqfds, list) { + if (irqfd->eventfd == eventfd) { + hsm_irqfd_shutdown(irqfd); + break; + } + } + mutex_unlock(&vm->irqfds_lock); + eventfd_ctx_put(eventfd); + + return 0; +} + +int acrn_irqfd_config(struct acrn_vm *vm, struct acrn_irqfd *args) +{ + int ret; + + if (args->flags & ACRN_IRQFD_FLAG_DEASSIGN) + ret = acrn_irqfd_deassign(vm, args); + else + ret = acrn_irqfd_assign(vm, args); + + return ret; +} + +int acrn_irqfd_init(struct acrn_vm *vm) +{ + INIT_LIST_HEAD(&vm->irqfds); + mutex_init(&vm->irqfds_lock); + vm->irqfd_wq = alloc_workqueue("acrn_irqfd-%u", 0, 0, vm->vmid); + if (!vm->irqfd_wq) + return -ENOMEM; + + dev_dbg(vm->dev, "VM %u irqfd init.\n", vm->vmid); + return 0; +} + +void acrn_irqfd_deinit(struct acrn_vm *vm) +{ + struct hsm_irqfd *irqfd, *next; + + dev_dbg(vm->dev, "VM %u irqfd deinit.\n", vm->vmid); + destroy_workqueue(vm->irqfd_wq); + mutex_lock(&vm->irqfds_lock); + list_for_each_entry_safe(irqfd, next, &vm->irqfds, list) + hsm_irqfd_shutdown(irqfd); + mutex_unlock(&vm->irqfds_lock); +} diff --git a/drivers/virt/acrn/vm.c b/drivers/virt/acrn/vm.c index 53d504e26803..bd3dab5eee92 100644 --- a/drivers/virt/acrn/vm.c +++ b/drivers/virt/acrn/vm.c @@ -48,6 +48,7 @@ struct acrn_vm *acrn_vm_create(struct acrn_vm *vm, write_unlock_bh(&acrn_vm_list_lock); acrn_ioeventfd_init(vm); + acrn_irqfd_init(vm); dev_dbg(vm->dev, "VM %u created.\n", vm->vmid); return vm; } @@ -66,7 +67,9 @@ int acrn_vm_destroy(struct acrn_vm *vm) write_unlock_bh(&acrn_vm_list_lock); acrn_ioeventfd_deinit(vm); + acrn_irqfd_deinit(vm); acrn_ioreq_deinit(vm); + if (vm->monitor_page) { put_page(vm->monitor_page); vm->monitor_page = NULL; diff --git a/include/uapi/linux/acrn.h b/include/uapi/linux/acrn.h index 7a99124c7d4d..75a687838a43 100644 --- a/include/uapi/linux/acrn.h +++ b/include/uapi/linux/acrn.h @@ -411,6 +411,19 @@ struct acrn_ioeventfd { __u64 data; }; +#define ACRN_IRQFD_FLAG_DEASSIGN 0x01 +/** + * struct acrn_irqfd - Data to operate a &struct hsm_irqfd + * @fd: The fd of eventfd associated with a hsm_irqfd + * @flags: Logical-OR of ACRN_IRQFD_FLAG_* + * @msi: Info of MSI associated with the irqfd + */ +struct acrn_irqfd { + __s32 fd; + __u32 flags; + struct acrn_msi_entry msi; +}; + /* The ioctl type, documented in ioctl-number.rst */ #define ACRN_IOCTL_TYPE 0xA2 @@ -467,5 +480,7 @@ struct acrn_ioeventfd { #define ACRN_IOCTL_IOEVENTFD \ _IOW(ACRN_IOCTL_TYPE, 0x70, struct acrn_ioeventfd) +#define ACRN_IOCTL_IRQFD \ + _IOW(ACRN_IOCTL_TYPE, 0x71, struct acrn_irqfd) #endif /* _UAPI_ACRN_H */ -- 2.28.0