Received: by 2002:ab2:3319:0:b0:1ef:7a0f:c32d with SMTP id i25csp887228lqc; Fri, 8 Mar 2024 15:07:53 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCUofHAM8x90TNtvBw7Lzk7SAShzABa6SPIJf+/CtNGSXW1KNBIRc5rNqocwqH63DqGBSJBZc4ySKRfa5yHVLe6/8/wLswChr7oPa9oPEg== X-Google-Smtp-Source: AGHT+IERcLSvQRGhc6NNLQo5BtpCiH5u1rHEg9qui8pNN7rd9oaZpK81c5KoGCDmUMkL/WiTsRf4 X-Received: by 2002:a05:620a:f92:b0:788:3fe3:13db with SMTP id b18-20020a05620a0f9200b007883fe313dbmr460168qkn.60.1709939273627; Fri, 08 Mar 2024 15:07:53 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709939273; cv=pass; d=google.com; s=arc-20160816; b=VMKbQkaKfxQ3huOCpKuXPnap/x7MypmCF9dxnxC5oSTpPvoZDo/EMwUgyVi5WnFKng F+Zl3lEMXix8zuQz7le7RQ/LEHPkTICkB+3rSHh1mqDSzJ2p2Tgeoqg8kfBxRJL9I4kO hlor/6nlaQm0nxClzguJX2r7JaouG6Qc5JaJ8qI6DAh+j/tuBHyNAcUZOFmXjJo1LrPU 2UWdMldPFguwRsd7UI02Dd3AWtJx8t45tw2ZFz71Tolq04mGkP8HZAaKDbpNVv7WyYye bJ+WbVnim3UEAz3k+o47QL3S7KDkFA8VnGI08/keNPBelRPFeWjRTdlBN6lrQOyHEIzm itPw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=T30nVL2WzB2fz1DAo6eXAKzxiL2l64osi6FTP+vjx0w=; fh=AON1tP1R2s0GAvYSrU1I/R6N81mWhzuaPDd8Cwaumc8=; b=B2YkdT+W4Y6GO4m443YTmuFp93UAy/IFaszDHyQ9+/jYqhm5fgWf5PMj694h4PAzC3 dWd4gDJ8ofgH7dFnNR+l7yrWa1JDhhjWWOS7GLf2df/dtcE+W8iBaXggavuqahM3ZgrE //nUBhLf42GTNcbK9Y/aAxQFR8cAxh4gLSg0RhGlAyqr2L+qxlLRmcyyYucRd8xU394V +7GOVxoO7kg/I6PzUrfBE0R46IN+ORNZWYTKrINRnDLtKX8gmkfPWEz3rgaiNNq5KYph MV8AreqPjGybBP57QbCvINcn2UcA9o3I8M5mB4iCJzsyMqrusuYBHJKGL74ki0Hje4VJ k3Pg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="WM3eiY/T"; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-97681-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-97681-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id a10-20020a05620a066a00b007883e7e3cd3si487455qkh.3.2024.03.08.15.07.53 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 Mar 2024 15:07:53 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-97681-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="WM3eiY/T"; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-97681-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-97681-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 502601C20BD4 for ; Fri, 8 Mar 2024 23:07:53 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 1053A53373; Fri, 8 Mar 2024 23:06:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="WM3eiY/T" Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F210048787 for ; Fri, 8 Mar 2024 23:06:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709939177; cv=none; b=Lp3Br6k2CZwIsmgFd4e5Z7clLg4g/N7L+bqRq5vhvCJRqKkRLdpRSrG9PdvLuKDl6fToc2syQN6xNWkHKoIn2N8yOJ0Rxvcb+sZiMVgxg4YCX5ADxOsLQPfANiTwDMmc5fiAnldGdgrIxucFaFGVrCM2emqYbYSY04zK4AtdKcU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709939177; c=relaxed/simple; bh=IgRfhOHk13Mv0ASTofy/83LFTItxFaSl3Db+woVfbPI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=GikI/d6fVe8jiBFvWx8p0UY95ON3fvmQzYdY8uw5kTG6eKY9yQjL3RB0aG76IJ5BI2Qg2hx+eeZgnd+C85IdomxIUmHFdcidbFDq1kk5LhYVdBjtUOqE6gL0lMYK3Oy7mjDp9tCzu8HvRPa1xI82J+Nz5cJKyxxEz5O2YO2vU0U= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=WM3eiY/T; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1709939174; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=T30nVL2WzB2fz1DAo6eXAKzxiL2l64osi6FTP+vjx0w=; b=WM3eiY/TLiptFQoYpL5lUinzVWeT1mvqd3nT11w4jJGm7jqpjdOzLNYEB1CiKXTIBkFsQU 6paIjuXpO5t3BF7Z3uYq/QjABtvVppPPds4jLWP+bsBl1qi/lD73Gvzr3CfP0GASA0tkJ8 SvJC5+CttOGr6cTOYJsz8VBcXl0m3zU= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-648-VsIc0bXfMm6eAF6PUsRH1A-1; Fri, 08 Mar 2024 18:06:13 -0500 X-MC-Unique: VsIc0bXfMm6eAF6PUsRH1A-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 0ABFB29AA387; Fri, 8 Mar 2024 23:06:13 +0000 (UTC) Received: from omen.home.shazbot.org (unknown [10.22.8.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0130347CF; Fri, 8 Mar 2024 23:06:11 +0000 (UTC) From: Alex Williamson To: alex.williamson@redhat.com Cc: kvm@vger.kernel.org, eric.auger@redhat.com, clg@redhat.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org, kevin.tian@intel.com, stable@vger.kernel.org Subject: [PATCH v2 6/7] vfio/platform: Create persistent IRQ handlers Date: Fri, 8 Mar 2024 16:05:27 -0700 Message-ID: <20240308230557.805580-7-alex.williamson@redhat.com> In-Reply-To: <20240308230557.805580-1-alex.williamson@redhat.com> References: <20240308230557.805580-1-alex.williamson@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.1 The vfio-platform SET_IRQS ioctl currently allows loopback triggering of an interrupt before a signaling eventfd has been configured by the user, which thereby allows a NULL pointer dereference. Rather than register the IRQ relative to a valid trigger, register all IRQs in a disabled state in the device open path. This allows mask operations on the IRQ to nest within the overall enable state governed by a valid eventfd signal. This decouples @masked, protected by the @locked spinlock from @trigger, protected via the @igate mutex. In doing so, it's guaranteed that changes to @trigger cannot race the IRQ handlers because the IRQ handler is synchronously disabled before modifying the trigger, and loopback triggering of the IRQ via ioctl is safe due to serialization with trigger changes via igate. For compatibility, request_irq() failures are maintained to be local to the SET_IRQS ioctl rather than a fatal error in the open device path. This allows, for example, a userspace driver with polling mode support to continue to work regardless of moving the request_irq() call site. This necessarily blocks all SET_IRQS access to the failed index. Cc: Eric Auger Cc: stable@vger.kernel.org Fixes: 57f972e2b341 ("vfio/platform: trigger an interrupt via eventfd") Signed-off-by: Alex Williamson --- drivers/vfio/platform/vfio_platform_irq.c | 100 +++++++++++++++------- 1 file changed, 68 insertions(+), 32 deletions(-) diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c index e5dcada9e86c..ef41ecef83af 100644 --- a/drivers/vfio/platform/vfio_platform_irq.c +++ b/drivers/vfio/platform/vfio_platform_irq.c @@ -136,6 +136,16 @@ static int vfio_platform_set_irq_unmask(struct vfio_platform_device *vdev, return 0; } +/* + * The trigger eventfd is guaranteed valid in the interrupt path + * and protected by the igate mutex when triggered via ioctl. + */ +static void vfio_send_eventfd(struct vfio_platform_irq *irq_ctx) +{ + if (likely(irq_ctx->trigger)) + eventfd_signal(irq_ctx->trigger); +} + static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id) { struct vfio_platform_irq *irq_ctx = dev_id; @@ -155,7 +165,7 @@ static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id) spin_unlock_irqrestore(&irq_ctx->lock, flags); if (ret == IRQ_HANDLED) - eventfd_signal(irq_ctx->trigger); + vfio_send_eventfd(irq_ctx); return ret; } @@ -164,52 +174,40 @@ static irqreturn_t vfio_irq_handler(int irq, void *dev_id) { struct vfio_platform_irq *irq_ctx = dev_id; - eventfd_signal(irq_ctx->trigger); + vfio_send_eventfd(irq_ctx); return IRQ_HANDLED; } static int vfio_set_trigger(struct vfio_platform_device *vdev, int index, - int fd, irq_handler_t handler) + int fd) { struct vfio_platform_irq *irq = &vdev->irqs[index]; struct eventfd_ctx *trigger; - int ret; if (irq->trigger) { - irq_clear_status_flags(irq->hwirq, IRQ_NOAUTOEN); - free_irq(irq->hwirq, irq); - kfree(irq->name); + disable_irq(irq->hwirq); eventfd_ctx_put(irq->trigger); irq->trigger = NULL; } if (fd < 0) /* Disable only */ return 0; - irq->name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-irq[%d](%s)", - irq->hwirq, vdev->name); - if (!irq->name) - return -ENOMEM; trigger = eventfd_ctx_fdget(fd); - if (IS_ERR(trigger)) { - kfree(irq->name); + if (IS_ERR(trigger)) return PTR_ERR(trigger); - } irq->trigger = trigger; - irq_set_status_flags(irq->hwirq, IRQ_NOAUTOEN); - ret = request_irq(irq->hwirq, handler, 0, irq->name, irq); - if (ret) { - kfree(irq->name); - eventfd_ctx_put(trigger); - irq->trigger = NULL; - return ret; - } - - if (!irq->masked) - enable_irq(irq->hwirq); + /* + * irq->masked effectively provides nested disables within the overall + * enable relative to trigger. Specifically request_irq() is called + * with NO_AUTOEN, therefore the IRQ is initially disabled. The user + * may only further disable the IRQ with a MASK operations because + * irq->masked is initially false. + */ + enable_irq(irq->hwirq); return 0; } @@ -228,7 +226,7 @@ static int vfio_platform_set_irq_trigger(struct vfio_platform_device *vdev, handler = vfio_irq_handler; if (!count && (flags & VFIO_IRQ_SET_DATA_NONE)) - return vfio_set_trigger(vdev, index, -1, handler); + return vfio_set_trigger(vdev, index, -1); if (start != 0 || count != 1) return -EINVAL; @@ -236,7 +234,7 @@ static int vfio_platform_set_irq_trigger(struct vfio_platform_device *vdev, if (flags & VFIO_IRQ_SET_DATA_EVENTFD) { int32_t fd = *(int32_t *)data; - return vfio_set_trigger(vdev, index, fd, handler); + return vfio_set_trigger(vdev, index, fd); } if (flags & VFIO_IRQ_SET_DATA_NONE) { @@ -260,6 +258,14 @@ int vfio_platform_set_irqs_ioctl(struct vfio_platform_device *vdev, unsigned start, unsigned count, uint32_t flags, void *data) = NULL; + /* + * For compatibility, errors from request_irq() are local to the + * SET_IRQS path and reflected in the name pointer. This allows, + * for example, polling mode fallback for an exclusive IRQ failure. + */ + if (IS_ERR(vdev->irqs[index].name)) + return PTR_ERR(vdev->irqs[index].name); + switch (flags & VFIO_IRQ_SET_ACTION_TYPE_MASK) { case VFIO_IRQ_SET_ACTION_MASK: func = vfio_platform_set_irq_mask; @@ -280,7 +286,7 @@ int vfio_platform_set_irqs_ioctl(struct vfio_platform_device *vdev, int vfio_platform_irq_init(struct vfio_platform_device *vdev) { - int cnt = 0, i; + int cnt = 0, i, ret = 0; while (vdev->get_irq(vdev, cnt) >= 0) cnt++; @@ -292,29 +298,54 @@ int vfio_platform_irq_init(struct vfio_platform_device *vdev) for (i = 0; i < cnt; i++) { int hwirq = vdev->get_irq(vdev, i); + irq_handler_t handler = vfio_irq_handler; - if (hwirq < 0) + if (hwirq < 0) { + ret = -EINVAL; goto err; + } spin_lock_init(&vdev->irqs[i].lock); vdev->irqs[i].flags = VFIO_IRQ_INFO_EVENTFD; - if (irq_get_trigger_type(hwirq) & IRQ_TYPE_LEVEL_MASK) + if (irq_get_trigger_type(hwirq) & IRQ_TYPE_LEVEL_MASK) { vdev->irqs[i].flags |= VFIO_IRQ_INFO_MASKABLE | VFIO_IRQ_INFO_AUTOMASKED; + handler = vfio_automasked_irq_handler; + } vdev->irqs[i].count = 1; vdev->irqs[i].hwirq = hwirq; vdev->irqs[i].masked = false; + vdev->irqs[i].name = kasprintf(GFP_KERNEL_ACCOUNT, + "vfio-irq[%d](%s)", hwirq, + vdev->name); + if (!vdev->irqs[i].name) { + ret = -ENOMEM; + goto err; + } + + ret = request_irq(hwirq, handler, IRQF_NO_AUTOEN, + vdev->irqs[i].name, &vdev->irqs[i]); + if (ret) { + kfree(vdev->irqs[i].name); + vdev->irqs[i].name = ERR_PTR(ret); + } } vdev->num_irqs = cnt; return 0; err: + for (--i; i >= 0; i--) { + if (!IS_ERR(vdev->irqs[i].name)) { + free_irq(vdev->irqs[i].hwirq, &vdev->irqs[i]); + kfree(vdev->irqs[i].name); + } + } kfree(vdev->irqs); - return -EINVAL; + return ret; } void vfio_platform_irq_cleanup(struct vfio_platform_device *vdev) @@ -324,7 +355,12 @@ void vfio_platform_irq_cleanup(struct vfio_platform_device *vdev) for (i = 0; i < vdev->num_irqs; i++) { vfio_virqfd_disable(&vdev->irqs[i].mask); vfio_virqfd_disable(&vdev->irqs[i].unmask); - vfio_set_trigger(vdev, i, -1, NULL); + if (!IS_ERR(vdev->irqs[i].name)) { + free_irq(vdev->irqs[i].hwirq, &vdev->irqs[i]); + if (vdev->irqs[i].trigger) + eventfd_ctx_put(vdev->irqs[i].trigger); + kfree(vdev->irqs[i].name); + } } vdev->num_irqs = 0; -- 2.44.0