Received: by 10.192.165.148 with SMTP id m20csp2467526imm; Sun, 22 Apr 2018 07:39:18 -0700 (PDT) X-Google-Smtp-Source: AIpwx4+8ULfe/RS2fOtj09P9qAnixdOMuStcuW4R6b7B6JtWpoy8HORYowPcGJWcLg3B2Bf6FpkA X-Received: by 2002:a17:902:3181:: with SMTP id x1-v6mr17663313plb.198.1524407958784; Sun, 22 Apr 2018 07:39:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524407958; cv=none; d=google.com; s=arc-20160816; b=oRZ3b1XnbFW7RaAsDYJje/WYBbgXosszJzG0NTs2FV7R7bJfJ8+qFfJfoOJB6TBn2r 9yN4Ltr0TVBJD4Y+MlJ8WPjoow27t1Yn7Sg7CIp1sHt0kj0fLiYZ9VRmqyI1yhT6tCTT OcCgFKInXF/YJ/NcDMhw0JTN317eHQXmz6oP4QmfN58GD77Jn8klfvMPs3VBaoKt3ZcU hTHTq+57KlDZLdtbrdXp/c7Egexpc6KPNJqLj/gnd+WI8YEicEipcHvG3z5MSJAZq2pp Kcl6THKVX5ORkyqgx8vRQEbtJmTyRjKmrdFL2a7IKQqijvdmVxWjSz/PxEs97FaVqi1h YnLw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=kXn0BcyF7+2iqRHUHM20Br5cf4i/vZurYTNh/tNGPvk=; b=Xs5cASG4CVkl0jxtKM/1mkkHoKDHqpnnbsv5BEoCg+pHsnV4CK3VC4JVbhVIc+im1a mKWDA/XuC+gzyDs281jxqGMrRbSvpWhJvGSXQW0ukNr5YTCyJHvtm9/1Fd97F3Ee6J9U II+Krmvo5e5hfmaQEYzSUagMnpgV4Sb1OxSVDtMtEiTtYpVbYqeH+ZGRXW9jZPyage+E gAwGONk3kkc+n31OfThkkUq0abV/0ND+uV08oSqDi3aTCTCVbt1h1UhfYAfPBwjaxC9/ NscTZSbE4X/0u1GxG5n0X2WKhUnNJ9rC9FY316koc6IpFJ8O0jVAzir619HHrX7Tr3yz WaIw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a64-v6si8192509pla.530.2018.04.22.07.39.04; Sun, 22 Apr 2018 07:39:18 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757360AbeDVOh7 (ORCPT + 99 others); Sun, 22 Apr 2018 10:37:59 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:59058 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932382AbeDVOSk (ORCPT ); Sun, 22 Apr 2018 10:18:40 -0400 Received: from localhost (LFbn-1-12247-202.w90-92.abo.wanadoo.fr [90.92.61.202]) by mail.linuxfoundation.org (Postfix) with ESMTPSA id 5A780D0A; Sun, 22 Apr 2018 14:18:38 +0000 (UTC) From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Alex Williamson , Greg Rose Subject: [PATCH 4.4 71/97] vfio-pci: Virtualize PCIe & AF FLR Date: Sun, 22 Apr 2018 15:53:49 +0200 Message-Id: <20180422135309.141199435@linuxfoundation.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180422135304.577223025@linuxfoundation.org> References: <20180422135304.577223025@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.4-stable review patch. If anyone has any objections, please let me know. ------------------ From: Alex Williamson commit ddf9dc0eb5314d6dac8b19b1cc37c739c6896e7e upstream. We use a BAR restore trick to try to detect when a user has performed a device reset, possibly through FLR or other backdoors, to put things back into a working state. This is important for backdoor resets, but we can actually just virtualize the "front door" resets provided via PCIe and AF FLR. Set these bits as virtualized + writable, allowing the default write to set them in vconfig, then we can simply check the bit, perform an FLR of our own, and clear the bit. We don't actually have the granularity in PCI to specify the type of reset we want to do, but generally devices don't implement both PCIe and AF FLR and we'll favor these over other types of reset, so we should generally lineup. We do test whether the device provides the requested FLR type to stay consistent with hardware capabilities though. This seems to fix several instance of devices getting into bad states with userspace drivers, like dpdk, running inside a VM. Signed-off-by: Alex Williamson Reviewed-by: Greg Rose Signed-off-by: Greg Kroah-Hartman --- drivers/vfio/pci/vfio_pci_config.c | 82 ++++++++++++++++++++++++++++++++++--- 1 file changed, 77 insertions(+), 5 deletions(-) --- a/drivers/vfio/pci/vfio_pci_config.c +++ b/drivers/vfio/pci/vfio_pci_config.c @@ -752,6 +752,40 @@ static int __init init_pci_cap_pcix_perm return 0; } +static int vfio_exp_config_write(struct vfio_pci_device *vdev, int pos, + int count, struct perm_bits *perm, + int offset, __le32 val) +{ + __le16 *ctrl = (__le16 *)(vdev->vconfig + pos - + offset + PCI_EXP_DEVCTL); + + count = vfio_default_config_write(vdev, pos, count, perm, offset, val); + if (count < 0) + return count; + + /* + * The FLR bit is virtualized, if set and the device supports PCIe + * FLR, issue a reset_function. Regardless, clear the bit, the spec + * requires it to be always read as zero. NB, reset_function might + * not use a PCIe FLR, we don't have that level of granularity. + */ + if (*ctrl & cpu_to_le16(PCI_EXP_DEVCTL_BCR_FLR)) { + u32 cap; + int ret; + + *ctrl &= ~cpu_to_le16(PCI_EXP_DEVCTL_BCR_FLR); + + ret = pci_user_read_config_dword(vdev->pdev, + pos - offset + PCI_EXP_DEVCAP, + &cap); + + if (!ret && (cap & PCI_EXP_DEVCAP_FLR)) + pci_try_reset_function(vdev->pdev); + } + + return count; +} + /* Permissions for PCI Express capability */ static int __init init_pci_cap_exp_perm(struct perm_bits *perm) { @@ -759,26 +793,64 @@ static int __init init_pci_cap_exp_perm( if (alloc_perm_bits(perm, PCI_CAP_EXP_ENDPOINT_SIZEOF_V2)) return -ENOMEM; + perm->writefn = vfio_exp_config_write; + p_setb(perm, PCI_CAP_LIST_NEXT, (u8)ALL_VIRT, NO_WRITE); /* - * Allow writes to device control fields (includes FLR!) - * but not to devctl_phantom which could confuse IOMMU - * or to the ARI bit in devctl2 which is set at probe time + * Allow writes to device control fields, except devctl_phantom, + * which could confuse IOMMU, and the ARI bit in devctl2, which + * is set at probe time. FLR gets virtualized via our writefn. */ - p_setw(perm, PCI_EXP_DEVCTL, NO_VIRT, ~PCI_EXP_DEVCTL_PHANTOM); + p_setw(perm, PCI_EXP_DEVCTL, + PCI_EXP_DEVCTL_BCR_FLR, ~PCI_EXP_DEVCTL_PHANTOM); p_setw(perm, PCI_EXP_DEVCTL2, NO_VIRT, ~PCI_EXP_DEVCTL2_ARI); return 0; } +static int vfio_af_config_write(struct vfio_pci_device *vdev, int pos, + int count, struct perm_bits *perm, + int offset, __le32 val) +{ + u8 *ctrl = vdev->vconfig + pos - offset + PCI_AF_CTRL; + + count = vfio_default_config_write(vdev, pos, count, perm, offset, val); + if (count < 0) + return count; + + /* + * The FLR bit is virtualized, if set and the device supports AF + * FLR, issue a reset_function. Regardless, clear the bit, the spec + * requires it to be always read as zero. NB, reset_function might + * not use an AF FLR, we don't have that level of granularity. + */ + if (*ctrl & PCI_AF_CTRL_FLR) { + u8 cap; + int ret; + + *ctrl &= ~PCI_AF_CTRL_FLR; + + ret = pci_user_read_config_byte(vdev->pdev, + pos - offset + PCI_AF_CAP, + &cap); + + if (!ret && (cap & PCI_AF_CAP_FLR) && (cap & PCI_AF_CAP_TP)) + pci_try_reset_function(vdev->pdev); + } + + return count; +} + /* Permissions for Advanced Function capability */ static int __init init_pci_cap_af_perm(struct perm_bits *perm) { if (alloc_perm_bits(perm, pci_cap_length[PCI_CAP_ID_AF])) return -ENOMEM; + perm->writefn = vfio_af_config_write; + p_setb(perm, PCI_CAP_LIST_NEXT, (u8)ALL_VIRT, NO_WRITE); - p_setb(perm, PCI_AF_CTRL, NO_VIRT, PCI_AF_CTRL_FLR); + p_setb(perm, PCI_AF_CTRL, PCI_AF_CTRL_FLR, PCI_AF_CTRL_FLR); return 0; }