Received: by 2002:a05:6a10:6006:0:0:0:0 with SMTP id w6csp664695pxa; Thu, 27 Aug 2020 12:19:36 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy9BwX56ZqUk1ZD04ej7wBLnKCj3QCcElzI9btvb6Lo85fVFWZFC2ygb9YR8RTG/+Ptpmbk X-Received: by 2002:aa7:d284:: with SMTP id w4mr17255759edq.258.1598555976602; Thu, 27 Aug 2020 12:19:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1598555976; cv=none; d=google.com; s=arc-20160816; b=U1gEeuYCl6EYmmlhiPPS0cjp4a0nLsUSNL6YdCWtIvHPPNEXcXF0XcH020exhi14pe AiIcSNL0z9WhPeq/+j5twfZEeUE0LV+/EE5CnA8KwUlEjVCUSuZu/564nH7NJNDb+XC2 x4uoZK5q/JERGwxP8R7PfH/ZH68683UXPCilZpyNUMfqrYm0M1jkKPOU1co/Sba6yOVS chJZdMgfy+UYD6vLYq2uR5xWvx2tDWVppAZRB/qjnSbRrO5e5tNYFy7nT7BZogNS+54E U5Ve6QlIMF7cuLqXm8B10VBYcSRw+Shl9e6nmlvJAs3eDTaS6gUJkCB07eGWj61oKkAS hoWA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :organization:references:in-reply-to:message-id:subject:cc:to:from :date:dkim-signature; bh=IxPAieUngr9DBwwiu/oaORGN+QbkwF+ahdjNmb5dNbE=; b=mPRQDiS+LSzSyB6o6FymM4QKZvLna/TXk4ljFxZs0SyQb2DI8NOzxSxkLWiRKL1mlg 4h9Gg7B9DQG19w5iTZSaGmYJ/NTukjgq6Y/DULeyob3zxF1PGdQ+cv5xz5xyUnvOVtbG kQXw1Sv/YawMS18ULJSR6cpuBNY77+bfATXOWbomz5aRtCW9F6ZpFmu850lV/FzXqg75 +E1bnStElJCw72iZCUlBgAqCHNDwd8xeNymLblyRMAk95Qpk1o5mRQ6TDj+/1hoXK+7Q 2YK/+BtE0wdHCBtoCb18jQVVimKSat5FGCpEmsGQC3RsotlaKFfrhd0xBpSiSatopnBm P23Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=JL2pPvej; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id s18si1898280ejf.52.2020.08.27.12.19.13; Thu, 27 Aug 2020 12:19:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=JL2pPvej; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727020AbgH0TR6 (ORCPT + 99 others); Thu, 27 Aug 2020 15:17:58 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:29553 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726246AbgH0TR5 (ORCPT ); Thu, 27 Aug 2020 15:17:57 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1598555875; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IxPAieUngr9DBwwiu/oaORGN+QbkwF+ahdjNmb5dNbE=; b=JL2pPvejL/9AvDfRq9K3p8AQDNvPEmf/EBoWs0mfLdg7J+DB1+PHU0FUN8HjkNVJS7+gRl jh8GPWqXJ654LTSJlicRHl3c0hvWE6wbu2K3pe87BF2QMSn4+aI2m9EuvlxojPcdLFQU9S 0CvJAqm2gnOFMp2czhVQ7/1upaL55tU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-359-YRCxWWLPN621bX7wXhJMjg-1; Thu, 27 Aug 2020 15:17:53 -0400 X-MC-Unique: YRCxWWLPN621bX7wXhJMjg-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 0C53364086; Thu, 27 Aug 2020 19:17:51 +0000 (UTC) Received: from x1.home (ovpn-112-71.phx2.redhat.com [10.3.112.71]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3C77B702FF; Thu, 27 Aug 2020 19:17:49 +0000 (UTC) Date: Thu, 27 Aug 2020 13:17:48 -0600 From: Alex Williamson To: Bjorn Helgaas Cc: Matthew Rosato , bhelgaas@google.com, schnelle@linux.ibm.com, pmorel@linux.ibm.com, mpe@ellerman.id.au, oohall@gmail.com, linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, linux-pci@vger.kernel.org Subject: Re: [PATCH v3] PCI: Introduce flag for detached virtual functions Message-ID: <20200827131748.46b3f8bc@x1.home> In-Reply-To: <20200827183138.GA1929779@bjorn-Precision-5520> References: <1597333243-29483-2-git-send-email-mjrosato@linux.ibm.com> <20200827183138.GA1929779@bjorn-Precision-5520> Organization: Red Hat MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 27 Aug 2020 13:31:38 -0500 Bjorn Helgaas wrote: > Re the subject line, this patch does a lot more than just "introduce a > flag"; AFAICT it actually enables important VFIO functionality, e.g., > something like: > > vfio/pci: Enable MMIO access for s390 detached VFs > > On Thu, Aug 13, 2020 at 11:40:43AM -0400, Matthew Rosato wrote: > > s390x has the notion of providing VFs to the kernel in a manner > > where the associated PF is inaccessible other than via firmware. > > These are not treated as typical VFs and access to them is emulated > > by underlying firmware which can still access the PF. After > > the referened commit however these detached VFs were no longer able > > to work with vfio-pci as the firmware does not provide emulation of > > the PCI_COMMAND_MEMORY bit. In this case, let's explicitly recognize > > these detached VFs so that vfio-pci can allow memory access to > > them again. > > Out of curiosity, in what sense is the PF inaccessible? Is it > *impossible* for Linux to access the PF, or is it just not enumerated > by clp_list_pci() so Linux doesn't know about it? > > VFs do not implement PCI_COMMAND, so I guess "firmware does not > provide emulation of PCI_COMMAND_MEMORY" means something like "we > can't access the PF so we can't enable/disable PCI_COMMAND_MEMORY"? > > s/referened/referenced/ > > > Fixes: abafbc551fdd ("vfio-pci: Invalidate mmaps and block MMIO access on disabled memory") > > Signed-off-by: Matthew Rosato > > --- > > arch/s390/pci/pci_bus.c | 13 +++++++++++++ > > drivers/vfio/pci/vfio_pci_config.c | 8 ++++---- > > include/linux/pci.h | 4 ++++ > > 3 files changed, 21 insertions(+), 4 deletions(-) > > > > diff --git a/arch/s390/pci/pci_bus.c b/arch/s390/pci/pci_bus.c > > index 642a993..1b33076 100644 > > --- a/arch/s390/pci/pci_bus.c > > +++ b/arch/s390/pci/pci_bus.c > > @@ -184,6 +184,19 @@ static inline int zpci_bus_setup_virtfn(struct zpci_bus *zbus, > > } > > #endif > > > > +void pcibios_bus_add_device(struct pci_dev *pdev) > > +{ > > + struct zpci_dev *zdev = to_zpci(pdev); > > + > > + /* > > + * If we have a VF on a non-multifunction bus, it must be a VF that is > > + * detached from its parent PF. We rely on firmware emulation to > > + * provide underlying PF details. > > What exactly does "multifunction bus" mean? I'm familiar with > multi-function *devices*, but not multi-function buses. > > > + */ > > + if (zdev->vfn && !zdev->zbus->multifunction) > > + pdev->detached_vf = 1; > > +} > > + > > static int zpci_bus_add_device(struct zpci_bus *zbus, struct zpci_dev *zdev) > > { > > struct pci_bus *bus; > > diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c > > index d98843f..98f93d1 100644 > > --- a/drivers/vfio/pci/vfio_pci_config.c > > +++ b/drivers/vfio/pci/vfio_pci_config.c > > @@ -406,7 +406,7 @@ bool __vfio_pci_memory_enabled(struct vfio_pci_device *vdev) > > * PF SR-IOV capability, there's therefore no need to trigger > > * faults based on the virtual value. > > */ > > - return pdev->is_virtfn || (cmd & PCI_COMMAND_MEMORY); > > + return dev_is_vf(&pdev->dev) || (cmd & PCI_COMMAND_MEMORY); > > I'm not super keen on the idea of having two subtly different ways of > identifying VFs. I think that will be confusing. This seems to be > the critical line, so whatever we do here, it will be out of the > ordinary and probably deserves a little comment. > > If Linux doesn't see the PF, does pci_physfn(VF) return NULL, i.e., is > VF->physfn NULL? FWIW, pci_physfn() never returns NULL, it returns the provided pdev if is_virtfn is not set. This proposal wouldn't change that return value. AIUI pci_physfn(), the caller needs to test that the returned device is different from the provided device if there's really code that wants to traverse to the PF. My interpretation of what's happening here is that we're a guest running on a bare metal hypervisor (I assume z/VM) and we're assigned a VF that appears on this non-multifunction bus, but the hypervisor doesn't provide emulation of all of the non-implemented config space features of a VF, the memory enable bit being relevant for this fix. We're therefore trying to detect this VF nature of the device, which gets a bit messy since a VF implies a PF on bare metal. The PF would be owned by the hypervisor and not accessible to us. An alternative idea we tossed around, that might still be a possibility, is using dev_flags to describe the specific missing feature, for example something about the command register memory bit being hardwired to zero but always enabled (assuming the PF SR-IOV MSE bit is not cleared). Thanks, Alex