Received: by 10.192.165.148 with SMTP id m20csp360790imm; Fri, 20 Apr 2018 07:58:18 -0700 (PDT) X-Google-Smtp-Source: AIpwx49SSLWJJwsnTrR1dO1kLgZxw9zZpcKb+z5Mfqztk1/2f+0x7KjCFurWN8ukkfk/lvu/wmPy X-Received: by 2002:a17:902:708a:: with SMTP id z10-v6mr10788073plk.315.1524236298164; Fri, 20 Apr 2018 07:58:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524236298; cv=none; d=google.com; s=arc-20160816; b=Bo4uMZabzZAqkge96WwqBpXysQPwVrDleN4zA3bdu6ersx9hj8fRB0RfpkzlM/gnXl AW30IqjAjoDBwRKRDrtF9L+sy2O4I/eDxBsgRS6zQbunr7/BAS/TiRkmCEA1Ue6S9v1U GcYwo6DX7rXeuW1p6yOSfWHRqExqpItGTvxxbjDyM2f9hJZDphu/enQtasIvh+cM29sl b59wIfCTOvKGXDtD+uU9hyLpzzjXfWa/I8SWaCdIldFX6BJx/roTLqPvrVW0HuA9C1dc +iey3JR5tMkqjd9pGnHeA4uHqtLnJuqLWxZG6Dr/IQa/ebECMvh8cauPRuFK8QYHmgps 6xdA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=1h2eTqPUKh7YdHI7jDaZsC/dnw7eh6UibTCiK0pEKJc=; b=0tn/gcW6Xip9uGlK8B84fYKRxuuz1RmSpyzaDTEYU771WUjF8A3uf1C5OVd/kIW8UJ xBlxC7Kf0hRCADcf0wnhmKktVFhF8ZziqyTWB9p59eVOBsnhf0M02x4+SJ8QTVBZVwoy NYDPQGyzxDxLvSTh7zvjbzs02s9ETc3rHNXNpkSYglFwFXvlwP55lAdsqUxzRs4e/ges ZqSaUUBjer8Q1ITueyjHeocMghM8yZN/rvomG0wB7DAnMfOCd58q8XGrinHDk35+HOuN 93NsZ5mjzyyksBHdwZ5qQkNNQSd8gq3IJEskWyvUwJqTxGc+AnLg7MY2pp92okSgSwVA uu9A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=AQFUDRY9; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j1si4842100pgv.688.2018.04.20.07.58.04; Fri, 20 Apr 2018 07:58:18 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=AQFUDRY9; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755571AbeDTO4U (ORCPT + 99 others); Fri, 20 Apr 2018 10:56:20 -0400 Received: from mail-oi0-f67.google.com ([209.85.218.67]:45467 "EHLO mail-oi0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755214AbeDTO4Q (ORCPT ); Fri, 20 Apr 2018 10:56:16 -0400 Received: by mail-oi0-f67.google.com with SMTP id j10-v6so8280314oii.12; Fri, 20 Apr 2018 07:56:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=1h2eTqPUKh7YdHI7jDaZsC/dnw7eh6UibTCiK0pEKJc=; b=AQFUDRY9mDpouDKVVGfSr73PW/0MP7dVjluRSJYtvAxKvZVkrnbzmkBXhXuU9jKsCF ZX/GyRY9ztomhl/FZZTfRlQw6qUDZcRPmsEIqvQUFjdG1tN6f3TlPMpD+AgSP83i2s7Z fzdcKiwxINV9glsHUc4SE9IvpqHZB3UGJ0MgwqHAxcnWueG9rSBORD5bz88swQaKAWSx 2tNaHIzKGmOUReusxE/XsiBnHxIp2dr5eglVB193Wvp2yqfSp1IxZaywRaqRwogrA/02 9HAl2YqpFlJsUv3U7A9tm/Hyp1evKYxR4JM4xnc85IMzl6ZedPnQfwkK0Pvs+cBNRQTa 7bHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=1h2eTqPUKh7YdHI7jDaZsC/dnw7eh6UibTCiK0pEKJc=; b=bEh0ChckoI4CriUWeh2cI7uHdRFqiK6EhMGwjIUcZscpJSnxll5rX15cp0a8Zq8ggr ONXhUqTemWEL3woNwiAwCijl7/oGSPkopftzeppFqgnTmN4o/BKRgYIIQCNdcPeomtLp cuYGCYi/H6KPB3ssvnsF9zWR4GVD7sh7C5bu6V031kezkuf5R08VO4qjoKIoD8jb6bUf GbJBMiSL7ysCcXKQkDAf8Ws0f8ip6Nzv6Uja2mJGn0F2tLd/W5RHxKbdUQtng/vxMS0S JWz3wFbSVR++cpAplq8QTKwH3cOpkF4FDElukKuLKDbnLc5+IWP4VEIDvMTrT60MU4Qd FQ9Q== X-Gm-Message-State: ALQs6tAChBrhJ13VafWN6vXURDuqJsFH47nn4Z81VK77uJsyoqP1IRmt ggXH5o4KY4/UAHNZLWytGl1GMoft8pfX4VdoLHs= X-Received: by 2002:aca:4dcc:: with SMTP id a195-v6mr6080998oib.259.1524236175012; Fri, 20 Apr 2018 07:56:15 -0700 (PDT) MIME-Version: 1.0 Received: by 10.201.52.1 with HTTP; Fri, 20 Apr 2018 07:56:14 -0700 (PDT) In-Reply-To: <20180420030640-mutt-send-email-mst@kernel.org> References: <20180315183449.3102.64791.stgit@localhost.localdomain> <20180315184132.3102.90947.stgit@localhost.localdomain> <20180316183042-mutt-send-email-mst@kernel.org> <20180403161151-mutt-send-email-mst@kernel.org> <20180403212503-mutt-send-email-mst@kernel.org> <20180420030640-mutt-send-email-mst@kernel.org> From: Alexander Duyck Date: Fri, 20 Apr 2018 07:56:14 -0700 Message-ID: Subject: Re: [virtio-dev] [pci PATCH v7 2/5] virtio_pci: Add support for unmanaged SR-IOV on virtio_pci devices To: "Michael S. Tsirkin" Cc: "Daly, Dan" , Bjorn Helgaas , "Duyck, Alexander H" , linux-pci@vger.kernel.org, virtio-dev@lists.oasis-open.org, kvm@vger.kernel.org, Netdev , LKML , linux-nvme@lists.infradead.org, Keith Busch , netanel@amazon.com, Don Dutile , Maximilian Heyne , "Wang, Liang-min" , "Rustad, Mark D" , David Woodhouse , Christoph Hellwig , dwmw@amazon.co.uk Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Apr 19, 2018 at 5:40 PM, Michael S. Tsirkin wrote: > On Tue, Apr 03, 2018 at 12:06:03PM -0700, Alexander Duyck wrote: >> On Tue, Apr 3, 2018 at 11:27 AM, Michael S. Tsirkin wrote: >> > On Tue, Apr 03, 2018 at 10:32:00AM -0700, Alexander Duyck wrote: >> >> On Tue, Apr 3, 2018 at 6:12 AM, Michael S. Tsirkin wrote: >> >> > On Fri, Mar 16, 2018 at 09:40:34AM -0700, Alexander Duyck wrote: >> >> >> On Fri, Mar 16, 2018 at 9:34 AM, Michael S. Tsirkin wrote: >> >> >> > On Thu, Mar 15, 2018 at 11:42:41AM -0700, Alexander Duyck wrote: >> >> >> >> From: Alexander Duyck >> >> >> >> >> >> >> >> Hardware-realized virtio_pci devices can implement SR-IOV, so this >> >> >> >> patch enables its use. The device in question is an upcoming Intel >> >> >> >> NIC that implements both a virtio_net PF and virtio_net VFs. These >> >> >> >> are hardware realizations of what has been up to now been a software >> >> >> >> interface. >> >> >> >> >> >> >> >> The device in question has the following 4-part PCI IDs: >> >> >> >> >> >> >> >> PF: vendor: 1af4 device: 1041 subvendor: 8086 subdevice: 15fe >> >> >> >> VF: vendor: 1af4 device: 1041 subvendor: 8086 subdevice: 05fe >> >> >> >> >> >> >> >> The patch currently needs no check for device ID, because the callback >> >> >> >> will never be made for devices that do not assert the capability or >> >> >> >> when run on a platform incapable of SR-IOV. >> >> >> >> >> >> >> >> One reason for this patch is because the hardware requires the >> >> >> >> vendor ID of a VF to be the same as the vendor ID of the PF that >> >> >> >> created it. So it seemed logical to simply have a fully-functioning >> >> >> >> virtio_net PF create the VFs. This patch makes that possible. >> >> >> >> >> >> >> >> Reviewed-by: Christoph Hellwig >> >> >> >> Signed-off-by: Mark Rustad >> >> >> >> Signed-off-by: Alexander Duyck >> >> >> > >> >> >> > So if and when virtio PFs can manage the VFs, then we can >> >> >> > add a feature bit for that? >> >> >> > Seems reasonable. >> >> >> >> >> >> Yes. If nothing else you may not even need a feature bit depending on >> >> >> how things go. >> >> > >> >> > OTOH if the interface is changed in an incompatible way, >> >> > and old Linux will attempt to drive the new device >> >> > since there is no check. >> >> > >> >> > I think we should add a feature bit right away. >> >> >> >> I'm not sure why you would need a feature bit. The capability is >> >> controlled via PCI configuration space. If it is present the device >> >> has the capability. If it is not then it does not. >> >> >> >> Basically if the PCI configuration space is not present then the sysfs >> >> entries will not be spawned and nothing will attempt to use this >> >> function. >> >> >> >> - ALex >> > >> > It's about compability with older guests which ignore the >> > capability. >> > >> > The feature is thus helpful so host knows whether guest supports VFs. >> >> The thing is if the capability is ignored then the feature isn't used. >> So for SR-IOV it isn't an uncommon thing for there to be drivers for >> the PF floating around that do not support SR-IOV. In such cases >> SR-IOV just isn't used while the hardware could support it. > > Right but how come there are VF drivers but PF driver does not > know about these? I'm not sure what you mean here. The VF and PF drivers are the same driver. The only difference is that the PF has the extra SR-IOV configuration space. What this code is meant to enable is a form of SR-IOV where the VFs are essentially pre-allocated resources. So for example in our case the MMIO space is identical for a PF versus any of the VFs. It doesn't have any special controls in place to allow the PF to manipulate any of the resources belonging to the VFs. > And are there PF drivers that intentially do not enable SRIOV > because it's known to be broken in some way? In the Virtio IO case right now are there any devices that support SR-IOV? For now this is just an add-on bit to a function that is already emulating the Virtio in hardware. > Case in point I do think virtio want to limit this > depending on a feature bit on general principles > (the principle being that all extensions have feature bits). This part has me kind of scratching my head. In our setup the "PF" is really nothing more than a "VF" with the SR-IOV configuration space attached to it. There are already examples of similar designs for NVMe and the Amazon ENA devices. Giving the "PF" any functionality in MMIO space that controls the SR-IOV kind of defeats the whole point of allowing this function in the first place. Basically the PF isn't really controlling things, it is the kernel that is doing it. > There are security implications here - we previously relied on > whitelisting after all. Yes and no. The original patch set had issues as you could have a PF assigned to user space and the VFs managed by the host. When I changed things so that the function had to be in a kernel driver that issue went away. > Wouldn't it be safer to be a bit more careful and update the > actual PF drivers? It's just one line per driver, but it > can be done with an ack by driver maintainer. > If/once we find out all drivers do have it, we can then > change the default. I have no clue what you are talking about here. This is the more careful approach. Are you sure you are reviewing the v7 of the patches? My understanding is that no paravirtual interfaces currently expose SR-IOV. What we are looking at is hardware will want to emulate Virtio, specifically virtio_net in the future and as a part of that the PF ends up emulating it as well. What we would need to watch for going forward is that any device that enables SR-IOV support would need to also provide a 4 tuple ID so that if something goes wrong with it we could disable SR-IOV on the device via a PCI quirk later. >> I would think in the case of virtio it would be the same kind of >> thing. Basically if SR-IOV is supported by the host then the >> capability would be present. If SR-IOV is supported by the guest then >> it would make use of the capability to spawn VFs. If either the >> capability isn't present, or the driver doesn't use it then you won't >> be able to spawn VFs in the guest. > >> Maybe I am missing something. Do you support dynamically changing the >> PCI configuration space for Virtio devices based on the presence of >> feature bits provided by the guest? > > No. The point is that IMHO at least virtio - in absence of feature bit - > to ignore VFs rather than assume they are safe to drive > in an unmanaged way. > >> Also are you saying this patch set should wait on the feature bit to >> be added, or are you talking about doing this as some sort of >> follow-up? >> >> - Alex > > I think for virtio it should include the feature bit, yes. > Adding feature bit is very easy - post a patch to the virtio TC mailing > list, wait about a week to give people time to respond (two weeks if it > is around holidays and such). The problem is we are talking about hardware/FPGA, not software. Adding a feature bit means going back and updating RTL. The software side of things is easy, re-validating things after a hardware/FPGA change not so much. If this is a hard requirement I may just drop the virtio patch, push what I have, and leave it to Mark/Dan to deal with the necessary RTL and code changes needed to support Virtio as I don't expect the turnaround to be as easy as just a patch. Thanks. - Alex