Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp1682457imm; Sat, 13 Oct 2018 01:26:46 -0700 (PDT) X-Google-Smtp-Source: ACcGV60fwEZ3jPfpk5RY9LfiCbpuYmFHz/88Hh6oESbbckIxsf3XMY0yq/rhUyez4Y9IqDZHijuO X-Received: by 2002:a63:6385:: with SMTP id x127-v6mr8563073pgb.10.1539419205977; Sat, 13 Oct 2018 01:26:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539419205; cv=none; d=google.com; s=arc-20160816; b=k4q/6qxDKkqsYWd7god7Ia/GkA0SpiNQ1BgcwE4gtuqsjXZ7JOKRE8PcpIKl40Aduy fT8qhoedTjekX04PWrW5reHut+YcRKLyUAXL3X5hWG/XJNRh7reX2HPWi28pxXGFUc76 8USMkU1izond9YFadievwFo5M0tGhIuBnx4YFA1nxFxjxzuvijiR+9VD2z5/UDZxoWZn dYeNBGI3v3QHSb/kR/TpVjtkhpkIBThVxBkC2W/P4/iG4M6tXszuu4QG/B8USbv5dNTv 0DEwgNR/wvx/xzOhuxjdrU8/VEiA1Hq5K3i+7+94Y5hiOBv4tvHzKQkeK3oxRxAksyZO QJlg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:from:cc:references:to :subject; bh=QIU/ND0/hX0jObyUG1f0+LZGfVQZQCL4n4nMPR4SjpY=; b=zArZPr8x81H4pe+CzEE3MUondgOyUMH1T7M4voZxinA+UeSgR/TpXwEuz5QiWj6JCb 15iafiHfzodDS5eHhWzT2e+C1ZrAbLxBDckHy44doTzPX8BBnCuKDyvCCGadHt5baakD kAu4Hav3K+8ubqZAoHECZ0grSbLL2/SC9wQAFpTNGYzYxLytukPmzD2WFhThKcGvECIH d9iBN1lXvsELf31qqAU5SoqXtCL1rjdIiSwQQux7tpxjCFOOH/cGJV9+D3eRnsEzkIsu ciHFCuQF0LE+UsLBzPu9BO6pb3Ws4aaSOtgCBu3KUBsOo2EbkXsQinSQ1Dbi8XfYyc/O jmKQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b28-v6si3978463pff.192.2018.10.13.01.26.30; Sat, 13 Oct 2018 01:26:45 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726352AbeJMQCJ (ORCPT + 99 others); Sat, 13 Oct 2018 12:02:09 -0400 Received: from szxga05-in.huawei.com ([45.249.212.191]:13651 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726134AbeJMQCJ (ORCPT ); Sat, 13 Oct 2018 12:02:09 -0400 Received: from DGGEMS407-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id 0215F4774EC17; Sat, 13 Oct 2018 16:25:46 +0800 (CST) Received: from [127.0.0.1] (10.57.77.109) by DGGEMS407-HUB.china.huawei.com (10.3.19.207) with Microsoft SMTP Server id 14.3.399.0; Sat, 13 Oct 2018 16:25:45 +0800 Subject: Re: [PATCH v3 0/8] vfio/mdev: IOMMU aware mediated device To: Lu Baolu , Joerg Roedel , "David Woodhouse" , Alex Williamson , Kirti Wankhede References: <20181012051632.26064-1-baolu.lu@linux.intel.com> CC: , , , Jean-Philippe Brucker , , , , , , From: Xu Zaibo Message-ID: <5BC1AC09.1060507@huawei.com> Date: Sat, 13 Oct 2018 16:25:45 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0 MIME-Version: 1.0 In-Reply-To: <20181012051632.26064-1-baolu.lu@linux.intel.com> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.57.77.109] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On 2018/10/12 13:16, Lu Baolu wrote: > Hi, > > The Mediate Device is a framework for fine-grained physical device > sharing across the isolated domains. Currently the mdev framework > is designed to be independent of the platform IOMMU support. As the > result, the DMA isolation relies on the mdev parent device in a > vendor specific way. > > There are several cases where a mediated device could be protected > and isolated by the platform IOMMU. For example, Intel vt-d rev3.0 > [1] introduces a new translation mode called 'scalable mode', which > enables PASID-granular translations. The vt-d scalable mode is the > key ingredient for Scalable I/O Virtualization [2] [3] which allows > sharing a device in minimal possible granularity (ADI - Assignable > Device Interface). > > A mediated device backed by an ADI could be protected and isolated > by the IOMMU since 1) the parent device supports tagging an unique > PASID to all DMA traffic out of the mediated device; and 2) the DMA > translation unit (IOMMU) supports the PASID granular translation. > We can apply IOMMU protection and isolation to this kind of devices > just as what we are doing with an assignable PCI device. > > In order to distinguish the IOMMU-capable mediated devices from those > which still need to rely on parent devices, this patch set adds two > new members in struct mdev_device. > > * iommu_device > - This, if set, indicates that the mediated device could > be fully isolated and protected by IOMMU via attaching > an iommu domain to this device. If empty, it indicates > using vendor defined isolation. > > * iommu_domain > - This is a place holder for an iommu domain. A domain > could be store here for later use once it has been > attached to the iommu_device of this mdev. > > Below helpers are added to set and get above iommu device > and iommu domain pointers in mdev core implementation. > > * mdev_set/get_iommu_device(dev, iommu_device) > - Set or get the iommu device which represents this mdev > in IOMMU's device scope. Drivers don't need to set the > iommu device if it uses vendor defined isolation. > > * mdev_set/get_iommu_domain(domain) > - A iommu domain which has been attached to the iommu > device in order to protect and isolate the mediated > device will be kept in the mdev data structure and > could be retrieved later. > > The mdev parent device driver could opt-in that the mdev could be > fully isolated and protected by the IOMMU when the mdev is being > created by invoking mdev_set_iommu_device() in its @create(). I just cannot understand here, how to get an iommu_device while I create mediated device in my parent device driver? And why not reuse the device of MDEV instread of adding a new device here? Thanks, Zaibo . > > In the vfio_iommu_type1_attach_group(), a domain allocated through > iommu_domain_alloc() will be attached to the mdev iommu device if > an iommu device has been set. Otherwise, the dummy external domain > will be used and all the DMA isolation and protection are routed to > parent driver as the result. > > On IOMMU side, a basic requirement is allowing to attach multiple > domains to a PCI device if the device advertises the capability > and the IOMMU hardware supports finer granularity translations than > the normal PCI Source ID based translation. > > As the result, a PCI device could work in two modes: normal mode > and auxiliary mode. In the normal mode, a pci device could be > isolated in the Source ID granularity; the pci device itself could > be assigned to a user application by attaching a single domain > to it. In the auxiliary mode, a pci device could be isolated in > finer granularity, hence subsets of the device could be assigned > to different user level application by attaching a different domain > to each subset. > > The device driver is able to switch between above two modes with > below interfaces: > > * iommu_get_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_CAPABILITY) > - Represents the ability of supporting multiple domains > per device. > > * iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_ENABLE) > - Enable the multiple domains capability for the device > referenced by @dev. > > * iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_DISABLE) > - Disable the multiple domains capability for the device > referenced by @dev. > > * iommu_domain_get_attr(domain, DOMAIN_ATTR_AUXD_ID) > - Return ID used for finer-granularity DMA translation. > > The existing interfaces for attaching/detaching domains keep the > same as before. The different behaviors between the normal mode > and the auxiliary mode are handled in the vendor specific iommu > drivers. > > In order for the ease of discussion, sometimes we call "a domain in > auxiliary mode' or simply 'an auxiliary domain' when a domain is > attached to a device for finer granularity translations. But we need > to keep in mind that this doesn't mean there is a differnt domain > type. A same domain could be bound to a device for Source ID based > translation, and bound to another device for finer granularity > translation at the same time. > > This patch series extends both IOMMU and vfio components to support > mdev device passing through when it could be isolated and protected > by the IOMMU units. The first part of this series (PATCH 1/08~5/08) > adds the interfaces and implementation of the multiple domains per > device. The second part (PATCH 6/08~8/08) adds the iommu device > attribute to each mdev, determines isolation type according to the > existence of an iommu device when attaching group in vfio type1 iommu > module, and attaches the domain to iommu aware mediated devices. > > This patch series depends on a patch set posted here [4] for discussion > which added scalable mode support in Intel IOMMU driver. > > References: > [1] https://software.intel.com/en-us/download/intel-virtualization-technology-for-directed-io-architecture-specification > [2] https://software.intel.com/en-us/download/intel-scalable-io-virtualization-technical-specification > [3] https://schd.ws/hosted_files/lc32018/00/LC3-SIOV-final.pdf > [4] https://lkml.org/lkml/2018/10/7/54 > > Best regards, > Lu Baolu > > Change log: > v2->v3: > - Remove domain type enum and use a pointer on mdev_device instead. > - Add a generic interface for getting/setting per device iommu > attributions. And use it for query aux domain capability, enable > aux domain and disable aux domain purpose. > - Reuse iommu_domain_get_attr() to retrieve the id in a aux domain. > - We discussed the impact of the default domain implementation > on reusing iommu_at(de)tach_device() interfaces. We agreed > that reusing iommu_at(de)tach_device() interfaces is the right > direction and we could tweak the code to remove the impact. > https://www.spinics.net/lists/kvm/msg175285.html > - Removed the RFC tag since no objections received. > - This patch has been submitted separately. > https://www.spinics.net/lists/kvm/msg173936.html > > v1->v2: > - Rewrite the patches with the concept of auxiliary domains. > > Lu Baolu (8): > iommu: Add APIs for multiple domains per device > iommu/vt-d: Add multiple domains per device query > iommu/vt-d: Enable/disable multiple domains per device > iommu/vt-d: Attach/detach domains in auxiliary mode > iommu/vt-d: Return ID associated with an auxiliary domain > vfio/mdev: Add iommu place holders in mdev_device > vfio/type1: Add domain at(de)taching group helpers > vfio/type1: Handle different mdev isolation type > > drivers/iommu/intel-iommu.c | 249 ++++++++++++++++++++++++++++++- > drivers/iommu/iommu.c | 25 ++++ > drivers/vfio/mdev/mdev_core.c | 36 +++++ > drivers/vfio/mdev/mdev_private.h | 2 + > drivers/vfio/vfio_iommu_type1.c | 146 ++++++++++++++++-- > include/linux/intel-iommu.h | 11 ++ > include/linux/iommu.h | 33 ++++ > include/linux/mdev.h | 23 +++ > 8 files changed, 509 insertions(+), 16 deletions(-) >