Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp4850672imm; Tue, 11 Sep 2018 19:44:46 -0700 (PDT) X-Google-Smtp-Source: ANB0VdapxH2aJicH2eWx30MpkY0C/w/QQWZHg0/UsJmX4HtXG8WCXvZGv+Sy+UeyY2n5toSmeNmf X-Received: by 2002:a17:902:bd95:: with SMTP id q21-v6mr30244385pls.284.1536720286891; Tue, 11 Sep 2018 19:44:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536720286; cv=none; d=google.com; s=arc-20160816; b=05Ietp/F8Pmxc6ZNWOqLGkPD/Rw731i30TLnbuUaDEQen5thHn0o1Cw7TGfFP96VAk gR+L83Y9vi1Pp8ZZeIB2gNdWs4GBTS+Jx1VuJ52ipeycuqqQQRrlwIcL73s9CsUnrVIR ryglUbs29+DnywDYEr0BeCrU02C4v/ugJON2/Wk3hTqFotIAnK6T2qBNF1j9rBh+1GFO S19BH2oIgKr/KseUd5r4xlEihc42IALFKFMOOILWbW+PpYg/H59iNwEfinlRaOHQs7EN 2xGhOl50YpHmLLUtYtehqwheTOHX+2ZOHnPXf7Hf2VhoZ2ZQT6ealmCig52OIE6YJSYC +TIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:to:subject:cc; bh=/cXSGGaw//rWXdOo7KNRd5k5oTo27bjvW9ImYKT7NiE=; b=a6NWOSjXuPEbYbBpJ/v7PvrDKVc+jmTtJE0aNrHITEeYejFZkkPoPpx3ScNiBFK+X2 jc4Mto433aAQrW4Emd9a2T5z7QPWpE7W6hpikMA/WXt9n1wHgvxbjoywOBYG98SBfI9x 9lzWz701y1s01OIh7x/OQwnbIXzBAfAHH/Jx2Hflh0QcpNZAj7Q6CNogL/YfNyuX0knK YPqBFTKiQom/T4pSnn7hTtHGm0F26mNGuFyw3iwiSG7rROLj1XwQHELef+WUvFKZ1LqT /TP5JG5hNWkSGtwD8pmipiR8Jjvy7ZBfpN+BaCX7CbMoa1gugTpjST9Ppau4T/E0L0+d G4Ig== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d129-v6si23793100pfd.113.2018.09.11.19.44.31; Tue, 11 Sep 2018 19:44:46 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727993AbeILHqj (ORCPT + 99 others); Wed, 12 Sep 2018 03:46:39 -0400 Received: from mga09.intel.com ([134.134.136.24]:15478 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726686AbeILHqj (ORCPT ); Wed, 12 Sep 2018 03:46:39 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Sep 2018 19:44:25 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,362,1531810800"; d="scan'208";a="69302811" Received: from allen-box.sh.intel.com (HELO [10.239.161.122]) ([10.239.161.122]) by fmsmga007.fm.intel.com with ESMTP; 11 Sep 2018 19:44:03 -0700 Cc: baolu.lu@linux.intel.com, kevin.tian@intel.com, ashok.raj@intel.com, tiwei.bie@intel.com, sanjay.k.kumar@intel.com, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org, yi.y.sun@intel.com, jacob.jun.pan@intel.com, kvm@vger.kernel.org Subject: Re: [RFC PATCH v2 00/10] vfio/mdev: IOMMU aware mediated device To: Jean-Philippe Brucker , Joerg Roedel , David Woodhouse , Alex Williamson , Kirti Wankhede References: <20180830040922.30426-1-baolu.lu@linux.intel.com> <380dc154-5d72-0085-2056-fa466789e1ab@arm.com> From: Lu Baolu Message-ID: Date: Wed, 12 Sep 2018 10:42:52 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <380dc154-5d72-0085-2056-fa466789e1ab@arm.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On 09/11/2018 12:22 AM, Jean-Philippe Brucker wrote: > Hi, > > On 30/08/2018 05:09, Lu Baolu wrote: >> Below APIs are introduced in the IOMMU glue for device drivers to use >> the finer granularity translation. >> >> * iommu_capable(IOMMU_CAP_AUX_DOMAIN) >> - Represents the ability for supporting multiple domains per device >> (a.k.a. finer granularity translations) of the IOMMU hardware. > > iommu_capable() cannot represent hardware capabilities, we need > something else for systems with multiple IOMMUs that have different > caps. How about iommu_domain_get_attr on the device's domain instead? Domain is not a good choice for per iommu cap query. A domain might be attached to devices belonging to different iommu's. How about an API with device structure as parameter? A device always belongs to a specific iommu. This API is supposed to be used the device driver. > >> * iommu_en(dis)able_aux_domain(struct device *dev) >> - Enable/disable the multiple domains capability for a device >> referenced by @dev. >> >> * iommu_auxiliary_id(struct iommu_domain *domain) >> - Return the index value used for finer-granularity DMA translation. >> The specific device driver needs to feed the hardware with this >> value, so that hardware device could issue the DMA transaction with >> this value tagged. > > This could also reuse iommu_domain_get_attr. > > > More generally I'm having trouble understanding how auxiliary domains > will be used. So VFIO allocates PASIDs like this: As I wrote in the cover letter, "auxiliary domain" is just a name to ease discussion. It's actually has no special meaning (we think a domain as an isolation boundary which could be used by the IOMMU to isolate the DMA transactions out of a PCI device or partial of it). So drivers like vfio should see no difference when use an auxiliary domain. The auxiliary domain is not aware out of iommu driver. > > * iommu_enable_aux_domain(parent_dev) > * iommu_domain_alloc() -> dom1 > * iommu_domain_alloc() -> dom2 > * iommu_attach_device(dom1, parent_dev) > -> dom1 gets PASID #1 > * iommu_attach_device(dom2, parent_dev) > -> dom2 gets PASID #2 > > Then I'm not sure about the next steps, when userspace does > VFIO_IOMMU_MAP_DMA or VFIO_IOMMU_BIND on an mdev's container. Is the > following use accurate? > > For the single translation level: > * iommu_map(dom1, ...) updates first-level/second-level pgtables for > PASID #1 > * iommu_map(dom2, ...) updates first-level/second-level pgtables for > PASID #2 > > Nested translation: > * iommu_map(dom1, ...) updates second-level pgtables for PASID #1 > * iommu_bind_table(dom1, ...) binds first-level pgtables, provided by > the guest, for PASID #1 > * iommu_map(dom2, ...) updates second-level pgtables for PASID #2 > * iommu_bind_table(dom2, ...) binds first-level pgtables for PASID #2 > > > I'm trying to understand how to implement this with SMMU and other This is proposed for architectures which support finer granularity second level translation with no impact on architectures which only support Source ID or the similar granularity. > IOMMUs. It's not a clean fit since we have a single domain to hold the > second-level pgtables. Do you mind explaining why a domain holds multiple second-level pgtables? Shouldn't that be multiple domains? > Then again, the nested case probably doesn't > matter for us - we might as well assign the parent directly, since all > mdevs have the same second-level and can only be assigned to the same VM. > > > Also, can non-VFIO device drivers use auxiliary domains to do map/unmap > on PASIDs? They are asking to do that and I'm proposing the private > PASID thing, but since aux domains provide a similar feature we should > probably converge somehow. Yes, any non-VFIO device driver could use aux domain as well. The use model is: iommu_enable_aux_domain(dev) -- enables aux domain support for this device iommu_domain_alloc(dev) -- allocate an iommu domain iommu_attach_device(domain, dev) -- attach the domain to device iommu_auxiliary_id(domain) -- retrieve the pasid id used by this domain The device driver then iommu_map(domain, ...) set the pasid id to hardware register and start to do dma. Best regards, Lu Baolu