Received: by 2002:a05:6a10:a852:0:0:0:0 with SMTP id d18csp791672pxy; Wed, 5 May 2021 14:07:21 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzb3fV2RJLfNIWRqHqmCI5a+McW0UkQnOigWDbXVz6iuqV8iXTWY6HMB9hZvDjf4F8tjtD2 X-Received: by 2002:a05:6402:5158:: with SMTP id n24mr1055192edd.74.1620248841384; Wed, 05 May 2021 14:07:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1620248841; cv=none; d=google.com; s=arc-20160816; b=ngGdKHszsBjO8isoUs95TvG6VBDwgw4N2rWLBuPoWozvLCW10IIYSy91v6Ykquajc3 Gwyx0IEUGbQLNRx5xzexZ5oP1g4fHmsbStHfK7f8gpfatupnW8OOch3Ua8JCBoh+atib ndj1DcU5FhtgbdSa+FxYD04iFH+p9GTPfbSq8U2GoTY3xub129TMRzdwL7ACMDbPfUjP 0ENWJD7eCAWNIzTJyrqvDyf0D1toOEsdUajC8brh3iw6MyZ51cpmllDNdtTk/kc06V4j puVw4Cmk9yzZx1LO6DrFtBLRqrOT1yGz5gVl0ZLkrkOiRKZm84RBqtL0GncGmbXwNO7t BSyg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :organization:references:in-reply-to:message-id:subject:cc:to:from :date:ironport-sdr:ironport-sdr; bh=GeAxkPsqdjhJvj9RUGW5qhtDyiibPvUxjLIhvftPk4s=; b=ALq7bfn3rwqRTk0evuID7FyOnY5MMMrji2zb+qx0hTOnaklXGdzN5nS4NlCinBsctZ O1Rh/0dFAYUfaHFAbn+1Kt4dNo/zTGhQYKZDlv9OC6CfOFljTTCSbtj1D/Il4dbfLSSi oosH2Lf9mABfODj3kk9Pi/urqYSOsblUTsIf/SStN5QvA3ExULcmXcvPXyQhnS5whAlh ZuK1M2mu80Bw3s+u5Ea17rQmUIFIXWEmhkPy3OH9TiNLBZMx4gIkfFRnU+nPqSDFNt0y YHak/uhYA8II52TrAAnahU4aSZoePEBJ+tKLkHL3vWTqYvO4qmlXWFP5trkarGrXQRjg 46ig== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id qc21si421807ejb.190.2021.05.05.14.06.57; Wed, 05 May 2021 14:07:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235166AbhEEUC4 (ORCPT + 99 others); Wed, 5 May 2021 16:02:56 -0400 Received: from mga09.intel.com ([134.134.136.24]:22828 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233223AbhEEUCy (ORCPT ); Wed, 5 May 2021 16:02:54 -0400 IronPort-SDR: bp9vGeSTkEhGnxfbSkfY/A5KiUZaqC5BuXISj5WDcCpq0pL4DS1efZHSds+1FdtRF/dT10RleZ cgx33+eSp0Yg== X-IronPort-AV: E=McAfee;i="6200,9189,9975"; a="198374802" X-IronPort-AV: E=Sophos;i="5.82,276,1613462400"; d="scan'208";a="198374802" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2021 13:01:56 -0700 IronPort-SDR: VI2pDIaJ4hHcNspdA1bLZ4OSjm/y7q0dqAdAfbQxIZJ/+0aNwNizzDu2jRjKs8XwRsR7nUF658 AYEPooNFtQVg== X-IronPort-AV: E=Sophos;i="5.82,276,1613462400"; d="scan'208";a="406682561" Received: from jacob-builder.jf.intel.com (HELO jacob-builder) ([10.7.199.155]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2021 13:01:56 -0700 Date: Wed, 5 May 2021 13:04:46 -0700 From: Jacob Pan To: Jason Gunthorpe Cc: "Tian, Kevin" , Alex Williamson , "Liu, Yi L" , Auger Eric , Jean-Philippe Brucker , LKML , Joerg Roedel , Lu Baolu , David Woodhouse , "iommu@lists.linux-foundation.org" , "cgroups@vger.kernel.org" , Tejun Heo , Li Zefan , Johannes Weiner , Jean-Philippe Brucker , Jonathan Corbet , "Raj, Ashok" , "Wu, Hao" , "Jiang, Dave" , jacob.jun.pan@linux.intel.com Subject: Re: [PATCH V4 05/18] iommu/ioasid: Redefine IOASID set and allocation APIs Message-ID: <20210505130446.3ee2fccd@jacob-builder> In-Reply-To: <20210505180023.GJ1370958@nvidia.com> References: <20210423114944.GF1370958@nvidia.com> <20210426123817.GQ1370958@nvidia.com> <20210504084148.4f61d0b5@jacob-builder> <20210504180050.GB1370958@nvidia.com> <20210504151154.02908c63@jacob-builder> <20210504231530.GE1370958@nvidia.com> <20210505102259.044cafdf@jacob-builder> <20210505180023.GJ1370958@nvidia.com> Organization: OTC X-Mailer: Claws Mail 3.17.5 (GTK+ 2.24.32; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Jason, On Wed, 5 May 2021 15:00:23 -0300, Jason Gunthorpe wrote: > On Wed, May 05, 2021 at 10:22:59AM -0700, Jacob Pan wrote: > > > Global and pluggable are for slightly separate reasons. > > - We need global PASID on VT-d in that we need to support shared > > workqueues (SWQ). E.g. One SWQ can be wrapped into two mdevs then > > assigned to two VMs. Each VM uses its private guest PASID to submit > > work but each guest PASID must be translated to a global (system-wide) > > host PASID to avoid conflict. Also, since PASID table storage is per > > PF, if two mdevs of the same PF are assigned to different VMs, the > > PASIDs must be unique. > > From a protocol perspective each RID has a unique PASID table, and > RIDs can have overlapping PASIDs. > True, per RID or per PF as I was referring to. > Since your SWQ is connected to a single RID the requirement that > PASIDs are unique to the RID ensures they are sufficiently unique. > True, but one process can submit work to multiple mdevs from different RIDs/PFs. One process uses one PASID and PASID translation table is per VM. The same PASID is used for all the PASID tables of each RID. For example: VM1 has two mdevs: mdev1 and mdev2. mdev1's parent is RID1, mdev2's parent is RID2. The guest process A allocates PASID_A and bind to both mdev1 and mdev2. PASID_A must be present in the PASID tables for both RID1 and RID2. If the allocator is per RID, it is not possible to ensure PASID_A is available for both RIDs. Right? Sorry I missed this point in my earlier explanation. > If the IOMMU driver has additional restrictions then it should raise > the PASID table up higher in the hierarchy than at the RID. > That higher level in the hierarchy is global, right? I am a little concerned about expanding PASID table sharing from security perspective. Even though, VMs already share PASID table for mdevs. > I think what you are trying to explain is that the Intel vIOMMU has a > single PASID address space shared globally by the vCPU because ENQCMD > uses the global vGPU translation table. > Yes, PASID translation table is per VM, global in terms of the guest. That combined with the case of two mdevs from different RIDs can be used by the same guest process/PASID requires global PASID. > That is fine, but all this stuff should be inside the Intel vIOMMU > driver not made into a global resource of the entire iommu subsystem. > Intel vIOMMU has to use a generic uAPI to allocate PASID so the generic code need to have this option. I guess you are saying we should also have a per RID allocation option in addition to global? > Systems that work this way just cannot have multiple iommu drivers > competing for PASID. > Sorry, I am not following. There would not be mixed iommu drivers on one platform, I must have missed your point. Could you explain a little? > > - The pluggable allocator is to support the option where the guest > > PASIDs are allocated by the hypervisor. > > And if the hypervisor allocates the PASID then again the specific > vIOMMU itself is concerned with this and it has nothing to do with > global behavior of the iommu subsystem. > > > For ARM, since the guest owns the per device PASID table. There is no > > need to allocate PASIDs from the host nor the hypervisor. Without SWQ, > > there is no need for global PASID/SSID either. So PASID being global > > for ARM is for simplicity in case of host PASID/SSID. > > It isn't clear how ARM can support PASID and mdev but that is an > unrelated issue.. > AFAIK, the current SMMU device assignment is per RID, since only one stage2 page tables per RID, not per PASID. This is equivalent to the older VT-d spec. prior to scalable mode. Eric/Jean, can you help? > Jason Thanks, Jacob