Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp542104pxf; Wed, 24 Mar 2021 10:02:57 -0700 (PDT) X-Google-Smtp-Source: ABdhPJykOMAZqEtSCZacwSRisqPLS8oQg2biPwIgrpT10+umKZlGfFHygc7tyCLG//ffmsiy1gSl X-Received: by 2002:a17:906:33d9:: with SMTP id w25mr4983579eja.413.1616605377232; Wed, 24 Mar 2021 10:02:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1616605377; cv=none; d=google.com; s=arc-20160816; b=NGlRLyTm/ezk/H0YksIPVqwHNnQ+xnXoIJUP15Fhg5Ywy23tuvypPAPlK+h9OFp8A1 ed747YCYT8wLWGjtOnmj6S+H3HV9oIYVDFOg46LIn5fzGWqEMAWhCHIVVnhJ9bvoXWFz qdCenfj7QBrfWnTOTN24Cu+oMDvV42C8xfGxCCTLCEmPe4GCCsPqo7Prg77ZjANMWPXc zVEBdkhHYCioLNhTnDZbuC7TCULbK5Qd4i7jlfs2CStz8TIMkopv9r5EvBKtLALU9AN7 YxgqJlFrSbuzAxHb5VAHpxV0cY96382hWvIKthXKmf72sChY4/S6DfzhdpxcduT3D6PN mNUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :organization:references:in-reply-to:message-id:subject:cc:to:from :date:ironport-sdr:ironport-sdr; bh=Z7971lDLl/XmfAmfvANQhbuTK4OUFW9UMO4WVwZkkbo=; b=wnltpiKGNznZoQPrE1MF0zUuTVm2hsTe2P/u1E/YLyMDW6+eNjn1vA+c8YEXawQYIl z8sAcv8V1OTe6scDyKy+cN8u9z5yzylZa2CPe370k+Shg0zRwS1ahyeeSsBvbL2fOUG4 Y95TB2XNKDO2GgKbRVH3q2Ot4d/M0z316WKdG+uAwj4eq5zwO/n8RAMjycH8uHO0zwY5 BVMA7yeQSOUGlPGyQzOHiRWuxPEC8djY5/dCuTM+U/E/dWyEXFMZP+vtrjtBLC/sRyq+ ekI7ZhvatXuoEg/MKWw6avtTMPwkHyVbS+GrinI7xUm9TVXfjw9vjlmZEJPVC8xMThw2 DHwg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id c20si2064182edw.141.2021.03.24.10.02.32; Wed, 24 Mar 2021 10:02:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236364AbhCXRBE convert rfc822-to-8bit (ORCPT + 99 others); Wed, 24 Mar 2021 13:01:04 -0400 Received: from mga12.intel.com ([192.55.52.136]:50718 "EHLO mga12.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236969AbhCXRAe (ORCPT ); Wed, 24 Mar 2021 13:00:34 -0400 IronPort-SDR: 35iqYBQVVRjVFqbbIO2lq3+7VCYylIvJsaUfh36OrwNLO3S8iHTlqNNKI/pIDIjpAiH03xfroh B8G8wPJgLiNQ== X-IronPort-AV: E=McAfee;i="6000,8403,9933"; a="170094181" X-IronPort-AV: E=Sophos;i="5.81,275,1610438400"; d="scan'208";a="170094181" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Mar 2021 10:00:33 -0700 IronPort-SDR: YaXrQdlMK3wx/87Gouc8P/JcmCFUnmxlV9XMFtnEUTpfgHtOYwd8VKOOu4hPqOQ8ocZEoUbDJh 55GbULgN11QA== X-IronPort-AV: E=Sophos;i="5.81,275,1610438400"; d="scan'208";a="593442260" Received: from jacob-builder.jf.intel.com (HELO jacob-builder) ([10.7.199.155]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Mar 2021 10:00:32 -0700 Date: Wed, 24 Mar 2021 10:02:46 -0700 From: Jacob Pan To: Jean-Philippe Brucker Cc: Jason Gunthorpe , LKML , Joerg Roedel , Lu Baolu , David Woodhouse , iommu@lists.linux-foundation.org, cgroups@vger.kernel.org, Tejun Heo , Li Zefan , Johannes Weiner , Jean-Philippe Brucker , Alex Williamson , Eric Auger , Jonathan Corbet , Raj Ashok , "Tian, Kevin" , Yi Liu , Wu Hao , Dave Jiang , jacob.jun.pan@linux.intel.com Subject: Re: [PATCH V4 05/18] iommu/ioasid: Redefine IOASID set and allocation APIs Message-ID: <20210324100246.4e6b8aa1@jacob-builder> In-Reply-To: References: <1614463286-97618-1-git-send-email-jacob.jun.pan@linux.intel.com> <1614463286-97618-6-git-send-email-jacob.jun.pan@linux.intel.com> <20210318172234.3e8c34f7@jacob-builder> <20210319124645.GP2356281@nvidia.com> <20210319135432.GT2356281@nvidia.com> <20210319112221.5123b984@jacob-builder> Organization: OTC X-Mailer: Claws Mail 3.17.5 (GTK+ 2.24.32; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Jean-Philippe, On Mon, 22 Mar 2021 10:24:00 +0100, Jean-Philippe Brucker wrote: > On Fri, Mar 19, 2021 at 11:22:21AM -0700, Jacob Pan wrote: > > Hi Jason, > > > > On Fri, 19 Mar 2021 10:54:32 -0300, Jason Gunthorpe > > wrote: > > > On Fri, Mar 19, 2021 at 02:41:32PM +0100, Jean-Philippe Brucker > > > wrote: > > > > On Fri, Mar 19, 2021 at 09:46:45AM -0300, Jason Gunthorpe wrote: > > > > > On Fri, Mar 19, 2021 at 10:58:41AM +0100, Jean-Philippe Brucker > > > > > wrote: > > > > > > Although there is no use for it at the moment (only two upstream > > > > > > users and it looks like amdkfd always uses current too), I quite > > > > > > like the client-server model where the privileged process does > > > > > > bind() and programs the hardware queue on behalf of the client > > > > > > process. > > > > > > > > > > This creates a lot complexity, how do does process A get a secure > > > > > reference to B? How does it access the memory in B to setup the > > > > > HW? > > > > > > > > mm_access() for example, and passing addresses via IPC > > > > > > I'd rather the source process establish its own PASID and then pass > > > the rights to use it to some other process via FD passing than try to > > > go the other way. There are lots of security questions with something > > > like mm_access. > > > > > > > Thank you all for the input, it sounds like we are OK to remove mm > > argument from iommu_sva_bind_device() and iommu_sva_alloc_pasid() for > > now? > > Fine by me. By the way the IDXD currently missues the bind API for > supervisor PASID, and the drvdata parameter isn't otherwise used. This > would be a good occasion to clean both. The new bind prototype could be: > > struct iommu_sva *iommu_sva_bind_device(struct device *dev, int flags) > yes, we really just hijacked drvdata as flags, it would be cleaner to use flags explicitly. > And a flag IOMMU_SVA_BIND_SUPERVISOR (not that I plan to implement it in > the SMMU, but I think we need to clean the current usage) > You mean move #define SVM_FLAG_SUPERVISOR_MODE out of Intel code to be a generic flag in iommu-sva-lib.h called IOMMU_SVA_BIND_SUPERVISOR? I agree if that is the proposal. > > > > Let me try to summarize PASID allocation as below: > > > > Interfaces | Usage | Limit | bind¹ |User visible > > -------------------------------------------------------------------- > > /dev/ioasid² | G-SVA/IOVA | cgroup | No > > |Yes > > -------------------------------------------------------------------- > > char dev³ | SVA | cgroup | Yes |No > > -------------------------------------------------------------------- > > iommu driver | default PASID| no | No |No > > > > Is this PASID #0? > True for native case but not limited to PASID#0 for guest case. E.g. for mdev assignment with guest IOVA, the guest PASID would #0, but the host aux domain default PASID can be non-zero. Here I meant to include both cases. > > -------------------------------------------------------------------- > > kernel | super SVA | no | yes |No > > -------------------------------------------------------------------- > > Also wondering about device driver allocating auxiliary domains for their > private use, to do iommu_map/unmap on private PASIDs (a clean replacement > to super SVA, for example). Would that go through the same path as > /dev/ioasid and use the cgroup of current task? > For the in-kernel private use, I don't think we should restrict based on cgroup, since there is no affinity to user processes. I also think the PASID allocation should just use kernel API instead of /dev/ioasid. Why would user space need to know the actual PASID # for device private domains? Maybe I missed your idea? > Thanks, > Jean > > > > > ¹ Allocated during SVA bind > > ² PASIDs allocated via /dev/ioasid are not bound to any mm. But its > > ownership is assigned to the process that does the allocation. > > ³ Include uacce, other private device driver char dev such as idxd > > > > Currently, the proposed /dev/ioasid interface does not map individual > > PASID with an FD. The FD is at the ioasid_set granularity and bond to > > the current mm. We could extend the IOCTLs to cover individual PASID-FD > > passing case when use cases arise. Would this work? > > > > Thanks, > > > > Jacob Thanks, Jacob