Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp3668796pxj; Tue, 11 May 2021 09:17:06 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxxZJ5a2FGUCqwj5zB5M8l2dgK9hO/mBupc270uJBMxZ3ZqZ4c4P/kQ8fauofGsjLD2UqIA X-Received: by 2002:ac2:51ba:: with SMTP id f26mr21949969lfk.545.1620749826400; Tue, 11 May 2021 09:17:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1620749826; cv=none; d=google.com; s=arc-20160816; b=QbLkagqG4PxLziCC57PRB5MReZDE8W+yQRrzG5bz9Y0sPCUk9z1AvU0p+dsmopmKji xmCCGaX+YKQUw8oq1U4iLILOp/kHyfejAji4ECUUH27+cmoWrIwV+4/zCscvwlL5N3C9 ZSEYGLI2qOSTz1F/+mwcWJRgD47H954RuZtEFbGuirjDXQNujjXEMzYqUxdZWptEC8IC ywsKXuf5fFLlW5kkDp6+/r1k7ojWQdQzyChadq6JuBxJGKAYX0aIaIhWH7rZlG3v6Ii3 S0aK4ez9O3VuNwAbVBNIKccBAno5Z6IJN4ODgo4XTBuvcCjNzJhKj10BFIUoJsc5tvqo haCg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :organization:references:in-reply-to:message-id:subject:cc:to:from :date:ironport-sdr:ironport-sdr; bh=UHYCHB3NWg3mhJ++Uosz31LzfFiNNZj1bV4nzLuDa+A=; b=rlHYqMtH6om+8misD+KNvE7eYqR/e+GBYttYwI7KyUCkTUv+IxPhsFu1sejeN18xFK SsW9T054QslswviKkwI70g4M9p9E/UjsiiH7bQsdiJFct2I+cSqgTK4HF1MLhH/i19nj VNMo2ppzvTvLbOVu7xCI4q7QXcE3N9fT5wZio6JdJNfGGZrDg2PkfRj85m8kZVyCoUYL 4R/jRXHiEEL8KBhWxDFARFTvhD/NbiiZ1yqySTVHVIhFS5ZJfhz3iOlF808CrS3Pmwn1 DpEQHjkqZVWWJQpftNcUynCNHnaUkEMQjfTht7PD+44ZUiduBmBYi9auKIe7o/qve0LE H1NA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id t13si23822293ljc.424.2021.05.11.09.16.35; Tue, 11 May 2021 09:17:06 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230491AbhEKQNh convert rfc822-to-8bit (ORCPT + 99 others); Tue, 11 May 2021 12:13:37 -0400 Received: from mga04.intel.com ([192.55.52.120]:54646 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229921AbhEKQNg (ORCPT ); Tue, 11 May 2021 12:13:36 -0400 IronPort-SDR: MH3VkuDoxBP2cuEePQFZZIvEwMOEUno6UvyAMapHNu+Fg2C2dd0Sq677lq1PfwmJ/B5YVNldfd i7w5bughUHKA== X-IronPort-AV: E=McAfee;i="6200,9189,9981"; a="197501467" X-IronPort-AV: E=Sophos;i="5.82,291,1613462400"; d="scan'208";a="197501467" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2021 09:12:27 -0700 IronPort-SDR: WkArB/mkxVj2txW3V/uTWHVEI7jMObfp61iG6B7af54As7VEc7oHkCnrfCOd8zLBMRJRaC5+Ac E8Qe34nHih6w== X-IronPort-AV: E=Sophos;i="5.82,291,1613462400"; d="scan'208";a="541705233" Received: from jacob-builder.jf.intel.com (HELO jacob-builder) ([10.7.199.155]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 May 2021 09:12:27 -0700 Date: Tue, 11 May 2021 09:14:52 -0700 From: Jacob Pan To: Jason Gunthorpe Cc: LKML , iommu@lists.linux-foundation.org, Joerg Roedel , Lu Baolu , Jean-Philippe Brucker , Christoph Hellwig , Yi Liu , Raj Ashok , "Tian, Kevin" , Dave Jiang , wangzhou1@hisilicon.com, zhangfei.gao@linaro.org, vkoul@kernel.org, jacob.jun.pan@linux.intel.com, David Woodhouse Subject: Re: [PATCH v4 1/2] iommu/sva: Tighten SVA bind API with explicit flags Message-ID: <20210511091452.721e9a03@jacob-builder> In-Reply-To: <20210511114848.GK1002214@nvidia.com> References: <1620653108-44901-1-git-send-email-jacob.jun.pan@linux.intel.com> <1620653108-44901-2-git-send-email-jacob.jun.pan@linux.intel.com> <20210510233749.GG1002214@nvidia.com> <20210510203145.086835cc@jacob-builder> <20210511114848.GK1002214@nvidia.com> Organization: OTC X-Mailer: Claws Mail 3.17.5 (GTK+ 2.24.32; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Jason, On Tue, 11 May 2021 08:48:48 -0300, Jason Gunthorpe wrote: > On Mon, May 10, 2021 at 08:31:45PM -0700, Jacob Pan wrote: > > Hi Jason, > > > > On Mon, 10 May 2021 20:37:49 -0300, Jason Gunthorpe > > wrote: > > > On Mon, May 10, 2021 at 06:25:07AM -0700, Jacob Pan wrote: > > > > > > > +/* > > > > + * The IOMMU_SVA_BIND_SUPERVISOR flag requests a PASID which can be > > > > used only > > > > + * for access to kernel addresses. No IOTLB flushes are > > > > automatically done > > > > + * for kernel mappings; it is valid only for access to the kernel's > > > > static > > > > + * 1:1 mapping of physical memory — not to vmalloc or even module > > > > mappings. > > > > + * A future API addition may permit the use of such ranges, by > > > > means of an > > > > + * explicit IOTLB flush call (akin to the DMA API's unmap method). > > > > + * > > > > + * It is unlikely that we will ever hook into > > > > flush_tlb_kernel_range() to > > > > + * do such IOTLB flushes automatically. > > > > + */ > > > > +#define IOMMU_SVA_BIND_SUPERVISOR BIT(0) > > > > > > Huh? That isn't really SVA, can you call it something saner please? > > > > > This is shared kernel virtual address, I am following the SVA lib naming > > since this is where the flag will be used. Why this is not SVA? Kernel > > virtual address is still virtual address. Is it due to direct map? > > As the above explains it doesn't actually synchronize the kernel's > address space it just shoves the direct map into the IOMMU. > There is no duplicated kernel direct map in IOMMU. > I suppose a different IOMMU implementation might point the PASID directly > at the kernel's page table and avoid those limitations - but since > that isn't portable it seems irrelevant. > This is what we are doing here. We allocate a supervisor PASID and put the kernel page table (init_mm pgd) in this PASID entry. > Since the only thing it really maps is the direct map I would just > call it direct_map, or all physical or something. > Good idea. It makes things clear to the callers. They must only use direct map memory for DMA. > How does this interact with the DMA APIs? DMA API would use RID2PASID (PASID 0), so it is separated by PASIDs. > How do you get CPU cache > flushing/etc into PASID operations that don't trigger IOMMU updates? > Sorry, I am not following. This is used for direct map only. > Honestly, I'm not convinced we should have "kernel SVA" at all.. Why > does IDXD use normal DMA on the RID for kernel controlled accesses? > Using SVA simplifies the work submission, there is no need to do map/unmap. Just bind PASID with init_mm, then submit work directly either with ENQCMDS (supervisor version of ENQCMD) to a shared workqueue or put the supervisor PASID in the descriptor for dedicated workqueue. > > > Is it really a PASID that always has all of physical memory mapped > > > into it? Sounds dangerous. What is it for? > > > > Yes. It is to bind DMA request w/ PASID with init_mm/init_top_pgt. Per > > PCIe spec PASID TLP prefix has "Privileged Mode Requested" bit. VT-d > > supports this with "Privileged-mode-Requested (PR) flag (to distinguish > > user versus supervisor access)". Each PASID entry has a SRE (Supervisor > > Request Enable) bit. > > The PR flag is only needed if the underlying IOMMU is directly > processing the CPU page tables. For cases where the IOMMU is using its > own page table format and has its own copies the PR flag shouldn't be > used. > We are doing the former case. There is no IOMMU page tables for the direct map. > Jason Thanks, Jacob