Received: by 2002:a25:86ce:0:0:0:0:0 with SMTP id y14csp610052ybm; Wed, 22 May 2019 08:36:34 -0700 (PDT) X-Google-Smtp-Source: APXvYqykQufStQ/EhFRivpOqBcH06WRytsPyj6SKsJQx3oSId5tf0meX9tvyHu0YKzh0eFs0e8CM X-Received: by 2002:a65:5c41:: with SMTP id v1mr15726832pgr.20.1558539394802; Wed, 22 May 2019 08:36:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1558539394; cv=none; d=google.com; s=arc-20160816; b=X57Si6KR9xWjN+ppBsKneUf/jV+1L1+pNDBhFRbSNQ/4f06KpPz1Dz6+RHKSSHi5Rh wVt81chnofueFuqITgog5j5xQPszlZ7VssDpNQgBCWD/QD/voTKMoxvpO74ZCcSfrZue 9MxscfWskrU73y+a35NmFvuVBT68BtcDADTCZwxwNF6Pb9RgAcAJ8Ijb8hW9KT1FnNrU 3nDTG0N+nUDD3yjDOmN3bAJGYho+1CmN0re7bi7Bv4DyCrVrmLk6g25r0bZQtgzGr6LP NtJ8oEfZJ09OoGW8NpR+jJ2bj4d6AZ2BhbSi1LIFAEAvHyyzv3hMgBtRVZ/TTj3GrMUW FYqw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=D6OBmtobn/d6HLAH/YS4Jy3zgj44Td8rFGOvbACUEc8=; b=k7aREjnOE4az/Wrio9Zr71o0JVApt4CoC5pa6Zd27qpmY9+y2ngS3PSA+bYkUrqM+Z 1CeCHKeUEfceeL678E/ohe4j4KZaroCgKK3SpNiDSXuAbom+xVFG40g3wJSbw54UJQmU iiK5Wj9dwliuXz2WH7wonx3MdbtjqeO4nvnyMMYg4Sro9H+eN/70IWy+N1xaLeBi+f7+ tiesG0urNWtIe7ZXtqfPsWfotO87JI91v5wDPZXyD83qS5VqGzh2zZ0r1dCaaa99/6QR E90V5mjuGDOkopLlU9jgcERQl+F9Q3exF1ckBBsYdrjKaQu9L876FSJuAYzRTruu6TQM tR4g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g2si17541583plg.336.2019.05.22.08.36.19; Wed, 22 May 2019 08:36:34 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729779AbfEVPGU (ORCPT + 99 others); Wed, 22 May 2019 11:06:20 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:53324 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728466AbfEVPGU (ORCPT ); Wed, 22 May 2019 11:06:20 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id BD1E780D; Wed, 22 May 2019 08:06:19 -0700 (PDT) Received: from [10.1.196.129] (ostrya.cambridge.arm.com [10.1.196.129]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 16E8C3F718; Wed, 22 May 2019 08:06:17 -0700 (PDT) Subject: Re: [PATCH v3 09/16] iommu: Introduce guest PASID bind function To: Jacob Pan Cc: "iommu@lists.linux-foundation.org" , LKML , Joerg Roedel , David Woodhouse , Eric Auger , Alex Williamson , "Tian, Kevin" , Raj Ashok , Andriy Shevchenko References: <1556922737-76313-1-git-send-email-jacob.jun.pan@linux.intel.com> <1556922737-76313-10-git-send-email-jacob.jun.pan@linux.intel.com> <20190516091429.6d06f7e1@jacob-builder> <20190520122241.0db13f14@jacob-builder> <7bf71437-d75b-c4f7-d705-fcd71fc75060@arm.com> <20190521155029.0ab0a462@jacob-builder> From: Jean-Philippe Brucker Message-ID: <37d1eee7-92c1-7e07-b73d-7af82fdb1652@arm.com> Date: Wed, 22 May 2019 16:05:53 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: <20190521155029.0ab0a462@jacob-builder> Content-Type: text/plain; charset=windows-1252 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 21/05/2019 23:50, Jacob Pan wrote: >>> /** >>> * struct gpasid_bind_data - Information about device and guest >>> PASID binding >>> * @version: Version of this data structure >>> * @format: PASID table entry format >>> * @flags: Additional information on guest bind request >>> * @gpgd: Guest page directory base of the guest mm to bind >>> * @hpasid: Process address space ID used for the guest mm >>> in host IOMMU >>> * @gpasid: Process address space ID used for the guest mm >>> in guest IOMMU >> >> Trying to understand the full flow: >> * @gpasid is the one allocated by the guest using a virtual command. >> The guest writes @gpgd into the virtual PASID table at index @gpasid, >> then sends an invalidate command to QEMU. > yes >> * QEMU issues a gpasid_bind ioctl (on the mdev or its container?). >> VFIO forwards. The IOMMU driver installs @gpgd into the PASID table >> using @hpasid, which is associated with the auxiliary domain. >> >> But why do we need the @hpasid field here? Does userspace know about >> it at all, and does VFIO need to pass it to the IOMMU driver? >> > We need to support two guest-host PASID mappings through this API. Idea > comes from Kevin & Yi. > 1. identity mapping between host and guest PASID > 2. guest owns its own pasid space > > For option 1, which will plan to support first in this series. There is > no need for gpasid field since gpasid=hpasid. Guest allocates PASID > using virtual command interface which gets a host PASID. Then PASID > cache invalidation in the guest will result in bind_gpasid(), @gpasid is > not valid in the bind data (indicated by the IOMMU_SVA_GPASID_VAL flag). > > For option 2, guest still uses virtual command to allocate guest pasid, > but this time QEMU does the allocation for gpasid, at the same time > QEMU will allocate a host pasid then maintain a G->H PASID lookup. > When guest invalidate its PASID cache with GPASID, QEMU will find the > match host PASID then pass both gpasid and hpasid down to the host IOMMU > driver. > Host IOMMU driver will store the gpgd at the hpasid entry but keep > track of the gpasid->hpasid mapping. Host will never program gpasid in > the IOMMU HW. Host IOMMU driver provides G->H PASID translation for PF > device drivers that emulates mdev config space, i.e. virtual device > composition module > (https://events.linuxfoundation.org/wp-content/uploads/2017/12/Hardware-Assisted-Mediated-Pass-Through-with-VFIO-Kevin-Tian-Intel.pdf). > > These two options is a per VM choice. Hopefully the two diagrams below > can help to explain. I will put them in the next patch headers. Thanks for the explanation, makes sense to me now. So the host kernel needs to know G->H because the guest may write GPASID into the config space emulated by the host device driver, and device driver then retrieves the HPASID via an iommu_ops callback? But the device driver keeps track of aux domains so isn't HPASID retrievable with aux_get_pasid() already? > > Option 1. Identity G-H PASID mapping diagram. > > .-------------. .---------------------------. > | vIOMMU | | Guest process mm, FL only | > | | '---------------------------' > .----------------/ > | PASID Entry |--- PASID cache flush - > '-------------'\ | > | | \ | > | | \ | > '-------------' \________________ | > GPASID = HPASID | > Guest ^ ^ | > ------| Shadow |-------| VCMD |-----------|------------ > v v | | | > QEMU v v | > ------------------------------------------|------------ > Host HPASID = ioasid_alloc() | > | v > | sva_bind_gpasid(HPASID) > | > .-------------. | .----------------------. > | pIOMMU | | | Bind FL for GVA-GPA | > | | | /'----------------------' > .----------------' | > | PASID Entry | V (Nested xlate) > '----------------..---------------------. > | | |Set SL to GPA-HPA | > | | '---------------------' > '-------------' > > > > Option 2. Non-identity G-H PASID mapping diagram. > > .-------------. .---------------------------. > | vIOMMU | | Guest process mm, FL only | > | | '---------------------------' > .----------------/ > | PASID Entry |--- PASID cache flush - > '-------------'\ | .-------------. > | | \ | |Guest driver | > | | \ | |writes GPASID| > '-------------' \________________ | '-------------' > GPASID | | > Guest ^ ^ | | > ------| Shadow |-------| VCMD |-----------|------------ | > v v | | | | > QEMU v v | | > GPASID = qemu_gpasid_alloc() | | > keep G->H PASID lookup | | > ^ v | > | lookup G->H PASID | > -------------------|----------------------|------------ | > Host HPASID = ioasid_alloc() | | > | v | > | sva_bind_gpasid(HPASID,GPASID)| > | keep H-G PASID lookup | > | \ -------------------. > .-------------. | .----------------------. \| VDCM | > | pIOMMU | | | Bind FL for GVA-GPA | | H = lookup(GPASID)| > | | | /'----------------------' | write H to dev | > .----------------' | '------------------' > | PASID Entry | V (Nested xlate) > '----------------..---------------------. > | | |Set SL to GPA-HPA | > | | '---------------------' > '-------------' > There is also implications in G-H pasid lookup for PRQ, that would be > in the later series. > >>> * @addr_width: Guest address width. Paging mode can also be >>> derived. >> >> What does the last sentence mean? @addr_width should probably be in >> @vtd if it provides implicit information. >> > Derive 4 or 5 level paging mode from the address width. It can be in > @vtd but i thought this can be generic. Yes I think it's generic enough. It may be worth stating that this is the *virtual* address width, and removing or clarifying what the paging mode is (the sentence could be confusing on Arm, as we have different page granules which cannot be derived from the address width) Thanks, Jean