Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp3306417pxb; Mon, 25 Jan 2021 12:14:24 -0800 (PST) X-Google-Smtp-Source: ABdhPJxRwBQCFuu0RPmefZRx+yKTaI0dmA2wkqxSyQQtJSKaFupQVALgF5SFcAaxQGrVzovhln1F X-Received: by 2002:a17:906:f6d8:: with SMTP id jo24mr1390828ejb.213.1611605664074; Mon, 25 Jan 2021 12:14:24 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1611605664; cv=none; d=google.com; s=arc-20160816; b=AaI42SbFndOxFz0QPsvZweZJ+n4VHm9xF0Gc3Lsz7ZRKYdSDS1AHvKv7eOQ7Lupweo 8SbwQ83YE9AM2ILrSSQzkTvejL9B2Isv4JmImLjrDEPbsvV0ID02KqtJvx9WyIwI1g+G e3grcrQFJFVP22TR/mK31Ep8ZZ38s4TNLwSiX7iyfvmuR4dbli2XFUl94fJbh98UxTpk wkHKSk8pvzP8BjS5TIYDS9N/OB9osuTi78L9/Zwy9jkKyJXIeMAKot4+vJ+4awBZcDqs lRrh+21cw1h0DTmmrCFtOcdyxYt1dqcYVk4jt23j6QVb8M6jA+rfmC1wZ2UuNURskElk umqg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=rLAaR8soH5tBfGf21Vy8H1W9zwjMnJ0RYngXVHkh1TQ=; b=hwHOmDWbXBDoWzzcWKSBO0d3VH3zUew+BaA99O64RP1z1VurdzH8ihwC9TvoYJgO7Y FGG7yQ+o0KWN+/5v/WUMntfyOnoQXkO/mWTu8NI2BRQOkG6rX8EUcxy/E84jOzlMSWyN O+rTI53FgE7DMRzGFBr4srD+hcOpPV6YEcst/gcHqD+1xd9c2u8fMuDrhvvLmRSGiZmr P2eAj3v4mqk2jJ4bpDfVS2/rr3yuAKscMWWooG4cPL8V4vvtHdwrv/H5GWgbAliZl4eS u5D0nSHH26pkBUBjRR9wFuiWfrzachbytBlcToyrwoyYzGKBsRpAS0DhQga+zSyM6CaR aq5g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id ho19si6082637ejc.674.2021.01.25.12.13.57; Mon, 25 Jan 2021 12:14:24 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732262AbhAYULf (ORCPT + 99 others); Mon, 25 Jan 2021 15:11:35 -0500 Received: from foss.arm.com ([217.140.110.172]:59032 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732246AbhAYTzr (ORCPT ); Mon, 25 Jan 2021 14:55:47 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5EF4431B; Mon, 25 Jan 2021 11:55:00 -0800 (PST) Received: from [10.57.46.25] (unknown [10.57.46.25]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C47863F66E; Mon, 25 Jan 2021 11:54:58 -0800 (PST) Subject: Re: [PATCH 1/1] iommu/arm-smmu-v3: add support for BBML To: "Leizhen (ThunderTown)" , Will Deacon Cc: Joerg Roedel , Jean-Philippe Brucker , Jonathan Cameron , linux-arm-kernel , iommu , linux-kernel , Yang Yingliang References: <20201126034230.777-1-thunder.leizhen@huawei.com> <20210122125132.GB24102@willie-the-truck> <34a9c164-389d-30cd-11a3-8796eb7bca93@arm.com> <3985308c-87be-fa12-0760-55e5ab29c47d@huawei.com> From: Robin Murphy Message-ID: <3192f888-206b-6bc4-5fe1-fbc4316f316f@arm.com> Date: Mon, 25 Jan 2021 19:54:52 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; rv:78.0) Gecko/20100101 Thunderbird/78.6.1 MIME-Version: 1.0 In-Reply-To: <3985308c-87be-fa12-0760-55e5ab29c47d@huawei.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-GB Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2021-01-23 10:20, Leizhen (ThunderTown) wrote: > > > On 2021/1/22 21:00, Robin Murphy wrote: >> On 2021-01-22 12:51, Will Deacon wrote: >>> On Thu, Nov 26, 2020 at 11:42:30AM +0800, Zhen Lei wrote: >>>> When changing from a set of pages/smaller blocks to a larger block for an >>>> address, the software should follow the sequence of BBML processing. >>>> >>>> When changing from a block to a set of pages/smaller blocks for an >>>> address, there's no need to use nT bit. If an address in the large block >>>> is accessed before page table switching, the TLB caches the large block >>>> mapping. After the page table is switched and before TLB invalidation >>>> finished, new access requests are still based on large block mapping. >>>> After the block or page is invalidated, the system reads the small block >>>> or page mapping from the memory; If the address in the large block is not >>>> accessed before page table switching, the TLB has no cache. After the >>>> page table is switched, a new access is initiated to read the small block >>>> or page mapping from the memory. >>>> >>>> Signed-off-by: Zhen Lei >>>> --- >>>>   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  2 + >>>>   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  2 + >>>>   drivers/iommu/io-pgtable-arm.c              | 46 ++++++++++++++++----- >>>>   include/linux/io-pgtable.h                  |  1 + >>>>   4 files changed, 40 insertions(+), 11 deletions(-) >>>> >>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c >>>> index e634bbe60573..14a1a11565fb 100644 >>>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c >>>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c >>>> @@ -1977,6 +1977,7 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain, >>>>           .coherent_walk    = smmu->features & ARM_SMMU_FEAT_COHERENCY, >>>>           .tlb        = &arm_smmu_flush_ops, >>>>           .iommu_dev    = smmu->dev, >>>> +        .bbml        = smmu->bbml, >>>>       }; >>>>         if (smmu_domain->non_strict) >>>> @@ -3291,6 +3292,7 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu) >>>>         /* IDR3 */ >>>>       reg = readl_relaxed(smmu->base + ARM_SMMU_IDR3); >>>> +    smmu->bbml = FIELD_GET(IDR3_BBML, reg); >>>>       if (FIELD_GET(IDR3_RIL, reg)) >>>>           smmu->features |= ARM_SMMU_FEAT_RANGE_INV; >>>>   diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h >>>> index d4b7f40ccb02..aa7eb460fa09 100644 >>>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h >>>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h >>>> @@ -51,6 +51,7 @@ >>>>   #define IDR1_SIDSIZE            GENMASK(5, 0) >>>>     #define ARM_SMMU_IDR3            0xc >>>> +#define IDR3_BBML            GENMASK(12, 11) >>>>   #define IDR3_RIL            (1 << 10) >>>>     #define ARM_SMMU_IDR5            0x14 >>>> @@ -617,6 +618,7 @@ struct arm_smmu_device { >>>>         int                gerr_irq; >>>>       int                combined_irq; >>>> +    int                bbml; >>>>         unsigned long            ias; /* IPA */ >>>>       unsigned long            oas; /* PA */ >>>> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c >>>> index a7a9bc08dcd1..341581337ad0 100644 >>>> --- a/drivers/iommu/io-pgtable-arm.c >>>> +++ b/drivers/iommu/io-pgtable-arm.c >>>> @@ -72,6 +72,7 @@ >>>>     #define ARM_LPAE_PTE_NSTABLE        (((arm_lpae_iopte)1) << 63) >>>>   #define ARM_LPAE_PTE_XN            (((arm_lpae_iopte)3) << 53) >>>> +#define ARM_LPAE_PTE_nT            (((arm_lpae_iopte)1) << 16) >>>>   #define ARM_LPAE_PTE_AF            (((arm_lpae_iopte)1) << 10) >>>>   #define ARM_LPAE_PTE_SH_NS        (((arm_lpae_iopte)0) << 8) >>>>   #define ARM_LPAE_PTE_SH_OS        (((arm_lpae_iopte)2) << 8) >>>> @@ -255,7 +256,7 @@ static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data, >>>>     static void __arm_lpae_init_pte(struct arm_lpae_io_pgtable *data, >>>>                   phys_addr_t paddr, arm_lpae_iopte prot, >>>> -                int lvl, arm_lpae_iopte *ptep) >>>> +                int lvl, arm_lpae_iopte *ptep, arm_lpae_iopte nT) >>>>   { >>>>       arm_lpae_iopte pte = prot; >>>>   @@ -265,37 +266,60 @@ static void __arm_lpae_init_pte(struct arm_lpae_io_pgtable *data, >>>>           pte |= ARM_LPAE_PTE_TYPE_BLOCK; >>>>         pte |= paddr_to_iopte(paddr, data); >>>> +    pte |= nT; >>>>         __arm_lpae_set_pte(ptep, pte, &data->iop.cfg); >>>>   } >>>>   +static void __arm_lpae_free_pgtable(struct arm_lpae_io_pgtable *data, int lvl, >>>> +                    arm_lpae_iopte *ptep); >>>>   static int arm_lpae_init_pte(struct arm_lpae_io_pgtable *data, >>>>                    unsigned long iova, phys_addr_t paddr, >>>>                    arm_lpae_iopte prot, int lvl, >>>>                    arm_lpae_iopte *ptep) >>>>   { >>>>       arm_lpae_iopte pte = *ptep; >>>> +    struct io_pgtable_cfg *cfg = &data->iop.cfg; >>>>         if (iopte_leaf(pte, lvl, data->iop.fmt)) { >>>>           /* We require an unmap first */ >>>>           WARN_ON(!selftest_running); >>>>           return -EEXIST; >>>>       } else if (iopte_type(pte, lvl) == ARM_LPAE_PTE_TYPE_TABLE) { >>>> -        /* >>>> -         * We need to unmap and free the old table before >>>> -         * overwriting it with a block entry. >>>> -         */ >>>>           arm_lpae_iopte *tblp; >>>> +        struct io_pgtable *iop = &data->iop; >>>>           size_t sz = ARM_LPAE_BLOCK_SIZE(lvl, data); >>>>   -        tblp = ptep - ARM_LPAE_LVL_IDX(iova, lvl, data); >>>> -        if (__arm_lpae_unmap(data, NULL, iova, sz, lvl, tblp) != sz) { >>>> -            WARN_ON(1); >>>> -            return -EINVAL; >>>> +        switch (cfg->bbml) { >>>> +        case 0: >>>> +            /* >>>> +             * We need to unmap and free the old table before >>>> +             * overwriting it with a block entry. >>>> +             */ >>>> +            tblp = ptep - ARM_LPAE_LVL_IDX(iova, lvl, data); >>>> +            if (__arm_lpae_unmap(data, NULL, iova, sz, lvl, tblp) != sz) { >>>> +                WARN_ON(1); >>>> +                return -EINVAL; >>>> +            } >>>> +            break; >>>> +        case 1: >>>> +            __arm_lpae_init_pte(data, paddr, prot, lvl, ptep, ARM_LPAE_PTE_nT); >>>> + >>>> +            io_pgtable_tlb_flush_walk(iop, iova, sz, ARM_LPAE_GRANULE(data)); >>>> +            tblp = iopte_deref(pte, data); >>>> +            __arm_lpae_free_pgtable(data, lvl + 1, tblp); >>>> +            break; >>>> +        case 2: >>>> +            __arm_lpae_init_pte(data, paddr, prot, lvl, ptep, 0); >>>> + >>>> +            io_pgtable_tlb_flush_walk(iop, iova, sz, ARM_LPAE_GRANULE(data)); >>>> +            tblp = iopte_deref(pte, data); >>>> +            __arm_lpae_free_pgtable(data, lvl + 1, tblp); >>>> +            return 0; >>> >>> Sorry, but I really don't understand what you're trying to do here. The old >>> code uses BBM for the table -> block path so we don't need anything extra >>> here. The dodgy case is when we unmap part of a block, and end up installing >>> a table via arm_lpae_split_blk_unmap(). We can't use BBM there because there >>> could be ongoing DMA to parts of the block mapping that we want to remain in >>> place. >>> >>> Are you seeing a problem in practice? >> >> Right, I was under the assumption that we could ignore BBML because we should never have a legitimate reason to split blocks. I'm certainly not keen on piling any more complexity into split_blk_unmap, because the IOMMU API clearly doesn't have a well-defined behaviour for that case anyway - some other drivers will just unmap the entire block, and IIRC there was a hint somewhere in VFIO that it might actually expect that behaviour. > > KVM has this usage scenario. As far as I'm aware KVM doesn't go anywhere near the IOMMU API, does it? If you mean VFIO, the comment starting at line 1110 of vfio_iommu_type1.c disagrees, especially the part that explicitly says "Unmap requests must fully cover previous mappings." Robin.