Received: by 2002:a05:7412:8d11:b0:fa:4934:9f with SMTP id bj17csp487364rdb; Mon, 15 Jan 2024 04:05:28 -0800 (PST) X-Google-Smtp-Source: AGHT+IHtxfuEnf7OHetXCOSb1iq96f+Ugu9cpNvz8MwBM25lCmEmLva3TY4cX2VLuDGDy/11KHRN X-Received: by 2002:a05:6870:3912:b0:206:7f6d:52ad with SMTP id b18-20020a056870391200b002067f6d52admr8304928oap.113.1705320328592; Mon, 15 Jan 2024 04:05:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1705320328; cv=none; d=google.com; s=arc-20160816; b=E6Erl6znAyDdo008lMbTnH1eIz1R0JeAwgWQoAvvSiVYUkhu/3a7+PBnl/RnEF7w3F M7wLTzVopFM1N00e8K1dDPmW85LpNhr7jbJcOwr987S3W8sDHuka/H1CFLOUmDdz+Iyg az+Ij1Fd/Oh5YGlhHTGlAPeZcIVj5XnvDs8kAD8I71yhtFPtewppXQklTZ4c86VBBWuX Ean0MAfhyLlHoFAQNO9k6mIc5sXAouPknDS8mdoB3Ar6MKVrXQuzEe69EmRfICS09zz1 HrSLpeA/wYiJExKTPvDO1pWzTCnsazj5fr/p73l8zRFS9RXr67hb/ZXZIkNA2gZuYfd9 hF2g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:list-unsubscribe:list-subscribe:list-id:precedence :message-id:date:subject:cc:to:from; bh=m/bXCpWnDDgz7ZtYGSSOhKYOKkzIGmIxebT22IDPULQ=; fh=WVZbora+mZyVsZ1Id8TxcuzTAuaaM0E7vTha4gPZq2o=; b=tmTrIf9QqB1O//tJR9CxoSU9pi3XdHw3kLFPSdJoeDpNn54uijDVp+PDybDY6cazVb EOIB1ag/koyPlc1IxPcj+p+L46Cb2RuooMyDQD57OCMpmh6Ryb5s9jq3WBzlt9+cWDnT W6ZpK4z1J824tw4CEHsStcmPy681izEqcpkpjdLGwd8yM/RqTEuNU2zaqnpp7zKtKgsz chRGIjdIJmdFaCjOLrL81qwwNKe0jRo0xg4s2K+CqbBuSYivqnkM/BHbxrpE3EUUcLSC eaRrM5N9z9wxWurRAloEu8giE4vQhfMhLK680tRm2vRBpypDMEcVYkCyp3/7dHeEFYWD xFBQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-25950-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-25950-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id w128-20020a636286000000b005cd82a478f6si8994827pgb.759.2024.01.15.04.05.28 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Jan 2024 04:05:28 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-25950-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-25950-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-25950-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 40AB9282ABA for ; Mon, 15 Jan 2024 12:05:28 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id C8D722C699; Mon, 15 Jan 2024 12:05:22 +0000 (UTC) Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C3A4D1E867 for ; Mon, 15 Jan 2024 12:05:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.163.252]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4TD9Lz0VXbzNlC3; Mon, 15 Jan 2024 19:45:47 +0800 (CST) Received: from kwepemd100006.china.huawei.com (unknown [7.221.188.47]) by mail.maildlp.com (Postfix) with ESMTPS id 40844180071; Mon, 15 Jan 2024 19:46:30 +0800 (CST) Received: from huawei.com (10.175.112.208) by kwepemd100006.china.huawei.com (7.221.188.47) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.2.1258.28; Mon, 15 Jan 2024 19:46:29 +0800 From: Zhang Zekun To: , , , , , , , CC: Subject: [PATCH] iommu/arm-smmu-v3: Add a threshold to avoid potential soft lockup Date: Mon, 15 Jan 2024 19:40:40 +0800 Message-ID: <20240115114040.6279-1-zhangzekun11@huawei.com> X-Mailer: git-send-email 2.17.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemd100006.china.huawei.com (7.221.188.47) The commit d5afb4b47e13 ("iommu/arm-smmu-v3: Fix soft lockup triggered by arm_smmu_mm_invalidate_range") has fix a soft lockup problem when running SVA case, but code paths from iommu_unmap and dma APIs still remain unfixed which could also cause potential soft lockup. When cmdq is quite busy and don't have much space for batch submitting cmds, and size passed to __arm_smmu_tlb_inv_range() is large (1G in this case), the following softlockup is triggered. WARN: soft lockup - CPU#71 stuck for 12s! [qemu-kvm:1303] .. Call trace: dump_backtrace+0x0/0x200 show_stack+0x20/0x30 dump_stack+0xf0/0x138 watchdog_print_info+0x48/0x54 watchdog_process_before_softlockup+0x9c/0xa0 watchdog_timer_fn+0x1ac/0x2f0 __run_hrtimer+0x98/0x2b4 __hrtimer_run_queues+0xc0/0x13c hrtimer_interrupt+0x150/0x3e4 arch_timer_handler_phys+0x3c/0x50 handle_percpu_devid_irq+0x90/0x1f4 __handle_domain_irq+0x84/0xfc gic_handle_irq+0x88/0x2b0 el1_irq+0xb8/0x140 arm_smmu_cmdq_issue_cmdlist+0x184/0x5f4 __arm_smmu_tlb_inv_range+0x114/0x22c arm_smmu_tlb_inv_walk+0x88/0x120 __arm_lpae_unmap+0x188/0x2c0 __arm_lpae_unmap+0x104/0x2c0 arm_lpae_unmap+0x68/0x80 arm_smmu_unmap+0x24/0x40 __iommu_unmap+0xd8/0x210 iommu_unmap+0x44/0x9c .. The basic idea is use the actual granual size instead of PAGE_SIZE used in SVA scenarios to calculate a threshold. When smmu without ARM_SMMU_FEAT_RANGE_INV need to invalid a TLB range larger than the threshold, we use the granularity of asid or vmid to invalid the TLB. The calculation logic is similar to calculate 'bits_per_level' when allocating io-pgtable, which could also been applyed to calculate the existing threshold in SVA scenarios. Besides, change the comment "MAX_TLBI_OPS" to "MAX_DVM_OPS", because it is has been renamed in commit ec1c3b9ff160 ("arm64: tlbflush: Rename MAX_TLBI_OPS") Signed-off-by: Zhang Zekun --- .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 11 +-------- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 23 +++++++++++++++---- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 10 ++++++++ 3 files changed, 30 insertions(+), 14 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c index 05722121f00e..164a218a4d41 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c @@ -203,15 +203,6 @@ static void arm_smmu_free_shared_cd(struct arm_smmu_ctx_desc *cd) } } -/* - * Cloned from the MAX_TLBI_OPS in arch/arm64/include/asm/tlbflush.h, this - * is used as a threshold to replace per-page TLBI commands to issue in the - * command queue with an address-space TLBI command, when SMMU w/o a range - * invalidation feature handles too many per-page TLBI commands, which will - * otherwise result in a soft lockup. - */ -#define CMDQ_MAX_TLBI_OPS (1 << (PAGE_SHIFT - 3)) - static void arm_smmu_mm_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn, struct mm_struct *mm, unsigned long start, @@ -228,7 +219,7 @@ static void arm_smmu_mm_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn, */ size = end - start; if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_RANGE_INV)) { - if (size >= CMDQ_MAX_TLBI_OPS * PAGE_SIZE) + if (size >= CMDQ_MAX_TLBI_OPS(PAGE_SIZE) * PAGE_SIZE) size = 0; } else { if (size == ULONG_MAX) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 0ffb1cf17e0b..cecccba17511 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1997,6 +1997,14 @@ static void arm_smmu_tlb_inv_page_nosync(struct iommu_iotlb_gather *gather, static void arm_smmu_tlb_inv_walk(unsigned long iova, size_t size, size_t granule, void *cookie) { + struct arm_smmu_domain *smmu_domain = cookie; + struct arm_smmu_device *smmu = smmu_domain->smmu; + + if (!(smmu->features & ARM_SMMU_FEAT_RANGE_INV) && + size >= CMDQ_MAX_TLBI_OPS(granule) * granule) { + arm_smmu_tlb_inv_context(cookie); + return; + } arm_smmu_tlb_inv_range_domain(iova, size, granule, false, cookie); } @@ -2502,13 +2510,20 @@ static void arm_smmu_iotlb_sync(struct iommu_domain *domain, struct iommu_iotlb_gather *gather) { struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); + struct arm_smmu_device *smmu = smmu_domain->smmu; + size_t size = gather->end - gather->start + 1; + size_t granule = gather->pgsize; - if (!gather->pgsize) + if (!granule) return; - arm_smmu_tlb_inv_range_domain(gather->start, - gather->end - gather->start + 1, - gather->pgsize, true, smmu_domain); + if (!(smmu->features & ARM_SMMU_FEAT_RANGE_INV) && + size >= CMDQ_MAX_TLBI_OPS(granule) * granule) { + arm_smmu_tlb_inv_context(smmu_domain); + return; + } + arm_smmu_tlb_inv_range_domain(gather->start, size, + granule, true, smmu_domain); } static phys_addr_t diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 65fb388d5173..a9a7376c0437 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -431,6 +431,16 @@ struct arm_smmu_ste { #define MSI_IOVA_BASE 0x8000000 #define MSI_IOVA_LENGTH 0x100000 +/* + * Similar to MAX_DVM_OPS in arch/arm64/include/asm/tlbflush.h, this is used + * as a threshold to replace per-page TLBI commands to issue in the command + * queue with an address-space TLBI command, when SMMU w/o a range invalidation + * feature handles too many per-page TLBI commands, which will otherwise result + * in a soft lockup. + */ + +#define CMDQ_MAX_TLBI_OPS(granule) (1 << (ilog2(granule) - 3)) + enum pri_resp { PRI_RESP_DENY = 0, PRI_RESP_FAIL = 1, -- 2.17.1