Received: by 2002:a5b:505:0:0:0:0:0 with SMTP id o5csp943471ybp; Wed, 9 Oct 2019 06:39:40 -0700 (PDT) X-Google-Smtp-Source: APXvYqww80vpPgPGDmYjbyZUXNXDbX4lfWalaOia2iYI5DUP7LJhk4QvkCuz1rSWA6YGOVImB1/T X-Received: by 2002:aa7:d842:: with SMTP id f2mr3065144eds.27.1570628380195; Wed, 09 Oct 2019 06:39:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1570628380; cv=none; d=google.com; s=arc-20160816; b=LzK1B4Rh1N38Ye1zEOetlY2rfzaaZx1E/TrUq0SVIhUlnBxClXj+iobQ/yjYJ6W/xD ZD1wxpPnaRTWTOiwQu16kSKxHSjYcFWa5INM8w7Rkp0OSoZASeaIe5NfiqpT8QLn35Nw WcLlOGwRHOMa2vDh29SXEUc5vpFWx7YODvaVpKYfqpmS1XPqFd074h6WY+NgCykPvnOF k9gA276JIVFG6zRyzoZrTIsAMJ4TqWBTGrKzO2rghc+kgXA+mr/8rf1ByVYpkr9BETj0 mPHeA1JgyO93FomhDB05mGYKSdykSpXYUGFP2uGiH56m4oVkbTNrYxJVgl9ULLktS5us OX8A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:date:cc:to:from:subject:message-id; bh=RNun2x/Rh0BqL8SxX1EwZDGY6j6Dv9LbFxH7ixtypL8=; b=t6knG0MLPGPocFPRNLSV2zJb9N1ocV8eE7ebBl8cGnZsfG7+1Ohiq7FzL9m+fM8J8C Yvo6IVKS531pCIHtUdycUk4ui+fP3JC0UJvD22DL7BVcBQwBD1qR1ZPd6PkrjSft9x2s ajS5YHrbl1ECb08lqY0uLJYFpDDTX6JpGIINeX6xETALzxSoyUfiYkaGUYtG9L/Ab5TG JwBv+yHKhqoZvrzOeX8DW/6KJC+LCgcgiiy92R9oDwwL8jIsxg98CcrUuw2SxveZiM5x Lqvmm53vHBWEcF3OF3Y7GwXgVoE3ANZfOBYfcKqlvtzqFWAkTGBK5uYb72FFeR9QaGjc Q0Fg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=mediatek.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d32si1442748eda.266.2019.10.09.06.39.14; Wed, 09 Oct 2019 06:39:40 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=mediatek.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731243AbfJINin (ORCPT + 99 others); Wed, 9 Oct 2019 09:38:43 -0400 Received: from mailgw02.mediatek.com ([1.203.163.81]:61433 "EHLO mailgw02.mediatek.com" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1731138AbfJINin (ORCPT ); Wed, 9 Oct 2019 09:38:43 -0400 X-UUID: 23a1f3f954ed427e90f1d30049e12463-20191009 X-UUID: 23a1f3f954ed427e90f1d30049e12463-20191009 Received: from mtkcas34.mediatek.inc [(172.27.4.253)] by mailgw02.mediatek.com (envelope-from ) (mailgw01.mediatek.com ESMTP with TLS) with ESMTP id 1426748016; Wed, 09 Oct 2019 21:38:26 +0800 Received: from MTKCAS32.mediatek.inc (172.27.4.184) by MTKMBS32N2.mediatek.inc (172.27.4.72) with Microsoft SMTP Server (TLS) id 15.0.1395.4; Wed, 9 Oct 2019 21:38:24 +0800 Received: from [10.17.3.153] (172.27.4.253) by MTKCAS32.mediatek.inc (172.27.4.170) with Microsoft SMTP Server id 15.0.1395.4 via Frontend Transport; Wed, 9 Oct 2019 21:38:24 +0800 Message-ID: <1570628307.19130.53.camel@mhfsdcap03> Subject: Re: [PATCH] iommu/mediatek: Move the tlb_sync into tlb_flush From: Yong Wu To: Tomasz Figa CC: Matthias Brugger , Joerg Roedel , Will Deacon , Evan Green , Robin Murphy , "moderated list:ARM/Mediatek SoC support" , srv_heupstream , Linux Kernel Mailing List , "list@263.net:IOMMU DRIVERS , Joerg Roedel ," , "list@263.net:IOMMU DRIVERS , Joerg Roedel ," , , "Nicolas Boichat" , , , Date: Wed, 9 Oct 2019 21:38:27 +0800 In-Reply-To: References: <1569822142-14303-1-git-send-email-yong.wu@mediatek.com> <1570522162.19130.38.camel@mhfsdcap03> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.10.4-0ubuntu2 MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-TM-SNTS-SMTP: 9DC8DFD7C36EE50E1052EBAAB997ADF22F64BBEC2A1D9E8D110DA80EEC3D152C2000:8 X-MTK: N Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2019-10-09 at 16:56 +0900, Tomasz Figa wrote: > On Tue, Oct 8, 2019 at 5:09 PM Yong Wu wrote: > > > > Hi Tomasz, > > > > Sorry for reply late. > > > > On Wed, 2019-10-02 at 14:18 +0900, Tomasz Figa wrote: > > > Hi Yong, > > > > > > On Mon, Sep 30, 2019 at 2:42 PM Yong Wu wrote: > > > > > > > > The commit 4d689b619445 ("iommu/io-pgtable-arm-v7s: Convert to IOMMU API > > > > TLB sync") help move the tlb_sync of unmap from v7s into the iommu > > > > framework. It helps add a new function "mtk_iommu_iotlb_sync", But it > > > > lacked the dom->pgtlock, then it will cause the variable > > > > "tlb_flush_active" may be changed unexpectedly, we could see this warning > > > > log randomly: > > > > > > > > > > Thanks for the patch! Please see my comments inline. > > > > > > > mtk-iommu 10205000.iommu: Partial TLB flush timed out, falling back to > > > > full flush > > > > > > > > To fix this issue, we can add dom->pgtlock in the "mtk_iommu_iotlb_sync". > > > > And when checking this issue, we find that __arm_v7s_unmap call > > > > io_pgtable_tlb_add_flush consecutively when it is supersection/largepage, > > > > this also is potential unsafe for us. There is no tlb flush queue in the > > > > MediaTek M4U HW. The HW always expect the tlb_flush/tlb_sync one by one. > > > > If v7s don't always gurarantee the sequence, Thus, In this patch I move > > > > the tlb_sync into tlb_flush(also rename the function deleting "_nosync"). > > > > and we don't care if it is leaf, rearrange the callback functions. Also, > > > > the tlb flush/sync was already finished in v7s, then iotlb_sync and > > > > iotlb_sync_all is unnecessary. > > > > > > Performance-wise, we could do much better. Instead of synchronously > > > syncing at the end of mtk_iommu_tlb_add_flush(), we could sync at the > > > beginning, if there was any previous flush still pending. We would > > > also have to keep the .iotlb_sync() callback, to take care of waiting > > > for the last flush. That would allow better pipelining with CPU in > > > cases like this: > > > > > > for (all pages in range) { > > > change page table(); > > > flush(); > > > } > > > > > > "change page table()" could execute while the IOMMU is flushing the > > > previous change. > > > > Do you mean adding a new tlb_sync before tlb_flush_no_sync, like below: > > > > mtk_iommu_tlb_add_flush_nosync { > > + mtk_iommu_tlb_sync(); > > tlb_flush_no_sync(); > > data->tlb_flush_active = true; > > } > > > > mtk_iommu_tlb_sync { > > if (!data->tlb_flush_active) > > return; > > tlb_sync(); > > data->tlb_flush_active = false; > > } > > > > This way look improve the flow, But adjusting the flow is not the root > > cause of this issue. the problem is "data->tlb_flush_active" may be > > changed from mtk_iommu_iotlb_sync which don't have a dom->pglock. > > That was not the only problem with existing code. Existing code also > assumed that add_flush and sync always go in pairs, but that's not > true. Yes. Thus I put the tlb_flush always followed by tlb_sync to make sure they always go in pairs. > > My suggestion is to fix the locking in the driver and keep the sync > deferred as much as possible, so that performance is not degraded. I I really didn't get this timeout warning log in previous kernel(Many tlb_flush followed by one tlb_sync), But deferring the sync is not suggested by our DE, thus I still would like to fix the sequence in this patch with putting them together. > changed my mind, though. I think we would need to make more changes to > the driver to make it implement the flushing efficiently, so let's go > with the current simple approach for now and improve incrementally. > > > > > Currently the synchronisation of the tlb_flush/tlb_sync flow are > > controlled by the variable "data->tlb_flush_active". > > > > In this patch putting the tlb_flush/tlb_sync together looks make > > the flow simpler: > > a) Don't need the sensitive variable "tlb_flush_active". > > b) Remove mtk_iommu_iotlb_sync, Don't need add lock in it. > > c) Simplify the tlb_flush_walk/tlb_flush_leaf. > > is it ok? > > > > Okay, let's do so as a first step to fix the issue. Then we can > optimize in follow up patches. Thanks the confirm, I have sent a quick v2. > > > > > > > > > > > > Besides, there are two minor changes: > > > > a) Use writel for the register F_MMU_INV_RANGE which is for triggering the > > > > HW work. We expect all the setting(iova_start/iova_end...) have already > > > > been finished before F_MMU_INV_RANGE. > > > > b) Reduce the tlb timeout value from 100000us to 1000us. the original value > > > > is so long that affect the multimedia performance. > > > > > > By definition, timeout is something that should not normally happen. > > > Too long timeout affecting multimedia performance would suggest that > > > the timeout was actually happening, which is the core problem, not the > > > length of the timeout. Could you provide more details on this? > > > > As description above, this issue is because there is no dom->pgtlock in > > the mtk_iommu_iotlb_sync. I have tried that the issue will disappear > > after adding lock in it. > > > > Although the issue is fixed after this patch, I still would like to > > reduce the timeout value for somehow error happen in the future. 100ms > > is unnecessary for us. It looks a minor improvement rather than fixing > > the issue. I will use a new patch for it. > > > > Okay, makes sense. > > Best regards, > Tomasz