Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp689225imm; Thu, 26 Jul 2018 10:14:56 -0700 (PDT) X-Google-Smtp-Source: AAOMgpcJQK801LauUdM3OETfPzeNOL9NDvgOjvet3zuEk14J+wWwsvaG5wJwbYA5AmN0gjd/0+0H X-Received: by 2002:a17:902:a60b:: with SMTP id u11-v6mr2731447plq.158.1532625296370; Thu, 26 Jul 2018 10:14:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1532625296; cv=none; d=google.com; s=arc-20160816; b=mZf7iReYjDmD4SYUspNKqF8CvkxpxJwBlOL6Oc5RdHjch0cRav2w/xbh/OzylfucIy 4l0plgo6MlAT5gGzyWvz7fqsGS+ZkRBBdLAUYx6ZDNyIia13mTyqb+u+azggpSQDE1MH Ma7xg4JVlfqzSH3lSQGTgITynO/asj8a9DSkUCYCyW4/nnUqoR4pdCu0Sst8zUfSHmPA GLuGxpyBxWdTWK3QI9z/mddPhT2pg/j4A7MLJwzuCHybENnuxBsKCH97Z4Ap2nZT+3eO /F9TzrQS76o9of7vAfrD2BhEqHuY94x/rswuFmcsrrf9jbUcBaFHNEPXkFQ5jzD76XwO hgKg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=vkJ6jiXuwR2+z5PopWZ+juZSAv3o38C1SGE2TCDwbk4=; b=JERZFKniKOFqHJj9sSjLJTiZAcGSmkOjEzFck2clxSswT2k12EHs6SNftwOd8oFGMk 7bUzI2iYlSSAbov03UzF1pmWYkq0agyhxmnVE/NCA+9/Mxyj/mtF+0dnVdEokGZWOkt+ SkZTph9DEqnxX0BhbWYA1kQ0wxzFmD7Ve2OKhgpwieUtZcYPqzH1DoULefycmYuXxHT+ FTx9IU8JD0SVppGZEwQra//clyaaxN4FZ+jx2AUAmuQfu/+7nGy9CkLidMLEmKQJ4nKo v+lbYXP2/NlVJ1Ei/SEEUNG0iv59HFiTuoXzuTb2lGCOVvPwXVmNze9mqJPYbsC2nKmP UdMA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n11-v6si1814200pgu.649.2018.07.26.10.14.29; Thu, 26 Jul 2018 10:14:56 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731766AbeGZQsT (ORCPT + 99 others); Thu, 26 Jul 2018 12:48:19 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:58236 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730179AbeGZQsT (ORCPT ); Thu, 26 Jul 2018 12:48:19 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 58ED515A2; Thu, 26 Jul 2018 08:30:57 -0700 (PDT) Received: from [10.4.12.131] (e110467-lin.emea.arm.com [10.4.12.131]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C187B3F575; Thu, 26 Jul 2018 08:30:54 -0700 (PDT) Subject: Re: [PATCH v13 1/4] iommu/arm-smmu: Add pm_runtime/sleep ops To: Vivek Gautam Cc: Joerg Roedel , robh+dt , "Rafael J. Wysocki" , Will Deacon , "list@263.net:IOMMU DRIVERS" , Joerg Roedel , iommu@lists.linux-foundation.org, "open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS" , open list , Mark Rutland , Linux PM , sboyd@kernel.org, Lukas Wunner , linux-arm-msm , freedreno References: <20180719101539.6104-1-vivek.gautam@codeaurora.org> <20180719101539.6104-2-vivek.gautam@codeaurora.org> From: Robin Murphy Message-ID: <401d8509-b9ad-913b-334e-f4ac853472e3@arm.com> Date: Thu, 26 Jul 2018 16:30:53 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 26/07/18 08:12, Vivek Gautam wrote: > On Wed, Jul 25, 2018 at 11:46 PM, Vivek Gautam > wrote: >> On Tue, Jul 24, 2018 at 8:51 PM, Robin Murphy wrote: >>> On 19/07/18 11:15, Vivek Gautam wrote: >>>> >>>> From: Sricharan R >>>> >>>> The smmu needs to be functional only when the respective >>>> master's using it are active. The device_link feature >>>> helps to track such functional dependencies, so that the >>>> iommu gets powered when the master device enables itself >>>> using pm_runtime. So by adapting the smmu driver for >>>> runtime pm, above said dependency can be addressed. >>>> >>>> This patch adds the pm runtime/sleep callbacks to the >>>> driver and also the functions to parse the smmu clocks >>>> from DT and enable them in resume/suspend. >>>> >>>> Also, while we enable the runtime pm add a pm sleep suspend >>>> callback that pushes devices to low power state by turning >>>> the clocks off in a system sleep. >>>> Also add corresponding clock enable path in resume callback. >>>> >>>> Signed-off-by: Sricharan R >>>> Signed-off-by: Archit Taneja >>>> [vivek: rework for clock and pm ops] >>>> Signed-off-by: Vivek Gautam >>>> Reviewed-by: Tomasz Figa >>>> --- >>>> >>>> Changes since v12: >>>> - Added pm sleep .suspend callback. This disables the clocks. >>>> - Added corresponding change to enable clocks in .resume >>>> pm sleep callback. >>>> >>>> drivers/iommu/arm-smmu.c | 75 >>>> ++++++++++++++++++++++++++++++++++++++++++++++-- >>>> 1 file changed, 73 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c >>>> index c73cfce1ccc0..9138a6fffe04 100644 >>>> --- a/drivers/iommu/arm-smmu.c >>>> +++ b/drivers/iommu/arm-smmu.c > > [snip] > >>>> platform_device *pdev) >>>> arm_smmu_device_remove(pdev); >>>> } >>>> +static int __maybe_unused arm_smmu_runtime_resume(struct device *dev) >>>> +{ >>>> + struct arm_smmu_device *smmu = dev_get_drvdata(dev); >>>> + >>>> + return clk_bulk_enable(smmu->num_clks, smmu->clks); >>> >>> >>> If there's a power domain being automatically switched by genpd then we need >>> a reset here because we may have lost state entirely. Since I remembered the >>> otherwise-useless GPU SMMU on Juno is in a separate power domain, I gave it >>> a poking via sysfs with some debug stuff to dump sCR0 in these callbacks, >>> and the problem is clear: >>> >>> ... >>> [ 4.625551] arm-smmu 2b400000.iommu: genpd_runtime_suspend() >>> [ 4.631163] arm-smmu 2b400000.iommu: arm_smmu_runtime_suspend: 0x00201936 >>> [ 4.637897] arm-smmu 2b400000.iommu: suspend latency exceeded, 6733980 ns >>> [ 21.566983] arm-smmu 2b400000.iommu: genpd_runtime_resume() >>> [ 21.584796] arm-smmu 2b400000.iommu: arm_smmu_runtime_resume: 0x00220101 >>> [ 21.591452] arm-smmu 2b400000.iommu: resume latency exceeded, 6658020 ns >>> ... >> >> Qualcomm SoCs have retention enabled for SMMU registers so they don't >> lose state. >> ... >> [ 256.013367] arm-smmu b40000.arm,smmu: arm_smmu_runtime_suspend >> SCR0 = 0x201e36 >> [ 256.013367] >> [ 256.019160] arm-smmu b40000.arm,smmu: arm_smmu_runtime_resume >> SCR0 = 0x201e36 >> [ 256.019160] >> [ 256.027368] arm-smmu b40000.arm,smmu: arm_smmu_runtime_suspend >> SCR0 = 0x201e36 >> [ 256.027368] >> [ 256.036786] arm-smmu b40000.arm,smmu: arm_smmu_runtime_resume >> SCR0 = 0x201e36 >> ... >> >> However after adding arm_smmu_device_reset() in runtime_resume() I observe >> some performance degradation when kill an instance of 'kmscube' and >> start it again. >> The launch time with arm_smmu_device_reset() in runtime_resume() change is >> more. >> Could this be because of frequent TLB invalidation and sync? Probably. Plus the reset procedure is a big chunk of MMIO accesses, which for a non-trivial SMMU configuration probably isn't negligible in itself. Unfortunately, unless you know for absolute certain that you don't need to do that, you do. > Some more information that i gathered. > On Qcom SoCs besides the registers retention, TCU invalidates TLB cache on > a CX power collapse exit, which is the system wide suspend case. > The arm-smmu software is not aware of this CX power collapse / > auto-invalidation. > > So wouldn't doing an explicit TLB invalidations during runtime resume be > detrimental to performance? Indeed it would be, but resuming with TLBs full of random valid-looking junk is even more so. > I have one more doubt here - > We do runtime power cycle around arm_smmu_map/unmap() too. > Now during map/unmap we selectively do TLB maintenance (either > tlb_sync or tlb_add_flush). > But with runtime pm we want to do TLBIALL*. Is that a problem? It's technically redundant to do both, true, but as we've covered in previous rounds of discussion it's very difficult to know *which* one is sufficient at any given time, so in order to make progress for now I think we have to settle with doing both. Robin.