Received: by 2002:ab2:784b:0:b0:1fd:adc2:8405 with SMTP id m11csp257118lqp; Mon, 10 Jun 2024 03:13:42 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCV+R6jcldJrZZcfgbMNK6+6A/LZiO7ytSo75EieK06ncbSrXvIr5VRGCCDPLlTyQefVjlzpGvmn0sftqNDo5vOgDcgneJdFg1e7jREJ1w== X-Google-Smtp-Source: AGHT+IHGMRX9LH+Y4Yd+hE//uiOTlswmL0b52KiSeOtSbHJhTy5rFqer+CtnmTl8iO8TWnRjqW86 X-Received: by 2002:a05:6102:54a3:b0:48c:5447:762 with SMTP id ada2fe7eead31-48c54471043mr2840102137.10.1718014422529; Mon, 10 Jun 2024 03:13:42 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1718014422; cv=pass; d=google.com; s=arc-20160816; b=O2MDwNAeBDIeOLdSMKpKJ8w07XQzCFqaFQ03UIaDyQY6RJoQPfdYZFi6jK0P0n9ZEc OPCctjC5U+pop4bg5nNPwfWEQjNps3fvCwXPCC+71S6rvm80I0CPgQkFe1SHDzKUMi/b yBWUg/swVc24oAVPL+EwVmPwXS5bISi2UvKXm2WNMpE2icfQI04C4t0I+amJpVytg0S0 NIusuKlW7HDoh5yUQycYTALAjOfmfReFsp2c56Ecvc4sJPoeIDUO48OTkGOzqebkEWRz onQ0/4+KWIHIz+lZpJ8tQGRL8M/6Mw50z4M4qt53bJ1k8PDbpdGVIaplhaLph4fWz64w QWzw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:date:message-id:dkim-signature; bh=LYjFZiza5DBm6xg02dI/oVC8v93T04hBV0Qx6Xlht6E=; fh=6hETw5aSMk+8c4Dopcvk6IRArY2Mr/BZRVm2UePUOUg=; b=mdqWa0lW1jnBda5pOXQYppfPltyNJFF0Xa6Ll8tqd06Tnw4Y8uz633nTlK8NC37W+h myTbFqUrTcWtxufXcs3I1JEjHA4LD436rQbzq2wGcUPdv7LefA16x4bya6rLN4/Df9VK 2bUu/ZDoDxWq0DnpXOTAKnayqKtCczxc4JKv1dgM2SoRAsvXUMoMkVWvE2MACSO6dMhq 1Ib8E+hgibmPfK4Du2RIbHNLAVr+v+zeBaMyZwuZVMsFQ6sPon8lrG8RHTQeut8/ZObL HZDJMSowYDc0hCNAMQVy54yFz8Rwqrvon9KfHT3rHAH04CG4HAtpVedTP6B+uMIHkmjx gSMg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@quicinc.com header.s=qcppdkim1 header.b="D/80nV6f"; arc=pass (i=1 spf=pass spfdomain=quicinc.com dkim=pass dkdomain=quicinc.com dmarc=pass fromdomain=quicinc.com); spf=pass (google.com: domain of linux-kernel+bounces-207998-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-207998-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=quicinc.com Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id d75a77b69052e-44089b36e1fsi29302081cf.569.2024.06.10.03.13.42 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 10 Jun 2024 03:13:42 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-207998-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@quicinc.com header.s=qcppdkim1 header.b="D/80nV6f"; arc=pass (i=1 spf=pass spfdomain=quicinc.com dkim=pass dkdomain=quicinc.com dmarc=pass fromdomain=quicinc.com); spf=pass (google.com: domain of linux-kernel+bounces-207998-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-207998-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=quicinc.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 34EAB1C21DA2 for ; Mon, 10 Jun 2024 10:13:42 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 122A178281; Mon, 10 Jun 2024 10:13:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=quicinc.com header.i=@quicinc.com header.b="D/80nV6f" Received: from mx0b-0031df01.pphosted.com (mx0b-0031df01.pphosted.com [205.220.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3DDFE6F318; Mon, 10 Jun 2024 10:13:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.180.131 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718014413; cv=none; b=eKJHL4uxi/7U2jnctOFE1FgEzkD3mgNn0m2iJkP0UAupSbLniRRjrEx0HIEy8QnuMneKHkz8f8gs3j8MjUJus3HTZRPZ0mMT7eVB8/2VTUzleeAygsY1QDEJykYANC4ehudqJ/mSyn3fAqpcCrvC0x5xUC6B2TSSTp4V7DASbyM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718014413; c=relaxed/simple; bh=X+zkTDjFPJHTSjlR+KJSGj46oqNLjH4pdYR1FslIiDw=; h=Message-ID:Date:MIME-Version:Subject:To:CC:References:From: In-Reply-To:Content-Type; b=g8tJIpAESwbTR9LX3vg82NbZ6hsJFjSDDf1qXERmSmgoeBEt9ZKafUQ0tYmaQvbQc9hzpro9+FwfrB8V3LPQCvoLwJAUGVXih5z80nzY785cFW9QOe+AkhY0kY3hCOWzDmfUylkMQH2N3Et+bdt1SW5e/3XLPuzkLIpTvy2y5oQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=quicinc.com; spf=pass smtp.mailfrom=quicinc.com; dkim=pass (2048-bit key) header.d=quicinc.com header.i=@quicinc.com header.b=D/80nV6f; arc=none smtp.client-ip=205.220.180.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=quicinc.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=quicinc.com Received: from pps.filterd (m0279870.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 459MuAYb001782; Mon, 10 Jun 2024 10:13:09 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=qcppdkim1; bh= LYjFZiza5DBm6xg02dI/oVC8v93T04hBV0Qx6Xlht6E=; b=D/80nV6fcU3ym/yl vNg/1rmc5KorteQBqAVgjvMtobc6JLKWyYmhvDnCRJXfmY6neb95A57FhVy1biQQ GaI6pP3wb8IOgY3VeyhghMEudLSPLlE5JKrvdsSPHCIp7k13cxAEJ065ts3VtjYy mYXNEUOKABo9CHoGlkEAo53yyf+Id6FFFVx6Hnju3M8IjIOmmCf45sq00ZeRHwBQ 0Z4X4aMXW0H8i7/CO/w0YQKWQUFb/0CIiDv5znjHPz+v5Q0/ijwteAyV3Sg4Vf5+ 2DP1lyU9LhE+CiJulTNstxlyY3cq/6wxYs8B82cEqKuCCKU33y++VgfKJvbECr+R 8TBlGQ== Received: from nalasppmta02.qualcomm.com (Global_NAT1.qualcomm.com [129.46.96.20]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3ymfcv353a-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 10 Jun 2024 10:13:09 +0000 (GMT) Received: from nalasex01c.na.qualcomm.com (nalasex01c.na.qualcomm.com [10.47.97.35]) by NALASPPMTA02.qualcomm.com (8.17.1.19/8.17.1.19) with ESMTPS id 45AAD7WU017741 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 10 Jun 2024 10:13:07 GMT Received: from [10.214.66.253] (10.80.80.8) by nalasex01c.na.qualcomm.com (10.47.97.35) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.9; Mon, 10 Jun 2024 03:13:01 -0700 Message-ID: Date: Mon, 10 Jun 2024 15:42:53 +0530 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v9 3/5] iommu/arm-smmu: introduction of ACTLR for custom prefetcher settings To: Rob Clark CC: Dmitry Baryshkov , Konrad Dybcio , , , , , , , , , , , , , , , , , References: <20240123144543.9405-1-quic_bibekkum@quicinc.com> <20240123144543.9405-4-quic_bibekkum@quicinc.com> <51b2bd40-888d-4ee4-956f-c5239c5be9e9@linaro.org> <0a867cd1-8d99-495e-ae7e-a097fc9c00e9@quicinc.com> <7140cdb8-eda4-4dcd-b5e3-c4acdd01befb@linaro.org> <9992067e-51c5-4a55-8d66-55a102a001b6@quicinc.com> Content-Language: en-US From: Bibek Kumar Patro In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-ClientProxiedBy: nasanex01b.na.qualcomm.com (10.46.141.250) To nalasex01c.na.qualcomm.com (10.47.97.35) X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-ORIG-GUID: Xc3p5qo6H2QCwgjE5ntm8XcjapCklKni X-Proofpoint-GUID: Xc3p5qo6H2QCwgjE5ntm8XcjapCklKni X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-06-10_02,2024-06-06_02,2024-05-17_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 bulkscore=0 mlxlogscore=999 phishscore=0 priorityscore=1501 malwarescore=0 suspectscore=0 lowpriorityscore=0 clxscore=1015 spamscore=0 adultscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2405170001 definitions=main-2406100076 On 6/6/2024 3:43 AM, Rob Clark wrote: > On Wed, Jun 5, 2024 at 3:52 AM Bibek Kumar Patro > wrote: >> >> On 6/5/2024 12:19 AM, Rob Clark wrote: >>> On Thu, May 30, 2024 at 2:22 AM Bibek Kumar Patro >>> wrote: >>>> >>>> >>>> >>>> On 5/28/2024 9:38 PM, Rob Clark wrote: >>>>> On Tue, May 28, 2024 at 6:06 AM Dmitry Baryshkov >>>>> wrote: >>>>>> >>>>>> On Tue, May 28, 2024 at 02:59:51PM +0200, Konrad Dybcio wrote: >>>>>>> >>>>>>> >>>>>>> On 5/15/24 15:59, Bibek Kumar Patro wrote: >>>>>>>> >>>>>>>> >>>>>>>> On 5/10/2024 6:32 PM, Konrad Dybcio wrote: >>>>>>>>> On 10.05.2024 2:52 PM, Bibek Kumar Patro wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 5/1/2024 12:30 AM, Rob Clark wrote: >>>>>>>>>>> On Tue, Jan 23, 2024 at 7:00 AM Bibek Kumar Patro >>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Currently in Qualcomm SoCs the default prefetch is set to 1 which allows >>>>>>>>>>>> the TLB to fetch just the next page table. MMU-500 features ACTLR >>>>>>>>>>>> register which is implementation defined and is used for Qualcomm SoCs >>>>>>>>>>>> to have a custom prefetch setting enabling TLB to prefetch the next set >>>>>>>>>>>> of page tables accordingly allowing for faster translations. >>>>>>>>>>>> >>>>>>>>>>>> ACTLR value is unique for each SMR (Stream matching register) and stored >>>>>>>>>>>> in a pre-populated table. This value is set to the register during >>>>>>>>>>>> context bank initialisation. >>>>>>>>>>>> >>>>>>>>>>>> Signed-off-by: Bibek Kumar Patro >>>>>>>>>>>> --- >>>>>>>>> >>>>>>>>> [...] >>>>>>>>> >>>>>>>>>>>> + >>>>>>>>>>>> + for_each_cfg_sme(cfg, fwspec, j, idx) { >>>>>>>>>>>> + smr = &smmu->smrs[idx]; >>>>>>>>>>>> + if (smr_is_subset(smr, id, mask)) { >>>>>>>>>>>> + arm_smmu_cb_write(smmu, cbndx, ARM_SMMU_CB_ACTLR, >>>>>>>>>>>> + actlrcfg[i].actlr); >>>>>>>>>>> >>>>>>>>>>> So, this makes ACTLR look like kind of a FIFO. But I'm looking at >>>>>>>>>>> downstream kgsl's PRR thing (which we'll need to implement vulkan >>>>>>>>>>> sparse residency), and it appears to be wanting to set BIT(5) in ACTLR >>>>>>>>>>> to enable PRR. >>>>>>>>>>> >>>>>>>>>>> val = KGSL_IOMMU_GET_CTX_REG(ctx, KGSL_IOMMU_CTX_ACTLR); >>>>>>>>>>> val |= FIELD_PREP(KGSL_IOMMU_ACTLR_PRR_ENABLE, 1); >>>>>>>>>>> KGSL_IOMMU_SET_CTX_REG(ctx, KGSL_IOMMU_CTX_ACTLR, val); >>>>>>>>>>> >>>>>>>>>>> Any idea how this works? And does it need to be done before or after >>>>>>>>>>> the ACTLR programming done in this patch? >>>>>>>>>>> >>>>>>>>>>> BR, >>>>>>>>>>> -R >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hi Rob, >>>>>>>>>> >>>>>>>>>> Can you please help provide some more clarification on the FIFO part? By FIFO are you referring to the storing of ACTLR data in the table? >>>>>>>>>> >>>>>>>>>> Thanks for pointing to the downstream implementation of kgsl driver for >>>>>>>>>> the PRR bit. Since kgsl driver is already handling this PRR bit's >>>>>>>>>> setting, this makes setting the PRR BIT(5) by SMMU driver redundant. >>>>>>>>> >>>>>>>>> The kgsl driver is not present upstream. >>>>>>>>> >>>>>>>> >>>>>>>> Right kgsl is not present upstream, it would be better to avoid configuring the PRR bit and can be handled by kgsl directly in downstream. >>>>>>> >>>>>>> No! Upstream is not a dumping ground to reduce your technical debt. >>>>>>> >>>>>>> There is no kgsl driver upstream, so this ought to be handled here, in >>>>>>> the iommu driver (as poking at hardware A from driver B is usually not good >>>>>>> practice). >>>>>> >>>>>> I'd second the request here. If another driver has to control the >>>>>> behaviour of another driver, please add corresponding API for that. >>>>> >>>>> We have adreno_smmu_priv for this purpose ;-) >>>>> >>>> >>>> Thanks Rob for pointing to this private interface structure between smmu >>>> and gpu. I think it's similar to what you're trying to implement here >>>> https://lore.kernel.org/all/CAF6AEGtm-KweFdMFvahH1pWmpOq7dW_p0Xe_13aHGWt0jSbg8w@mail.gmail.com/#t >>>> I can add an api "set_actlr_prr()" with smmu_domain cookie, page pointer >>>> as two parameters. This api then can be used by drm/msm driver to carry >>>> out the prr implementation by simply calling this. >>>> Would this be okay Rob,Konrad,Dmitry? >>>> Let me know if any other suggestions you have in mind as well regarding >>>> parameters and placement. >>> >>> Hey Bibek, quick question.. is ACTLR preserved across a suspend/resume >>> cycle? Or does it need to be reprogrammed on resume? And same >>> question for these two PRR related regs: >>> >>> /* Global SMMU register offsets */ >>> #define KGSL_IOMMU_PRR_CFG_LADDR 0x6008 >>> #define KGSL_IOMMU_PRR_CFG_UADDR 0x600c >>> >>> (ie. high/low 32b of the PRR page) >>> >> >> Hey Rob, In suspend/resume, the register space power rails are not in >> disabled state, so it won't go back to reset values and should retain >> it's value. Only in hibernation cycle the registers' value would get reset. >> >> So the hi/low address bit register for PRR page would also retain it's >> value along with the ACTLR registers. >> >>> I was starting to type up a patch to add PRR configuration, but >>> depending on whether it interacts with suspend/resume, it might be >>> better form arm-smmu-qcom.c to just always enable and configure PRR >>> (including allocating a page to have an address to program into >>> PRR_CFG_LADDR/UADDR), and instead add an interface to return the PRR >>> page? I think there is no harm in unconditionally configuring PRR for >>> gpu smmu. >> >> Sounds okay though since this would not interact with suspend/resume path. >> But I think, suppose in-case this page would have some other references >> as well before configuring the address to the registers for PRR >> configuration, then GPU would be dependent on arm-smmu-qcom for this page. >> So Instead an endpoint api in arm-smmu-qcom.c can recieve the just the >> page-address, and bit set status from drm/msm driver and can set/reset >> the bit along with any page-address they want ? >> It would mean the interface will be smmu's , but the choice of >> configuration data to the registers' will be still with gpu. >> >> I wrote up a small patch with this implementation, would you like to >> review that? >> Will send it in this v11 series as new patch. > > I think if there is no suspend/resume interaction, we should go back > to the original idea of page allocation in drm/msm. > > Basically, I think the pros and cons are: > > allocate in arm-smmu > pro: easy to sequence programming with suspend/resume > con: there isn't a convenient place to free the page on driver unload > > allocate in drm/msm: > pro: easy place to free the page in teardown > con: harder to sequence with s/r > > But if ACTLR and PRR_CFG_LADDR/UADDR are retained, then the con isn't > actually an issue ;-) > Sounds right, also in this case the ownership of the page stays with drm/msm which might also make it easy to handle the page for them. > Anyways, I can type that patch.. the rest of drm/msm and userspace > changes (vm_bind + sparse) to get to the point where I can use PRR are > a somewhat bigger task so it will take me a while to get the point > where I can test any smmu patches. > Sure Rob get it. Previously in v11 I sent a patch adding a adreno-smmu-priv api with similar "page allocation in drm" design as you explained above. Is that approach looking okay? If it's okay can I add you in suggested-by tag in that patch ? Thanks & regards, Bibek > BR, > -R > > >> Thanks & regards, >> Bibek >> >>> >>> BR, >>> -R >>> >>>> Thanks & regards, >>>> Bibek >>>> >>>>> BR, >>>>> -R >>>>> >>>>>>> >>>>>>>> >>>>>>>>>> Thanks for bringing up this point. >>>>>>>>>> I will send v10 patch series removing this BIT(5) setting from the ACTLR >>>>>>>>>> table. >>>>>>>>> >>>>>>>>> I think it's generally saner to configure the SMMU from the SMMU driver.. >>>>>>>> >>>>>>>> Yes, agree on this. But since PRR bit is not directly related to SMMU >>>>>>>> configuration so I think it would be better to remove this PRR bit >>>>>>>> setting from SMMU driver based on my understanding. >>>>>>> >>>>>>> Why is it not related? We still don't know what it does. >>>>>>> >>>>>>> Konrad >>>>>> >>>>>> -- >>>>>> With best wishes >>>>>> Dmitry