Received: by 2002:a05:7412:6592:b0:d7:7d3a:4fe2 with SMTP id m18csp1435580rdg; Sat, 12 Aug 2023 00:46:54 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHzqNfGKmti/fCIxijZMpQBKIXWswuIQNWEc6BogLLaxMdoyKPaBybXjAbdlm9bqoiLykzk X-Received: by 2002:a05:6808:200c:b0:3a3:78dc:8c4c with SMTP id q12-20020a056808200c00b003a378dc8c4cmr6675873oiw.46.1691826414251; Sat, 12 Aug 2023 00:46:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691826414; cv=none; d=google.com; s=arc-20160816; b=mMp9Plm+P+j27Msvmd6h2E3B93VJPbRqvOATdeL9XLiiUUVNDu+Y40pVDGAvWoxgX9 SdlWd/tm75NRs8fuJAzSZlgC50Fqlg+3XLs5W80/5IRVRPWTcYOdA9E6kOWvLqriwU80 oBYSxPhwqlT9y5qKaJWQ0uYPh4yjb9P24vOLCEmP9Iybe0pgK4U3rWKpy4OP+tcGMVCW x7E1K1MDNxcRodBVbh0LSQuF3fOsYbNzmlcDoIcLsCyKWRMTRczWTVXRGQSQ+0y2kXue OrVMMfpyPbD2en826QK7xlQ5jw+awvJrWX/UVBs5HS+AlQQISWCFqQCtndao20km7ozF hkag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:subject:user-agent:mime-version:date:message-id; bh=aPKyamVl0a4HwWv/ekvAmewAoZVoQ9xMDgw754IYc9I=; fh=O57Wjv+WwzOWLqaMiX2UAfiP765xajm5LaeCQ51HW6w=; b=H+Saq3yrAEezm+0DUA0hsfbk6K9RA0p7kKKmMPASYuWze3g0CtWmK1XF400u129ets FoAkwaJ8ezKKPDouv4yLptTdSSFhoe4UbgSp/ilU0nxu+L5twAy9zrnnukG49i3It9ga B7mbpdwAIvagCc6v/BRqOrTSioSd+PsXo91AsSYuZkG4mXT3ztedjH6IAk0aBaCNBoZL Sxq3EcC495zobSLrdJBFB0Sqgx75yuHtaBgzXuamsLaEWdkt/YcqkH6nANwTtsBd8uFY nBuMgpaJ2XfDsoxwrI/7by6hO+fA4MnEK7wtIwuqvZGypB3YfbBOa/LrTIVAz9/HsZ6l nUOA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s9-20020a637709000000b0056531bfc660si4779129pgc.143.2023.08.12.00.46.40; Sat, 12 Aug 2023 00:46:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229609AbjHLHS4 (ORCPT + 99 others); Sat, 12 Aug 2023 03:18:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45432 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229497AbjHLHSz (ORCPT ); Sat, 12 Aug 2023 03:18:55 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 29D419C for ; Sat, 12 Aug 2023 00:18:55 -0700 (PDT) Received: from kwepemi500019.china.huawei.com (unknown [172.30.72.54]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4RNBmj5PbPzTmNG; Sat, 12 Aug 2023 15:16:53 +0800 (CST) Received: from [10.174.176.82] (10.174.176.82) by kwepemi500019.china.huawei.com (7.221.188.117) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Sat, 12 Aug 2023 15:18:51 +0800 Message-ID: <36924a4d-e62c-68d5-3cb0-375b7fe1d5c0@huawei.com> Date: Sat, 12 Aug 2023 15:18:50 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.1.1 Subject: Re: [RESEND PATCH 2/2] iommu/iova: allocate iova_rcache->depot dynamicly To: John Garry CC: , , , , , , Robin Murphy , , References: <20230811130246.42719-1-zhangzekun11@huawei.com> <20230811130246.42719-3-zhangzekun11@huawei.com> From: "zhangzekun (A)" In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.174.176.82] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To kwepemi500019.china.huawei.com (7.221.188.117) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-2.9 required=5.0 tests=BAYES_00,NICE_REPLY_A, RCVD_IN_DNSWL_BLOCKED,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 在 2023/8/11 22:14, John Garry 写道: > On 11/08/2023 14:02, Zhang Zekun wrote: >> In fio test with 4k,read,and allowed cpus to 0-255, we observe a >> performance >> decrease of IOPS. The normal IOPS > > What do you mean by normal IOPS? Describe this "normal" scenario. > Hi, John The reason why I think 1980K is normal is that I have test the iova_rache hit rate with all iova size, the average iova cache hit rate can reach up to around 99% (varys from diffent work loads and iova size), and I think iova_rcache behave well in our test work loads. Besides, the IOPS is behaving as expect which is acked by our test group. > ? can reach up to 1980k, but we can only >> get about 1600k. >> >> abormal IOPS: >> Jobs: 12 (f=12): [R(12)][99.3%][r=6220MiB/s][r=1592k IOPS][eta 00m:12s] >> Jobs: 12 (f=12): [R(12)][99.4%][r=6215MiB/s][r=1591k IOPS][eta 00m:11s] >> Jobs: 12 (f=12): [R(12)][99.4%][r=6335MiB/s][r=1622k IOPS][eta 00m:10s] >> Jobs: 12 (f=12): [R(12)][99.5%][r=6194MiB/s][r=1586k IOPS][eta 00m:09s] >> Jobs: 12 (f=12): [R(12)][99.6%][r=6173MiB/s][r=1580k IOPS][eta 00m:08s] >> Jobs: 12 (f=12): [R(12)][99.6%][r=5984MiB/s][r=1532k IOPS][eta 00m:07s] >> Jobs: 12 (f=12): [R(12)][99.7%][r=6374MiB/s][r=1632k IOPS][eta 00m:06s] >> Jobs: 12 (f=12): [R(12)][99.7%][r=6343MiB/s][r=1624k IOPS][eta 00m:05s] >> >> normal IOPS: >> Jobs: 12 (f=12): [R(12)][99.3%][r=7736MiB/s][r=1980k IOPS][eta 00m:12s] >> Jobs: 12 (f=12): [R(12)][99.4%][r=7744MiB/s][r=1982k IOPS][eta 00m:11s] >> Jobs: 12 (f=12): [R(12)][99.4%][r=7737MiB/s][r=1981k IOPS][eta 00m:10s] >> Jobs: 12 (f=12): [R(12)][99.5%][r=7735MiB/s][r=1980k IOPS][eta 00m:09s] >> Jobs: 12 (f=12): [R(12)][99.6%][r=7741MiB/s][r=1982k IOPS][eta 00m:08s] >> Jobs: 12 (f=12): [R(12)][99.6%][r=7740MiB/s][r=1982k IOPS][eta 00m:07s] >> Jobs: 12 (f=12): [R(12)][99.7%][r=7736MiB/s][r=1981k IOPS][eta 00m:06s] >> Jobs: 12 (f=12): [R(12)][99.7%][r=7736MiB/s][r=1980k IOPS][eta 00m:05s] >> >> The current struct of iova_rcache will have iova_cpu_rcache for every >> cpu, and these iova_cpu_rcaches use a common buffer >> iova_rcache->depot to >> exchange iovas among iova_cpu_rcaches. A machine with 256 cpus will have >> 256 iova_cpu_rcaches and 1 iova_rcache->depot per iova_domain. However, >> the max size of iova_rcache->depot is fixed to MAX_GLOBAL_MAGS which >> equals >> to 32, and can't grow with the number of cpus, and this can cause >> problem >> in some condition. >> >> Some drivers will only free iovas in their irq call back function. For >> the driver in this case it has 16 thread irqs to free iova, but these >> irq call back function will only free iovas on 16 certain cpus(cpu{0,16, >> 32...,240}). Thread irq which smp affinity is 0-15, will only free >> iova on >> cpu 0. However, the driver will alloc iova on all cpus(cpu{0-255}), cpus >> without free iova in local cpu_rcache need to get free iovas from >> iova_rcache->depot. The current size of iova_rcache->depot max size >> is 32, >> and it seems to be too small for 256 users (16 cpus will put iovas to >> iova_rcache->depot and 240 cpus will try to get iova from it). Set >> iova_rcache->depot grow with the num of possible cpus, and the decrease >> of IOPS disappear. > > Doesn't it take a long time for all the depots to fill for you? From > the description, this sounds like the hisilicon SAS controller which > you are experimenting with and I found there that it took a long time > for the depots to fill for high throughput testing. > > Thanks, > John Yes, it will take more time to fill rcahe->depot, but we don't need to fill all depots before using iova in them. We can use iova in rcache->depot when local cpu_rcache is empty and there are iova magazines in rcache->depot, which means "rcache->depot_size > 0". The larger depot size will help caching more iova magazines freed in __iova_rcache_insert() which means more potential memory cost for iova_rcache, and should not introduce performance issues.