Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp2342979pxb; Sun, 3 Apr 2022 03:27:46 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyigil9tel8dFdkOb2WH+OKBWMU+anUKwK4kmZLYP5qB5LZWXToeBNb/20+TqQYZWYb9l7n X-Received: by 2002:a17:907:3ea3:b0:6e6:faf5:b8e with SMTP id hs35-20020a1709073ea300b006e6faf50b8emr4029719ejc.402.1648981666571; Sun, 03 Apr 2022 03:27:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1648981666; cv=none; d=google.com; s=arc-20160816; b=llEIP8XxOw3EZfA5CREiIgZ0CLdkJuRVAJx9CY+69Ocy3A0s3zRY0JqPRb6A9g27rr 9s0B7lCJxw52DmucYujqT2HnwpSBwhR/HVBMgRS1qyxeKqazvGWYlwUBiThZ3+SemcPF D/8t5/W5x0pDdJ0zJqCTcPY6mPRDtN4vK+qdDp71dZtEKwAUc57ynRZPHS9mObrlFEJl 4xBYVwrgaREEnTDumHGRWDwcdWFeFsOPONVr1J6nmf7iISQftaRgCuylAhjCQcbEuR8V O8jU6lYu+Hq0JGzHWHNEuE9InQP4P59cLPgRrf6nVrtA7EcQAbCcLZ/uKJ+xLRudkKHc 5Dow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:from:references:to:subject :cc; bh=EhooozuvMj9QGBAwA825mukcDbmgfz4YlOV2kKXyyM8=; b=dXP3O2v9zOZj34zIZJ9RVl5jYKD+3T7RHr0bhX2kQKiwLuxL/Rov6EOhAZZb6JR0lm cCV9xPa2XSABozJ0Jwn8FKAEaJKsBjRnJwqqjxsLf5k/yOkMmmdk/HDCOQeYPaTThbA7 KFF/awytXyEbsg5tQ/fQ2SJUeAuuv0KucAxBvwXfin1lmOmwt22py1jW6GsGpzMJnfzb bwQaZk4BMfGBpLUq1C8XG7IhNhToyCUTOZuYBmuO38wAyrGKiUyvzuPMuBPT5oT4DFu7 tR6Thwfo5sNXw7/3YUvIwbmIXkPWJOnwxuVaqjp8YlKnHH7Rrpr5yjy8WF3Nrf84ye9Z Yciw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id eb11-20020a0564020d0b00b00418fe90d5cfsi6082097edb.576.2022.04.03.03.27.21; Sun, 03 Apr 2022 03:27:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1354330AbiDBKa5 (ORCPT + 99 others); Sat, 2 Apr 2022 06:30:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54502 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239592AbiDBKaz (ORCPT ); Sat, 2 Apr 2022 06:30:55 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 060FE49266 for ; Sat, 2 Apr 2022 03:29:03 -0700 (PDT) Received: from canpemm500009.china.huawei.com (unknown [172.30.72.56]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4KVtWr1Z9nzgY97; Sat, 2 Apr 2022 18:27:20 +0800 (CST) Received: from [10.67.102.169] (10.67.102.169) by canpemm500009.china.huawei.com (7.192.105.203) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.21; Sat, 2 Apr 2022 18:29:00 +0800 CC: , Catalin Marinas , Will Deacon , Sudeep Holla , Greg Kroah-Hartman , "Rafael J. Wysocki" , Ingo Molnar , Peter Zijlstra , Juri Lelli , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , "linux-arm-kernel@lists.infradead.org" , "linux-kernel@vger.kernel.org" Subject: Re: [PATCH] sched: topology: make cache topology separate from cpu topology To: =?UTF-8?B?546L5pOO?= , Vincent Guittot References: <1646917125-20038-1-git-send-email-wangqing@vivo.com> From: Yicong Yang Message-ID: Date: Sat, 2 Apr 2022 18:29:00 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.5.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit X-Originating-IP: [10.67.102.169] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To canpemm500009.china.huawei.com (7.192.105.203) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,NICE_REPLY_A, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Qing, On 2022/4/2 17:34, 王擎 wrote: > >>> >>> >>>>> >>>>> >>>>>>> >>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> On Thu, 10 Mar 2022 at 13:59, Qing Wang wrote: >>>>>>>>>>> >>>>>>>>>>> From: Wang Qing >>>>>>>>>>> >>>>>>>>>>> Some architectures(e.g. ARM64), caches are implemented below: >>>>>>>>>>> cluster:                     ****** cluster 0 *****      ****** cluster 1 ***** >>>>>>>>>>> core:                         0      1          2      3          4      5           6      7 >>>>>>>>> (add cache level 1)        c0    c1        c2    c3         c4    c5         c6    c7 >>>>>>>>>>> cache(Leveln):         **cache0**  **cache1**  **cache2**  **cache3** >>>>>>>>> (add cache level 3)        *************share level 3 cache *************** >>>>>>>>>>> sd_llc_id(current):     0      0          0      0          4      4           4      4 >>>>>>>>>>> sd_llc_id(should be): 0      0          2      2          4      4           6      6 >>>>>>>>>>> >>>>>>>>> Here, n always be 2 in ARM64, but others are also possible. >>>>>>>>> core[0,1] form a complex(ARMV9),  which share L2 cache, core[2,3] is the same. >>>>>>>>> >>>>>>>>>>> Caches and cpus have different topology, this causes cpus_share_cache() >>>>>>>>>>> return the wrong value, which will affect the CPU load balance. >>>>>>>>>>> >>>>>>>>>> What does your current scheduler topology  look like? >>>>>>>>>> >>>>>>>>>> For CPU 0 to 3, do you have the below ? >>>>>>>>>> DIE [0     -     3] [4-7] >>>>>>>>>> MC  [0] [1] [2] [3] >>>>>>>>> >>>>>>>>> The current scheduler topology consistent with CPU topology: >>>>>>>>> DIE  [0-7] >>>>>>>>> MC  [0-3] [4-7]  (SD_SHARE_PKG_RESOURCES) >>>>>>>>> Most Android phones have this topology. >>>>>>>>>> >>>>>>>>>> But you would like something like below for cpu 0-1 instead ? >>>>>>>>>> DIE [0     -     3] [4-7] >>>>>>>>>> CLS [0 - 1] [2 - 3] >>>>>>>>>> MC  [0] [1] >>>>>>>>>> >>>>>>>>>> with SD_SHARE_PKG_RESOURCES only set to MC level ? >>>>>>>>> >>>>>>>>> We don't change the current scheduler topology, but the >>>>>>>>> cache topology should be separated like below: >>>>>>>> >>>>>>>> The scheduler topology is not only cpu topology but a mixed of cpu and >>>>>>>> cache/memory cache topology >>>>>>>> >>>>>>>>> [0-7]                          (shared level 3 cache ) >>>>>>>>> [0-1] [2-3][4-5][6-7]   (shared level 2 cache ) >>>>>>>> >>>>>>>> So you don't  bother the intermediate cluster level which is even simpler. >>>>>>>> you have to modify generic arch topology so that cpu_coregroup_mask >>>>>>>> returns the correct cpu mask directly. >>>>>>>> >>>>>>>> You will notice a llc_sibling field that is currently used by acpi but >>>>>>>> not DT to return llc cpu mask >>>>>>>> >>>>>>> cpu_topology[].llc_sibling describe the last level cache of whole system, >>>>>>> not in the sched_domain. >>>>>>> >>>>>>> in the above cache topology, llc_sibling is 0xff([0-7]) , it describes >>>>>> >>>>>> If llc_sibling was 0xff([0-7] on your system, you would have only one level: >>>>>> MC[0-7] >>>>> >>>>> Sorry, but I don't get it, why llc_sibling was 0xff([0-7] means MC[0-7]? >>>>> In our system(Android), llc_sibling is indeed 0xff([0-7]) , because they >>>>> shared the llc(L3), but we also have two level: >>>>> DIE [0-7] >>>>> MC [0-3][4-6] >>>>> It makes sense, [0-3] are little cores, [4-7] are bit cores, se only up migrate >>>>> when misfit. We won't change it. >>>>> >>>>>> >>>>>>> the L3 cache sibling, but sd_llc_id describes the maximum shared cache >>>>>>> in sd, which should be [0-1] instead of [0-3]. >>>>>> >>>>>> sd_llc_id describes the last sched_domain with SD_SHARE_PKG_RESOURCES. >>>>>> If you want llc to be [0-3] make sure that the >>>>>> sched_domain_topology_level array returns the correct cpumask with >>>>>> this flag >>>>> >>>>> Acturely, we want sd_llc to be [0-1] [2-3], but if the MC domain don't have >>>> >>>> sd_llc_id refers to a scheduler domain but your patch breaks this so >>>> if you want a llc that reflects this topo:  [0-1] [2-3] you must >>>> provide a sched_domain level with this topo >>> >>> Maybe we should add a shared-cache level(SC), like what CLS does: >>> >>> DIE  [0-7] (shared level 3 cache, SD_SHARE_PKG_RESOURCES) >>> MC  [0-3] [4-7]  (not SD_SHARE_PKG_RESOURCES) >>> CLS  (if necessary) >>> SC    [0-1][2-3][4-5][6-7] (shared level 2 cache, SD_SHARE_PKG_RESOURCES) >>> SMT (if necessary) >>> >>> SC means a couple of CPUs which are placed closely by sharing >>> mid-level caches, but not enough to be a cluster. >> >> what you name SC above looks the same as CLS which should not be mixed >> with Arm cluster terminology > > Do you mean cluster is equal to shared cache instead of containing, SC just > means shared cache, but not form a cluster, a CLS can contain many SCs. > The cluster is a topology level above the CPUs but under LLC. On Kunpeng 920 the cpus in a CLS will share L3T and on Intel's Jacobsville cpus in a CLS will share L2[1]. Seems you're using a DT based system. I think the parsing of cluster level is not supported on DT yet so you cannot see it. Otherwise with right cpu topology reported you will have a CLS level in which the cpus share L2 cache, just like Jacobsville. [1] https://lore.kernel.org/all/20210924085104.44806-4-21cnbao@gmail.com/ > If as you said, SC looks the same as CLS, should we rename CLS to SC to > avoid confusion? > > Thanks, > Wang >