Received: by 2002:a05:6a10:413:0:0:0:0 with SMTP id 19csp24202pxp; Tue, 15 Mar 2022 22:37:11 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzq5dzmr+Q9KrNyQGJ8plH3w95Ow9F+H/wznjw7x9aApeNgKvntsyZqLgljx1tKRParyfam X-Received: by 2002:a17:906:3a4f:b0:6cf:86e0:586c with SMTP id a15-20020a1709063a4f00b006cf86e0586cmr25620748ejf.626.1647409031455; Tue, 15 Mar 2022 22:37:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1647409031; cv=none; d=google.com; s=arc-20160816; b=DRvoHmzL121co03Rl+39eIPe0mKrX6TganC60G2oLhExAn1oX/15cVLmvWZxO7m3nr fWJLdNQHpUUMpBfdYNTXhBv8DZ7KipBaiG5CIggR48E2n95MqBBCqOyPxGYKvfNwGa6E ohrbInePX2X5Mv7nilzkHvm5tgxcucCb1KJ3HExV5ODFhLOQTYN8aChjpouYhdCh8TIV mQMN1lsWMlT6/W3idSV7clWWEl7NdM/rSHi9os9pAvt50ISevPtcbGbkO4++uW3p3tJB I4d8x9XL2g67yfXX0MnyWY10PiPOmX3itS9kPgaZc4LRday7aUZJtPW0TeymzTG8vV9x R/Ow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=jin/dil8Y86UR6g39ryXWBgMORg2mX23wERJrXq3qYs=; b=VBTUCkpa5uiSnRy7LvkCVOPiK3sjRsQHwi78eJJS49k04YttJsSQbiqA8XH5lNtnxa /L/S8slOnD2XLhboZryRjuzvHzNXwSvMZX6YmS2WF5CEIMgKz47PdwJ/X2pwepyK3E0p 1GEMgVtUP9sl/DJiFYzNsHKronT9PNngcaf5b+PA9ySg/YyvRI+HeLPQl5BHUuXiLHe1 nhgcLK53RAXmqGjrNShTZKuRdVK3OD/2LpJ4DwAoX5TGPqOQN+c+1jw7r1wJWOcbmFcx KBEWIGiNrMStYZB+f9rdng7HLroMlGU58XhYQWtEyzbqwzZVpYv52uFkFMX4Vq0KIg6K XUoA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="kc/qG/9l"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id by23-20020a170906a2d700b006db8ca907afsi552962ejb.950.2022.03.15.22.36.45; Tue, 15 Mar 2022 22:37:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="kc/qG/9l"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1350252AbiCOQxU (ORCPT + 99 others); Tue, 15 Mar 2022 12:53:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59516 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1350361AbiCOQxI (ORCPT ); Tue, 15 Mar 2022 12:53:08 -0400 Received: from mail-lf1-x135.google.com (mail-lf1-x135.google.com [IPv6:2a00:1450:4864:20::135]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EA09D574BA for ; Tue, 15 Mar 2022 09:51:55 -0700 (PDT) Received: by mail-lf1-x135.google.com with SMTP id w12so34030986lfr.9 for ; Tue, 15 Mar 2022 09:51:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=jin/dil8Y86UR6g39ryXWBgMORg2mX23wERJrXq3qYs=; b=kc/qG/9lO+a+365TCdbU1o0PeaKoSB0mj6zzCjftJK9qVvd5NAC+/Qci6BQ90IXwSK pahG1K44yw9TqzAf+aL7d9CzxJpVdcNSa1eqCzO3SuELcJfTsgGhR0nfJJGZcQ2TKeWk K9L+hPzKLkTtsvAr/W8zLfVBVMy8hnBo62pGqit96BFtjFZj2wIg34Z8+64MhBikw6Qk 4kWPjwUONA7ZjehvbTYXdxXYiPgNzy4IJzbgPkrvbi8rAEvZI2WN4FkfbwpA2EC8SwuX ZqJlLadZcY2CKOjk4QEnjxddQNKzT7v00mqOlVsm4KwmWzDUhZ2JRZDuSSTvtR3QKf49 1YXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=jin/dil8Y86UR6g39ryXWBgMORg2mX23wERJrXq3qYs=; b=ba2MhF2RvB6xPqeW3t2fDcqpwlsIvEvNIa6S3UoeIjiQZJ+s5X4hDHAy+3dOaWspZm Q7+2N8HjVJnJfXwQYiIbb+tPzCWWuQEylxr8jiNCcmIYI6D48amrkuTnof7pAXr1/cwT 9vDNCCh//XbaCQNE9UrVpmmSEee4y73RQgbdQP43KzGvRqGTsxF9VCLHZEDJKbTwqdhh gkB/O1vncCEkP68i8DTZ1+IqtdYLCZpnZU23+P/zrrdrIxh0N2yVMpkpycaMNksvSvWG 8VvD4SGxLTay4G9waMSgXREPM8LZV2vXDGiuME47G5QoD/deCtM60kDmYTV6VmY6HwjX 9Jwg== X-Gm-Message-State: AOAM530b20hu/rDy/p2jppzNJzmeWwtzuQoeL++tyQ3AXyArAJYN/fH9 RqAsnUozFodx8/G19saQs2ufHYLDkFhHkbQuMtPw7Q== X-Received: by 2002:ac2:554a:0:b0:448:2a09:66eb with SMTP id l10-20020ac2554a000000b004482a0966ebmr17464548lfk.645.1647363113677; Tue, 15 Mar 2022 09:51:53 -0700 (PDT) MIME-Version: 1.0 References: <1646917125-20038-1-git-send-email-wangqing@vivo.com> In-Reply-To: From: Vincent Guittot Date: Tue, 15 Mar 2022 17:51:41 +0100 Message-ID: Subject: Re: [PATCH] sched: topology: make cache topology separate from cpu topology To: =?UTF-8?B?546L5pOO?= Cc: Catalin Marinas , Will Deacon , Sudeep Holla , Greg Kroah-Hartman , "Rafael J. Wysocki" , Ingo Molnar , Peter Zijlstra , Juri Lelli , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , "linux-arm-kernel@lists.infradead.org" , "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 15 Mar 2022 at 02:58, =E7=8E=8B=E6=93=8E wrote: > > > >> > >> > >> >> > >> >> > >> >> >> > >> >> >> > >> >> >> >On Thu, 10 Mar 2022 at 13:59, Qing Wang wro= te: > >> >> >> >> > >> >> >> >> From: Wang Qing > >> >> >> >> > >> >> >> >> Some architectures(e.g. ARM64), caches are implemented below: > >> >> >> >> cluster: ****** cluster 0 ***** ****= ** cluster 1 ***** > >> >> >> >> core: 0 1 2 3 = 4 5 6 7 > >> >> >> (add cache level 1) c0 c1 c2 c3 c4 = c5 c6 c7 > >> >> >> >> cache(Leveln): **cache0** **cache1** **cache2** **= cache3** > >> >> >> (add cache level 3) *************share level 3 cache ****= *********** > >> >> >> >> sd_llc_id(current): 0 0 0 0 4= 4 4 4 > >> >> >> >> sd_llc_id(should be): 0 0 2 2 4 = 4 6 6 > >> >> >> >> > >> >> >> Here, n always be 2 in ARM64, but others are also possible. > >> >> >> core[0,1] form a complex(ARMV9), which share L2 cache, core[2,3= ] is the same. > >> >> >> > >> >> >> >> Caches and cpus have different topology, this causes cpus_sha= re_cache() > >> >> >> >> return the wrong value, which will affect the CPU load balanc= e. > >> >> >> >> > >> >> >> >What does your current scheduler topology look like? > >> >> >> > > >> >> >> >For CPU 0 to 3, do you have the below ? > >> >> >> >DIE [0 - 3] [4-7] > >> >> >> >MC [0] [1] [2] [3] > >> >> >> > >> >> >> The current scheduler topology consistent with CPU topology: > >> >> >> DIE [0-7] > >> >> >> MC [0-3] [4-7] (SD_SHARE_PKG_RESOURCES) > >> >> >> Most Android phones have this topology. > >> >> >> > > >> >> >> >But you would like something like below for cpu 0-1 instead ? > >> >> >> >DIE [0 - 3] [4-7] > >> >> >> >CLS [0 - 1] [2 - 3] > >> >> >> >MC [0] [1] > >> >> >> > > >> >> >> >with SD_SHARE_PKG_RESOURCES only set to MC level ? > >> >> >> > >> >> >> We don't change the current scheduler topology, but the > >> >> >> cache topology should be separated like below: > >> >> > > >> >> >The scheduler topology is not only cpu topology but a mixed of cpu= and > >> >> >cache/memory cache topology > >> >> > > >> >> >> [0-7] (shared level 3 cache ) > >> >> >> [0-1] [2-3][4-5][6-7] (shared level 2 cache ) > >> >> > > >> >> >So you don't bother the intermediate cluster level which is even = simpler. > >> >> >you have to modify generic arch topology so that cpu_coregroup_mas= k > >> >> >returns the correct cpu mask directly. > >> >> > > >> >> >You will notice a llc_sibling field that is currently used by acpi= but > >> >> >not DT to return llc cpu mask > >> >> > > >> >> cpu_topology[].llc_sibling describe the last level cache of whole s= ystem, > >> >> not in the sched_domain. > >> >> > >> >> in the above cache topology, llc_sibling is 0xff([0-7]) , it descri= bes > >> > > >> >If llc_sibling was 0xff([0-7] on your system, you would have only one= level: > >> >MC[0-7] > >> > >> Sorry, but I don't get it, why llc_sibling was 0xff([0-7] means MC[0-7= ]? > >> In our system(Android), llc_sibling is indeed 0xff([0-7]) , because th= ey > >> shared the llc(L3), but we also have two level: > >> DIE [0-7] > >> MC [0-3][4-6] > >> It makes sense, [0-3] are little cores, [4-7] are bit cores, se only u= p migrate > >> when misfit. We won't change it. > >> > >> > > >> >> the L3 cache sibling, but sd_llc_id describes the maximum shared ca= che > >> >> in sd, which should be [0-1] instead of [0-3]. > >> > > >> >sd_llc_id describes the last sched_domain with SD_SHARE_PKG_RESOURCES= . > >> >If you want llc to be [0-3] make sure that the > >> >sched_domain_topology_level array returns the correct cpumask with > >> >this flag > >> > >> Acturely, we want sd_llc to be [0-1] [2-3], but if the MC domain don't= have > > > >sd_llc_id refers to a scheduler domain but your patch breaks this so > >if you want a llc that reflects this topo: [0-1] [2-3] you must > >provide a sched_domain level with this topo > > Maybe we should add a shared-cache level(SC), like what CLS does: > > DIE [0-7] (shared level 3 cache, SD_SHARE_PKG_RESOURCES) > MC [0-3] [4-7] (not SD_SHARE_PKG_RESOURCES) > CLS (if necessary) > SC [0-1][2-3][4-5][6-7] (shared level 2 cache, SD_SHARE_PKG_RESOURCES) > SMT (if necessary) > > SC means a couple of CPUs which are placed closely by sharing > mid-level caches, but not enough to be a cluster. what you name SC above looks the same as CLS which should not be mixed with Arm cluster terminology > > > >side question, why don't you want llc to be the L3 one ? > > Yes, we should set SD_SHARE_PKG_RESOURCES to DIE, but we also want to > represent the mid-level caches to improve throughput. so your topology (from a scheduler poV) should be for cpu0: DIE [0 - 3] [4 - 7] (SD_SHARE_PKG_RESOURCES) MC [0 - 1] [2 - 3] (SD_SHARE_PKG_RESOURCES) CLS [0] [1] (SD_SHARE_PKG_RESOURCES) so the llc will be the DIE level and load balance will spread tasks in the different group of cpus And regarding EAS, it will look at DIE level for task placement And of course, this doesn't need any change in scheduler but the arch_topology.c to return the correct cpumask for your system > > Thanks, > Wang > > > >> SD_SHARE_PKG_RESOURCES flag, the sd_llc will be [0][1][2][3]. It's not= true. > > > >The only entry point for describing the scheduler domain is the > >sched_domain_topology_level array which provides some cpumask and some > >associated flags. By default, SD_SHARE_PKG_RESOURCES is set for > >scheduler MC level which implies that the cpus shared their cache. If > >this is not the case for your system, you should either remove this > >flag or update the cpumask to reflect which CPUs really share their > >caches. > > > >> > >> So we must separate sd_llc from sd topology, or the demand cannot be > >> met in any case under the existing mechanism. > > > >There is a default array with DIE, MC, CLS and SMT levels with > >SD_SHARE_PKG_RESOURCES set up to MC which is considered to be the LLC > >but a different array than the default one can be provided with > >set_sched_topology() > > > >Thanks > >Vincent > > > >> > >> Thanks, > >> Wang > >>