Received: by 2002:a6b:fb09:0:0:0:0:0 with SMTP id h9csp484833iog; Thu, 30 Jun 2022 04:49:21 -0700 (PDT) X-Google-Smtp-Source: AGRyM1sJB95R1sHUgQ+pREGWftf0A1y6EBk9x+diat6IPidJOOudib48Zq18nueNfk6HZR+I/ZhL X-Received: by 2002:a17:907:ea5:b0:726:2c1c:312f with SMTP id ho37-20020a1709070ea500b007262c1c312fmr8321766ejc.248.1656589761102; Thu, 30 Jun 2022 04:49:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1656589761; cv=none; d=google.com; s=arc-20160816; b=Pcj8gVH733tlSYFJfcra33WDkyiOfMQe/0irniR1lZuliYnA1b6VhpCtse33oQcxSm aJYIow+benxzi3jwpsV0FEY+9GxTNDLk6VHdpwNd8321IhF6VbfwmEXNjAcmzS1hqSl/ XWUwrRjVPPnMIL3WC2c95OetXkNl3zre6aSaxRET24HIPk6WAwoV57MvtFwLlmdx3ZmO BCjmtv4r+l8G76mb2xH610BRE2TD25NHplWMi8BFVUVZs2+N++SYlSQpvgG7n7zEn71r 3+GDLVRNcyu0uc3GpzszEMdQgsd9A7KXuDi0y2HVDruFXAgeMWueRXrR7QqJTFqkk0e2 M9uw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=AHRCCBW/JYhqCHWOVbSogrUe2u3TyZz98/JcPRt3vOM=; b=HIxnYjzsuAqXHlDIDfns2sN0qTU7HrdpYrJ906TTGFDroBxnQrisRL8I46PeCbvdFy WEdvgMaPkIiEbAjeyto9kvuqVfmCB6QXqIXun5hBAk97FOQsc2hQOmuU3DHbZ4YEqk9h 5Fwxnfit5eJDCfgI0S5dUdHkTeeELuWypT3dUTxWXLXIQCxvPY4AhZ0EMzu/A6LXgkSK MMuRl+0sx7WZR9mua4g9cOZDSL3/JWya8DU6VeDQ64bIRoRy1m70/PcaszVKv6PPr2YF lRYY2SjXjIWoamwhcnehIhOc4SykULs/gTtyimgg6jySlSThHh2WpClegl5Crn1Yw7Im MUqg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=IicN9pLM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l21-20020a170906795500b007269f720fb5si1232391ejo.530.2022.06.30.04.48.56; Thu, 30 Jun 2022 04:49:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=IicN9pLM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231910AbiF3KqQ (ORCPT + 99 others); Thu, 30 Jun 2022 06:46:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34444 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229673AbiF3KqP (ORCPT ); Thu, 30 Jun 2022 06:46:15 -0400 Received: from mail-pl1-x631.google.com (mail-pl1-x631.google.com [IPv6:2607:f8b0:4864:20::631]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 05BD5D65 for ; Thu, 30 Jun 2022 03:46:14 -0700 (PDT) Received: by mail-pl1-x631.google.com with SMTP id r1so16675920plo.10 for ; Thu, 30 Jun 2022 03:46:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=message-id:date:mime-version:user-agent:subject:content-language:to :cc:references:from:in-reply-to:content-transfer-encoding; bh=AHRCCBW/JYhqCHWOVbSogrUe2u3TyZz98/JcPRt3vOM=; b=IicN9pLM0nICiRtnqiY88yj2ZaU+Nb/b/PnRYuAyLzX/hEOX8t4SzAFodnLYTWlXRz DPVEOzR1r2U39H89xXjdq9zYYoYYtNmc2cRFXBCAAHv05qvePfx3EHvI6grx+hvcZtW8 fXDvt6sCT4w+6xGYENjw0/NdJCwJGCy1ul40L1MG9E5nZCWEXEMajLlLLo5DDaY6dtOK 2HS6KJdWdplHLw+iKj9Hm5byXMvDcKaFBzwjXWMk3IMiomeCa50tkTGxbdgOYJ5cA3Pq ppO0fNfeG9ObHxh/DywqXOWr0psylqZnvI0qZyvkH0bbn1QdZn62H7lvSW/OJcLMswgY RTbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=AHRCCBW/JYhqCHWOVbSogrUe2u3TyZz98/JcPRt3vOM=; b=Yj3HpKfS4NzRi9UqYg3c8i0UXqrACR4KM0aeQLaHKc6+ybQDWaHwiEXAWnz/ZUhNEZ zMiCi+8z4Rw8KfAMaiGG7V1smKHhmtvAs9cL9R8aiVtxpfDC5OOwWWw/PbzhgeUdu24M nDCjDa9tmvj5CdTXFysFUM5v/ZnxEbFrg3JG4BAXK8M0ASLLtujDGUsbAV/PX5j7gspG 6jWS4dZhjuGHsY/I7Y+UVQsNwSBgS9JZWEjvUtrsslRSWVkVgwJERpPIjiHs/03Mfnoy Hl+jYeTg5zBLsJTWhl1YlPVb+4WoujZ+nANt2onKW6sgXQ2QlnmbIBcS4c88D4PS5XnM UFlw== X-Gm-Message-State: AJIora9PE26a4+b633rez3bxRdlThBm5BK+gm0ouEpEF2I9vQ3705VnP 0HU2mQgGRUUmEhprERRGRVTD3+1juK8rNg== X-Received: by 2002:a17:90b:1b07:b0:1ec:c617:a314 with SMTP id nu7-20020a17090b1b0700b001ecc617a314mr9200091pjb.214.1656585973552; Thu, 30 Jun 2022 03:46:13 -0700 (PDT) Received: from [10.4.105.41] ([139.177.225.225]) by smtp.gmail.com with ESMTPSA id y5-20020a62ce05000000b0052514384f02sm2465676pfg.54.2022.06.30.03.46.09 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 30 Jun 2022 03:46:13 -0700 (PDT) Message-ID: Date: Thu, 30 Jun 2022 18:46:08 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Subject: Re: [PATCH v4 6/7] sched/fair: skip busy cores in SIS search Content-Language: en-US To: Chen Yu Cc: Peter Zijlstra , Mel Gorman , Vincent Guittot , Josh Don , Tim Chen , K Prateek Nayak , "Gautham R . Shenoy" , linux-kernel@vger.kernel.org References: <20220619120451.95251-1-wuyun.abel@bytedance.com> <20220619120451.95251-7-wuyun.abel@bytedance.com> <20220621181442.GA37168@chenyu5-mobl1> <543d55e1-fad8-3df3-8bae-d79c0c8d8340@bytedance.com> <20220624033032.GA14945@chenyu5-mobl1> <3e4d2594-f678-b77a-4883-0b893daf19f6@bytedance.com> <2d18453d-9c9b-b57b-1616-d4a9229abd5a@bytedance.com> <20220630041645.GA9253@chenyu5-mobl1> From: Abel Wu In-Reply-To: <20220630041645.GA9253@chenyu5-mobl1> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 6/30/22 12:16 PM, Chen Yu Wrote: > On Tue, Jun 28, 2022 at 03:58:55PM +0800, Abel Wu wrote: >> >> On 6/27/22 6:13 PM, Abel Wu Wrote: >> There seems like not much difference except hackbench pipe test at >> certain groups (30~110). > OK, smaller LLC domain seems to not have much difference, which might > suggest that by leveraging load balance code path, the read/write > to LLC shared mask might not be the bottleneck. I have an vague > impression that during Aubrey's cpumask searching for idle CPUs > work[1], there is concern that updating the shared mask in large LLC > has introduced cache contention and performance degrading. Maybe we > can find that regressed test case to verify. > [1] https://lore.kernel.org/all/1615872606-56087-1-git-send-email-aubrey.li@intel.com/ I just went through Aubrey's v1-v11 patches and didn't find any particular tests other than hackbench/tbench/uperf. Please let me know if I missed something, thanks! >> I am intended to provide better scalability >> by applying the filter which will be enabled when: >> >> - The LLC is large enough that simply traversing becomes >> in-sufficient, and/or >> >> - The LLC is loaded that unoccupied cpus are minority. >> >> But it would be very nice if a more fine grained pattern works well >> so we can drop the above constrains. >> > We can first try to push a simple version, and later optimize it. > One concern about v4 is that, we changed the logic in v3, which recorded > the overloaded CPU, while v4 tracks unoccupied CPUs. An overloaded CPU is > more "stable" because there are more than 1 running tasks on that runqueue. > It is more likely to remain "occupied" for a while. That is to say, > nr_task = 1, 2, 3... will all be regarded as occupied, while only nr_task = 0 > is unoccupied. The former would bring less false negative/positive. Yes, I like the 'overloaded mask' too, but the downside is extra cpumask ops needed in the SIS path (the added cpumask_andnot). Besides, in this patch, the 'overloaded mask' is also unstable due to the state is maintained at core level rather than per-cpu, some more thoughts are in cover letter. > > By far I have tested hackbench/schbench/netperf on top of Peter's sched/core branch, > with SIS_UTIL enabled. Overall it looks good, and netperf has especially > significant improvement when the load approaches overloaded(which is aligned > with your comment above). I'll re-run the netperf for several cycles to check the > standard deviation. And I'm also curious about v3's performance because it > tracks overloaded CPUs, so I'll also test on v3 with small modifications. Thanks very much for your reviewing and testing. Abel