Received: by 2002:a05:6a10:9afc:0:0:0:0 with SMTP id t28csp1756433pxm; Thu, 24 Feb 2022 08:48:20 -0800 (PST) X-Google-Smtp-Source: ABdhPJz0JTDquCApUW2BKLR2XW3BrLYnFj9jR1ghQfWkgPX+2N2bd2Ew0MIhRt1g8iv+9QDI8wW8 X-Received: by 2002:a17:906:d20a:b0:69e:cd43:bbd with SMTP id w10-20020a170906d20a00b0069ecd430bbdmr2974836ejz.219.1645721300521; Thu, 24 Feb 2022 08:48:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1645721300; cv=none; d=google.com; s=arc-20160816; b=UTdgFhgLyUHkfD1RKPqTHaYVUDn6sF9hAPQ1H5pMx8ZwDPWtT2lL3XTXwe7eIkiyeJ C6YxwFUqAyj5XC/+axEeB/nNGN9AtUpdX3CMOWy0pRvpC5QDwRCJPcVthrx4g1QuXzNS BS0l4oSRjizhOkJSMLh0b6vU/bSnySy/1BS+kkPk6hOcwUGgkawy77QmmgJLBd+FnVuj z6nIAlW82NyyEJHW/83dWXHiZXMc9kA4Ue3VKyYUpl+tFoV/MC+8l1S+ZLH0t0nQm9Bu dQgSAanHi1bZgFHjKQjdJXG3PS5hfZkI1KKqtbEgipzzGyJ/U+EeYtAF2Sgkad6jWK1V S03g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature:dkim-signature; bh=U1H0CadfhtyYyF6LVEd3emETLP+Vbb/T1mlZGHXxU+Y=; b=QEio3qnapiorgTDy/nwEinpP7HyAJE3FUt7HdrRxlr9PhpPKvwp+CAFa2wp+h212Ak Q3uBD8kJ6u+RNg8XvXNyztAKb37t6zXuVgDl/8pFFAH3Ll4UeVGqDY/fb1dAvZfif0cl i2gacNrvt88jbvj62R/nN5pV1ADu2cI3Ti0i+/dBR/TjNg8aBof81dpMqK8kgEbIIz7W S0v7FWWDHFeMMUgeU3NIYztUvYWGy22JmYAqy5rU3VCfvs0JJ1Yvd5BLcrXuRo09R9Fn e9GTbkau5vOZXaxw7uflSGPAV6nYcquVJPkBFgRYKCB5uvgrmfbesGHbtxfQUGvbk8/+ G/Vg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=IeBZekcF; dkim=neutral (no key) header.i=@suse.de header.s=susede2_ed25519 header.b=pl+Qlnta; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x18-20020a05640226d200b00410ef690928si65815edd.314.2022.02.24.08.47.56; Thu, 24 Feb 2022 08:48:20 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=IeBZekcF; dkim=neutral (no key) header.i=@suse.de header.s=susede2_ed25519 header.b=pl+Qlnta; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230151AbiBXQrr (ORCPT + 99 others); Thu, 24 Feb 2022 11:47:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60388 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229619AbiBXQrq (ORCPT ); Thu, 24 Feb 2022 11:47:46 -0500 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8BF243EB9F for ; Thu, 24 Feb 2022 08:47:16 -0800 (PST) Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 4676E212B8; Thu, 24 Feb 2022 16:47:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1645721235; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=U1H0CadfhtyYyF6LVEd3emETLP+Vbb/T1mlZGHXxU+Y=; b=IeBZekcFMaF4ptlQ0qbFqdGsVWOLtH3pwa9EXgyNdAfTlXzH8ldLLF/3T2fFmzXt8MtMWR 9uC+IrysVuU5a1YhoprN01PtUZP3hLcN3F6US18kinfRfbzVNhd9f3EG5j4SxIlTv8tF/R jUSaggGG8OdRfp099ZaEROxRYaNjxfo= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1645721235; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=U1H0CadfhtyYyF6LVEd3emETLP+Vbb/T1mlZGHXxU+Y=; b=pl+Qlnta+XDdyxNK2WpKXyorvxLr2GfUGEUsKmeEu3LZF6hPIpNSz3AedBhqPlvBgVyjU8 xHLUHDyKhFyRltBQ== Received: from suse.de (unknown [10.163.32.246]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 1A1F6A3B8A; Thu, 24 Feb 2022 16:47:13 +0000 (UTC) Date: Thu, 24 Feb 2022 16:47:11 +0000 From: Mel Gorman To: Abel Wu Cc: Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Ingo Molnar , Juri Lelli , Peter Zijlstra , Steven Rostedt , Vincent Guittot , linux-kernel@vger.kernel.org Subject: Re: [RFC PATCH 0/5] introduce sched-idle balancing Message-ID: <20220224164711.GA4363@suse.de> References: <20220217154403.6497-1-wuyun.abel@bytedance.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20220217154403.6497-1-wuyun.abel@bytedance.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 17, 2022 at 11:43:56PM +0800, Abel Wu wrote: > Current load balancing is mainly based on cpu capacity > and task util, which makes sense in the POV of overall > throughput. While there still might be some improvement > can be done by reducing number of overloaded cfs rqs if > sched-idle or idle rq exists. > > An CFS runqueue is considered overloaded when there are > more than one pullable non-idle tasks on it (since sched- > idle cpus are treated as idle cpus). And idle tasks are > counted towards rq->cfs.idle_h_nr_running, that is either > assigned SCHED_IDLE policy or placed under idle cgroups. > It's not clear how your tests evaluated the balancing of SCHED_IDLE tasks versus the existing idle balancing and isolated that impact. I suspect the tests may primarily measured the effect of the SIS filter. > So in short, the goal of the sched-idle balancing is to > let the *non-idle tasks* make full use of cpu resources. > To achieve that, we mainly do two things: > > - pull non-idle tasks for sched-idle or idle rqs > from the overloaded ones, and > > - prevent pulling the last non-idle task in an rq > > The mask of overloaded cpus is updated in periodic tick > and the idle path at the LLC domain basis. This cpumask > will also be used in SIS as a filter, improving idle cpu > searching. > As the overloaded mask may be updated on each idle, it could be a significant source of cache misses between CPUs sharing the domain for workloads that rapidly idle so there should be data on whether cache misses are increased heavily. It also potentially delays the CPU reaching idle but it may not be by much. The filter may be out of date. It takes up to one tick to detect overloaded and the filter to have a positive impact. As a CPU is not guaranteed to enter idle if there is at least one CPU-bound task, it may also be up to 1 tick before the mask is cleared. I'm not sure this is a serious problem though as SIS would not pick the CPU with the CPU-bound task anyway. At minimum, the filter should be split out and considered first as it is the most likely reason why a performance difference was measured. It has some oddities like why nr_overloaded is really a boolean and as it's under rq lock, it's not clear why it's atomic. The changelog would ideally contain some comment on the impact to cache misses if any and some sort of proof that SIS search depth is reduced which https://lore.kernel.org/lkml/20210726102247.21437-2-mgorman@techsingularity.net/ may be some help. At that point, compare the idle task balancing on top to isolate how much it improves things if any and identify why existing balancing is insufficient. Split out the can_migrate_task change beforehand in case it is the main source of difference as opposed to the new balancing mechanism. -- Mel Gorman SUSE Labs