Received: by 2002:a05:7412:3784:b0:e2:908c:2ebd with SMTP id jk4csp2784780rdb; Wed, 4 Oct 2023 11:16:13 -0700 (PDT) X-Google-Smtp-Source: AGHT+IG6Et17SmZSmsLY7aiKoUZ9S79UjPaZo3qDAlOKT8oOnE75jZBdMX85keZtvbNU340cjbBz X-Received: by 2002:a05:6a00:3a1e:b0:68a:52ec:3d36 with SMTP id fj30-20020a056a003a1e00b0068a52ec3d36mr3075450pfb.31.1696443372732; Wed, 04 Oct 2023 11:16:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696443372; cv=none; d=google.com; s=arc-20160816; b=0XbQgtXQWlvxYP6LDLqZ/s82WIj5oaAdqFCvuHUADhPvZ01jlK9yCnR8W4WbHCnCeE UP/lmJAqe5ncV08bz8d+6TsMVgtHe8LoQFjpYIyqnEWt0QZtgenj14kNWGX+mibDGdeB bP7hyj/IjQLHXsRTRcMr9g8Ac9ejizBWLJOPc+ffdIRwd6QL/4NrIX8ptVSy5PzKOslL r2DFjXy2C7QpL8s2kRouL8cjwmw8Vw1SWsCCyU47S7su8yskr0pBBf/5ubANYbmSYpGn uyfJM2nsgL5RLiig4V4FGwRt8UBPLVnVn1HGqUwRH8KzUL8nL8ucW9Ol4ZcZ9A2N8uWj ss4w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:sender:dkim-signature; bh=a1N1m6C2ZJv5jGP2wdjxZsch8T9RQRLRWPKB+sdu5Zw=; fh=j6XJunqNeu1WAHjH3xrcXkEgqLnJjyG1nvG8Ur3yboY=; b=LdCKNBGmQXffDwBW3DUxE0JF0yrkuTorA0T/jHESbsBQ9kcrEFHWSFOuWNzUPc1jDh v2YLUd4T2p9aa2BUI1bW5kjTu5ZmGbXwT9bFtXLlD20Yxr68XuUx9JTR2cRiWdGZRuST aPbw2v392qfdxs1wJTgt4GxO6leZMyrMLW2mTiYVtO3NmOXoxt2WScUFc6iz3IwLxzO/ 9vo38QuGsf8/S5f/opQV0LB3kugeDD9TNutZH9/glOHtaAnbh8zGNfVeeAtUEA06i64k QFVRyAjnW4E1fzTWEo5RUvfemc7wlZfxP55ZbZNY6sTta5+f4zkx3iJqSIe1adcnAfir VJAg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=B+nnlbgW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from pete.vger.email (pete.vger.email. [2620:137:e000::3:6]) by mx.google.com with ESMTPS id ef6-20020a056a002c8600b006901a8f5dadsi3965345pfb.289.2023.10.04.11.16.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Oct 2023 11:16:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) client-ip=2620:137:e000::3:6; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=B+nnlbgW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 242F78087523; Wed, 4 Oct 2023 11:15:40 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244029AbjJDSPU (ORCPT + 99 others); Wed, 4 Oct 2023 14:15:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52062 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244028AbjJDSPT (ORCPT ); Wed, 4 Oct 2023 14:15:19 -0400 Received: from mail-ej1-x636.google.com (mail-ej1-x636.google.com [IPv6:2a00:1450:4864:20::636]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 034789E for ; Wed, 4 Oct 2023 11:15:15 -0700 (PDT) Received: by mail-ej1-x636.google.com with SMTP id a640c23a62f3a-98377c5d53eso25319366b.0 for ; Wed, 04 Oct 2023 11:15:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1696443313; x=1697048113; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:sender:from:to:cc:subject:date:message-id :reply-to; bh=a1N1m6C2ZJv5jGP2wdjxZsch8T9RQRLRWPKB+sdu5Zw=; b=B+nnlbgW3mi4V6w35Kq8ERb8nkRCgmSU05+4rkFRR/7mXsY3YM6LzlLx3rlBGdKqFR ssxZUqhDawFf8ae3y4bFMluAHgn1FqdzYJSMpOf1q7y7iqNGzxy+VDqh8NujnfaOITEF O7MTOYLalr2xMnxz+DhoJezEXd2WTaRuJmd0KcZXMIn3uq4crrLCGHAkR7PN85stifEn XNctOhsXNs5NPjus1KhgNhwWWIJSALtWHFP5K/e2swuqMNHW59UBm6j7J0I9dHCRV3rB skrUwd7fqraps4FUFWiD7+jmEnSVA6fEdAuoxd+oYB8YioWHCjfkZxmxDKptnTs/Tyln pmEA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696443313; x=1697048113; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:sender:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=a1N1m6C2ZJv5jGP2wdjxZsch8T9RQRLRWPKB+sdu5Zw=; b=JEjh4pTuvYdM92jmbqLPKHyzDTh953ZsH/1hn+pnniOIyi3WQ7MUO2q97i3TlsjINY cBGrt+wh4ea+TN1JWBVYICXjB8aRUBK9pTjUb5KweIC9qPGRW2uSc/kaNmaDgihmZmWr +o9JvdZfV3z7hVOohWJK9r15/A1Lx0G0AQ0E85fL1FcTpKMTYC6fIm02wLUT/79PNb8K Lk+Sh/jZvoQKlVrTYA6C/BCDmVk55J0G+ULZ4VhcQKqwHFDKf+0daBU83WqF8tGj0kEb V1EysYVz3VakoGlu1/YXrHT8So177WsWf2ZfFzNGo9j3hdWlv/KKv38tTdGpiFHRVej0 zlRg== X-Gm-Message-State: AOJu0Yyqv7ojj6og0I9ndtq2VB2eqw/kgtwD2uasFMhLSlgLukZyP2Lf E7Cefr7fofIURGcVe0xJlzc= X-Received: by 2002:a17:906:7496:b0:9b2:93f2:71b0 with SMTP id e22-20020a170906749600b009b293f271b0mr2939008ejl.38.1696443313006; Wed, 04 Oct 2023 11:15:13 -0700 (PDT) Received: from gmail.com (1F2EF530.nat.pool.telekom.hu. [31.46.245.48]) by smtp.gmail.com with ESMTPSA id mf24-20020a170906cb9800b009ae3d711fd9sm3138223ejb.69.2023.10.04.11.15.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Oct 2023 11:15:12 -0700 (PDT) Sender: Ingo Molnar Date: Wed, 4 Oct 2023 20:15:10 +0200 From: Ingo Molnar To: Julia Lawall Cc: Peter Zijlstra , Ingo Molnar , Vincent Guittot , Dietmar Eggemann , Mel Gorman , linux-kernel@vger.kernel.org Subject: Re: EEVDF and NUMA balancing Message-ID: References: <20231003215159.GJ1539@noisy.programming.kicks-ass.net> <20231004120544.GA6307@noisy.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=2.6 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, MAILING_LIST_MULTI,RCVD_IN_SBL_CSS,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Wed, 04 Oct 2023 11:15:40 -0700 (PDT) X-Spam-Level: ** * Julia Lawall wrote: > > > On Wed, 4 Oct 2023, Peter Zijlstra wrote: > > > On Wed, Oct 04, 2023 at 02:01:26PM +0200, Julia Lawall wrote: > > > > > > > > > On Tue, 3 Oct 2023, Peter Zijlstra wrote: > > > > > > > On Tue, Oct 03, 2023 at 10:25:08PM +0200, Julia Lawall wrote: > > > > > Is it expected that the commit e8f331bcc270 should have an impact on the > > > > > frequency of NUMA balancing? > > > > > > > > Definitely not expected. The only effect of that commit was supposed to > > > > be the runqueue order of tasks. I'll go stare at it in the morning -- > > > > definitely too late for critical thinking atm. > > > > > > Maybe it's just randomly making a bad situation worse rather than directly > > > introduing a problem. There is a high standard deviatind in the > > > performance. Here are some results with hyperfine. The general trends > > > are reproducible. > > > > OK,. I'm still busy trying to bring a 4 socket machine up-to-date... > > gawd I hate the boot times on those machines :/ > > > > But yeah, I was thinking similar things, I really can't spot an obvious > > fail in that commit. > > > > I'll go have a poke once the darn machine is willing to submit :-) > > I tried a two-socket machine, but in 50 runs the problem doesn't show up. > > The commit e8f331bcc270 starts with > > - if (sched_feat(PLACE_LAG) && cfs_rq->nr_running > 1) { > + if (sched_feat(PLACE_LAG) && cfs_rq->nr_running) { > > This seemed like a big change - cfs_rq->nr_running > 1 should be rarely > true in ua, while cfs_rq->nr_running should always be true. Adding back > the > 1 and simply replacing the test by 0 both had no effect, though. BTW., in terms of statistical reliability, one of the biggest ... stochastic elements of scheduler balancing is wakeup-preemption - which you can turn off via: echo NO_WAKEUP_PREEMPTION > /debug/sched/features or: echo NO_WAKEUP_PREEMPTION > /sys/kernel/debug/sched/features If you can measure a performance regression with WAKEUP_PREEMPTION turned off in *both* kernels, there's likely a material change (regression) in the quality of NUMA load-balancing. If it goes away or changes dramatically with WAKEUP_PREEMPTION off, then I'd pin this effect to EEVDF causing timing changes that are subtly shifting NUMA & SMP balancing decisions past some critical threshold that is detrimental to this particular workload. ( Obviously both are regressions we care about - but doing this test would help categorize the nature of the regression. ) Thanks, Ingo