Received: by 2002:a05:7412:d8a:b0:e2:908c:2ebd with SMTP id b10csp722277rdg; Wed, 11 Oct 2023 03:54:00 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGT9AWKqQ8V6odCNnqFavyPCSILF9qkBzZ2Yo+M2+faS6MKhvTCjZzTY4QhK1XyGqbBM7Bt X-Received: by 2002:a92:c90d:0:b0:353:a2d8:463a with SMTP id t13-20020a92c90d000000b00353a2d8463amr16481054ilp.16.1697021639877; Wed, 11 Oct 2023 03:53:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697021639; cv=none; d=google.com; s=arc-20160816; b=P/ECEc/acwHU6nXYew14zlRsjFzi+ND1pbj6tH936mcdYe3nM/4cfiKyKYO5yM7z1y ot3MgGyCjs0GvkiO6fRymTvWdTY6MHoGXP18GcgEzC52Eo1dzsnkU6qFnHEuqSE0EpbM tPcBKcXYbjLAdXQ0xYWw2tsYPkHPo4+8NDwkIeSTzq/j3GR+l+zCrQORxAd6cTqC2i9a Ao3Fvt5fgWtbI5W2r/UJOtHQb/h5brmBwyyhxFZIBxDieXi9P972Av7P5bJndjPrHMif xVcizK1lqqRVQAmcy3zV7UM3hF4xzjGaMS2Ke0ElHf/rArEaxKknQrJPf+ih/tfxWeX5 poRw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=n0Rd8iB01dlTqJL61z7uUBCANaEnEeQ6SUc86tc4CLM=; fh=F9Nt5KJh76xH0xf5a73QvVQiS/n7ik9i7XiPLIFkQ3E=; b=j+GHUltPhsGB/7iUG09TWzuy248kqG9gCimz3HBn5Gk0rdhkm31v4UrMwWwYvtRsbG XBgugPfZ1tw/s4niuCODRyJifA6PaFzDHW/xmbAZlfhgebaY+30tTxVFRtgQNl++U499 KNiYS4sYwIiEeLqO5Hq0LWCqqHtvxMrACjHIiqfrL/ml+vjYcLwMm5PSTrf8IgdSEVPn 8GXuyivzlCjx4KtUY5+Xz+Oyj7eHMMffCF0GhZ/pAltb5gbOZwczOVnGnvjUlRajfvxB wHil911fwDwlJcWLlDGwCR7tHbB1eiMviiIIAXkIncgQpfhC0iQXrH4sa0Rl7uApGL2K HiCw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b="A4NdML/D"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id cm4-20020a056a020a0400b005775e2f160fsi13330413pgb.667.2023.10.11.03.53.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 03:53:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=desiato.20200630 header.b="A4NdML/D"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id C045881B8025; Wed, 11 Oct 2023 03:53:58 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234808AbjJKKxt (ORCPT + 99 others); Wed, 11 Oct 2023 06:53:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52814 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231818AbjJKKxs (ORCPT ); Wed, 11 Oct 2023 06:53:48 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1572494 for ; Wed, 11 Oct 2023 03:53:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=In-Reply-To:Content-Transfer-Encoding: Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date: Sender:Reply-To:Content-ID:Content-Description; bh=n0Rd8iB01dlTqJL61z7uUBCANaEnEeQ6SUc86tc4CLM=; b=A4NdML/DjEC+NS52AfNQ6oa2wa Q8WFeyiZLWde7b9NDQyUOcbgLKJPzJ8WDXWc2tGaCIeRJd27Mqh9ZH/R5PRbgW6JzIOepH5kfVy3c EZ+D3y0wUMQyO1R3LSv7Sbd7RkLFND1JOxYIicXDpGrzH9JxaOHioptaDNxi0sTBaAWYyyDuuLERM y5IHDPGFz4/ZNNX0MHOf8JNc7MqSm/W68l1nyyQunqT+cfiCfgtbbSLExfll2wvTLMoJdER6SoYTJ H8/RgeHbnRXySmWn/Ev/0Z+S3xHGQqgxXZXyDHAy5V3ITVgEypOaLbVw6rAydSyZ9UdmuKoglwYs8 poN90tQA==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1qqWqG-0009YU-1m; Wed, 11 Oct 2023 10:53:30 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 1000) id A690230026F; Wed, 11 Oct 2023 12:53:29 +0200 (CEST) Date: Wed, 11 Oct 2023 12:53:29 +0200 From: Peter Zijlstra To: Ankit Jain Cc: yury.norov@gmail.com, andriy.shevchenko@linux.intel.com, linux@rasmusvillemoes.dk, qyousef@layalina.io, pjt@google.com, joshdon@google.com, bristot@redhat.com, vschneid@redhat.com, linux-kernel@vger.kernel.org, namit@vmware.com, amakhalov@vmware.com, srinidhir@vmware.com, vsirnapalli@vmware.com, vbrahmajosyula@vmware.com, akaher@vmware.com, srivatsa@csail.mit.edu Subject: Re: [PATCH RFC] cpumask: Randomly distribute the tasks within affinity mask Message-ID: <20231011105329.GA17066@noisy.programming.kicks-ass.net> References: <20231011071925.761590-1-ankitja@vmware.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20231011071925.761590-1-ankitja@vmware.com> X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_NONE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Wed, 11 Oct 2023 03:53:59 -0700 (PDT) On Wed, Oct 11, 2023 at 12:49:25PM +0530, Ankit Jain wrote: > commit 46a87b3851f0 ("sched/core: Distribute tasks within affinity masks") > and commit 14e292f8d453 ("sched,rt: Use cpumask_any*_distribute()") > introduced the logic to distribute the tasks at initial wakeup on cpus > where load balancing works poorly or disabled at all (isolated cpus). > > There are cases in which the distribution of tasks > that are spawned on isolcpus does not happen properly. > In production deployment, initial wakeup of tasks spawn from > housekeeping cpus to isolcpus[nohz_full cpu] happens on first cpu > within isolcpus range instead of distributed across isolcpus. > > Usage of distribute_cpu_mask_prev from one processes group, > will clobber previous value of another or other groups and vice-versa. > > When housekeeping cpus spawn multiple child tasks to wakeup on > isolcpus[nohz_full cpu], using cpusets.cpus/sched_setaffinity(), > distribution is currently performed based on per-cpu > distribute_cpu_mask_prev counter. > At the same time, on housekeeping cpus there are percpu > bounded timers interrupt/rcu threads and other system/user tasks > would be running with affinity as housekeeping cpus. In a real-life > environment, housekeeping cpus are much fewer and are too much loaded. > So, distribute_cpu_mask_prev value from these tasks impacts > the offset value for the tasks spawning to wakeup on isolcpus and > thus most of the tasks end up waking up on first cpu within the > isolcpus set. > > Steps to reproduce: > Kernel cmdline parameters: > isolcpus=2-5 skew_tick=1 nohz=on nohz_full=2-5 > rcu_nocbs=2-5 rcu_nocb_poll idle=poll irqaffinity=0-1 > > * pid=$(echo $$) > * taskset -pc 0 $pid > * cat loop-normal.c > int main(void) > { > while (1) > ; > return 0; > } > * gcc -o loop-normal loop-normal.c > * for i in {1..50}; do ./loop-normal & done > * pids=$(ps -a | grep loop-normal | cut -d' ' -f5) > * for i in $pids; do taskset -pc 2-5 $i ; done > > Expected output: > * All 50 “loop-normal” tasks should wake up on cpu2-5 > equally distributed. > * ps -eLo cpuid,pid,tid,ppid,cls,psr,cls,cmd | grep "^ [2345]" > > Actual output: > * All 50 “loop-normal” tasks got woken up on cpu2 only Your expectation is wrong. Things work as advertised. Also, isolcpus is crap.