Received: by 2002:a05:7412:2a8c:b0:e2:908c:2ebd with SMTP id u12csp1228995rdh; Mon, 25 Sep 2023 06:59:03 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFwewjcxddVG9i+vCpzbS4+K1J+9eZnOv2D2y9jt9dG1sG1ua81fKVOApx4N7sfedj52AUD X-Received: by 2002:a05:6a00:1589:b0:68c:49e4:bd71 with SMTP id u9-20020a056a00158900b0068c49e4bd71mr8533968pfk.34.1695650343097; Mon, 25 Sep 2023 06:59:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695650343; cv=none; d=google.com; s=arc-20160816; b=XuQ6Uz5ZDJ1W9rJZy0V5Jmo73mzPYG6C+yCzJYXUBTp8VmANUbJHeTYXNiTvDC7ijq fnToAZQciuV2/oNuOB0iLO46e/inDVokal1gN3tOeElQmYPq0j3AEy9eecmYv/mRvnRj CDk0xzaKVIalcF8PfIQqV48YSSjxxyShYyoy7nf15IGfV7a6OE3WeUz9uBurXWngUFAV 42HjWCVsiuMgyhI/vX2jFH0JE4k8KADLoNuhnZqXcBcAp+IRydLjY7CuKKpgMvvdT08P NJFKRhRpE4Tttx897rM//8TPyXlHYo6w6OEYe4zmXsJpMdzz29KWAi1TxTNpY7acW8Qs WzUA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:sender:dkim-signature; bh=1JLzJEEtCTsx2ytyqcpgNQTyqzlBd0CFplvaB4DOG0o=; fh=aq72/wQcQHiy0Ur/mNCumvFOABvROECKB8rU4sQWyrM=; b=JQpQT/DjVuJxKNkOSoAVYB8Hn+Qcaar4/ildtYHM6JdsKA13f1+QO2MqQZx3ogMf7H U+R6a9qgal2x7OrDD03CTY94MX6dT53k0BYiKY+poHpDKNpxjcgBwF6X0WAd447VINGz vcPVj+Q0bgitz3iTynRFoPnIbxqqCdAF6cGIxOlsiKi4N1lDkDreNpMZ2VHd3g/T2wEM AcHniN3nnSKKdxlgwW/Gg40uR+pqi6EyqDuxwh+aeDifL8Hpzg1doNWMgRUKrjXu4dFS VN6Cft59QgEXmpalCVUKdAKiFWmIv5GNKY+SC+huBTTDi9s1HUAWzSmSnrEhKSIYQ/A4 DvrA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=Q3paE99Z; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from howler.vger.email (howler.vger.email. [23.128.96.34]) by mx.google.com with ESMTPS id s3-20020a637703000000b00578086d1387si10173896pgc.74.2023.09.25.06.59.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Sep 2023 06:59:03 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) client-ip=23.128.96.34; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=Q3paE99Z; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id 356EF809927F; Mon, 25 Sep 2023 01:27:53 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232664AbjIYI1s (ORCPT + 99 others); Mon, 25 Sep 2023 04:27:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47774 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230250AbjIYI1r (ORCPT ); Mon, 25 Sep 2023 04:27:47 -0400 Received: from mail-ej1-x62c.google.com (mail-ej1-x62c.google.com [IPv6:2a00:1450:4864:20::62c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4CE22A9 for ; Mon, 25 Sep 2023 01:27:41 -0700 (PDT) Received: by mail-ej1-x62c.google.com with SMTP id a640c23a62f3a-9adca291f99so716360966b.2 for ; Mon, 25 Sep 2023 01:27:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695630460; x=1696235260; darn=vger.kernel.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:sender :from:to:cc:subject:date:message-id:reply-to; bh=1JLzJEEtCTsx2ytyqcpgNQTyqzlBd0CFplvaB4DOG0o=; b=Q3paE99ZlpkNhvAelrP1/lOGJnh1xjU7eaNJ4E/VWyaVONZiH7OJ6SKV549ELYOO37 5D77S8kEFlUHTb1yusnc4RcvUZcFkY4UL7km5BnLhGkQPl/tSnC3RwD6pLRbMcuMY+aO iENtrXQXfF1st9muQtAhdbX2qQe7INEDeg6PAlu46Qu/MCFLU3rxWuWsCXvUsyy24Q6q UHeJXn2C+UAjSZmifKTwAl7dlAsdFw5287JJ87tSQKcBFEhUS+EGEhCbdOMIR77imrlG UFp9Daa9FvNU8fH3g0loZ7qq1va2OUEylM+EUDDoyN52JT16JVY88yw2LcUDFsqXQwHV uhFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695630460; x=1696235260; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:sender :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=1JLzJEEtCTsx2ytyqcpgNQTyqzlBd0CFplvaB4DOG0o=; b=IzWA+IXWwqvmCNpPJKlk3ev0SJlMgJVBRHjYxQEzTnh8wmdpsBhBbWAUtUtp3HQ+Ci ZDmvizkOdPhWahsievhkNwAtaxq95wZvBuiSkL263/PacRzpXRR44+ta0rWo2+NqjhfC CsXJzneu3vXURDsZ8f+mCX1Z7Pr2U/ZObNUM4ClXo8YQB8lm4U0jDVEUs71uZhFtB+4l 2t2eX7DVO59pIrhvfRJYROYgGs3i52ZBEF5DVMxMirpLY3rbYqiMLEJRLAeEGbh60bkV VB5mjlE4WwwsAnOqwxCok7I3Qq5ZKyhPGBUxaFyaqGfezALYylXv1GO3o+gQEovDIE8/ YL0A== X-Gm-Message-State: AOJu0YxGKrdnXVBKx47v5oEGCRqd+m+g5wBOjNxaTYdS9iIV/DEkQxZz pciuyj5rFZgolwAuATKMyCU= X-Received: by 2002:a17:907:3da8:b0:9ae:420e:73a0 with SMTP id he40-20020a1709073da800b009ae420e73a0mr7544575ejc.69.1695630459561; Mon, 25 Sep 2023 01:27:39 -0700 (PDT) Received: from gmail.com (195-38-113-94.pool.digikabel.hu. [195.38.113.94]) by smtp.gmail.com with ESMTPSA id e8-20020a170906374800b0099df2ddfc37sm5980616ejc.165.2023.09.25.01.27.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Sep 2023 01:27:38 -0700 (PDT) Sender: Ingo Molnar Date: Mon, 25 Sep 2023 10:27:36 +0200 From: Ingo Molnar To: Sebastian Andrzej Siewior Cc: Valentin Schneider , linux-kernel@vger.kernel.org, Daniel Bristot de Oliveira , Dietmar Eggemann , Ingo Molnar , Juri Lelli , Peter Zijlstra , Steven Rostedt , Vincent Guittot , Thomas Gleixner Subject: Re: [PATCH] sched/rt: Make rt_rq->pushable_tasks updates drive rto_mask Message-ID: References: <20230811112044.3302588-1-vschneid@redhat.com> <20230815142121.MoZplZUr@linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20230815142121.MoZplZUr@linutronix.de> X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Mon, 25 Sep 2023 01:27:53 -0700 (PDT) * Sebastian Andrzej Siewior wrote: > On 2023-08-11 12:20:44 [+0100], Valentin Schneider wrote: > > Sebastian noted that the rto_push_work IRQ work can be queued for a CPU > > that has an empty pushable_tasks list, which means nothing useful will be > > done in the IPI other than queue the work for the next CPU on the rto_mask. > > > > rto_push_irq_work_func() only operates on tasks in the pushable_tasks list, > > but the conditions for that irq_work to be queued (and for a CPU to be > > added to the rto_mask) rely on rq_rt->nr_migratory instead. > > > > nr_migratory is increased whenever an RT task entity is enqueued and it has > > nr_cpus_allowed > 1. Unlike the pushable_tasks list, nr_migratory includes a > > rt_rq's current task. This means a rt_rq can have a migratible current, N > > non-migratible queued tasks, and be flagged as overloaded / have its CPU > > set in the rto_mask, despite having an empty pushable_tasks list. > > > > Make an rt_rq's overload logic be driven by {enqueue,dequeue}_pushable_task(). > > Since rt_rq->{rt_nr_migratory,rt_nr_total} become unused, remove them. > > > > Note that the case where the current task is pushed away to make way for a > > migration-disabled task remains unchanged: the migration-disabled task has > > to be in the pushable_tasks list in the first place, which means it has > > nr_cpus_allowed > 1. > > > > Link: http://lore.kernel.org/r/20230801152648._y603AS_@linutronix.de > > Reported-by: Sebastian Andrzej Siewior > > Signed-off-by: Valentin Schneider > > --- > > This is lightly tested, this looks to be working OK but I don't have nor am > > I aware of a test case for RT balancing, I suppose we want something that > > asserts we always run the N highest prio tasks for N CPUs, with a small > > margin for migrations? > > I don't see the storm of IPIs I saw before. So as far that goes: > Tested-by: Sebastian Andrzej Siewior I've applied Valentin's initial fix to tip:sched/core, for an eventual v6.7 merge, as it addresses the IPI storm bug. Let me know if merging this is not desirable for some reason. > What I still observe is: > - CPU0 is idle. CPU0 gets a task assigned from CPU1. That task receives > a wakeup. CPU0 returns from idle and schedules the task. > pull_rt_task() on CPU1 and sometimes on other CPU observe this, too. > CPU1 sends irq_work to CPU0 while at the time rto_next_cpu() sees that > has_pushable_tasks() return 0. That bit was cleared earlier (as per > tracing). > > - CPU0 is idle. CPU0 gets a task assigned from CPU1. The task on CPU0 is > woken up without an IPI (yay). But then pull_rt_task() decides that > send irq_work and has_pushable_tasks() said that is has tasks left > so…. > Now: rto_push_irq_work_func() run once once on CPU0, does nothing, > rto_next_cpu() return CPU0 again and enqueues itself again on CPU0. > Usually after the second or third round the scheduler on CPU0 makes > enough progress to remove the task/ clear the CPU from mask. Just curious, any progress on solving this? Thanks, Ingo