Received: by 2002:ac0:c50a:0:0:0:0:0 with SMTP id y10csp1231213imi; Fri, 1 Jul 2022 05:55:29 -0700 (PDT) X-Google-Smtp-Source: AGRyM1s/xF4PCm0H+0HC5/d5tKNX4ymE7h4spxA9355S1LGjVYHxEpK4bc2xvbSa8i5yWf3KpmEe X-Received: by 2002:a17:90b:1d84:b0:1ed:5918:74e3 with SMTP id pf4-20020a17090b1d8400b001ed591874e3mr16108801pjb.173.1656680129289; Fri, 01 Jul 2022 05:55:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1656680129; cv=none; d=google.com; s=arc-20160816; b=T2rblkwpPVmJ24j8X0kVjhuVN7DmQiG3hHey3XJMRCZyBmrNFQO5JKwM15w+RQVXvM tDaow6q7Pl2zCd77Zi3FKFcBGRDY267tAZwkFIX2f4gCX0632BVDwx1AjDuiZDUQ5IBx 440QwciM3eosKLMS3v/THAh5SQyIJKm/EI4uJXIDJtNg30tEWIrABEE8N1ouAEETqq2V VPEj0Poo/yKeVRWU4FH3Go88RIvGg/ZPbfkkF3golcojRZf2Dd+MSa/IP6/CVrGTF7fV uOCvgGwTfB5T4JuTFN/k1FQg1uLeMVnwlkMIPwyz1AM8VpdptLWK7qJxAcePbcIRz+Ii Jqmw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=9XkBrLdbExkrMK0Kb9rCuJwrg1QdNLAV6sdPgOxnSw4=; b=k0IoNrEfDJxJI6aZMRFHtH0V60vwnczPwZGNS/8+8Jk2/yKVpfESdTlZ2apLR12xLk CrSxelp5JdS+bGCDJbPrWhTPCE8mcVqyOFoPdYnhzquOJMKPF5jTudmX//tYPcIHOVES FPgWONrp9MxOlIq1IlWVZ0EuHZZLKP0L35uufu9TuK1bEiogVoIVjcM2Lrsa82yiHZ7I 0q9bEZ0ebk8N2XyHvjmmEl4i6CQQ004gi01nqIizXbQEd97cjH8AR0cYi9kEO0ukbsUD 8sP85pHoeKtM0y9BP7iXE6/+an2dr1Kj4PlwbZH34DjfZjoBJ8znp1qcMVkiiLJTQa3i NoUg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=Tmam4JZy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u9-20020a656709000000b003fddf71dacasi30830769pgf.666.2022.07.01.05.55.01; Fri, 01 Jul 2022 05:55:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=Tmam4JZy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234097AbiGAMSe (ORCPT + 99 others); Fri, 1 Jul 2022 08:18:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51458 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236376AbiGAMS0 (ORCPT ); Fri, 1 Jul 2022 08:18:26 -0400 Received: from mail-wr1-x431.google.com (mail-wr1-x431.google.com [IPv6:2a00:1450:4864:20::431]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3E0B412774 for ; Fri, 1 Jul 2022 05:18:23 -0700 (PDT) Received: by mail-wr1-x431.google.com with SMTP id i1so2938962wrb.11 for ; Fri, 01 Jul 2022 05:18:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=9XkBrLdbExkrMK0Kb9rCuJwrg1QdNLAV6sdPgOxnSw4=; b=Tmam4JZy3k0yYs6+v2edS2NIZvmEsbA6R/cZ5QIJBDBoqOf+Wjmgl+3UfR1rLYaLdk BlG0fln7K03XCuxOSlNQl5RTTiG+bVT2r6nhlPv6tVIEbJy8BsqXN7aPMPP6afToZumx 9ZdqGCtmd60+xmib814axqXVCRNxOxJyJOeW4AV5AEHhw46kO+i/WXEpNnTOUPdIBXfB XEgk4h9M0CZ1w/p/NeY/mAFxqEQGxm6O/Xe6ctLRrjoptLwiETI2v3VAMu/eBRiIvJQw tH3Gg5zcy4RCypSaA24bwZsePixzZGMNn1yijFAjwDL+efiaMH8zGsfcSwaTor9znFX2 T88g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=9XkBrLdbExkrMK0Kb9rCuJwrg1QdNLAV6sdPgOxnSw4=; b=6qu1YNRM3MA+b8yLdQsro/2aHtlJqHX87KXAmATvgl7M3KYao0gSRnpGIVhTJlE+CT DdcLS/DOqy6yptaso+j/0mdy7GkoKkRqQxJJmDVlwkAGfRHYghkEVtzbEjS7tE0EnceC 2Y0opJci3JzhmdloW3Ea78fWf+l3Rdf0cXhtZqjb3KAY3msQ3WZd40SKy7B3I5pomiei AcTLzZBEzU1GObb3Eexp0XBCkJ5QhZ98AkoWLPdGgrZoRPm0fCXIipDIENzD7EwNfmjE ILdwlnJg6YbC48Tb/4Xt8aez+rCVEa5dEFZZeQ+I9Dx5Qn0JAVXaj9EbXgVtWzwL/9tA uyEQ== X-Gm-Message-State: AJIora+yBYsEkd0Ylyls5fffQDKP8hiQyxIbPwSJf+dZ7U3L6K8Tuzu5 T6Z8iQogsJvckMdBvpPwvQfPlOWWxFbsePeneWI= X-Received: by 2002:a5d:47c8:0:b0:21d:2295:6a05 with SMTP id o8-20020a5d47c8000000b0021d22956a05mr13337594wrc.302.1656677901653; Fri, 01 Jul 2022 05:18:21 -0700 (PDT) MIME-Version: 1.0 References: <20220627154051.92599-1-schspa@gmail.com> In-Reply-To: From: Schspa Shi Date: Fri, 1 Jul 2022 20:18:10 +0800 Message-ID: Subject: Re: [PATCH v2] sched/rt: fix bad task migration for rt tasks To: Valentin Schneider Cc: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, Benjamin Segall , mgorman@suse.de, bristot@redhat.com, Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Valentin Schneider writes: > On 27/06/22 23:40, Schspa Shi wrote: >> @@ -2115,6 +2115,15 @@ static int push_rt_task(struct rq *rq, bool pull) >> if (WARN_ON(next_task == rq->curr)) >> return 0; >> >> + /* >> + * It is possible the task has running for a while, we need to check >> + * task migration disable flag again. If task migration is disabled, >> + * the retry code will retry to push the current running task on this >> + * CPU away. >> + */ >> + if (unlikely(is_migration_disabled(next_task))) >> + goto retry; >> + > > Can we ever hit this? The previous is_migration_disabled() check is in the > same rq->lock segment. Ahh, I'm sorry, I add this to the wrong place, It should be in front of deactivate_task(rq, next_task, 0); Sorry for this mistake. > > AFAIA this doesn't fix the problem v1 was fixing, which is next_task can > become migrate_disable() after push_rt_task() goes through > find_lock_lowest_rq(). > Something in the following should fix it. put_task_struct(next_task); next_task = task; goto retry; } if (unlikely(is_migration_disabled(next_task))) { put_task_struct(next_task); goto retry; } deactivate_task(rq, next_task, 0); > For the task to still be in the pushable_tasks list after having made > itself migration disabled, it must no longer be current, which means we > enqueued a higher priority RT task, in which case we went through > set_next_task_rt() so we did rt_queue_push_tasks(). The current task may not have a higher priority, maybe a process of the same priority preempted the migration disabled task. In this case, we still have the opportunity to make this migration disabled task execute faster by migrating the higher priority task to other CPUs. And this is what the commit 95158a89dd50 ("sched,rt: Use the full cpumask for balancing") and 1beec5b55060 ("sched: Fix migrate_disable() vs rt/dl balancing") doing. Considering this, the V1 patch is not the best solution, and I send this V2 patch (although there is a misplaced bug here). Or can we ignore this small possibility? > > So I think what you had in v1 was actually what we needed. > Yes, v1 is the patch I have tested for a week, V2 hasn't done this long time. >> /* We might release rq lock */ >> get_task_struct(next_task); >> >> -- >> 2.24.3 (Apple Git-128) -- Schspa Shi BRs