Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp3783779pxb; Tue, 17 Nov 2020 03:32:13 -0800 (PST) X-Google-Smtp-Source: ABdhPJwyk6GcePwiBoNx26fvzJ/PYn7uq3pf126Hj07rguGKfVPSoOL7qqaLc6gRLEjbxo3cl+91 X-Received: by 2002:aa7:d2d9:: with SMTP id k25mr19606716edr.310.1605612732907; Tue, 17 Nov 2020 03:32:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1605612732; cv=none; d=google.com; s=arc-20160816; b=xAunc65Wxi8QeYhF8fqhH+HchgSNpCZMP2C5dPc8YOn23aNm613DniLSHYFb1wvYCL bd4E1mhkSdGviG7B0fbJfc+g8DS93Ika5Hhkv41rMhFBx2JuBJ6O4oCsQ/Y57QO7Ocn4 a+uca8toqlQDW9Pd91gan4qnqNAHPFa6n4NvBm1FLDWcDl0ud3MbGQWs4Yl1ykWaE6RR dqNevJalHxVgAnMbReHWMi4xnvFHrdggsM+/GDjRi+zGFupnMxwI822FEFHLW1oTJFIh q2mhWR37Cj+pnnyL3pNMN2FL6oK2B4vaCM/bbnEZ+rdRZ4rN2uZf6FV1wKnENPf/nkRu 09qQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:in-reply-to:subject :cc:to:from:user-agent:references; bh=zXTY8GSB2FX56A/xQi2YlE4HStJbxhlfJZwvIkJ7z9M=; b=euIEFgc63p63dzamYSqP905p8HGbu+bDu+xxX6u7wcy6Hw7o5klwCgwT7wq3X6T37N UJ/f9nqam/Srwld0gbTEjIbskIvfTCjvOL41dr1uwCt5USEEpkWPEhvDUiw4zBHjMd5C AX8YPsk2daHlRajNsdj8/gT/tVvLHA3doqed7zM4R2dziXgK8c0zEQOOytpLM2mR1syR iW4D5Jy6bPYxKMdTUI0BiyDag8G5eAl/NKSIiYKjW+qQHPNCyK16D8rI6EtucOLkdF60 bs5NOB33EKAFBV4uZ/pBfUTDhV445BuBYE7YGQ6rvT/ZeWtQxsQUxkz7x1zz9v2aEZ55 1Q9Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o60si14568535eda.61.2020.11.17.03.31.50; Tue, 17 Nov 2020 03:32:12 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728177AbgKQL30 (ORCPT + 99 others); Tue, 17 Nov 2020 06:29:26 -0500 Received: from foss.arm.com ([217.140.110.172]:54760 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728155AbgKQL3Z (ORCPT ); Tue, 17 Nov 2020 06:29:25 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E467FD6E; Tue, 17 Nov 2020 03:29:24 -0800 (PST) Received: from e113632-lin (e113632-lin.cambridge.arm.com [10.1.194.46]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id EEFF23F718; Tue, 17 Nov 2020 03:29:23 -0800 (PST) References: <20201117110620.GG3121378@hirez.programming.kicks-ass.net> User-agent: mu4e 0.9.17; emacs 26.3 From: Valentin Schneider To: Peter Zijlstra Cc: Oleksandr Natalenko , linux-rt-users@vger.kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, rostedt@goodmis.org Subject: Re: WARNING at kernel/sched/core.c:2013 migration_cpu_stop+0x2e3/0x330 In-reply-to: <20201117110620.GG3121378@hirez.programming.kicks-ass.net> Date: Tue, 17 Nov 2020 11:29:18 +0000 Message-ID: MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 17/11/20 11:06, Peter Zijlstra wrote: > On Mon, Nov 16, 2020 at 10:00:14AM +0000, Valentin Schneider wrote: >> >> On 15/11/20 22:32, Oleksandr Natalenko wrote: >> > Hi. >> > >> > I'm running v5.10-rc3-rt7 for some time, and I came across this splat in >> > dmesg: >> > >> > ``` >> > [118769.951010] ------------[ cut here ]------------ >> > [118769.951013] WARNING: CPU: 19 PID: 146 at kernel/sched/core.c:2013 >> >> Err, I didn't pick up on this back then, but isn't that check bogus? If the >> task is enqueued elsewhere, it's valid for it not to be affined >> 'here'. Also that is_migration_disabled() check within is_cpu_allowed() >> makes me think this isn't the best thing to call on a remote task. >> >> --- >> diff --git a/kernel/sched/core.c b/kernel/sched/core.c >> index 1218f3ce1713..47d5b677585f 100644 >> --- a/kernel/sched/core.c >> +++ b/kernel/sched/core.c >> @@ -2010,7 +2010,7 @@ static int migration_cpu_stop(void *data) >> * valid again. Nothing to do. >> */ >> if (!pending) { >> - WARN_ON_ONCE(!is_cpu_allowed(p, cpu_of(rq))); >> + WARN_ON_ONCE(!cpumask_test_cpu(task_cpu(p), p->cpus_ptr)); > > Ho humm.. bit of a mess that. I'm trying to figure out if we need that > is_per_cpu_kthread() test here or not. > > I suppose not, what we want here is to ensure the CPU is in cpus_mask > and not care about the whole hotplug mess. > That was my thought as well. On top of that, is_cpu_allowed(p) does a p->migration_disabled read, which isn't so great in the remote case. > Would it makes sense to replace both instances in migration_cpu_stop() > with: > > WARN_ON_ONCE(!cpumask_test_cpu(task_cpu(p), p->cpus_mask)); > > ? I guess so; I was trying to see if we could factorize this, but stopped mid-swing as I'm really wary of shuffling too much of this code (even with the help of TLA+; well, maybe *because* of it).