Received: by 2002:a25:6193:0:0:0:0:0 with SMTP id v141csp3408198ybb; Sun, 22 Mar 2020 23:59:29 -0700 (PDT) X-Google-Smtp-Source: ADFU+vud4rOn5bt7xHe26JGHW1+wg2sc8EUVPaGBDO42sgRegI8x8Mt9UYhB/XgAF0uofeEZiXmp X-Received: by 2002:aca:fd48:: with SMTP id b69mr11008471oii.126.1584946769690; Sun, 22 Mar 2020 23:59:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1584946769; cv=none; d=google.com; s=arc-20160816; b=klAZBpNDrOzw0AwkPL/qW9uGoKHgffXorWOPFsCjNfbkSFJuZdLM86sIswj5Q+3drr Ox1brlS0LW5v0yol0dBjKZy29+bsZUztRadOm+mbbtpTSwce83av4j+6Mioh1wgxP5b0 BDfUfyUeKntdgv0LihfVqb2r3pqIm/xvqNAl07P2vrIiPBORH8pThQhHdIgkNc2zcNQy RMGVX4gM014ByNxcQBIOKM0lklLWWrA3K/IsF8Zq7Qt9vWb0rofznItSL06TYnojlR/o h/ivDCtxg1dq2D/z2QTaaFiWB8cwR6altkX6z6clBvDSnNf1H4vNCdN4GKz+56qC9C4X zvMg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:ironport-sdr:ironport-sdr; bh=xOZtMfiFS2fk1PzicE6CjcRcJOq0tdM1FzTQeGUbYtY=; b=WBLF/gf7vAVtBOlfSDudw1Q6DmvHGH5bsOumtDyYROPANsfiE3807z9HFQyHCqGmoZ GyOcWcMofZep4Hdv3L9RAaSXO0B3FbKjprbKh0EcL0tUa6PKTav/XImi4ztVW14B+bfb VLACU+ig1uMEYek2cwFn8eJhDfsyZONaGljOSSLKU07ZXHX687DKIZsGsCFR/K7b6DIL Pyvp4eDZW88pIUgIJQzS2/V13R6yXW7tlMmYOc40NnROpGXBMCKvFGNZvAzlqdYVOj2x 0IiBpWqEHFMnz8bln9jUgJgxPkz15D32WFcEiO5EZg4+F10D46d+LBzSPBCFeqJUJvkG 3t9A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e193si7417781oib.276.2020.03.22.23.59.16; Sun, 22 Mar 2020 23:59:29 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727345AbgCWG6X (ORCPT + 99 others); Mon, 23 Mar 2020 02:58:23 -0400 Received: from mga09.intel.com ([134.134.136.24]:41721 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727164AbgCWG6X (ORCPT ); Mon, 23 Mar 2020 02:58:23 -0400 IronPort-SDR: 4qqFghqYqJvRKLen7U4JvsmY8vxbrP/e1tKR+dvWYqoefu0RHQ5cBk7b9Vv5RMIZs4FTfrzqLZ MuNA7nsvrnCQ== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Mar 2020 23:58:22 -0700 IronPort-SDR: jZfDROTwMlocgaKQO+cBS7eT/pYH+5XWKH4m9coTaSgLYLzAzRX3V4uzVu9JSv+4jQLVffWgfl 2jkQLLCo21yA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.72,295,1580803200"; d="scan'208";a="357011531" Received: from cli6-desk1.ccr.corp.intel.com (HELO [10.239.161.118]) ([10.239.161.118]) by fmsmga001.fm.intel.com with ESMTP; 22 Mar 2020 23:58:19 -0700 Subject: Re: [PATCH] sched: Use RCU-sched in core-scheduling balancing logic To: paulmck@kernel.org, "Joel Fernandes (Google)" Cc: linux-kernel@vger.kernel.org, vpillai , Aaron Lu , Aubrey Li , peterz@infradead.org, Ben Segall , Dietmar Eggemann , Ingo Molnar , Juri Lelli , Mel Gorman , Steven Rostedt , Vincent Guittot References: <20200313232918.62303-1-joel@joelfernandes.org> <20200314003004.GI3199@paulmck-ThinkPad-P72> From: "Li, Aubrey" Message-ID: Date: Mon, 23 Mar 2020 14:58:18 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.3.0 MIME-Version: 1.0 In-Reply-To: <20200314003004.GI3199@paulmck-ThinkPad-P72> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2020/3/14 8:30, Paul E. McKenney wrote: > On Fri, Mar 13, 2020 at 07:29:18PM -0400, Joel Fernandes (Google) wrote: >> rcu_read_unlock() can incur an infrequent deadlock in >> sched_core_balance(). Fix this by using the RCU-sched flavor instead. >> >> This fixes the following spinlock recursion observed when testing the >> core scheduling patches on PREEMPT=y kernel on ChromeOS: >> >> [ 14.998590] watchdog: BUG: soft lockup - CPU#0 stuck for 11s! [kworker/0:10:965] >> > > The original could indeed deadlock, and this would avoid that deadlock. > (The commit to solve this deadlock is sadly not yet in mainline.) > > Acked-by: Paul E. McKenney I saw this in dmesg with this patch, is it expected? Thanks, -Aubrey [ 117.000905] ============================= [ 117.000907] WARNING: suspicious RCU usage [ 117.000911] 5.5.7+ #160 Not tainted [ 117.000913] ----------------------------- [ 117.000916] kernel/sched/core.c:4747 suspicious rcu_dereference_check() usage! [ 117.000918] other info that might help us debug this: [ 117.000921] rcu_scheduler_active = 2, debug_locks = 1 [ 117.000923] 1 lock held by swapper/52/0: [ 117.000925] #0: ffffffff82670960 (rcu_read_lock_sched){....}, at: sched_core_balance+0x5/0x700 [ 117.000937] stack backtrace: [ 117.000940] CPU: 52 PID: 0 Comm: swapper/52 Kdump: loaded Not tainted 5.5.7+ #160 [ 117.000943] Hardware name: Intel Corporation S2600WFD/S2600WFD, BIOS SE5C620.86B.01.00.0412.020920172159 02/09/2017 [ 117.000945] Call Trace: [ 117.000955] dump_stack+0x86/0xcb [ 117.000962] sched_core_balance+0x634/0x700 [ 117.000982] __balance_callback+0x49/0xa0 [ 117.000990] __schedule+0x1416/0x1620 [ 117.001000] ? lockdep_hardirqs_off+0xa0/0xe0 [ 117.001005] ? _raw_spin_unlock_irqrestore+0x41/0x70 [ 117.001024] schedule_idle+0x28/0x40 [ 117.001030] do_idle+0x17e/0x2a0 [ 117.001041] cpu_startup_entry+0x19/0x20 [ 117.001048] start_secondary+0x16c/0x1c0 [ 117.001055] secondary_startup_64+0xa4/0xb0 > >> --- >> kernel/sched/core.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/kernel/sched/core.c b/kernel/sched/core.c >> index 3045bd50e249..037e8f2e2686 100644 >> --- a/kernel/sched/core.c >> +++ b/kernel/sched/core.c >> @@ -4735,7 +4735,7 @@ static void sched_core_balance(struct rq *rq) >> struct sched_domain *sd; >> int cpu = cpu_of(rq); >> >> - rcu_read_lock(); >> + rcu_read_lock_sched(); >> raw_spin_unlock_irq(rq_lockp(rq)); >> for_each_domain(cpu, sd) { >> if (!(sd->flags & SD_LOAD_BALANCE)) >> @@ -4748,7 +4748,7 @@ static void sched_core_balance(struct rq *rq) >> break; >> } >> raw_spin_lock_irq(rq_lockp(rq)); >> - rcu_read_unlock(); >> + rcu_read_unlock_sched(); >> } >> >> static DEFINE_PER_CPU(struct callback_head, core_balance_head); >> -- >> 2.25.1.481.gfbce0eb801-goog >>