Received: by 2002:a25:e7d8:0:0:0:0:0 with SMTP id e207csp84283ybh; Fri, 13 Mar 2020 17:31:44 -0700 (PDT) X-Google-Smtp-Source: ADFU+vtlkonVCc7tB7q4NaKAjx05lzdBxr1eOlNdA0R1bK4hxFfxJwXNw0j3lPfESMgUgjlglL3M X-Received: by 2002:a9d:22:: with SMTP id 31mr12644535ota.173.1584145904469; Fri, 13 Mar 2020 17:31:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1584145904; cv=none; d=google.com; s=arc-20160816; b=YX3un19nm2hWTLeTzEviCabL0/UAIeYP2mrPh+vTCwjxpTixca1pX9OnKnyxVgkqF+ eIsRRMrfV5klNBGGWBDgyNmA1s0Gjwd1QvvxsnI9kdq/QvoA0ICQzTGfJMPcD9r+gAEF K9KhRAF7ASyRRAHAC8cOtJR0/QAUvV75ll3cRXTYlpXk4aPEklzSYyo1ot/2RR6JaGjH WHic9wrwh1G4+mugFA3uv2YSZziYph8BlW+VXBK1LZv5vrbAGx1QmFToLRKQ+i3elU8P 01qlphEobHG8tz8HLc0n/2e4nnTquBnu6IyQhClQLClXg5c4CpjL4JQ3/eUsEbdP/Y8d An2Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:reply-to:message-id :subject:cc:to:from:date:dkim-signature; bh=0tg0FX6FYeyk1yuCEMKueanSngDfjmcOHDgZ8js7iGI=; b=htgIH6xX7/31wrC6qNhj2hCkPvjZhZiP/w/2cagUfQBeDpyYgURW3cIS3gHOfYh5P1 7ae8PgRp1A0vUXMvV/I3UN3knyRtKDmJ8SismaaFJUDbeb9kmsWVPzYAk5A9CDoaE6fL LZf2bt0dkHMYYTKJVds8ffgVJg6dR3MJe7odjFnZaB79ezJ4dyqYrI9Tpr3dHblqSvJj bQ02We6pBQEAA/4DOrZX5MAfe7pYfUL7gmt9du+T6W1Mbn+3jwHeNbUJAjOjR6AS2Ern mScSIKS+lZnSjRj3PcY051dpKe7oSg+gh5VGJ8JVWx4/omsJTf0DjK62tCrSmSgx+izz 8AjA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=JwQUJ3pz; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 71si5932318otm.111.2020.03.13.17.31.32; Fri, 13 Mar 2020 17:31:44 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=JwQUJ3pz; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727637AbgCNAaF (ORCPT + 99 others); Fri, 13 Mar 2020 20:30:05 -0400 Received: from mail.kernel.org ([198.145.29.99]:43842 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726853AbgCNAaF (ORCPT ); Fri, 13 Mar 2020 20:30:05 -0400 Received: from paulmck-ThinkPad-P72.home (50-39-105-78.bvtn.or.frontiernet.net [50.39.105.78]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id C72662074C; Sat, 14 Mar 2020 00:30:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1584145804; bh=mytMbKcVo3G8GeSdyPgGXZoqLhztVzpu0+zOLIaooSQ=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=JwQUJ3pz31wLbCTpybzMxA3FGgqXaqPr+AatA74o1RYdPWETxnrR5n4xEyJAZ6svj sShyWV3B+9aBiJ2ysYoH3FDJsqIri0Bp2zMiMwZ3XyDCfzPTv7yvT9MivZZhSlzsnc LBpy32IqmJ0dVNUvS3+9I99UQxRfFErgHBHDBE7o= Received: by paulmck-ThinkPad-P72.home (Postfix, from userid 1000) id 9330135226C7; Fri, 13 Mar 2020 17:30:04 -0700 (PDT) Date: Fri, 13 Mar 2020 17:30:04 -0700 From: "Paul E. McKenney" To: "Joel Fernandes (Google)" Cc: linux-kernel@vger.kernel.org, vpillai , Aaron Lu , Aubrey Li , peterz@infradead.org, Ben Segall , Dietmar Eggemann , Ingo Molnar , Juri Lelli , Mel Gorman , Steven Rostedt , Vincent Guittot Subject: Re: [PATCH] sched: Use RCU-sched in core-scheduling balancing logic Message-ID: <20200314003004.GI3199@paulmck-ThinkPad-P72> Reply-To: paulmck@kernel.org References: <20200313232918.62303-1-joel@joelfernandes.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200313232918.62303-1-joel@joelfernandes.org> User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 13, 2020 at 07:29:18PM -0400, Joel Fernandes (Google) wrote: > rcu_read_unlock() can incur an infrequent deadlock in > sched_core_balance(). Fix this by using the RCU-sched flavor instead. > > This fixes the following spinlock recursion observed when testing the > core scheduling patches on PREEMPT=y kernel on ChromeOS: > > [ 3.240891] BUG: spinlock recursion on CPU#2, swapper/2/0 > [ 3.240900] lock: 0xffff9cd1eeb28e40, .magic: dead4ead, .owner: swapper/2/0, .owner_cpu: 2 > [ 3.240905] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.4.22htcore #4 > [ 3.240908] Hardware name: Google Eve/Eve, BIOS Google_Eve.9584.174.0 05/29/2018 > [ 3.240910] Call Trace: > [ 3.240919] dump_stack+0x97/0xdb > [ 3.240924] ? spin_bug+0xa4/0xb1 > [ 3.240927] do_raw_spin_lock+0x79/0x98 > [ 3.240931] try_to_wake_up+0x367/0x61b > [ 3.240935] rcu_read_unlock_special+0xde/0x169 > [ 3.240938] ? sched_core_balance+0xd9/0x11e > [ 3.240941] __rcu_read_unlock+0x48/0x4a > [ 3.240945] __balance_callback+0x50/0xa1 > [ 3.240949] __schedule+0x55a/0x61e > [ 3.240952] schedule_idle+0x21/0x2d > [ 3.240956] do_idle+0x1d5/0x1f8 > [ 3.240960] cpu_startup_entry+0x1d/0x1f > [ 3.240964] start_secondary+0x159/0x174 > [ 3.240967] secondary_startup_64+0xa4/0xb0 > [ 14.998590] watchdog: BUG: soft lockup - CPU#0 stuck for 11s! [kworker/0:10:965] > > Cc: vpillai > Cc: Aaron Lu > Cc: Aubrey Li > Cc: peterz@infradead.org > Cc: paulmck@kernel.org > Signed-off-by: Joel Fernandes (Google) The original could indeed deadlock, and this would avoid that deadlock. (The commit to solve this deadlock is sadly not yet in mainline.) Acked-by: Paul E. McKenney > --- > kernel/sched/core.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 3045bd50e249..037e8f2e2686 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -4735,7 +4735,7 @@ static void sched_core_balance(struct rq *rq) > struct sched_domain *sd; > int cpu = cpu_of(rq); > > - rcu_read_lock(); > + rcu_read_lock_sched(); > raw_spin_unlock_irq(rq_lockp(rq)); > for_each_domain(cpu, sd) { > if (!(sd->flags & SD_LOAD_BALANCE)) > @@ -4748,7 +4748,7 @@ static void sched_core_balance(struct rq *rq) > break; > } > raw_spin_lock_irq(rq_lockp(rq)); > - rcu_read_unlock(); > + rcu_read_unlock_sched(); > } > > static DEFINE_PER_CPU(struct callback_head, core_balance_head); > -- > 2.25.1.481.gfbce0eb801-goog >