Received: by 10.223.176.46 with SMTP id f43csp1110958wra; Fri, 19 Jan 2018 07:04:52 -0800 (PST) X-Google-Smtp-Source: ACJfBou0ZYFylA+93+LQovouzv4yADDRXT45agAwElqKLz0aaIPPinZy8pxPiaXp+kfjt9KRJXif X-Received: by 2002:a17:902:d205:: with SMTP id t5-v6mr1743423ply.190.1516374292061; Fri, 19 Jan 2018 07:04:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516374292; cv=none; d=google.com; s=arc-20160816; b=EKV8wgTPN9BmssVy2xxpV8gIf+po5gfIQWIExipRTovrnyUMo9ouwtgOXgLOwpp6Qb Izo8mBNVKKiH6uiHh2hiK27Jmjio5tq6V2MNba6V6KaeOsoOEUWbzqNaCtdwEZEZetsR l2l+Var5Bfhk1CUqBW1YHfsjMZO4dYNFf/85EjEOrueksy3W8QckNeMGfgPAtHTaUgc8 2bRDF12wWgmB7/HwGxSNXsBjsD2vd6XciVFFgBQnPt7yvpOrlB0eExMAtwcU2FPv5TEu 6Km10qdQhtrvxt+N0GGbsPBcGpC8A+U10jw7RDH6TLzn0qK4/6pHCZeaic3ktMCVyvRa 04cw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dmarc-filter:arc-authentication-results; bh=PpIjawjmI4EzqkPwOCPac7vPdoCRfwj1px20TqoGvFg=; b=MxnqZL2IO4bdBGzJluen7F4x+50CuvGEgbvoDs2YtyeQdNj72lZYnm3SyLwVsM4HbP C26heIbFMgGW3m7WM85cqTgqJoGMTPplFBxGhUGyjKo6ZsSMXWBMCeV0lIj6IadDX2+b lV4AATJK4Z7RR/MCiles+b+tIBXfT1k9lWGPkWyZkBLs8PQoYI4wPSi5H+QbvpNRM9KK GDzc4nLXDdfNKIykbIb9GR2nOXmhLGW82D1URbwedyFl+EFwv+TNcq2R8s6RlHAqAL6T AmWDw5btzGbPxLnlDDc/cR2RoqhMHotu3zegRzCX0+vO0YEzaYJZMfLsn+j3ugdQbYrn GDAg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l6si8267082pgp.5.2018.01.19.07.04.37; Fri, 19 Jan 2018 07:04:52 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755864AbeASPED (ORCPT + 99 others); Fri, 19 Jan 2018 10:04:03 -0500 Received: from mail.kernel.org ([198.145.29.99]:51622 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754542AbeASPD5 (ORCPT ); Fri, 19 Jan 2018 10:03:57 -0500 Received: from gandalf.local.home (cpe-172-100-180-131.stny.res.rr.com [172.100.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 0E07421456; Fri, 19 Jan 2018 15:03:55 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0E07421456 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=goodmis.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=rostedt@goodmis.org Date: Fri, 19 Jan 2018 10:03:53 -0500 From: Steven Rostedt To: Pavan Kondeti Cc: williams@redhat.com, Ingo Molnar , LKML , Peter Zijlstra , Thomas Gleixner , bristot@redhat.com, jkacur@redhat.com, efault@gmx.de, hpa@zytor.com, torvalds@linux-foundation.org, swood@redhat.com, linux-tip-commits@vger.kernel.org Subject: Re: [tip:sched/core] sched/rt: Simplify the IPI based RT balancing logic Message-ID: <20180119100353.7f9f5154@gandalf.local.home> In-Reply-To: References: <20170424114732.1aac6dc4@gandalf.local.home> X-Mailer: Claws Mail 3.14.0 (GTK+ 2.24.31; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 19 Jan 2018 14:53:05 +0530 Pavan Kondeti wrote: > I am seeing "spinlock already unlocked" BUG for rd->rto_lock on a 4.9 > stable kernel based system. This issue is observed only after > inclusion of this patch. It appears to me that rq->rd can change > between spinlock is acquired and released in rto_push_irq_work_func() > IRQ work if hotplug is in progress. It was only reported couple of > times during long stress testing. The issue can be easily reproduced > if an artificial delay is introduced between lock and unlock of > rto_lock. The rq->rd is changed under rq->lock, so we can protect this > race with rq->lock. The below patch solved the problem. we are taking > rq->lock in pull_rt_task()->tell_cpu_to_push(), so I extended the same > here. Please let me know your thoughts on this. As so rq->rd can change. Interesting. > > diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c > index d863d39..478192b 100644 > --- a/kernel/sched/rt.c > +++ b/kernel/sched/rt.c > @@ -2284,6 +2284,7 @@ void rto_push_irq_work_func(struct irq_work *work) > raw_spin_unlock(&rq->lock); > } > > + raw_spin_lock(&rq->lock); What about just saving the rd then? struct root_domain *rd; rd = READ_ONCE(rq->rd); then use that. Then we don't need to worry about it changing. -- Steve > raw_spin_lock(&rq->rd->rto_lock); > > /* Pass the IPI to the next rt overloaded queue */ > @@ -2291,11 +2292,10 @@ void rto_push_irq_work_func(struct irq_work *work) > > raw_spin_unlock(&rq->rd->rto_lock); > > - if (cpu < 0) > - return; > - > /* Try the next RT overloaded CPU */ > - irq_work_queue_on(&rq->rd->rto_push_work, cpu); > + if (cpu >= 0) > + irq_work_queue_on(&rq->rd->rto_push_work, cpu); > + raw_spin_unlock(&rq->lock); > } > #endif /* HAVE_RT_PUSH_IPI */ > > > Thanks, > Pavan >