Received: by 2002:a25:f815:0:0:0:0:0 with SMTP id u21csp2287298ybd; Thu, 27 Jun 2019 09:48:34 -0700 (PDT) X-Google-Smtp-Source: APXvYqxhGaWBuXjyQ8rF3IIzJj8cjDa9Lv96F0Nb7JSLsnNNVrTCCyMohMvGSO1lfAo4kiT7leBI X-Received: by 2002:a63:5002:: with SMTP id e2mr2313015pgb.216.1561654114011; Thu, 27 Jun 2019 09:48:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1561654114; cv=none; d=google.com; s=arc-20160816; b=dR0kpXDXBddQAoZtx9O/YKM88tGzRsVDqSqyuaMOfH/ubzk6+U9BvUW47McRIz3u7v o0eG2NoCRsEcxnYqgsm2zc/gQGnF/v0ArDBPlZSi7UZqVCQLUf6GcFLjgZyy89ltlU3n sJJy5NSSJ0OH0HN7z0ihwuqywObTcCkaUxwUHX1CCAFfI7ZqzAlm1S5Ai8lq0NN82cGS vvWkou9vzY8clAQTexN0zeQV/ba64eiQep6v3Oh89lwxTub0Vu6TYr0IIY3P89vjTfHn 6L3MAzuZjcSaO+AJSCMnw7IIfQ47ZeCOcfLcQs5MTzFLxjGQwUGDSnzq+vENzVSa4XhI CbMw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=fP8Sg3qOi5RkXjx6ZyzKJn5DQ7slXEGVgd1aV4dwPMs=; b=bl43yGDqDwuqMzSN6//dxdt0IFqWrMXlaOwm1LUtOIx0sUEXUmTuZCTXu8CKmdRvCt iL9AO7Jq2M4BD0IsAK0PDiZIm+gSEXqG5dhyTvUIDVh0ddufzq7+XmxBhCqoJgVMKMGy znjH1Gn3UgmYHeUTHECJU12AMsT42F+fxWtzBvJ1aJH1CwO0dtrLx1TjfO36r48mfY/3 zs3D9RyZius5VHwZT92OJmWMqu7IOMK8Prp8VeuOvlIcxuQKaCxm+eNnXSxUvFwIwNuc pKe+kPux5uIC4DfPcUlJ5qamnWkXEiIIoz2GnFYlecLIyKEM8PfjSx1l6Yt0uK0AlCnr 3AxQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=yL54hFeM; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w4si2714985plz.323.2019.06.27.09.48.17; Thu, 27 Jun 2019 09:48:33 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=yL54hFeM; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726620AbfF0Qri (ORCPT + 99 others); Thu, 27 Jun 2019 12:47:38 -0400 Received: from mail-lj1-f196.google.com ([209.85.208.196]:38758 "EHLO mail-lj1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725770AbfF0Qrh (ORCPT ); Thu, 27 Jun 2019 12:47:37 -0400 Received: by mail-lj1-f196.google.com with SMTP id r9so3078727ljg.5 for ; Thu, 27 Jun 2019 09:47:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=fP8Sg3qOi5RkXjx6ZyzKJn5DQ7slXEGVgd1aV4dwPMs=; b=yL54hFeMZHqyW7Ixx4fI2UB2nyITEp3UG11a33d3ga9G/npgfHBKurYmAhoo6abFOB dQioC/MtzJkNc4w1xvunxd2M+lI82+lvxwwgjWYYFxUHYfP/tDAIveYLx3GHA9bbXvd8 kE6gNmilJFrsXlgsA5QcvEvEG3hQIQw2a/bKc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=fP8Sg3qOi5RkXjx6ZyzKJn5DQ7slXEGVgd1aV4dwPMs=; b=p4JBu8QbKWzls3LfGprlt3tyvLBNO+8rYqV7y0F/jNJY5fiGO2fNv2MdUg0HXdrofu mRSYnnTcJgwk2JpojIAhK/e05f2xVpyCsJ8xOaNp5dpE/fFStOf348hzoX7ViwdApUy6 /AQybrY80YXS1GPBCTloKZ9YIpE2tWFRyKQEPaSRAK6+c7Huxb307DYPvMXrcOuLQtlC 7hcLGuXsqj5vCgG6InupL2ZnjM1HGtBYuzXURaK8/PV5Vz7Yq14XtEXjQ/fpInea1S2M +NLxcVtrL6nxZI3ey8kuN7gKDwXjBTvdK/T+sk7BE69hz3P3NaX5LdePjyjST4AAlxgh Aq+g== X-Gm-Message-State: APjAAAVhA8KrdHe3SwbZUKYs1LaEKHMs55iYoxa+SZbXeBFdmu5Ah/HW oldq6j0AUDIq5AoSXeF0GG8MK7mMLKxz1v/OxpT1cg== X-Received: by 2002:a2e:1510:: with SMTP id s16mr3152570ljd.19.1561654055489; Thu, 27 Jun 2019 09:47:35 -0700 (PDT) MIME-Version: 1.0 References: <20190626135447.y24mvfuid5fifwjc@linutronix.de> <20190626162558.GY26519@linux.ibm.com> <20190627142436.GD215968@google.com> <20190627103455.01014276@gandalf.local.home> <20190627153031.GA249127@google.com> <20190627155506.GU26519@linux.ibm.com> In-Reply-To: <20190627155506.GU26519@linux.ibm.com> From: Joel Fernandes Date: Thu, 27 Jun 2019 12:47:24 -0400 Message-ID: Subject: Re: [RFC] Deadlock via recursive wakeup via RCU with threadirqs To: "Paul E. McKenney" Cc: Steven Rostedt , Sebastian Andrzej Siewior , rcu , LKML , Thomas Gleixner , Ingo Molnar , Peter Zijlstra , Josh Triplett , Mathieu Desnoyers , Lai Jiangshan Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 27, 2019 at 11:55 AM Paul E. McKenney wrote: > > On Thu, Jun 27, 2019 at 11:30:31AM -0400, Joel Fernandes wrote: > > On Thu, Jun 27, 2019 at 10:34:55AM -0400, Steven Rostedt wrote: > > > On Thu, 27 Jun 2019 10:24:36 -0400 > > > Joel Fernandes wrote: > > > > > > > > What am I missing here? > > > > > > > > This issue I think is > > > > > > > > (in normal process context) > > > > spin_lock_irqsave(rq_lock); // which disables both preemption and interrupt > > > > // but this was done in normal process context, > > > > // not from IRQ handler > > > > rcu_read_lock(); > > > > <---------- IPI comes in and sets exp_hint > > > > > > How would an IPI come in here with interrupts disabled? > > > > > > -- Steve > > > > This is true, could it be rcu_read_unlock_special() got called for some > > *other* reason other than the IPI then? > > > > Per Sebastian's stack trace of the recursive lock scenario, it is happening > > during cpu_acct_charge() which is called with the rq_lock held. > > > > The only other reasons I know off to call rcu_read_unlock_special() are if > > 1. the tick indicated that the CPU has to report a QS > > 2. an IPI in the middle of the reader section for expedited GPs > > 3. preemption in the middle of a preemptible RCU reader section > > 4. Some previous reader section was IPIed or preempted, but either > interrupts, softirqs, or preemption was disabled across the > rcu_read_unlock() of that previous reader section. Hi Paul, I did not fully understand 4. The previous RCU reader section could not have been IPI'ed or been preempted if interrupts were disabled across. Also, if softirq/preempt is disabled across the previous reader section, the previous reader could not be preempted in these case. That leaves us with the only scenario where the previous reader was IPI'ed while softirq/preempt was disabled across it. Is that what you meant? But in this scenario, the previous reader should have set exp_hint to false in the previous reader's rcu_read_unlock_special() invocation itself. So I would think t->rcu_read_unlock_special should be 0 during the new reader's invocation thus I did not understand how rcu_read_unlock_special can be called because of a previous reader. I'll borrow some of that confused color paint if you don't mind ;-) And we should document this somewhere for future sanity preservation :-D thanks, - Joel > > I -think- that this is what Sebastian is seeing. > > Thanx, Paul > > > 1. and 2. are not possible because interrupts are disabled, that's why the > > wakeup_softirq even happened. > > 3. is not possible because we are holding rq_lock in the RCU reader section. > > > > So I am at a bit of a loss how this can happen :-( > > > > Spurious call to rcu_read_unlock_special() may be when it should not have > > been called? > > > > thanks, > > > > - Joel