Received: by 2002:a25:f815:0:0:0:0:0 with SMTP id u21csp2215923ybd; Thu, 27 Jun 2019 08:38:33 -0700 (PDT) X-Google-Smtp-Source: APXvYqzA4k5SS79pcfDnXIo/v87a2OfMOHzcwLuuhHxufb5JyK0UhP3a9SvLw/lYOFTgGuZpGF5l X-Received: by 2002:a63:5508:: with SMTP id j8mr4273395pgb.278.1561649913394; Thu, 27 Jun 2019 08:38:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1561649913; cv=none; d=google.com; s=arc-20160816; b=j8AaPAO7C+s0KmpwdJSZAEDGJkxzX8Lt/BNFXMFKV2XiEGNFbd6OnKY1AwdWNb2B3O 1uYcS14mBvUIAVK5vZ10kwAXc0FiYmDR/z6BIqcTfeZqp/U2IGkrLwfqLosk8rK0Qzfz 6xi+CPqvxgJq2974kifw6zHxsOyU5jY97Rr+dlHISoxgvwcQuojfBk3OpWH4ph3UoWKm k/RpiZRPrwPzgNcb2kotzXrwYbxQw+dfYgvudsNyyJomdJNVDTXGpzLbkjsOO+E5DhQ+ 90S+Q3dWY0ipJ/p51OqE1iIa5o1i9v/88wJd4SLjdkLniIcINkYBRXs+C7M5bR570Mfj 4NPA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=dAiAhNcqqN83HV89jHOtQmAjXiR+1D1hV8qVzm4kOU8=; b=OTaYxqANBHA7JQ69rcrN3ILACwg3N1+5V0/naiixebBx6g6WMTexy3GrBB7RWmAayg EVw1GMkAPoggPBP1WvCxHaXcoJCdOvYYaCYu01wlN5lcN/OmxkKcTMD0THDMn1kAEuVX 3eF7PZbL+O6SgJSulmbW+N1eRC44RYgzgWhA9Rqg3DKjyKD2YXaPbK6wBV4zZBKNIuCg u05YoAbFv+dtXgn+2H7XTIR2m3vdGLvJeKj7ZUOjQsNLCgYq/mZCjiw6rvdFy0PrSdqS 1dg2DmoYZugQQIXQRljaWRKQRAiHorfuVWV4wHBhcuJAbA2ULVmUgPKaS3dJAZDUbQAC vrPA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=Y6a0QfMe; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g18si950247plq.190.2019.06.27.08.38.17; Thu, 27 Jun 2019 08:38:33 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=Y6a0QfMe; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726545AbfF0PhY (ORCPT + 99 others); Thu, 27 Jun 2019 11:37:24 -0400 Received: from mail-lj1-f195.google.com ([209.85.208.195]:46446 "EHLO mail-lj1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726472AbfF0PhY (ORCPT ); Thu, 27 Jun 2019 11:37:24 -0400 Received: by mail-lj1-f195.google.com with SMTP id v24so2802711ljg.13 for ; Thu, 27 Jun 2019 08:37:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=dAiAhNcqqN83HV89jHOtQmAjXiR+1D1hV8qVzm4kOU8=; b=Y6a0QfMeusE49Ow+gzUNmBUFqbgGMHCmtrJxOwnQ1MeKB4oeIz2c1FjE6XfWiDKqcc MuUDD5UDha0tq/z87lL2noiWB5GadljzQX29Uzd2+g2Wc8RJDsGA+bSD1uLLdueeGqXY AYELY011UNU6k3vC4YWhRcgU7bHMetl7JeuVE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=dAiAhNcqqN83HV89jHOtQmAjXiR+1D1hV8qVzm4kOU8=; b=IJxcx3PedXtrjJJuPuMU0ZgcTeDW2xBtlmj6YKEPSTA5yGxLWyWqHzV9lzBoQVbk2n /jsCQw8HaSY6GSYpLIdIc5JM27UwziGkG/uXnXdZPLnr233qiQlygyh2u5Hlcup1FE62 r/aGN9wj5IOwSrMMXy33y1gNJU+xFwyGKtF+YGROflPMGrKmnOXsX7MAwoCEeN+Fugtu n1YM5Tr3DwMbp5YkOfauggKNPH2A/RWktA48rvuTbpTpBahyJw+zBgMPpUZP2Sa0360f aIJjPN/u6HhN+vcZivTv45VuZVAIIhlJtWxqwDDm0rYRdwxu2V9pnpotEI5XInrjNYMJ 6iKA== X-Gm-Message-State: APjAAAU5aNXx/uqtLbxi+xeBREn+RTn6g5rdEdkctRxh7oGV+1Z1qAyW apRW9i/ZxAk1lYmJRchQBTN3pbagY5dSy0jm3lKtjw== X-Received: by 2002:a2e:7315:: with SMTP id o21mr2995067ljc.3.1561649841872; Thu, 27 Jun 2019 08:37:21 -0700 (PDT) MIME-Version: 1.0 References: <20190626135447.y24mvfuid5fifwjc@linutronix.de> <20190626162558.GY26519@linux.ibm.com> <20190627142436.GD215968@google.com> <20190627103455.01014276@gandalf.local.home> <20190627153031.GA249127@google.com> In-Reply-To: <20190627153031.GA249127@google.com> From: Joel Fernandes Date: Thu, 27 Jun 2019 11:37:10 -0400 Message-ID: Subject: Re: [RFC] Deadlock via recursive wakeup via RCU with threadirqs To: Steven Rostedt Cc: "Paul E. McKenney" , Sebastian Andrzej Siewior , rcu , LKML , Thomas Gleixner , Ingo Molnar , Peter Zijlstra , Josh Triplett , Mathieu Desnoyers , Lai Jiangshan Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 27, 2019 at 11:30 AM Joel Fernandes wrote: > > On Thu, Jun 27, 2019 at 10:34:55AM -0400, Steven Rostedt wrote: > > On Thu, 27 Jun 2019 10:24:36 -0400 > > Joel Fernandes wrote: > > > > > > What am I missing here? > > > > > > This issue I think is > > > > > > (in normal process context) > > > spin_lock_irqsave(rq_lock); // which disables both preemption and interrupt > > > // but this was done in normal process context, > > > // not from IRQ handler > > > rcu_read_lock(); > > > <---------- IPI comes in and sets exp_hint > > > > How would an IPI come in here with interrupts disabled? > > > > -- Steve > > This is true, could it be rcu_read_unlock_special() got called for some > *other* reason other than the IPI then? > > Per Sebastian's stack trace of the recursive lock scenario, it is happening > during cpu_acct_charge() which is called with the rq_lock held. > > The only other reasons I know off to call rcu_read_unlock_special() are if > 1. the tick indicated that the CPU has to report a QS > 2. an IPI in the middle of the reader section for expedited GPs > 3. preemption in the middle of a preemptible RCU reader section > > 1. and 2. are not possible because interrupts are disabled, that's why the > wakeup_softirq even happened. > 3. is not possible because we are holding rq_lock in the RCU reader section. > > So I am at a bit of a loss how this can happen :-( Sebastian it would be nice if possible to trace where the t->rcu_read_unlock_special is set for this scenario of calling rcu_read_unlock_special, to give a clear idea about whether it was really because of an IPI. I guess we could also add additional RCU debug fields to task_struct (just for debugging) to see where there unlock_special is set. Is there a test to reproduce this, or do I just boot an intel x86_64 machine with "threadirqs" and run into it?