Received: by 2002:a25:f815:0:0:0:0:0 with SMTP id u21csp2341088ybd; Thu, 27 Jun 2019 10:44:37 -0700 (PDT) X-Google-Smtp-Source: APXvYqwUJPl9ootnFPnz306r1z6kwAATWq9mZ+iYZrRHecuthIQsCJOevb8McmUpLPlK3xYkLr21 X-Received: by 2002:a17:902:b594:: with SMTP id a20mr6195817pls.259.1561657477519; Thu, 27 Jun 2019 10:44:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1561657477; cv=none; d=google.com; s=arc-20160816; b=d8qGOcV/FuXPHh6NrXJ+dnTgjcodRX5eaN+MuWs+QbsQ1KwC7208Ayua0OwgoZyuCQ VkYuTy0aBYVaRLBwS7fU7eD5vsd9+WoYi1z1AiWSRbac74Ddfcq7Exab5x7NqA91CgBN kujKDl6RB6lEk1QMXx9sgj6cm+OndsgnG/73D4AZtQT/eyiTkrxF9D1JyymAo8pRqExw NDxS0ahXSTuBgDrZO41aTzQVFc1Js0e5ZjEUOFsFwvnVnPr+J8FtWok09y1zjskqzNik IMt2YMXvOOtLrv1D31OQJ+gSLrjkZ8tm+TLzGio9AhnYfWV1G9Vn/88PdFK/JSPkGbV5 vDkg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=GKpy0tHYE9nhG9KCIJ1rt25gfqZJRMTkcZ17Pqh1Sho=; b=Nzk/3+x63U18S/SGkL18jyOf6hjVtQCeO1DoGOJzDO+VKVphU/GEmvRq6gDEq2lVa1 lcoHOOj4plKhWp2VxFbY+WYP7T5lUFlCvsCxYKYBGhL9skvXsoyomxy+/zcaKTlM5436 Xd40VK4ikjPBBbaCujX2Ev2VzwrTseysVjRcMgE/ZuNOfUuJ88cU+0j+vyyt8U5i5kop r2re8+Hpbqyt+AHfiGWIbyk0lVBgeT2ltnvfnWQLAXktae+yMJhhuEKZ7DwFxErW3Uxe 90x44q+k5y/rxSDLCU0AZau7kPmPsBVYYyvkW+mkeo+YzC5tSoUNIOezsiJ7ZbhZhLqe PAtQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=BMRUuHbe; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x9si2938437plo.98.2019.06.27.10.44.21; Thu, 27 Jun 2019 10:44:37 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=BMRUuHbe; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727190AbfF0RoK (ORCPT + 99 others); Thu, 27 Jun 2019 13:44:10 -0400 Received: from mail-lj1-f193.google.com ([209.85.208.193]:44159 "EHLO mail-lj1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726549AbfF0RoH (ORCPT ); Thu, 27 Jun 2019 13:44:07 -0400 Received: by mail-lj1-f193.google.com with SMTP id k18so3208576ljc.11 for ; Thu, 27 Jun 2019 10:44:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=GKpy0tHYE9nhG9KCIJ1rt25gfqZJRMTkcZ17Pqh1Sho=; b=BMRUuHbeHOgbvemMWjAUC3U4wv5WlJjNIvWKdmfOacm4DPudsDIucxbyW7uBf5gUEB lw4AofuFcjaj2oVQ/rvmCrpcOSy/612a5gzhX3sa5qcmm3l49t8sH4RZYglf3JXC15+8 bCtvZ1g3erZV0rqxGhHkoDdV7ACLqtot7EOI0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=GKpy0tHYE9nhG9KCIJ1rt25gfqZJRMTkcZ17Pqh1Sho=; b=JQl7my+SDcIXAXj0DIk3wB2w2SIhv9TO8Klw3jk5m6RI7OddMHiDIGjWEjk2ogoKKm qcJipkATdgiZ9990cytxBB4gq9exFP0dlHlgG47NwLiRieYZB9Jz1Rsz/fyV+vPxehhw x6HoZVqcoI4RgcRy5LKdpI4NAV6Q5IiKEGvH/UAu5io03fcIXcGy+jdXJ1TU3raKEO9M Ln0jRg04JGLuZ+JW1DII7XIhIGEXKfdoV5jYDZ62VlU/3Uojz7nE9QfAscJJSNa2QGpA l68lnnUBIydIVZlNVT9jrYNvUBGBOuPNn8WohngX2st7qNk4PJkxchLsOpO+O9KAp2Jz NF7g== X-Gm-Message-State: APjAAAW5gOvtHLWIUzzAUwhGqQt4jqJOG4iboWdHHLOp+2Xu1hAjzuGU JZGL10U0IAxbvFwZ10PTPRs3vOTQt6GpZAgxQq2LqQ== X-Received: by 2002:a2e:654d:: with SMTP id z74mr3422261ljb.111.1561657445370; Thu, 27 Jun 2019 10:44:05 -0700 (PDT) MIME-Version: 1.0 References: <20190626135447.y24mvfuid5fifwjc@linutronix.de> <20190626162558.GY26519@linux.ibm.com> <20190627142436.GD215968@google.com> <20190627103455.01014276@gandalf.local.home> <20190627153031.GA249127@google.com> <20190627154011.vbje64x6auaknhx4@linutronix.de> In-Reply-To: <20190627154011.vbje64x6auaknhx4@linutronix.de> From: Joel Fernandes Date: Thu, 27 Jun 2019 13:43:54 -0400 Message-ID: Subject: Re: [RFC] Deadlock via recursive wakeup via RCU with threadirqs To: Sebastian Andrzej Siewior Cc: Steven Rostedt , "Paul E. McKenney" , rcu , LKML , Thomas Gleixner , Ingo Molnar , Peter Zijlstra , Josh Triplett , Mathieu Desnoyers , Lai Jiangshan Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 27, 2019 at 11:40 AM Sebastian Andrzej Siewior wrote: > > On 2019-06-27 11:37:10 [-0400], Joel Fernandes wrote: > > Sebastian it would be nice if possible to trace where the > > t->rcu_read_unlock_special is set for this scenario of calling > > rcu_read_unlock_special, to give a clear idea about whether it was > > really because of an IPI. I guess we could also add additional RCU > > debug fields to task_struct (just for debugging) to see where there > > unlock_special is set. > > > > Is there a test to reproduce this, or do I just boot an intel x86_64 > > machine with "threadirqs" and run into it? > > Do you want to send me a patch or should I send you my kvm image which > triggers the bug on boot? I could reproduce this as well just booting Linus tree with threadirqs command line and running rcutorture. In 15 seconds or so it locks up... gdb backtrace shows the recursive lock: (gdb) bt #0 queued_spin_lock_slowpath (lock=0xffff88813af16b44, val=0) at kernel/locking/qspinlock.c:381 #1 0xffffffff81a75a0f in queued_spin_lock (lock=) at ./include/asm-generic/qspinlock.h:81 #2 do_raw_spin_lock_flags (flags=, lock=) at ./include/linux/spinlock.h:193 #3 __raw_spin_lock_irqsave (lock=) at ./include/linux/spinlock_api_smp.h:119 #4 _raw_spin_lock_irqsave (lock=0xffff88813af16b44) at kernel/locking/spinlock.c:159 #5 0xffffffff81092158 in try_to_wake_up (p=0xffff88813af16400, state=3, wake_flags=0) at kernel/sched/core.c:2000 #6 0xffffffff8109265c in wake_up_process (p=) at kernel/sched/core.c:2114 #7 0xffffffff8106b019 in wakeup_softirqd () at kernel/softirq.c:76 #8 0xffffffff8106b8ef in raise_softirq_irqoff (nr=) at kernel/softirq.c:437 #9 0xffffffff810d080d in rcu_read_unlock_special (t=) at kernel/rcu/tree_plugin.h:632 #10 0xffffffff810d0868 in __rcu_read_unlock () at kernel/rcu/tree_plugin.h:414 #11 0xffffffff810aa58e in rcu_read_unlock () at ./include/linux/rcupdate.h:646 #12 cpuacct_charge (tsk=, cputime=) at kernel/sched/cpuacct.c:352 #13 0xffffffff81097994 in cgroup_account_cputime (delta_exec=, task=) at ./include/linux/cgroup.h:764 #14 update_curr (cfs_rq=) at kernel/sched/fair.c:843 #15 0xffffffff8109a882 in enqueue_entity (flags=, se=, cfs_rq=) at kernel/sched/fair.c:3901 #16 enqueue_task_fair (rq=, p=, flags=9) at kernel/sched/fair.c:5194 #17 0xffffffff81091699 in enqueue_task (flags=, p=, rq=) at kernel/sched/core.c:774 #18 activate_task (rq=0xffff88813af16b44, p=0x0 , flags=) at kernel/sched/core.c:795 #19 0xffffffff81091a1a in ttwu_do_activate (rq=0xffff88813af16b44, p=0x0 , wake_flags=0, rf=) at kernel/sched/core.c:1737 #20 0xffffffff810924b3 in ttwu_queue (wake_flags=, cpu=, p=) at kernel/sched/core.c:1882 #21 try_to_wake_up (p=0xffff88813af16400, state=, wake_flags=0) at kernel/sched/core.c:2092 #22 0xffffffff8109265c in wake_up_process (p=) at kernel/sched/core.c:2114 #23 0xffffffff8106b019 in wakeup_softirqd () at kernel/softirq.c:76 #24 0xffffffff8106b843 in invoke_softirq () at kernel/softirq.c:383 #25 irq_exit () at kernel/softirq.c:413 #26 0xffffffff81c01f8e in exiting_irq () at ./arch/x86/include/asm/apic.h:536 > > Sebastian