Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp2571863ybi; Mon, 1 Jul 2019 14:29:10 -0700 (PDT) X-Google-Smtp-Source: APXvYqzGyiUtl7seHccOwHS+rjMZ5U+z8XfacgjvoxJoqAWjLwnQzs/Awy1iN0R2rHEN7T+35vJA X-Received: by 2002:a17:902:2ba7:: with SMTP id l36mr30815750plb.334.1562016550847; Mon, 01 Jul 2019 14:29:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1562016550; cv=none; d=google.com; s=arc-20160816; b=wFbumnjswV3PmsXqzp/mtWl3mayXJODvG4x+PLP76ezEPoMRagIxeFG1LbXZJHOPNq SSeQYpbKRHMjmAHR5RMG/0N7si+yRx6QtLSg1kXXQgFcOzKvbWW34wp6rdcj1lmqvkTh G1GvOvYecqf0wqFc4Wqi6KgiFaJ9xcbwvqJPB4HyLtsiQ4BHORfz2kqiAuUQVfB1spoX fIA2llmKVaSALr4ne2UV/wPOZQRVul9XKuxcUruNZOr1yZgkpvpyWhEKswHwE9qQ5X/P qJxJA6V0BY9ZBvDV+UnwmO6s5SLK8HlhHwMHqYt5WBIJfbDBx/FMKbwmLD9Q9UA+oNfq aOFg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date; bh=LGyYPw1uJgd5Xgs4tGIlYIbZBV9y7o5fmmdMd1MOJ2w=; b=k7c583IxjnI5QHtGYbVoJsFvQbGgKu1pXVzrnPituZJM1gTzSn9BR8MkvF90KpxXbw Cj6gzhdBwU7/pULIBtuceZoHD0KSyb24dRNkCdOHgccWiwP+4CDM46hO/mdHaHioiYJZ zuUxa9anLIQuVe0cseTjcSyxpVGMNKo1qvLJJD54MlOF+uklkTtKjodpHrDMmmpboDPe t5WGUJPo0qXoqJKfHBdIw3ivqPctjhrbRbRB80Qtmk0qJm/3Vqco2ltS+kYeZ64azHu1 S1nWZYpKWspwDZe6jgGItx2yekF9ag3RvoCwJBtTCBssY0mp0ZTGkQqy04PDFdHfPqej vP4Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k15si10638113pgh.331.2019.07.01.14.28.55; Mon, 01 Jul 2019 14:29:10 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726793AbfGAV22 (ORCPT + 99 others); Mon, 1 Jul 2019 17:28:28 -0400 Received: from mail.kernel.org ([198.145.29.99]:53022 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726509AbfGAV22 (ORCPT ); Mon, 1 Jul 2019 17:28:28 -0400 Received: from gandalf.local.home (cpe-66-24-58-225.stny.res.rr.com [66.24.58.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 805732064A; Mon, 1 Jul 2019 21:28:26 +0000 (UTC) Date: Mon, 1 Jul 2019 17:28:25 -0400 From: Steven Rostedt To: Corey Minyard Cc: Corey Minyard , Sebastian Andrzej Siewior , linux-rt-users@vger.kernel.org, linux-kernel@vger.kernel.org, Peter Zijlstra , tglx@linutronix.de Subject: Re: [PATCH RT v2] Fix a lockup in wait_for_completion() and friends Message-ID: <20190701172825.7d861e85@gandalf.local.home> In-Reply-To: <20190701171333.37cc0567@gandalf.local.home> References: <20190509193320.21105-1-minyard@acm.org> <20190510103318.6cieoifz27eph4n5@linutronix.de> <20190628214903.6f92a9ea@oasis.local.home> <20190701190949.GB4336@minyard.net> <20190701161840.1a53c9e4@gandalf.local.home> <20190701204325.GD5041@minyard.net> <20190701170602.2fdb35c2@gandalf.local.home> <20190701171333.37cc0567@gandalf.local.home> X-Mailer: Claws Mail 3.17.3 (GTK+ 2.24.32; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 1 Jul 2019 17:13:33 -0400 Steven Rostedt wrote: > On Mon, 1 Jul 2019 17:06:02 -0400 > Steven Rostedt wrote: > > > On Mon, 1 Jul 2019 15:43:25 -0500 > > Corey Minyard wrote: > > > > > > > I show that patch is already applied at > > > > > > 1921ea799b7dc561c97185538100271d88ee47db > > > sched/completion: Fix a lockup in wait_for_completion() > > > > > > git describe --contains 1921ea799b7dc561c97185538100271d88ee47db > > > v4.19.37-rt20~1 > > > > > > So I'm not sure what is going on. > > > > Bah, I'm replying to the wrong commit that I'm having issues with. > > > > I searched your name to find the patch that is of trouble, and picked > > this one. > > > > I'll go find the problem patch, sorry for the noise on this one. > > > > No, I did reply to the right email, but it wasn't the top patch I was > having issues with. It was the patch I replied to: > > This change below that Sebastian marked as stable-rt is what is causing > me an issue. Not the patch that started the thread. > In fact, my system doesn't boot with this commit in 5.0-rt. If I revert 90e1b18eba2ae4a729 ("swait: Delete the task from after a wakeup occured") the machine boots again. Sebastian, I think that's a bad commit, please revert it. Thanks! -- Steve > > > > Now.. that will fix it, but I think it is also wrong. > > > > The problem being that it violates FIFO, something that might be more > > important on -RT than elsewhere. > > > > The regular wait API seems confused/inconsistent when it uses > > autoremove_wake_function and default_wake_function, which doesn't help, > > but we can easily support this with swait -- the problematic thing is > > the custom wake functions, we musn't do that. > > > > (also, mingo went and renamed a whole bunch of wait_* crap and didn't do > > the same to swait_ so now its named all different :/) > > > > Something like the below perhaps. > > > > --- > > diff --git a/include/linux/swait.h b/include/linux/swait.h > > index 73e06e9986d4..f194437ae7d2 100644 > > --- a/include/linux/swait.h > > +++ b/include/linux/swait.h > > @@ -61,11 +61,13 @@ struct swait_queue_head { > > struct swait_queue { > > struct task_struct *task; > > struct list_head task_list; > > + unsigned int remove; > > }; > > > > #define __SWAITQUEUE_INITIALIZER(name) { \ > > .task = current, \ > > .task_list = LIST_HEAD_INIT((name).task_list), \ > > + .remove = 1, \ > > } > > > > #define DECLARE_SWAITQUEUE(name) \ > > diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c > > index e83a3f8449f6..86974ecbabfc 100644 > > --- a/kernel/sched/swait.c > > +++ b/kernel/sched/swait.c > > @@ -28,7 +28,8 @@ void swake_up_locked(struct swait_queue_head *q) > > > > curr = list_first_entry(&q->task_list, typeof(*curr), task_list); > > wake_up_process(curr->task); > > - list_del_init(&curr->task_list); > > + if (curr->remove) > > + list_del_init(&curr->task_list); > > } > > EXPORT_SYMBOL(swake_up_locked); > > > > @@ -57,7 +58,8 @@ void swake_up_all(struct swait_queue_head *q) > > curr = list_first_entry(&tmp, typeof(*curr), task_list); > > > > wake_up_state(curr->task, TASK_NORMAL); > > - list_del_init(&curr->task_list); > > + if (curr->remove) > > + list_del_init(&curr->task_list); > > > > if (list_empty(&tmp)) > > break; >