Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751574AbdFZXaf (ORCPT ); Mon, 26 Jun 2017 19:30:35 -0400 Received: from mx2.suse.de ([195.135.220.15]:47873 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751426AbdFZXaf (ORCPT ); Mon, 26 Jun 2017 19:30:35 -0400 Date: Tue, 27 Jun 2017 01:30:30 +0200 From: "Luis R. Rodriguez" To: Linus Torvalds , Thomas Gleixner , Peter Zijlstra , DanielWagnerwagi@monom.org, Boqun Feng , Marcelo Tosatti , Paul Gortmaker , Ming Lei , yi1.li@linux.intel.com, takahiro.akashi@linaro.org Cc: Jakub Kicinski , "Luis R. Rodriguez" , Greg Kroah-Hartman , "Paul E. McKenney" , Davidlohr Bueso , Linux Kernel Mailing List , ebiederm@xmission.com, pmladek@suse.com, luto@kernel.org, keescook@chromium.org, dhowells@redhat.com, alan@linux.intel.com, tytso@mit.edu, oss-drivers@netronome.com Subject: Re: [PATCH] firmware: wake all waiters Message-ID: <20170626233030.GI21846@wotan.suse.de> References: <20170623233702.20564-1-jakub.kicinski@netronome.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.6.0 (2016-04-01) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2839 Lines: 69 On Mon, Jun 26, 2017 at 02:44:17PM -0700, Linus Torvalds wrote: > On Fri, Jun 23, 2017 at 4:37 PM, Jakub Kicinski > wrote: > > - swake_up(&fw_st->wq); > > + swake_up_all(&fw_st->wq); > > Why is that code using the braindamaed "swait" model in the first place? The conversion was done via commit 5b029624948d ("firmware: do not use fw_lock for fw_state protection"). This commit overlooked this "batched request" feature. > That's the real problem here - swait() is a very specialized > interface, and it does not make sense to use it here. > > Among all the simplifications it has is exactly the fact that it wakes > up only one thing, because it is *so* specialized. Not sure I follow, it can wake up all items in queue with swake_up_all(), no? At first I *thought* we don't add further items with __prepare_to_swait() but actually that if (list_empty(&wait->task_list)) check just checks if the item was already added to a wait list. So swait can queue more waiters, no? > But the *only* reason for swait is extreme memory issues and some very > special realtime issues, where it saves a couple of bytes in > structures that need close packing, and doesn't even use normal > spinlocks, so it saves a couple of cycles at wakeup/sleep because it > doesn't do a good job in general. I see. > The "avoid normal spinlocks" is because it is meant for code that is > *so* special that it needs the magical low-level raw spinlocks. > > I really have *no* idea why the firmware code uses that idiotic > special wait-queue. It has no reason to do so, except this comment > from the commit that added it: > > "We use also swait instead of wait because don't need all the additional > features wait provides." > > which is bogus, since it clearly just got the waiting wrong exactly > *because* swait is pretty damn bad and specialized. If indeed it cannot queue and wake all then surely this is buggered! > I think the two valid users are RCU (which needed it for RT), and kvm > (which also needed it for similar issues - it needs to be > non-preemptible). Got it. > I don't see any similar reason for the firmware loading, and all it > did was use an odd interface that resulted in this bug. >From my review when this was suggested it seemed to cover all grounds and an API that was more light weight seemed sensible. The swait documentation seems to note "One would recommend using this wait queue where possible." If its non suitable for multiple waits then indeed this needs fixing so this doesn't happen again. > Why is the firmware code being so damn odd on purpose? I've been asking myself that since I started reviewing the firmware code :) But in this case I believe the commit overlooked batched requests functionality and to use swake_up_all(). Luis