Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752165AbdFZVo1 (ORCPT ); Mon, 26 Jun 2017 17:44:27 -0400 Received: from mail-io0-f180.google.com ([209.85.223.180]:35757 "EHLO mail-io0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751439AbdFZVoS (ORCPT ); Mon, 26 Jun 2017 17:44:18 -0400 MIME-Version: 1.0 In-Reply-To: <20170623233702.20564-1-jakub.kicinski@netronome.com> References: <20170623233702.20564-1-jakub.kicinski@netronome.com> From: Linus Torvalds Date: Mon, 26 Jun 2017 14:44:17 -0700 X-Google-Sender-Auth: bgzHLSSmhZJzFb8iSEf3UoKorio Message-ID: Subject: Re: [PATCH] firmware: wake all waiters To: Jakub Kicinski Cc: "Luis R. Rodriguez" , Greg Kroah-Hartman , Linux Kernel Mailing List , oss-drivers@netronome.com Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1652 Lines: 42 On Fri, Jun 23, 2017 at 4:37 PM, Jakub Kicinski wrote: > - swake_up(&fw_st->wq); > + swake_up_all(&fw_st->wq); Why is that code using the braindamaed "swait" model in the first place? That's the real problem here - swait() is a very specialized interface, and it does not make sense to use it here. Among all the simplifications it has is exactly the fact that it wakes up only one thing, because it is *so* specialized. But the *only* reason for swait is extreme memory issues and some very special realtime issues, where it saves a couple of bytes in structures that need close packing, and doesn't even use normal spinlocks, so it saves a couple of cycles at wakeup/sleep because it doesn't do a good job in general. The "avoid normal spinlocks" is because it is meant for code that is *so* special that it needs the magical low-level raw spinlocks. I really have *no* idea why the firmware code uses that idiotic special wait-queue. It has no reason to do so, except this comment from the commit that added it: "We use also swait instead of wait because don't need all the additional features wait provides." which is bogus, since it clearly just got the waiting wrong exactly *because* swait is pretty damn bad and specialized. I think the two valid users are RCU (which needed it for RT), and kvm (which also needed it for similar issues - it needs to be non-preemptible). I don't see any similar reason for the firmware loading, and all it did was use an odd interface that resulted in this bug. Why is the firmware code being so damn odd on purpose? Linus