Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751942AbdFZVXS (ORCPT ); Mon, 26 Jun 2017 17:23:18 -0400 Received: from mail.kernel.org ([198.145.29.99]:54912 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751381AbdFZVXQ (ORCPT ); Mon, 26 Jun 2017 17:23:16 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 176C622BCF Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=mcgrof@kernel.org From: "Luis R. Rodriguez" To: gregkh@linuxfoundation.org Cc: jakub.kicinski@netronome.com, nbroeking@me.com, ming.lei@redhat.com, mfuzzey@parkeon.com, ebiederm@xmission.com, dmitry.torokhov@gmail.com, wagi@monom.org, dwmw2@infradead.org, jewalt@lgsinnovations.com, rafal@milecki.pl, arend.vanspriel@broadcom.com, rjw@rjwysocki.net, yi1.li@linux.intel.com, atull@kernel.org, moritz.fischer@ettus.com, pmladek@suse.com, johannes.berg@intel.com, emmanuel.grumbach@intel.com, luciano.coelho@intel.com, kvalo@codeaurora.org, luto@kernel.org, torvalds@linux-foundation.org, keescook@chromium.org, takahiro.akashi@linaro.org, dhowells@redhat.com, pjones@redhat.com, hdegoede@redhat.com, alan@linux.intel.com, tytso@mit.edu, paul.gortmaker@windriver.com, mtosatti@redhat.com, mawilcox@microsoft.com, linux-kernel@vger.kernel.org, "[4.10+]" , "Luis R . Rodriguez" Subject: [PATCH v2] firmware: fix batched requests - wake all waiters Date: Mon, 26 Jun 2017 14:23:12 -0700 Message-Id: <20170626212312.31958-1-mcgrof@kernel.org> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20170626212036.GE21846@wotan.suse.de> References: <20170626212036.GE21846@wotan.suse.de> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3104 Lines: 73 From: Jakub Kicinski The firmware cache mechanism serves two purposes, the secondary purpose is not well documented nor understood. This fixes a regression with the secondary purpose of the firmware cache mechanism: batched requests. The firmware cache is used for: 1) Addressing races with file lookups during the suspend/resume cycle by keeping firmware in memory during the cycle 2) Batched requests for the same file rely only on work from the first file lookup, which keeps the firmware in memory until the last release_firmware() is called Batched requests *only* take effect if secondary requests come in prior to the first user calling release_firmware(). The devres name used for the internal firmware cache is used as a hint other pending requests are ongoing, the firmware buffer data is kept in memory until the last user of the buffer calls release_firmware(), therefore serializing requests and delaying the release until all requests are done. Batched requests wait for a wakup or signal (we only accept SIGKILL now) so we can rely on the first file fetch to write to the pending secondary requests. Commit 5b029624948d ("firmware: do not use fw_lock for fw_state protection") ported the firmware API to use swait, and in doing so failed to convert complete_all() to swake_up_all() -- it used swake_up(), loosing the ability for *some* batched requests to take effect. Without this fix it has been reported plugging in two Intel 6260 Wifi cards on a system will end up enumerating the two devices only 50% of the time [0]. The ported swake_up() should have actually two devices, however, *if more than two cards are used* the swake_up() would not suffice. This change is only part of the required fixes for batched requests. Subsequent fixes will follow. This particular change should fix the cases where more than three requests with the same firmware name is used, otherwise batched requests will wait for MAX_SCHEDULE_TIMEOUT and just timeout eventually. [0] https://bugzilla.kernel.org/show_bug.cgi?id=195477 Fixes: 5b029624948d ("firmware: do not use fw_lock for fw_state protection") CC: [4.10+] Cc: Ming Lei Signed-off-by: Jakub Kicinski [mcgrof: expanded on impact on commit log] Signed-off-by: Luis R. Rodriguez --- Greg, I think it would make sense to queue this in after the signal stable fixes [1]. [1] https://lkml.kernel.org/r/20170614222017.14653-1-mcgrof@kernel.org drivers/base/firmware_class.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c index b9f907eedbf7..686381a621a0 100644 --- a/drivers/base/firmware_class.c +++ b/drivers/base/firmware_class.c @@ -148,7 +148,7 @@ static void __fw_state_set(struct fw_state *fw_st, WRITE_ONCE(fw_st->status, status); if (status == FW_STATUS_DONE || status == FW_STATUS_ABORTED) - swake_up(&fw_st->wq); + swake_up_all(&fw_st->wq); } #define fw_state_start(fw_st) \ -- 2.11.0