Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753533AbdF2PQz (ORCPT ); Thu, 29 Jun 2017 11:16:55 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:59934 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753293AbdF2PQn (ORCPT ); Thu, 29 Jun 2017 11:16:43 -0400 Date: Thu, 29 Jun 2017 17:16:41 +0200 From: Greg KH To: "Luis R. Rodriguez" Cc: jakub.kicinski@netronome.com, nbroeking@me.com, ming.lei@redhat.com, mfuzzey@parkeon.com, ebiederm@xmission.com, dmitry.torokhov@gmail.com, wagi@monom.org, dwmw2@infradead.org, jewalt@lgsinnovations.com, rafal@milecki.pl, arend.vanspriel@broadcom.com, rjw@rjwysocki.net, yi1.li@linux.intel.com, atull@kernel.org, moritz.fischer@ettus.com, pmladek@suse.com, johannes.berg@intel.com, emmanuel.grumbach@intel.com, luciano.coelho@intel.com, kvalo@codeaurora.org, luto@kernel.org, torvalds@linux-foundation.org, keescook@chromium.org, takahiro.akashi@linaro.org, dhowells@redhat.com, pjones@redhat.com, hdegoede@redhat.com, alan@linux.intel.com, tytso@mit.edu, paul.gortmaker@windriver.com, mtosatti@redhat.com, mawilcox@microsoft.com, linux-kernel@vger.kernel.org, "[4.10+]" Subject: Re: [PATCH v2] firmware: fix batched requests - wake all waiters Message-ID: <20170629151641.GC4880@kroah.com> References: <20170626212036.GE21846@wotan.suse.de> <20170626212312.31958-1-mcgrof@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170626212312.31958-1-mcgrof@kernel.org> User-Agent: Mutt/1.8.3 (2017-05-23) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2738 Lines: 59 On Mon, Jun 26, 2017 at 02:23:12PM -0700, Luis R. Rodriguez wrote: > From: Jakub Kicinski > > The firmware cache mechanism serves two purposes, the secondary purpose is > not well documented nor understood. This fixes a regression with the secondary > purpose of the firmware cache mechanism: batched requests. > > The firmware cache is used for: > > 1) Addressing races with file lookups during the suspend/resume cycle > by keeping firmware in memory during the cycle > > 2) Batched requests for the same file rely only on work from the first file > lookup, which keeps the firmware in memory until the last release_firmware() > is called > > Batched requests *only* take effect if secondary requests come in prior to the > first user calling release_firmware(). The devres name used for the internal > firmware cache is used as a hint other pending requests are ongoing, the > firmware buffer data is kept in memory until the last user of the buffer > calls release_firmware(), therefore serializing requests and delaying the > release until all requests are done. > > Batched requests wait for a wakup or signal (we only accept SIGKILL now) so we > can rely on the first file fetch to write to the pending secondary requests. > Commit 5b029624948d ("firmware: do not use fw_lock for fw_state protection") > ported the firmware API to use swait, and in doing so failed to convert > complete_all() to swake_up_all() -- it used swake_up(), loosing the ability > for *some* batched requests to take effect. > > Without this fix it has been reported plugging in two Intel 6260 Wifi cards > on a system will end up enumerating the two devices only 50% of the time > [0]. The ported swake_up() should have actually two devices, however, > *if more than two cards are used* the swake_up() would not suffice. This > change is only part of the required fixes for batched requests. Subsequent > fixes will follow. > > This particular change should fix the cases where more than three requests > with the same firmware name is used, otherwise batched requests will wait for > MAX_SCHEDULE_TIMEOUT and just timeout eventually. > > [0] https://bugzilla.kernel.org/show_bug.cgi?id=195477 > > Fixes: 5b029624948d ("firmware: do not use fw_lock for fw_state protection") > CC: [4.10+] > Cc: Ming Lei > Signed-off-by: Jakub Kicinski > [mcgrof: expanded on impact on commit log] > Signed-off-by: Luis R. Rodriguez > --- > > Greg, I think it would make sense to queue this in after the signal stable > fixes [1]. As I just dropped them, can you redo this based on Linus's tree now? thanks, greg k-h