Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751510AbdFFTcI (ORCPT ); Tue, 6 Jun 2017 15:32:08 -0400 Received: from mga07.intel.com ([134.134.136.100]:9178 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751239AbdFFTcC (ORCPT ); Tue, 6 Jun 2017 15:32:02 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.39,307,1493708400"; d="scan'208";a="111661611" Subject: Re: [PATCHv2 0/3] Enable no_cache flag to driver_data To: "Luis R. Rodriguez" References: <1495262819-981-1-git-send-email-yi1.li@linux.intel.com> <20170524190357.GB8951@wotan.suse.de> Cc: atull@kernel.org, gregkh@linuxfoundation.org, wagi@monom.org, dwmw2@infradead.org, rafal@milecki.pl, arend.vanspriel@broadcom.com, rjw@rjwysocki.net, moritz.fischer@ettus.com, pmladek@suse.com, johannes.berg@intel.com, emmanuel.grumbach@intel.com, luciano.coelho@intel.com, kvalo@codeaurora.org, luto@kernel.org, takahiro.akashi@linaro.org, dhowells@redhat.com, pjones@redhat.com, linux-kernel@vger.kernel.org, linux-fpga@vger.kernel.org From: "Li, Yi" Message-ID: Date: Tue, 6 Jun 2017 14:31:54 -0500 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <20170524190357.GB8951@wotan.suse.de> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 8692 Lines: 205 Hi Luis, On 5/24/2017 2:03 PM, Luis R. Rodriguez wrote: > On Sat, May 20, 2017 at 01:46:56AM -0500, yi1.li@linux.intel.com wrote: >> From: Yi Li >> >> Changes in v2: >> >> - Rebase to Luis R. Rodriguez's 20170501-driver-data-try2 >> branch >> - Expose DRIVER_DATA_REQ_NO_CACHE flag to public >> driver_data_req_params structure, so upper drivers can ask >> driver_data driver to bypass the internal caching mechanism. >> This will be used for streaming and other drivers maintains >> their own caching like iwlwifi. >> - Add self test cases. >> >> >> Yi Li (3): >> firmware_class: move NO_CACHE from private to driver_data_req_params >> iwlwifi: use DRIVER_DATA_REQ_NO_CACHE for driver_data >> test: add no_cache to driver_data load tester >> >> drivers/base/firmware_class.c | 16 ++++----- >> drivers/net/wireless/intel/iwlwifi/iwl-drv.c | 2 ++ >> include/linux/driver_data.h | 4 +++ >> lib/test_driver_data.c | 43 +++++++++++++++++++++++-- >> tools/testing/selftests/firmware/driver_data.sh | 36 +++++++++++++++++++++ >> 5 files changed, 89 insertions(+), 12 deletions(-) >> > Good stuff, this series is looking good and very easy to read ! Only thing > though -- the cache test is just setting up the cache and ensuring it gets set, > it doesn't really *test* the cache is functional. Can you devise a test which > does ensure the cache is functional ? > > We use the cache upon suspend to cache the firmware so that upon resume a > request will use that cache, to avoid the file lookup on disk. Doing a test > with qemu suspend + resume is possible but that requires having access to > the qemu monitor interface and doing something like this to trigger a wakeup: > > echo system_wakeup | socat - /dev/pts/7,raw,echo=0,crnl > > where /dev/pts/7 is assumed to be the PTY. This is rather complex for ksefltest, > so another option is to add a debug mode debugfs interface for firmware_class where > we can force-enable the cache mechanism without actually this being prompted by a > suspend/hibernate. Then we can just toggle this bit from a testing perspective and > excercise the caching mechanism. > > So if you look at fw_pm_notify() > > switch (mode) { > case PM_HIBERNATION_PREPARE: > case PM_SUSPEND_PREPARE: > case PM_RESTORE_PREPARE: > /* > * kill pending fallback requests with a custom fallback > * to avoid stalling suspend. > */ > kill_pending_fw_fallback_reqs(true); > device_cache_fw_images(); > disable_firmware(); > break; > I am studying at the firmware caching codes and have to say it's very complicated. :-( Here are some questions: 1. Since device_cache_fw_images invokes dev_cache_fw_image through dpm_for_each_dev, adding a debugfs driver to kick it can only cache firmware for those associated with devices which has PM enabled, which do not include the driver_data_test_device. Any suggestions? 2. Look into dev_cache_fw_image function, devres_for_each_res will walk through the firmware have been loaded before (through assign_firmware_buf -> fw_add_devm_name) and add to the todo list, eventually it will create the fw_names list. So in the test driver, we need to load the firmware once before calling the kick? dev_cache_fw_image(struct device *dev, void *data) { LIST_HEAD(todo); struct fw_cache_entry *fce; struct fw_cache_entry *fce_next; struct firmware_cache *fwc = &fw_cache; devres_for_each_res(dev, fw_name_devm_release, devm_name_match, &fw_cache, dev_create_fw_entry, &todo); list_for_each_entry_safe(fce, fce_next, &todo, list) { list_del(&fce->list); spin_lock(&fwc->name_lock); /* only one cache entry for one firmware */ if (!__fw_entry_found(fce->name)) { list_add(&fce->list, &fwc->fw_names); } else { free_fw_cache_entry(fce); ... } > kill_pending_fw_fallback_reqs(true), device_cache_fw_images(), and > disable_firmware() could be folder into a helper, then a debugfs interface > could kick that into action to put us cache mode as the > device_cache_fw_images() changes the cache state to FW_LOADER_START_CACHE, and > when this is done you'll notice that assign_firmware_buf() does: > > /* > * After caching firmware image is started, let it piggyback > * on request firmware. > */ > if (!driver_data_param_nocache(&data_params->priv_params) && > buf->fwc->state == FW_LOADER_START_CACHE) { > if (fw_cache_piggyback_on_request(buf->fw_id)) > kref_get(&buf->ref); > } > > Which adds an incoming request to the cache. The first request that adds this cache > entry would be triggered by device_cache_fw_images() after the cache state is enabled: > > static void device_cache_fw_images(void) > { > ... > fwc->state = FW_LOADER_START_CACHE; > dpm_for_each_dev(NULL, dev_cache_fw_image); > ... > } This only applies to the devices have PM enabled. > Subsequent requests then lookup for the cache through _request_firmware_prepare() > when fw_lookup_and_allocate_buf() is called. __fw_lookup_buf() really should be > renamed to something that reflects this is a cache lookup. In fact if you find > anything else that needs renaming to make it clear please feel free to send patches > for it. > > We want to test that when caching is enabled, the cache is actually used. > > Note that disable_firmware() above on the notifier *does* disable subsequent firmware > lookups but this is only *if* the lookup fails with _request_firmware_prepare(): > > _request_firmware(const struct firmware **firmware_p, const char *name, > struct driver_data_params *data_params, > struct device *device) > { > struct firmware *fw = NULL; > int ret; > > if (!firmware_p) > return -EINVAL; > > if (!name || name[0] == '\0') { > ret = -EINVAL; > goto out; > } > > ret = _request_firmware_prepare(&fw, name, device, data_params); > if (ret <= 0) /* error or already assigned */ > goto out; > > if (!firmware_enabled()) { > WARN(1, "firmware request while host is not available\n"); > ret = -EHOSTDOWN; > goto out; > } > ... > } > > The idea is that _request_firmware_prepare() will have picked up the cache and > enabled use of that while the infrastructure for disk lookups is disabled. The > caching effect is lifted later on the same notifier fw_pm_notify() and it also > schedules a clearing of the cached firmwares with device_uncache_fw_images_delay(). device_uncache_fw_images_delay() will be called for PM_POST_RESTORE, which means the hibernation restore got error. How about the successful restore case, calling driver will free its firmware_buf after loading? > To me this last part smells like a possible source of issue (not sure) if we might > suspend/resume a lot in short period of time, this theory could be tested by toggling > on/of this debugfs interface I suggested while having requests of different types > blast in. Agree, it might create an issue if the system is get into restore_prepare again before the device_uncache_fw_images_delay clear the cache, why we need the 10 * MSEC_PER_SEC delay? In theory, fwc->name_lock should protect the case though. > > This is the sort of testing which would really help here. > > Likewise, if you are extending functionality please consider ways to break it :) Understand the need to test the firmware caching part. For non-caching test case, will it be enough if we can test that the noncache setting will ban the firmware name be added to the fwc->fw_names list? > and test against it. Please think about these things carefully, its what will > change the stability for the better long term of our loader infrastructure. > > Luis > -- > To unsubscribe from this list: send the line "unsubscribe linux-fpga" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >