2017-06-06 19:32:08

by Li, Yi

[permalink] [raw]
Subject: Re: [PATCHv2 0/3] Enable no_cache flag to driver_data

Hi Luis,


On 5/24/2017 2:03 PM, Luis R. Rodriguez wrote:
> On Sat, May 20, 2017 at 01:46:56AM -0500, [email protected] wrote:
>> From: Yi Li <[email protected]>
>>
>> Changes in v2:
>>
>> - Rebase to Luis R. Rodriguez's 20170501-driver-data-try2
>> branch
>> - Expose DRIVER_DATA_REQ_NO_CACHE flag to public
>> driver_data_req_params structure, so upper drivers can ask
>> driver_data driver to bypass the internal caching mechanism.
>> This will be used for streaming and other drivers maintains
>> their own caching like iwlwifi.
>> - Add self test cases.
>>
>>
>> Yi Li (3):
>> firmware_class: move NO_CACHE from private to driver_data_req_params
>> iwlwifi: use DRIVER_DATA_REQ_NO_CACHE for driver_data
>> test: add no_cache to driver_data load tester
>>
>> drivers/base/firmware_class.c | 16 ++++-----
>> drivers/net/wireless/intel/iwlwifi/iwl-drv.c | 2 ++
>> include/linux/driver_data.h | 4 +++
>> lib/test_driver_data.c | 43 +++++++++++++++++++++++--
>> tools/testing/selftests/firmware/driver_data.sh | 36 +++++++++++++++++++++
>> 5 files changed, 89 insertions(+), 12 deletions(-)
>>
> Good stuff, this series is looking good and very easy to read ! Only thing
> though -- the cache test is just setting up the cache and ensuring it gets set,
> it doesn't really *test* the cache is functional. Can you devise a test which
> does ensure the cache is functional ?
>
> We use the cache upon suspend to cache the firmware so that upon resume a
> request will use that cache, to avoid the file lookup on disk. Doing a test
> with qemu suspend + resume is possible but that requires having access to
> the qemu monitor interface and doing something like this to trigger a wakeup:
>
> echo system_wakeup | socat - /dev/pts/7,raw,echo=0,crnl
>
> where /dev/pts/7 is assumed to be the PTY. This is rather complex for ksefltest,
> so another option is to add a debug mode debugfs interface for firmware_class where
> we can force-enable the cache mechanism without actually this being prompted by a
> suspend/hibernate. Then we can just toggle this bit from a testing perspective and
> excercise the caching mechanism.
>
> So if you look at fw_pm_notify()
>
> switch (mode) {
> case PM_HIBERNATION_PREPARE:
> case PM_SUSPEND_PREPARE:
> case PM_RESTORE_PREPARE:
> /*
> * kill pending fallback requests with a custom fallback
> * to avoid stalling suspend.
> */
> kill_pending_fw_fallback_reqs(true);
> device_cache_fw_images();
> disable_firmware();
> break;
>
I am studying at the firmware caching codes and have to say it's very
complicated. :-( Here are some questions:

1. Since device_cache_fw_images invokes dev_cache_fw_image through
dpm_for_each_dev, adding a debugfs driver to kick it can only cache
firmware for those associated with devices which has PM enabled, which
do not include the driver_data_test_device. Any suggestions?

2. Look into dev_cache_fw_image function, devres_for_each_res will walk through the firmware have been loaded before (through assign_firmware_buf -> fw_add_devm_name) and add to the todo list, eventually it will create the fw_names list. So in the test driver, we need to load the firmware once before calling the kick?
dev_cache_fw_image(struct device *dev, void *data)

{

LIST_HEAD(todo);

struct fw_cache_entry *fce;

struct fw_cache_entry *fce_next;

struct firmware_cache *fwc = &fw_cache;

devres_for_each_res(dev, fw_name_devm_release,

devm_name_match, &fw_cache,

dev_create_fw_entry, &todo);

list_for_each_entry_safe(fce, fce_next, &todo, list) {

list_del(&fce->list);

spin_lock(&fwc->name_lock);

/* only one cache entry for one firmware */

if (!__fw_entry_found(fce->name)) {

list_add(&fce->list, &fwc->fw_names);

} else {

free_fw_cache_entry(fce);

...
}
> kill_pending_fw_fallback_reqs(true), device_cache_fw_images(), and
> disable_firmware() could be folder into a helper, then a debugfs interface
> could kick that into action to put us cache mode as the
> device_cache_fw_images() changes the cache state to FW_LOADER_START_CACHE, and
> when this is done you'll notice that assign_firmware_buf() does:
>
> /*
> * After caching firmware image is started, let it piggyback
> * on request firmware.
> */
> if (!driver_data_param_nocache(&data_params->priv_params) &&
> buf->fwc->state == FW_LOADER_START_CACHE) {
> if (fw_cache_piggyback_on_request(buf->fw_id))
> kref_get(&buf->ref);
> }
>
> Which adds an incoming request to the cache. The first request that adds this cache
> entry would be triggered by device_cache_fw_images() after the cache state is enabled:
>
> static void device_cache_fw_images(void)
> {
> ...
> fwc->state = FW_LOADER_START_CACHE;
> dpm_for_each_dev(NULL, dev_cache_fw_image);
> ...
> }
This only applies to the devices have PM enabled.
> Subsequent requests then lookup for the cache through _request_firmware_prepare()
> when fw_lookup_and_allocate_buf() is called. __fw_lookup_buf() really should be
> renamed to something that reflects this is a cache lookup. In fact if you find
> anything else that needs renaming to make it clear please feel free to send patches
> for it.
>
> We want to test that when caching is enabled, the cache is actually used.
>
> Note that disable_firmware() above on the notifier *does* disable subsequent firmware
> lookups but this is only *if* the lookup fails with _request_firmware_prepare():
>
> _request_firmware(const struct firmware **firmware_p, const char *name,
> struct driver_data_params *data_params,
> struct device *device)
> {
> struct firmware *fw = NULL;
> int ret;
>
> if (!firmware_p)
> return -EINVAL;
>
> if (!name || name[0] == '\0') {
> ret = -EINVAL;
> goto out;
> }
>
> ret = _request_firmware_prepare(&fw, name, device, data_params);
> if (ret <= 0) /* error or already assigned */
> goto out;
>
> if (!firmware_enabled()) {
> WARN(1, "firmware request while host is not available\n");
> ret = -EHOSTDOWN;
> goto out;
> }
> ...
> }
>
> The idea is that _request_firmware_prepare() will have picked up the cache and
> enabled use of that while the infrastructure for disk lookups is disabled. The
> caching effect is lifted later on the same notifier fw_pm_notify() and it also
> schedules a clearing of the cached firmwares with device_uncache_fw_images_delay().

device_uncache_fw_images_delay() will be called for PM_POST_RESTORE, which means the hibernation restore got error. How about the successful restore case, calling driver will free its firmware_buf after loading?

> To me this last part smells like a possible source of issue (not sure) if we might
> suspend/resume a lot in short period of time, this theory could be tested by toggling
> on/of this debugfs interface I suggested while having requests of different types
> blast in.
Agree, it might create an issue if the system is get into
restore_prepare again before the device_uncache_fw_images_delay clear
the cache, why we need the 10 * MSEC_PER_SEC delay? In theory,
fwc->name_lock should protect the case though.
>
> This is the sort of testing which would really help here.
>
> Likewise, if you are extending functionality please consider ways to break it :)
Understand the need to test the firmware caching part. For non-caching
test case, will it be enough if we can test that the noncache setting
will ban the firmware name be added to the fwc->fw_names list?
> and test against it. Please think about these things carefully, its what will
> change the stability for the better long term of our loader infrastructure.
>
> Luis
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fpga" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>


2017-06-07 17:59:26

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCHv2 0/3] Enable no_cache flag to driver_data

On Tue, Jun 06, 2017 at 02:31:54PM -0500, Li, Yi wrote:
> > We use the cache upon suspend to cache the firmware so that upon resume a
> > request will use that cache, to avoid the file lookup on disk. Doing a test
> > with qemu suspend + resume is possible but that requires having access to
> > the qemu monitor interface and doing something like this to trigger a wakeup:
> >
> > echo system_wakeup | socat - /dev/pts/7,raw,echo=0,crnl

BTW if it helps for testing:

https://github.com/mcgrof/kvm-boot

> I am studying at the firmware caching codes and have to say it's very
> complicated. :-( Here are some questions:

It is! For this reason I tried to document it, have you read this:

https://www.kernel.org/doc/html/latest/driver-api/firmware/firmware_cache.html

This is certainly missing the notes about how some of this is loosely re-used
for possible duplicate requests though, but I think I've mentioned this on
my review of your patches so far.

> 1. Since device_cache_fw_images invokes dev_cache_fw_image through
> dpm_for_each_dev, adding a debugfs driver to kick it can only cache firmware
> for those associated with devices which has PM enabled, which do not include
> the driver_data_test_device. Any suggestions?

Ah, consider adding PM to driver_data_test_device ?

May be worth updating the documentation bout this requirement too!

> 2. Look into dev_cache_fw_image function, devres_for_each_res will walk
> through the firmware have been loaded before (through assign_firmware_buf ->
> fw_add_devm_name) and add to the todo list, eventually it will create the
> fw_names list. So in the test driver, we need to load the firmware once
> before calling the kick?

Yes, makes sense, given its a firmware cache, it would only make sense to
cache firmware for requests already made.

An important note I made in the documentation which you did not mention
but should be important for your own sanity:

"The firmware devres entry is maintained throughout the lifetime of the device.
This means that even if you release_firmware() the firmware cache will still be
used on resume from suspend."

This means the cache will be tried even if the driver did use release_firmware()
after request_firmware().

> > static void device_cache_fw_images(void)
> > {
> > ...
> > fwc->state = FW_LOADER_START_CACHE;
> > dpm_for_each_dev(NULL, dev_cache_fw_image);
> > ...
> > }
> This only applies to the devices have PM enabled.

Makes sense!

>
> device_uncache_fw_images_delay() will be called for PM_POST_RESTORE, which
> means the hibernation restore got error. How about the successful restore
> case, calling driver will free its firmware_buf after loading?

As it is on my latest 20170605-driver-data [0] (but should be just as yours if you
used a recent branch of mine), fw_pm_notify() uses:

static int fw_pm_notify(struct notifier_block *notify_block,
unsigned long mode, void *unused)
{
switch (mode) {
...
case PM_POST_SUSPEND:
case PM_POST_HIBERNATION:
case PM_POST_RESTORE:
/*
* In case that system sleep failed and syscore_suspend is
* not called.
*/
mutex_lock(&fw_lock);
fw_cache.state = FW_LOADER_NO_CACHE;
mutex_unlock(&fw_lock);
enable_firmware();

device_uncache_fw_images_delay(10 * MSEC_PER_SEC);
break;
}
}

So it uses also PM_POST_RESTORE. I think that covers it all ? As it
is today we handle the firmware cache for all drivers, it should be
transparent to drivers.

[0] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20170605-driver-data

> > To me this last part smells like a possible source of issue (not sure) if we might
> > suspend/resume a lot in short period of time, this theory could be tested by toggling
> > on/of this debugfs interface I suggested while having requests of different types
> > blast in.
> Agree, it might create an issue if the system is get into restore_prepare
> again before the device_uncache_fw_images_delay clear the cache, why we need
> the 10 * MSEC_PER_SEC delay? In theory, fwc->name_lock should protect the
> case though.

One possible solution may be to cancel_delayed_work(&fw_cache.work)
or cancel_delayed_work_sync(&fw_cache.work) on suspend, and then
force it to run *right away* so we clear any pending old cache.

One performance issue with this is we are clearing some possible cache we are
about to re-create, so mod_delayed_work() seems more efficient -- provided we
have accounting notes to know no new firmware requests have trickled in since
the last cache build. This seems rather complicated so a first initial approach
to just kill all known cache before suspend seems easy and fair.

BTW such a fix might be a stable candidate if you find a real behavioural
issue with the kernel !

> > This is the sort of testing which would really help here.
> >
> > Likewise, if you are extending functionality please consider ways to break it :)
> Understand the need to test the firmware caching part. For non-caching test
> case, will it be enough if we can test that the noncache setting will ban
> the firmware name be added to the fwc->fw_names list?

Good question. Let's see... in the future a device might have two requests
with the same firmware, one with a cache enabled and one without it -- I suppose
for now we should perhaps disable a non-cache request if one already exists
with a cache request ? The code for that probably is not there... But other than
this corner case...

Yeah I think that's a reasonable test case. Such a code seems may need a debug
export symbol to allow drivers to query for this info, if so feel free to a
respective debug kconfig entry for it.

Luis

2017-06-07 21:00:54

by Li, Yi

[permalink] [raw]
Subject: Re: [PATCHv2 0/3] Enable no_cache flag to driver_data

On 6/7/2017 12:59 PM, Luis R. Rodriguez wrote:
> On Tue, Jun 06, 2017 at 02:31:54PM -0500, Li, Yi wrote:
>>> We use the cache upon suspend to cache the firmware so that upon resume a
>>> request will use that cache, to avoid the file lookup on disk. Doing a test
>>> with qemu suspend + resume is possible but that requires having access to
>>> the qemu monitor interface and doing something like this to trigger a wakeup:
>>>
>>> echo system_wakeup | socat - /dev/pts/7,raw,echo=0,crnl
>
> BTW if it helps for testing:
>
> https://github.com/mcgrof/kvm-boot

Thanks, will explore that.

>
>> I am studying at the firmware caching codes and have to say it's very
>> complicated. :-( Here are some questions:
>
> It is! For this reason I tried to document it, have you read this:
>
> https://www.kernel.org/doc/html/latest/driver-api/firmware/firmware_cache.html

Good info, thanks.

>
> This is certainly missing the notes about how some of this is loosely re-used
> for possible duplicate requests though, but I think I've mentioned this on
> my review of your patches so far.
>
>> 1. Since device_cache_fw_images invokes dev_cache_fw_image through
>> dpm_for_each_dev, adding a debugfs driver to kick it can only cache firmware
>> for those associated with devices which has PM enabled, which do not include
>> the driver_data_test_device. Any suggestions?
>
> Ah, consider adding PM to driver_data_test_device ?
>

That's what I am looking now. But per my understanding, the misc_device
does not have the hook to add PM support like those platform device
drivers do. Please let me know if there is a way to do that.

> May be worth updating the documentation bout this requirement too!
>
>> 2. Look into dev_cache_fw_image function, devres_for_each_res will walk
>> through the firmware have been loaded before (through assign_firmware_buf ->
>> fw_add_devm_name) and add to the todo list, eventually it will create the
>> fw_names list. So in the test driver, we need to load the firmware once
>> before calling the kick?
>
> Yes, makes sense, given its a firmware cache, it would only make sense to
> cache firmware for requests already made.
>
> An important note I made in the documentation which you did not mention
> but should be important for your own sanity:
>
> "The firmware devres entry is maintained throughout the lifetime of the device.
> This means that even if you release_firmware() the firmware cache will still be
> used on resume from suspend."
>
> This means the cache will be tried even if the driver did use release_firmware()
> after request_firmware().

Yes, it makes sense and good to know and it's explained in the
kernel.org link you pointed out earlier. The devres entry records all
the firmware have been loaded before, it will save restore prepare time
for all the drivers to mark their firmware "no cache" if there is no
need to reload the firmware during restore.

>
>>> static void device_cache_fw_images(void)
>>> {
>>> ...
>>> fwc->state = FW_LOADER_START_CACHE;
>>> dpm_for_each_dev(NULL, dev_cache_fw_image);
>>> ...
>>> }
>> This only applies to the devices have PM enabled.
>
> Makes sense!
>
>>
>> device_uncache_fw_images_delay() will be called for PM_POST_RESTORE, which
>> means the hibernation restore got error. How about the successful restore
>> case, calling driver will free its firmware_buf after loading?
>
> As it is on my latest 20170605-driver-data [0] (but should be just as yours if you
> used a recent branch of mine), fw_pm_notify() uses:
>
> static int fw_pm_notify(struct notifier_block *notify_block,
> unsigned long mode, void *unused)
> {
> switch (mode) {
> ...
> case PM_POST_SUSPEND:
> case PM_POST_HIBERNATION:
> case PM_POST_RESTORE:
> /*
> * In case that system sleep failed and syscore_suspend is
> * not called.
> */
> mutex_lock(&fw_lock);
> fw_cache.state = FW_LOADER_NO_CACHE;
> mutex_unlock(&fw_lock);
> enable_firmware();
>
> device_uncache_fw_images_delay(10 * MSEC_PER_SEC);
> break;
> }
> }
>
> So it uses also PM_POST_RESTORE. I think that covers it all ? As it
> is today we handle the firmware cache for all drivers, it should be
> transparent to drivers.

Ah, the comment of "sleep failed" mislead me. :)

>
> [0] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20170605-driver-data
>
>>> To me this last part smells like a possible source of issue (not sure) if we might
>>> suspend/resume a lot in short period of time, this theory could be tested by toggling
>>> on/of this debugfs interface I suggested while having requests of different types
>>> blast in.
>> Agree, it might create an issue if the system is get into restore_prepare
>> again before the device_uncache_fw_images_delay clear the cache, why we need
>> the 10 * MSEC_PER_SEC delay? In theory, fwc->name_lock should protect the
>> case though.
>
> One possible solution may be to cancel_delayed_work(&fw_cache.work)
> or cancel_delayed_work_sync(&fw_cache.work) on suspend, and then
> force it to run *right away* so we clear any pending old cache.
>
> One performance issue with this is we are clearing some possible cache we are
> about to re-create, so mod_delayed_work() seems more efficient -- provided we
> have accounting notes to know no new firmware requests have trickled in since
> the last cache build. This seems rather complicated so a first initial approach
> to just kill all known cache before suspend seems easy and fair.
>
> BTW such a fix might be a stable candidate if you find a real behavioural
> issue with the kernel !
>
>>> This is the sort of testing which would really help here.
>>>
>>> Likewise, if you are extending functionality please consider ways to break it :)
>> Understand the need to test the firmware caching part. For non-caching test
>> case, will it be enough if we can test that the noncache setting will ban
>> the firmware name be added to the fwc->fw_names list?
>
> Good question. Let's see... in the future a device might have two requests
> with the same firmware, one with a cache enabled and one without it -- I suppose
> for now we should perhaps disable a non-cache request if one already exists
> with a cache request ? The code for that probably is not there... But other than
> this corner case...
>
> Yeah I think that's a reasonable test case. Such a code seems may need a debug
> export symbol to allow drivers to query for this info, if so feel free to a
> respective debug kconfig entry for it.

Sounds good, thanks!

>
> Luis
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fpga" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

2017-06-07 23:02:10

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCHv2 0/3] Enable no_cache flag to driver_data

On Wed, Jun 07, 2017 at 04:00:42PM -0500, Li, Yi wrote:
> On 6/7/2017 12:59 PM, Luis R. Rodriguez wrote:
> > Ah, consider adding PM to driver_data_test_device ?
>
> That's what I am looking now. But per my understanding, the misc_device does
> not have the hook to add PM support like those platform device drivers do.
> Please let me know if there is a way to do that.

Ah, hrm. Yeah I'd have to look into it, why does misc devs not have support?

Otherwise you could see if you can just modify the test driver to be a platform
driver, these are easy to register, see. For a simple example see
platform_device_register_simple() use on net/wireless/reg.c.

> > > device_uncache_fw_images_delay() will be called for PM_POST_RESTORE, which
> > > means the hibernation restore got error. How about the successful restore
> > > case, calling driver will free its firmware_buf after loading?
> >
> > As it is on my latest 20170605-driver-data [0] (but should be just as yours if you
> > used a recent branch of mine), fw_pm_notify() uses:
> >
> > static int fw_pm_notify(struct notifier_block *notify_block,
> > unsigned long mode, void *unused)
> > {
> > switch (mode) {
> > ...
> > case PM_POST_SUSPEND:
> > case PM_POST_HIBERNATION:
> > case PM_POST_RESTORE:
> > /*
> > * In case that system sleep failed and syscore_suspend is
> > * not called.
> > */
> > mutex_lock(&fw_lock);
> > fw_cache.state = FW_LOADER_NO_CACHE;
> > mutex_unlock(&fw_lock);
> > enable_firmware();
> >
> > device_uncache_fw_images_delay(10 * MSEC_PER_SEC);
> > break;
> > }
> > }
> >
> > So it uses also PM_POST_RESTORE. I think that covers it all ? As it
> > is today we handle the firmware cache for all drivers, it should be
> > transparent to drivers.
>
> Ah, the comment of "sleep failed" mislead me. :)

Ah no that comment is about more of a brain fuck with the internals of suspend
on Linux, this may be very confusing. Its for this reason why I recently
submitted a patch which helps explain all this more, the patch was recently
merged by Greg KH, it expands on the documentation of fw_pm_notify() so let me
paste the diagram which is relevant form the documentation added:

/**
* fw_pm_notify - notifier for suspend/resume
* @notify_block: unused
* @mode: mode we are switching to
* @unused: unused
*
* Used to modify the firmware_class state as we move in between states.
* The firmware_class implements a firmware cache to enable device driver
* to fetch firmware upon resume before the root filesystem is ready. We
* disable API calls which do not use the built-in firmware or the firmware
* cache when we know these calls will not work.
*
* The inner logic behind all this is a bit complex so it is worth summarizing
* the kernel's own suspend/resume process with context and focus on how this
* can impact the firmware API.
*
* First a review on how we go to suspend::
*
* pm_suspend() --> enter_state() -->
* sys_sync()
* suspend_prepare() -->
* __pm_notifier_call_chain(PM_SUSPEND_PREPARE, ...);
* suspend_freeze_processes() -->
* freeze_processes() -->
* __usermodehelper_set_disable_depth(UMH_DISABLED);
* freeze all tasks ...
* freeze_kernel_threads()
* suspend_devices_and_enter() -->
* dpm_suspend_start() -->
* dpm_prepare()
* dpm_suspend()
* suspend_enter() -->
* platform_suspend_prepare()
* dpm_suspend_late()
* freeze_enter()
* syscore_suspend()
*
* When we resume we bail out of a loop from suspend_devices_and_enter() and
* unwind back out to the caller enter_state() where we were before as follows::
*
* enter_state() -->
* suspend_devices_and_enter() --> (bail from loop)
* dpm_resume_end() -->
* dpm_resume()
* dpm_complete()
* suspend_finish() -->
* suspend_thaw_processes() -->
* thaw_processes() -->
* __usermodehelper_set_disable_depth(UMH_FREEZING);
* thaw_workqueues();
* thaw all processes ...
* usermodehelper_enable();
* pm_notifier_call_chain(PM_POST_SUSPEND);
*
* fw_pm_notify() works through pm_notifier_call_chain().
*/

The key is that even if we fail at syscore_suspend() we will bail out of the
loop on suspend_devices_and_enter() and eventually get our notifier called
with PM_POST_SUSPEND, in the case of suspend/resume. The reason the
comment is there is that even though we have the notifier we also have
the fw_syscore_ops, and our fw_suspend(). If our fw_suspend() fails to
get called then this does not happen:

static int fw_suspend(void)
{
fw_cache.state = FW_LOADER_NO_CACHE;
return 0;
}

Luis