2023-10-05 14:06:43

by Mukesh Ojha

[permalink] [raw]
Subject: [PATCH v3] firmware_loader: Abort new fw load request once firmware core knows about reboot

There could be following scenario where there is a ongoing reboot
is going from processA which tries to call all the reboot notifier
callback and one of them is firmware reboot call which tries to
abort all the ongoing firmware userspace request under fw_lock but
there could be another processB which tries to do request firmware,
which came just after abort done from ProcessA and ask for userspace
to load the firmware and this can stop the ongoing reboot ProcessA
to stall for next 60s(default timeout) which may not be expected
behaviour everyone like to see, instead we should abort any firmware
load request which came once firmware knows about the reboot through
notification.

ProcessA ProcessB

kernel_restart_prepare
blocking_notifier_call_chain
fw_shutdown_notify
kill_pending_fw_fallback_reqs
__fw_load_abort
fw_state_aborted request_firmware
__fw_state_set firmware_fallback_sysfs
... fw_load_from_user_helper
.. ...
. ..
usermodehelper_read_trylock
fw_load_sysfs_fallback
fw_sysfs_wait_timeout
usermodehelper_disable
__usermodehelper_disable
down_write()

Signed-off-by: Mukesh Ojha <[email protected]>
---
Changes from v2: https://lore.kernel.org/lkml/[email protected]/
- Renamed the flag to fw_abort_load.

Changes from v1: https://lore.kernel.org/lkml/[email protected]/
- Renamed the flag to fw_load_abort.
- Kept the flag under fw_lock.
- Repharsed commit text.

drivers/base/firmware_loader/fallback.c | 6 +++++-
drivers/base/firmware_loader/firmware.h | 1 +
drivers/base/firmware_loader/main.c | 1 +
3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/base/firmware_loader/fallback.c b/drivers/base/firmware_loader/fallback.c
index bf68e3947814..a162020e98f2 100644
--- a/drivers/base/firmware_loader/fallback.c
+++ b/drivers/base/firmware_loader/fallback.c
@@ -57,6 +57,10 @@ void kill_pending_fw_fallback_reqs(bool only_kill_custom)
if (!fw_priv->need_uevent || !only_kill_custom)
__fw_load_abort(fw_priv);
}
+
+ if (!only_kill_custom)
+ fw_abort_load = true;
+
mutex_unlock(&fw_lock);
}

@@ -86,7 +90,7 @@ static int fw_load_sysfs_fallback(struct fw_sysfs *fw_sysfs, long timeout)
}

mutex_lock(&fw_lock);
- if (fw_state_is_aborted(fw_priv)) {
+ if (fw_abort_load || fw_state_is_aborted(fw_priv)) {
mutex_unlock(&fw_lock);
retval = -EINTR;
goto out;
diff --git a/drivers/base/firmware_loader/firmware.h b/drivers/base/firmware_loader/firmware.h
index bf549d6500d7..8b2549910f36 100644
--- a/drivers/base/firmware_loader/firmware.h
+++ b/drivers/base/firmware_loader/firmware.h
@@ -86,6 +86,7 @@ struct fw_priv {

extern struct mutex fw_lock;
extern struct firmware_cache fw_cache;
+extern bool fw_abort_load;

static inline bool __fw_state_check(struct fw_priv *fw_priv,
enum fw_status status)
diff --git a/drivers/base/firmware_loader/main.c b/drivers/base/firmware_loader/main.c
index b58c42f1b1ce..2658bdcdf1ee 100644
--- a/drivers/base/firmware_loader/main.c
+++ b/drivers/base/firmware_loader/main.c
@@ -93,6 +93,7 @@ static inline struct fw_priv *to_fw_priv(struct kref *ref)
DEFINE_MUTEX(fw_lock);

struct firmware_cache fw_cache;
+bool fw_abort_load;

void fw_state_init(struct fw_priv *fw_priv)
{
--
2.7.4


2023-10-19 19:06:59

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v3] firmware_loader: Abort new fw load request once firmware core knows about reboot

On Thu, Oct 05, 2023 at 10:37:00AM +0530, Mukesh Ojha wrote:
> There could be following scenario where there is a ongoing reboot
> is going from processA which tries to call all the reboot notifier
> callback and one of them is firmware reboot call which tries to
> abort all the ongoing firmware userspace request under fw_lock but
> there could be another processB which tries to do request firmware,
> which came just after abort done from ProcessA and ask for userspace
> to load the firmware and this can stop the ongoing reboot ProcessA
> to stall for next 60s(default timeout) which may not be expected
> behaviour everyone like to see, instead we should abort any firmware
> load request which came once firmware knows about the reboot through
> notification.
>
> ProcessA ProcessB
>
> kernel_restart_prepare
> blocking_notifier_call_chain
> fw_shutdown_notify
> kill_pending_fw_fallback_reqs
> __fw_load_abort
> fw_state_aborted request_firmware
> __fw_state_set firmware_fallback_sysfs
> ... fw_load_from_user_helper
> .. ...
> . ..
> usermodehelper_read_trylock
> fw_load_sysfs_fallback
> fw_sysfs_wait_timeout
> usermodehelper_disable
> __usermodehelper_disable
> down_write()
>
> Signed-off-by: Mukesh Ojha <[email protected]>
> ---
> Changes from v2: https://lore.kernel.org/lkml/[email protected]/
> - Renamed the flag to fw_abort_load.
>
> Changes from v1: https://lore.kernel.org/lkml/[email protected]/
> - Renamed the flag to fw_load_abort.
> - Kept the flag under fw_lock.
> - Repharsed commit text.
>
> drivers/base/firmware_loader/fallback.c | 6 +++++-
> drivers/base/firmware_loader/firmware.h | 1 +
> drivers/base/firmware_loader/main.c | 1 +
> 3 files changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/base/firmware_loader/fallback.c b/drivers/base/firmware_loader/fallback.c
> index bf68e3947814..a162020e98f2 100644
> --- a/drivers/base/firmware_loader/fallback.c
> +++ b/drivers/base/firmware_loader/fallback.c
> @@ -57,6 +57,10 @@ void kill_pending_fw_fallback_reqs(bool only_kill_custom)

This routine uses a bool for only_kill_custom for when the events we
should kill ar ecustom or not. Piggy backing on it to assume that the
negative of value represents a shutdown is abusing the semantics
and muddies the waters. So to avoid that, just extend the arguments
to kill_pending_fw_fallback_reqs() for a new bool shutdown, that allows
the code to be clearer and the intent is kept clear.

> if (!fw_priv->need_uevent || !only_kill_custom)
> __fw_load_abort(fw_priv);
> }
> +
> + if (!only_kill_custom)
> + fw_abort_load = true;
> +
> mutex_unlock(&fw_lock);
> }
>
> @@ -86,7 +90,7 @@ static int fw_load_sysfs_fallback(struct fw_sysfs *fw_sysfs, long timeout)
> }
>
> mutex_lock(&fw_lock);
> - if (fw_state_is_aborted(fw_priv)) {
> + if (fw_abort_load || fw_state_is_aborted(fw_priv)) {

However, do we really need this ? Could we just use:

if (system_state == SYSTEM_HALT ||
system_state == SYSTEM_POWER_OFF ||
system_state == SYSTEM_RESTART ||
fw_state_is_aborted(fw_priv))

?

Luis

2023-10-23 12:41:58

by Mukesh Ojha

[permalink] [raw]
Subject: Re: [PATCH v3] firmware_loader: Abort new fw load request once firmware core knows about reboot



On 10/20/2023 12:36 AM, Luis Chamberlain wrote:
> On Thu, Oct 05, 2023 at 10:37:00AM +0530, Mukesh Ojha wrote:
>> There could be following scenario where there is a ongoing reboot
>> is going from processA which tries to call all the reboot notifier
>> callback and one of them is firmware reboot call which tries to
>> abort all the ongoing firmware userspace request under fw_lock but
>> there could be another processB which tries to do request firmware,
>> which came just after abort done from ProcessA and ask for userspace
>> to load the firmware and this can stop the ongoing reboot ProcessA
>> to stall for next 60s(default timeout) which may not be expected
>> behaviour everyone like to see, instead we should abort any firmware
>> load request which came once firmware knows about the reboot through
>> notification.
>>
>> ProcessA ProcessB
>>
>> kernel_restart_prepare
>> blocking_notifier_call_chain
>> fw_shutdown_notify
>> kill_pending_fw_fallback_reqs
>> __fw_load_abort
>> fw_state_aborted request_firmware
>> __fw_state_set firmware_fallback_sysfs
>> ... fw_load_from_user_helper
>> .. ...
>> . ..
>> usermodehelper_read_trylock
>> fw_load_sysfs_fallback
>> fw_sysfs_wait_timeout
>> usermodehelper_disable
>> __usermodehelper_disable
>> down_write()
>>
>> Signed-off-by: Mukesh Ojha <[email protected]>
>> ---
>> Changes from v2: https://lore.kernel.org/lkml/[email protected]/
>> - Renamed the flag to fw_abort_load.
>>
>> Changes from v1: https://lore.kernel.org/lkml/[email protected]/
>> - Renamed the flag to fw_load_abort.
>> - Kept the flag under fw_lock.
>> - Repharsed commit text.
>>
>> drivers/base/firmware_loader/fallback.c | 6 +++++-
>> drivers/base/firmware_loader/firmware.h | 1 +
>> drivers/base/firmware_loader/main.c | 1 +
>> 3 files changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/base/firmware_loader/fallback.c b/drivers/base/firmware_loader/fallback.c
>> index bf68e3947814..a162020e98f2 100644
>> --- a/drivers/base/firmware_loader/fallback.c
>> +++ b/drivers/base/firmware_loader/fallback.c
>> @@ -57,6 +57,10 @@ void kill_pending_fw_fallback_reqs(bool only_kill_custom)
>
> This routine uses a bool for only_kill_custom for when the events we
> should kill ar ecustom or not. Piggy backing on it to assume that the
> negative of value represents a shutdown is abusing the semantics
> and muddies the waters. So to avoid that, just extend the arguments
> to kill_pending_fw_fallback_reqs() for a new bool shutdown, that allows
> the code to be clearer and the intent is kept clear.

Let me understand what does 'only_kill_custom' do,

1. It gets called from reboot notifier where it's default value is
false. So, the intention here is to kill all the ongoing request along
with !buf->need_uevent ?

commit c4b768934be613fb882e4e4090946218d76c8e1b
Author: Luis R. Rodriguez <[email protected]>
Date: Tue May 2 01:31:03 2017 -0700

firmware: share fw fallback killing on reboot/suspend

We kill pending fallback requests on suspend and reboot,
the only difference is that on suspend we only kill custom
fallback requests. Provide a wrapper that lets us customize
the request with a flag.

This also lets us simplify the #ifdef'ery over the calls.

Signed-off-by: Luis R. Rodriguez <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

2. The second call is from fw_pm_notify() which is only
interested in aborting non-uevent calls.

And in this patch since, we are calling it from reboot which is the
first case, so piggybacking on the 'only_kill_custom' would be fine for
this patch. However, if you think we should rename this
'only_kill_custom' to something like its inverse 'kill_all' and reverse
the below check to be more meaningful now ?

if (kill_all || !fw_priv->need_uevent)

>
>> if (!fw_priv->need_uevent || !only_kill_custom)
>> __fw_load_abort(fw_priv);
>> }
>> +
>> + if (!only_kill_custom)
>> + fw_abort_load = true;
>> +
>> mutex_unlock(&fw_lock);
>> }
>>
>> @@ -86,7 +90,7 @@ static int fw_load_sysfs_fallback(struct fw_sysfs *fw_sysfs, long timeout)
>> }
>>
>> mutex_lock(&fw_lock);
>> - if (fw_state_is_aborted(fw_priv)) {
>> + if (fw_abort_load || fw_state_is_aborted(fw_priv)) {
>
> However, do we really need this ? Could we just use:
>
> if (system_state == SYSTEM_HALT ||
> system_state == SYSTEM_POWER_OFF ||
> system_state == SYSTEM_RESTART ||
> fw_state_is_aborted(fw_priv))
>
> ?

There is slight window where system_state is not yet set and other
reboot notifiers after firmware one are running with ongoing
request_firmware().

-Mukesh

>
> Luis

2023-10-23 16:15:31

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v3] firmware_loader: Abort new fw load request once firmware core knows about reboot

On Mon, Oct 23, 2023 at 06:11:18PM +0530, Mukesh Ojha wrote:
> However, if you think we should rename this
> 'only_kill_custom' to something like its inverse 'kill_all' and reverse the
> below check to be more meaningful now ?
>
> if (kill_all || !fw_priv->need_uevent)

This seems like a better approach to make the intent clear and avoid
future confusion.

Luis

2023-10-26 14:34:51

by Mukesh Ojha

[permalink] [raw]
Subject: Re: [PATCH v3] firmware_loader: Abort new fw load request once firmware core knows about reboot



On 10/23/2023 9:45 PM, Luis Chamberlain wrote:
> On Mon, Oct 23, 2023 at 06:11:18PM +0530, Mukesh Ojha wrote:
>> However, if you think we should rename this
>> 'only_kill_custom' to something like its inverse 'kill_all' and reverse the
>> below check to be more meaningful now ?
>>
>> if (kill_all || !fw_priv->need_uevent)
>
> This seems like a better approach to make the intent clear and avoid
> future confusion.

Thanks, have sent it here.

https://lore.kernel.org/lkml/[email protected]/

-Mukesh

>
> Luis