2022-11-30 16:57:23

by Ricardo Ribalda

[permalink] [raw]
Subject: [PATCH v7 0/2] ASoC: SOF: Fix deadlock when shutdown a frozen userspace

Since: 83bfc7e793b5 ("ASoC: SOF: core: unregister clients and machine drivers in .shutdown")
we wait for all the workloads to be completed during shutdown. This was done to
avoid a stall once the device is started again.

Unfortunately this has the side effect of stalling kexec(), if the userspace
is frozen. Let's handle that case.

To: Pierre-Louis Bossart <[email protected]>
To: Liam Girdwood <[email protected]>
To: Peter Ujfalusi <[email protected]>
To: Bard Liao <[email protected]>
To: Ranjani Sridharan <[email protected]>
To: Kai Vehmanen <[email protected]>
To: Daniel Baluta <[email protected]>
To: Mark Brown <[email protected]>
To: Jaroslav Kysela <[email protected]>
To: Takashi Iwai <[email protected]>
To: Eric Biederman <[email protected]>
To: Chromeos Kdump <[email protected]>
To: Steven Rostedt <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Ricardo Ribalda <[email protected]>
---
Changes in v7:
- Fix commit message (Thanks Pierre-Louis).
- Link to v6: https://lore.kernel.org/r/[email protected]

Changes in v6:
- Check if we are in kexec with the userspace frozen.
- Link to v5: https://lore.kernel.org/r/[email protected]

Changes in v5:
- Edit subject prefix.
- Link to v4: https://lore.kernel.org/r/[email protected]

Changes in v4:
- Do not call snd_sof_machine_unregister from shutdown.
- Link to v3: https://lore.kernel.org/r/[email protected]

Changes in v3:
- Wrap pm_freezing in a function.
- Link to v2: https://lore.kernel.org/r/[email protected]

Changes in v2:
- Only use pm_freezing if CONFIG_FREEZER .
- Link to v1: https://lore.kernel.org/r/[email protected]

---
Ricardo Ribalda (2):
kexec: Introduce kexec_with_frozen_processes
ASoC: SOF: Fix deadlock when shutdown a frozen userspace

include/linux/kexec.h | 3 +++
kernel/kexec_core.c | 5 +++++
sound/soc/sof/core.c | 4 +++-
3 files changed, 11 insertions(+), 1 deletion(-)
---
base-commit: 4312098baf37ee17a8350725e6e0d0e8590252d4
change-id: 20221127-snd-freeze-1ee143228326

Best regards,
--
Ricardo Ribalda <[email protected]>


2022-11-30 16:57:28

by Ricardo Ribalda

[permalink] [raw]
Subject: [PATCH v7 2/2] ASoC: SOF: Fix deadlock when shutdown a frozen userspace

During kexec(), the userspace might frozen. Therefore we cannot wait
for it to complete.

During a kexec with frozen processe do not unregister the clients.

This fixes:

[ 84.943749] Freezing user space processes ... (elapsed 0.111 seconds) done.
[ 246.784446] INFO: task kexec-lite:5123 blocked for more than 122 seconds.
[ 246.819035] Call Trace:
[ 246.821782] <TASK>
[ 246.824186] __schedule+0x5f9/0x1263
[ 246.828231] schedule+0x87/0xc5
[ 246.831779] snd_card_disconnect_sync+0xb5/0x127
...
[ 246.889249] snd_sof_device_shutdown+0xb4/0x150
[ 246.899317] pci_device_shutdown+0x37/0x61
[ 246.903990] device_shutdown+0x14c/0x1d6
[ 246.908391] kernel_kexec+0x45/0xb9

And:

[ 246.893222] INFO: task kexec-lite:4891 blocked for more than 122 seconds.
[ 246.927709] Call Trace:
[ 246.930461] <TASK>
[ 246.932819] __schedule+0x5f9/0x1263
[ 246.936855] ? fsnotify_grab_connector+0x5c/0x70
[ 246.942045] schedule+0x87/0xc5
[ 246.945567] schedule_timeout+0x49/0xf3
[ 246.949877] wait_for_completion+0x86/0xe8
[ 246.954463] snd_card_free+0x68/0x89
...
[ 247.001080] platform_device_unregister+0x12/0x35

Cc: [email protected]
Fixes: 83bfc7e793b5 ("ASoC: SOF: core: unregister clients and machine drivers in .shutdown")
Signed-off-by: Ricardo Ribalda <[email protected]>
---
sound/soc/sof/core.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/sound/soc/sof/core.c b/sound/soc/sof/core.c
index 3e6141d03770..4301f347bb90 100644
--- a/sound/soc/sof/core.c
+++ b/sound/soc/sof/core.c
@@ -9,6 +9,7 @@
//

#include <linux/firmware.h>
+#include <linux/kexec.h>
#include <linux/module.h>
#include <sound/soc.h>
#include <sound/sof.h>
@@ -484,7 +485,8 @@ int snd_sof_device_shutdown(struct device *dev)
* make sure clients and machine driver(s) are unregistered to force
* all userspace devices to be closed prior to the DSP shutdown sequence
*/
- sof_unregister_clients(sdev);
+ if (!kexec_with_frozen_processes())
+ sof_unregister_clients(sdev);

snd_sof_machine_unregister(sdev, pdata);


--
2.38.1.584.g0f3c55d4c2-goog-b4-0.11.0-dev-696ae

2022-11-30 17:10:57

by Ricardo Ribalda

[permalink] [raw]
Subject: [PATCH v7 1/2] kexec: Introduce kexec_with_frozen_processes

Drivers running .shutdown() might want to wait for userspace to complete
before exiting.

If userspace is frozen and we are running kexec they will stall the
computer.

Add a way for them to figure out if they should just skip waiting for
userspace.

Cc: [email protected]
Fixes: 83bfc7e793b5 ("ASoC: SOF: core: unregister clients and machine drivers in .shutdown")
Signed-off-by: Ricardo Ribalda <[email protected]>
---
include/linux/kexec.h | 3 +++
kernel/kexec_core.c | 5 +++++
2 files changed, 8 insertions(+)

diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 41a686996aaa..c22711e0f7b5 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -426,6 +426,8 @@ extern int kexec_load_disabled;
/* flag to track if kexec reboot is in progress */
extern bool kexec_in_progress;

+bool kexec_with_frozen_processes(void);
+
int crash_shrink_memory(unsigned long new_size);
ssize_t crash_get_memory_size(void);

@@ -507,6 +509,7 @@ static inline void __crash_kexec(struct pt_regs *regs) { }
static inline void crash_kexec(struct pt_regs *regs) { }
static inline int kexec_should_crash(struct task_struct *p) { return 0; }
static inline int kexec_crash_loaded(void) { return 0; }
+static inline bool kexec_with_frozen_processes(void) { return false; }
#define kexec_in_progress false
#endif /* CONFIG_KEXEC_CORE */

diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index ca2743f9c634..8bc8257ee7ca 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -54,6 +54,11 @@ note_buf_t __percpu *crash_notes;
/* Flag to indicate we are going to kexec a new kernel */
bool kexec_in_progress = false;

+bool kexec_with_frozen_processes(void)
+{
+ return kexec_in_progress && pm_freezing;
+}
+EXPORT_SYMBOL(kexec_with_frozen_processes);

/* Location of the reserved area for the crash kernel */
struct resource crashk_res = {

--
2.38.1.584.g0f3c55d4c2-goog-b4-0.11.0-dev-696ae

2022-12-01 05:34:39

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v7 1/2] kexec: Introduce kexec_with_frozen_processes

Hi Ricardo,

I love your patch! Yet something to improve:

[auto build test ERROR on 4312098baf37ee17a8350725e6e0d0e8590252d4]

url: https://github.com/intel-lab-lkp/linux/commits/Ricardo-Ribalda/ASoC-SOF-Fix-deadlock-when-shutdown-a-frozen-userspace/20221201-003214
base: 4312098baf37ee17a8350725e6e0d0e8590252d4
patch link: https://lore.kernel.org/r/20221127-snd-freeze-v7-1-127c582f1ca4%40chromium.org
patch subject: [PATCH v7 1/2] kexec: Introduce kexec_with_frozen_processes
config: ia64-randconfig-r022-20221128
compiler: ia64-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/ddb7195380bb431ff9f4505d5fcfbef756b80a3a
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Ricardo-Ribalda/ASoC-SOF-Fix-deadlock-when-shutdown-a-frozen-userspace/20221201-003214
git checkout ddb7195380bb431ff9f4505d5fcfbef756b80a3a
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=ia64 SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

kernel/kexec_core.c: In function 'kexec_with_frozen_processes':
>> kernel/kexec_core.c:59:37: error: 'pm_freezing' undeclared (first use in this function); did you mean 'freezing'?
59 | return kexec_in_progress && pm_freezing;
| ^~~~~~~~~~~
| freezing
kernel/kexec_core.c:59:37: note: each undeclared identifier is reported only once for each function it appears in
kernel/kexec_core.c:60:1: error: control reaches end of non-void function [-Werror=return-type]
60 | }
| ^
cc1: some warnings being treated as errors


vim +59 kernel/kexec_core.c

56
57 bool kexec_with_frozen_processes(void)
58 {
> 59 return kexec_in_progress && pm_freezing;
60 }
61 EXPORT_SYMBOL(kexec_with_frozen_processes);
62

--
0-DAY CI Kernel Test Service
https://01.org/lkp


Attachments:
(No filename) (2.36 kB)
config (128.31 kB)
Download all attachments

2022-12-01 09:08:19

by Kai Vehmanen

[permalink] [raw]
Subject: Re: [PATCH v7 2/2] ASoC: SOF: Fix deadlock when shutdown a frozen userspace

Hi,

On Wed, 30 Nov 2022, Ricardo Ribalda wrote:

> During kexec(), the userspace might frozen. Therefore we cannot wait
> for it to complete.
[...]
> --- a/sound/soc/sof/core.c
> +++ b/sound/soc/sof/core.c
> @@ -9,6 +9,7 @@
> //
>
> #include <linux/firmware.h>
> +#include <linux/kexec.h>
> #include <linux/module.h>
> #include <sound/soc.h>
> #include <sound/sof.h>
> @@ -484,7 +485,8 @@ int snd_sof_device_shutdown(struct device *dev)
> * make sure clients and machine driver(s) are unregistered to force
> * all userspace devices to be closed prior to the DSP shutdown sequence
> */
> - sof_unregister_clients(sdev);
> + if (!kexec_with_frozen_processes())
> + sof_unregister_clients(sdev);
>
> snd_sof_machine_unregister(sdev, pdata);

I think the case you hit was specifically snd_card_disconnect_sync() that
gets called via snd_sof_machine_unregister(), right, so you'd have to skip
both sof_unregister_clients() and the machine_unregister().

Skipping ok might be an ok solution here. There's clearly a problem and we
cannot just drop these calls in the general case (when we are going to
S5), but in the specific case of kexec, this is probably safe. And I agree
one way or another this needs to be fixed. Pierre and others what do you
think?

Br, Kai

2022-12-01 21:12:53

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v7 1/2] kexec: Introduce kexec_with_frozen_processes

Hi Ricardo,

I love your patch! Yet something to improve:

[auto build test ERROR on 4312098baf37ee17a8350725e6e0d0e8590252d4]

url: https://github.com/intel-lab-lkp/linux/commits/Ricardo-Ribalda/ASoC-SOF-Fix-deadlock-when-shutdown-a-frozen-userspace/20221201-003214
base: 4312098baf37ee17a8350725e6e0d0e8590252d4
patch link: https://lore.kernel.org/r/20221127-snd-freeze-v7-1-127c582f1ca4%40chromium.org
patch subject: [PATCH v7 1/2] kexec: Introduce kexec_with_frozen_processes
config: s390-randconfig-r024-20221128
compiler: clang version 16.0.0 (https://github.com/llvm/llvm-project 6e4cea55f0d1104408b26ac574566a0e4de48036)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install s390 cross compiling tool for clang build
# apt-get install binutils-s390x-linux-gnu
# https://github.com/intel-lab-lkp/linux/commit/ddb7195380bb431ff9f4505d5fcfbef756b80a3a
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Ricardo-Ribalda/ASoC-SOF-Fix-deadlock-when-shutdown-a-frozen-userspace/20221201-003214
git checkout ddb7195380bb431ff9f4505d5fcfbef756b80a3a
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=s390 SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>

All error/warnings (new ones prefixed by >>):

In file included from kernel/kexec_core.c:14:
In file included from include/linux/kexec.h:19:
In file included from arch/s390/include/asm/io.h:75:
include/asm-generic/io.h:547:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
val = __raw_readb(PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:560:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
~~~~~~~~~~ ^
include/uapi/linux/byteorder/big_endian.h:37:59: note: expanded from macro '__le16_to_cpu'
#define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))
^
include/uapi/linux/swab.h:102:54: note: expanded from macro '__swab16'
#define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
^
In file included from kernel/kexec_core.c:14:
In file included from include/linux/kexec.h:19:
In file included from arch/s390/include/asm/io.h:75:
include/asm-generic/io.h:573:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
~~~~~~~~~~ ^
include/uapi/linux/byteorder/big_endian.h:35:59: note: expanded from macro '__le32_to_cpu'
#define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
^
include/uapi/linux/swab.h:115:54: note: expanded from macro '__swab32'
#define __swab32(x) (__u32)__builtin_bswap32((__u32)(x))
^
In file included from kernel/kexec_core.c:14:
In file included from include/linux/kexec.h:19:
In file included from arch/s390/include/asm/io.h:75:
include/asm-generic/io.h:584:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
__raw_writeb(value, PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:594:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
__raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:604:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
__raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:692:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
readsb(PCI_IOBASE + addr, buffer, count);
~~~~~~~~~~ ^
include/asm-generic/io.h:700:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
readsw(PCI_IOBASE + addr, buffer, count);
~~~~~~~~~~ ^
include/asm-generic/io.h:708:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
readsl(PCI_IOBASE + addr, buffer, count);
~~~~~~~~~~ ^
include/asm-generic/io.h:717:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
writesb(PCI_IOBASE + addr, buffer, count);
~~~~~~~~~~ ^
include/asm-generic/io.h:726:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
writesw(PCI_IOBASE + addr, buffer, count);
~~~~~~~~~~ ^
include/asm-generic/io.h:735:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
writesl(PCI_IOBASE + addr, buffer, count);
~~~~~~~~~~ ^
>> kernel/kexec_core.c:59:30: error: use of undeclared identifier 'pm_freezing'; did you mean 'freezing'?
return kexec_in_progress && pm_freezing;
^~~~~~~~~~~
freezing
include/linux/freezer.h:75:20: note: 'freezing' declared here
static inline bool freezing(struct task_struct *p) { return false; }
^
>> kernel/kexec_core.c:59:30: warning: address of function 'freezing' will always evaluate to 'true' [-Wpointer-bool-conversion]
return kexec_in_progress && pm_freezing;
~~ ^~~~~~~~~~~
kernel/kexec_core.c:59:30: note: prefix with the address-of operator to silence this warning
return kexec_in_progress && pm_freezing;
^
&
13 warnings and 1 error generated.


vim +59 kernel/kexec_core.c

56
57 bool kexec_with_frozen_processes(void)
58 {
> 59 return kexec_in_progress && pm_freezing;
60 }
61 EXPORT_SYMBOL(kexec_with_frozen_processes);
62

--
0-DAY CI Kernel Test Service
https://01.org/lkp


Attachments:
(No filename) (7.09 kB)
config (168.15 kB)
Download all attachments