2020-04-10 08:27:13

by Lu, Brent

[permalink] [raw]
Subject: [PATCH] ASoC: Intel: sst: ipc command timeout

After sending an ipc command to DSP, the host waits for the reply message
which will be read from SST_IPCD register in sst_byt_irq_thread() to
complete the transaction. Sometimes the value read from SST_IPCD register
is still the reply message for previous command instead of the waiting
command so ipc command timeout happens.

In an experiment we read the same SST_IPCD register again when the defect
happens and found the value of second read is different from previous one
and is the correct reply message. It suggests the DSP is okay but the way
we read the register may be the cause.

Currently the driver is using memcpy_fromio() to read the value of 64-bit
registers. This function is based on __builtin_memcpy() call and depends
on the implementation of compiler. Since this issue happens right after
the toolchain switched from clang 10 to clang 11, we replace the register
read with two readl() function calls to avoid all optimization from
compiler's library.

Signed-off-by: Brent Lu <[email protected]>
---
sound/soc/intel/common/sst-dsp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/soc/intel/common/sst-dsp.c b/sound/soc/intel/common/sst-dsp.c
index ec66be2..12af7aa 100644
--- a/sound/soc/intel/common/sst-dsp.c
+++ b/sound/soc/intel/common/sst-dsp.c
@@ -42,7 +42,7 @@ u64 sst_shim32_read64(void __iomem *addr, u32 offset)
{
u64 val;

- memcpy_fromio(&val, addr + offset, sizeof(val));
+ sst_memcpy_fromio_32(NULL, &val, addr + offset, sizeof(val));
return val;
}
EXPORT_SYMBOL_GPL(sst_shim32_read64);
--
2.7.4


2020-04-14 16:54:47

by Lu, Brent

[permalink] [raw]
Subject: RE: [PATCH] ASoC: Intel: sst: ipc command timeout

>
> I have mixed feelings about this.
>
> One one hand, this looks simple enough.
>
> But on the other hand we have other users of memcpy_fromio(), including
> SOF drivers, so what are the odds we have the same problems in other
> places? Wouldn't it be safer to either change this function so that it's
> behavior is not ambiguous or compiler-dependent, or fix the compiler?
>

Hi Pierre and Amadeusz,

I have to admit that I didn't dig into clang's __builtin_memcpy to see what's
happening inside so I don't have direct evidence to say it's clang's problem.
What I know is kernel built by clang10 works fine but have this issue once
changed to clang11. At first I also suspect that it's a timing issue so I checked
the command transaction. The transaction is simple, host writes command
in SST_IPCX register, the DSP then writes reply in SST_IPCD register and
trigger an interrupt. Finally the irq thread sst_byt_irq_thread() reads the
SST_IPCD register to complete the transaction. I added some debug messages
to see if there is something wrong in the transaction but it all looks good.

I am also confused that why this only happens to BYT but not BDW since they
share the same register accessing code in sst-dsp.c. I checked the code and
realized that in BDW, the irq thread (hsw_irq_thread) performs 32-bit register
read instead of 64-bit in BYT platform. Therefore I change the code in BYT to
use two readl() calls and found the problem is gone. My best guess is it's
related to the implementation of __builtin_memcpy() but not sure it's the
timing or implementing cause this problem.


Regards,
Brent

2020-04-15 17:41:28

by Amadeusz Sławiński

[permalink] [raw]
Subject: Re: [PATCH] ASoC: Intel: sst: ipc command timeout



On 4/10/2020 10:18 AM, Brent Lu wrote:
> After sending an ipc command to DSP, the host waits for the reply message
> which will be read from SST_IPCD register in sst_byt_irq_thread() to
> complete the transaction. Sometimes the value read from SST_IPCD register
> is still the reply message for previous command instead of the waiting
> command so ipc command timeout happens.
>
> In an experiment we read the same SST_IPCD register again when the defect
> happens and found the value of second read is different from previous one
> and is the correct reply message. It suggests the DSP is okay but the way
> we read the register may be the cause.
>
> Currently the driver is using memcpy_fromio() to read the value of 64-bit
> registers. This function is based on __builtin_memcpy() call and depends
> on the implementation of compiler. Since this issue happens right after
> the toolchain switched from clang 10 to clang 11, we replace the register
> read with two readl() function calls to avoid all optimization from
> compiler's library.
>
> Signed-off-by: Brent Lu <[email protected]>
> ---
> sound/soc/intel/common/sst-dsp.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/sound/soc/intel/common/sst-dsp.c b/sound/soc/intel/common/sst-dsp.c
> index ec66be2..12af7aa 100644
> --- a/sound/soc/intel/common/sst-dsp.c
> +++ b/sound/soc/intel/common/sst-dsp.c
> @@ -42,7 +42,7 @@ u64 sst_shim32_read64(void __iomem *addr, u32 offset)
> {
> u64 val;
>
> - memcpy_fromio(&val, addr + offset, sizeof(val));
> + sst_memcpy_fromio_32(NULL, &val, addr + offset, sizeof(val));
> return val;
> }
> EXPORT_SYMBOL_GPL(sst_shim32_read64);
>

I'm not convinced that it is a correct fix. From your description it
seems to be a timing issue and you are just masking a problem, while
real problem is flow which allows to read data before confirming that
DSP finished processing.

2020-04-15 21:39:01

by Pierre-Louis Bossart

[permalink] [raw]
Subject: Re: [PATCH] ASoC: Intel: sst: ipc command timeout



On 4/10/20 3:18 AM, Brent Lu wrote:
> After sending an ipc command to DSP, the host waits for the reply message
> which will be read from SST_IPCD register in sst_byt_irq_thread() to
> complete the transaction. Sometimes the value read from SST_IPCD register
> is still the reply message for previous command instead of the waiting
> command so ipc command timeout happens.
>
> In an experiment we read the same SST_IPCD register again when the defect
> happens and found the value of second read is different from previous one
> and is the correct reply message. It suggests the DSP is okay but the way
> we read the register may be the cause.
>
> Currently the driver is using memcpy_fromio() to read the value of 64-bit
> registers. This function is based on __builtin_memcpy() call and depends
> on the implementation of compiler. Since this issue happens right after
> the toolchain switched from clang 10 to clang 11, we replace the register
> read with two readl() function calls to avoid all optimization from
> compiler's library.

I have mixed feelings about this.

One one hand, this looks simple enough.

But on the other hand we have other users of memcpy_fromio(), including
SOF drivers, so what are the odds we have the same problems in other
places? Wouldn't it be safer to either change this function so that it's
behavior is not ambiguous or compiler-dependent, or fix the compiler?

>
> Signed-off-by: Brent Lu <[email protected]>
> ---
> sound/soc/intel/common/sst-dsp.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/sound/soc/intel/common/sst-dsp.c b/sound/soc/intel/common/sst-dsp.c
> index ec66be2..12af7aa 100644
> --- a/sound/soc/intel/common/sst-dsp.c
> +++ b/sound/soc/intel/common/sst-dsp.c
> @@ -42,7 +42,7 @@ u64 sst_shim32_read64(void __iomem *addr, u32 offset)
> {
> u64 val;
>
> - memcpy_fromio(&val, addr + offset, sizeof(val));
> + sst_memcpy_fromio_32(NULL, &val, addr + offset, sizeof(val));
> return val;
> }
> EXPORT_SYMBOL_GPL(sst_shim32_read64);
>

2020-04-15 21:54:16

by Cezary Rojewski

[permalink] [raw]
Subject: Re: [PATCH] ASoC: Intel: sst: ipc command timeout

On 2020-04-14 18:20, Lu, Brent wrote:
>>
>> I have mixed feelings about this.
>>
>> One one hand, this looks simple enough.
>>
>> But on the other hand we have other users of memcpy_fromio(), including
>> SOF drivers, so what are the odds we have the same problems in other
>> places? Wouldn't it be safer to either change this function so that it's
>> behavior is not ambiguous or compiler-dependent, or fix the compiler?
>>
>
> Hi Pierre and Amadeusz,
>
> I have to admit that I didn't dig into clang's __builtin_memcpy to see what's
> happening inside so I don't have direct evidence to say it's clang's problem.
> What I know is kernel built by clang10 works fine but have this issue once
> changed to clang11. At first I also suspect that it's a timing issue so I checked
> the command transaction. The transaction is simple, host writes command
> in SST_IPCX register, the DSP then writes reply in SST_IPCD register and
> trigger an interrupt. Finally the irq thread sst_byt_irq_thread() reads the
> SST_IPCD register to complete the transaction. I added some debug messages
> to see if there is something wrong in the transaction but it all looks good.
>
> I am also confused that why this only happens to BYT but not BDW since they
> share the same register accessing code in sst-dsp.c. I checked the code and
> realized that in BDW, the irq thread (hsw_irq_thread) performs 32-bit register
> read instead of 64-bit in BYT platform. Therefore I change the code in BYT to
> use two readl() calls and found the problem is gone. My best guess is it's
> related to the implementation of __builtin_memcpy() but not sure it's the
> timing or implementing cause this problem.
>
>
> Regards,
> Brent
>

Regs width difference between BDW and BYT comes from specification. BDW
has IPC registers which are 32 wide. This fact ain't exactly the reason
to modify sst_shim32_read64.

I'm sharing Amadeo's point of view. Your change should slow down
execution a bit - but that might be just what handlers needed to make
everything work again. Debug prints also slow down the execution what
could have prevented you from spotting the real problem.
Let's ignore the memcpy stuff for a moment - could you focus on
elaborating the scenario where such issue occurs? Your initial commit
message also skips important bits such as platform used when reproducing
and so on, please add them.

Thanks,
Czarek

2020-04-21 16:19:33

by Lu, Brent

[permalink] [raw]
Subject: RE: [PATCH] ASoC: Intel: sst: ipc command timeout

>
> Regs width difference between BDW and BYT comes from specification. BDW
> has IPC registers which are 32 wide. This fact ain't exactly the reason to modify
> sst_shim32_read64.
>
> I'm sharing Amadeo's point of view. Your change should slow down execution a
> bit - but that might be just what handlers needed to make everything work again.
> Debug prints also slow down the execution what could have prevented you from
> spotting the real problem.
> Let's ignore the memcpy stuff for a moment - could you focus on elaborating
> the scenario where such issue occurs? Your initial commit message also skips
> important bits such as platform used when reproducing and so on, please add
> them.
>
> Thanks,
> Czarek

This issue is reported to happen on BYT Chrome book (rambi) on Chrome-v4.4
branch after changing toolchain from clang10 to clang11.

The reproduce step is simple. Just run arecord multiple times (<10) then you will
see error message for hw_params ioctl failure.

$ arecord -D hw:1,0 -f S16_LE -c 2 /dev/null
$ ctrl+c

The error message suggests the stream commit (IPC_IA_ALLOC_STREAM, 0x20) IPC
command failed due to timeout but the msg id field of ipcd read after the timeout is
also 0x20. It seems to me that the command is success but the host driver does not
know it for some reason.

2020-03-15T23:02:06.614151+00:00 ERR kernel: [ 852.023766] baytrail-pcm-audio baytrail-pcm-audio: ipc: --message timeout-- ipcx 0x2220 isr 0xf0020 ipcd 0x2220 imrx 0x0
2020-03-15T23:02:06.614209+00:00 ERR kernel: [ 852.023796] baytrail-pcm-audio baytrail-pcm-audio: ipc: error stream commit failed
2020-03-15T23:02:06.614218+00:00 ERR kernel: [ 852.023813] Baytrail Audio: PCM: failed stream commit -110
2020-03-15T23:02:06.614225+00:00 ERR kernel: [ 852.023832] baytrail-pcm-audio baytrail-pcm-audio: ASoC: baytrail-pcm-audio hw params failed: -110

I add some messages to make the log easier to read. You can check the gerrit link if you
are interested:
https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/2131507

In the test patch I read the SST_IPCD register in sst_byt_irq_thread() twice and found
the value could be different. The IPC_IA_FREE_STREAM(0x21) seems to be the last IPC
command sent in the pcm_release of previous arecord command.

[ 125.009724] sound pcmC1D0c: snd_pcm_release:
[ 125.009732] baytrail-pcm-audio baytrail-pcm-audio: ipc_tx_message: header 0x8000000000000221 wait 1
[ 125.009760] baytrail-pcm-audio baytrail-pcm-audio: byt_tx_msg: header 0x8000000000000226
[ 125.009803] baytrail-pcm-audio baytrail-pcm-audio: sst_byt_irq_thread: header 0x297
[ 125.009823] baytrail-pcm-audio baytrail-pcm-audio: byt_tx_msg: header 0x8000000000000221
[ 125.009872] baytrail-pcm-audio baytrail-pcm-audio: sst_byt_irq_thread: header 0x297
[ 125.010442] baytrail-pcm-audio baytrail-pcm-audio: sst_byt_irq_thread: header 0x80000000000c2226
[ 125.010456] baytrail-pcm-audio baytrail-pcm-audio: sst_ipc_reply_find_msg: not match, msg header 0x8000000000000221
[ 125.010511] baytrail-pcm-audio baytrail-pcm-audio: sst_byt_irq_thread: header 0x8000000000000221
[ 125.011355] baytrail-pcm-audio baytrail-pcm-audio: tx_wait_done: ipc success, header 0x8000000000000221
[ 125.380506] sound pcmC1D0c: snd_pcm_capture_open:
[ 125.380544] sound pcmC1D0c: snd_pcm_capture_ioctl1: INFO
[ 125.380554] sound pcmC1D0c: snd_pcm_capture_ioctl1: PVERSION
[ 125.380561] sound pcmC1D0c: snd_pcm_capture_ioctl1: TTSTAMP
[ 125.380581] sound pcmC1D0c: snd_pcm_mmap:
[ 125.380622] sound pcmC1D0c: snd_pcm_mmap:
[ 125.380648] sound pcmC1D0c: snd_pcm_capture_ioctl1: INFO
[ 125.380717] sound pcmC1D0c: snd_pcm_capture_ioctl1: HW_REFINE
[ 125.380737] sound pcmC1D0c: snd_pcm_capture_ioctl1: HW_REFINE
[ 125.380766] sound pcmC1D0c: snd_pcm_capture_ioctl1: HW_REFINE
[ 125.380779] sound pcmC1D0c: snd_pcm_capture_ioctl1: HW_REFINE
[ 125.380789] sound pcmC1D0c: snd_pcm_capture_ioctl1: HW_REFINE
[ 125.380799] sound pcmC1D0c: snd_pcm_capture_ioctl1: HW_REFINE
[ 125.380808] sound pcmC1D0c: snd_pcm_capture_ioctl1: HW_REFINE
[ 125.380815] sound pcmC1D0c: snd_pcm_capture_ioctl1: HW_PARAMS
[ 125.380839] baytrail-pcm-audio baytrail-pcm-audio: ipc_tx_message: header 0x8000000000642220 wait 1
[ 125.380854] baytrail-pcm-audio baytrail-pcm-audio: byt_tx_msg: header 0x8000000000642220
=> write IPC_IA_ALLOC_STREAM message (0x20) to SST_IPCX
[ 125.380931] baytrail-pcm-audio baytrail-pcm-audio: sst_byt_irq_thread: header 0x8000000000000221
=> read message reply from SST_IPCD and an IPC_IA_FREE_STREAM(0x21) is returned
[ 125.380942] baytrail-pcm-audio baytrail-pcm-audio: sst_byt_irq_thread: ipcd 0x8000000000602220
=> read SST_IPCD again, this time the message id is what we are expected (0x20)
[ 125.380948] baytrail-pcm-audio baytrail-pcm-audio: sst_ipc_reply_find_msg: not match, msg header 0x8000000000642220
[ 125.681171] baytrail-pcm-audio baytrail-pcm-audio: tx_wait_done: ipc timeout, header 0x8000000000642220
[ 125.681204] baytrail-pcm-audio baytrail-pcm-audio: ipc: --message timeout-- ipcx 0x2220 isr 0x50020 ipcd 0x2220 imrx 0x0
[ 125.681212] baytrail-pcm-audio baytrail-pcm-audio: ipc: error stream commit failed
[ 125.681217] Baytrail Audio: PCM: failed stream commit -110
[ 125.681223] baytrail-pcm-audio baytrail-pcm-audio: ASoC: baytrail-pcm-audio hw params failed: -110

Regards,
Brent

2020-04-22 21:26:26

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH] ASoC: Intel: sst: ipc command timeout

On Tue, Apr 21, 2020 at 9:16 AM Lu, Brent <[email protected]> wrote:
>
> >
> > Regs width difference between BDW and BYT comes from specification. BDW
> > has IPC registers which are 32 wide. This fact ain't exactly the reason to modify
> > sst_shim32_read64.
> >
> > I'm sharing Amadeo's point of view. Your change should slow down execution a
> > bit - but that might be just what handlers needed to make everything work again.
> > Debug prints also slow down the execution what could have prevented you from
> > spotting the real problem.
> > Let's ignore the memcpy stuff for a moment - could you focus on elaborating
> > the scenario where such issue occurs? Your initial commit message also skips
> > important bits such as platform used when reproducing and so on, please add
> > them.

Splicing in Manoj, from CrOS' LLVM team, who has done a ton of work
for building Linux kernels with Clang. Manoj posted this on an
internal thread about this issue, that I think could help inform the
discussion upstream:
```
I saw the upstream discussion and would like to clarify a few things:

1. __builtin_memcpy makes clang emit a call to memcpy.
2. In Linux kernel: memcpy implementation is provided by Kernel, not clang.
3. clang 10 used to optimize trivial memcpy e.g. 64-bit memcpy was
optimized to two 32 bit reads and not calling memcpy at all.
4. clang 11 does not optimize trivial memcpy if the program provides
its own memcpy implementation i.e. it defers memcpy to the
implementation defined memcpy.
4.1: As kernel provides its own memcpy, this code here now calls into
kernel memcpy .
5. If the code now does not work as expected, this is because it was
relying on compiler to optimize the code to two 32 bit reads and not
calling memcpy. i.e. it was just working by chance.

So, replacing call to memcpy by an implementation that does the reads
in a loop is the right thing to get the previous behavior.
```
In particular, I'm curious if the readb/readl/readw macros
(arch/x86/include/asm/io.h) should be used here, which are explicitly
designed to prevent multiple reads or write from being combined into
memcpy()'s for MMIO? Though I have no idea if that even applies here,
and have only been paying attention to this bug from afar, so sorry if
the suggestion is just noise.

> >
> > Thanks,
> > Czarek
>
> This issue is reported to happen on BYT Chrome book (rambi) on Chrome-v4.4
> branch after changing toolchain from clang10 to clang11.
>
> The reproduce step is simple. Just run arecord multiple times (<10) then you will
> see error message for hw_params ioctl failure.
>
> $ arecord -D hw:1,0 -f S16_LE -c 2 /dev/null
> $ ctrl+c
>
> The error message suggests the stream commit (IPC_IA_ALLOC_STREAM, 0x20) IPC
> command failed due to timeout but the msg id field of ipcd read after the timeout is
> also 0x20. It seems to me that the command is success but the host driver does not
> know it for some reason.
>
> 2020-03-15T23:02:06.614151+00:00 ERR kernel: [ 852.023766] baytrail-pcm-audio baytrail-pcm-audio: ipc: --message timeout-- ipcx 0x2220 isr 0xf0020 ipcd 0x2220 imrx 0x0
> 2020-03-15T23:02:06.614209+00:00 ERR kernel: [ 852.023796] baytrail-pcm-audio baytrail-pcm-audio: ipc: error stream commit failed
> 2020-03-15T23:02:06.614218+00:00 ERR kernel: [ 852.023813] Baytrail Audio: PCM: failed stream commit -110
> 2020-03-15T23:02:06.614225+00:00 ERR kernel: [ 852.023832] baytrail-pcm-audio baytrail-pcm-audio: ASoC: baytrail-pcm-audio hw params failed: -110
>
> I add some messages to make the log easier to read. You can check the gerrit link if you
> are interested:
> https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/2131507
>
> In the test patch I read the SST_IPCD register in sst_byt_irq_thread() twice and found
> the value could be different. The IPC_IA_FREE_STREAM(0x21) seems to be the last IPC
> command sent in the pcm_release of previous arecord command.
>
> [ 125.009724] sound pcmC1D0c: snd_pcm_release:
> [ 125.009732] baytrail-pcm-audio baytrail-pcm-audio: ipc_tx_message: header 0x8000000000000221 wait 1
> [ 125.009760] baytrail-pcm-audio baytrail-pcm-audio: byt_tx_msg: header 0x8000000000000226
> [ 125.009803] baytrail-pcm-audio baytrail-pcm-audio: sst_byt_irq_thread: header 0x297
> [ 125.009823] baytrail-pcm-audio baytrail-pcm-audio: byt_tx_msg: header 0x8000000000000221
> [ 125.009872] baytrail-pcm-audio baytrail-pcm-audio: sst_byt_irq_thread: header 0x297
> [ 125.010442] baytrail-pcm-audio baytrail-pcm-audio: sst_byt_irq_thread: header 0x80000000000c2226
> [ 125.010456] baytrail-pcm-audio baytrail-pcm-audio: sst_ipc_reply_find_msg: not match, msg header 0x8000000000000221
> [ 125.010511] baytrail-pcm-audio baytrail-pcm-audio: sst_byt_irq_thread: header 0x8000000000000221
> [ 125.011355] baytrail-pcm-audio baytrail-pcm-audio: tx_wait_done: ipc success, header 0x8000000000000221
> [ 125.380506] sound pcmC1D0c: snd_pcm_capture_open:
> [ 125.380544] sound pcmC1D0c: snd_pcm_capture_ioctl1: INFO
> [ 125.380554] sound pcmC1D0c: snd_pcm_capture_ioctl1: PVERSION
> [ 125.380561] sound pcmC1D0c: snd_pcm_capture_ioctl1: TTSTAMP
> [ 125.380581] sound pcmC1D0c: snd_pcm_mmap:
> [ 125.380622] sound pcmC1D0c: snd_pcm_mmap:
> [ 125.380648] sound pcmC1D0c: snd_pcm_capture_ioctl1: INFO
> [ 125.380717] sound pcmC1D0c: snd_pcm_capture_ioctl1: HW_REFINE
> [ 125.380737] sound pcmC1D0c: snd_pcm_capture_ioctl1: HW_REFINE
> [ 125.380766] sound pcmC1D0c: snd_pcm_capture_ioctl1: HW_REFINE
> [ 125.380779] sound pcmC1D0c: snd_pcm_capture_ioctl1: HW_REFINE
> [ 125.380789] sound pcmC1D0c: snd_pcm_capture_ioctl1: HW_REFINE
> [ 125.380799] sound pcmC1D0c: snd_pcm_capture_ioctl1: HW_REFINE
> [ 125.380808] sound pcmC1D0c: snd_pcm_capture_ioctl1: HW_REFINE
> [ 125.380815] sound pcmC1D0c: snd_pcm_capture_ioctl1: HW_PARAMS
> [ 125.380839] baytrail-pcm-audio baytrail-pcm-audio: ipc_tx_message: header 0x8000000000642220 wait 1
> [ 125.380854] baytrail-pcm-audio baytrail-pcm-audio: byt_tx_msg: header 0x8000000000642220
> => write IPC_IA_ALLOC_STREAM message (0x20) to SST_IPCX
> [ 125.380931] baytrail-pcm-audio baytrail-pcm-audio: sst_byt_irq_thread: header 0x8000000000000221
> => read message reply from SST_IPCD and an IPC_IA_FREE_STREAM(0x21) is returned
> [ 125.380942] baytrail-pcm-audio baytrail-pcm-audio: sst_byt_irq_thread: ipcd 0x8000000000602220
> => read SST_IPCD again, this time the message id is what we are expected (0x20)
> [ 125.380948] baytrail-pcm-audio baytrail-pcm-audio: sst_ipc_reply_find_msg: not match, msg header 0x8000000000642220
> [ 125.681171] baytrail-pcm-audio baytrail-pcm-audio: tx_wait_done: ipc timeout, header 0x8000000000642220
> [ 125.681204] baytrail-pcm-audio baytrail-pcm-audio: ipc: --message timeout-- ipcx 0x2220 isr 0x50020 ipcd 0x2220 imrx 0x0
> [ 125.681212] baytrail-pcm-audio baytrail-pcm-audio: ipc: error stream commit failed
> [ 125.681217] Baytrail Audio: PCM: failed stream commit -110
> [ 125.681223] baytrail-pcm-audio baytrail-pcm-audio: ASoC: baytrail-pcm-audio hw params failed: -110
>
> Regards,
> Brent
>
> --
> You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
> To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/BN6PR1101MB21328C54E66082227B9F497A97D50%40BN6PR1101MB2132.namprd11.prod.outlook.com.



--
Thanks,
~Nick Desaulniers

2020-04-23 08:38:06

by Amadeusz Sławiński

[permalink] [raw]
Subject: Re: [PATCH] ASoC: Intel: sst: ipc command timeout



On 4/21/2020 6:16 PM, Lu, Brent wrote:
>>
>> Regs width difference between BDW and BYT comes from specification. BDW
>> has IPC registers which are 32 wide. This fact ain't exactly the reason to modify
>> sst_shim32_read64.
>>
>> I'm sharing Amadeo's point of view. Your change should slow down execution a
>> bit - but that might be just what handlers needed to make everything work again.
>> Debug prints also slow down the execution what could have prevented you from
>> spotting the real problem.
>> Let's ignore the memcpy stuff for a moment - could you focus on elaborating
>> the scenario where such issue occurs? Your initial commit message also skips
>> important bits such as platform used when reproducing and so on, please add
>> them.
>>
>> Thanks,
>> Czarek
>
> This issue is reported to happen on BYT Chrome book (rambi) on Chrome-v4.4
> branch after changing toolchain from clang10 to clang11.
>
> The reproduce step is simple. Just run arecord multiple times (<10) then you will
> see error message for hw_params ioctl failure.
>
> $ arecord -D hw:1,0 -f S16_LE -c 2 /dev/null
> $ ctrl+c
>
> The error message suggests the stream commit (IPC_IA_ALLOC_STREAM, 0x20) IPC
> command failed due to timeout but the msg id field of ipcd read after the timeout is
> also 0x20. It seems to me that the command is success but the host driver does not
> know it for some reason.
>
> 2020-03-15T23:02:06.614151+00:00 ERR kernel: [ 852.023766] baytrail-pcm-audio baytrail-pcm-audio: ipc: --message timeout-- ipcx 0x2220 isr 0xf0020 ipcd 0x2220 imrx 0x0
> 2020-03-15T23:02:06.614209+00:00 ERR kernel: [ 852.023796] baytrail-pcm-audio baytrail-pcm-audio: ipc: error stream commit failed
> 2020-03-15T23:02:06.614218+00:00 ERR kernel: [ 852.023813] Baytrail Audio: PCM: failed stream commit -110
> 2020-03-15T23:02:06.614225+00:00 ERR kernel: [ 852.023832] baytrail-pcm-audio baytrail-pcm-audio: ASoC: baytrail-pcm-audio hw params failed: -110
>
> I add some messages to make the log easier to read. You can check the gerrit link if you
> are interested:
> https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/2131507
>
> In the test patch I read the SST_IPCD register in sst_byt_irq_thread() twice and found
> the value could be different. The IPC_IA_FREE_STREAM(0x21) seems to be the last IPC
> command sent in the pcm_release of previous arecord command.
>
> [ 125.009724] sound pcmC1D0c: snd_pcm_release:
> [ 125.009732] baytrail-pcm-audio baytrail-pcm-audio: ipc_tx_message: header 0x8000000000000221 wait 1
> [ 125.009760] baytrail-pcm-audio baytrail-pcm-audio: byt_tx_msg: header 0x8000000000000226
> [ 125.009803] baytrail-pcm-audio baytrail-pcm-audio: sst_byt_irq_thread: header 0x297
> [ 125.009823] baytrail-pcm-audio baytrail-pcm-audio: byt_tx_msg: header 0x8000000000000221
> [ 125.009872] baytrail-pcm-audio baytrail-pcm-audio: sst_byt_irq_thread: header 0x297
> [ 125.010442] baytrail-pcm-audio baytrail-pcm-audio: sst_byt_irq_thread: header 0x80000000000c2226
> [ 125.010456] baytrail-pcm-audio baytrail-pcm-audio: sst_ipc_reply_find_msg: not match, msg header 0x8000000000000221
> [ 125.010511] baytrail-pcm-audio baytrail-pcm-audio: sst_byt_irq_thread: header 0x8000000000000221
> [ 125.011355] baytrail-pcm-audio baytrail-pcm-audio: tx_wait_done: ipc success, header 0x8000000000000221
> [ 125.380506] sound pcmC1D0c: snd_pcm_capture_open:
> [ 125.380544] sound pcmC1D0c: snd_pcm_capture_ioctl1: INFO
> [ 125.380554] sound pcmC1D0c: snd_pcm_capture_ioctl1: PVERSION
> [ 125.380561] sound pcmC1D0c: snd_pcm_capture_ioctl1: TTSTAMP
> [ 125.380581] sound pcmC1D0c: snd_pcm_mmap:
> [ 125.380622] sound pcmC1D0c: snd_pcm_mmap:
> [ 125.380648] sound pcmC1D0c: snd_pcm_capture_ioctl1: INFO
> [ 125.380717] sound pcmC1D0c: snd_pcm_capture_ioctl1: HW_REFINE
> [ 125.380737] sound pcmC1D0c: snd_pcm_capture_ioctl1: HW_REFINE
> [ 125.380766] sound pcmC1D0c: snd_pcm_capture_ioctl1: HW_REFINE
> [ 125.380779] sound pcmC1D0c: snd_pcm_capture_ioctl1: HW_REFINE
> [ 125.380789] sound pcmC1D0c: snd_pcm_capture_ioctl1: HW_REFINE
> [ 125.380799] sound pcmC1D0c: snd_pcm_capture_ioctl1: HW_REFINE
> [ 125.380808] sound pcmC1D0c: snd_pcm_capture_ioctl1: HW_REFINE
> [ 125.380815] sound pcmC1D0c: snd_pcm_capture_ioctl1: HW_PARAMS
> [ 125.380839] baytrail-pcm-audio baytrail-pcm-audio: ipc_tx_message: header 0x8000000000642220 wait 1
> [ 125.380854] baytrail-pcm-audio baytrail-pcm-audio: byt_tx_msg: header 0x8000000000642220
> => write IPC_IA_ALLOC_STREAM message (0x20) to SST_IPCX
> [ 125.380931] baytrail-pcm-audio baytrail-pcm-audio: sst_byt_irq_thread: header 0x8000000000000221
> => read message reply from SST_IPCD and an IPC_IA_FREE_STREAM(0x21) is returned
> [ 125.380942] baytrail-pcm-audio baytrail-pcm-audio: sst_byt_irq_thread: ipcd 0x8000000000602220
> => read SST_IPCD again, this time the message id is what we are expected (0x20)
> [ 125.380948] baytrail-pcm-audio baytrail-pcm-audio: sst_ipc_reply_find_msg: not match, msg header 0x8000000000642220
> [ 125.681171] baytrail-pcm-audio baytrail-pcm-audio: tx_wait_done: ipc timeout, header 0x8000000000642220
> [ 125.681204] baytrail-pcm-audio baytrail-pcm-audio: ipc: --message timeout-- ipcx 0x2220 isr 0x50020 ipcd 0x2220 imrx 0x0
> [ 125.681212] baytrail-pcm-audio baytrail-pcm-audio: ipc: error stream commit failed
> [ 125.681217] Baytrail Audio: PCM: failed stream commit -110
> [ 125.681223] baytrail-pcm-audio baytrail-pcm-audio: ASoC: baytrail-pcm-audio hw params failed: -110
>
> Regards,
> Brent
>

I've looked at the code and byt_is_dsp_busy seems suspicious to me.
Can you check if following change fixes problem for you:

diff --git a/sound/soc/intel/baytrail/sst-baytrail-ipc.c
b/sound/soc/intel/baytrail/sst-baytrail-ipc.c
index 74274bd38f7a..34746fd871b0 100644
--- a/sound/soc/intel/baytrail/sst-baytrail-ipc.c
+++ b/sound/soc/intel/baytrail/sst-baytrail-ipc.c
@@ -666,8 +666,8 @@ static bool byt_is_dsp_busy(struct sst_dsp *dsp)
{
u64 ipcx;

- ipcx = sst_dsp_shim_read_unlocked(dsp, SST_IPCX);
- return (ipcx & (SST_IPCX_BUSY | SST_IPCX_DONE));
+ ipcx = sst_dsp_shim_read64_unlocked(dsp, SST_IPCX);
+ return (ipcx & (SST_BYT_IPCX_BUSY | SST_BYT_IPCX_DONE));
}

int sst_byt_dsp_init(struct device *dev, struct sst_pdata *pdata)

We seem to treat SST_IPCX as 32 bit register instead of 64 one, which
may explain wrong behaviour. (Specification says it is 64 bit register).

Thanks,
Amadeusz

2020-04-28 17:31:11

by Lu, Brent

[permalink] [raw]
Subject: RE: [PATCH] ASoC: Intel: sst: ipc command timeout

>
> I've looked at the code and byt_is_dsp_busy seems suspicious to me.
> Can you check if following change fixes problem for you:
>
> diff --git a/sound/soc/intel/baytrail/sst-baytrail-ipc.c
> b/sound/soc/intel/baytrail/sst-baytrail-ipc.c
> index 74274bd38f7a..34746fd871b0 100644
> --- a/sound/soc/intel/baytrail/sst-baytrail-ipc.c
> +++ b/sound/soc/intel/baytrail/sst-baytrail-ipc.c
> @@ -666,8 +666,8 @@ static bool byt_is_dsp_busy(struct sst_dsp *dsp)
> {
> u64 ipcx;
>
> - ipcx = sst_dsp_shim_read_unlocked(dsp, SST_IPCX);
> - return (ipcx & (SST_IPCX_BUSY | SST_IPCX_DONE));
> + ipcx = sst_dsp_shim_read64_unlocked(dsp, SST_IPCX);
> + return (ipcx & (SST_BYT_IPCX_BUSY | SST_BYT_IPCX_DONE));
> }
>
> int sst_byt_dsp_init(struct device *dev, struct sst_pdata *pdata)
>
> We seem to treat SST_IPCX as 32 bit register instead of 64 one, which may
> explain wrong behaviour. (Specification says it is 64 bit register).
>
> Thanks,
> Amadeusz

Hi Amadeusz,

The patch does not work but I managed to create a workaround just for
reference. Still don't know why first read in sst_byt_irq_thread returns
incorrect value.

diff --git a/sound/soc/intel/baytrail/sst-baytrail-ipc.c b/sound/soc/intel/baytrail/sst-baytrail-ipc.c
index 5bbaa667bec1..56c557cb722d 100644
--- a/sound/soc/intel/baytrail/sst-baytrail-ipc.c
+++ b/sound/soc/intel/baytrail/sst-baytrail-ipc.c
@@ -12,6 +12,7 @@
* more details.
*/

+#define DEBUG
#include <linux/types.h>
#include <linux/kernel.h>
#include <linux/list.h>
@@ -313,7 +314,7 @@ static irqreturn_t sst_byt_irq_thread(int irq, void *context)
struct sst_dsp *sst = (struct sst_dsp *) context;
struct sst_byt *byt = sst_dsp_get_thread_context(sst);
struct sst_generic_ipc *ipc = &byt->ipc;
- u64 header;
+ u64 header, mask, old, new;
unsigned long flags;

spin_lock_irqsave(&sst->spinlock, flags);
@@ -332,10 +333,32 @@ static irqreturn_t sst_byt_irq_thread(int irq, void *context)
* processed the message and can accept new. Clear data part
* of the header
*/
- sst_dsp_shim_update_bits64_unlocked(sst, SST_IPCD,
- SST_BYT_IPCD_DONE | SST_BYT_IPCD_BUSY |
- IPC_HEADER_DATA(IPC_HEADER_DATA_MASK),
- SST_BYT_IPCD_DONE);
+ /* inline the sst_dsp_shim_update_bits64_unlocked function */
+ mask = SST_BYT_IPCD_DONE | SST_BYT_IPCD_BUSY |
+ IPC_HEADER_DATA(IPC_HEADER_DATA_MASK);
+ old = sst_dsp_shim_read64_unlocked(sst, SST_IPCD);
+ new = (old & (~mask)) | (SST_BYT_IPCD_DONE & mask);
+
+ if (old != new)
+ sst_dsp_shim_write64_unlocked(sst, SST_IPCD, new);
+
+ /*
+ * workaround, give it a second chance if the SST_IPCD
+ * changes
+ */
+ if (old != header) {
+ dev_dbg(ipc->dev, "%s: header 0x%llx old 0x%llx\n",
+ __func__, header, old);
+ if (old & SST_BYT_IPCD_BUSY) {
+ if (old & IPC_NOTIFICATION) {
+ /* message from ADSP */
+ sst_byt_process_notification(byt, &flags);
+ } else {
+ /* reply from ADSP */
+ sst_byt_process_reply(byt, old);
+ }
+ }
+ }
/* unmask message request interrupts */
sst_dsp_shim_update_bits64_unlocked(sst, SST_IMRX,
SST_BYT_IMRX_REQUEST, 0);

2020-04-29 15:23:44

by Amadeusz Sławiński

[permalink] [raw]
Subject: Re: [PATCH] ASoC: Intel: sst: ipc command timeout



On 4/28/2020 7:29 PM, Lu, Brent wrote:
>>
>> I've looked at the code and byt_is_dsp_busy seems suspicious to me.
>> Can you check if following change fixes problem for you:(...)
>>
>> We seem to treat SST_IPCX as 32 bit register instead of 64 one, which may
>> explain wrong behaviour. (Specification says it is 64 bit register).
>>
>> Thanks,
>> Amadeusz
>
> Hi Amadeusz,
>
> The patch does not work but I managed to create a workaround just for
> reference. Still don't know why first read in sst_byt_irq_thread returns
> incorrect value.
>

Hi,
yes that seems bit weird. It is bit better as it does not modify common
code, but still... Maybe going back to your original idea of replacing
memcpy, try replacing it with readq? It should generate one instruction
read (although it is only for x64_64, for 32 bit kernel we would still
need to do something else).

Thanks,
Amadeusz

2020-04-30 15:40:00

by Lu, Brent

[permalink] [raw]
Subject: RE: [PATCH] ASoC: Intel: sst: ipc command timeout

>
> Hi,
> yes that seems bit weird. It is bit better as it does not modify common code,
> but still... Maybe going back to your original idea of replacing memcpy, try
> replacing it with readq? It should generate one instruction read (although it is
> only for x64_64, for 32 bit kernel we would still need to do something else).
>
> Thanks,
> Amadeusz

Hi,

I've compared the assembly to see if there is clue. Both kernels are using 64-bit
mov to read register and the only difference is optimized or not. Both
implementations are looking good to me. Currently I don't have answer why
slower kernel hits the problem while optimized one survived.

1. Old kernel. Code is optimized and not able to reproduce the issue on this kernel.

(gdb) disas sst_shim32_read64
Dump of assembler code for function sst_shim32_read64:
0x000000000000096c <+0>: call 0x971 <sst_shim32_read64+5>
=> call __fentry__
0x0000000000000971 <+5>: push rbp
0x0000000000000972 <+6>: mov rbp,rsp
0x0000000000000975 <+9>: mov eax,esi
0x0000000000000977 <+11>: mov rax,QWORD PTR [rdi+rax*1]
=> perform 64-bit mov
0x000000000000097b <+15>: pop rbp
0x000000000000097c <+16>: ret
End of assembler dump.

2. New kernel: obviously optimization is disabled and it calls memcpy to do the read operation.

(gdb) disas sst_shim32_read64
Dump of assembler code for function sst_shim32_read64:
0x00000000000009a8 <+0>: call 0x9ad <sst_shim32_read64+5>
=> call __fentry__
0x00000000000009ad <+5>: push rbp
0x00000000000009ae <+6>: mov rbp,rsp
0x00000000000009b1 <+9>: push rbx
0x00000000000009b2 <+10>: sub rsp,0x10
0x00000000000009b6 <+14>: mov rax,QWORD PTR gs:0x28
0x00000000000009bf <+23>: mov QWORD PTR [rbp-0x10],rax
0x00000000000009c3 <+27>: movabs rax,0xaaaaaaaaaaaaaaaa
0x00000000000009cd <+37>: lea rbx,[rbp-0x18]
0x00000000000009d1 <+41>: mov QWORD PTR [rbx],rax
0x00000000000009d4 <+44>: mov esi,esi
0x00000000000009d6 <+46>: add rsi,rdi
0x00000000000009d9 <+49>: mov edx,0x8
0x00000000000009de <+54>: mov rdi,rbx
0x00000000000009e1 <+57>: call 0x9e6 <sst_shim32_read64+62>
=> call memcpy

The memcpy is implemented in arch/x86/lib/memcpy_64.S

(gdb) disas memcpy
Dump of assembler code for function memcpy:
0xffffffff813519c0 <+0>: jmp 0xffffffff813519f0 <memcpy_orig>
=> jump to memcpy_orig function

X86_FEATURE_ERMS is disabled so it jumps to memcpy_orig

(gdb) disas memcpy_orig
Dump of assembler code for function memcpy_orig:
0xffffffff813519f0 <+0>: mov rax,rdi
0xffffffff813519f3 <+3>: cmp rdx,0x20
0xffffffff813519f7 <+7>: jb 0xffffffff81351a77 <memcpy_orig+135>
=> jump because our read size is 8
...
0xffffffff81351a77 <+135>: cmp edx,0x10
0xffffffff81351a7a <+138>: jb 0xffffffff81351aa0 <memcpy_orig+176>
=> jump because our read size is 8
...
0xffffffff81351aa0 <+176>: cmp edx,0x8
0xffffffff81351aa3 <+179>: jb 0xffffffff81351ac0 <memcpy_orig+208>
0xffffffff81351aa5 <+181>: mov r8,QWORD PTR [rsi]
0xffffffff81351aa8 <+184>: mov r9,QWORD PTR [rsi+rdx*1-0x8]
0xffffffff81351aad <+189>: mov QWORD PTR [rdi],r8
0xffffffff81351ab0 <+192>: mov QWORD PTR [rdi+rdx*1-0x8],r9
=> perform 64-bit mov twice over same address (rdx=0x8)
0xffffffff81351ab5 <+197>: ret

Regards,
Brent

2020-05-04 09:28:11

by Amadeusz Sławiński

[permalink] [raw]
Subject: Re: [PATCH] ASoC: Intel: sst: ipc command timeout



On 4/30/2020 5:38 PM, Lu, Brent wrote:
>>
>> Hi,
>> yes that seems bit weird. It is bit better as it does not modify common code,
>> but still... Maybe going back to your original idea of replacing memcpy, try
>> replacing it with readq? It should generate one instruction read (although it is
>> only for x64_64, for 32 bit kernel we would still need to do something else).
>>
>> Thanks,
>> Amadeusz
>
> Hi,
>
> I've compared the assembly to see if there is clue. Both kernels are using 64-bit
> mov to read register and the only difference is optimized or not. Both
> implementations are looking good to me. Currently I don't have answer why
> slower kernel hits the problem while optimized one survived.
>
> 1. Old kernel. Code is optimized and not able to reproduce the issue on this kernel.
>
> (gdb) disas sst_shim32_read64
> Dump of assembler code for function sst_shim32_read64:
> 0x000000000000096c <+0>: call 0x971 <sst_shim32_read64+5>
> => call __fentry__
> 0x0000000000000971 <+5>: push rbp
> 0x0000000000000972 <+6>: mov rbp,rsp
> 0x0000000000000975 <+9>: mov eax,esi
> 0x0000000000000977 <+11>: mov rax,QWORD PTR [rdi+rax*1]
> => perform 64-bit mov
> 0x000000000000097b <+15>: pop rbp
> 0x000000000000097c <+16>: ret
> End of assembler dump.
>

Hi,

That's why I would suggest trying with readq, it should also generate
one instruction read x86_64 platforms, I looked a bit more and there is
fallback to generate two 32 bit reads on 32bit platforms, so my previous
concern about having to write separate handling for those platforms was
unneeded. So I would recommend checking using it.

diff --git a/sound/soc/intel/common/sst-dsp.c
b/sound/soc/intel/common/sst-dsp.c
index ec66be269b69..e96f636387d9 100644
--- a/sound/soc/intel/common/sst-dsp.c
+++ b/sound/soc/intel/common/sst-dsp.c
@@ -34,16 +34,13 @@ EXPORT_SYMBOL_GPL(sst_shim32_read);

void sst_shim32_write64(void __iomem *addr, u32 offset, u64 value)
{
- memcpy_toio(addr + offset, &value, sizeof(value));
+ writeq(value, addr + offset);
}
EXPORT_SYMBOL_GPL(sst_shim32_write64);

u64 sst_shim32_read64(void __iomem *addr, u32 offset)
{
- u64 val;
-
- memcpy_fromio(&val, addr + offset, sizeof(val));
- return val;
+ return readq(addr + offset);
}
EXPORT_SYMBOL_GPL(sst_shim32_read64);


Thanks,
Amadeusz