2020-05-18 04:37:59

by Brent Lu

[permalink] [raw]
Subject: [PATCH v2] ALSA: pcm: fix incorrect hw_base increase

There is a corner case that ALSA keeps increasing the hw_ptr but DMA
already stop working/updating the position for a long time.

In following log we can see the position returned from DMA driver does
not move at all but the hw_ptr got increased at some point of time so
snd_pcm_avail() will return a large number which seems to be a buffer
underrun event from user space program point of view. The program
thinks there is space in the buffer and fill more data.

[ 418.510086] sound pcmC0D5p: pos 96 hw_ptr 96 appl_ptr 4096 avail 12368
[ 418.510149] sound pcmC0D5p: pos 96 hw_ptr 96 appl_ptr 6910 avail 9554
...
[ 418.681052] sound pcmC0D5p: pos 96 hw_ptr 96 appl_ptr 15102 avail 1362
[ 418.681130] sound pcmC0D5p: pos 96 hw_ptr 96 appl_ptr 16464 avail 0
[ 418.726515] sound pcmC0D5p: pos 96 hw_ptr 16464 appl_ptr 16464 avail 16368

This is because the hw_base will be increased by runtime->buffer_size
frames unconditionally if the hw_ptr is not updated for over half of
buffer time. As the hw_base increases, so does the hw_ptr increased
by the same number.

The avail value returned from snd_pcm_avail() could exceed the limit
(buffer_size) easily becase the hw_ptr itself got increased by same
buffer_size samples when the corner case happens. In following log,
the buffer_size is 16368 samples but the avail is 21810 samples so
CRAS server complains about it.

[ 418.851755] sound pcmC0D5p: pos 96 hw_ptr 16464 appl_ptr 27390 avail 5442
[ 418.926491] sound pcmC0D5p: pos 96 hw_ptr 32832 appl_ptr 27390 avail 21810

cras_server[1907]: pcm_avail returned frames larger than buf_size:
sof-glkda7219max: :0,5: 21810 > 16368

By updating runtime->hw_ptr_jiffies each time the HWSYNC is called,
the hw_base will keep the same when buffer stall happens at long as
the interval between each HWSYNC call is shorter than half of buffer
time.

Following is a log captured by a patched kernel. The hw_base/hw_ptr
value is fixed in this corner case and user space program should be
aware of the buffer stall and handle it.

[ 293.525543] sound pcmC0D5p: pos 96 hw_ptr 96 appl_ptr 4096 avail 12368
[ 293.525606] sound pcmC0D5p: pos 96 hw_ptr 96 appl_ptr 6880 avail 9584
[ 293.525975] sound pcmC0D5p: pos 96 hw_ptr 96 appl_ptr 10976 avail 5488
[ 293.611178] sound pcmC0D5p: pos 96 hw_ptr 96 appl_ptr 15072 avail 1392
[ 293.696429] sound pcmC0D5p: pos 96 hw_ptr 96 appl_ptr 16464 avail 0
...
[ 381.139517] sound pcmC0D5p: pos 96 hw_ptr 96 appl_ptr 16464 avail 0

Signed-off-by: Brent Lu <[email protected]>
---
sound/core/pcm_lib.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/sound/core/pcm_lib.c b/sound/core/pcm_lib.c
index 872a852..d531e1b 100644
--- a/sound/core/pcm_lib.c
+++ b/sound/core/pcm_lib.c
@@ -433,6 +433,7 @@ static int snd_pcm_update_hw_ptr0(struct snd_pcm_substream *substream,

no_delta_check:
if (runtime->status->hw_ptr == new_hw_ptr) {
+ runtime->hw_ptr_jiffies = curr_jiffies;
update_audio_tstamp(substream, &curr_tstamp, &audio_tstamp);
return 0;
}
--
2.7.4


2020-05-18 07:58:09

by Takashi Iwai

[permalink] [raw]
Subject: Re: [PATCH v2] ALSA: pcm: fix incorrect hw_base increase

On Mon, 18 May 2020 06:30:38 +0200,
Brent Lu wrote:
>
> There is a corner case that ALSA keeps increasing the hw_ptr but DMA
> already stop working/updating the position for a long time.
>
> In following log we can see the position returned from DMA driver does
> not move at all but the hw_ptr got increased at some point of time so
> snd_pcm_avail() will return a large number which seems to be a buffer
> underrun event from user space program point of view. The program
> thinks there is space in the buffer and fill more data.
>
> [ 418.510086] sound pcmC0D5p: pos 96 hw_ptr 96 appl_ptr 4096 avail 12368
> [ 418.510149] sound pcmC0D5p: pos 96 hw_ptr 96 appl_ptr 6910 avail 9554
> ...
> [ 418.681052] sound pcmC0D5p: pos 96 hw_ptr 96 appl_ptr 15102 avail 1362
> [ 418.681130] sound pcmC0D5p: pos 96 hw_ptr 96 appl_ptr 16464 avail 0
> [ 418.726515] sound pcmC0D5p: pos 96 hw_ptr 16464 appl_ptr 16464 avail 16368
>
> This is because the hw_base will be increased by runtime->buffer_size
> frames unconditionally if the hw_ptr is not updated for over half of
> buffer time. As the hw_base increases, so does the hw_ptr increased
> by the same number.
>
> The avail value returned from snd_pcm_avail() could exceed the limit
> (buffer_size) easily becase the hw_ptr itself got increased by same
> buffer_size samples when the corner case happens. In following log,
> the buffer_size is 16368 samples but the avail is 21810 samples so
> CRAS server complains about it.
>
> [ 418.851755] sound pcmC0D5p: pos 96 hw_ptr 16464 appl_ptr 27390 avail 5442
> [ 418.926491] sound pcmC0D5p: pos 96 hw_ptr 32832 appl_ptr 27390 avail 21810
>
> cras_server[1907]: pcm_avail returned frames larger than buf_size:
> sof-glkda7219max: :0,5: 21810 > 16368
>
> By updating runtime->hw_ptr_jiffies each time the HWSYNC is called,
> the hw_base will keep the same when buffer stall happens at long as
> the interval between each HWSYNC call is shorter than half of buffer
> time.
>
> Following is a log captured by a patched kernel. The hw_base/hw_ptr
> value is fixed in this corner case and user space program should be
> aware of the buffer stall and handle it.
>
> [ 293.525543] sound pcmC0D5p: pos 96 hw_ptr 96 appl_ptr 4096 avail 12368
> [ 293.525606] sound pcmC0D5p: pos 96 hw_ptr 96 appl_ptr 6880 avail 9584
> [ 293.525975] sound pcmC0D5p: pos 96 hw_ptr 96 appl_ptr 10976 avail 5488
> [ 293.611178] sound pcmC0D5p: pos 96 hw_ptr 96 appl_ptr 15072 avail 1392
> [ 293.696429] sound pcmC0D5p: pos 96 hw_ptr 96 appl_ptr 16464 avail 0
> ...
> [ 381.139517] sound pcmC0D5p: pos 96 hw_ptr 96 appl_ptr 16464 avail 0
>
> Signed-off-by: Brent Lu <[email protected]>

Thanks, applied now with Reviewed-by tag from Jaroslav.
I also put Cc to stable, as it can fix the actual issues.


Takashi