2023-03-17 19:52:50

by John Keeping

[permalink] [raw]
Subject: [PATCH] ALSA: usb-audio: Fix recursive locking on XRUN

snd_usb_queue_pending_output_urbs() may be called from
snd_pcm_ops::ack() which means the PCM stream is locked.

For the normal case where the call back into the PCM core is via
prepare_output_urb() the "_under_stream_lock" variant of
snd_pcm_period_elapsed() is called, but when an error occurs and the
stream is stopped as XRUN then snd_pcm_xrun() tries to recursively lock
the stream which results in deadlock.

Follow the example of snd_pcm_period_elapsed() by adding
snd_pcm_xrun_under_stream_lock() and use this when the PCM substream
lock is already held.

Signed-off-by: John Keeping <[email protected]>
---
include/sound/pcm.h | 1 +
sound/core/pcm_native.c | 28 ++++++++++++++++++++++++----
sound/usb/endpoint.c | 18 +++++++++++-------
3 files changed, 36 insertions(+), 11 deletions(-)

diff --git a/include/sound/pcm.h b/include/sound/pcm.h
index 27040b472a4f..98551907453a 100644
--- a/include/sound/pcm.h
+++ b/include/sound/pcm.h
@@ -571,6 +571,7 @@ int snd_pcm_status64(struct snd_pcm_substream *substream,
int snd_pcm_start(struct snd_pcm_substream *substream);
int snd_pcm_stop(struct snd_pcm_substream *substream, snd_pcm_state_t status);
int snd_pcm_drain_done(struct snd_pcm_substream *substream);
+int snd_pcm_stop_xrun_under_stream_lock(struct snd_pcm_substream *substream);
int snd_pcm_stop_xrun(struct snd_pcm_substream *substream);
#ifdef CONFIG_PM
int snd_pcm_suspend_all(struct snd_pcm *pcm);
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index 331380c2438b..617f5dc74df0 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -1559,24 +1559,44 @@ int snd_pcm_drain_done(struct snd_pcm_substream *substream)
SNDRV_PCM_STATE_SETUP);
}

+/**
+ * snd_pcm_stop_xrun_under_stream_lock - stop the running stream as XRUN under the lock of
+ * the PCM substream.
+ * @substream: the PCM substream instance
+ *
+ * This stops the given running substream (and all linked substreams) as XRUN.
+ * This function assumes that the substream lock is already held.
+ *
+ * Return: Zero if successful, or a negative error core.
+ */
+int snd_pcm_stop_xrun_under_stream_lock(struct snd_pcm_substream *substream)
+{
+ if (substream->runtime && snd_pcm_running(substream))
+ __snd_pcm_xrun(substream);
+
+ return 0;
+}
+
/**
* snd_pcm_stop_xrun - stop the running streams as XRUN
* @substream: the PCM substream instance
*
+ * This function is similar to ``snd_pcm_stop_xrun_under_stream_lock()`` except that it
+ * acquires the substream lock itself.
+ *
* This stops the given running substream (and all linked substreams) as XRUN.
- * Unlike snd_pcm_stop(), this function takes the substream lock by itself.
*
* Return: Zero if successful, or a negative error code.
*/
int snd_pcm_stop_xrun(struct snd_pcm_substream *substream)
{
unsigned long flags;
+ int ret;

snd_pcm_stream_lock_irqsave(substream, flags);
- if (substream->runtime && snd_pcm_running(substream))
- __snd_pcm_xrun(substream);
+ ret = snd_pcm_stop_xrun_under_stream_lock(substream);
snd_pcm_stream_unlock_irqrestore(substream, flags);
- return 0;
+ return ret;
}
EXPORT_SYMBOL_GPL(snd_pcm_stop_xrun);

diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c
index 1e0af1179ca8..83a6b6d41374 100644
--- a/sound/usb/endpoint.c
+++ b/sound/usb/endpoint.c
@@ -400,13 +400,17 @@ static int prepare_inbound_urb(struct snd_usb_endpoint *ep,
}

/* notify an error as XRUN to the assigned PCM data substream */
-static void notify_xrun(struct snd_usb_endpoint *ep)
+static void notify_xrun(struct snd_usb_endpoint *ep, bool in_stream_lock)
{
struct snd_usb_substream *data_subs;

data_subs = READ_ONCE(ep->data_subs);
- if (data_subs && data_subs->pcm_substream)
- snd_pcm_stop_xrun(data_subs->pcm_substream);
+ if (data_subs && data_subs->pcm_substream) {
+ if (in_stream_lock)
+ snd_pcm_stop_xrun_under_stream_lock(data_subs->pcm_substream);
+ else
+ snd_pcm_stop_xrun(data_subs->pcm_substream);
+ }
}

static struct snd_usb_packet_info *
@@ -498,7 +502,7 @@ void snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
if (err == -EAGAIN)
push_back_to_ready_list(ep, ctx);
else
- notify_xrun(ep);
+ notify_xrun(ep, in_stream_lock);
return;
}

@@ -507,7 +511,7 @@ void snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
usb_audio_err(ep->chip,
"Unable to submit urb #%d: %d at %s\n",
ctx->index, err, __func__);
- notify_xrun(ep);
+ notify_xrun(ep, in_stream_lock);
return;
}

@@ -574,7 +578,7 @@ static void snd_complete_urb(struct urb *urb)
return;

usb_audio_err(ep->chip, "cannot submit urb (err = %d)\n", err);
- notify_xrun(ep);
+ notify_xrun(ep, false);

exit_clear:
clear_bit(ctx->index, &ep->active_mask);
@@ -1762,7 +1766,7 @@ static void snd_usb_handle_sync_urb(struct snd_usb_endpoint *ep,
usb_audio_err(ep->chip,
"next package FIFO overflow EP 0x%x\n",
ep->ep_num);
- notify_xrun(ep);
+ notify_xrun(ep, false);
return;
}

--
2.40.0



2023-03-18 00:20:19

by Takashi Sakamoto

[permalink] [raw]
Subject: Re: [PATCH] ALSA: usb-audio: Fix recursive locking on XRUN

Hi,

On Fri, Mar 17, 2023 at 07:51:27PM +0000, John Keeping wrote:
> snd_usb_queue_pending_output_urbs() may be called from
> snd_pcm_ops::ack() which means the PCM stream is locked.
>
> For the normal case where the call back into the PCM core is via
> prepare_output_urb() the "_under_stream_lock" variant of
> snd_pcm_period_elapsed() is called, but when an error occurs and the
> stream is stopped as XRUN then snd_pcm_xrun() tries to recursively lock
> the stream which results in deadlock.
>
> Follow the example of snd_pcm_period_elapsed() by adding
> snd_pcm_xrun_under_stream_lock() and use this when the PCM substream
> lock is already held.
>
> Signed-off-by: John Keeping <[email protected]>
> ---
> include/sound/pcm.h | 1 +
> sound/core/pcm_native.c | 28 ++++++++++++++++++++++++----
> sound/usb/endpoint.c | 18 +++++++++++-------
> 3 files changed, 36 insertions(+), 11 deletions(-)

The name of added kernel API implies me that you refer to existent
'snd_pcm_period_elapsed_under_stream_lock()' which I added to Linux
v5.14.

In my opinion, unlike the version of period elapsed API, the version of
XRUN API seems not to be necessarily required to ALSA PCM core, since PCM
device drivers can implement .pointer callback in the part of PCM operation.
When the callback returns SNDRV_PCM_POS_XRUN, ALSA PCM application get
occurence of XRUN as a result of any operation relevant to hwptr movement
(e.g. SNDRV_PCM_IOCTL_HWSYNC).

Therefore I think it possible to fix the issue without the proposed
kernel API. I can assume some scenario:

1. Failure at tasklet for URB completion

It is softIRQ context. The stream lock is not acquired. It doesn't
matter to call current XRUN API.

2. Failure at PCM operation called by ALSA PCM application

It is process context. The stream lock is acquired before calling driver
code. When detecting any type of failure, driver code stores the state.
Then .pointer callback should return SNDRV_PCM_IOCTL_HWSYNC refering to
the state.

Of course, I'm not a developer for USB audio devices. I'm just a developer
for the other type of packet-oriented drivers (IEC 61883-1/6 packet
streaming engine for audio and music unit in IEEE 1394 bus). So I do not
get every part of USB driver. However, from my experience for the
packet-oriented drivers, I have the above concern about adding the new
XRUN API.

I apologize if miss-hitting the point for your issue.


Regards

Takashi Sakamoto

2023-03-18 02:30:37

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH] ALSA: usb-audio: Fix recursive locking on XRUN

Hi John,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tiwai-sound/for-next]
[also build test ERROR on tiwai-sound/for-linus v6.3-rc2 next-20230317]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/John-Keeping/ALSA-usb-audio-Fix-recursive-locking-on-XRUN/20230318-035430
base: https://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git for-next
patch link: https://lore.kernel.org/r/20230317195128.3911155-1-john%40metanate.com
patch subject: [PATCH] ALSA: usb-audio: Fix recursive locking on XRUN
config: m68k-allmodconfig (https://download.01.org/0day-ci/archive/20230318/[email protected]/config)
compiler: m68k-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/21bbf1266d22cbc0e1ec7c8d535738f66bbc9801
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review John-Keeping/ALSA-usb-audio-Fix-recursive-locking-on-XRUN/20230318-035430
git checkout 21bbf1266d22cbc0e1ec7c8d535738f66bbc9801
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=m68k olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=m68k SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>
| Link: https://lore.kernel.org/oe-kbuild-all/[email protected]/

All errors (new ones prefixed by >>, old ones prefixed by <<):

>> ERROR: modpost: "snd_pcm_stop_xrun_under_stream_lock" [sound/usb/snd-usb-audio.ko] undefined!

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

2023-03-18 05:46:41

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH] ALSA: usb-audio: Fix recursive locking on XRUN

Hi John,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tiwai-sound/for-next]
[also build test ERROR on tiwai-sound/for-linus linus/master v6.3-rc2 next-20230317]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/John-Keeping/ALSA-usb-audio-Fix-recursive-locking-on-XRUN/20230318-035430
base: https://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git for-next
patch link: https://lore.kernel.org/r/20230317195128.3911155-1-john%40metanate.com
patch subject: [PATCH] ALSA: usb-audio: Fix recursive locking on XRUN
config: x86_64-randconfig-a013-20230313 (https://download.01.org/0day-ci/archive/20230318/[email protected]/config)
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/21bbf1266d22cbc0e1ec7c8d535738f66bbc9801
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review John-Keeping/ALSA-usb-audio-Fix-recursive-locking-on-XRUN/20230318-035430
git checkout 21bbf1266d22cbc0e1ec7c8d535738f66bbc9801
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>
| Link: https://lore.kernel.org/oe-kbuild-all/[email protected]/

All errors (new ones prefixed by >>, old ones prefixed by <<):

>> ERROR: modpost: "snd_pcm_stop_xrun_under_stream_lock" [sound/usb/snd-usb-audio.ko] undefined!

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

2023-03-18 10:59:39

by Takashi Sakamoto

[permalink] [raw]
Subject: Re: [PATCH] ALSA: usb-audio: Fix recursive locking on XRUN

On Sat, Mar 18, 2023 at 09:20:05AM +0900, Takashi Sakamoto wrote:
> 1. Failure at tasklet for URB completion
>
> It is softIRQ context. The stream lock is not acquired. It doesn't
> matter to call current XRUN API.
>
> 2. Failure at PCM operation called by ALSA PCM application
>
> It is process context. The stream lock is acquired before calling driver
> code. When detecting any type of failure, driver code stores the state.
> Then .pointer callback should return SNDRV_PCM_IOCTL_HWSYNC refering to
> the state.

Oops. I did copy-and-paste mistake here... It should be
SNDRV_PCM_POS_XRUN instead of SNDRV_PCM_IOCTL_HWSYNC...


Regards

Takashi Sakamoto

2023-03-19 03:29:06

by Takashi Sakamoto

[permalink] [raw]
Subject: Re: [PATCH] ALSA: usb-audio: Fix recursive locking on XRUN

Hi,

On Sat, Mar 18, 2023 at 09:20:05AM +0900, Takashi Sakamoto wrote:
> On Fri, Mar 17, 2023 at 07:51:27PM +0000, John Keeping wrote:
> > snd_usb_queue_pending_output_urbs() may be called from
> > snd_pcm_ops::ack() which means the PCM stream is locked.
> >
> > For the normal case where the call back into the PCM core is via
> > prepare_output_urb() the "_under_stream_lock" variant of
> > snd_pcm_period_elapsed() is called, but when an error occurs and the
> > stream is stopped as XRUN then snd_pcm_xrun() tries to recursively lock
> > the stream which results in deadlock.
> >
> > Follow the example of snd_pcm_period_elapsed() by adding
> > snd_pcm_xrun_under_stream_lock() and use this when the PCM substream
> > lock is already held.
> >
> > Signed-off-by: John Keeping <[email protected]>
> > ---
> > include/sound/pcm.h | 1 +
> > sound/core/pcm_native.c | 28 ++++++++++++++++++++++++----
> > sound/usb/endpoint.c | 18 +++++++++++-------
> > 3 files changed, 36 insertions(+), 11 deletions(-)
>
> The name of added kernel API implies me that you refer to existent
> 'snd_pcm_period_elapsed_under_stream_lock()' which I added to Linux
> v5.14.
>
> In my opinion, unlike the version of period elapsed API, the version of
> XRUN API seems not to be necessarily required to ALSA PCM core, since PCM
> device drivers can implement .pointer callback in the part of PCM operation.
> When the callback returns SNDRV_PCM_POS_XRUN, ALSA PCM application get
> occurence of XRUN as a result of any operation relevant to hwptr movement
> (e.g. SNDRV_PCM_IOCTL_HWSYNC).
>
> Therefore I think it possible to fix the issue without the proposed
> kernel API. I can assume some scenario:
>
> 1. Failure at tasklet for URB completion
>
> It is softIRQ context. The stream lock is not acquired. It doesn't
> matter to call current XRUN API.
>
> 2. Failure at PCM operation called by ALSA PCM application
>
> It is process context. The stream lock is acquired before calling driver
> code. When detecting any type of failure, driver code stores the state.
> Then .pointer callback should return SNDRV_PCM_POS_XRUNrefering to
> the state.

Although being inexperienced to hack driver for USB audio device class,
I attempt to post the patch to fix the issue of recursive stream lock.
I apologies in advance since the patch is not tested yet...

The 'in_xrun' member is newly added to 'struct snd_usb_substream'. When
detecting any failure, false is assigned to the member. The assignment
is expected to be done in both softIRQ context, and process context with
stream lock, thus no need to take care of cocurrent access (e.g. by usage
of WRITE_ONCE/READ_ONCE).

Typical ALSA PCM application periodically calls PCM operation which calls
.pointer in driver code. As I described, returning SNDRV_PCM_POS_XRUN
takes ALSA PCM core to handle XRUN state of PCM substream in the timing.

The negative point of the patch is the delay of XRUN notification to user
space application. In the point, I think the new kernel API introduced by
your patch has advantage.

The in_xrun member can be replaced with a kind of EP_STATE_
enumerations; i.e. EP_STATE_XRUN. In the case, we need some care so that
the state should be referred from pcm.c.

For your information.

```
---
sound/usb/card.h | 1 +
sound/usb/endpoint.c | 18 +++++++++++-------
sound/usb/pcm.c | 3 ++-
3 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/sound/usb/card.h b/sound/usb/card.h
index 6ec95b2edf86..cb07d5eb09ad 100644
--- a/sound/usb/card.h
+++ b/sound/usb/card.h
@@ -172,6 +172,7 @@ struct snd_usb_substream {
unsigned int hwptr_done; /* processed byte position in the buffer */
unsigned int transfer_done; /* processed frames since last period update */
unsigned int frame_limit; /* limits number of packets in URB */
+ bool in_xrun;

/* data and sync endpoints for this stream */
unsigned int ep_num; /* the endpoint number */
diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c
index 1e0af1179ca8..41266c169404 100644
--- a/sound/usb/endpoint.c
+++ b/sound/usb/endpoint.c
@@ -400,13 +400,17 @@ static int prepare_inbound_urb(struct snd_usb_endpoint *ep,
}

/* notify an error as XRUN to the assigned PCM data substream */
-static void notify_xrun(struct snd_usb_endpoint *ep)
+static void notify_xrun(struct snd_usb_endpoint *ep, bool in_stream_lock)
{
struct snd_usb_substream *data_subs;

data_subs = READ_ONCE(ep->data_subs);
- if (data_subs && data_subs->pcm_substream)
- snd_pcm_stop_xrun(data_subs->pcm_substream);
+ if (data_subs && data_subs->pcm_substream && !data_subs->in_xrun) {
+ if (in_stream_lock)
+ data_subs->in_xrun = true;
+ else
+ snd_pcm_stop_xrun(data_subs->pcm_substream);
+ }
}

static struct snd_usb_packet_info *
@@ -498,7 +502,7 @@ void snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
if (err == -EAGAIN)
push_back_to_ready_list(ep, ctx);
else
- notify_xrun(ep);
+ notify_xrun(ep, in_stream_lock);
return;
}

@@ -507,7 +511,7 @@ void snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
usb_audio_err(ep->chip,
"Unable to submit urb #%d: %d at %s\n",
ctx->index, err, __func__);
- notify_xrun(ep);
+ notify_xrun(ep, in_stream_lock);
return;
}

@@ -574,7 +578,7 @@ static void snd_complete_urb(struct urb *urb)
return;

usb_audio_err(ep->chip, "cannot submit urb (err = %d)\n", err);
- notify_xrun(ep);
+ notify_xrun(ep, false);

exit_clear:
clear_bit(ctx->index, &ep->active_mask);
@@ -1762,7 +1766,7 @@ static void snd_usb_handle_sync_urb(struct snd_usb_endpoint *ep,
usb_audio_err(ep->chip,
"next package FIFO overflow EP 0x%x\n",
ep->ep_num);
- notify_xrun(ep);
+ notify_xrun(ep, false);
return;
}

diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c
index d959da7a1afb..8889c81297db 100644
--- a/sound/usb/pcm.c
+++ b/sound/usb/pcm.c
@@ -75,7 +75,7 @@ static snd_pcm_uframes_t snd_usb_pcm_pointer(struct snd_pcm_substream *substream
struct snd_usb_substream *subs = runtime->private_data;
unsigned int hwptr_done;

- if (atomic_read(&subs->stream->chip->shutdown))
+ if (atomic_read(&subs->stream->chip->shutdown) || subs->in_xrun)
return SNDRV_PCM_POS_XRUN;
spin_lock(&subs->lock);
hwptr_done = subs->hwptr_done;
@@ -671,6 +671,7 @@ static int snd_usb_pcm_prepare(struct snd_pcm_substream *substream)
subs->transfer_done = 0;
subs->last_frame_number = 0;
subs->period_elapsed_pending = 0;
+ subs->in_xrun = false;
runtime->delay = 0;

subs->lowlatency_playback = lowlatency_playback_available(runtime, subs);
```


Takashi Sakamoto

2023-03-19 07:57:21

by Takashi Iwai

[permalink] [raw]
Subject: Re: [PATCH] ALSA: usb-audio: Fix recursive locking on XRUN

On Sun, 19 Mar 2023 04:28:53 +0100,
Takashi Sakamoto wrote:
>
> Hi,
>
> On Sat, Mar 18, 2023 at 09:20:05AM +0900, Takashi Sakamoto wrote:
> > On Fri, Mar 17, 2023 at 07:51:27PM +0000, John Keeping wrote:
> > > snd_usb_queue_pending_output_urbs() may be called from
> > > snd_pcm_ops::ack() which means the PCM stream is locked.
> > >
> > > For the normal case where the call back into the PCM core is via
> > > prepare_output_urb() the "_under_stream_lock" variant of
> > > snd_pcm_period_elapsed() is called, but when an error occurs and the
> > > stream is stopped as XRUN then snd_pcm_xrun() tries to recursively lock
> > > the stream which results in deadlock.
> > >
> > > Follow the example of snd_pcm_period_elapsed() by adding
> > > snd_pcm_xrun_under_stream_lock() and use this when the PCM substream
> > > lock is already held.
> > >
> > > Signed-off-by: John Keeping <[email protected]>
> > > ---
> > > include/sound/pcm.h | 1 +
> > > sound/core/pcm_native.c | 28 ++++++++++++++++++++++++----
> > > sound/usb/endpoint.c | 18 +++++++++++-------
> > > 3 files changed, 36 insertions(+), 11 deletions(-)
> >
> > The name of added kernel API implies me that you refer to existent
> > 'snd_pcm_period_elapsed_under_stream_lock()' which I added to Linux
> > v5.14.
> >
> > In my opinion, unlike the version of period elapsed API, the version of
> > XRUN API seems not to be necessarily required to ALSA PCM core, since PCM
> > device drivers can implement .pointer callback in the part of PCM operation.
> > When the callback returns SNDRV_PCM_POS_XRUN, ALSA PCM application get
> > occurence of XRUN as a result of any operation relevant to hwptr movement
> > (e.g. SNDRV_PCM_IOCTL_HWSYNC).
> >
> > Therefore I think it possible to fix the issue without the proposed
> > kernel API. I can assume some scenario:
> >
> > 1. Failure at tasklet for URB completion
> >
> > It is softIRQ context. The stream lock is not acquired. It doesn't
> > matter to call current XRUN API.
> >
> > 2. Failure at PCM operation called by ALSA PCM application
> >
> > It is process context. The stream lock is acquired before calling driver
> > code. When detecting any type of failure, driver code stores the state.
> > Then .pointer callback should return SNDRV_PCM_POS_XRUNrefering to
> > the state.
>
> Although being inexperienced to hack driver for USB audio device class,
> I attempt to post the patch to fix the issue of recursive stream lock.
> I apologies in advance since the patch is not tested yet...
>
> The 'in_xrun' member is newly added to 'struct snd_usb_substream'. When
> detecting any failure, false is assigned to the member. The assignment
> is expected to be done in both softIRQ context, and process context with
> stream lock, thus no need to take care of cocurrent access (e.g. by usage
> of WRITE_ONCE/READ_ONCE).
>
> Typical ALSA PCM application periodically calls PCM operation which calls
> .pointer in driver code. As I described, returning SNDRV_PCM_POS_XRUN
> takes ALSA PCM core to handle XRUN state of PCM substream in the timing.
>
> The negative point of the patch is the delay of XRUN notification to user
> space application. In the point, I think the new kernel API introduced by
> your patch has advantage.
>
> The in_xrun member can be replaced with a kind of EP_STATE_
> enumerations; i.e. EP_STATE_XRUN. In the case, we need some care so that
> the state should be referred from pcm.c.

Thanks for the patch. That would work, but the shortcoming side of
this implementation is that it misses stopping / reporting the error
immediately but waiting for the next pointer update.

It might be simpler if we perform the xrun handling in the caller
side, i.e. a change like below:

--- a/sound/core/pcm_lib.c
+++ b/sound/core/pcm_lib.c
@@ -2155,6 +2155,8 @@ int pcm_lib_apply_appl_ptr(struct snd_pcm_substream *substream,
ret = substream->ops->ack(substream);
if (ret < 0) {
runtime->control->appl_ptr = old_appl_ptr;
+ if (ret == -EPIPE)
+ __snd_pcm_xrun(substream);
return ret;
}
}

... and let the caller returning -EPIPE for XRUN:

--- a/sound/usb/endpoint.c
+++ b/sound/usb/endpoint.c
@@ -455,8 +455,8 @@ static void push_back_to_ready_list(struct snd_usb_endpoint *ep,
* This function is used both for implicit feedback endpoints and in low-
* latency playback mode.
*/
-void snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
- bool in_stream_lock)
+int snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
+ bool in_stream_lock)
{
bool implicit_fb = snd_usb_endpoint_implicit_feedback_sink(ep);

@@ -480,7 +480,7 @@ void snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
spin_unlock_irqrestore(&ep->lock, flags);

if (ctx == NULL)
- return;
+ return 0;

/* copy over the length information */
if (implicit_fb) {
@@ -495,11 +495,11 @@ void snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
break;
if (err < 0) {
/* push back to ready list again for -EAGAIN */
- if (err == -EAGAIN)
+ if (err == -EAGAIN) {
push_back_to_ready_list(ep, ctx);
- else
- notify_xrun(ep);
- return;
+ return 0;
+ }
+ return -EPIPE;
}

err = usb_submit_urb(ctx->urb, GFP_ATOMIC);
@@ -507,8 +507,7 @@ void snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
usb_audio_err(ep->chip,
"Unable to submit urb #%d: %d at %s\n",
ctx->index, err, __func__);
- notify_xrun(ep);
- return;
+ return -EPIPE;
}

set_bit(ctx->index, &ep->active_mask);
--- a/sound/usb/endpoint.h
+++ b/sound/usb/endpoint.h
@@ -52,7 +52,7 @@ int snd_usb_endpoint_implicit_feedback_sink(struct snd_usb_endpoint *ep);
int snd_usb_endpoint_next_packet_size(struct snd_usb_endpoint *ep,
struct snd_urb_ctx *ctx, int idx,
unsigned int avail);
-void snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
- bool in_stream_lock);
+int snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
+ bool in_stream_lock);

#endif /* __USBAUDIO_ENDPOINT_H */
--- a/sound/usb/pcm.c
+++ b/sound/usb/pcm.c
@@ -1639,7 +1639,7 @@ static int snd_usb_pcm_playback_ack(struct snd_pcm_substream *substream)
* outputs here
*/
if (!ep->active_mask)
- snd_usb_queue_pending_output_urbs(ep, true);
+ return snd_usb_queue_pending_output_urbs(ep, true);
return 0;
}


thanks,

Takashi

2023-03-19 09:16:03

by Takashi Iwai

[permalink] [raw]
Subject: Re: [PATCH] ALSA: usb-audio: Fix recursive locking on XRUN

On Sun, 19 Mar 2023 08:57:03 +0100,
Takashi Iwai wrote:
>
> On Sun, 19 Mar 2023 04:28:53 +0100,
> Takashi Sakamoto wrote:
> >
> > Hi,
> >
> > On Sat, Mar 18, 2023 at 09:20:05AM +0900, Takashi Sakamoto wrote:
> > > On Fri, Mar 17, 2023 at 07:51:27PM +0000, John Keeping wrote:
> > > > snd_usb_queue_pending_output_urbs() may be called from
> > > > snd_pcm_ops::ack() which means the PCM stream is locked.
> > > >
> > > > For the normal case where the call back into the PCM core is via
> > > > prepare_output_urb() the "_under_stream_lock" variant of
> > > > snd_pcm_period_elapsed() is called, but when an error occurs and the
> > > > stream is stopped as XRUN then snd_pcm_xrun() tries to recursively lock
> > > > the stream which results in deadlock.
> > > >
> > > > Follow the example of snd_pcm_period_elapsed() by adding
> > > > snd_pcm_xrun_under_stream_lock() and use this when the PCM substream
> > > > lock is already held.
> > > >
> > > > Signed-off-by: John Keeping <[email protected]>
> > > > ---
> > > > include/sound/pcm.h | 1 +
> > > > sound/core/pcm_native.c | 28 ++++++++++++++++++++++++----
> > > > sound/usb/endpoint.c | 18 +++++++++++-------
> > > > 3 files changed, 36 insertions(+), 11 deletions(-)
> > >
> > > The name of added kernel API implies me that you refer to existent
> > > 'snd_pcm_period_elapsed_under_stream_lock()' which I added to Linux
> > > v5.14.
> > >
> > > In my opinion, unlike the version of period elapsed API, the version of
> > > XRUN API seems not to be necessarily required to ALSA PCM core, since PCM
> > > device drivers can implement .pointer callback in the part of PCM operation.
> > > When the callback returns SNDRV_PCM_POS_XRUN, ALSA PCM application get
> > > occurence of XRUN as a result of any operation relevant to hwptr movement
> > > (e.g. SNDRV_PCM_IOCTL_HWSYNC).
> > >
> > > Therefore I think it possible to fix the issue without the proposed
> > > kernel API. I can assume some scenario:
> > >
> > > 1. Failure at tasklet for URB completion
> > >
> > > It is softIRQ context. The stream lock is not acquired. It doesn't
> > > matter to call current XRUN API.
> > >
> > > 2. Failure at PCM operation called by ALSA PCM application
> > >
> > > It is process context. The stream lock is acquired before calling driver
> > > code. When detecting any type of failure, driver code stores the state.
> > > Then .pointer callback should return SNDRV_PCM_POS_XRUNrefering to
> > > the state.
> >
> > Although being inexperienced to hack driver for USB audio device class,
> > I attempt to post the patch to fix the issue of recursive stream lock.
> > I apologies in advance since the patch is not tested yet...
> >
> > The 'in_xrun' member is newly added to 'struct snd_usb_substream'. When
> > detecting any failure, false is assigned to the member. The assignment
> > is expected to be done in both softIRQ context, and process context with
> > stream lock, thus no need to take care of cocurrent access (e.g. by usage
> > of WRITE_ONCE/READ_ONCE).
> >
> > Typical ALSA PCM application periodically calls PCM operation which calls
> > .pointer in driver code. As I described, returning SNDRV_PCM_POS_XRUN
> > takes ALSA PCM core to handle XRUN state of PCM substream in the timing.
> >
> > The negative point of the patch is the delay of XRUN notification to user
> > space application. In the point, I think the new kernel API introduced by
> > your patch has advantage.
> >
> > The in_xrun member can be replaced with a kind of EP_STATE_
> > enumerations; i.e. EP_STATE_XRUN. In the case, we need some care so that
> > the state should be referred from pcm.c.
>
> Thanks for the patch. That would work, but the shortcoming side of
> this implementation is that it misses stopping / reporting the error
> immediately but waiting for the next pointer update.
>
> It might be simpler if we perform the xrun handling in the caller
> side, i.e. a change like below:
>
> --- a/sound/core/pcm_lib.c
> +++ b/sound/core/pcm_lib.c
> @@ -2155,6 +2155,8 @@ int pcm_lib_apply_appl_ptr(struct snd_pcm_substream *substream,
> ret = substream->ops->ack(substream);
> if (ret < 0) {
> runtime->control->appl_ptr = old_appl_ptr;
> + if (ret == -EPIPE)
> + __snd_pcm_xrun(substream);
> return ret;
> }
> }
>
> ... and let the caller returning -EPIPE for XRUN:

and that misses the XRUN in the case of non-stream-lock.
A revised version is below.


Takashi

-- 8< --
--- a/sound/core/pcm_lib.c
+++ b/sound/core/pcm_lib.c
@@ -2155,6 +2155,8 @@ int pcm_lib_apply_appl_ptr(struct snd_pcm_substream *substream,
ret = substream->ops->ack(substream);
if (ret < 0) {
runtime->control->appl_ptr = old_appl_ptr;
+ if (ret == -EPIPE)
+ __snd_pcm_xrun(substream);
return ret;
}
}
--- a/sound/usb/endpoint.c
+++ b/sound/usb/endpoint.c
@@ -455,8 +455,8 @@ static void push_back_to_ready_list(struct snd_usb_endpoint *ep,
* This function is used both for implicit feedback endpoints and in low-
* latency playback mode.
*/
-void snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
- bool in_stream_lock)
+int snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
+ bool in_stream_lock)
{
bool implicit_fb = snd_usb_endpoint_implicit_feedback_sink(ep);

@@ -480,7 +480,7 @@ void snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
spin_unlock_irqrestore(&ep->lock, flags);

if (ctx == NULL)
- return;
+ break;

/* copy over the length information */
if (implicit_fb) {
@@ -495,11 +495,14 @@ void snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
break;
if (err < 0) {
/* push back to ready list again for -EAGAIN */
- if (err == -EAGAIN)
+ if (err == -EAGAIN) {
push_back_to_ready_list(ep, ctx);
- else
+ break;
+ }
+
+ if (!in_stream_lock)
notify_xrun(ep);
- return;
+ return -EPIPE;
}

err = usb_submit_urb(ctx->urb, GFP_ATOMIC);
@@ -507,13 +510,16 @@ void snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
usb_audio_err(ep->chip,
"Unable to submit urb #%d: %d at %s\n",
ctx->index, err, __func__);
- notify_xrun(ep);
- return;
+ if (!in_stream_lock)
+ notify_xrun(ep);
+ return -EPIPE;
}

set_bit(ctx->index, &ep->active_mask);
atomic_inc(&ep->submitted_urbs);
}
+
+ return 0;
}

/*
--- a/sound/usb/endpoint.h
+++ b/sound/usb/endpoint.h
@@ -52,7 +52,7 @@ int snd_usb_endpoint_implicit_feedback_sink(struct snd_usb_endpoint *ep);
int snd_usb_endpoint_next_packet_size(struct snd_usb_endpoint *ep,
struct snd_urb_ctx *ctx, int idx,
unsigned int avail);
-void snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
- bool in_stream_lock);
+int snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
+ bool in_stream_lock);

#endif /* __USBAUDIO_ENDPOINT_H */
--- a/sound/usb/pcm.c
+++ b/sound/usb/pcm.c
@@ -1639,7 +1639,7 @@ static int snd_usb_pcm_playback_ack(struct snd_pcm_substream *substream)
* outputs here
*/
if (!ep->active_mask)
- snd_usb_queue_pending_output_urbs(ep, true);
+ return snd_usb_queue_pending_output_urbs(ep, true);
return 0;
}


2023-03-20 12:04:39

by John Keeping

[permalink] [raw]
Subject: Re: [PATCH] ALSA: usb-audio: Fix recursive locking on XRUN

On Sun, Mar 19, 2023 at 10:15:55AM +0100, Takashi Iwai wrote:
> On Sun, 19 Mar 2023 08:57:03 +0100,
> Takashi Iwai wrote:
> >
> > On Sun, 19 Mar 2023 04:28:53 +0100,
> > Takashi Sakamoto wrote:
> > >
> > > Hi,
> > >
> > > On Sat, Mar 18, 2023 at 09:20:05AM +0900, Takashi Sakamoto wrote:
> > > > On Fri, Mar 17, 2023 at 07:51:27PM +0000, John Keeping wrote:
> > > > > snd_usb_queue_pending_output_urbs() may be called from
> > > > > snd_pcm_ops::ack() which means the PCM stream is locked.
> > > > >
> > > > > For the normal case where the call back into the PCM core is via
> > > > > prepare_output_urb() the "_under_stream_lock" variant of
> > > > > snd_pcm_period_elapsed() is called, but when an error occurs and the
> > > > > stream is stopped as XRUN then snd_pcm_xrun() tries to recursively lock
> > > > > the stream which results in deadlock.
> > > > >
> > > > > Follow the example of snd_pcm_period_elapsed() by adding
> > > > > snd_pcm_xrun_under_stream_lock() and use this when the PCM substream
> > > > > lock is already held.
> > > > >
> > > > > Signed-off-by: John Keeping <[email protected]>
> > > > > ---
> > > > > include/sound/pcm.h | 1 +
> > > > > sound/core/pcm_native.c | 28 ++++++++++++++++++++++++----
> > > > > sound/usb/endpoint.c | 18 +++++++++++-------
> > > > > 3 files changed, 36 insertions(+), 11 deletions(-)
> > > >
> > > > The name of added kernel API implies me that you refer to existent
> > > > 'snd_pcm_period_elapsed_under_stream_lock()' which I added to Linux
> > > > v5.14.
> > > >
> > > > In my opinion, unlike the version of period elapsed API, the version of
> > > > XRUN API seems not to be necessarily required to ALSA PCM core, since PCM
> > > > device drivers can implement .pointer callback in the part of PCM operation.
> > > > When the callback returns SNDRV_PCM_POS_XRUN, ALSA PCM application get
> > > > occurence of XRUN as a result of any operation relevant to hwptr movement
> > > > (e.g. SNDRV_PCM_IOCTL_HWSYNC).
> > > >
> > > > Therefore I think it possible to fix the issue without the proposed
> > > > kernel API. I can assume some scenario:
> > > >
> > > > 1. Failure at tasklet for URB completion
> > > >
> > > > It is softIRQ context. The stream lock is not acquired. It doesn't
> > > > matter to call current XRUN API.
> > > >
> > > > 2. Failure at PCM operation called by ALSA PCM application
> > > >
> > > > It is process context. The stream lock is acquired before calling driver
> > > > code. When detecting any type of failure, driver code stores the state.
> > > > Then .pointer callback should return SNDRV_PCM_POS_XRUNrefering to
> > > > the state.
> > >
> > > Although being inexperienced to hack driver for USB audio device class,
> > > I attempt to post the patch to fix the issue of recursive stream lock.
> > > I apologies in advance since the patch is not tested yet...
> > >
> > > The 'in_xrun' member is newly added to 'struct snd_usb_substream'. When
> > > detecting any failure, false is assigned to the member. The assignment
> > > is expected to be done in both softIRQ context, and process context with
> > > stream lock, thus no need to take care of cocurrent access (e.g. by usage
> > > of WRITE_ONCE/READ_ONCE).
> > >
> > > Typical ALSA PCM application periodically calls PCM operation which calls
> > > .pointer in driver code. As I described, returning SNDRV_PCM_POS_XRUN
> > > takes ALSA PCM core to handle XRUN state of PCM substream in the timing.
> > >
> > > The negative point of the patch is the delay of XRUN notification to user
> > > space application. In the point, I think the new kernel API introduced by
> > > your patch has advantage.
> > >
> > > The in_xrun member can be replaced with a kind of EP_STATE_
> > > enumerations; i.e. EP_STATE_XRUN. In the case, we need some care so that
> > > the state should be referred from pcm.c.
> >
> > Thanks for the patch. That would work, but the shortcoming side of
> > this implementation is that it misses stopping / reporting the error
> > immediately but waiting for the next pointer update.
> >
> > It might be simpler if we perform the xrun handling in the caller
> > side, i.e. a change like below:
> >
> > --- a/sound/core/pcm_lib.c
> > +++ b/sound/core/pcm_lib.c
> > @@ -2155,6 +2155,8 @@ int pcm_lib_apply_appl_ptr(struct snd_pcm_substream *substream,
> > ret = substream->ops->ack(substream);
> > if (ret < 0) {
> > runtime->control->appl_ptr = old_appl_ptr;
> > + if (ret == -EPIPE)
> > + __snd_pcm_xrun(substream);
> > return ret;
> > }
> > }
> >
> > ... and let the caller returning -EPIPE for XRUN:
>
> and that misses the XRUN in the case of non-stream-lock.
> A revised version is below.

Yes, it looks like this also solves the problem. If you roll this into
a proper patch feel free to add:

Tested-by: John Keeping <[email protected]>

>
> -- 8< --
> --- a/sound/core/pcm_lib.c
> +++ b/sound/core/pcm_lib.c
> @@ -2155,6 +2155,8 @@ int pcm_lib_apply_appl_ptr(struct snd_pcm_substream *substream,
> ret = substream->ops->ack(substream);
> if (ret < 0) {
> runtime->control->appl_ptr = old_appl_ptr;
> + if (ret == -EPIPE)
> + __snd_pcm_xrun(substream);
> return ret;
> }
> }
> --- a/sound/usb/endpoint.c
> +++ b/sound/usb/endpoint.c
> @@ -455,8 +455,8 @@ static void push_back_to_ready_list(struct snd_usb_endpoint *ep,
> * This function is used both for implicit feedback endpoints and in low-
> * latency playback mode.
> */
> -void snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
> - bool in_stream_lock)
> +int snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
> + bool in_stream_lock)
> {
> bool implicit_fb = snd_usb_endpoint_implicit_feedback_sink(ep);
>
> @@ -480,7 +480,7 @@ void snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
> spin_unlock_irqrestore(&ep->lock, flags);
>
> if (ctx == NULL)
> - return;
> + break;
>
> /* copy over the length information */
> if (implicit_fb) {
> @@ -495,11 +495,14 @@ void snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
> break;
> if (err < 0) {
> /* push back to ready list again for -EAGAIN */
> - if (err == -EAGAIN)
> + if (err == -EAGAIN) {
> push_back_to_ready_list(ep, ctx);
> - else
> + break;
> + }
> +
> + if (!in_stream_lock)
> notify_xrun(ep);
> - return;
> + return -EPIPE;
> }
>
> err = usb_submit_urb(ctx->urb, GFP_ATOMIC);
> @@ -507,13 +510,16 @@ void snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
> usb_audio_err(ep->chip,
> "Unable to submit urb #%d: %d at %s\n",
> ctx->index, err, __func__);
> - notify_xrun(ep);
> - return;
> + if (!in_stream_lock)
> + notify_xrun(ep);
> + return -EPIPE;
> }
>
> set_bit(ctx->index, &ep->active_mask);
> atomic_inc(&ep->submitted_urbs);
> }
> +
> + return 0;
> }
>
> /*
> --- a/sound/usb/endpoint.h
> +++ b/sound/usb/endpoint.h
> @@ -52,7 +52,7 @@ int snd_usb_endpoint_implicit_feedback_sink(struct snd_usb_endpoint *ep);
> int snd_usb_endpoint_next_packet_size(struct snd_usb_endpoint *ep,
> struct snd_urb_ctx *ctx, int idx,
> unsigned int avail);
> -void snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
> - bool in_stream_lock);
> +int snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
> + bool in_stream_lock);
>
> #endif /* __USBAUDIO_ENDPOINT_H */
> --- a/sound/usb/pcm.c
> +++ b/sound/usb/pcm.c
> @@ -1639,7 +1639,7 @@ static int snd_usb_pcm_playback_ack(struct snd_pcm_substream *substream)
> * outputs here
> */
> if (!ep->active_mask)
> - snd_usb_queue_pending_output_urbs(ep, true);
> + return snd_usb_queue_pending_output_urbs(ep, true);
> return 0;
> }
>

2023-03-20 14:28:14

by Takashi Iwai

[permalink] [raw]
Subject: Re: [PATCH] ALSA: usb-audio: Fix recursive locking on XRUN

On Mon, 20 Mar 2023 13:04:22 +0100,
John Keeping wrote:
>
> On Sun, Mar 19, 2023 at 10:15:55AM +0100, Takashi Iwai wrote:
> > On Sun, 19 Mar 2023 08:57:03 +0100,
> > Takashi Iwai wrote:
> > >
> > > On Sun, 19 Mar 2023 04:28:53 +0100,
> > > Takashi Sakamoto wrote:
> > > >
> > > > Hi,
> > > >
> > > > On Sat, Mar 18, 2023 at 09:20:05AM +0900, Takashi Sakamoto wrote:
> > > > > On Fri, Mar 17, 2023 at 07:51:27PM +0000, John Keeping wrote:
> > > > > > snd_usb_queue_pending_output_urbs() may be called from
> > > > > > snd_pcm_ops::ack() which means the PCM stream is locked.
> > > > > >
> > > > > > For the normal case where the call back into the PCM core is via
> > > > > > prepare_output_urb() the "_under_stream_lock" variant of
> > > > > > snd_pcm_period_elapsed() is called, but when an error occurs and the
> > > > > > stream is stopped as XRUN then snd_pcm_xrun() tries to recursively lock
> > > > > > the stream which results in deadlock.
> > > > > >
> > > > > > Follow the example of snd_pcm_period_elapsed() by adding
> > > > > > snd_pcm_xrun_under_stream_lock() and use this when the PCM substream
> > > > > > lock is already held.
> > > > > >
> > > > > > Signed-off-by: John Keeping <[email protected]>
> > > > > > ---
> > > > > > include/sound/pcm.h | 1 +
> > > > > > sound/core/pcm_native.c | 28 ++++++++++++++++++++++++----
> > > > > > sound/usb/endpoint.c | 18 +++++++++++-------
> > > > > > 3 files changed, 36 insertions(+), 11 deletions(-)
> > > > >
> > > > > The name of added kernel API implies me that you refer to existent
> > > > > 'snd_pcm_period_elapsed_under_stream_lock()' which I added to Linux
> > > > > v5.14.
> > > > >
> > > > > In my opinion, unlike the version of period elapsed API, the version of
> > > > > XRUN API seems not to be necessarily required to ALSA PCM core, since PCM
> > > > > device drivers can implement .pointer callback in the part of PCM operation.
> > > > > When the callback returns SNDRV_PCM_POS_XRUN, ALSA PCM application get
> > > > > occurence of XRUN as a result of any operation relevant to hwptr movement
> > > > > (e.g. SNDRV_PCM_IOCTL_HWSYNC).
> > > > >
> > > > > Therefore I think it possible to fix the issue without the proposed
> > > > > kernel API. I can assume some scenario:
> > > > >
> > > > > 1. Failure at tasklet for URB completion
> > > > >
> > > > > It is softIRQ context. The stream lock is not acquired. It doesn't
> > > > > matter to call current XRUN API.
> > > > >
> > > > > 2. Failure at PCM operation called by ALSA PCM application
> > > > >
> > > > > It is process context. The stream lock is acquired before calling driver
> > > > > code. When detecting any type of failure, driver code stores the state.
> > > > > Then .pointer callback should return SNDRV_PCM_POS_XRUNrefering to
> > > > > the state.
> > > >
> > > > Although being inexperienced to hack driver for USB audio device class,
> > > > I attempt to post the patch to fix the issue of recursive stream lock.
> > > > I apologies in advance since the patch is not tested yet...
> > > >
> > > > The 'in_xrun' member is newly added to 'struct snd_usb_substream'. When
> > > > detecting any failure, false is assigned to the member. The assignment
> > > > is expected to be done in both softIRQ context, and process context with
> > > > stream lock, thus no need to take care of cocurrent access (e.g. by usage
> > > > of WRITE_ONCE/READ_ONCE).
> > > >
> > > > Typical ALSA PCM application periodically calls PCM operation which calls
> > > > .pointer in driver code. As I described, returning SNDRV_PCM_POS_XRUN
> > > > takes ALSA PCM core to handle XRUN state of PCM substream in the timing.
> > > >
> > > > The negative point of the patch is the delay of XRUN notification to user
> > > > space application. In the point, I think the new kernel API introduced by
> > > > your patch has advantage.
> > > >
> > > > The in_xrun member can be replaced with a kind of EP_STATE_
> > > > enumerations; i.e. EP_STATE_XRUN. In the case, we need some care so that
> > > > the state should be referred from pcm.c.
> > >
> > > Thanks for the patch. That would work, but the shortcoming side of
> > > this implementation is that it misses stopping / reporting the error
> > > immediately but waiting for the next pointer update.
> > >
> > > It might be simpler if we perform the xrun handling in the caller
> > > side, i.e. a change like below:
> > >
> > > --- a/sound/core/pcm_lib.c
> > > +++ b/sound/core/pcm_lib.c
> > > @@ -2155,6 +2155,8 @@ int pcm_lib_apply_appl_ptr(struct snd_pcm_substream *substream,
> > > ret = substream->ops->ack(substream);
> > > if (ret < 0) {
> > > runtime->control->appl_ptr = old_appl_ptr;
> > > + if (ret == -EPIPE)
> > > + __snd_pcm_xrun(substream);
> > > return ret;
> > > }
> > > }
> > >
> > > ... and let the caller returning -EPIPE for XRUN:
> >
> > and that misses the XRUN in the case of non-stream-lock.
> > A revised version is below.
>
> Yes, it looks like this also solves the problem. If you roll this into
> a proper patch feel free to add:
>
> Tested-by: John Keeping <[email protected]>

Thanks, then I'll submit a proper patch.


Takashi