2022-03-28 16:31:39

by Mohan Kumar

[permalink] [raw]
Subject: [PATCH] ALSA: hda: Avoid unsol event during RPM suspending

There is a corner case with unsol event handling during codec runtime
suspending state. When the codec runtime suspend call initiated, the
codec->in_pm atomic variable would be 0, currently the codec runtime
suspend function calls snd_hdac_enter_pm() which will just increments
the codec->in_pm atomic variable. Consider unsol event happened just
after this step and before snd_hdac_leave_pm() in the codec runtime
suspend function. The snd_hdac_power_up_pm() in the unsol event
flow in hdmi_present_sense_via_verbs() function would just increment
the codec->in_pm atomic variable without calling pm_runtime_get_sync
function.

As codec runtime suspend flow is already in progress and in parallel
unsol event is also accessing the codec verbs, as soon as codec
suspend flow completes and clocks are switched off before completing
the unsol event handling as both functions doesn't wait for each other.
This will result in below errors

[ 589.428020] tegra-hda 3510000.hda: azx_get_response timeout, switching
to polling mode: last cmd=0x505f2f57
[ 589.428344] tegra-hda 3510000.hda: spurious response 0x80000074:0x5,
last cmd=0x505f2f57
[ 589.428547] tegra-hda 3510000.hda: spurious response 0x80000065:0x5,
last cmd=0x505f2f57

To avoid this, the unsol event flow should not perform any codec verb
related operations during RPM_SUSPENDING state.

Signed-off-by: Mohan Kumar <[email protected]>
---
sound/pci/hda/patch_hdmi.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/sound/pci/hda/patch_hdmi.c b/sound/pci/hda/patch_hdmi.c
index c85ed7bc121e..67870c8d84a5 100644
--- a/sound/pci/hda/patch_hdmi.c
+++ b/sound/pci/hda/patch_hdmi.c
@@ -1625,6 +1625,7 @@ static void hdmi_present_sense_via_verbs(struct hdmi_spec_per_pin *per_pin,
struct hda_codec *codec = per_pin->codec;
struct hdmi_spec *spec = codec->spec;
struct hdmi_eld *eld = &spec->temp_eld;
+ struct device *dev = hda_codec_dev(codec);
hda_nid_t pin_nid = per_pin->pin_nid;
int dev_id = per_pin->dev_id;
/*
@@ -1639,7 +1640,8 @@ static void hdmi_present_sense_via_verbs(struct hdmi_spec_per_pin *per_pin,
int ret;

ret = snd_hda_power_up_pm(codec);
- if (ret < 0 && pm_runtime_suspended(hda_codec_dev(codec)))
+ if ((ret < 0 && pm_runtime_suspended(dev)) ||
+ (dev->power.runtime_status == RPM_SUSPENDING))
goto out;

present = snd_hda_jack_pin_sense(codec, pin_nid, dev_id);
--
2.17.1


2022-03-28 17:30:33

by Mohan Kumar

[permalink] [raw]
Subject: Re: [PATCH] ALSA: hda: Avoid unsol event during RPM suspending


On 3/28/2022 3:12 PM, Takashi Iwai wrote:
> External email: Use caution opening links or attachments
>
>
> On Mon, 28 Mar 2022 11:14:11 +0200,
> Mohan Kumar wrote:
>> There is a corner case with unsol event handling during codec runtime
>> suspending state. When the codec runtime suspend call initiated, the
>> codec->in_pm atomic variable would be 0, currently the codec runtime
>> suspend function calls snd_hdac_enter_pm() which will just increments
>> the codec->in_pm atomic variable. Consider unsol event happened just
>> after this step and before snd_hdac_leave_pm() in the codec runtime
>> suspend function. The snd_hdac_power_up_pm() in the unsol event
>> flow in hdmi_present_sense_via_verbs() function would just increment
>> the codec->in_pm atomic variable without calling pm_runtime_get_sync
>> function.
>>
>> As codec runtime suspend flow is already in progress and in parallel
>> unsol event is also accessing the codec verbs, as soon as codec
>> suspend flow completes and clocks are switched off before completing
>> the unsol event handling as both functions doesn't wait for each other.
>> This will result in below errors
>>
>> [ 589.428020] tegra-hda 3510000.hda: azx_get_response timeout, switching
>> to polling mode: last cmd=0x505f2f57
>> [ 589.428344] tegra-hda 3510000.hda: spurious response 0x80000074:0x5,
>> last cmd=0x505f2f57
>> [ 589.428547] tegra-hda 3510000.hda: spurious response 0x80000065:0x5,
>> last cmd=0x505f2f57
>>
>> To avoid this, the unsol event flow should not perform any codec verb
>> related operations during RPM_SUSPENDING state.
>>
>> Signed-off-by: Mohan Kumar <[email protected]>
> Thanks, that's a hairy problem...
>
> The logic sounds good, but can we check the PM state before calling
> snd_hda_power_up_pm()?

If am not wrong, PM apis exposed either provide RPM_ACTIVE or
RPM_SUSPENDED status. Don't see anything which provides info on
RPM_SUSPENDING. We might need to exactly know this state to fix this issue.

>
> Takashi

2022-03-28 20:44:25

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH] ALSA: hda: Avoid unsol event during RPM suspending

Hi Mohan,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tiwai-sound/for-next]
[also build test ERROR on v5.17 next-20220328]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/intel-lab-lkp/linux/commits/Mohan-Kumar/ALSA-hda-Avoid-unsol-event-during-RPM-suspending/20220328-171517
base: https://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git for-next
config: s390-randconfig-c005-20220327 (https://download.01.org/0day-ci/archive/20220329/[email protected]/config)
compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project 0f6d9501cf49ce02937099350d08f20c4af86f3d)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install s390 cross compiling tool for clang build
# apt-get install binutils-s390x-linux-gnu
# https://github.com/intel-lab-lkp/linux/commit/80c4e21f5e97cd4b779806fa5da5bb7392e2874f
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Mohan-Kumar/ALSA-hda-Avoid-unsol-event-during-RPM-suspending/20220328-171517
git checkout 80c4e21f5e97cd4b779806fa5da5bb7392e2874f
# save the config file to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=s390 SHELL=/bin/bash sound/pci/hda/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

In file included from sound/pci/hda/patch_hdmi.c:21:
In file included from include/linux/pci.h:39:
In file included from include/linux/io.h:13:
In file included from arch/s390/include/asm/io.h:75:
include/asm-generic/io.h:464:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
val = __raw_readb(PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:477:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
~~~~~~~~~~ ^
include/uapi/linux/byteorder/big_endian.h:37:59: note: expanded from macro '__le16_to_cpu'
#define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))
^
include/uapi/linux/swab.h:102:54: note: expanded from macro '__swab16'
#define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
^
In file included from sound/pci/hda/patch_hdmi.c:21:
In file included from include/linux/pci.h:39:
In file included from include/linux/io.h:13:
In file included from arch/s390/include/asm/io.h:75:
include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
~~~~~~~~~~ ^
include/uapi/linux/byteorder/big_endian.h:35:59: note: expanded from macro '__le32_to_cpu'
#define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
^
include/uapi/linux/swab.h:115:54: note: expanded from macro '__swab32'
#define __swab32(x) (__u32)__builtin_bswap32((__u32)(x))
^
In file included from sound/pci/hda/patch_hdmi.c:21:
In file included from include/linux/pci.h:39:
In file included from include/linux/io.h:13:
In file included from arch/s390/include/asm/io.h:75:
include/asm-generic/io.h:501:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
__raw_writeb(value, PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:511:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
__raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:521:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
__raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:609:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
readsb(PCI_IOBASE + addr, buffer, count);
~~~~~~~~~~ ^
include/asm-generic/io.h:617:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
readsw(PCI_IOBASE + addr, buffer, count);
~~~~~~~~~~ ^
include/asm-generic/io.h:625:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
readsl(PCI_IOBASE + addr, buffer, count);
~~~~~~~~~~ ^
include/asm-generic/io.h:634:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
writesb(PCI_IOBASE + addr, buffer, count);
~~~~~~~~~~ ^
include/asm-generic/io.h:643:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
writesw(PCI_IOBASE + addr, buffer, count);
~~~~~~~~~~ ^
include/asm-generic/io.h:652:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
writesl(PCI_IOBASE + addr, buffer, count);
~~~~~~~~~~ ^
>> sound/pci/hda/patch_hdmi.c:1644:15: error: no member named 'runtime_status' in 'struct dev_pm_info'
(dev->power.runtime_status == RPM_SUSPENDING))
~~~~~~~~~~ ^
12 warnings and 1 error generated.


vim +1644 sound/pci/hda/patch_hdmi.c

1620
1621 /* update ELD and jack state via HD-audio verbs */
1622 static void hdmi_present_sense_via_verbs(struct hdmi_spec_per_pin *per_pin,
1623 int repoll)
1624 {
1625 struct hda_codec *codec = per_pin->codec;
1626 struct hdmi_spec *spec = codec->spec;
1627 struct hdmi_eld *eld = &spec->temp_eld;
1628 struct device *dev = hda_codec_dev(codec);
1629 hda_nid_t pin_nid = per_pin->pin_nid;
1630 int dev_id = per_pin->dev_id;
1631 /*
1632 * Always execute a GetPinSense verb here, even when called from
1633 * hdmi_intrinsic_event; for some NVIDIA HW, the unsolicited
1634 * response's PD bit is not the real PD value, but indicates that
1635 * the real PD value changed. An older version of the HD-audio
1636 * specification worked this way. Hence, we just ignore the data in
1637 * the unsolicited response to avoid custom WARs.
1638 */
1639 int present;
1640 int ret;
1641
1642 ret = snd_hda_power_up_pm(codec);
1643 if ((ret < 0 && pm_runtime_suspended(dev)) ||
> 1644 (dev->power.runtime_status == RPM_SUSPENDING))
1645 goto out;
1646
1647 present = snd_hda_jack_pin_sense(codec, pin_nid, dev_id);
1648
1649 mutex_lock(&per_pin->lock);
1650 eld->monitor_present = !!(present & AC_PINSENSE_PRESENCE);
1651 if (eld->monitor_present)
1652 eld->eld_valid = !!(present & AC_PINSENSE_ELDV);
1653 else
1654 eld->eld_valid = false;
1655
1656 codec_dbg(codec,
1657 "HDMI status: Codec=%d NID=0x%x Presence_Detect=%d ELD_Valid=%d\n",
1658 codec->addr, pin_nid, eld->monitor_present, eld->eld_valid);
1659
1660 if (eld->eld_valid) {
1661 if (spec->ops.pin_get_eld(codec, pin_nid, dev_id,
1662 eld->eld_buffer, &eld->eld_size) < 0)
1663 eld->eld_valid = false;
1664 }
1665
1666 update_eld(codec, per_pin, eld, repoll);
1667 mutex_unlock(&per_pin->lock);
1668 out:
1669 snd_hda_power_down_pm(codec);
1670 }
1671

--
0-DAY CI Kernel Test Service
https://01.org/lkp

2022-03-28 21:03:57

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH] ALSA: hda: Avoid unsol event during RPM suspending

Hi Mohan,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tiwai-sound/for-next]
[also build test ERROR on v5.17 next-20220328]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/intel-lab-lkp/linux/commits/Mohan-Kumar/ALSA-hda-Avoid-unsol-event-during-RPM-suspending/20220328-171517
base: https://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git for-next
config: alpha-buildonly-randconfig-r001-20220327 (https://download.01.org/0day-ci/archive/20220328/[email protected]/config)
compiler: alpha-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/80c4e21f5e97cd4b779806fa5da5bb7392e2874f
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Mohan-Kumar/ALSA-hda-Avoid-unsol-event-during-RPM-suspending/20220328-171517
git checkout 80c4e21f5e97cd4b779806fa5da5bb7392e2874f
# save the config file to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=alpha SHELL=/bin/bash sound/pci/hda/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

sound/pci/hda/patch_hdmi.c: In function 'hdmi_present_sense_via_verbs':
>> sound/pci/hda/patch_hdmi.c:1644:28: error: 'struct dev_pm_info' has no member named 'runtime_status'
1644 | (dev->power.runtime_status == RPM_SUSPENDING))
| ^


vim +1644 sound/pci/hda/patch_hdmi.c

1620
1621 /* update ELD and jack state via HD-audio verbs */
1622 static void hdmi_present_sense_via_verbs(struct hdmi_spec_per_pin *per_pin,
1623 int repoll)
1624 {
1625 struct hda_codec *codec = per_pin->codec;
1626 struct hdmi_spec *spec = codec->spec;
1627 struct hdmi_eld *eld = &spec->temp_eld;
1628 struct device *dev = hda_codec_dev(codec);
1629 hda_nid_t pin_nid = per_pin->pin_nid;
1630 int dev_id = per_pin->dev_id;
1631 /*
1632 * Always execute a GetPinSense verb here, even when called from
1633 * hdmi_intrinsic_event; for some NVIDIA HW, the unsolicited
1634 * response's PD bit is not the real PD value, but indicates that
1635 * the real PD value changed. An older version of the HD-audio
1636 * specification worked this way. Hence, we just ignore the data in
1637 * the unsolicited response to avoid custom WARs.
1638 */
1639 int present;
1640 int ret;
1641
1642 ret = snd_hda_power_up_pm(codec);
1643 if ((ret < 0 && pm_runtime_suspended(dev)) ||
> 1644 (dev->power.runtime_status == RPM_SUSPENDING))
1645 goto out;
1646
1647 present = snd_hda_jack_pin_sense(codec, pin_nid, dev_id);
1648
1649 mutex_lock(&per_pin->lock);
1650 eld->monitor_present = !!(present & AC_PINSENSE_PRESENCE);
1651 if (eld->monitor_present)
1652 eld->eld_valid = !!(present & AC_PINSENSE_ELDV);
1653 else
1654 eld->eld_valid = false;
1655
1656 codec_dbg(codec,
1657 "HDMI status: Codec=%d NID=0x%x Presence_Detect=%d ELD_Valid=%d\n",
1658 codec->addr, pin_nid, eld->monitor_present, eld->eld_valid);
1659
1660 if (eld->eld_valid) {
1661 if (spec->ops.pin_get_eld(codec, pin_nid, dev_id,
1662 eld->eld_buffer, &eld->eld_size) < 0)
1663 eld->eld_valid = false;
1664 }
1665
1666 update_eld(codec, per_pin, eld, repoll);
1667 mutex_unlock(&per_pin->lock);
1668 out:
1669 snd_hda_power_down_pm(codec);
1670 }
1671

--
0-DAY CI Kernel Test Service
https://01.org/lkp

2022-03-28 21:10:21

by Takashi Iwai

[permalink] [raw]
Subject: Re: [PATCH] ALSA: hda: Avoid unsol event during RPM suspending

On Mon, 28 Mar 2022 15:51:17 +0200,
Mohan Kumar D wrote:
>
>
> On 3/28/2022 4:27 PM, Takashi Iwai wrote:
> > External email: Use caution opening links or attachments
> >
> >
> > On Mon, 28 Mar 2022 12:19:03 +0200,
> > Mohan Kumar D wrote:
> >>
> >> On 3/28/2022 3:12 PM, Takashi Iwai wrote:
> >>> External email: Use caution opening links or attachments
> >>>
> >>>
> >>> On Mon, 28 Mar 2022 11:14:11 +0200,
> >>> Mohan Kumar wrote:
> >>>> There is a corner case with unsol event handling during codec runtime
> >>>> suspending state. When the codec runtime suspend call initiated, the
> >>>> codec->in_pm atomic variable would be 0, currently the codec runtime
> >>>> suspend function calls snd_hdac_enter_pm() which will just increments
> >>>> the codec->in_pm atomic variable. Consider unsol event happened just
> >>>> after this step and before snd_hdac_leave_pm() in the codec runtime
> >>>> suspend function. The snd_hdac_power_up_pm() in the unsol event
> >>>> flow in hdmi_present_sense_via_verbs() function would just increment
> >>>> the codec->in_pm atomic variable without calling pm_runtime_get_sync
> >>>> function.
> >>>>
> >>>> As codec runtime suspend flow is already in progress and in parallel
> >>>> unsol event is also accessing the codec verbs, as soon as codec
> >>>> suspend flow completes and clocks are switched off before completing
> >>>> the unsol event handling as both functions doesn't wait for each other.
> >>>> This will result in below errors
> >>>>
> >>>> [ 589.428020] tegra-hda 3510000.hda: azx_get_response timeout, switching
> >>>> to polling mode: last cmd=0x505f2f57
> >>>> [ 589.428344] tegra-hda 3510000.hda: spurious response 0x80000074:0x5,
> >>>> last cmd=0x505f2f57
> >>>> [ 589.428547] tegra-hda 3510000.hda: spurious response 0x80000065:0x5,
> >>>> last cmd=0x505f2f57
> >>>>
> >>>> To avoid this, the unsol event flow should not perform any codec verb
> >>>> related operations during RPM_SUSPENDING state.
> >>>>
> >>>> Signed-off-by: Mohan Kumar <[email protected]>
> >>> Thanks, that's a hairy problem...
> >>>
> >>> The logic sounds good, but can we check the PM state before calling
> >>> snd_hda_power_up_pm()?
> >> If am not wrong, PM apis exposed either provide RPM_ACTIVE or
> >> RPM_SUSPENDED status. Don't see anything which provides info on
> >> RPM_SUSPENDING. We might need to exactly know this state to fix this
> >> issue.
> > Well, maybe my question wasn't clear. What I meant was that your
> > change below
> >
> >> ret = snd_hda_power_up_pm(codec);
> >> - if (ret < 0 && pm_runtime_suspended(hda_codec_dev(codec)))
> >> + if ((ret < 0 && pm_runtime_suspended(dev)) ||
> >> + (dev->power.runtime_status == RPM_SUSPENDING))
> >> goto out;
> > can be rather like:
> >
> >> + if (dev->power.runtime_status == RPM_SUSPENDING)
> >> + return;
> >> ret = snd_hda_power_up_pm(codec);
> >> if (ret < 0 && pm_runtime_suspended(hda_codec_dev(codec)))
> > so that it skips unneeded power up/down calls.
> >
> > Basically the state is set at drivers/base/power/runtime.c
> > rpm_suspend() just before calling the device's runtime_suspend
> > callback. So the state is supposed to be same before and after
> > snd_hda_power_up_pm() in that case.
> Thanks!, Make sense, will push the updated patch after testing with
> latest suggestion.

Thanks. Also don't forget to cover a case the test bot complained:
the reference to power.runtime_status needs #ifdef CONFIG_PM.


Takashi

2022-03-28 21:26:10

by Takashi Iwai

[permalink] [raw]
Subject: Re: [PATCH] ALSA: hda: Avoid unsol event during RPM suspending

On Mon, 28 Mar 2022 11:14:11 +0200,
Mohan Kumar wrote:
>
> There is a corner case with unsol event handling during codec runtime
> suspending state. When the codec runtime suspend call initiated, the
> codec->in_pm atomic variable would be 0, currently the codec runtime
> suspend function calls snd_hdac_enter_pm() which will just increments
> the codec->in_pm atomic variable. Consider unsol event happened just
> after this step and before snd_hdac_leave_pm() in the codec runtime
> suspend function. The snd_hdac_power_up_pm() in the unsol event
> flow in hdmi_present_sense_via_verbs() function would just increment
> the codec->in_pm atomic variable without calling pm_runtime_get_sync
> function.
>
> As codec runtime suspend flow is already in progress and in parallel
> unsol event is also accessing the codec verbs, as soon as codec
> suspend flow completes and clocks are switched off before completing
> the unsol event handling as both functions doesn't wait for each other.
> This will result in below errors
>
> [ 589.428020] tegra-hda 3510000.hda: azx_get_response timeout, switching
> to polling mode: last cmd=0x505f2f57
> [ 589.428344] tegra-hda 3510000.hda: spurious response 0x80000074:0x5,
> last cmd=0x505f2f57
> [ 589.428547] tegra-hda 3510000.hda: spurious response 0x80000065:0x5,
> last cmd=0x505f2f57
>
> To avoid this, the unsol event flow should not perform any codec verb
> related operations during RPM_SUSPENDING state.
>
> Signed-off-by: Mohan Kumar <[email protected]>

Thanks, that's a hairy problem...

The logic sounds good, but can we check the PM state before calling
snd_hda_power_up_pm()?


Takashi

2022-03-28 21:36:12

by Mohan Kumar

[permalink] [raw]
Subject: Re: [PATCH] ALSA: hda: Avoid unsol event during RPM suspending


On 3/28/2022 9:45 PM, Takashi Iwai wrote:
> External email: Use caution opening links or attachments
>
>
> On Mon, 28 Mar 2022 15:51:17 +0200,
> Mohan Kumar D wrote:
>>
>> On 3/28/2022 4:27 PM, Takashi Iwai wrote:
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> On Mon, 28 Mar 2022 12:19:03 +0200,
>>> Mohan Kumar D wrote:
>>>> On 3/28/2022 3:12 PM, Takashi Iwai wrote:
>>>>> External email: Use caution opening links or attachments
>>>>>
>>>>>
>>>>> On Mon, 28 Mar 2022 11:14:11 +0200,
>>>>> Mohan Kumar wrote:
>>>>>> There is a corner case with unsol event handling during codec runtime
>>>>>> suspending state. When the codec runtime suspend call initiated, the
>>>>>> codec->in_pm atomic variable would be 0, currently the codec runtime
>>>>>> suspend function calls snd_hdac_enter_pm() which will just increments
>>>>>> the codec->in_pm atomic variable. Consider unsol event happened just
>>>>>> after this step and before snd_hdac_leave_pm() in the codec runtime
>>>>>> suspend function. The snd_hdac_power_up_pm() in the unsol event
>>>>>> flow in hdmi_present_sense_via_verbs() function would just increment
>>>>>> the codec->in_pm atomic variable without calling pm_runtime_get_sync
>>>>>> function.
>>>>>>
>>>>>> As codec runtime suspend flow is already in progress and in parallel
>>>>>> unsol event is also accessing the codec verbs, as soon as codec
>>>>>> suspend flow completes and clocks are switched off before completing
>>>>>> the unsol event handling as both functions doesn't wait for each other.
>>>>>> This will result in below errors
>>>>>>
>>>>>> [ 589.428020] tegra-hda 3510000.hda: azx_get_response timeout, switching
>>>>>> to polling mode: last cmd=0x505f2f57
>>>>>> [ 589.428344] tegra-hda 3510000.hda: spurious response 0x80000074:0x5,
>>>>>> last cmd=0x505f2f57
>>>>>> [ 589.428547] tegra-hda 3510000.hda: spurious response 0x80000065:0x5,
>>>>>> last cmd=0x505f2f57
>>>>>>
>>>>>> To avoid this, the unsol event flow should not perform any codec verb
>>>>>> related operations during RPM_SUSPENDING state.
>>>>>>
>>>>>> Signed-off-by: Mohan Kumar <[email protected]>
>>>>> Thanks, that's a hairy problem...
>>>>>
>>>>> The logic sounds good, but can we check the PM state before calling
>>>>> snd_hda_power_up_pm()?
>>>> If am not wrong, PM apis exposed either provide RPM_ACTIVE or
>>>> RPM_SUSPENDED status. Don't see anything which provides info on
>>>> RPM_SUSPENDING. We might need to exactly know this state to fix this
>>>> issue.
>>> Well, maybe my question wasn't clear. What I meant was that your
>>> change below
>>>
>>>> ret = snd_hda_power_up_pm(codec);
>>>> - if (ret < 0 && pm_runtime_suspended(hda_codec_dev(codec)))
>>>> + if ((ret < 0 && pm_runtime_suspended(dev)) ||
>>>> + (dev->power.runtime_status == RPM_SUSPENDING))
>>>> goto out;
>>> can be rather like:
>>>
>>>> + if (dev->power.runtime_status == RPM_SUSPENDING)
>>>> + return;
>>>> ret = snd_hda_power_up_pm(codec);
>>>> if (ret < 0 && pm_runtime_suspended(hda_codec_dev(codec)))
>>> so that it skips unneeded power up/down calls.
>>>
>>> Basically the state is set at drivers/base/power/runtime.c
>>> rpm_suspend() just before calling the device's runtime_suspend
>>> callback. So the state is supposed to be same before and after
>>> snd_hda_power_up_pm() in that case.
>> Thanks!, Make sense, will push the updated patch after testing with
>> latest suggestion.
> Thanks. Also don't forget to cover a case the test bot complained:
> the reference to power.runtime_status needs #ifdef CONFIG_PM.
Sure, will take care same in next patch update.
>
> Takashi

2022-03-28 21:41:37

by Takashi Iwai

[permalink] [raw]
Subject: Re: [PATCH] ALSA: hda: Avoid unsol event during RPM suspending

On Mon, 28 Mar 2022 12:19:03 +0200,
Mohan Kumar D wrote:
>
>
> On 3/28/2022 3:12 PM, Takashi Iwai wrote:
> > External email: Use caution opening links or attachments
> >
> >
> > On Mon, 28 Mar 2022 11:14:11 +0200,
> > Mohan Kumar wrote:
> >> There is a corner case with unsol event handling during codec runtime
> >> suspending state. When the codec runtime suspend call initiated, the
> >> codec->in_pm atomic variable would be 0, currently the codec runtime
> >> suspend function calls snd_hdac_enter_pm() which will just increments
> >> the codec->in_pm atomic variable. Consider unsol event happened just
> >> after this step and before snd_hdac_leave_pm() in the codec runtime
> >> suspend function. The snd_hdac_power_up_pm() in the unsol event
> >> flow in hdmi_present_sense_via_verbs() function would just increment
> >> the codec->in_pm atomic variable without calling pm_runtime_get_sync
> >> function.
> >>
> >> As codec runtime suspend flow is already in progress and in parallel
> >> unsol event is also accessing the codec verbs, as soon as codec
> >> suspend flow completes and clocks are switched off before completing
> >> the unsol event handling as both functions doesn't wait for each other.
> >> This will result in below errors
> >>
> >> [ 589.428020] tegra-hda 3510000.hda: azx_get_response timeout, switching
> >> to polling mode: last cmd=0x505f2f57
> >> [ 589.428344] tegra-hda 3510000.hda: spurious response 0x80000074:0x5,
> >> last cmd=0x505f2f57
> >> [ 589.428547] tegra-hda 3510000.hda: spurious response 0x80000065:0x5,
> >> last cmd=0x505f2f57
> >>
> >> To avoid this, the unsol event flow should not perform any codec verb
> >> related operations during RPM_SUSPENDING state.
> >>
> >> Signed-off-by: Mohan Kumar <[email protected]>
> > Thanks, that's a hairy problem...
> >
> > The logic sounds good, but can we check the PM state before calling
> > snd_hda_power_up_pm()?
>
> If am not wrong, PM apis exposed either provide RPM_ACTIVE or
> RPM_SUSPENDED status. Don't see anything which provides info on
> RPM_SUSPENDING. We might need to exactly know this state to fix this
> issue.

Well, maybe my question wasn't clear. What I meant was that your
change below

> ret = snd_hda_power_up_pm(codec);
> - if (ret < 0 && pm_runtime_suspended(hda_codec_dev(codec)))
> + if ((ret < 0 && pm_runtime_suspended(dev)) ||
> + (dev->power.runtime_status == RPM_SUSPENDING))
> goto out;

can be rather like:

> + if (dev->power.runtime_status == RPM_SUSPENDING)
> + return;
> ret = snd_hda_power_up_pm(codec);
> if (ret < 0 && pm_runtime_suspended(hda_codec_dev(codec)))

so that it skips unneeded power up/down calls.

Basically the state is set at drivers/base/power/runtime.c
rpm_suspend() just before calling the device's runtime_suspend
callback. So the state is supposed to be same before and after
snd_hda_power_up_pm() in that case.


thanks,

Takashi

2022-03-28 22:58:17

by Mohan Kumar

[permalink] [raw]
Subject: Re: [PATCH] ALSA: hda: Avoid unsol event during RPM suspending


On 3/28/2022 4:27 PM, Takashi Iwai wrote:
> External email: Use caution opening links or attachments
>
>
> On Mon, 28 Mar 2022 12:19:03 +0200,
> Mohan Kumar D wrote:
>>
>> On 3/28/2022 3:12 PM, Takashi Iwai wrote:
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> On Mon, 28 Mar 2022 11:14:11 +0200,
>>> Mohan Kumar wrote:
>>>> There is a corner case with unsol event handling during codec runtime
>>>> suspending state. When the codec runtime suspend call initiated, the
>>>> codec->in_pm atomic variable would be 0, currently the codec runtime
>>>> suspend function calls snd_hdac_enter_pm() which will just increments
>>>> the codec->in_pm atomic variable. Consider unsol event happened just
>>>> after this step and before snd_hdac_leave_pm() in the codec runtime
>>>> suspend function. The snd_hdac_power_up_pm() in the unsol event
>>>> flow in hdmi_present_sense_via_verbs() function would just increment
>>>> the codec->in_pm atomic variable without calling pm_runtime_get_sync
>>>> function.
>>>>
>>>> As codec runtime suspend flow is already in progress and in parallel
>>>> unsol event is also accessing the codec verbs, as soon as codec
>>>> suspend flow completes and clocks are switched off before completing
>>>> the unsol event handling as both functions doesn't wait for each other.
>>>> This will result in below errors
>>>>
>>>> [ 589.428020] tegra-hda 3510000.hda: azx_get_response timeout, switching
>>>> to polling mode: last cmd=0x505f2f57
>>>> [ 589.428344] tegra-hda 3510000.hda: spurious response 0x80000074:0x5,
>>>> last cmd=0x505f2f57
>>>> [ 589.428547] tegra-hda 3510000.hda: spurious response 0x80000065:0x5,
>>>> last cmd=0x505f2f57
>>>>
>>>> To avoid this, the unsol event flow should not perform any codec verb
>>>> related operations during RPM_SUSPENDING state.
>>>>
>>>> Signed-off-by: Mohan Kumar <[email protected]>
>>> Thanks, that's a hairy problem...
>>>
>>> The logic sounds good, but can we check the PM state before calling
>>> snd_hda_power_up_pm()?
>> If am not wrong, PM apis exposed either provide RPM_ACTIVE or
>> RPM_SUSPENDED status. Don't see anything which provides info on
>> RPM_SUSPENDING. We might need to exactly know this state to fix this
>> issue.
> Well, maybe my question wasn't clear. What I meant was that your
> change below
>
>> ret = snd_hda_power_up_pm(codec);
>> - if (ret < 0 && pm_runtime_suspended(hda_codec_dev(codec)))
>> + if ((ret < 0 && pm_runtime_suspended(dev)) ||
>> + (dev->power.runtime_status == RPM_SUSPENDING))
>> goto out;
> can be rather like:
>
>> + if (dev->power.runtime_status == RPM_SUSPENDING)
>> + return;
>> ret = snd_hda_power_up_pm(codec);
>> if (ret < 0 && pm_runtime_suspended(hda_codec_dev(codec)))
> so that it skips unneeded power up/down calls.
>
> Basically the state is set at drivers/base/power/runtime.c
> rpm_suspend() just before calling the device's runtime_suspend
> callback. So the state is supposed to be same before and after
> snd_hda_power_up_pm() in that case.
Thanks!, Make sense, will push the updated patch after testing with
latest suggestion.
>
> thanks,
>
> Takashi

2022-03-30 18:27:50

by Takashi Iwai

[permalink] [raw]
Subject: Re: [PATCH] ALSA: hda: Avoid unsol event during RPM suspending

On Tue, 29 Mar 2022 17:59:40 +0200,
Mohan Kumar wrote:
>
> There is a corner case with unsol event handling during codec runtime
> suspending state. When the codec runtime suspend call initiated, the
> codec->in_pm atomic variable would be 0, currently the codec runtime
> suspend function calls snd_hdac_enter_pm() which will just increments
> the codec->in_pm atomic variable. Consider unsol event happened just
> after this step and before snd_hdac_leave_pm() in the codec runtime
> suspend function. The snd_hdac_power_up_pm() in the unsol event
> flow in hdmi_present_sense_via_verbs() function would just increment
> the codec->in_pm atomic variable without calling pm_runtime_get_sync
> function.
>
> As codec runtime suspend flow is already in progress and in parallel
> unsol event is also accessing the codec verbs, as soon as codec
> suspend flow completes and clocks are switched off before completing
> the unsol event handling as both functions doesn't wait for each other.
> This will result in below errors
>
> [ 589.428020] tegra-hda 3510000.hda: azx_get_response timeout, switching
> to polling mode: last cmd=0x505f2f57
> [ 589.428344] tegra-hda 3510000.hda: spurious response 0x80000074:0x5,
> last cmd=0x505f2f57
> [ 589.428547] tegra-hda 3510000.hda: spurious response 0x80000065:0x5,
> last cmd=0x505f2f57
>
> To avoid this, the unsol event flow should not perform any codec verb
> related operations during RPM_SUSPENDING state.
>
> Signed-off-by: Mohan Kumar <[email protected]>

Thanks, applied now.


Takashi