The hci_suspend_notifier which was introduced last year, is causing
problems for uart attached btrtl devices. These devices may loose their
firmware and their baudrate setting over a suspend/resume.
Since we don't even know the baudrate after a suspend/resume recovering
from this is tricky. The driver solves this by treating these devices
the same as USB BT HCIs which drop of the bus during suspend.
Specifically the driver:
1. Simply unconditionally turns the device fully off during
system-suspend to save maximum power.
2. Calls device_reprobe() from a workqueue to fully re-init the device
from scratch on system-resume (unregistering the old HCI and
registering a new HCI).
This means that these devices do not benefit from the suspend / resume
handling work done by the hci_suspend_notifier. At best this unnecessarily
adds some time to the suspend/resume time.
But in practice this is actually causing problems:
1. These btrtl devices seem to not like the HCI_OP_WRITE_SCAN_ENABLE(
SCAN_DISABLED) request being send to them when entering the
BT_SUSPEND_CONFIGURE_WAKE state. The same request send on
BT_SUSPEND_DISCONNECT works fine, but the second one send (unnecessarily?)
from the BT_SUSPEND_CONFIGURE_WAKE transition causes the device to hang:
[ 573.497754] PM: suspend entry (s2idle)
[ 573.554615] Filesystems sync: 0.056 seconds
[ 575.837753] Bluetooth: hci0: Timed out waiting for suspend events
[ 575.837801] Bluetooth: hci0: Suspend timeout bit: 4
[ 575.837925] Bluetooth: hci0: Suspend notifier action (3) failed: -110
2. The PM_POST_SUSPEND / BT_RUNNING transition races with the
driver-unbinding done by the device_reprobe() work.
If the hci_suspend_notifier wins the race it is talking to a dead
device leading to the following errors being logged:
[ 598.686060] Bluetooth: hci0: Timed out waiting for suspend events
[ 598.686124] Bluetooth: hci0: Suspend timeout bit: 5
[ 598.686237] Bluetooth: hci0: Suspend notifier action (4) failed: -110
In both cases things still work, but the suspend-notifier is causing
these ugly errors getting logged and ut increase both the suspend- and
the resume-time by 2 seconds.
This commit works around these problems by disabling (unregistering)
the hci_suspend_notifier.
Note that any eventual hci_unregister_dev() will call
unregister_pm_notifier() a second time, this is fine it will
simply fail with -ENOENT and hci_unregister_dev() ignores the
return value.
Cc: Luiz Augusto von Dentz <[email protected]>
Cc: Vasily Khoruzhick <[email protected]>
Cc: Abhishek Pandit-Subedi <[email protected]>
Signed-off-by: Hans de Goede <[email protected]>
---
drivers/bluetooth/hci_h5.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/bluetooth/hci_h5.c b/drivers/bluetooth/hci_h5.c
index 7be16a7f653b..acbcc676d6c2 100644
--- a/drivers/bluetooth/hci_h5.c
+++ b/drivers/bluetooth/hci_h5.c
@@ -14,6 +14,7 @@
#include <linux/of_device.h>
#include <linux/serdev.h>
#include <linux/skbuff.h>
+#include <linux/suspend.h>
#include <net/bluetooth/bluetooth.h>
#include <net/bluetooth/hci_core.h>
@@ -876,6 +877,13 @@ static int h5_btrtl_setup(struct h5 *h5)
bool flow_control;
int err;
+ /*
+ * Since h5_btrtl_resume() does a device_reprobe() the suspend handling
+ * done by the hci_suspend_notifier is not necessary; it actually causes
+ * delays and a bunch of errors to get logged, so disable it.
+ */
+ unregister_pm_notifier(&h5->hu->hdev->suspend_notifier);
+
btrtl_dev = btrtl_initialize(h5->hu->hdev, h5->id);
if (IS_ERR(btrtl_dev))
return PTR_ERR(btrtl_dev);
--
2.29.2
This is automated email and please do not reply to this email!
Dear submitter,
Thank you for submitting the patches to the linux bluetooth mailing list.
This is a CI test results with your patch series:
PW Link:https://patchwork.kernel.org/project/bluetooth/list/?series=423233
---Test result---
##############################
Test: CheckPatch - PASS
##############################
Test: CheckGitLint - PASS
##############################
Test: CheckBuildK - PASS
##############################
Test: CheckTestRunner: Setup - PASS
##############################
Test: CheckTestRunner: l2cap-tester - PASS
Total: 40, Passed: 34 (85.0%), Failed: 0, Not Run: 6
##############################
Test: CheckTestRunner: bnep-tester - PASS
Total: 1, Passed: 1 (100.0%), Failed: 0, Not Run: 0
##############################
Test: CheckTestRunner: mgmt-tester - PASS
Total: 416, Passed: 402 (96.6%), Failed: 0, Not Run: 14
##############################
Test: CheckTestRunner: rfcomm-tester - PASS
Total: 9, Passed: 9 (100.0%), Failed: 0, Not Run: 0
##############################
Test: CheckTestRunner: sco-tester - PASS
Total: 8, Passed: 8 (100.0%), Failed: 0, Not Run: 0
##############################
Test: CheckTestRunner: smp-tester - PASS
Total: 8, Passed: 8 (100.0%), Failed: 0, Not Run: 0
##############################
Test: CheckTestRunner: userchan-tester - PASS
Total: 3, Passed: 3 (100.0%), Failed: 0, Not Run: 0
---
Regards,
Linux Bluetooth
Hi Hans,
You can also implement hdev->prevent_wake to prevent configure wake
from running. hci_unregister_dev will also clear any tasks during
resume, unregister the pm notifier and cancel the suspend work so if
reprobe takes <2s (default suspend work timeout), you shouldn't see a
delay or any errors.
I'm not 100% confident that the reprobe worker and the pm notifier
won't race so I'm ok with this change as well. I would prefer if the
driver only set a quirk and the actual register/unregister was handled
entirely inside hci_core though.
Thanks,
Abhishek
On Wed, Jan 27, 2021 at 1:58 PM Hans de Goede <[email protected]> wrote:
>
> The hci_suspend_notifier which was introduced last year, is causing
> problems for uart attached btrtl devices. These devices may loose their
> firmware and their baudrate setting over a suspend/resume.
>
> Since we don't even know the baudrate after a suspend/resume recovering
> from this is tricky. The driver solves this by treating these devices
> the same as USB BT HCIs which drop of the bus during suspend.
>
> Specifically the driver:
> 1. Simply unconditionally turns the device fully off during
> system-suspend to save maximum power.
> 2. Calls device_reprobe() from a workqueue to fully re-init the device
> from scratch on system-resume (unregistering the old HCI and
> registering a new HCI).
>
> This means that these devices do not benefit from the suspend / resume
> handling work done by the hci_suspend_notifier. At best this unnecessarily
> adds some time to the suspend/resume time.
>
> But in practice this is actually causing problems:
>
> 1. These btrtl devices seem to not like the HCI_OP_WRITE_SCAN_ENABLE(
> SCAN_DISABLED) request being send to them when entering the
> BT_SUSPEND_CONFIGURE_WAKE state. The same request send on
> BT_SUSPEND_DISCONNECT works fine, but the second one send (unnecessarily?)
> from the BT_SUSPEND_CONFIGURE_WAKE transition causes the device to hang:
>
> [ 573.497754] PM: suspend entry (s2idle)
> [ 573.554615] Filesystems sync: 0.056 seconds
> [ 575.837753] Bluetooth: hci0: Timed out waiting for suspend events
> [ 575.837801] Bluetooth: hci0: Suspend timeout bit: 4
> [ 575.837925] Bluetooth: hci0: Suspend notifier action (3) failed: -110
>
> 2. The PM_POST_SUSPEND / BT_RUNNING transition races with the
> driver-unbinding done by the device_reprobe() work.
> If the hci_suspend_notifier wins the race it is talking to a dead
> device leading to the following errors being logged:
>
> [ 598.686060] Bluetooth: hci0: Timed out waiting for suspend events
> [ 598.686124] Bluetooth: hci0: Suspend timeout bit: 5
> [ 598.686237] Bluetooth: hci0: Suspend notifier action (4) failed: -110
>
> In both cases things still work, but the suspend-notifier is causing
> these ugly errors getting logged and ut increase both the suspend- and
> the resume-time by 2 seconds.
>
> This commit works around these problems by disabling (unregistering)
> the hci_suspend_notifier.
>
> Note that any eventual hci_unregister_dev() will call
> unregister_pm_notifier() a second time, this is fine it will
> simply fail with -ENOENT and hci_unregister_dev() ignores the
> return value.
>
> Cc: Luiz Augusto von Dentz <[email protected]>
> Cc: Vasily Khoruzhick <[email protected]>
> Cc: Abhishek Pandit-Subedi <[email protected]>
> Signed-off-by: Hans de Goede <[email protected]>
> ---
> drivers/bluetooth/hci_h5.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/drivers/bluetooth/hci_h5.c b/drivers/bluetooth/hci_h5.c
> index 7be16a7f653b..acbcc676d6c2 100644
> --- a/drivers/bluetooth/hci_h5.c
> +++ b/drivers/bluetooth/hci_h5.c
> @@ -14,6 +14,7 @@
> #include <linux/of_device.h>
> #include <linux/serdev.h>
> #include <linux/skbuff.h>
> +#include <linux/suspend.h>
>
> #include <net/bluetooth/bluetooth.h>
> #include <net/bluetooth/hci_core.h>
> @@ -876,6 +877,13 @@ static int h5_btrtl_setup(struct h5 *h5)
> bool flow_control;
> int err;
>
> + /*
> + * Since h5_btrtl_resume() does a device_reprobe() the suspend handling
> + * done by the hci_suspend_notifier is not necessary; it actually causes
> + * delays and a bunch of errors to get logged, so disable it.
> + */
> + unregister_pm_notifier(&h5->hu->hdev->suspend_notifier);
> +
> btrtl_dev = btrtl_initialize(h5->hu->hdev, h5->id);
> if (IS_ERR(btrtl_dev))
> return PTR_ERR(btrtl_dev);
> --
> 2.29.2
>
Hi,
On 1/28/21 12:58 AM, Abhishek Pandit-Subedi wrote:
> Hi Hans,
>
> You can also implement hdev->prevent_wake to prevent configure wake
> from running.
Yes I did consider that, but there still is an issue sometimes on
resume too and the BT_SUSPEND_DISCONNECT handling takes 200 ms which
is nice to get rid of too. So I opted to just disable the
hci_suspend_notifier().
> hci_unregister_dev will also clear any tasks during
> resume, unregister the pm notifier and cancel the suspend work so if
> reprobe takes <2s (default suspend work timeout), you shouldn't see a
> delay or any errors.
The RTL8723bs is the cheapest of cheap wifi/bt combo chip and is
often found on cheap/slow devices. I did see this:
[ 598.686060] Bluetooth: hci0: Timed out waiting for suspend events
[ 598.686124] Bluetooth: hci0: Suspend timeout bit: 5
[ 598.686237] Bluetooth: hci0: Suspend notifier action (4) failed: -110
At least once, it might just be that the 2 seconds elapsed before
the reprobe ran (or there might still be a race somewhere).
> I'm not 100% confident that the reprobe worker and the pm notifier
> won't race so I'm ok with this change as well. I would prefer if the> driver only set a quirk and the actual register/unregister was handled
> entirely inside hci_core though.
Ok, I did consider using a quirk for this myself, but I was not sure
if this warranted a core change since this is a somewhat unique
situation.
I'll respin the patch into a patch-set adding a quirk, thank you
for your feedback.
Regards,
Hans
> On Wed, Jan 27, 2021 at 1:58 PM Hans de Goede <[email protected]> wrote:
>>
>> The hci_suspend_notifier which was introduced last year, is causing
>> problems for uart attached btrtl devices. These devices may loose their
>> firmware and their baudrate setting over a suspend/resume.
>>
>> Since we don't even know the baudrate after a suspend/resume recovering
>> from this is tricky. The driver solves this by treating these devices
>> the same as USB BT HCIs which drop of the bus during suspend.
>>
>> Specifically the driver:
>> 1. Simply unconditionally turns the device fully off during
>> system-suspend to save maximum power.
>> 2. Calls device_reprobe() from a workqueue to fully re-init the device
>> from scratch on system-resume (unregistering the old HCI and
>> registering a new HCI).
>>
>> This means that these devices do not benefit from the suspend / resume
>> handling work done by the hci_suspend_notifier. At best this unnecessarily
>> adds some time to the suspend/resume time.
>>
>> But in practice this is actually causing problems:
>>
>> 1. These btrtl devices seem to not like the HCI_OP_WRITE_SCAN_ENABLE(
>> SCAN_DISABLED) request being send to them when entering the
>> BT_SUSPEND_CONFIGURE_WAKE state. The same request send on
>> BT_SUSPEND_DISCONNECT works fine, but the second one send (unnecessarily?)
>> from the BT_SUSPEND_CONFIGURE_WAKE transition causes the device to hang:
>>
>> [ 573.497754] PM: suspend entry (s2idle)
>> [ 573.554615] Filesystems sync: 0.056 seconds
>> [ 575.837753] Bluetooth: hci0: Timed out waiting for suspend events
>> [ 575.837801] Bluetooth: hci0: Suspend timeout bit: 4
>> [ 575.837925] Bluetooth: hci0: Suspend notifier action (3) failed: -110
>>
>> 2. The PM_POST_SUSPEND / BT_RUNNING transition races with the
>> driver-unbinding done by the device_reprobe() work.
>> If the hci_suspend_notifier wins the race it is talking to a dead
>> device leading to the following errors being logged:
>>
>> [ 598.686060] Bluetooth: hci0: Timed out waiting for suspend events
>> [ 598.686124] Bluetooth: hci0: Suspend timeout bit: 5
>> [ 598.686237] Bluetooth: hci0: Suspend notifier action (4) failed: -110
>>
>> In both cases things still work, but the suspend-notifier is causing
>> these ugly errors getting logged and ut increase both the suspend- and
>> the resume-time by 2 seconds.
>>
>> This commit works around these problems by disabling (unregistering)
>> the hci_suspend_notifier.
>>
>> Note that any eventual hci_unregister_dev() will call
>> unregister_pm_notifier() a second time, this is fine it will
>> simply fail with -ENOENT and hci_unregister_dev() ignores the
>> return value.
>>
>> Cc: Luiz Augusto von Dentz <[email protected]>
>> Cc: Vasily Khoruzhick <[email protected]>
>> Cc: Abhishek Pandit-Subedi <[email protected]>
>> Signed-off-by: Hans de Goede <[email protected]>
>> ---
>> drivers/bluetooth/hci_h5.c | 8 ++++++++
>> 1 file changed, 8 insertions(+)
>>
>> diff --git a/drivers/bluetooth/hci_h5.c b/drivers/bluetooth/hci_h5.c
>> index 7be16a7f653b..acbcc676d6c2 100644
>> --- a/drivers/bluetooth/hci_h5.c
>> +++ b/drivers/bluetooth/hci_h5.c
>> @@ -14,6 +14,7 @@
>> #include <linux/of_device.h>
>> #include <linux/serdev.h>
>> #include <linux/skbuff.h>
>> +#include <linux/suspend.h>
>>
>> #include <net/bluetooth/bluetooth.h>
>> #include <net/bluetooth/hci_core.h>
>> @@ -876,6 +877,13 @@ static int h5_btrtl_setup(struct h5 *h5)
>> bool flow_control;
>> int err;
>>
>> + /*
>> + * Since h5_btrtl_resume() does a device_reprobe() the suspend handling
>> + * done by the hci_suspend_notifier is not necessary; it actually causes
>> + * delays and a bunch of errors to get logged, so disable it.
>> + */
>> + unregister_pm_notifier(&h5->hu->hdev->suspend_notifier);
>> +
>> btrtl_dev = btrtl_initialize(h5->hu->hdev, h5->id);
>> if (IS_ERR(btrtl_dev))
>> return PTR_ERR(btrtl_dev);
>> --
>> 2.29.2
>>
>