2021-11-10 00:06:34

by Benjamin Berg

[permalink] [raw]
Subject: [PATCH 0/4] Cancel sync commands if a TX failure occurs

It was reported that userspace could hang for 10s after resuming due to
btusb hitting the internal timeout when sending the firmware.

In this case, the bluetooth dongle disappeared right after resume due to
the thinkpad_acpi rfkill being blocked. This causes the USB device to
disappear, however the bluetooth stack does not handle the
corresponding ENODEV errors and instead waits for a timeout to happen.

To avoid blocking everything for such a long time, the synchronous
command has to finish immediately after an ENODEV error occurs. This
requires further error handling, which this patchset adds by building
on top of the existing cancellation infrastructure for synchronous
commands.

Note that this just adds basic error handling in order to quickly abort
the initialization routine in case the device disappears at that time.
Additional error handling such as calling hci_reset_dev might be
sensible in some cases.

Benjamin Berg (4):
Bluetooth: Reset more state when cancelling a sync command
Bluetooth: Add new hci_tx_error function
Bluetooth: hci_core: Signal TX failure if sending a frame failed
Bluetooth: btusb: Signal URB errors as TX failure

drivers/bluetooth/btusb.c | 16 ++++++++++++----
include/net/bluetooth/hci_core.h | 1 +
net/bluetooth/hci_core.c | 9 +++++++++
net/bluetooth/hci_request.c | 5 +++++
4 files changed, 27 insertions(+), 4 deletions(-)

--
2.31.1


2021-11-10 00:06:37

by Benjamin Berg

[permalink] [raw]
Subject: [PATCH 2/4] Bluetooth: Add new hci_tx_error function

From: Benjamin Berg <[email protected]>

Currently this function only cancels any synchronous operation that
might be ongoing. Adding this function allows aborting synchronous
commands in case of low level TX/RX issues. A common example for this is
that the device has been removed.

Signed-off-by: Benjamin Berg <[email protected]>
---
include/net/bluetooth/hci_core.h | 1 +
net/bluetooth/hci_core.c | 7 +++++++
2 files changed, 8 insertions(+)

diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h
index dd8840e70e25..542f5a37b9d0 100644
--- a/include/net/bluetooth/hci_core.h
+++ b/include/net/bluetooth/hci_core.h
@@ -1267,6 +1267,7 @@ void hci_release_dev(struct hci_dev *hdev);
int hci_suspend_dev(struct hci_dev *hdev);
int hci_resume_dev(struct hci_dev *hdev);
int hci_reset_dev(struct hci_dev *hdev);
+void hci_tx_error(struct hci_dev *hdev, int err);
int hci_recv_frame(struct hci_dev *hdev, struct sk_buff *skb);
int hci_recv_diag(struct hci_dev *hdev, struct sk_buff *skb);
__printf(2, 3) void hci_set_hw_info(struct hci_dev *hdev, const char *fmt, ...);
diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
index 8d33aa64846b..bbb35188e41f 100644
--- a/net/bluetooth/hci_core.c
+++ b/net/bluetooth/hci_core.c
@@ -4069,6 +4069,13 @@ int hci_reset_dev(struct hci_dev *hdev)
}
EXPORT_SYMBOL(hci_reset_dev);

+/* Reset HCI device */
+void hci_tx_error(struct hci_dev *hdev, int err)
+{
+ hci_req_sync_cancel(hdev, err);
+}
+EXPORT_SYMBOL(hci_tx_error);
+
/* Receive frame from HCI drivers */
int hci_recv_frame(struct hci_dev *hdev, struct sk_buff *skb)
{
--
2.31.1

2021-11-10 00:06:38

by Benjamin Berg

[permalink] [raw]
Subject: [PATCH 3/4] Bluetooth: hci_core: Signal TX failure if sending a frame failed

From: Benjamin Berg <[email protected]>

Call the hci_tx_error handler in case a frame cannot be send.

Signed-off-by: Benjamin Berg <[email protected]>
---
net/bluetooth/hci_core.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
index bbb35188e41f..8664c2fbacdb 100644
--- a/net/bluetooth/hci_core.c
+++ b/net/bluetooth/hci_core.c
@@ -4200,6 +4200,8 @@ static void hci_send_frame(struct hci_dev *hdev, struct sk_buff *skb)
if (err < 0) {
bt_dev_err(hdev, "sending frame failed (%d)", err);
kfree_skb(skb);
+
+ hci_tx_error(hdev, -err);
}
}

--
2.31.1

2021-11-10 00:08:48

by Benjamin Berg

[permalink] [raw]
Subject: [PATCH 4/4] Bluetooth: btusb: Signal URB errors as TX failure

From: Benjamin Berg <[email protected]>

Call the TX failure handler when transmission of URBs fail. This is done
both for failures to send an URB and also when the interrupt URB used to
retrieve a response fails.

This approach is sufficient to quickly deal with certain errors such as
a device being disconnected while synchronous commands are done during
initialization.

Signed-off-by: Benjamin Berg <[email protected]>
---
drivers/bluetooth/btusb.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
index 75c83768c257..0c4fe89c6573 100644
--- a/drivers/bluetooth/btusb.c
+++ b/drivers/bluetooth/btusb.c
@@ -924,6 +924,8 @@ static void btusb_intr_complete(struct urb *urb)
if (err != -EPERM && err != -ENODEV)
bt_dev_err(hdev, "urb %p failed to resubmit (%d)",
urb, -err);
+ if (err != -EPERM)
+ hci_tx_error(hdev, -err);
usb_unanchor_urb(urb);
}
}
@@ -967,6 +969,8 @@ static int btusb_submit_intr_urb(struct hci_dev *hdev, gfp_t mem_flags)
if (err != -EPERM && err != -ENODEV)
bt_dev_err(hdev, "urb %p submission failed (%d)",
urb, -err);
+ if (err != -EPERM)
+ hci_tx_error(hdev, -err);
usb_unanchor_urb(urb);
}

@@ -1322,10 +1326,12 @@ static void btusb_tx_complete(struct urb *urb)
if (!test_bit(HCI_RUNNING, &hdev->flags))
goto done;

- if (!urb->status)
+ if (!urb->status) {
hdev->stat.byte_tx += urb->transfer_buffer_length;
- else
+ } else {
+ hci_tx_error(hdev, -urb->status);
hdev->stat.err_tx++;
+ }

done:
spin_lock_irqsave(&data->txlock, flags);
@@ -1348,10 +1354,12 @@ static void btusb_isoc_tx_complete(struct urb *urb)
if (!test_bit(HCI_RUNNING, &hdev->flags))
goto done;

- if (!urb->status)
+ if (!urb->status) {
hdev->stat.byte_tx += urb->transfer_buffer_length;
- else
+ } else {
+ hci_tx_error(hdev, -urb->status);
hdev->stat.err_tx++;
+ }

done:
kfree(urb->setup_packet);
--
2.31.1

2021-11-10 00:48:26

by Luiz Augusto von Dentz

[permalink] [raw]
Subject: Re: [PATCH 2/4] Bluetooth: Add new hci_tx_error function

Hi Benjamin,

On Tue, Nov 9, 2021 at 2:35 PM Benjamin Berg <[email protected]> wrote:
>
> From: Benjamin Berg <[email protected]>
>
> Currently this function only cancels any synchronous operation that
> might be ongoing. Adding this function allows aborting synchronous
> commands in case of low level TX/RX issues. A common example for this is
> that the device has been removed.
>
> Signed-off-by: Benjamin Berg <[email protected]>
> ---
> include/net/bluetooth/hci_core.h | 1 +
> net/bluetooth/hci_core.c | 7 +++++++
> 2 files changed, 8 insertions(+)
>
> diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h
> index dd8840e70e25..542f5a37b9d0 100644
> --- a/include/net/bluetooth/hci_core.h
> +++ b/include/net/bluetooth/hci_core.h
> @@ -1267,6 +1267,7 @@ void hci_release_dev(struct hci_dev *hdev);
> int hci_suspend_dev(struct hci_dev *hdev);
> int hci_resume_dev(struct hci_dev *hdev);
> int hci_reset_dev(struct hci_dev *hdev);
> +void hci_tx_error(struct hci_dev *hdev, int err);
> int hci_recv_frame(struct hci_dev *hdev, struct sk_buff *skb);
> int hci_recv_diag(struct hci_dev *hdev, struct sk_buff *skb);
> __printf(2, 3) void hci_set_hw_info(struct hci_dev *hdev, const char *fmt, ...);
> diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
> index 8d33aa64846b..bbb35188e41f 100644
> --- a/net/bluetooth/hci_core.c
> +++ b/net/bluetooth/hci_core.c
> @@ -4069,6 +4069,13 @@ int hci_reset_dev(struct hci_dev *hdev)
> }
> EXPORT_SYMBOL(hci_reset_dev);
>
> +/* Reset HCI device */
> +void hci_tx_error(struct hci_dev *hdev, int err)
> +{
> + hci_req_sync_cancel(hdev, err);
> +}
> +EXPORT_SYMBOL(hci_tx_error);

I think we might be better off having such functionality exposed by
hci_sync.h since that should be accessible by driver nowadays, at
least that seems cleaner than having to introduce yet another public
function in hci_core.h just to interface with the likes of
hci_req_sync_cancel, that said we are also deprecating hci_request.h
in favor of hci_sync.h so we might as well move the likes of
hci_req_sync_cancel to hci_sync.h renaming it to hci_cmd_sync_cancel
so it is inline with naming we are using in hci_sync.h.

> /* Receive frame from HCI drivers */
> int hci_recv_frame(struct hci_dev *hdev, struct sk_buff *skb)
> {
> --
> 2.31.1
>


--
Luiz Augusto von Dentz

2021-11-10 00:48:44

by Luiz Augusto von Dentz

[permalink] [raw]
Subject: Re: [PATCH 3/4] Bluetooth: hci_core: Signal TX failure if sending a frame failed

Hi Benjamin,

On Tue, Nov 9, 2021 at 2:35 PM Benjamin Berg <[email protected]> wrote:
>
> From: Benjamin Berg <[email protected]>
>
> Call the hci_tx_error handler in case a frame cannot be send.
>
> Signed-off-by: Benjamin Berg <[email protected]>
> ---
> net/bluetooth/hci_core.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
> index bbb35188e41f..8664c2fbacdb 100644
> --- a/net/bluetooth/hci_core.c
> +++ b/net/bluetooth/hci_core.c
> @@ -4200,6 +4200,8 @@ static void hci_send_frame(struct hci_dev *hdev, struct sk_buff *skb)
> if (err < 0) {
> bt_dev_err(hdev, "sending frame failed (%d)", err);
> kfree_skb(skb);
> +
> + hci_tx_error(hdev, -err);

Either we do this here by calling directly hci_cmd_sync_cancel like I
suggested previously or perhaps we should have the error returned by
hci_send_frame otherwise the current thread still has to wait to get
the error from req_result which perhaps is not necessary if we already
got a proper error here just return it so the thread doesn't even need
to sleep.

> }
> }
>
> --
> 2.31.1
>


--
Luiz Augusto von Dentz

2021-11-10 00:49:09

by Luiz Augusto von Dentz

[permalink] [raw]
Subject: Re: [PATCH 4/4] Bluetooth: btusb: Signal URB errors as TX failure

Hi Benjamin,

On Tue, Nov 9, 2021 at 2:35 PM Benjamin Berg <[email protected]> wrote:
>
> From: Benjamin Berg <[email protected]>
>
> Call the TX failure handler when transmission of URBs fail. This is done
> both for failures to send an URB and also when the interrupt URB used to
> retrieve a response fails.
>
> This approach is sufficient to quickly deal with certain errors such as
> a device being disconnected while synchronous commands are done during
> initialization.
>
> Signed-off-by: Benjamin Berg <[email protected]>
> ---
> drivers/bluetooth/btusb.c | 16 ++++++++++++----
> 1 file changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
> index 75c83768c257..0c4fe89c6573 100644
> --- a/drivers/bluetooth/btusb.c
> +++ b/drivers/bluetooth/btusb.c
> @@ -924,6 +924,8 @@ static void btusb_intr_complete(struct urb *urb)
> if (err != -EPERM && err != -ENODEV)
> bt_dev_err(hdev, "urb %p failed to resubmit (%d)",
> urb, -err);
> + if (err != -EPERM)
> + hci_tx_error(hdev, -err);
> usb_unanchor_urb(urb);
> }
> }
> @@ -967,6 +969,8 @@ static int btusb_submit_intr_urb(struct hci_dev *hdev, gfp_t mem_flags)
> if (err != -EPERM && err != -ENODEV)
> bt_dev_err(hdev, "urb %p submission failed (%d)",
> urb, -err);
> + if (err != -EPERM)
> + hci_tx_error(hdev, -err);
> usb_unanchor_urb(urb);
> }
>
> @@ -1322,10 +1326,12 @@ static void btusb_tx_complete(struct urb *urb)
> if (!test_bit(HCI_RUNNING, &hdev->flags))
> goto done;
>
> - if (!urb->status)
> + if (!urb->status) {
> hdev->stat.byte_tx += urb->transfer_buffer_length;
> - else
> + } else {
> + hci_tx_error(hdev, -urb->status);
> hdev->stat.err_tx++;
> + }

Looks like we are reusing the btusb_tx_complete for all endpoints but
the likes of hci_tx_error/hci_cmd_sync_cancel only applies to commands
(e.g: alloc_ctrl_urb), perhaps there is a way to detect if this is
actually a control urb or not so we can skip this for bulk transfers,
or actually if a bulk transfer fails we may actually need to resend
depending if the error is recoverable since the bulk transfers can
actually contain fragments rather than the entire packet, but I'd
leave that for another patch since it is probably not what you are
trying to fix in this set.

> done:
> spin_lock_irqsave(&data->txlock, flags);
> @@ -1348,10 +1354,12 @@ static void btusb_isoc_tx_complete(struct urb *urb)
> if (!test_bit(HCI_RUNNING, &hdev->flags))
> goto done;
>
> - if (!urb->status)
> + if (!urb->status) {
> hdev->stat.byte_tx += urb->transfer_buffer_length;
> - else
> + } else {
> + hci_tx_error(hdev, -urb->status);
> hdev->stat.err_tx++;
> + }
>
> done:
> kfree(urb->setup_packet);
> --
> 2.31.1
>


--
Luiz Augusto von Dentz

2021-11-10 08:50:29

by Benjamin Berg

[permalink] [raw]
Subject: Re: [PATCH 3/4] Bluetooth: hci_core: Signal TX failure if sending a frame failed

Hi Luiz,

On Tue, 2021-11-09 at 15:13 -0800, Luiz Augusto von Dentz wrote:
> Hi Benjamin,
>
> On Tue, Nov 9, 2021 at 2:35 PM Benjamin Berg <[email protected]> wrote:
> >
> > From: Benjamin Berg <[email protected]>
> >
> > Call the hci_tx_error handler in case a frame cannot be send.
> >
> > Signed-off-by: Benjamin Berg <[email protected]>
> > ---
> > net/bluetooth/hci_core.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
> > index bbb35188e41f..8664c2fbacdb 100644
> > --- a/net/bluetooth/hci_core.c
> > +++ b/net/bluetooth/hci_core.c
> > @@ -4200,6 +4200,8 @@ static void hci_send_frame(struct hci_dev *hdev, struct sk_buff *skb)
> > if (err < 0) {
> > bt_dev_err(hdev, "sending frame failed (%d)", err);
> > kfree_skb(skb);
> > +
> > + hci_tx_error(hdev, -err);
>
> Either we do this here by calling directly hci_cmd_sync_cancel like I
> suggested previously or perhaps we should have the error returned by
> hci_send_frame otherwise the current thread still has to wait to get
> the error from req_result which perhaps is not necessary if we already
> got a proper error here just return it so the thread doesn't even need
> to sleep.

Yes, returning the error is a good idea.

I think that means doing the cancellation from hci_cmd_work. As I
understand it, the frame is sent from the main workqueue so we are not
avoiding the context switches for now. But that may well change in the
future.

Benjamin

> > }
> > }
> >
> > --
> > 2.31.1
> >
>
>


Attachments:
signature.asc (849.00 B)
This is a digitally signed message part