2022-08-08 18:06:56

by Abhishek Pandit-Subedi

[permalink] [raw]
Subject: [PATCH] Bluetooth: Ignore cmd_timeout with HCI_USER_CHANNEL

From: Abhishek Pandit-Subedi <[email protected]>

When using HCI_USER_CHANNEL, hci traffic is expected to be handled by
userspace entirely. However, it is still possible (and sometimes
desirable) to be able to send commands to the controller directly. In
such cases, the kernel command timeout shouldn't do any error handling.

Signed-off-by: Abhishek Pandit-Subedi <[email protected]>
---
This was tested by running a userchannel stack and sending commands via
hcitool cmd on an Intel AX200 controller. Without this change, each
command sent via hcitool would result in hci_cmd_timeout being called
and after 5 commands, the controller would reset.

Hcitool continues working here because it marks the socket as
promiscuous and gets a copy of all traffic while the socket is open (and
does filtering in userspace).

Tested on Chromebook with 5.4 kernel with patch (and applied cleanly on
bluetooth-next).

net/bluetooth/hci_core.c | 26 +++++++++++++++++---------
1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
index b3a5a3cc9372..c9a15f6633f7 100644
--- a/net/bluetooth/hci_core.c
+++ b/net/bluetooth/hci_core.c
@@ -1481,17 +1481,25 @@ static void hci_cmd_timeout(struct work_struct *work)
struct hci_dev *hdev = container_of(work, struct hci_dev,
cmd_timer.work);

- if (hdev->sent_cmd) {
- struct hci_command_hdr *sent = (void *) hdev->sent_cmd->data;
- u16 opcode = __le16_to_cpu(sent->opcode);
+ /* Don't trigger the timeout behavior if it happens while we're in
+ * userchannel mode. Userspace is responsible for handling any command
+ * timeouts.
+ */
+ if (!(hci_dev_test_flag(hdev, HCI_USER_CHANNEL) &&
+ test_bit(HCI_UP, &hdev->flags))) {
+ if (hdev->sent_cmd) {
+ struct hci_command_hdr *sent =
+ (void *)hdev->sent_cmd->data;
+ u16 opcode = __le16_to_cpu(sent->opcode);

- bt_dev_err(hdev, "command 0x%4.4x tx timeout", opcode);
- } else {
- bt_dev_err(hdev, "command tx timeout");
- }
+ bt_dev_err(hdev, "command 0x%4.4x tx timeout", opcode);
+ } else {
+ bt_dev_err(hdev, "command tx timeout");
+ }

- if (hdev->cmd_timeout)
- hdev->cmd_timeout(hdev);
+ if (hdev->cmd_timeout)
+ hdev->cmd_timeout(hdev);
+ }

atomic_set(&hdev->cmd_cnt, 1);
queue_work(hdev->workqueue, &hdev->cmd_work);
--
2.37.1.559.g78731f0fdb-goog


2022-08-08 19:17:08

by bluez.test.bot

[permalink] [raw]
Subject: RE: Bluetooth: Ignore cmd_timeout with HCI_USER_CHANNEL

This is automated email and please do not reply to this email!

Dear submitter,

Thank you for submitting the patches to the linux bluetooth mailing list.
This is a CI test results with your patch series:
PW Link:https://patchwork.kernel.org/project/bluetooth/list/?series=666118

---Test result---

Test Summary:
CheckPatch PASS 1.29 seconds
GitLint PASS 0.76 seconds
SubjectPrefix PASS 0.57 seconds
BuildKernel PASS 43.76 seconds
BuildKernel32 PASS 37.79 seconds
Incremental Build with patchesPASS 52.27 seconds
TestRunner: Setup PASS 630.09 seconds
TestRunner: l2cap-tester PASS 21.10 seconds
TestRunner: bnep-tester PASS 8.49 seconds
TestRunner: mgmt-tester PASS 133.23 seconds
TestRunner: rfcomm-tester PASS 12.93 seconds
TestRunner: sco-tester PASS 12.33 seconds
TestRunner: smp-tester PASS 11.77 seconds
TestRunner: userchan-tester PASS 8.42 seconds



---
Regards,
Linux Bluetooth

2022-08-08 20:34:49

by Luiz Augusto von Dentz

[permalink] [raw]
Subject: Re: [PATCH] Bluetooth: Ignore cmd_timeout with HCI_USER_CHANNEL

Hi Abhishek,

On Mon, Aug 8, 2022 at 11:04 AM Abhishek Pandit-Subedi
<[email protected]> wrote:
>
> From: Abhishek Pandit-Subedi <[email protected]>
>
> When using HCI_USER_CHANNEL, hci traffic is expected to be handled by
> userspace entirely. However, it is still possible (and sometimes
> desirable) to be able to send commands to the controller directly. In
> such cases, the kernel command timeout shouldn't do any error handling.
>
> Signed-off-by: Abhishek Pandit-Subedi <[email protected]>
> ---
> This was tested by running a userchannel stack and sending commands via
> hcitool cmd on an Intel AX200 controller. Without this change, each
> command sent via hcitool would result in hci_cmd_timeout being called
> and after 5 commands, the controller would reset.
>
> Hcitool continues working here because it marks the socket as
> promiscuous and gets a copy of all traffic while the socket is open (and
> does filtering in userspace).

There is something not quite right here, if you have a controller
using user_channel (addr.hci_channel = HCI_CHANNEL_USER) it probably
shouldn't even accept to be opened again by the likes of hcitool which
uses HCI_CHANNEL_RAW as it can cause conflicts. If you really need a
test tool that does send the command while in HCI_CHANNEL_USER then it
must be send on that mode but I wouldn't do it with hcitool anyway as
that is deprecated and this exercise seem to revolve to a entire stack
on top of HCI_USER_CHANNEL then you shall use tools of that stack and
mix with BlueZ userspace tools.

> Tested on Chromebook with 5.4 kernel with patch (and applied cleanly on
> bluetooth-next).
>
> net/bluetooth/hci_core.c | 26 +++++++++++++++++---------
> 1 file changed, 17 insertions(+), 9 deletions(-)
>
> diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
> index b3a5a3cc9372..c9a15f6633f7 100644
> --- a/net/bluetooth/hci_core.c
> +++ b/net/bluetooth/hci_core.c
> @@ -1481,17 +1481,25 @@ static void hci_cmd_timeout(struct work_struct *work)
> struct hci_dev *hdev = container_of(work, struct hci_dev,
> cmd_timer.work);
>
> - if (hdev->sent_cmd) {
> - struct hci_command_hdr *sent = (void *) hdev->sent_cmd->data;
> - u16 opcode = __le16_to_cpu(sent->opcode);
> + /* Don't trigger the timeout behavior if it happens while we're in
> + * userchannel mode. Userspace is responsible for handling any command
> + * timeouts.
> + */
> + if (!(hci_dev_test_flag(hdev, HCI_USER_CHANNEL) &&
> + test_bit(HCI_UP, &hdev->flags))) {
> + if (hdev->sent_cmd) {
> + struct hci_command_hdr *sent =
> + (void *)hdev->sent_cmd->data;
> + u16 opcode = __le16_to_cpu(sent->opcode);
>
> - bt_dev_err(hdev, "command 0x%4.4x tx timeout", opcode);
> - } else {
> - bt_dev_err(hdev, "command tx timeout");
> - }
> + bt_dev_err(hdev, "command 0x%4.4x tx timeout", opcode);
> + } else {
> + bt_dev_err(hdev, "command tx timeout");
> + }
>
> - if (hdev->cmd_timeout)
> - hdev->cmd_timeout(hdev);
> + if (hdev->cmd_timeout)
> + hdev->cmd_timeout(hdev);
> + }

I wonder why hci_cmd_timeout is even active if the controller is in
HCI_USER_CHANNEL mode, that sounds like a bug already.

> atomic_set(&hdev->cmd_cnt, 1);
> queue_work(hdev->workqueue, &hdev->cmd_work);
> --
> 2.37.1.559.g78731f0fdb-goog
>


--
Luiz Augusto von Dentz

2022-08-08 21:20:05

by Abhishek Pandit-Subedi

[permalink] [raw]
Subject: Re: [PATCH] Bluetooth: Ignore cmd_timeout with HCI_USER_CHANNEL

On Mon, Aug 8, 2022 at 1:31 PM Luiz Augusto von Dentz
<[email protected]> wrote:
>
> Hi Abhishek,
>
> On Mon, Aug 8, 2022 at 11:04 AM Abhishek Pandit-Subedi
> <[email protected]> wrote:
> >
> > From: Abhishek Pandit-Subedi <[email protected]>
> >
> > When using HCI_USER_CHANNEL, hci traffic is expected to be handled by
> > userspace entirely. However, it is still possible (and sometimes
> > desirable) to be able to send commands to the controller directly. In
> > such cases, the kernel command timeout shouldn't do any error handling.
> >
> > Signed-off-by: Abhishek Pandit-Subedi <[email protected]>
> > ---
> > This was tested by running a userchannel stack and sending commands via
> > hcitool cmd on an Intel AX200 controller. Without this change, each
> > command sent via hcitool would result in hci_cmd_timeout being called
> > and after 5 commands, the controller would reset.
> >
> > Hcitool continues working here because it marks the socket as
> > promiscuous and gets a copy of all traffic while the socket is open (and
> > does filtering in userspace).
>
> There is something not quite right here, if you have a controller
> using user_channel (addr.hci_channel = HCI_CHANNEL_USER) it probably
> shouldn't even accept to be opened again by the likes of hcitool which
> uses HCI_CHANNEL_RAW as it can cause conflicts. If you really need a
> test tool that does send the command while in HCI_CHANNEL_USER then it
> must be send on that mode but I wouldn't do it with hcitool anyway as
> that is deprecated and this exercise seem to revolve to a entire stack
> on top of HCI_USER_CHANNEL then you shall use tools of that stack and
> mix with BlueZ userspace tools.

Our goal is eventually consistent with that aim (not having multiple
users to the socket when using HCI_CHANNEL_USER).

In the interim however, we have existing tooling that expects to be
able to write raw hci to the controller, get responses and not expect
any crashes (Intel Wireless Reporting Tools for example). hcitool is
just an easy test tool here and the real behavior being tested is RAW
channel injections not triggering the cmd timeout.

>
> > Tested on Chromebook with 5.4 kernel with patch (and applied cleanly on
> > bluetooth-next).
> >
> > net/bluetooth/hci_core.c | 26 +++++++++++++++++---------
> > 1 file changed, 17 insertions(+), 9 deletions(-)
> >
> > diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
> > index b3a5a3cc9372..c9a15f6633f7 100644
> > --- a/net/bluetooth/hci_core.c
> > +++ b/net/bluetooth/hci_core.c
> > @@ -1481,17 +1481,25 @@ static void hci_cmd_timeout(struct work_struct *work)
> > struct hci_dev *hdev = container_of(work, struct hci_dev,
> > cmd_timer.work);
> >
> > - if (hdev->sent_cmd) {
> > - struct hci_command_hdr *sent = (void *) hdev->sent_cmd->data;
> > - u16 opcode = __le16_to_cpu(sent->opcode);
> > + /* Don't trigger the timeout behavior if it happens while we're in
> > + * userchannel mode. Userspace is responsible for handling any command
> > + * timeouts.
> > + */
> > + if (!(hci_dev_test_flag(hdev, HCI_USER_CHANNEL) &&
> > + test_bit(HCI_UP, &hdev->flags))) {
> > + if (hdev->sent_cmd) {
> > + struct hci_command_hdr *sent =
> > + (void *)hdev->sent_cmd->data;
> > + u16 opcode = __le16_to_cpu(sent->opcode);
> >
> > - bt_dev_err(hdev, "command 0x%4.4x tx timeout", opcode);
> > - } else {
> > - bt_dev_err(hdev, "command tx timeout");
> > - }
> > + bt_dev_err(hdev, "command 0x%4.4x tx timeout", opcode);
> > + } else {
> > + bt_dev_err(hdev, "command tx timeout");
> > + }
> >
> > - if (hdev->cmd_timeout)
> > - hdev->cmd_timeout(hdev);
> > + if (hdev->cmd_timeout)
> > + hdev->cmd_timeout(hdev);
> > + }
>
> I wonder why hci_cmd_timeout is even active if the controller is in
> HCI_USER_CHANNEL mode, that sounds like a bug already.

This gets scheduled in hci_cmd_work. I tried not scheduling
hci_cmd_timeout in the first place but that caused the event stream to
hang (I think because subsequent tx work wasn't being scheduled). I
didn't dive very deep here and fix looked complex for a scenario that
we will migrate away from.

>
> > atomic_set(&hdev->cmd_cnt, 1);
> > queue_work(hdev->workqueue, &hdev->cmd_work);
> > --
> > 2.37.1.559.g78731f0fdb-goog
> >
>
>
> --
> Luiz Augusto von Dentz