Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp3344623ybt; Mon, 29 Jun 2020 23:48:57 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzwO6Uk5k6jvv98UPeBOyrKaZdb/SH6peHOoBBO47PIAQ4r0mvmiWIbJfhL5c1+lZSbBYT+ X-Received: by 2002:a17:906:5f98:: with SMTP id a24mr16461965eju.241.1593499737321; Mon, 29 Jun 2020 23:48:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1593499737; cv=none; d=google.com; s=arc-20160816; b=WcRbm+7Zhi8UrflVhqYwrQbfa4rGnQQeVu/9voP714YdNxRtPRVd1x+7x1MqIW3Q1r 8zoEl8tq5/WVEz2r2urHiVfq6SGtYeaMMChR1uQE/baoUxKt4zSfrBg5ip/iD7e/hT7q CEgVYvO4Y39/V2o3m+LMd0wAFPAPDYmauglYkCkEAyYRdjTl32XpytznJfn5qpNbw2r5 PM46RwwHB7QS0piwf4+Cp3sCiPzHOye4OrVmXQVTmjJPN36sAJubE1mgbydMumBcRJuE 5fqIMko3z1KPG6rANMS7wisQPVQnyKCxCSLpJOoldqy6C4ASyKmjCOECjqHUvUb0sPVL F/UQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=XLMDo1UsxYJMELotpDQ5hxq+PK7DJHzcb1+KFvP8aG0=; b=oqW8B4gD7X7j+azBeJwqHltqDGQ1eo2JQmdcYKxbpW3barEDYuFwRUumKhW7c38oC5 q5Bwy47eD6sROP1BOWk1hoSMG4Z7ziFrLdZyKKyvbvx+2y6Jn9UwFnVHPzjOiPP4orVO H+ZObJg+2wwfrbWUXKD3YQ/ByGGqJFmPHjc9FCl3Aq3Lhm3FrdAtrOQtHaA5AROU8eNl AzW1so8dt/vQi69YfPocUCwZimXhK/Rk0BgeCD1r1bPeobY61yU+WL7RVHsNaDIisJh7 1oKH42cikLxdc6rKTchXfzOGsjrHgvT8hFu/6JQOL6JOT73jFBfFwmtYr8rariWCrvHm 6ngg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=RW58M7OS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b4si1100945edk.454.2020.06.29.23.48.34; Mon, 29 Jun 2020 23:48:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=RW58M7OS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730482AbgF3Gs2 (ORCPT + 99 others); Tue, 30 Jun 2020 02:48:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38930 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730386AbgF3Gs1 (ORCPT ); Tue, 30 Jun 2020 02:48:27 -0400 Received: from mail-wr1-x442.google.com (mail-wr1-x442.google.com [IPv6:2a00:1450:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2FD59C03E97A for ; Mon, 29 Jun 2020 23:48:27 -0700 (PDT) Received: by mail-wr1-x442.google.com with SMTP id f18so10863043wrs.0 for ; Mon, 29 Jun 2020 23:48:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=XLMDo1UsxYJMELotpDQ5hxq+PK7DJHzcb1+KFvP8aG0=; b=RW58M7OS7OlArVG0kvfUEakS2s5DQF2UyPiSnvoqoWFzkI9MOXX2LqKxVSAex0N/pd YSwfgrU3vEo3/E1r2s0cCuclbQWayqu6bo1pDLXfV10T/BTp69V/js6AY0dVpbZZb+FX +hr6o9+cjzIchZ1TF3glX0unPfcIQPTX6i/Be5GOFmxe+XP8YhbCpDwH/kQgdj9NQXMA 18zwOSPbADnMs4HbkNHkuY0xmLnqV1dVUkNeJD+rdHQYZq5C3fCmYvFmf9WOskxO56xJ ondtmjKbGKkc5SOb/sJnijswm2WRDh3K8xA+EuWG7uAVGTDeWPqkLlKv9V9JdX6x5cfu e7ug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=XLMDo1UsxYJMELotpDQ5hxq+PK7DJHzcb1+KFvP8aG0=; b=ACITYfxpMykb5sFwnunY+QJ34Wocj7f1qTtI6yEWnX/xuHqeP7iGTKxks6kubvafSn f+W6Ohg9th+rLl5Vzpg2wuyQK5x4Tm+mIs8ueaWafTeUNe6OvFTjTlDGWg4Rr9MTjExc cIgSt8RvaOJzFHtejVnND54mXXNSSkJJZVQsx6Me6GfnnSqC4aRdUyrIe8q1gDTylasT Q3h//lbURuVDh4Tbccmd8ps8FNpDuUyMMZlIg+/SybsV8lSW2pBP9Z6r/phQtR0KtYJC H3E1MAvxZLe4Z29oJ1CZg1GH9Z1CtqqYUDVCEwjk+9yAXLvwQ+cIg69yHh773ENipJyk 5qaw== X-Gm-Message-State: AOAM530bQSZl8iIqW9jNQSSqZjlyUoQmsZyMtLZJPoL26lcHoqbAYlkY wtxfnRd/qVrMdzl3gqRlB9YekVImYtQWVgedEJ078w== X-Received: by 2002:adf:828b:: with SMTP id 11mr21950031wrc.58.1593499705514; Mon, 29 Jun 2020 23:48:25 -0700 (PDT) MIME-Version: 1.0 References: <20200627105437.453053-1-apusaka@google.com> <20200627185320.RFC.v1.1.Icea550bb064a24b89f2217cf19e35b4480a31afd@changeid> <91CFE951-262A-4E83-8550-25445AE84B5A@holtmann.org> In-Reply-To: <91CFE951-262A-4E83-8550-25445AE84B5A@holtmann.org> From: Archie Pusaka Date: Tue, 30 Jun 2020 14:48:14 +0800 Message-ID: Subject: Re: [RFC PATCH v1 1/2] Bluetooth: queue ACL packets if no handle is found To: Marcel Holtmann Cc: linux-bluetooth , chromeos-bluetooth-upstreaming , Abhishek Pandit-Subedi , "David S. Miller" , Jakub Kicinski , Johan Hedberg , kernel list , netdev@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Marcel, On Mon, 29 Jun 2020 at 14:40, Marcel Holtmann wrote: > > Hi Archie, > > > There is a possibility that an ACL packet is received before we > > receive the HCI connect event for the corresponding handle. If this > > happens, we discard the ACL packet. > > > > Rather than just ignoring them, this patch provides a queue for > > incoming ACL packet without a handle. The queue is processed when > > receiving a HCI connection event. If 2 seconds elapsed without > > receiving the HCI connection event, assume something bad happened > > and discard the queued packet. > > > > Signed-off-by: Archie Pusaka > > Reviewed-by: Abhishek Pandit-Subedi > > so two things up front. I want to hide this behind a HCI_QUIRK_OUT_OF_ORD= ER_ACL that a transport driver has to set first. Frankly if this kind of ou= t-of-order happens on UART or SDIO transports, then something is obviously = going wrong. I have no plan to fix up after a fully serialized transport. > > Secondly, if a transport sets HCI_QUIRK_OUT_OF_ORDER_ACL, then I want thi= s off by default. You can enable it via an experimental setting. The reason= here is that we have to make it really hard and fail as often as possible = so that hardware manufactures and spec writers realize that something is fu= ndamentally broken here. > > I have no problem in running the code and complaining loudly in case the = quirk has been set. Just injecting the packets can only happen if bluetooth= d explicitly enabled it. Got it. > > > > > > --- > > > > include/net/bluetooth/hci_core.h | 8 +++ > > net/bluetooth/hci_core.c | 84 +++++++++++++++++++++++++++++--- > > net/bluetooth/hci_event.c | 2 + > > 3 files changed, 88 insertions(+), 6 deletions(-) > > > > diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/h= ci_core.h > > index 836dc997ff94..b69ecdd0d15a 100644 > > --- a/include/net/bluetooth/hci_core.h > > +++ b/include/net/bluetooth/hci_core.h > > @@ -270,6 +270,9 @@ struct adv_monitor { > > /* Default authenticated payload timeout 30s */ > > #define DEFAULT_AUTH_PAYLOAD_TIMEOUT 0x0bb8 > > > > +/* Time to keep ACL packets without a corresponding handle queued (2s)= */ > > +#define PENDING_ACL_TIMEOUT msecs_to_jiffies(2000) > > + > > Do we have some btmon traces with timestamps. Isn=E2=80=99t a second enou= gh? Actually 2 seconds is an awful long time. When this happens in the test lab, the HCI connect event is about 0.002 second behind the first ACL packet. We can change this if required. > > > struct amp_assoc { > > __u16 len; > > __u16 offset; > > @@ -538,6 +541,9 @@ struct hci_dev { > > struct delayed_work rpa_expired; > > bdaddr_t rpa; > > > > + struct delayed_work remove_pending_acl; > > + struct sk_buff_head pending_acl_q; > > + > > can we name this ooo_q and move it to the other queues in this struct. Un= less we want to add a Kconfig option around it, we don=E2=80=99t need to ke= ep it here. Ack. > > > #if IS_ENABLED(CONFIG_BT_LEDS) > > struct led_trigger *power_led; > > #endif > > @@ -1773,6 +1779,8 @@ void hci_le_start_enc(struct hci_conn *conn, __le= 16 ediv, __le64 rand, > > void hci_copy_identity_address(struct hci_dev *hdev, bdaddr_t *bdaddr, > > u8 *bdaddr_type); > > > > +void hci_process_pending_acl(struct hci_dev *hdev, struct hci_conn *co= nn); > > + > > #define SCO_AIRMODE_MASK 0x0003 > > #define SCO_AIRMODE_CVSD 0x0000 > > #define SCO_AIRMODE_TRANSP 0x0003 > > diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c > > index 7959b851cc63..30780242c267 100644 > > --- a/net/bluetooth/hci_core.c > > +++ b/net/bluetooth/hci_core.c > > @@ -1786,6 +1786,7 @@ int hci_dev_do_close(struct hci_dev *hdev) > > skb_queue_purge(&hdev->rx_q); > > skb_queue_purge(&hdev->cmd_q); > > skb_queue_purge(&hdev->raw_q); > > + skb_queue_purge(&hdev->pending_acl_q); > > > > /* Drop last sent command */ > > if (hdev->sent_cmd) { > > @@ -3518,6 +3519,78 @@ static int hci_suspend_notifier(struct notifier_= block *nb, unsigned long action, > > return NOTIFY_STOP; > > } > > > > +static void hci_add_pending_acl(struct hci_dev *hdev, struct sk_buff *= skb) > > +{ > > + skb_queue_tail(&hdev->pending_acl_q, skb); > > + > > + queue_delayed_work(hdev->workqueue, &hdev->remove_pending_acl, > > + PENDING_ACL_TIMEOUT); > > +} > > + > > +void hci_process_pending_acl(struct hci_dev *hdev, struct hci_conn *co= nn) > > +{ > > + struct sk_buff *skb, *tmp; > > + struct hci_acl_hdr *hdr; > > + u16 handle, flags; > > + bool reset_timer =3D false; > > + > > + skb_queue_walk_safe(&hdev->pending_acl_q, skb, tmp) { > > + hdr =3D (struct hci_acl_hdr *)skb->data; > > + handle =3D __le16_to_cpu(hdr->handle); > > + flags =3D hci_flags(handle); > > + handle =3D hci_handle(handle); > > + > > + if (handle !=3D conn->handle) > > + continue; > > + > > + __skb_unlink(skb, &hdev->pending_acl_q); > > + skb_pull(skb, HCI_ACL_HDR_SIZE); > > + > > + l2cap_recv_acldata(conn, skb, flags); > > + reset_timer =3D true; > > + } > > + > > + if (reset_timer) > > + mod_delayed_work(hdev->workqueue, &hdev->remove_pending_a= cl, > > + PENDING_ACL_TIMEOUT); > > +} > > + > > +/* Remove the oldest pending ACL, and all pending ACLs with the same h= andle */ > > +static void hci_remove_pending_acl(struct work_struct *work) > > +{ > > + struct hci_dev *hdev; > > + struct sk_buff *skb, *tmp; > > + struct hci_acl_hdr *hdr; > > + u16 handle, oldest_handle; > > + > > + hdev =3D container_of(work, struct hci_dev, remove_pending_acl.wo= rk); > > + skb =3D skb_dequeue(&hdev->pending_acl_q); > > + > > + if (!skb) > > + return; > > + > > + hdr =3D (struct hci_acl_hdr *)skb->data; > > + oldest_handle =3D hci_handle(__le16_to_cpu(hdr->handle)); > > + kfree_skb(skb); > > + > > + bt_dev_err(hdev, "ACL packet for unknown connection handle %d", > > + oldest_handle); > > + > > + skb_queue_walk_safe(&hdev->pending_acl_q, skb, tmp) { > > + hdr =3D (struct hci_acl_hdr *)skb->data; > > + handle =3D hci_handle(__le16_to_cpu(hdr->handle)); > > + > > + if (handle =3D=3D oldest_handle) { > > + __skb_unlink(skb, &hdev->pending_acl_q); > > + kfree_skb(skb); > > + } > > + } > > + > > + if (!skb_queue_empty(&hdev->pending_acl_q)) > > + queue_delayed_work(hdev->workqueue, &hdev->remove_pending= _acl, > > + PENDING_ACL_TIMEOUT); > > +} > > + > > So I am wondering if we make this too complicated. Since generally speaki= ng we can only have a single HCI connect complete anyway at a time. No matt= er if the controller serializes it for us or we do it for the controller. S= o hci_conn_add could just process the queue for packets with its handle and= then flush it. And it can flush it no matter what since whatever other pac= kets are in the queue, they can not be valid. > > That said, we wouldn=E2=80=99t even need to check the packet handles at a= ll. We just needed to flag them as already out-of-order queued once and han= d them back into the rx_q at the top. Then the would be processed as usual.= Already ooo packets would cause the same error as before if it is for a no= n-existing handle and others would end up being processed. > > For me this means we just need another queue to park the packets until hc= i_conn_add gets called. I might have missed something, but I am looking for= the least invasive option for this and least code duplication. I'm not aware of the fact that we can only have a single HCI connect complete event at any time. Is this also true even if two / more peripherals connect at the same time? I was under the impression that if we have device A and B both are connecting to us at the same time, we might receive the packets in this order: (1) ACL A (2) ACL B (3) HCI conn evt B (4) HCI conn evt A Hence the queue and the handle check. > > > /* Alloc HCI device */ > > struct hci_dev *hci_alloc_dev(void) > > { > > @@ -3610,10 +3683,12 @@ struct hci_dev *hci_alloc_dev(void) > > INIT_WORK(&hdev->suspend_prepare, hci_prepare_suspend); > > > > INIT_DELAYED_WORK(&hdev->power_off, hci_power_off); > > + INIT_DELAYED_WORK(&hdev->remove_pending_acl, hci_remove_pending_a= cl); > > > > skb_queue_head_init(&hdev->rx_q); > > skb_queue_head_init(&hdev->cmd_q); > > skb_queue_head_init(&hdev->raw_q); > > + skb_queue_head_init(&hdev->pending_acl_q); > > > > init_waitqueue_head(&hdev->req_wait_q); > > init_waitqueue_head(&hdev->suspend_wait_q); > > @@ -4662,8 +4737,6 @@ static void hci_acldata_packet(struct hci_dev *hd= ev, struct sk_buff *skb) > > struct hci_conn *conn; > > __u16 handle, flags; > > > > - skb_pull(skb, HCI_ACL_HDR_SIZE); > > - > > handle =3D __le16_to_cpu(hdr->handle); > > flags =3D hci_flags(handle); > > handle =3D hci_handle(handle); > > @@ -4678,17 +4751,16 @@ static void hci_acldata_packet(struct hci_dev *= hdev, struct sk_buff *skb) > > hci_dev_unlock(hdev); > > > > if (conn) { > > + skb_pull(skb, HCI_ACL_HDR_SIZE); > > + > > hci_conn_enter_active_mode(conn, BT_POWER_FORCE_ACTIVE_OF= F); > > > > /* Send to upper protocol */ > > l2cap_recv_acldata(conn, skb, flags); > > return; > > } else { > > - bt_dev_err(hdev, "ACL packet for unknown connection handl= e %d", > > - handle); > > + hci_add_pending_acl(hdev, skb); > > So here I want to keep being verbose. If no quirk is set, then this has t= o stay as an error. In case the quirk is set, then this should still warn t= hat we are queuing up a packet. It is not an expected behavior. Ack. > > > } > > - > > - kfree_skb(skb); > > } > > > > /* SCO data packet */ > > diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c > > index e060fc9ebb18..108c6c102a6a 100644 > > --- a/net/bluetooth/hci_event.c > > +++ b/net/bluetooth/hci_event.c > > @@ -2627,6 +2627,8 @@ static void hci_conn_complete_evt(struct hci_dev = *hdev, struct sk_buff *skb) > > hci_send_cmd(hdev, HCI_OP_CHANGE_CONN_PTYPE, size= of(cp), > > &cp); > > } > > + > > + hci_process_pending_acl(hdev, conn); > > Can we just do this in hci_conn_add() when we create the connection objec= t? Yes, we can. > > > } else { > > conn->state =3D BT_CLOSED; > > if (conn->type =3D=3D ACL_LINK) > > -- > > 2.27.0.212.ge8ba1cc988-goog > > > > Regards > > Marcel > Thanks, Archie