Return-Path: MIME-Version: 1.0 In-Reply-To: References: Date: Mon, 14 Feb 2011 16:23:10 -0600 Message-ID: Subject: Re: HCI core error recovery. From: Andrei Warkentin To: linux-bluetooth@vger.kernel.org Content-Type: text/plain; charset=ISO-8859-1 Sender: linux-bluetooth-owner@vger.kernel.org List-ID: On Sat, Feb 12, 2011 at 12:47 AM, Andrei Warkentin wrote: > On Fri, Feb 11, 2011 at 5:07 PM, Andrei Warkentin wrote: >> Dear List, >> >> I've run into an interesting problem. Excuse me in advance if this was >> already covered here, or for my explanations, since I'm not too >> familiar with overall flow within BlueZ or Bluetooth specifics... >> We've had some hardware config issues that resulted in garbage/malformed >> messages arriving via H4 into the HCI layer. We've since resolved >> these, but it got me thinking. The issues would result in certain HCI >> messages being missed, including occasionally disconnect events being >> missed, and a subsequent connect event would result in a double add. >> >> I was thinking about how to fix at the very least the crash. The sysfs >> object is created as a last step after getting a "connection >> completed" HCI message, I think. What I am unsure about is if it's >> safe to just ignore the add if there is already a sysfs entry... >> >> So I would think the HCI core needs some resiliency against >> bad/malignant bluetooth controllers, and perform error >> recovery/resynchronization. Perhaps maybe there is room for a virtual >> hci controller that just injects various message types to see how well >> the core can cope? >> >> Thanks in advance, >> A > > To further explain the issue, here is what was happening - > > 0) A BT device is paired. > 1) Host goes into sleep mode. > 2) BT device turns off. > 3) Host wakes up due to BT waking the host. Due to UART resume issues, > HCI message corrupted. hci_disconn_complete_evt never gets called. > 4) BT device turns on. > 5) devref gets incremented in ?hci_conn_complete_evt, and is now 2. > 6) BT device turns off. hci_disconn_complete_evt is called, conn hash > is deleted, but sysfs entry not cleaned up since > atomic_dec_and_test(&conn->devref) != 0. > 7) BT device turns on. sysfs add fails since it never was cleaned up. > > The attached patch takes care of that. I'm not too familiar with BlueZ > (or bluetooth :-(), so I would like your feedback. In particular, I am > unsure about sync connections. > The primary issue overall is that HCI core doesn't handle HCI issues > (whether caused by transport issues, or bad/malicious BT controller). > I am curious if there are other ways to break the core. > > Thanks, > A > Anyone?