2023-11-18 19:31:17

by Aditya Garg

[permalink] [raw]
Subject: Re: [REGRESSION] Bluetooth is not working on Macs with BCM4377 chip starting from kernel 6.5



> On 14-Nov-2023, at 3:14 PM, Hector Martin <[email protected]> wrote:
>
> On 14/11/2023 18.03, Aditya Garg wrote:
>>
>>
>>> On 14-Nov-2023, at 1:28 PM, Hector Martin <[email protected]> wrote:
>>>
>>> On 14/11/2023 15.59, Hector Martin wrote:
>>>> On 14/11/2023 15.23, Aditya Garg wrote:
>>>>>
>>>>>
>>>>>> On 14-Nov-2023, at 5:01 AM, Bagas Sanjaya <[email protected]> wrote:
>>>>>>
>>>>>> On Mon, Nov 13, 2023 at 08:57:35PM +0000, Aditya Garg wrote:
>>>>>>> Starting from kernel 6.5, a regression in the kernel is causing Bluetooth to not work on T2 Macs with BCM4377 chip.
>>>>>>>
>>>>>>> Journalctl of kernel 6.4.8 which has Bluetooth working is given here: https://pastebin.com/u9U3kbFJ
>>>>>>>
>>>>>>> Journalctl of kernel 6.5.2, which has Bluetooth broken is given here: https://pastebin.com/aVHNFMRs
>>>>>>>
>>>>>>> Also, the bug hasn’t been fixed even in 6.6.1, as reported by users.
>>>>>>
>>>>>> Can you bisect this regression please?
>>>>>
>>>>> Since I don't have access to this hardware, it's not possible for me to bisect this regression. Let's hope someone is able to do so though.
>>>>
>>>> It's not a regression, it was always broken. I'm sending a patch.
>>>>
>>>> - Hector
>>>
>>> You are quite likely conflating two problems. The ubsan issue you quoted
>>> was always there and the patch I just sent fixes it, but it almost
>>> certainly always worked fine in practice without ubsan.
>>>
>>> The Bluetooth problem you are referring to is likely *specific to
>>> Bluetooth LE devices* and the regression was introduced by 288c90224e
>>> and fixed by 41e9cdea9c, which is also in 6.5.11 and 6.6.1.
>>>
>>> If Bluetooth is broken in *some other way* in 6.6.1 then we need a
>>> proper report or a bisect. Your logs don't show any issues other than
>>> the ubsan noise, which is not a regression.
>>>
>>> - Hector
>>>
>>
>> UBSAN noise seems to be fixed, Bluetooth not working though
>>
>> https://pastebin.com/HeVvMVk4
>>
>> Ill try setting .broken_le_coded = true,
>
> Now you have a probe timeout, which you didn't have before. That's a
> different problem.
>
> Please try this commit and see if it helps:
>
> https://github.com/AsahiLinux/linux/commit/8ec770b4f78fc14629705206e2db54d9d6439686
>
> If it's this then it's still not a regression, it's probably just random
> chance since I think the old timeout value was borderline for the older
> chips.
>
> - Hector
>


Hi

I recently got a kernel tested with this patch as well as with .broken_le_coded = true,
Here are the logs: https://pastebin.com/BpfJuJKY

Also, without .broken_le_coded = true, the bluetooth doesn't work, as specified in my previous email.


2023-11-20 11:07:51

by Hector Martin

[permalink] [raw]
Subject: Re: [REGRESSION] Bluetooth is not working on Macs with BCM4377 chip starting from kernel 6.5



On 2023/11/19 4:31, Aditya Garg wrote:
>
>
>> On 14-Nov-2023, at 3:14 PM, Hector Martin <[email protected]> wrote:
>>
>> On 14/11/2023 18.03, Aditya Garg wrote:
>>>
>>>
>>>> On 14-Nov-2023, at 1:28 PM, Hector Martin <[email protected]> wrote:
>>>>
>>>> On 14/11/2023 15.59, Hector Martin wrote:
>>>>> On 14/11/2023 15.23, Aditya Garg wrote:
>>>>>>
>>>>>>
>>>>>>> On 14-Nov-2023, at 5:01 AM, Bagas Sanjaya <[email protected]> wrote:
>>>>>>>
>>>>>>> On Mon, Nov 13, 2023 at 08:57:35PM +0000, Aditya Garg wrote:
>>>>>>>> Starting from kernel 6.5, a regression in the kernel is causing Bluetooth to not work on T2 Macs with BCM4377 chip.
>>>>>>>>
>>>>>>>> Journalctl of kernel 6.4.8 which has Bluetooth working is given here: https://pastebin.com/u9U3kbFJ
>>>>>>>>
>>>>>>>> Journalctl of kernel 6.5.2, which has Bluetooth broken is given here: https://pastebin.com/aVHNFMRs
>>>>>>>>
>>>>>>>> Also, the bug hasn’t been fixed even in 6.6.1, as reported by users.
>>>>>>>
>>>>>>> Can you bisect this regression please?
>>>>>>
>>>>>> Since I don't have access to this hardware, it's not possible for me to bisect this regression. Let's hope someone is able to do so though.
>>>>>
>>>>> It's not a regression, it was always broken. I'm sending a patch.
>>>>>
>>>>> - Hector
>>>>
>>>> You are quite likely conflating two problems. The ubsan issue you quoted
>>>> was always there and the patch I just sent fixes it, but it almost
>>>> certainly always worked fine in practice without ubsan.
>>>>
>>>> The Bluetooth problem you are referring to is likely *specific to
>>>> Bluetooth LE devices* and the regression was introduced by 288c90224e
>>>> and fixed by 41e9cdea9c, which is also in 6.5.11 and 6.6.1.
>>>>
>>>> If Bluetooth is broken in *some other way* in 6.6.1 then we need a
>>>> proper report or a bisect. Your logs don't show any issues other than
>>>> the ubsan noise, which is not a regression.
>>>>
>>>> - Hector
>>>>
>>>
>>> UBSAN noise seems to be fixed, Bluetooth not working though
>>>
>>> https://pastebin.com/HeVvMVk4
>>>
>>> Ill try setting .broken_le_coded = true,
>>
>> Now you have a probe timeout, which you didn't have before. That's a
>> different problem.
>>
>> Please try this commit and see if it helps:
>>
>> https://github.com/AsahiLinux/linux/commit/8ec770b4f78fc14629705206e2db54d9d6439686
>>
>> If it's this then it's still not a regression, it's probably just random
>> chance since I think the old timeout value was borderline for the older
>> chips.
>>
>> - Hector
>>
>
>
> Hi
>
> I recently got a kernel tested with this patch as well as with .broken_le_coded = true,
> Here are the logs: https://pastebin.com/BpfJuJKY
>
> Also, without .broken_le_coded = true, the bluetooth doesn't work, as specified in my previous email.

So are you saying everything works now? If not, what doesn't work?
"Bluetooth doesn't work" isn't useful information, especially in the
absence of any useful error messages. You can't just dump dmesg logs at
us, you have to *describe* what the problem is.

If broken_le_coded = true "fixed" it then "bluetooth doesn't work" was a
terrible bug report. What that quirk does is make *connecting/pairing to
Bluetooth LE devices* work. Non-BLE devices already worked, the
controller worked, scanning worked, etc. All that is useful information
if you want to get support for issues. We can't magically divine what's
wrong if you just send us a dmesg and say "it's broken". We need
detailed information about exactly what works and what doesn't (e.g. the
controller not showing up at all is VERY different from it showing up
but not finding your device). The only reason we guessed this here is
that this was a known issue that affected other chips. If we ever run
into a 4377-specific issue that only you can reproduce, "bluetooth
doesn't work" and no error logs really isn't going to get it fixed.

- Hector

2023-11-21 11:42:51

by Aditya Garg

[permalink] [raw]
Subject: Re: [REGRESSION] Bluetooth is not working on Macs with BCM4377 chip starting from kernel 6.5



> On 20-Nov-2023, at 4:37 PM, Hector Martin <[email protected]> wrote:
>
> 
>
>> On 2023/11/19 4:31, Aditya Garg wrote:
>>
>>
>>>> On 14-Nov-2023, at 3:14 PM, Hector Martin <[email protected]> wrote:
>>>
>>> On 14/11/2023 18.03, Aditya Garg wrote:
>>>>
>>>>
>>>>> On 14-Nov-2023, at 1:28 PM, Hector Martin <[email protected]> wrote:
>>>>>
>>>>> On 14/11/2023 15.59, Hector Martin wrote:
>>>>>> On 14/11/2023 15.23, Aditya Garg wrote:
>>>>>>>
>>>>>>>
>>>>>>>> On 14-Nov-2023, at 5:01 AM, Bagas Sanjaya <[email protected]> wrote:
>>>>>>>>
>>>>>>>> On Mon, Nov 13, 2023 at 08:57:35PM +0000, Aditya Garg wrote:
>>>>>>>>> Starting from kernel 6.5, a regression in the kernel is causing Bluetooth to not work on T2 Macs with BCM4377 chip.
>>>>>>>>>
>>>>>>>>> Journalctl of kernel 6.4.8 which has Bluetooth working is given here: https://pastebin.com/u9U3kbFJ
>>>>>>>>>
>>>>>>>>> Journalctl of kernel 6.5.2, which has Bluetooth broken is given here: https://pastebin.com/aVHNFMRs
>>>>>>>>>
>>>>>>>>> Also, the bug hasn’t been fixed even in 6.6.1, as reported by users.
>>>>>>>>
>>>>>>>> Can you bisect this regression please?
>>>>>>>
>>>>>>> Since I don't have access to this hardware, it's not possible for me to bisect this regression. Let's hope someone is able to do so though.
>>>>>>
>>>>>> It's not a regression, it was always broken. I'm sending a patch.
>>>>>>
>>>>>> - Hector
>>>>>
>>>>> You are quite likely conflating two problems. The ubsan issue you quoted
>>>>> was always there and the patch I just sent fixes it, but it almost
>>>>> certainly always worked fine in practice without ubsan.
>>>>>
>>>>> The Bluetooth problem you are referring to is likely *specific to
>>>>> Bluetooth LE devices* and the regression was introduced by 288c90224e
>>>>> and fixed by 41e9cdea9c, which is also in 6.5.11 and 6.6.1.
>>>>>
>>>>> If Bluetooth is broken in *some other way* in 6.6.1 then we need a
>>>>> proper report or a bisect. Your logs don't show any issues other than
>>>>> the ubsan noise, which is not a regression.
>>>>>
>>>>> - Hector
>>>>>
>>>>
>>>> UBSAN noise seems to be fixed, Bluetooth not working though
>>>>
>>>> https://pastebin.com/HeVvMVk4
>>>>
>>>> Ill try setting .broken_le_coded = true,
>>>
>>> Now you have a probe timeout, which you didn't have before. That's a
>>> different problem.
>>>
>>> Please try this commit and see if it helps:
>>>
>>> https://github.com/AsahiLinux/linux/commit/8ec770b4f78fc14629705206e2db54d9d6439686
>>>
>>> If it's this then it's still not a regression, it's probably just random
>>> chance since I think the old timeout value was borderline for the older
>>> chips.
>>>
>>> - Hector
>>>
>>
>>
>> Hi
>>
>> I recently got a kernel tested with this patch as well as with .broken_le_coded = true,
>> Here are the logs: https://pastebin.com/BpfJuJKY
>>
>> Also, without .broken_le_coded = true, the bluetooth doesn't work, as specified in my previous email.
>
> So are you saying everything works now? If not, what doesn't work?
> "Bluetooth doesn't work" isn't useful information, especially in the
> absence of any useful error messages. You can't just dump dmesg logs at
> us, you have to *describe* what the problem is.
>
My bad for not specifying that. The user reports that the Bluetooth device is not recognised at all.

Also, broken_le_coded = true did not "fix" it.

Talking about dmesg, well not getting any logs regarding this message indeed is a frustrating thing for me, and bisecting seems to be the only option in my mind rn.

> If broken_le_coded = true "fixed" it then "bluetooth doesn't work" was a
> terrible bug report. What that quirk does is make *connecting/pairing to
> Bluetooth LE devices* work. Non-BLE devices already worked, the
> controller worked, scanning worked, etc. All that is useful information
> if you want to get support for issues. We can't magically divine what's
> wrong if you just send us a dmesg and say "it's broken". We need
> detailed information about exactly what works and what doesn't (e.g. the
> controller not showing up at all is VERY different from it showing up
> but not finding your device). The only reason we guessed this here is
> that this was a known issue that affected other chips. If we ever run
> into a 4377-specific issue that only you can reproduce, "bluetooth
> doesn't work" and no error logs really isn't going to get it fixed.
>
> - Hector