2023-10-11 11:47:30

by Phil Elwell

[permalink] [raw]
Subject: Re: On brcm80211 maintenance and support

On Wed, 11 Oct 2023 at 12:28, Hector Martin <[email protected]> wrote:
> I think we need to qualify this "no regressions" "rule". People regress
> our machines in mainline all the time. We catch it and get it fixed,
> sometimes in RCs, sometimes it goes all the way to stable and needs to
> get fixed there. We've had patches break everything from Bluetooth LE
> pairing to core memory management to the point we needed earlycon to
> diagnose it. That last one was a blatantly wrong patch to core Linux MM,
> it wasn't even something specific to our platform, but even that got
> past review (it just happened to break us specifically due to a
> coincidence).
>
> The review process doesn't magically avoid regressing things. "No
> regressions" is impossible without someone actually testing things.
> Claiming otherwise is dishonest. So if I end up as maintainer here I
> certainly am not going to promise "no regressions" for chips I can't
> test, without someone interested in those chips testing them. Of course
> I'll take regression fixes when they do happen and someone notices, but
> I can't know in advance until someone does.
>
> Consider a patch that changes some codepath in the driver. I can't know
> whether the original logic was always broken, or whether it worked on
> some chips, and whether the new logic works on those chips or will
> regress them, without testing. This is a regular occurrence with
> brcmfmac, due to the complexity and variability of the firmware
> interface. Often multiple versions of stuff are supported, or some
> structures can be extended, but sometimes they can't. It's a mess, and
> without firmware source code nor any official specs, there is no way to
> know exactly what is intended and what the backwards compatibility
> requirements are.
>
> The only way to avoid that is to gratuitously introduce version/chip
> gates for *every single change affecting behavior from the firmware
> POV*, which is a complete non-starter and would quickly yield a giant
> mess of spaghetti code. It's bad enough having to support explicit
> ABI-breaking changes in firmware, and having to deal with multiple
> versions of huge structures and convert between them. Trying to outright
> keep existing logic identical for "other chips" is just not going to
> happen, not without first having confirmation of what the requirements
> are from someone who has the required docs/source.
>
> I have a patch to enable WPA3 in Broadcom chipsets (yes, the driver is
> in such a sorry state it doesn't even support that yet). The current
> support attempt was added by a Cypress engineer and uses a completely
> different firmware mechanism. Is that supposed to actually work? Does it
> work currently? Is that the case for all Cypress firmwares? Or only
> some? Does the alternate mechanism we have for Broadcom chips work too?
> Only Cypress can answer those questions ahead of time, and they aren't
> (they ignored me last time I brought this up). So my current patch just
> replaces the mechanism with the known-working one for Broadcom chips.
>
> Next time I send a bunch of our downstream patches, I'm going to resend
> that WPA3 patch. And if it regresses Cypress WPA3 support, tough luck.
> If someone catches it (Phil?) and it turns out the existing support is
> the only way to do things on Cypress, I'll rework the patch to
> conditionally support both styles. But if nobody does, and nobody cares,
> then I'm sorry if I regress things, but there is no way to avoid it. We
> can't be gratuitously gating every single change just to hope to avoid
> regressions, without any confirmation of what is required. That just
> doesn't scale.
>
> Phil: Given Cypress' complete apathy towards this driver, if you want
> the Cypress chips in Raspberry Pi systems to continue working and catch
> potential problems ahead of time, it would be helpful if you can test
> the patches in this branch. This is our downstream brcmfmac patchset.
> Regardless of what ends up happening with the maintainer situation,
> giving this a whirl will be very helpful in catching problems with
> systems people actually use. It should be easy to bisect regressions too
> (we keep this rebased directly on recent kernels so you can apply it on
> top of your tree or whatever).
>
> https://github.com/asahilinux/linux/tree/bits/080-wifi

That sounds like a reasonable starting point - I'll take a look.

Phil