2022-03-31 02:42:19

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH] CHROMIUM: arm64: dts: qcom: Add sc7180-gelarshie

On 30/03/2022 11:09, Mars Chen wrote:
> Initial attempt at Gelarshie device tree.
>
> BUG=b:225756600
> TEST=emerge-strongbad chromeos-kernel-5_4
>
> Signed-off-by: Mars Chen <[email protected]>
> ---
> arch/arm64/boot/dts/qcom/Makefile | 1 +
> .../dts/qcom/sc7180-trogdor-gelarshie-r0.dts | 15 +
> .../dts/qcom/sc7180-trogdor-gelarshie.dtsi | 304 ++++++++++++++++++
> 3 files changed, 320 insertions(+)
> create mode 100644 arch/arm64/boot/dts/qcom/sc7180-trogdor-gelarshie-r0.dts
> create mode 100644 arch/arm64/boot/dts/qcom/sc7180-trogdor-gelarshie.dtsi
>
> diff --git a/arch/arm64/boot/dts/qcom/Makefile b/arch/arm64/boot/dts/qcom/Makefile
> index f9e6343acd03..cf8f88b065c3 100644
> --- a/arch/arm64/boot/dts/qcom/Makefile
> +++ b/arch/arm64/boot/dts/qcom/Makefile
> @@ -57,6 +57,7 @@ dtb-$(CONFIG_ARCH_QCOM) += sc7180-trogdor-coachz-r1.dtb
> dtb-$(CONFIG_ARCH_QCOM) += sc7180-trogdor-coachz-r1-lte.dtb
> dtb-$(CONFIG_ARCH_QCOM) += sc7180-trogdor-coachz-r3.dtb
> dtb-$(CONFIG_ARCH_QCOM) += sc7180-trogdor-coachz-r3-lte.dtb
> +dtb-$(CONFIG_ARCH_QCOM) += sc7180-trogdor-gelarshie-r0.dtb
> dtb-$(CONFIG_ARCH_QCOM) += sc7180-trogdor-homestar-r2.dtb
> dtb-$(CONFIG_ARCH_QCOM) += sc7180-trogdor-homestar-r3.dtb
> dtb-$(CONFIG_ARCH_QCOM) += sc7180-trogdor-homestar-r4.dtb
> diff --git a/arch/arm64/boot/dts/qcom/sc7180-trogdor-gelarshie-r0.dts b/arch/arm64/boot/dts/qcom/sc7180-trogdor-gelarshie-r0.dts
> new file mode 100644
> index 000000000000..027d6d563a5f
> --- /dev/null
> +++ b/arch/arm64/boot/dts/qcom/sc7180-trogdor-gelarshie-r0.dts
> @@ -0,0 +1,15 @@
> +// SPDX-License-Identifier: (GPL-2.0+ OR MIT)
> +/*
> + * Google Gelarshie board device tree source
> + *
> + * Copyright 2022 Google LLC.
> + */
> +
> +/dts-v1/;
> +
> +#include "sc7180-trogdor-gelarshie.dtsi"
> +
> +/ {
> + model = "Google Gelarshie (rev0+)";
> + compatible = "google,gelarshie", "qcom,sc7180";

Missing bindings. Please document the compatible.

Best regards,
Krzysztof


2022-04-14 07:50:51

by Doug Anderson

[permalink] [raw]
Subject: Re: [PATCH] CHROMIUM: arm64: dts: qcom: Add sc7180-gelarshie

Hi,

On Wed, Mar 30, 2022 at 10:25 AM Krzysztof Kozlowski
<[email protected]> wrote:
>
> On 30/03/2022 11:09, Mars Chen wrote:
> > Initial attempt at Gelarshie device tree.
> >
> > BUG=b:225756600
> > TEST=emerge-strongbad chromeos-kernel-5_4
> >
> > Signed-off-by: Mars Chen <[email protected]>
> > ---
> > arch/arm64/boot/dts/qcom/Makefile | 1 +
> > .../dts/qcom/sc7180-trogdor-gelarshie-r0.dts | 15 +
> > .../dts/qcom/sc7180-trogdor-gelarshie.dtsi | 304 ++++++++++++++++++
> > 3 files changed, 320 insertions(+)
> > create mode 100644 arch/arm64/boot/dts/qcom/sc7180-trogdor-gelarshie-r0.dts
> > create mode 100644 arch/arm64/boot/dts/qcom/sc7180-trogdor-gelarshie.dtsi
> >
> > diff --git a/arch/arm64/boot/dts/qcom/Makefile b/arch/arm64/boot/dts/qcom/Makefile
> > index f9e6343acd03..cf8f88b065c3 100644
> > --- a/arch/arm64/boot/dts/qcom/Makefile
> > +++ b/arch/arm64/boot/dts/qcom/Makefile
> > @@ -57,6 +57,7 @@ dtb-$(CONFIG_ARCH_QCOM) += sc7180-trogdor-coachz-r1.dtb
> > dtb-$(CONFIG_ARCH_QCOM) += sc7180-trogdor-coachz-r1-lte.dtb
> > dtb-$(CONFIG_ARCH_QCOM) += sc7180-trogdor-coachz-r3.dtb
> > dtb-$(CONFIG_ARCH_QCOM) += sc7180-trogdor-coachz-r3-lte.dtb
> > +dtb-$(CONFIG_ARCH_QCOM) += sc7180-trogdor-gelarshie-r0.dtb
> > dtb-$(CONFIG_ARCH_QCOM) += sc7180-trogdor-homestar-r2.dtb
> > dtb-$(CONFIG_ARCH_QCOM) += sc7180-trogdor-homestar-r3.dtb
> > dtb-$(CONFIG_ARCH_QCOM) += sc7180-trogdor-homestar-r4.dtb
> > diff --git a/arch/arm64/boot/dts/qcom/sc7180-trogdor-gelarshie-r0.dts b/arch/arm64/boot/dts/qcom/sc7180-trogdor-gelarshie-r0.dts
> > new file mode 100644
> > index 000000000000..027d6d563a5f
> > --- /dev/null
> > +++ b/arch/arm64/boot/dts/qcom/sc7180-trogdor-gelarshie-r0.dts
> > @@ -0,0 +1,15 @@
> > +// SPDX-License-Identifier: (GPL-2.0+ OR MIT)
> > +/*
> > + * Google Gelarshie board device tree source
> > + *
> > + * Copyright 2022 Google LLC.
> > + */
> > +
> > +/dts-v1/;
> > +
> > +#include "sc7180-trogdor-gelarshie.dtsi"
> > +
> > +/ {
> > + model = "Google Gelarshie (rev0+)";
> > + compatible = "google,gelarshie", "qcom,sc7180";
>
> Missing bindings. Please document the compatible.

I'm actually kinda curious: is there really a good reason for this? I
know I haven't been adding things to
`Documentation/devicetree/bindings/arm/qcom.yaml` for Qualcomm
Chromebooks. Ironically, it turns out that the script I typically use
to invoke checkpatch happens to have "--no-tree" as an argument and
that seems to disable this check. Doh!

That being said, though, I do wonder a little bit about the value of
enumerating the top-level compatible like this in a yaml file.
Certainly the yaml schema validation in general can be quite useful,
but this top-level listing seems pure overhead. I guess it makes some
tools happy, but other than that it seems to provide very little
value...

-Doug

2022-04-14 12:59:06

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH] CHROMIUM: arm64: dts: qcom: Add sc7180-gelarshie

On 13/04/2022 23:48, Doug Anderson wrote:
> I'm actually kinda curious: is there really a good reason for this? I
> know I haven't been adding things to
> `Documentation/devicetree/bindings/arm/qcom.yaml` for Qualcomm
> Chromebooks. Ironically, it turns out that the script I typically use
> to invoke checkpatch happens to have "--no-tree" as an argument and
> that seems to disable this check. Doh!
>
> That being said, though, I do wonder a little bit about the value of
> enumerating the top-level compatible like this in a yaml file.
> Certainly the yaml schema validation in general can be quite useful,
> but this top-level listing seems pure overhead. I guess it makes some
> tools happy, but other than that it seems to provide very little
> value...

If compatible is not part of ABI, it is allowed to change in whatever
shape one wishes. In such case, how can anyone (e.g. user-space)
identify the board? Model name? Also not part of ABI (not documented)...


Best regards,
Krzysztof

2022-04-16 02:31:13

by Doug Anderson

[permalink] [raw]
Subject: Re: [PATCH] CHROMIUM: arm64: dts: qcom: Add sc7180-gelarshie

Hi,

On Thu, Apr 14, 2022 at 12:10 AM Krzysztof Kozlowski
<[email protected]> wrote:
>
> On 13/04/2022 23:48, Doug Anderson wrote:
> > I'm actually kinda curious: is there really a good reason for this? I
> > know I haven't been adding things to
> > `Documentation/devicetree/bindings/arm/qcom.yaml` for Qualcomm
> > Chromebooks. Ironically, it turns out that the script I typically use
> > to invoke checkpatch happens to have "--no-tree" as an argument and
> > that seems to disable this check. Doh!
> >
> > That being said, though, I do wonder a little bit about the value of
> > enumerating the top-level compatible like this in a yaml file.
> > Certainly the yaml schema validation in general can be quite useful,
> > but this top-level listing seems pure overhead. I guess it makes some
> > tools happy, but other than that it seems to provide very little
> > value...
>
> If compatible is not part of ABI, it is allowed to change in whatever
> shape one wishes. In such case, how can anyone (e.g. user-space)
> identify the board? Model name? Also not part of ABI (not documented)...

Hmm, it is a good question. I guess one issue is that the way
Chromebooks interact with the bootloader it's not trivially easy to
enumerate what exactly the compatible will be in this hardcoded list.
It all has to do with the whole "revision" and "sku" scheme the
bootloader on ARM Chromebooks uses. For example, on one Chromebook I
have the bootloader prints out:

Compat preference: google,lazor-rev5-sku6 google,lazor-rev5
google,lazor-sku6 google,lazor

What that means is that:

1. The bootloader will first look for 'google,lazor-rev5-sku6'. If it
finds a dts with that compatible it will pick it.

2. The bootloader will then look for 'google,lazor-rev5'. If it finds
a dts with that compatible it will pick it.

3. The bootloader will then look for 'google,lazor-sku6'. If it finds
a dts with that compatible it will pick it.

4. Finally, the bootloader will look for 'google,lazor'.

There's a method to the madness. Among other things, this allows
revving the board revision for a change to the board even if it
_should_ be invisible to software. The rule is always that the
"newest" device tree that's in Linux is always listed _without_ a
board revision number. An example might help.

a) Assume '-rev5' is the newest revision available. In Linux this
would be the only dts that advertises "google,lazor" (with no -rev).
Previous dts file would advertise "-rev3" or "-rev4" or whatever.

b) We need to spin the board for something that should be invisible to
software. Just in case, HW guys change the board strappings to -rev6.
This works _seamlessly_ because the newest dts file always advertises
just "google,lazor"

c) We spin the board for something that's _not_ invisible. It will be
"-rev7". Now, we go back and add "-rev5" and "-rev6" to the old board
dts file and remove the advertisement for "google,lazor". We create a
new dts file for -rev7 that advertises "google,lazor".

Now we can certainly argue back and forth above the above scheme and
how it's terrible and/or great, but it definitely works pretty well
and it's what we've been doing for a while now. Before that we used to
proactively add a whole bunch of "future" revisions "just in case".
That was definitely worse and had the same problem that we'd have to
shuffle compatibles. See, for instance `rk3288-veyron-jerry.dts`.

One thing we _definitely_ don't want to do is to give HW _any_
incentive to make board spins _without_ changing the revision. HW
sometimes makes spins without first involving software and if it
doesn't boot because they updated the board ID then someone in China
will just put the old ID in and ship it off. That's bad.

--

But I guess this doesn't answer your question: how can userspace
identify what board this is running? I don't have an answer to that,
but I guess I'd say that the top-level "compatible" isn't really it.
If nothing else, I think just from the definition it's not guaranteed
to be right, is it? From the spec: "Specifies a list of platform
architectures with which this platform is compatible." The key thing
is "a list". If this can be a list of things then how can you use it
to uniquely identify what one board you're on? If all of the things
that are different between two boards are things that are probable
(eDP panels, USB devices, PCIe devices) then two very different boards
could have the exact same device tree, right? ...and you could have
one device tree that lists the compatible of both boards?

That all being said, I think that on Chrome OS the userspace tools
_do_ some amount of parsing of the compatible strings here. For
Chromebooks they leverage the fact that they understand the above
scheme and thus can figure things out. I think they also use things
like `/proc/device-tree/firmware/coreboot/board-id` and
`/proc/device-tree/firmware/coreboot/sku-id`. That doesn't seem to be
documented, though. :(

I guess the question is, though, why do you need to know what board you're on?

-Doug

2022-04-20 05:18:28

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH] CHROMIUM: arm64: dts: qcom: Add sc7180-gelarshie

On 14/04/2022 19:36, Doug Anderson wrote:
> Hi,
>
> On Thu, Apr 14, 2022 at 12:10 AM Krzysztof Kozlowski
> <[email protected]> wrote:
>>
>> On 13/04/2022 23:48, Doug Anderson wrote:
>>> I'm actually kinda curious: is there really a good reason for this? I
>>> know I haven't been adding things to
>>> `Documentation/devicetree/bindings/arm/qcom.yaml` for Qualcomm
>>> Chromebooks. Ironically, it turns out that the script I typically use
>>> to invoke checkpatch happens to have "--no-tree" as an argument and
>>> that seems to disable this check. Doh!
>>>
>>> That being said, though, I do wonder a little bit about the value of
>>> enumerating the top-level compatible like this in a yaml file.
>>> Certainly the yaml schema validation in general can be quite useful,
>>> but this top-level listing seems pure overhead. I guess it makes some
>>> tools happy, but other than that it seems to provide very little
>>> value...
>>
>> If compatible is not part of ABI, it is allowed to change in whatever
>> shape one wishes. In such case, how can anyone (e.g. user-space)
>> identify the board? Model name? Also not part of ABI (not documented)...
>
> Hmm, it is a good question. I guess one issue is that the way
> Chromebooks interact with the bootloader it's not trivially easy to
> enumerate what exactly the compatible will be in this hardcoded list.
> It all has to do with the whole "revision" and "sku" scheme the
> bootloader on ARM Chromebooks uses. For example, on one Chromebook I
> have the bootloader prints out:
>
> Compat preference: google,lazor-rev5-sku6 google,lazor-rev5
> google,lazor-sku6 google,lazor
>
> What that means is that:
>
> 1. The bootloader will first look for 'google,lazor-rev5-sku6'. If it
> finds a dts with that compatible it will pick it.
>
> 2. The bootloader will then look for 'google,lazor-rev5'. If it finds
> a dts with that compatible it will pick it.
>
> 3. The bootloader will then look for 'google,lazor-sku6'. If it finds
> a dts with that compatible it will pick it.
>
> 4. Finally, the bootloader will look for 'google,lazor'.
>
> There's a method to the madness. Among other things, this allows
> revving the board revision for a change to the board even if it
> _should_ be invisible to software. The rule is always that the
> "newest" device tree that's in Linux is always listed _without_ a
> board revision number. An example might help.
>
> a) Assume '-rev5' is the newest revision available. In Linux this
> would be the only dts that advertises "google,lazor" (with no -rev).
> Previous dts file would advertise "-rev3" or "-rev4" or whatever.
>
> b) We need to spin the board for something that should be invisible to
> software. Just in case, HW guys change the board strappings to -rev6.
> This works _seamlessly_ because the newest dts file always advertises
> just "google,lazor"
>
> c) We spin the board for something that's _not_ invisible. It will be
> "-rev7". Now, we go back and add "-rev5" and "-rev6" to the old board
> dts file and remove the advertisement for "google,lazor". We create a
> new dts file for -rev7 that advertises "google,lazor".

Except shuffling the compatibles in bindings, you are changing the
meaning of final "google,lazor" compatible. The bootloader works as
expected - from most specific (rev5-sku6) to most generic compatible
(google,lazor) but why do you need to advertise the latest rev as
"google,lazor"? Why the bootloader on latest rev (e.g. rev7) cannot bind
to rev7 compatible?

> Now we can certainly argue back and forth above the above scheme and
> how it's terrible and/or great, but it definitely works pretty well
> and it's what we've been doing for a while now. Before that we used to
> proactively add a whole bunch of "future" revisions "just in case".
> That was definitely worse and had the same problem that we'd have to
> shuffle compatibles. See, for instance `rk3288-veyron-jerry.dts`.
>
> One thing we _definitely_ don't want to do is to give HW _any_
> incentive to make board spins _without_ changing the revision. HW
> sometimes makes spins without first involving software and if it
> doesn't boot because they updated the board ID then someone in China
> will just put the old ID in and ship it off. That's bad.
>
> --
>
> But I guess this doesn't answer your question: how can userspace
> identify what board this is running? I don't have an answer to that,
> but I guess I'd say that the top-level "compatible" isn't really it.

It can, the same as bootloader, by looking at the most specific
compatible (rev7).

> If nothing else, I think just from the definition it's not guaranteed
> to be right, is it? From the spec: "Specifies a list of platform
> architectures with which this platform is compatible." The key thing
> is "a list". If this can be a list of things then how can you use it
> to uniquely identify what one board you're on?

The most specific compatible identifies or, like recently Rob confirmed
in case of Renesas, the list of compatibles:
https://lore.kernel.org/linux-devicetree/Yk2%[email protected]/

> If all of the things
> that are different between two boards are things that are probable
> (eDP panels, USB devices, PCIe devices) then two very different boards
> could have the exact same device tree, right? ...and you could have
> one device tree that lists the compatible of both boards?

What is the question here?

>
> That all being said, I think that on Chrome OS the userspace tools
> _do_ some amount of parsing of the compatible strings here. For
> Chromebooks they leverage the fact that they understand the above
> scheme and thus can figure things out. I think they also use things
> like `/proc/device-tree/firmware/coreboot/board-id` and
> `/proc/device-tree/firmware/coreboot/sku-id`. That doesn't seem to be
> documented, though. :(
>
> I guess the question is, though, why do you need to know what board you're on?

You might have (and I had in some previous job) single user-space
application working on different HW and responding slightly differently,
depending on the hardware it runs. Exactly the same as kernel behaves
differently, depending on DTB. The differences for example might be in
GPIOs or some other interfaces managed via user-space drivers, not in
presence of devices. Another example are system tests behaving
differently depending on the DUT, where again you run the tests in a
generic way so the DUT is autodetected based on board.

Of course you could say: different hardware, has different DTB, so
user-space should check entire tree, to figure out how to operate that
hardware. Yeah, that's one way, very complex and actually duplicating
kernel's work. Embedded apps are specialized, so it is much easier for
them to check board compatible and make assumptions on that.

Best regards,
Krzysztof

2022-04-20 20:44:53

by Doug Anderson

[permalink] [raw]
Subject: Re: [PATCH] CHROMIUM: arm64: dts: qcom: Add sc7180-gelarshie

Hi,

On Tue, Apr 19, 2022 at 8:47 AM Krzysztof Kozlowski
<[email protected]> wrote:
>
> On 14/04/2022 19:36, Doug Anderson wrote:
> > Hi,
> >
> > On Thu, Apr 14, 2022 at 12:10 AM Krzysztof Kozlowski
> > <[email protected]> wrote:
> >>
> >> On 13/04/2022 23:48, Doug Anderson wrote:
> >>> I'm actually kinda curious: is there really a good reason for this? I
> >>> know I haven't been adding things to
> >>> `Documentation/devicetree/bindings/arm/qcom.yaml` for Qualcomm
> >>> Chromebooks. Ironically, it turns out that the script I typically use
> >>> to invoke checkpatch happens to have "--no-tree" as an argument and
> >>> that seems to disable this check. Doh!
> >>>
> >>> That being said, though, I do wonder a little bit about the value of
> >>> enumerating the top-level compatible like this in a yaml file.
> >>> Certainly the yaml schema validation in general can be quite useful,
> >>> but this top-level listing seems pure overhead. I guess it makes some
> >>> tools happy, but other than that it seems to provide very little
> >>> value...
> >>
> >> If compatible is not part of ABI, it is allowed to change in whatever
> >> shape one wishes. In such case, how can anyone (e.g. user-space)
> >> identify the board? Model name? Also not part of ABI (not documented)...
> >
> > Hmm, it is a good question. I guess one issue is that the way
> > Chromebooks interact with the bootloader it's not trivially easy to
> > enumerate what exactly the compatible will be in this hardcoded list.
> > It all has to do with the whole "revision" and "sku" scheme the
> > bootloader on ARM Chromebooks uses. For example, on one Chromebook I
> > have the bootloader prints out:
> >
> > Compat preference: google,lazor-rev5-sku6 google,lazor-rev5
> > google,lazor-sku6 google,lazor
> >
> > What that means is that:
> >
> > 1. The bootloader will first look for 'google,lazor-rev5-sku6'. If it
> > finds a dts with that compatible it will pick it.
> >
> > 2. The bootloader will then look for 'google,lazor-rev5'. If it finds
> > a dts with that compatible it will pick it.
> >
> > 3. The bootloader will then look for 'google,lazor-sku6'. If it finds
> > a dts with that compatible it will pick it.
> >
> > 4. Finally, the bootloader will look for 'google,lazor'.
> >
> > There's a method to the madness. Among other things, this allows
> > revving the board revision for a change to the board even if it
> > _should_ be invisible to software. The rule is always that the
> > "newest" device tree that's in Linux is always listed _without_ a
> > board revision number. An example might help.
> >
> > a) Assume '-rev5' is the newest revision available. In Linux this
> > would be the only dts that advertises "google,lazor" (with no -rev).
> > Previous dts file would advertise "-rev3" or "-rev4" or whatever.
> >
> > b) We need to spin the board for something that should be invisible to
> > software. Just in case, HW guys change the board strappings to -rev6.
> > This works _seamlessly_ because the newest dts file always advertises
> > just "google,lazor"
> >
> > c) We spin the board for something that's _not_ invisible. It will be
> > "-rev7". Now, we go back and add "-rev5" and "-rev6" to the old board
> > dts file and remove the advertisement for "google,lazor". We create a
> > new dts file for -rev7 that advertises "google,lazor".
>
> Except shuffling the compatibles in bindings, you are changing the
> meaning of final "google,lazor" compatible. The bootloader works as
> expected - from most specific (rev5-sku6) to most generic compatible
> (google,lazor) but why do you need to advertise the latest rev as
> "google,lazor"? Why the bootloader on latest rev (e.g. rev7) cannot bind
> to rev7 compatible?

The problem really comes along when a board strapped as -rev8 comes
along that is a board spin (and thus a new revision) but "should" be
invisible to software. Since it should be invisible to software we
want it to boot without any software changes. As per my previous mail,
sometimes HW guys make these changes without first consulting software
(since it's invisible to SW!) and we want to make sure that they're
still going to strap as "-rev8".

So what happens with this -rev8 board? The bootloader will check and
it won't see any device tree that advertises "google,lazor-rev8",
right? If _all_ lazor revisions all include the "google,lazor"
compatible then the bootloader won't have any way to know which to
pick. The bootloader _doesn't_ have the smarts to know that "-rev7" is
closest to "-rev8". It'll just randomly pick one of the "google,lazor"
boards. :( This is why we only advertise "google,lazor" for the newest
device tree.

Yes, I agree it's not beautiful but it's what we ended up with. I
don't think we want to compromise on the ability to boot new revisions
without software changes because that will just incentivize people to
not increment the board revision. The only other option would be to
make the bootloader smart enough to pick the "next revision down" but
so far they haven't been willing to do that.


I guess the question, though, is what action should be taken. I guess
options are:

1. Say that the above requirement that new "invisible" HW revs can
boot w/ no software changes is not a worthy requirement. Personally, I
wouldn't accept this option.

2. Ignore. Don't try to document top level compatible for these devices.

3. Document the compatible and accept that it's going to shuffle around a lot.

4. Try again to get the bootloader to match earlier revisions as fallbacks.


> > Now we can certainly argue back and forth above the above scheme and
> > how it's terrible and/or great, but it definitely works pretty well
> > and it's what we've been doing for a while now. Before that we used to
> > proactively add a whole bunch of "future" revisions "just in case".
> > That was definitely worse and had the same problem that we'd have to
> > shuffle compatibles. See, for instance `rk3288-veyron-jerry.dts`.
> >
> > One thing we _definitely_ don't want to do is to give HW _any_
> > incentive to make board spins _without_ changing the revision. HW
> > sometimes makes spins without first involving software and if it
> > doesn't boot because they updated the board ID then someone in China
> > will just put the old ID in and ship it off. That's bad.
> >
> > --
> >
> > But I guess this doesn't answer your question: how can userspace
> > identify what board this is running? I don't have an answer to that,
> > but I guess I'd say that the top-level "compatible" isn't really it.
>
> It can, the same as bootloader, by looking at the most specific
> compatible (rev7).
>
> > If nothing else, I think just from the definition it's not guaranteed
> > to be right, is it? From the spec: "Specifies a list of platform
> > architectures with which this platform is compatible." The key thing
> > is "a list". If this can be a list of things then how can you use it
> > to uniquely identify what one board you're on?
>
> The most specific compatible identifies or, like recently Rob confirmed
> in case of Renesas, the list of compatibles:
> https://lore.kernel.org/linux-devicetree/Yk2%[email protected]/

I'm confused. If the device tree contains the compatibles:

"google,lazor-rev4", "google,lazor-rev3", "google,lazor", "qualcomm,sc7180"

You want to know what board you're on and you look at the compatible,
right? You'll decide that you're on a "google,lazor-rev4" which is the
most specific compatible. ...but you could have booted a
"google,lazor-rev3". How do you know?

Further, imagine a case where two different HW manufacturers take some
reference design and each build hardware that's identical except for
what's plugged into PCIe / USB / eDP ports. We could have a single
device tree for these, right? So you could imagine a device tree with
compatibles these compatibles to support the imaginary CompUTown
CompUBox and the TheBestStuff BestBox computers:

"computown,compubox", "thebeststuff,bestbox"

Now you boot up. How do you know if you're on a CompUBox or a BestBox?


> > That all being said, I think that on Chrome OS the userspace tools
> > _do_ some amount of parsing of the compatible strings here. For
> > Chromebooks they leverage the fact that they understand the above
> > scheme and thus can figure things out. I think they also use things
> > like `/proc/device-tree/firmware/coreboot/board-id` and
> > `/proc/device-tree/firmware/coreboot/sku-id`. That doesn't seem to be
> > documented, though. :(
> >
> > I guess the question is, though, why do you need to know what board you're on?
>
> You might have (and I had in some previous job) single user-space
> application working on different HW and responding slightly differently,
> depending on the hardware it runs. Exactly the same as kernel behaves
> differently, depending on DTB. The differences for example might be in
> GPIOs or some other interfaces managed via user-space drivers, not in
> presence of devices. Another example are system tests behaving
> differently depending on the DUT, where again you run the tests in a
> generic way so the DUT is autodetected based on board.
>
> Of course you could say: different hardware, has different DTB, so
> user-space should check entire tree, to figure out how to operate that
> hardware. Yeah, that's one way, very complex and actually duplicating
> kernel's work. Embedded apps are specialized, so it is much easier for
> them to check board compatible and make assumptions on that.

I mean, it's fine if you want to make assumptions for the hardware
that you work on.

-Doug

2022-05-03 23:22:47

by Krzysztof Kozłowski

[permalink] [raw]
Subject: Re: [PATCH] CHROMIUM: arm64: dts: qcom: Add sc7180-gelarshie

On Tue, 19 Apr 2022 at 18:55, Doug Anderson <[email protected]> wrote:

> > Except shuffling the compatibles in bindings, you are changing the
> > meaning of final "google,lazor" compatible. The bootloader works as
> > expected - from most specific (rev5-sku6) to most generic compatible
> > (google,lazor) but why do you need to advertise the latest rev as
> > "google,lazor"? Why the bootloader on latest rev (e.g. rev7) cannot bind
> > to rev7 compatible?
>
> The problem really comes along when a board strapped as -rev8 comes
> along that is a board spin (and thus a new revision) but "should" be
> invisible to software. Since it should be invisible to software we
> want it to boot without any software changes. As per my previous mail,
> sometimes HW guys make these changes without first consulting software
> (since it's invisible to SW!) and we want to make sure that they're
> still going to strap as "-rev8".

If you want to boot it without any SW changes, do not change the SW.
Do not change the DTB. If you admit that you want to change DTB, so
the SW, sure, change it and accept the outcome - you have a new
compatible. This new compatible can be or might be not compatible with
rev7. Up to you.

>
> So what happens with this -rev8 board? The bootloader will check and
> it won't see any device tree that advertises "google,lazor-rev8",
> right?

Your bootloader looks for a specific rev8, which is not compatible
with rev7 (or is it? I lost the point of your example), and you ship
it with a DTB which has rev7, but not rev8. You control both pieces -
bootloader and DTB. You cannot put incompatible pieces of firmware
(one behaving entirely different than other) and expect proper output.
This is why you also have bindings.

> If _all_ lazor revisions all include the "google,lazor"
> compatible then the bootloader won't have any way to know which to
> pick. The bootloader _doesn't_ have the smarts to know that "-rev7" is
> closest to "-rev8".

rev7 the next in the compatible list, isn't it? So bootloader picks up
the fallback...

> It'll just randomly pick one of the "google,lazor"
> boards. :( This is why we only advertise "google,lazor" for the newest
> device tree.
>
> Yes, I agree it's not beautiful but it's what we ended up with. I
> don't think we want to compromise on the ability to boot new revisions
> without software changes because that will just incentivize people to
> not increment the board revision. The only other option would be to
> make the bootloader smart enough to pick the "next revision down" but
> so far they haven't been willing to do that.

Just choose the fallback and follow Devicetree spec...

> I guess the question, though, is what action should be taken. I guess
> options are:
>
> 1. Say that the above requirement that new "invisible" HW revs can
> boot w/ no software changes is not a worthy requirement. Personally, I
> wouldn't accept this option.
>
> 2. Ignore. Don't try to document top level compatible for these devices.
>
> 3. Document the compatible and accept that it's going to shuffle around a lot.
>
> 4. Try again to get the bootloader to match earlier revisions as fallbacks.
>
>
> > > Now we can certainly argue back and forth above the above scheme and
> > > how it's terrible and/or great, but it definitely works pretty well
> > > and it's what we've been doing for a while now. Before that we used to
> > > proactively add a whole bunch of "future" revisions "just in case".
> > > That was definitely worse and had the same problem that we'd have to
> > > shuffle compatibles. See, for instance `rk3288-veyron-jerry.dts`.
> > >
> > > One thing we _definitely_ don't want to do is to give HW _any_
> > > incentive to make board spins _without_ changing the revision. HW
> > > sometimes makes spins without first involving software and if it
> > > doesn't boot because they updated the board ID then someone in China
> > > will just put the old ID in and ship it off. That's bad.
> > >
> > > --
> > >
> > > But I guess this doesn't answer your question: how can userspace
> > > identify what board this is running? I don't have an answer to that,
> > > but I guess I'd say that the top-level "compatible" isn't really it.
> >
> > It can, the same as bootloader, by looking at the most specific
> > compatible (rev7).
> >
> > > If nothing else, I think just from the definition it's not guaranteed
> > > to be right, is it? From the spec: "Specifies a list of platform
> > > architectures with which this platform is compatible." The key thing
> > > is "a list". If this can be a list of things then how can you use it
> > > to uniquely identify what one board you're on?
> >
> > The most specific compatible identifies or, like recently Rob confirmed
> > in case of Renesas, the list of compatibles:
> > https://lore.kernel.org/linux-devicetree/Yk2%[email protected]/
>
> I'm confused. If the device tree contains the compatibles:
>
> "google,lazor-rev4", "google,lazor-rev3", "google,lazor", "qualcomm,sc7180"
>
> You want to know what board you're on and you look at the compatible,
> right? You'll decide that you're on a "google,lazor-rev4" which is the
> most specific compatible. ...but you could have booted a
> "google,lazor-rev3". How do you know?

Applying the wrong DTB on the wrong device will always give you the
wrong answer. You can try too boot google,lazor-rev3 on x86 PC and it
does not make it a google,lazor-rev3...

Best regards,
Krzysztof

2022-05-03 23:46:02

by Doug Anderson

[permalink] [raw]
Subject: Re: [PATCH] CHROMIUM: arm64: dts: qcom: Add sc7180-gelarshie

Hi,

On Tue, May 3, 2022 at 8:54 AM Krzysztof Kozłowski
<[email protected]> wrote:
>
> On Tue, 19 Apr 2022 at 18:55, Doug Anderson <[email protected]> wrote:
>
> > > Except shuffling the compatibles in bindings, you are changing the
> > > meaning of final "google,lazor" compatible. The bootloader works as
> > > expected - from most specific (rev5-sku6) to most generic compatible
> > > (google,lazor) but why do you need to advertise the latest rev as
> > > "google,lazor"? Why the bootloader on latest rev (e.g. rev7) cannot bind
> > > to rev7 compatible?
> >
> > The problem really comes along when a board strapped as -rev8 comes
> > along that is a board spin (and thus a new revision) but "should" be
> > invisible to software. Since it should be invisible to software we
> > want it to boot without any software changes. As per my previous mail,
> > sometimes HW guys make these changes without first consulting software
> > (since it's invisible to SW!) and we want to make sure that they're
> > still going to strap as "-rev8".
>
> If you want to boot it without any SW changes, do not change the SW.
> Do not change the DTB. If you admit that you want to change DTB, so
> the SW, sure, change it and accept the outcome - you have a new
> compatible. This new compatible can be or might be not compatible with
> rev7. Up to you.
>
> >
> > So what happens with this -rev8 board? The bootloader will check and
> > it won't see any device tree that advertises "google,lazor-rev8",
> > right?
>
> Your bootloader looks for a specific rev8, which is not compatible
> with rev7 (or is it? I lost the point of your example)

Actually the whole point is that _we don't know_ if -rev7 and -rev8
are compatible.

Think of it this way. You've got component A on your board and you
power it up with 1.8 V. We run out of component A and we decide to
replace it with component B. The vendor promises that component B is a
drop-in replacement for component A. You boot up a few devices with
component B and everything looks good. You build a whole lot of
products.

Sometime down the line you start getting failure reports. It turns out
that products that have component B are sporadically failing in the
field. After talking to the vendor, they suggest that we need to power
component B with 1.85 V instead of 1.80 V. Luckily we can adjust the
voltage with the PMIC, but component A's vendor doesn't want you to
bump the voltage up to 1.85V.

Even though we originally thought that the two boards were 100%
compatible, it later turns out that they're not.

So as a general principle, if we make big changes to a product we
increment the board revision strappings even if we think it's
invisible to software. This can help us get out of sticky situations
in the future.


> and you ship
> it with a DTB which has rev7, but not rev8. You control both pieces -
> bootloader and DTB. You cannot put incompatible pieces of firmware
> (one behaving entirely different than other) and expect proper output.
> This is why you also have bindings.

...and by "you" in "*you* control both pieces" you mean some
collection of people spread across several companies and several
countries and who don't always communicate well with each other. If
they believe that a change should be invisible to software, folks
building the hardware in China don't always send me a heads up in
California, but I still want them to bump the revision number just in
case they messed up and we do need a software change down the road.


> > If _all_ lazor revisions all include the "google,lazor"
> > compatible then the bootloader won't have any way to know which to
> > pick. The bootloader _doesn't_ have the smarts to know that "-rev7" is
> > closest to "-rev8".
>
> rev7 the next in the compatible list, isn't it? So bootloader picks up
> the fallback...

No. The bootloader works like this (just looking at the revision
strappings and ignoring the SKU strappings):

1. Read board strappings and get and ID (like "8")

2. Look for "google,lazor-rev8".

3. If it's not there, look for "google,lazor"

4. If it's not there then that's bad.

...so "-rev7" is _not_ in the compatible list for "-rev8".


> > It'll just randomly pick one of the "google,lazor"
> > boards. :( This is why we only advertise "google,lazor" for the newest
> > device tree.
> >
> > Yes, I agree it's not beautiful but it's what we ended up with. I
> > don't think we want to compromise on the ability to boot new revisions
> > without software changes because that will just incentivize people to
> > not increment the board revision. The only other option would be to
> > make the bootloader smart enough to pick the "next revision down" but
> > so far they haven't been willing to do that.
>
> Just choose the fallback and follow Devicetree spec...

It does choose the fallback and follow the devicetree spec, but the
bootloader doesn't have rules to consider "-rev7" as a fallback for
"-rev8".


> > I guess the question, though, is what action should be taken. I guess
> > options are:
> >
> > 1. Say that the above requirement that new "invisible" HW revs can
> > boot w/ no software changes is not a worthy requirement. Personally, I
> > wouldn't accept this option.
> >
> > 2. Ignore. Don't try to document top level compatible for these devices.
> >
> > 3. Document the compatible and accept that it's going to shuffle around a lot.
> >
> > 4. Try again to get the bootloader to match earlier revisions as fallbacks.
> >
> >
> > > > Now we can certainly argue back and forth above the above scheme and
> > > > how it's terrible and/or great, but it definitely works pretty well
> > > > and it's what we've been doing for a while now. Before that we used to
> > > > proactively add a whole bunch of "future" revisions "just in case".
> > > > That was definitely worse and had the same problem that we'd have to
> > > > shuffle compatibles. See, for instance `rk3288-veyron-jerry.dts`.
> > > >
> > > > One thing we _definitely_ don't want to do is to give HW _any_
> > > > incentive to make board spins _without_ changing the revision. HW
> > > > sometimes makes spins without first involving software and if it
> > > > doesn't boot because they updated the board ID then someone in China
> > > > will just put the old ID in and ship it off. That's bad.
> > > >
> > > > --
> > > >
> > > > But I guess this doesn't answer your question: how can userspace
> > > > identify what board this is running? I don't have an answer to that,
> > > > but I guess I'd say that the top-level "compatible" isn't really it.
> > >
> > > It can, the same as bootloader, by looking at the most specific
> > > compatible (rev7).
> > >
> > > > If nothing else, I think just from the definition it's not guaranteed
> > > > to be right, is it? From the spec: "Specifies a list of platform
> > > > architectures with which this platform is compatible." The key thing
> > > > is "a list". If this can be a list of things then how can you use it
> > > > to uniquely identify what one board you're on?
> > >
> > > The most specific compatible identifies or, like recently Rob confirmed
> > > in case of Renesas, the list of compatibles:
> > > https://lore.kernel.org/linux-devicetree/Yk2%[email protected]/
> >
> > I'm confused. If the device tree contains the compatibles:
> >
> > "google,lazor-rev4", "google,lazor-rev3", "google,lazor", "qualcomm,sc7180"
> >
> > You want to know what board you're on and you look at the compatible,
> > right? You'll decide that you're on a "google,lazor-rev4" which is the
> > most specific compatible. ...but you could have booted a
> > "google,lazor-rev3". How do you know?
>
> Applying the wrong DTB on the wrong device will always give you the
> wrong answer. You can try too boot google,lazor-rev3 on x86 PC and it
> does not make it a google,lazor-rev3...

I don't understand what you're saying here. If a device tree has the compatible:

"google,lazor-rev4", "google,lazor-rev3", "google,lazor", "qualcomm,sc7180"

You wouldn't expect to boot it on an x86 PC, but you would expect to
boot it on either a "google,lazor-rev4" _or_ a "google,lazor-rev3".
Correct? Now, after we've booted software wants to look at the
compatible of the device tree that was booted. The most specific entry
in that device tree is "google,lazor-rev4". ...but we could have
booted it on a "google,lazor-rev3". How can you know?

-Doug

2022-05-04 13:07:01

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH] CHROMIUM: arm64: dts: qcom: Add sc7180-gelarshie

On 03/05/2022 18:13, Doug Anderson wrote:
> Hi,
>
> On Tue, May 3, 2022 at 8:54 AM Krzysztof Kozłowski
> <[email protected]> wrote:
>>
>> On Tue, 19 Apr 2022 at 18:55, Doug Anderson <[email protected]> wrote:
>>
>>>> Except shuffling the compatibles in bindings, you are changing the
>>>> meaning of final "google,lazor" compatible. The bootloader works as
>>>> expected - from most specific (rev5-sku6) to most generic compatible
>>>> (google,lazor) but why do you need to advertise the latest rev as
>>>> "google,lazor"? Why the bootloader on latest rev (e.g. rev7) cannot bind
>>>> to rev7 compatible?
>>>
>>> The problem really comes along when a board strapped as -rev8 comes
>>> along that is a board spin (and thus a new revision) but "should" be
>>> invisible to software. Since it should be invisible to software we
>>> want it to boot without any software changes. As per my previous mail,
>>> sometimes HW guys make these changes without first consulting software
>>> (since it's invisible to SW!) and we want to make sure that they're
>>> still going to strap as "-rev8".
>>
>> If you want to boot it without any SW changes, do not change the SW.
>> Do not change the DTB. If you admit that you want to change DTB, so
>> the SW, sure, change it and accept the outcome - you have a new
>> compatible. This new compatible can be or might be not compatible with
>> rev7. Up to you.
>>
>>>
>>> So what happens with this -rev8 board? The bootloader will check and
>>> it won't see any device tree that advertises "google,lazor-rev8",
>>> right?
>>
>> Your bootloader looks for a specific rev8, which is not compatible
>> with rev7 (or is it? I lost the point of your example)
>
> Actually the whole point is that _we don't know_ if -rev7 and -rev8
> are compatible.
>
> Think of it this way. You've got component A on your board and you
> power it up with 1.8 V. We run out of component A and we decide to
> replace it with component B. The vendor promises that component B is a
> drop-in replacement for component A. You boot up a few devices with
> component B and everything looks good. You build a whole lot of
> products.
>
> Sometime down the line you start getting failure reports. It turns out
> that products that have component B are sporadically failing in the
> field. After talking to the vendor, they suggest that we need to power
> component B with 1.85 V instead of 1.80 V. Luckily we can adjust the
> voltage with the PMIC, but component A's vendor doesn't want you to
> bump the voltage up to 1.85V.
>
> Even though we originally thought that the two boards were 100%
> compatible, it later turns out that they're not.
>
> So as a general principle, if we make big changes to a product we
> increment the board revision strappings even if we think it's
> invisible to software. This can help us get out of sticky situations
> in the future.

Then assume boards are not really compatible, bump rev to rev8 and ship
it. Bootloader will know it is rev8 and use it.

>
>
>> and you ship
>> it with a DTB which has rev7, but not rev8. You control both pieces -
>> bootloader and DTB. You cannot put incompatible pieces of firmware
>> (one behaving entirely different than other) and expect proper output.
>> This is why you also have bindings.
>
> ...and by "you" in "*you* control both pieces" you mean some
> collection of people spread across several companies and several
> countries and who don't always communicate well with each other. If
> they believe that a change should be invisible to software, folks
> building the hardware in China don't always send me a heads up in
> California, but I still want them to bump the revision number just in
> case they messed up and we do need a software change down the road.
>
>
>>> If _all_ lazor revisions all include the "google,lazor"
>>> compatible then the bootloader won't have any way to know which to
>>> pick. The bootloader _doesn't_ have the smarts to know that "-rev7" is
>>> closest to "-rev8".
>>
>> rev7 the next in the compatible list, isn't it? So bootloader picks up
>> the fallback...
>
> No. The bootloader works like this (just looking at the revision
> strappings and ignoring the SKU strappings):
>
> 1. Read board strappings and get and ID (like "8")
>
> 2. Look for "google,lazor-rev8".
>
> 3. If it's not there, look for "google,lazor"
>
> 4. If it's not there then that's bad.
>
> ...so "-rev7" is _not_ in the compatible list for "-rev8".

Everything looks fine then. You have a rev8 board, which is not
compatible with rev7, and bootloader looks for rev8. Finds it (since it
is physically there!), loads it.

You have a rev7 board so bootloader looks for rev7, finds it and loads it.

>
>
>>> It'll just randomly pick one of the "google,lazor"
>>> boards. :( This is why we only advertise "google,lazor" for the newest
>>> device tree.
>>>
>>> Yes, I agree it's not beautiful but it's what we ended up with. I
>>> don't think we want to compromise on the ability to boot new revisions
>>> without software changes because that will just incentivize people to
>>> not increment the board revision. The only other option would be to
>>> make the bootloader smart enough to pick the "next revision down" but
>>> so far they haven't been willing to do that.
>>
>> Just choose the fallback and follow Devicetree spec...
>
> It does choose the fallback and follow the devicetree spec, but the
> bootloader doesn't have rules to consider "-rev7" as a fallback for
> "-rev8".

Sure, let's skip fallbacks and assume everything is not compatible with
else.

>
>
>>> I guess the question, though, is what action should be taken. I guess
>>> options are:
>>>
>>> 1. Say that the above requirement that new "invisible" HW revs can
>>> boot w/ no software changes is not a worthy requirement. Personally, I
>>> wouldn't accept this option.
>>>
>>> 2. Ignore. Don't try to document top level compatible for these devices.
>>>
>>> 3. Document the compatible and accept that it's going to shuffle around a lot.
>>>
>>> 4. Try again to get the bootloader to match earlier revisions as fallbacks.
>>>
>>>
>>>>> Now we can certainly argue back and forth above the above scheme and
>>>>> how it's terrible and/or great, but it definitely works pretty well
>>>>> and it's what we've been doing for a while now. Before that we used to
>>>>> proactively add a whole bunch of "future" revisions "just in case".
>>>>> That was definitely worse and had the same problem that we'd have to
>>>>> shuffle compatibles. See, for instance `rk3288-veyron-jerry.dts`.
>>>>>
>>>>> One thing we _definitely_ don't want to do is to give HW _any_
>>>>> incentive to make board spins _without_ changing the revision. HW
>>>>> sometimes makes spins without first involving software and if it
>>>>> doesn't boot because they updated the board ID then someone in China
>>>>> will just put the old ID in and ship it off. That's bad.
>>>>>
>>>>> --
>>>>>
>>>>> But I guess this doesn't answer your question: how can userspace
>>>>> identify what board this is running? I don't have an answer to that,
>>>>> but I guess I'd say that the top-level "compatible" isn't really it.
>>>>
>>>> It can, the same as bootloader, by looking at the most specific
>>>> compatible (rev7).
>>>>
>>>>> If nothing else, I think just from the definition it's not guaranteed
>>>>> to be right, is it? From the spec: "Specifies a list of platform
>>>>> architectures with which this platform is compatible." The key thing
>>>>> is "a list". If this can be a list of things then how can you use it
>>>>> to uniquely identify what one board you're on?
>>>>
>>>> The most specific compatible identifies or, like recently Rob confirmed
>>>> in case of Renesas, the list of compatibles:
>>>> https://lore.kernel.org/linux-devicetree/Yk2%[email protected]/
>>>
>>> I'm confused. If the device tree contains the compatibles:
>>>
>>> "google,lazor-rev4", "google,lazor-rev3", "google,lazor", "qualcomm,sc7180"
>>>
>>> You want to know what board you're on and you look at the compatible,
>>> right? You'll decide that you're on a "google,lazor-rev4" which is the
>>> most specific compatible. ...but you could have booted a
>>> "google,lazor-rev3". How do you know?
>>
>> Applying the wrong DTB on the wrong device will always give you the
>> wrong answer. You can try too boot google,lazor-rev3 on x86 PC and it
>> does not make it a google,lazor-rev3...
>
> I don't understand what you're saying here. If a device tree has the compatible:
>
> "google,lazor-rev4", "google,lazor-rev3", "google,lazor", "qualcomm,sc7180"
>
> You wouldn't expect to boot it on an x86 PC, but you would expect to
> boot it on either a "google,lazor-rev4" _or_ a "google,lazor-rev3".

Yes, but booting it does not mean that the hardware is rev3 or rev4.
Booting it means only that we are running DTB on a compatible hardware.
The DTB determines what is accessible to user-space, not what *really*
the hardware is. The user-space (since we are going now to original
question) reads it and can understand that it is running on hardware
compatible with rev3 - either rev3 or rev4 - and act accordingly.

> Correct? Now, after we've booted software wants to look at the
> compatible of the device tree that was booted. The most specific entry
> in that device tree is "google,lazor-rev4". ...but we could have
> booted it on a "google,lazor-rev3". How can you know?

No, providing and loading a rev4 DTB on a rev3 board is not correct and
does not make any sense. rev3 boards are not compatible with rev4, it's
the other way. Not every fruit is an apple, but every apple is a fruit.
This is why I used that example - if you load rev4 DTB on rev3 hardware
then you have totally wrong booting process.


Best regards,
Krzysztof

2022-05-08 19:11:15

by Doug Anderson

[permalink] [raw]
Subject: Re: [PATCH] CHROMIUM: arm64: dts: qcom: Add sc7180-gelarshie

Hi,

On Wed, May 4, 2022 at 12:04 AM Krzysztof Kozlowski
<[email protected]> wrote:
>
> >>>> The most specific compatible identifies or, like recently Rob confirmed
> >>>> in case of Renesas, the list of compatibles:
> >>>> https://lore.kernel.org/linux-devicetree/Yk2%[email protected]/
> >>>
> >>> I'm confused. If the device tree contains the compatibles:
> >>>
> >>> "google,lazor-rev4", "google,lazor-rev3", "google,lazor", "qualcomm,sc7180"
> >>>
> >>> You want to know what board you're on and you look at the compatible,
> >>> right? You'll decide that you're on a "google,lazor-rev4" which is the
> >>> most specific compatible. ...but you could have booted a
> >>> "google,lazor-rev3". How do you know?
> >>
> >> Applying the wrong DTB on the wrong device will always give you the
> >> wrong answer. You can try too boot google,lazor-rev3 on x86 PC and it
> >> does not make it a google,lazor-rev3...
> >
> > I don't understand what you're saying here. If a device tree has the compatible:
> >
> > "google,lazor-rev4", "google,lazor-rev3", "google,lazor", "qualcomm,sc7180"
> >
> > You wouldn't expect to boot it on an x86 PC, but you would expect to
> > boot it on either a "google,lazor-rev4" _or_ a "google,lazor-rev3".
>
> Yes, but booting it does not mean that the hardware is rev3 or rev4.
> Booting it means only that we are running DTB on a compatible hardware.
> The DTB determines what is accessible to user-space, not what *really*
> the hardware is. The user-space (since we are going now to original
> question) reads it and can understand that it is running on hardware
> compatible with rev3 - either rev3 or rev4 - and act accordingly.
>
> > Correct? Now, after we've booted software wants to look at the
> > compatible of the device tree that was booted. The most specific entry
> > in that device tree is "google,lazor-rev4". ...but we could have
> > booted it on a "google,lazor-rev3". How can you know?
>
> No, providing and loading a rev4 DTB on a rev3 board is not correct and
> does not make any sense. rev3 boards are not compatible with rev4, it's
> the other way. Not every fruit is an apple, but every apple is a fruit.
> This is why I used that example - if you load rev4 DTB on rev3 hardware
> then you have totally wrong booting process.

I think this is the crux of the difference in opinion and there's no
reasonable way I'm aware of to do what you're asking. If -rev3 and
-rev4 are identical from a software point of view it would be silly
not to share a device tree for the two of them. The number of device
trees we'd have to land in the kernel tree would be multiplied by
several times and we'd have many that are identical except for this
compatible string. I see no benefit here and lots of downside.


-Doug

2022-05-09 11:28:04

by Julius Werner

[permalink] [raw]
Subject: Re: [PATCH] CHROMIUM: arm64: dts: qcom: Add sc7180-gelarshie

Trying to chime in here from the firmware development side of this
issue to help clarify what Doug is asking for. We have the fundamental
problem that we have several different board revisions that we _think_
should be 100% compatible for kernel and userspace, so for various
practical reasons (easier to maintain in the source, limiting kernel
image size by not having to bundle the same DTB multiple times under a
different name), we want to use the same DTB for all of them. Just
saying "well, you should treat each revision as incompatible to all
the others from the start then" doesn't scale (we have a lot of
revisions, and in the vast majority of cases they are just as
compatible as we initially expect them to be).

However, we also can't just exhaustively enumerate all revision
numbers that are compatible with this DTB, because we don't know in
advance how many we'll build. For again various practical reasons
(bundling, signing, testing), it would cost a lot of extra effort and
friction to rebuild a new kernel image just to add a new compatible
string to the list every time the factory updates the revision number.
An earlier hacky solution we had for this is to just define a bunch of
extra revision compatible strings in advance even though they don't
exist yet (e.g. you can still see this in
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/arm/boot/dts/tegra124-nyan-blaze.dts
-- I believe there were only actually ever 3 or 4 Blaze revisions, the
other compatible strings defined there never existed as physical
boards). This is cumbersome and hacky and also doesn't really scale
(particularly when we need to add SKU as another dimension into the
string), so we needed to come up with something more manageable.

Our solution is to use "google,lazor" as the stand-in compatible
string for "all Lazor boards compatible with the latest revision".
When a compatibility break happens at some point in the Lazor series
(say between rev3 and rev4), we'll change the compatible string in the
old rev3 DTB to no longer include "google,lazor" but to instead list
all specific revision numbers ("google,lazor-rev0", ...,
"google-lazor-rev3") exhaustively (at this point we can do it, because
at this point we know we're not going to build any new boards
compatible with that old layout in the future). The new rev4 DTB will
then list "google,lazor" and again remain valid for all future
revisions we build, until the next compatibility break.

You are correct that this ends up changing the meaning of
"google,lazor" at some point, but it's really the only good way I can
see how to solve this within our practical constraints. I also don't
really see a practical concern with that, since our firmware matching
algorithm (and I believe this is the standard other bootloaders like
U-Boot also follow) will look for the more specific string with the
explicit revision number first, before falling back to the generic
one. So whenever we decide to change the meaning of the latter in the
kernel, we also make sure to provide matches for all the specific
revisions that previously used the generic match, so that they will
pick up those instead and there's no chance of them getting confused
by the change in meaning. While it is perhaps a bit unorthodox, I
think it is practical, safe, and I don't really see a practical
alternative for our use case.