2022-07-14 19:29:07

by Michal Suchánek

[permalink] [raw]
Subject: [PATCH 1/2] mtd: spi-nor: When a flash memory is missing do not report an error

It is normal that devices are designed with multiple types of storage,
and only some types of storage are present.

The kernel can handle this situation gracefully for many types of
storage devices such as mmc or ata but it reports and error when spi
flash is not present.

Only print a notice that the storage device is missing when no response
to the identify command is received.

Consider reply buffers with all bits set to the same value no response.

Signed-off-by: Michal Suchanek <[email protected]>
---
drivers/mtd/spi-nor/core.c | 25 +++++++++++++++++++++++--
1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
index 502967c76c5f..6bab540171a4 100644
--- a/drivers/mtd/spi-nor/core.c
+++ b/drivers/mtd/spi-nor/core.c
@@ -1652,6 +1652,24 @@ static const struct flash_info *spi_nor_match_id(struct spi_nor *nor,
return NULL;
}

+static const bool buffer_uniform(const u8 *buffer, size_t length)
+{
+ bool all0;
+ size_t i;
+
+ for (all0 = true, i = 0; i < length; i++)
+ if (buffer[i] != 0) {
+ all0 = false;
+ break;
+ }
+ if (all0)
+ return true;
+ for (i = 0; i < length; i++)
+ if (buffer[i] != 0xff)
+ return false;
+ return true;
+}
+
static const struct flash_info *spi_nor_detect(struct spi_nor *nor)
{
const struct flash_info *info;
@@ -1666,8 +1684,11 @@ static const struct flash_info *spi_nor_detect(struct spi_nor *nor)

info = spi_nor_match_id(nor, id);
if (!info) {
- dev_err(nor->dev, "unrecognized JEDEC id bytes: %*ph\n",
- SPI_NOR_MAX_ID_LEN, id);
+ if (buffer_uniform(id, SPI_NOR_MAX_ID_LEN))
+ dev_info(nor->dev, "No flash memory detected.\n");
+ else
+ dev_err(nor->dev, "unrecognized JEDEC id bytes: %*ph\n",
+ SPI_NOR_MAX_ID_LEN, id);
return ERR_PTR(-ENODEV);
}
return info;
--
2.35.3


2022-07-14 20:01:41

by Michael Walle

[permalink] [raw]
Subject: Re: [PATCH 1/2] mtd: spi-nor: When a flash memory is missing do not report an error

Hi,

Am 2022-07-14 21:19, schrieb Michal Suchanek:
> It is normal that devices are designed with multiple types of storage,
> and only some types of storage are present.
>
> The kernel can handle this situation gracefully for many types of
> storage devices such as mmc or ata but it reports and error when spi
> flash is not present.
>
> Only print a notice that the storage device is missing when no response
> to the identify command is received.
>
> Consider reply buffers with all bits set to the same value no response.

I'm not sure you can compare SPI with ATA and MMC. I'm just speaking of
DT now, but there, for ATA and MMC you just describe the controller and
it will auto-detect the connected storage. Whereas with SPI you describe
both the controller and the flash. So I'd argue that your hardware
description is wrong if it describes a flash which is not present.

> Signed-off-by: Michal Suchanek <[email protected]>
> ---
> drivers/mtd/spi-nor/core.c | 25 +++++++++++++++++++++++--
> 1 file changed, 23 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
> index 502967c76c5f..6bab540171a4 100644
> --- a/drivers/mtd/spi-nor/core.c
> +++ b/drivers/mtd/spi-nor/core.c
> @@ -1652,6 +1652,24 @@ static const struct flash_info
> *spi_nor_match_id(struct spi_nor *nor,
> return NULL;
> }
>
> +static const bool buffer_uniform(const u8 *buffer, size_t length)
> +{
> + bool all0;
> + size_t i;
> +
> + for (all0 = true, i = 0; i < length; i++)
> + if (buffer[i] != 0) {
> + all0 = false;
> + break;
> + }
> + if (all0)
> + return true;
> + for (i = 0; i < length; i++)
> + if (buffer[i] != 0xff)
> + return false;
> + return true;
> +}

That seems unnecessarily complex.
if (!memchr_inv(id, '\x00', SPI_NOR_MAX_ID_LEN) ||
!memchr_inv(id, '\xff', SPI_NOR_MAX_ID_LEN))

should be the same.

-michael

> +
> static const struct flash_info *spi_nor_detect(struct spi_nor *nor)
> {
> const struct flash_info *info;
> @@ -1666,8 +1684,11 @@ static const struct flash_info
> *spi_nor_detect(struct spi_nor *nor)
>
> info = spi_nor_match_id(nor, id);
> if (!info) {
> - dev_err(nor->dev, "unrecognized JEDEC id bytes: %*ph\n",
> - SPI_NOR_MAX_ID_LEN, id);
> + if (buffer_uniform(id, SPI_NOR_MAX_ID_LEN))
> + dev_info(nor->dev, "No flash memory detected.\n");
> + else
> + dev_err(nor->dev, "unrecognized JEDEC id bytes: %*ph\n",
> + SPI_NOR_MAX_ID_LEN, id);
> return ERR_PTR(-ENODEV);
> }
> return info;

2022-07-14 20:05:09

by Michal Suchánek

[permalink] [raw]
Subject: [PATCH resend 2/2] ARM: dts: sunxi: Enable optional SPI flash on Orange Pi Zero board

The flash is present on all new boards and users went out of their way
to add it on the old ones.

Enabling it makes a more reasonable default.

Signed-off-by: Michal Suchanek <[email protected]>
---
Link: https://lore.kernel.org/all/[email protected]/
arch/arm/boot/dts/sun8i-h2-plus-orangepi-zero.dts | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm/boot/dts/sun8i-h2-plus-orangepi-zero.dts b/arch/arm/boot/dts/sun8i-h2-plus-orangepi-zero.dts
index 3706216ffb40..4f782b1bdf27 100644
--- a/arch/arm/boot/dts/sun8i-h2-plus-orangepi-zero.dts
+++ b/arch/arm/boot/dts/sun8i-h2-plus-orangepi-zero.dts
@@ -163,8 +163,8 @@ &ohci1 {
};

&spi0 {
- /* Disable SPI NOR by default: it optional on Orange Pi Zero boards */
- status = "disabled";
+ /* Enable optional SPI NOR by default */
+ status = "okay";

flash@0 {
#address-cells = <1>;
--
2.35.3

2022-07-14 21:09:47

by Michal Suchánek

[permalink] [raw]
Subject: Re: [PATCH 1/2] mtd: spi-nor: When a flash memory is missing do not report an error

On Thu, Jul 14, 2022 at 09:41:48PM +0200, Michael Walle wrote:
> Hi,
>
> Am 2022-07-14 21:19, schrieb Michal Suchanek:
> > It is normal that devices are designed with multiple types of storage,
> > and only some types of storage are present.
> >
> > The kernel can handle this situation gracefully for many types of
> > storage devices such as mmc or ata but it reports and error when spi
> > flash is not present.
> >
> > Only print a notice that the storage device is missing when no response
> > to the identify command is received.
> >
> > Consider reply buffers with all bits set to the same value no response.
>
> I'm not sure you can compare SPI with ATA and MMC. I'm just speaking of
> DT now, but there, for ATA and MMC you just describe the controller and
> it will auto-detect the connected storage. Whereas with SPI you describe

Why does mmc assume storage and SDIO must be descibed? Why the special
casing?

> both the controller and the flash. So I'd argue that your hardware
> description is wrong if it describes a flash which is not present.

At any rate the situation is the same - the storage may be present
sometimes. I don't think assuming some kind of device by defualt is a
sound practice.

However, when the board is designed for a specific kind of device which
is not always present, and the kernel can detect the device, it is
perfectly fine to describe it.

The alternative is to not use the device at all, even when present,
which is kind of useless.

Thanks

Michal

>
> > Signed-off-by: Michal Suchanek <[email protected]>
> > ---
> > drivers/mtd/spi-nor/core.c | 25 +++++++++++++++++++++++--
> > 1 file changed, 23 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
> > index 502967c76c5f..6bab540171a4 100644
> > --- a/drivers/mtd/spi-nor/core.c
> > +++ b/drivers/mtd/spi-nor/core.c
> > @@ -1652,6 +1652,24 @@ static const struct flash_info
> > *spi_nor_match_id(struct spi_nor *nor,
> > return NULL;
> > }
> >
> > +static const bool buffer_uniform(const u8 *buffer, size_t length)
> > +{
> > + bool all0;
> > + size_t i;
> > +
> > + for (all0 = true, i = 0; i < length; i++)
> > + if (buffer[i] != 0) {
> > + all0 = false;
> > + break;
> > + }
> > + if (all0)
> > + return true;
> > + for (i = 0; i < length; i++)
> > + if (buffer[i] != 0xff)
> > + return false;
> > + return true;
> > +}
>
> That seems unnecessarily complex.
> if (!memchr_inv(id, '\x00', SPI_NOR_MAX_ID_LEN) ||
> !memchr_inv(id, '\xff', SPI_NOR_MAX_ID_LEN))
>
> should be the same.
>
> -michael
>
> > +
> > static const struct flash_info *spi_nor_detect(struct spi_nor *nor)
> > {
> > const struct flash_info *info;
> > @@ -1666,8 +1684,11 @@ static const struct flash_info
> > *spi_nor_detect(struct spi_nor *nor)
> >
> > info = spi_nor_match_id(nor, id);
> > if (!info) {
> > - dev_err(nor->dev, "unrecognized JEDEC id bytes: %*ph\n",
> > - SPI_NOR_MAX_ID_LEN, id);
> > + if (buffer_uniform(id, SPI_NOR_MAX_ID_LEN))
> > + dev_info(nor->dev, "No flash memory detected.\n");
> > + else
> > + dev_err(nor->dev, "unrecognized JEDEC id bytes: %*ph\n",
> > + SPI_NOR_MAX_ID_LEN, id);
> > return ERR_PTR(-ENODEV);
> > }
> > return info;

2022-07-14 21:53:56

by Michael Walle

[permalink] [raw]
Subject: Re: [PATCH 1/2] mtd: spi-nor: When a flash memory is missing do not report an error

Am 2022-07-14 22:55, schrieb Michal Suchánek:
> On Thu, Jul 14, 2022 at 09:41:48PM +0200, Michael Walle wrote:
>> Hi,
>>
>> Am 2022-07-14 21:19, schrieb Michal Suchanek:
>> > It is normal that devices are designed with multiple types of storage,
>> > and only some types of storage are present.
>> >
>> > The kernel can handle this situation gracefully for many types of
>> > storage devices such as mmc or ata but it reports and error when spi
>> > flash is not present.
>> >
>> > Only print a notice that the storage device is missing when no response
>> > to the identify command is received.
>> >
>> > Consider reply buffers with all bits set to the same value no response.
>>
>> I'm not sure you can compare SPI with ATA and MMC. I'm just speaking
>> of
>> DT now, but there, for ATA and MMC you just describe the controller
>> and
>> it will auto-detect the connected storage. Whereas with SPI you
>> describe
>
> Why does mmc assume storage and SDIO must be descibed? Why the special
> casing?

I can't follow you here. My SDIO wireless card just works in an SD
slot and doesn't have to be described.

>> both the controller and the flash. So I'd argue that your hardware
>> description is wrong if it describes a flash which is not present.
>
> At any rate the situation is the same - the storage may be present
> sometimes. I don't think assuming some kind of device by defualt is a
> sound practice.

Where is the assumption when the DT tells you there is a flash
on a specific chip select but actually there it isn't. Shouldn't
the DT then be fixed?

Maybe I don't understand your problem. What are you trying to
solve? I mean this just demotes an error to an info message.

> However, when the board is designed for a specific kind of device which
> is not always present, and the kernel can detect the device, it is
> perfectly fine to describe it.
>
> The alternative is to not use the device at all, even when present,
> which is kind of useless.

Or let the bootloader update your device tree and disable the device
if it's not there? Or load an overlay if it is there?

-michael

2022-07-14 23:06:32

by Michal Suchánek

[permalink] [raw]
Subject: Re: [PATCH 1/2] mtd: spi-nor: When a flash memory is missing do not report an error

On Thu, Jul 14, 2022 at 11:51:56PM +0200, Michael Walle wrote:
> Am 2022-07-14 22:55, schrieb Michal Such?nek:
> > On Thu, Jul 14, 2022 at 09:41:48PM +0200, Michael Walle wrote:
> > > Hi,
> > >
> > > Am 2022-07-14 21:19, schrieb Michal Suchanek:
> > > > It is normal that devices are designed with multiple types of storage,
> > > > and only some types of storage are present.
> > > >
> > > > The kernel can handle this situation gracefully for many types of
> > > > storage devices such as mmc or ata but it reports and error when spi
> > > > flash is not present.
> > > >
> > > > Only print a notice that the storage device is missing when no response
> > > > to the identify command is received.
> > > >
> > > > Consider reply buffers with all bits set to the same value no response.
> > >
> > > I'm not sure you can compare SPI with ATA and MMC. I'm just speaking
> > > of
> > > DT now, but there, for ATA and MMC you just describe the controller
> > > and
> > > it will auto-detect the connected storage. Whereas with SPI you
> > > describe
> >
> > Why does mmc assume storage and SDIO must be descibed? Why the special
> > casing?
>
> I can't follow you here. My SDIO wireless card just works in an SD
> slot and doesn't have to be described.
>
> > > both the controller and the flash. So I'd argue that your hardware
> > > description is wrong if it describes a flash which is not present.
> >
> > At any rate the situation is the same - the storage may be present
> > sometimes. I don't think assuming some kind of device by defualt is a
> > sound practice.
>
> Where is the assumption when the DT tells you there is a flash
> on a specific chip select but actually there it isn't. Shouldn't
> the DT then be fixed?

The DT says there isn't a flash on a specific chip select when there is.
Shouldn't that be fixed?

> Maybe I don't understand your problem. What are you trying to
> solve? I mean this just demotes an error to an info message.

Many boards provide multiple storage options - you get a PCB designed to
carry different kinds of storage, some may be socketed, some can be
soldered on in some production batches and not others.

The kernel can handle this for many kinds of storage but not SPI flash.

I don't see any reason why SPI flash should be a second class storage.

> > However, when the board is designed for a specific kind of device which
> > is not always present, and the kernel can detect the device, it is
> > perfectly fine to describe it.
> >
> > The alternative is to not use the device at all, even when present,
> > which is kind of useless.
>
> Or let the bootloader update your device tree and disable the device
> if it's not there?

But then it must be in the device tree?

And then people will complain that if the bootloader does not have this
feature then the kernel prints an error message?

> Or load an overlay if it is there?

Or maybe the kernel could just detect if the storage is present?

It's not like we don't have an identify command.

Thanks

Michal

2022-07-15 00:54:27

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 1/2] mtd: spi-nor: When a flash memory is missing do not report an error

Hi Michal,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on mtd/spi-nor/next]
[also build test WARNING on sunxi/sunxi/for-next krzk/for-next krzk-dt/for-next krzk-mem-ctrl/for-next linus/master v5.19-rc6 next-20220714]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Michal-Suchanek/mtd-spi-nor-When-a-flash-memory-is-missing-do-not-report-an-error/20220715-032336
base: https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git spi-nor/next
config: x86_64-allyesconfig (https://download.01.org/0day-ci/archive/20220715/[email protected]/config)
compiler: gcc-11 (Debian 11.3.0-3) 11.3.0
reproduce (this is a W=1 build):
# https://github.com/intel-lab-lkp/linux/commit/d747dae258dd8d5220d38fae5fd09be703b0391b
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Michal-Suchanek/mtd-spi-nor-When-a-flash-memory-is-missing-do-not-report-an-error/20220715-032336
git checkout d747dae258dd8d5220d38fae5fd09be703b0391b
# save the config file
mkdir build_dir && cp config build_dir/.config
make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash drivers/mtd/spi-nor/

If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <[email protected]>

All warnings (new ones prefixed by >>):

>> drivers/mtd/spi-nor/core.c:1655:8: warning: type qualifiers ignored on function return type [-Wignored-qualifiers]
1655 | static const bool buffer_uniform(const u8 *buffer, size_t length)
| ^~~~~


vim +1655 drivers/mtd/spi-nor/core.c

1654
> 1655 static const bool buffer_uniform(const u8 *buffer, size_t length)
1656 {
1657 bool all0;
1658 size_t i;
1659
1660 for (all0 = true, i = 0; i < length; i++)
1661 if (buffer[i] != 0) {
1662 all0 = false;
1663 break;
1664 }
1665 if (all0)
1666 return true;
1667 for (i = 0; i < length; i++)
1668 if (buffer[i] != 0xff)
1669 return false;
1670 return true;
1671 }
1672

--
0-DAY CI Kernel Test Service
https://01.org/lkp

2022-07-15 09:27:56

by Pratyush Yadav

[permalink] [raw]
Subject: Re: [PATCH 1/2] mtd: spi-nor: When a flash memory is missing do not report an error

On 15/07/22 12:07AM, Michal Such?nek wrote:
> On Thu, Jul 14, 2022 at 11:51:56PM +0200, Michael Walle wrote:
> > Am 2022-07-14 22:55, schrieb Michal Such?nek:
> > > On Thu, Jul 14, 2022 at 09:41:48PM +0200, Michael Walle wrote:
> > > > Hi,
> > > >
> > > > Am 2022-07-14 21:19, schrieb Michal Suchanek:
> > > > > It is normal that devices are designed with multiple types of storage,
> > > > > and only some types of storage are present.
> > > > >
> > > > > The kernel can handle this situation gracefully for many types of
> > > > > storage devices such as mmc or ata but it reports and error when spi
> > > > > flash is not present.
> > > > >
> > > > > Only print a notice that the storage device is missing when no response
> > > > > to the identify command is received.
> > > > >
> > > > > Consider reply buffers with all bits set to the same value no response.
> > > >
> > > > I'm not sure you can compare SPI with ATA and MMC. I'm just speaking
> > > > of
> > > > DT now, but there, for ATA and MMC you just describe the controller
> > > > and
> > > > it will auto-detect the connected storage. Whereas with SPI you
> > > > describe
> > >
> > > Why does mmc assume storage and SDIO must be descibed? Why the special
> > > casing?
> >
> > I can't follow you here. My SDIO wireless card just works in an SD
> > slot and doesn't have to be described.
> >
> > > > both the controller and the flash. So I'd argue that your hardware
> > > > description is wrong if it describes a flash which is not present.
> > >
> > > At any rate the situation is the same - the storage may be present
> > > sometimes. I don't think assuming some kind of device by defualt is a
> > > sound practice.
> >
> > Where is the assumption when the DT tells you there is a flash
> > on a specific chip select but actually there it isn't. Shouldn't
> > the DT then be fixed?
>
> The DT says there isn't a flash on a specific chip select when there is.
> Shouldn't that be fixed?

If the board has a flash chip, then DT should describe it. The it does
not have one, then DT should not describe it.

So if DT says there isn't a flash on a specific CS when there is, then
DT should be fixed to describe the flash, and then we can probe it. You
both seem to be saying the same thing here, and I agree.

>
> > Maybe I don't understand your problem. What are you trying to
> > solve? I mean this just demotes an error to an info message.
>
> Many boards provide multiple storage options - you get a PCB designed to
> carry different kinds of storage, some may be socketed, some can be
> soldered on in some production batches and not others.

Usually for non-enumerable components you can plug in or out, you should
use DT overlays to add the correct nodes as needed. For example, CSI
cameras just plug into a slot on the board. So you can easily remove one
and add another. That is why we do not put the camera node in the board
dts, but instead apply it as an overlay from the bootloader.

>
> The kernel can handle this for many kinds of storage but not SPI flash.
>
> I don't see any reason why SPI flash should be a second class storage.
>
> > > However, when the board is designed for a specific kind of device which
> > > is not always present, and the kernel can detect the device, it is
> > > perfectly fine to describe it.
> > >
> > > The alternative is to not use the device at all, even when present,
> > > which is kind of useless.
> >
> > Or let the bootloader update your device tree and disable the device
> > if it's not there?
>
> But then it must be in the device tree?
>
> And then people will complain that if the bootloader does not have this
> feature then the kernel prints an error message?

Then add the feature to the bootloader? Adding a node in DT for a flash
that is not present is wrong.

>
> > Or load an overlay if it is there?
>
> Or maybe the kernel could just detect if the storage is present?

It can't. This is a non-enumerable bus, unlike USB or PCI. And there is
no way you can actually detect the presence or absence of a flash
reliably. For example, TI chips come with a flash that is capable of
8D-8D-8D mode. If the flash is handed to the kernel in 8D-8D-8D mode,
the read ID command will return all 0xff bytes since the kernel expacts
communication in 1S-1S-1S mode. With your patch you will say that no
flash memory is detected. In reality the flash is present but the kernel
just failed to properly detect it.

>
> It's not like we don't have an identify command.

We do, but it does not let us detect the _absence_ of a flash.

>
> Thanks
>
> Michal

--
Regards,
Pratyush Yadav
Texas Instruments Inc.

2022-07-15 12:25:42

by Andre Przywara

[permalink] [raw]
Subject: Re: [PATCH 1/2] mtd: spi-nor: When a flash memory is missing do not report an error

On Fri, 15 Jul 2022 00:07:44 +0200
Michal Suchánek <[email protected]> wrote:

Hi,

> On Thu, Jul 14, 2022 at 11:51:56PM +0200, Michael Walle wrote:
> > Am 2022-07-14 22:55, schrieb Michal Such�nek:
> > > On Thu, Jul 14, 2022 at 09:41:48PM +0200, Michael Walle wrote:
> > > > Hi,
> > > >
> > > > Am 2022-07-14 21:19, schrieb Michal Suchanek:
> > > > > It is normal that devices are designed with multiple types of storage,
> > > > > and only some types of storage are present.
> > > > >
> > > > > The kernel can handle this situation gracefully for many types of
> > > > > storage devices such as mmc or ata but it reports and error when spi
> > > > > flash is not present.
> > > > >
> > > > > Only print a notice that the storage device is missing when no response
> > > > > to the identify command is received.
> > > > >
> > > > > Consider reply buffers with all bits set to the same value no response.
> > > >
> > > > I'm not sure you can compare SPI with ATA and MMC. I'm just speaking
> > > > of
> > > > DT now, but there, for ATA and MMC you just describe the controller
> > > > and
> > > > it will auto-detect the connected storage. Whereas with SPI you
> > > > describe
> > >
> > > Why does mmc assume storage and SDIO must be descibed? Why the special
> > > casing?
> >
> > I can't follow you here. My SDIO wireless card just works in an SD
> > slot and doesn't have to be described.

I think the difference is that MMC (so also SDIO) is a discoverable bus,
whereas SPI is not.
It's conceptually dangerous to blindly probe for SPI chips, and the kernel
tries to stay out of guessing games, in general, and leaves that up to
firmware.

> > > > both the controller and the flash. So I'd argue that your hardware
> > > > description is wrong if it describes a flash which is not present.
> > >
> > > At any rate the situation is the same - the storage may be present
> > > sometimes. I don't think assuming some kind of device by defualt is a
> > > sound practice.
> >
> > Where is the assumption when the DT tells you there is a flash
> > on a specific chip select but actually there it isn't. Shouldn't
> > the DT then be fixed?
>
> The DT says there isn't a flash on a specific chip select when there is.
> Shouldn't that be fixed?
>
> > Maybe I don't understand your problem. What are you trying to
> > solve? I mean this just demotes an error to an info message.

The particular problem at hand is that on those cheap development boards
SPI flash is somewhat optional. The PCB often has the footprint for it, but
sometimes it is not populated, because the vendor wants to save pennies.

In this case (OrangePi Zero) there was no SPI chip soldered on the first
batches, but later boards are shipped with a flash chip. The footprint is
on every version, and I for instance soldered a chip on an early board.

> Many boards provide multiple storage options - you get a PCB designed to
> carry different kinds of storage, some may be socketed, some can be
> soldered on in some production batches and not others.
>
> The kernel can handle this for many kinds of storage but not SPI flash.
>
> I don't see any reason why SPI flash should be a second class storage.

See above, SPI is not discoverable, you need to know about the slave
devices.

> > > However, when the board is designed for a specific kind of device which
> > > is not always present, and the kernel can detect the device, it is
> > > perfectly fine to describe it.
> > >
> > > The alternative is to not use the device at all, even when present,
> > > which is kind of useless.
> >
> > Or let the bootloader update your device tree and disable the device
> > if it's not there?

Yes, this is what I was suggesting already: U-Boot can do the job, because
a U-Boot build is device specific, and we can take certain risks that the
generic and single-image kernel wants to avoid.
In this case we know that there is a SPI flash footprint, and it does no
harm in trying to check on CS0. So I was thinking about introducing a
U-Boot Kconfig variable to probe for and potentially disable the SPI flash
DT node. We would set this variable in defconfigs of boards with optional
SPI flash.

> But then it must be in the device tree?

However this indeed means that the SPI flash DT node must be in and enabled
in the DT, because we (try hard to) only use original Linux DT files, and
DTs must have been reviewed through the kernel ML first. The U-Boot driver
relies on the DT as well, so the official kernel DT copy would need to come
with that node enabled. Ideally U-Boot would disable it, if needed, and
the kernel error message would never appear.

> And then people will complain that if the bootloader does not have this
> feature then the kernel prints an error message?

This should not happen, if people follow the advice and use U-Boot's
device tree directly ($fdtcontroladdr) instead of loading some DTB from
somewhere. Then the U-Boot code (doing the check) and the DT (having
it enabled) should be in sync, and we don't see kernel error messages.

If it happens anyways (because people load some DTB), then it's a matter of
either "live with it" or "update your firmware".

Cheers,
Andre

> > Or load an overlay if it is there?
>
> Or maybe the kernel could just detect if the storage is present?
>
> It's not like we don't have an identify command.
>
> Thanks
>
> Michal
>

2022-07-16 02:47:53

by Samuel Holland

[permalink] [raw]
Subject: Re: [PATCH 1/2] mtd: spi-nor: When a flash memory is missing do not report an error

On 7/15/22 7:20 AM, Andre Przywara wrote:
>>>> However, when the board is designed for a specific kind of device which
>>>> is not always present, and the kernel can detect the device, it is
>>>> perfectly fine to describe it.
>>>>
>>>> The alternative is to not use the device at all, even when present,
>>>> which is kind of useless.
>>>
>>> Or let the bootloader update your device tree and disable the device
>>> if it's not there?
>
> Yes, this is what I was suggesting already: U-Boot can do the job, because
> a U-Boot build is device specific, and we can take certain risks that the
> generic and single-image kernel wants to avoid.
> In this case we know that there is a SPI flash footprint, and it does no
> harm in trying to check on CS0. So I was thinking about introducing a
> U-Boot Kconfig variable to probe for and potentially disable the SPI flash
> DT node. We would set this variable in defconfigs of boards with optional
> SPI flash.

To support the "does no harm" assertion: the Allwinner Boot ROM will probe for
NOR flash (and possibly SPI NAND) on SPI0 + CS0 if no bootable MMC device is
found. So the board designer must already assume that JEDEC Read ID commands
will be sent over that bus.

>> But then it must be in the device tree?
>
> However this indeed means that the SPI flash DT node must be in and enabled
> in the DT, because we (try hard to) only use original Linux DT files, and
> DTs must have been reviewed through the kernel ML first. The U-Boot driver
> relies on the DT as well, so the official kernel DT copy would need to come
> with that node enabled. Ideally U-Boot would disable it, if needed, and
> the kernel error message would never appear.

I don't think this works for newer SoCs where the Boot ROM supports both NOR and
SPI NAND. If the board is sold with the flash chip unpopulated, the user could
install either kind of chip. So we cannot statically assume which binding will
be used. We would need to add a node with the right compatible string after
probing for both in the bootloader.

Regards,
Samuel

2022-07-16 08:15:58

by Michal Suchánek

[permalink] [raw]
Subject: Re: [PATCH 1/2] mtd: spi-nor: When a flash memory is missing do not report an error

On Fri, Jul 15, 2022 at 01:20:06PM +0100, Andre Przywara wrote:
> On Fri, 15 Jul 2022 00:07:44 +0200
> Michal Such?nek <[email protected]> wrote:
>
> Hi,
>
> > On Thu, Jul 14, 2022 at 11:51:56PM +0200, Michael Walle wrote:
> > > Am 2022-07-14 22:55, schrieb Michal Such�nek:
> > > > On Thu, Jul 14, 2022 at 09:41:48PM +0200, Michael Walle wrote:
> > > > > Hi,
> > > > >
> > > > > Am 2022-07-14 21:19, schrieb Michal Suchanek:
> > > > > > It is normal that devices are designed with multiple types of storage,
> > > > > > and only some types of storage are present.
> > > > > >
> > > > > > The kernel can handle this situation gracefully for many types of
> > > > > > storage devices such as mmc or ata but it reports and error when spi
> > > > > > flash is not present.
> > > > > >
> > > > > > Only print a notice that the storage device is missing when no response
> > > > > > to the identify command is received.
> > > > > >
> > > > > > Consider reply buffers with all bits set to the same value no response.
> > > > >
> > > > > I'm not sure you can compare SPI with ATA and MMC. I'm just speaking
> > > > > of
> > > > > DT now, but there, for ATA and MMC you just describe the controller
> > > > > and
> > > > > it will auto-detect the connected storage. Whereas with SPI you
> > > > > describe
> > > >
> > > > Why does mmc assume storage and SDIO must be descibed? Why the special
> > > > casing?
> > >
> > > I can't follow you here. My SDIO wireless card just works in an SD
> > > slot and doesn't have to be described.
>
> I think the difference is that MMC (so also SDIO) is a discoverable bus,
> whereas SPI is not.
> It's conceptually dangerous to blindly probe for SPI chips, and the kernel
> tries to stay out of guessing games, in general, and leaves that up to
> firmware.

There is no guessing game involved. The SPI NOR is fully described in
the device tree. The only missing bit of information is if it is mounted
on this particular board. That can be brobed safely and reliably.

> > > > > both the controller and the flash. So I'd argue that your hardware
> > > > > description is wrong if it describes a flash which is not present.
> > > >
> > > > At any rate the situation is the same - the storage may be present
> > > > sometimes. I don't think assuming some kind of device by defualt is a
> > > > sound practice.
> > >
> > > Where is the assumption when the DT tells you there is a flash
> > > on a specific chip select but actually there it isn't. Shouldn't
> > > the DT then be fixed?
> >
> > The DT says there isn't a flash on a specific chip select when there is.
> > Shouldn't that be fixed?
> >
> > > Maybe I don't understand your problem. What are you trying to
> > > solve? I mean this just demotes an error to an info message.
>
> The particular problem at hand is that on those cheap development boards
> SPI flash is somewhat optional. The PCB often has the footprint for it, but
> sometimes it is not populated, because the vendor wants to save pennies.
>
> In this case (OrangePi Zero) there was no SPI chip soldered on the first
> batches, but later boards are shipped with a flash chip. The footprint is
> on every version, and I for instance soldered a chip on an early board.
>
> > Many boards provide multiple storage options - you get a PCB designed to
> > carry different kinds of storage, some may be socketed, some can be
> > soldered on in some production batches and not others.
> >
> > The kernel can handle this for many kinds of storage but not SPI flash.
> >
> > I don't see any reason why SPI flash should be a second class storage.
>
> See above, SPI is not discoverable, you need to know about the slave
> devices.

Which we do, from the device tree. Except the device is disabled in the
device tree so the kernel does not probe it.

>
> > > > However, when the board is designed for a specific kind of device which
> > > > is not always present, and the kernel can detect the device, it is
> > > > perfectly fine to describe it.
> > > >
> > > > The alternative is to not use the device at all, even when present,
> > > > which is kind of useless.
> > >
> > > Or let the bootloader update your device tree and disable the device
> > > if it's not there?
>
> Yes, this is what I was suggesting already: U-Boot can do the job, because
> a U-Boot build is device specific, and we can take certain risks that the
> generic and single-image kernel wants to avoid.

For some cases this may be warranted.

However, in this case no additional device-specific knowledge beyond
what is alrready specified in the device tree is needed.

A generic kernel can probe the device just fine.

> In this case we know that there is a SPI flash footprint, and it does no
> harm in trying to check on CS0. So I was thinking about introducing a
> U-Boot Kconfig variable to probe for and potentially disable the SPI flash
> DT node. We would set this variable in defconfigs of boards with optional
> SPI flash.
>
> > But then it must be in the device tree?
>
> However this indeed means that the SPI flash DT node must be in and enabled
> in the DT, because we (try hard to) only use original Linux DT files, and
> DTs must have been reviewed through the kernel ML first. The U-Boot driver
> relies on the DT as well, so the official kernel DT copy would need to come
> with that node enabled. Ideally U-Boot would disable it, if needed, and
> the kernel error message would never appear.

Yes, that's a good point - even if it's decided that the kernel will not
handle this, the device tree still needs to contain the node enabled for
the bootloader to handle the device, anyway.

Thanks

Michal

2022-07-16 08:23:02

by Michal Suchánek

[permalink] [raw]
Subject: Re: [PATCH 1/2] mtd: spi-nor: When a flash memory is missing do not report an error

On Fri, Jul 15, 2022 at 02:50:17PM +0530, Pratyush Yadav wrote:
> On 15/07/22 12:07AM, Michal Such?nek wrote:
> > On Thu, Jul 14, 2022 at 11:51:56PM +0200, Michael Walle wrote:
> > > Am 2022-07-14 22:55, schrieb Michal Such?nek:
> > > > On Thu, Jul 14, 2022 at 09:41:48PM +0200, Michael Walle wrote:
> > > > > Hi,
> > > > >
> > > > > Am 2022-07-14 21:19, schrieb Michal Suchanek:
> > > > > > It is normal that devices are designed with multiple types of storage,
> > > > > > and only some types of storage are present.
> > > > > >
> > > > > > The kernel can handle this situation gracefully for many types of
> > > > > > storage devices such as mmc or ata but it reports and error when spi
> > > > > > flash is not present.
> > > > > >
> > > > > > Only print a notice that the storage device is missing when no response
> > > > > > to the identify command is received.
> > > > > >
> > > > > > Consider reply buffers with all bits set to the same value no response.
> > > > >
> > > > > I'm not sure you can compare SPI with ATA and MMC. I'm just speaking
> > > > > of
> > > > > DT now, but there, for ATA and MMC you just describe the controller
> > > > > and
> > > > > it will auto-detect the connected storage. Whereas with SPI you
> > > > > describe
> > > >
> > > > Why does mmc assume storage and SDIO must be descibed? Why the special
> > > > casing?
> > >
> > > I can't follow you here. My SDIO wireless card just works in an SD
> > > slot and doesn't have to be described.
> > >
> > > > > both the controller and the flash. So I'd argue that your hardware
> > > > > description is wrong if it describes a flash which is not present.
> > > >
> > > > At any rate the situation is the same - the storage may be present
> > > > sometimes. I don't think assuming some kind of device by defualt is a
> > > > sound practice.
> > >
> > > Where is the assumption when the DT tells you there is a flash
> > > on a specific chip select but actually there it isn't. Shouldn't
> > > the DT then be fixed?
> >
> > The DT says there isn't a flash on a specific chip select when there is.
> > Shouldn't that be fixed?
>
> If the board has a flash chip, then DT should describe it. The it does
> not have one, then DT should not describe it.
>
> So if DT says there isn't a flash on a specific CS when there is, then
> DT should be fixed to describe the flash, and then we can probe it. You
> both seem to be saying the same thing here, and I agree.

The disagreement is about the situation when there is sometimes a flash
chip.

As of now the chip is described in the device tree but is disabled, and
there is no mechanism for enabling it.

If it were enabled the driver could probe it but it is not.

The only real argument against enabling it I head was that if it is
enabled and missing users would see the kernel printing an error nad
come here, wasting everyones time.

So here is a patch the makes the kernle not print an error when the chip
is missing, and limit the error only to the situation when a chip is
present but not recognized.

>
> >
> > > Maybe I don't understand your problem. What are you trying to
> > > solve? I mean this just demotes an error to an info message.
> >
> > Many boards provide multiple storage options - you get a PCB designed to
> > carry different kinds of storage, some may be socketed, some can be
> > soldered on in some production batches and not others.
>
> Usually for non-enumerable components you can plug in or out, you should
> use DT overlays to add the correct nodes as needed. For example, CSI
> cameras just plug into a slot on the board. So you can easily remove one
> and add another. That is why we do not put the camera node in the board
> dts, but instead apply it as an overlay from the bootloader.

However, here the device is already enumerated in the device tree, the
only missing bit of information is if the device is present.

Sure, device tree overlays are useful and the right tool for the job for
some jobs, but not this one.

Please don't fall into the thinking that when you have a hammer
everything looks like a nail.

>
> >
> > The kernel can handle this for many kinds of storage but not SPI flash.
> >
> > I don't see any reason why SPI flash should be a second class storage.
> >
> > > > However, when the board is designed for a specific kind of device which
> > > > is not always present, and the kernel can detect the device, it is
> > > > perfectly fine to describe it.
> > > >
> > > > The alternative is to not use the device at all, even when present,
> > > > which is kind of useless.
> > >
> > > Or let the bootloader update your device tree and disable the device
> > > if it's not there?
> >
> > But then it must be in the device tree?
> >
> > And then people will complain that if the bootloader does not have this
> > feature then the kernel prints an error message?
>
> Then add the feature to the bootloader? Adding a node in DT for a flash
> that is not present is wrong.

And how would the bootloader know that it shouild look for a flash if
it's not described in the device tree?

>
> >
> > > Or load an overlay if it is there?
> >
> > Or maybe the kernel could just detect if the storage is present?
>
> It can't. This is a non-enumerable bus, unlike USB or PCI. And there is
> no way you can actually detect the presence or absence of a flash
> reliably. For example, TI chips come with a flash that is capable of
> 8D-8D-8D mode. If the flash is handed to the kernel in 8D-8D-8D mode,
> the read ID command will return all 0xff bytes since the kernel expacts
> communication in 1S-1S-1S mode. With your patch you will say that no
> flash memory is detected. In reality the flash is present but the kernel
> just failed to properly detect it.

This is a strawman argument. If your flash chip starts up in 8D-8D-8D
and you don't tell the kernel to talk to it in 8D-8D-8D it can't talk to
it no matter what. The id command will fail all the same if the chip is
there and try to probe it's type - all 0xff is not a valid id. The
problem is not that you are probing the presence of the chip but that
you did not describe the chip to the kernel correctly.

Not to mention that the controller in question does not support any
advanced modes, anyway.

>
> >
> > It's not like we don't have an identify command.
>
> We do, but it does not let us detect the _absence_ of a flash.

Actually, it does.

Thanks

Michal

2022-07-16 09:34:20

by Michael Walle

[permalink] [raw]
Subject: Re: [PATCH 1/2] mtd: spi-nor: When a flash memory is missing do not report an error

Am 2022-07-16 10:20, schrieb Michal Suchánek:

>> So if DT says there isn't a flash on a specific CS when there is, then
>> DT should be fixed to describe the flash, and then we can probe it.
>> You
>> both seem to be saying the same thing here, and I agree.
>
> The disagreement is about the situation when there is sometimes a flash
> chip.

No. The disagreement is what should happen if the DT says there is
a device but there isn't. Which right now is an error and it should
stay that way. Your hardware description says there is a flash
but it cannot be probed, so it is an error. What about a board
which has an actual error and the flash isn't responding? You
trade one use case for another.

Also I've looked at the PHY subsystem and there, if a PHY is described
in the DT but isn't there, the following error will be printed:
dev_err(&mdio->dev, "MDIO device at address %d is missing.\n", addr);

And that is for a bus which can even be automatically be
probed/detected.

-michael

2022-07-16 09:42:49

by Michal Suchánek

[permalink] [raw]
Subject: Re: [PATCH 1/2] mtd: spi-nor: When a flash memory is missing do not report an error

On Sat, Jul 16, 2022 at 11:30:12AM +0200, Michael Walle wrote:
> Am 2022-07-16 10:20, schrieb Michal Such?nek:
>
> > > So if DT says there isn't a flash on a specific CS when there is, then
> > > DT should be fixed to describe the flash, and then we can probe it.
> > > You
> > > both seem to be saying the same thing here, and I agree.
> >
> > The disagreement is about the situation when there is sometimes a flash
> > chip.
>
> No. The disagreement is what should happen if the DT says there is
> a device but there isn't. Which right now is an error and it should
> stay that way. Your hardware description says there is a flash
> but it cannot be probed, so it is an error. What about a board
> which has an actual error and the flash isn't responding? You
> trade one use case for another.

And what if you have a SATA controller with a bad cable?

Or a bad connection to a mmc card?

Then the kernel also does not say there is an error and simply does not
see the device.

This is normal. Not all devices that can potentially exist do exist. It
is up to the user to decide if it's an error that the device is missing.

> Also I've looked at the PHY subsystem and there, if a PHY is described
> in the DT but isn't there, the following error will be printed:
> dev_err(&mdio->dev, "MDIO device at address %d is missing.\n", addr);
>
> And that is for a bus which can even be automatically be
> probed/detected.

If there is no use case for having a card with unpopulated PHY then it
makes sense.

Here we do have a use case so the comparison is moot.

Thanks

Michal

2022-07-16 10:18:12

by Michael Walle

[permalink] [raw]
Subject: Re: [PATCH 1/2] mtd: spi-nor: When a flash memory is missing do not report an error

Am 2022-07-16 11:38, schrieb Michal Suchánek:
> On Sat, Jul 16, 2022 at 11:30:12AM +0200, Michael Walle wrote:
>> Am 2022-07-16 10:20, schrieb Michal Suchánek:
>>
>> > > So if DT says there isn't a flash on a specific CS when there is, then
>> > > DT should be fixed to describe the flash, and then we can probe it.
>> > > You
>> > > both seem to be saying the same thing here, and I agree.
>> >
>> > The disagreement is about the situation when there is sometimes a flash
>> > chip.
>>
>> No. The disagreement is what should happen if the DT says there is
>> a device but there isn't. Which right now is an error and it should
>> stay that way. Your hardware description says there is a flash
>> but it cannot be probed, so it is an error. What about a board
>> which has an actual error and the flash isn't responding? You
>> trade one use case for another.
>
> And what if you have a SATA controller with a bad cable?
>
> Or a bad connection to a mmc card?

Again. You don't tell the kernel via the DT that there is an
MMC card nor that there is a SATA SDD. In contrast to SPI-NOR..


> Then the kernel also does not say there is an error and simply does not
> see the device.
>
> This is normal. Not all devices that can potentially exist do exist. It
> is up to the user to decide if it's an error that the device is
> missing.
>
>> Also I've looked at the PHY subsystem and there, if a PHY is described
>> in the DT but isn't there, the following error will be printed:
>> dev_err(&mdio->dev, "MDIO device at address %d is missing.\n",
>> addr);
>>
>> And that is for a bus which can even be automatically be
>> probed/detected.
>
> If there is no use case for having a card with unpopulated PHY then it
> makes sense.
>
> Here we do have a use case so the comparison is moot.

And what use case is that? You are just demoting an error
to an info. Apparently you just want to see that error
go away, there is no change in functionality.

-michael

2022-07-16 11:35:49

by Andre Przywara

[permalink] [raw]
Subject: Re: [PATCH 1/2] mtd: spi-nor: When a flash memory is missing do not report an error

On Sat, 16 Jul 2022 09:54:21 +0200
Michal Suchánek <[email protected]> wrote:

Hi,

> On Fri, Jul 15, 2022 at 01:20:06PM +0100, Andre Przywara wrote:
> > On Fri, 15 Jul 2022 00:07:44 +0200
> > Michal Such�nek <[email protected]> wrote:
> >
> > Hi,
> >
> > > On Thu, Jul 14, 2022 at 11:51:56PM +0200, Michael Walle wrote:
> > > > Am 2022-07-14 22:55, schrieb Michal Such�nek:
> > > > > On Thu, Jul 14, 2022 at 09:41:48PM +0200, Michael Walle wrote:
> > > > > > Hi,
> > > > > >
> > > > > > Am 2022-07-14 21:19, schrieb Michal Suchanek:
> > > > > > > It is normal that devices are designed with multiple types of storage,
> > > > > > > and only some types of storage are present.
> > > > > > >
> > > > > > > The kernel can handle this situation gracefully for many types of
> > > > > > > storage devices such as mmc or ata but it reports and error when spi
> > > > > > > flash is not present.
> > > > > > >
> > > > > > > Only print a notice that the storage device is missing when no response
> > > > > > > to the identify command is received.
> > > > > > >
> > > > > > > Consider reply buffers with all bits set to the same value no response.
> > > > > >
> > > > > > I'm not sure you can compare SPI with ATA and MMC. I'm just speaking
> > > > > > of
> > > > > > DT now, but there, for ATA and MMC you just describe the controller
> > > > > > and
> > > > > > it will auto-detect the connected storage. Whereas with SPI you
> > > > > > describe
> > > > >
> > > > > Why does mmc assume storage and SDIO must be descibed? Why the special
> > > > > casing?
> > > >
> > > > I can't follow you here. My SDIO wireless card just works in an SD
> > > > slot and doesn't have to be described.
> >
> > I think the difference is that MMC (so also SDIO) is a discoverable bus,
> > whereas SPI is not.
> > It's conceptually dangerous to blindly probe for SPI chips, and the kernel
> > tries to stay out of guessing games, in general, and leaves that up to
> > firmware.
>
> There is no guessing game involved. The SPI NOR is fully described in
> the device tree. The only missing bit of information is if it is mounted
> on this particular board. That can be brobed safely and reliably.

For this particular board: maybe. In general: no. I don't think the
kernel is the place to make those decisions. As Michael said: if the DT
explicitly says there is a SPI flash, and there isn't, it's an error.

> > > > > > both the controller and the flash. So I'd argue that your hardware
> > > > > > description is wrong if it describes a flash which is not present.
> > > > >
> > > > > At any rate the situation is the same - the storage may be present
> > > > > sometimes. I don't think assuming some kind of device by defualt is a
> > > > > sound practice.
> > > >
> > > > Where is the assumption when the DT tells you there is a flash
> > > > on a specific chip select but actually there it isn't. Shouldn't
> > > > the DT then be fixed?
> > >
> > > The DT says there isn't a flash on a specific chip select when there is.
> > > Shouldn't that be fixed?
> > >
> > > > Maybe I don't understand your problem. What are you trying to
> > > > solve? I mean this just demotes an error to an info message.
> >
> > The particular problem at hand is that on those cheap development boards
> > SPI flash is somewhat optional. The PCB often has the footprint for it, but
> > sometimes it is not populated, because the vendor wants to save pennies.
> >
> > In this case (OrangePi Zero) there was no SPI chip soldered on the first
> > batches, but later boards are shipped with a flash chip. The footprint is
> > on every version, and I for instance soldered a chip on an early board.
> >
> > > Many boards provide multiple storage options - you get a PCB designed to
> > > carry different kinds of storage, some may be socketed, some can be
> > > soldered on in some production batches and not others.
> > >
> > > The kernel can handle this for many kinds of storage but not SPI flash.
> > >
> > > I don't see any reason why SPI flash should be a second class storage.
> >
> > See above, SPI is not discoverable, you need to know about the slave
> > devices.
>
> Which we do, from the device tree. Except the device is disabled in the
> device tree so the kernel does not probe it.

That doesn't count, status = "disable" has the same effect as the node
removed, the kernel doesn't use it. It's disabled because it's broken,
or the board catches fires when it's accessed, or the SPI flash is
secure only, and the kernel receives an SError when accessing it and
panics. We don't care exactly why, the kernel just skips it.

In this case having the node in DT and marking it as disabled was a
concession to users, to allow simple enablement, like this:
=> fdt addr $fdtcontroladdr
=> fdt set /soc/spi status "okay"
(on the U-Boot prompt)

And it's actually a hint that U-Boot can do this automatically, only we
need it the other way around then ("okay" from the beginning, switching
to "disabled" if needed).

> > > > > However, when the board is designed for a specific kind of device which
> > > > > is not always present, and the kernel can detect the device, it is
> > > > > perfectly fine to describe it.
> > > > >
> > > > > The alternative is to not use the device at all, even when present,
> > > > > which is kind of useless.
> > > >
> > > > Or let the bootloader update your device tree and disable the device
> > > > if it's not there?
> >
> > Yes, this is what I was suggesting already: U-Boot can do the job, because
> > a U-Boot build is device specific, and we can take certain risks that the
> > generic and single-image kernel wants to avoid.
>
> For some cases this may be warranted.
>
> However, in this case no additional device-specific knowledge beyond
> what is alrready specified in the device tree is needed.
>
> A generic kernel can probe the device just fine.

The fact that sometimes there is a SPI flash and sometimes not, is a
pity device specific problem. The kernel does not have and does not
want to have any knowledge of this particular problem: we have the DT
to tell us exactly what devices are there. Firmware is encouraged to
tweak the DT, if needed.

> > In this case we know that there is a SPI flash footprint, and it does no
> > harm in trying to check on CS0. So I was thinking about introducing a
> > U-Boot Kconfig variable to probe for and potentially disable the SPI flash
> > DT node. We would set this variable in defconfigs of boards with optional
> > SPI flash.
> >
> > > But then it must be in the device tree?
> >
> > However this indeed means that the SPI flash DT node must be in and enabled
> > in the DT, because we (try hard to) only use original Linux DT files, and
> > DTs must have been reviewed through the kernel ML first. The U-Boot driver
> > relies on the DT as well, so the official kernel DT copy would need to come
> > with that node enabled. Ideally U-Boot would disable it, if needed, and
> > the kernel error message would never appear.
>
> Yes, that's a good point - even if it's decided that the kernel will not
> handle this, the device tree still needs to contain the node enabled for
> the bootloader to handle the device, anyway.

Yes, and I am happy to support that case when we send a patch to change
the DT in the kernel repo.
But I don't think we need an actual kernel patch to address this
problem.

Cheers,
Andre

P.S. I just see that status = "disabled" is in the wrong node, it
should be in the SPI slave node, as the controller and the SPI bus
itself are fine.

2022-07-16 12:02:07

by Andre Przywara

[permalink] [raw]
Subject: Re: [PATCH 1/2] mtd: spi-nor: When a flash memory is missing do not report an error

On Fri, 15 Jul 2022 21:28:57 -0500
Samuel Holland <[email protected]> wrote:

Hi,

> On 7/15/22 7:20 AM, Andre Przywara wrote:
> >>>> However, when the board is designed for a specific kind of device which
> >>>> is not always present, and the kernel can detect the device, it is
> >>>> perfectly fine to describe it.
> >>>>
> >>>> The alternative is to not use the device at all, even when present,
> >>>> which is kind of useless.
> >>>
> >>> Or let the bootloader update your device tree and disable the device
> >>> if it's not there?
> >
> > Yes, this is what I was suggesting already: U-Boot can do the job, because
> > a U-Boot build is device specific, and we can take certain risks that the
> > generic and single-image kernel wants to avoid.
> > In this case we know that there is a SPI flash footprint, and it does no
> > harm in trying to check on CS0. So I was thinking about introducing a
> > U-Boot Kconfig variable to probe for and potentially disable the SPI flash
> > DT node. We would set this variable in defconfigs of boards with optional
> > SPI flash.
>
> To support the "does no harm" assertion: the Allwinner Boot ROM will probe for
> NOR flash (and possibly SPI NAND) on SPI0 + CS0 if no bootable MMC device is
> found. So the board designer must already assume that JEDEC Read ID commands
> will be sent over that bus.
>
> >> But then it must be in the device tree?
> >
> > However this indeed means that the SPI flash DT node must be in and enabled
> > in the DT, because we (try hard to) only use original Linux DT files, and
> > DTs must have been reviewed through the kernel ML first. The U-Boot driver
> > relies on the DT as well, so the official kernel DT copy would need to come
> > with that node enabled. Ideally U-Boot would disable it, if needed, and
> > the kernel error message would never appear.
>
> I don't think this works for newer SoCs where the Boot ROM supports both NOR and
> SPI NAND. If the board is sold with the flash chip unpopulated, the user could
> install either kind of chip. So we cannot statically assume which binding will
> be used. We would need to add a node with the right compatible string after
> probing for both in the bootloader.

If a *user* decides to *change* the board, it's up to the user
to make sure the DT matches. Overlays are the typical answer, or people
change the DT before they build U-Boot. If someone decides to hack
their board, they have to take care of the respective DT description
hack as well.

This case here is about the *vendor* shipping different versions of the
board, which I think is a different case. Technically we now would need
two DTs: one with, one without the SPI flash node, and let the user
decide which one to include in U-Boot at build time, depending on which
version they have.

What I was suggesting is a U-Boot config switch, which would only be
enabled on those boards where we have this situation (OPi Zero):
That avoids dangerous situations (because we know it's safe *on this
particular board*), and avoids the hassle of shipping two firmware
versions.

Cheers,
Andre

2022-07-24 16:20:12

by Michal Suchánek

[permalink] [raw]
Subject: Re: [PATCH 1/2] mtd: spi-nor: When a flash memory is missing do not report an error

On Sat, Jul 16, 2022 at 11:44:48AM +0200, Michael Walle wrote:
> Am 2022-07-16 11:38, schrieb Michal Such?nek:
> > On Sat, Jul 16, 2022 at 11:30:12AM +0200, Michael Walle wrote:
> > > Am 2022-07-16 10:20, schrieb Michal Such?nek:
> > >
> > > > > So if DT says there isn't a flash on a specific CS when there is, then
> > > > > DT should be fixed to describe the flash, and then we can probe it.
> > > > > You
> > > > > both seem to be saying the same thing here, and I agree.
> > > >
> > > > The disagreement is about the situation when there is sometimes a flash
> > > > chip.
> > >
> > > No. The disagreement is what should happen if the DT says there is
> > > a device but there isn't. Which right now is an error and it should
> > > stay that way. Your hardware description says there is a flash
> > > but it cannot be probed, so it is an error. What about a board
> > > which has an actual error and the flash isn't responding? You
> > > trade one use case for another.
> >
> > And what if you have a SATA controller with a bad cable?
> >
> > Or a bad connection to a mmc card?
>
> Again. You don't tell the kernel via the DT that there is an
> MMC card nor that there is a SATA SDD. In contrast to SPI-NOR..

So what?

there is some firmware table that can enable/disable individual SATA
ports, and some BIOSes have the option to disable individual ports.

Sure, you could insist that users should disable unused ports, and
having enabled port with no disk is an error but it's unreasonable - the
users would have to flip the setting unnecessarily because the kernel
can probe the presence of the disk but insists on assuming there is
one on every port. So people would rightfully point out that it's not
inherently an error to have a SATA port but no disk attached.

What I want is merely the same level of service for SPI NOR - that the
user does not need to switch the devicetrees when the presence of a
flash chip can be probed.

You keep repeting that devicetree describes the hardware but that's not
what it does.

"A |spec|-compliant devicetree describes device information in a system
that cannot necessarily be dynamically detected by a client program."

So it does not describe 'hardware', it describes hardware properties
that cannot be probed.

Let's dissect the mmc situation in a bit more detail:

If the device tree describes a MMC bus it means that there are some
traces attached to some hardware pins, and that the pinmux should be
configured to expose a mmc peripherial on these pins.

The decision to use these pins for mmc is somewhat arbitrary. It must be
specified in the devicetree because it cannot be probed but on many
boards you could use the same pins for different hardware peripherial
such as an UART, and with bitbanging low-speed protocols the
possibilities are endless. Even if there is a SD socket that does not
really mean anything - there are many abuses of standard connectors for
non-standard purposes all around.

Nonetheless, if at least some boards come with a MMC device soldered to
the pins or the documentation advises to use a SD/SDIO card then the
pins should be configured as MMC, and it is not invalidated by the fact
that the pins could theoretically be abused for other purposes.

Also on some boards there is always a MMC device soldered, and lack of
the device could be considered an error. Nonetheless, on other boards
the device may be present only on some boards or socketed. The mmc
driver is driver for all of these boards so it has to deal with missing
device gracefully.

Sure, it means that for some boards that have damaged hardware no error
is reported. That's the result of drivers being general purpose, not
baord specific. The same way on some PC mainboards you could consider
missing SATA drive on a specific port an error but the driver is
generic, and should handle also boards that don't necessarily always
come with all ports populated.

There is also the possibility that you have a SDIO WiFi card that does
not have an eeprom with a MAC address, and an EEPROM is mounted
somewhere on the board to store it. In thios case although the MMC bus
is 'discoverable', and the presence and type of the SDIO card can be
probed all right the relationship to the eeprom cannot, and has to be
recorded in the device tree. Will missing the SDIO card cause an error?
I doubt it, MMC is the great 'discoverable' bus after all so here you
have it - a hardware is described, it's missing, and it's not an error.

So for SPI NOR we similarily have some traces that are connected to some
hardware pins, and given that the board comes with a SPI NOR at least
sometimes, or it is documented that these traces should be used for
mounting a SPI NOR that should be described in the device tree. It is
not invalidated by the fact that not all boards come with the SPI NOR or
that it comes in an optional separate bag with accessories or that the
pins coud be theoretically abused for something else.

Now because SPI bus is not 'discoverable' not only the arbitrary
decision to mux the SPI peripehrial on those pins needs to be recorded
but also the arbitrary decision to use the pins to acces a SPI NOR, and
not any other random SPI peripherial.

Again, the spi-nor driver is driver not only the boards that always come
with the flash soldered but also for the boards where it is mounted only
sometimes. It should deal gracefully with the flash missing as much as
it deals gracefully with different flash memory sizes, erase block
sizes, etc. It's a parameter that can be probed.

> > Then the kernel also does not say there is an error and simply does not
> > see the device.
> >
> > This is normal. Not all devices that can potentially exist do exist. It
> > is up to the user to decide if it's an error that the device is missing.
> >
> > > Also I've looked at the PHY subsystem and there, if a PHY is described
> > > in the DT but isn't there, the following error will be printed:
> > > dev_err(&mdio->dev, "MDIO device at address %d is missing.\n",
> > > addr);
> > >
> > > And that is for a bus which can even be automatically be
> > > probed/detected.
> >
> > If there is no use case for having a card with unpopulated PHY then it
> > makes sense.
> >
> > Here we do have a use case so the comparison is moot.
>
> And what use case is that? You are just demoting an error

Boards with optional SPI NOR device.

> to an info. Apparently you just want to see that error
> go away, there is no change in functionality.

Yes, it's a sign of a bad assumption on the part of the driver. Sure, it
might have been true in the past but with new hardware it's no longer
the case so the driver should adjust.

Currently the driver problem is worked around by being over-specific
in devicetree but ultimately the driver is broken and should be fixed.

Thanks

Michal

2022-07-24 18:41:51

by Michal Suchánek

[permalink] [raw]
Subject: Re: [PATCH 1/2] mtd: spi-nor: When a flash memory is missing do not report an error

On Sat, Jul 16, 2022 at 11:58:49AM +0100, Andre Przywara wrote:
> On Fri, 15 Jul 2022 21:28:57 -0500
> Samuel Holland <[email protected]> wrote:
>
> Hi,
>
> > On 7/15/22 7:20 AM, Andre Przywara wrote:
> > >>>> However, when the board is designed for a specific kind of device which
> > >>>> is not always present, and the kernel can detect the device, it is
> > >>>> perfectly fine to describe it.
> > >>>>
> > >>>> The alternative is to not use the device at all, even when present,
> > >>>> which is kind of useless.
> > >>>
> > >>> Or let the bootloader update your device tree and disable the device
> > >>> if it's not there?
> > >
> > > Yes, this is what I was suggesting already: U-Boot can do the job, because
> > > a U-Boot build is device specific, and we can take certain risks that the
> > > generic and single-image kernel wants to avoid.
> > > In this case we know that there is a SPI flash footprint, and it does no
> > > harm in trying to check on CS0. So I was thinking about introducing a
> > > U-Boot Kconfig variable to probe for and potentially disable the SPI flash
> > > DT node. We would set this variable in defconfigs of boards with optional
> > > SPI flash.
> >
> > To support the "does no harm" assertion: the Allwinner Boot ROM will probe for
> > NOR flash (and possibly SPI NAND) on SPI0 + CS0 if no bootable MMC device is
> > found. So the board designer must already assume that JEDEC Read ID commands
> > will be sent over that bus.
> >
> > >> But then it must be in the device tree?
> > >
> > > However this indeed means that the SPI flash DT node must be in and enabled
> > > in the DT, because we (try hard to) only use original Linux DT files, and
> > > DTs must have been reviewed through the kernel ML first. The U-Boot driver
> > > relies on the DT as well, so the official kernel DT copy would need to come
> > > with that node enabled. Ideally U-Boot would disable it, if needed, and
> > > the kernel error message would never appear.
> >
> > I don't think this works for newer SoCs where the Boot ROM supports both NOR and
> > SPI NAND. If the board is sold with the flash chip unpopulated, the user could
> > install either kind of chip. So we cannot statically assume which binding will
> > be used. We would need to add a node with the right compatible string after
> > probing for both in the bootloader.
>
> If a *user* decides to *change* the board, it's up to the user
> to make sure the DT matches. Overlays are the typical answer, or people
> change the DT before they build U-Boot. If someone decides to hack
> their board, they have to take care of the respective DT description
> hack as well.
>
> This case here is about the *vendor* shipping different versions of the
> board, which I think is a different case. Technically we now would need
> two DTs: one with, one without the SPI flash node, and let the user
> decide which one to include in U-Boot at build time, depending on which
> version they have.

Technically we don't. The DT is supposed to describe hardware properties
that cannot be probed. Probing presence of the SPI NOR is perfectly
possible so it suffices to record that the SPI CS is reserved for a SPI
NOR, and as much as we don't describe the size we don't need to describe
or assume the presence either.

> What I was suggesting is a U-Boot config switch, which would only be
> enabled on those boards where we have this situation (OPi Zero):
> That avoids dangerous situations (because we know it's safe *on this
> particular board*), and avoids the hassle of shipping two firmware
> versions.

Also if we really want to make u-boot probe the device (such as in cases
when there is choice of mutiple devices) DT also supports "reserved"
state which may be useful for makring the devices to probe when the DT
is loaded together with u-boot before passing it to Linux.

Thanks

Michal