2021-10-11 16:23:14

by Sean Nyekjaer

[permalink] [raw]
Subject: [PATCH 0/3] mtd: core: protect access to mtd devices while in suspend

Follow-up on discussion in https://lkml.org/lkml/2021/10/4/41

Changes since from rfc v1/v2:
- added access protection for all device access hooks in mtd_info.
- added Suggested-by to [1/3] patch.
- removed refereces to commit ef347c0cfd61 ("mtd: rawnand: gpmi: Implement exec_op")
from commit msg as commit 013e6292aaf5 ("mtd: rawnand: Simplify the locking") is
to be blamed.
- tested on a kernel with LOCKDEP enabled.

@Miquel: I havn't covered every ioctl, to me it looks like they havn't
direct device access.

One (small) issue still present. gpmi_nand.c uses the rwsem before it's
initialized. Seems cumbersome to have every mtd/nand driver to call
init_waitqueue_head() and init_rwsem(). Could we somehow move the call
to mtd_set_dev_defaults() before nand_create_bbt()?

---
nand: device found, Manufacturer ID: 0x98, Chip ID: 0xdc
nand: Toshiba NAND 512MiB 3,3V 8-bit
nand: 512 MiB, SLC, erase size: 256 KiB, page size: 4096, OOB size: 128
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at kernel/locking/rwsem.c:1240 down_read+0x160/0x184
DEBUG_RWSEMS_WARN_ON(sem->magic != sem): count = 0x0, magic = 0x0, owner = 0x0, curr 0xc40b0000, list not empty
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.15.0-rc3-00002-g7eab985fa8cd-dirty #163
Hardware name: Freescale i.MX6 Ultralite (Device Tree)
[<c010fa90>] (unwind_backtrace) from [<c010b8fc>] (show_stack+0x10/0x14)
[<c010b8fc>] (show_stack) from [<c0b34998>] (dump_stack_lvl+0x58/0x70)
[<c0b34998>] (dump_stack_lvl) from [<c0123884>] (__warn+0xd4/0x154)
[<c0123884>] (__warn) from [<c0b2e9ac>] (warn_slowpath_fmt+0x94/0xbc)
[<c0b2e9ac>] (warn_slowpath_fmt) from [<c0b428c8>] (down_read+0x160/0x184)
[<c0b428c8>] (down_read) from [<c06df42c>] (mtd_read_oob+0xc8/0x284)
[<c06df42c>] (mtd_read_oob) from [<c06f28cc>] (scan_block_fast+0x78/0xf4)
[<c06f28cc>] (scan_block_fast) from [<c06f3c14>] (search_bbt+0x268/0x304)
[<c06f3c14>] (search_bbt) from [<c06f3ef0>] (nand_create_bbt+0x240/0x6dc)
[<c06f3ef0>] (nand_create_bbt) from [<c06fa758>] (gpmi_nand_probe+0x568/0x6d0)
[<c06fa758>] (gpmi_nand_probe) from [<c068731c>] (platform_probe+0x58/0xb8)
[<c068731c>] (platform_probe) from [<c0684a48>] (really_probe.part.0+0x9c/0x32c)
[<c0684a48>] (really_probe.part.0) from [<c0684d78>] (__driver_probe_device+0xa0/0x138)
[<c0684d78>] (__driver_probe_device) from [<c0684e40>] (driver_probe_device+0x30/0x10c)
[<c0684e40>] (driver_probe_device) from [<c0685510>] (__driver_attach+0xb4/0x174)
[<c0685510>] (__driver_attach) from [<c0682b84>] (bus_for_each_dev+0x78/0xb8)
[<c0682b84>] (bus_for_each_dev) from [<c0683ea4>] (bus_add_driver+0xf0/0x1d8)
[<c0683ea4>] (bus_add_driver) from [<c0686270>] (driver_register+0x88/0x118)
[<c0686270>] (driver_register) from [<c0102108>] (do_one_initcall+0x74/0x378)
[<c0102108>] (do_one_initcall) from [<c1001304>] (kernel_init_freeable+0x2a4/0x348)
[<c1001304>] (kernel_init_freeable) from [<c0b3bc34>] (kernel_init+0x10/0x124)
[<c0b3bc34>] (kernel_init) from [<c010011c>] (ret_from_fork+0x14/0x38)
Exception stack(0xc40a7fb0 to 0xc40a7ff8)
7fa0: ???????? ???????? ???????? ????????
7fc0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
7fe0: ???????? ???????? ???????? ???????? ???????? ????????
irq event stamp: 110936
hardirqs lastenabled at (110935): [<c0100b04>] __irq_svc+0x64/0x80
hardirqs last disabled at (110936): [<c0100af0>] __irq_svc+0x50/0x80
softirqs lastenabled at (110906): [<c093818c>] netlink_insert+0x60/0x8a4
softirqs last disabled at (110904): [<c08ac6d4>] release_sock+0x18/0x98
---[ end trace c70a1a8da23b619d ]---
Bad block table found at page 131008, version 0x01
Bad block table found at page 130944, version 0x01
---

Sean Nyekjaer (3):
mtd: core: protect access to MTD devices while in suspend
mtd: rawnand: remove suspended check
mtd: mtdconcat: add suspend lock handling

drivers/mtd/mtdconcat.c | 11 ++-
drivers/mtd/mtdcore.c | 133 +++++++++++++++++++++++++++----
drivers/mtd/nand/raw/nand_base.c | 52 +++---------
include/linux/mtd/mtd.h | 109 +++++++++++++++++++++----
include/linux/mtd/rawnand.h | 5 +-
5 files changed, 234 insertions(+), 76 deletions(-)

--
2.33.0


2021-10-11 16:23:20

by Sean Nyekjaer

[permalink] [raw]
Subject: [PATCH 2/3] mtd: rawnand: remove suspended check

Access is protected in upper MTD layer when MTD devices are suspended.

Fixes: 013e6292aaf5 ("mtd: rawnand: Simplify the locking")
Reviewed-by: Miquel Raynal <[email protected]>
Signed-off-by: Sean Nyekjaer <[email protected]>
---
drivers/mtd/nand/raw/nand_base.c | 52 ++++++++------------------------
include/linux/mtd/rawnand.h | 5 +--
2 files changed, 14 insertions(+), 43 deletions(-)

diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
index 3d6c6e880520..aa2874ae3c4a 100644
--- a/drivers/mtd/nand/raw/nand_base.c
+++ b/drivers/mtd/nand/raw/nand_base.c
@@ -332,19 +332,11 @@ static int nand_isbad_bbm(struct nand_chip *chip, loff_t ofs)
* @chip: NAND chip structure
*
* Lock the device and its controller for exclusive access
- *
- * Return: -EBUSY if the chip has been suspended, 0 otherwise
*/
-static int nand_get_device(struct nand_chip *chip)
+static void nand_get_device(struct nand_chip *chip)
{
mutex_lock(&chip->lock);
- if (chip->suspended) {
- mutex_unlock(&chip->lock);
- return -EBUSY;
- }
mutex_lock(&chip->controller->lock);
-
- return 0;
}

/**
@@ -573,10 +565,7 @@ static int nand_block_markbad_lowlevel(struct nand_chip *chip, loff_t ofs)
nand_erase_nand(chip, &einfo, 0);

/* Write bad block marker to OOB */
- ret = nand_get_device(chip);
- if (ret)
- return ret;
-
+ nand_get_device(chip);
ret = nand_markbad_bbm(chip, ofs);
nand_release_device(chip);
}
@@ -3756,9 +3745,7 @@ static int nand_read_oob(struct mtd_info *mtd, loff_t from,
ops->mode != MTD_OPS_RAW)
return -ENOTSUPP;

- ret = nand_get_device(chip);
- if (ret)
- return ret;
+ nand_get_device(chip);

if (!ops->datbuf)
ret = nand_do_read_oob(chip, from, ops);
@@ -4345,13 +4332,11 @@ static int nand_write_oob(struct mtd_info *mtd, loff_t to,
struct mtd_oob_ops *ops)
{
struct nand_chip *chip = mtd_to_nand(mtd);
- int ret;
+ int ret = 0;

ops->retlen = 0;

- ret = nand_get_device(chip);
- if (ret)
- return ret;
+ nand_get_device(chip);

switch (ops->mode) {
case MTD_OPS_PLACE_OOB:
@@ -4410,10 +4395,8 @@ int nand_erase_nand(struct nand_chip *chip, struct erase_info *instr,
if (nand_region_is_secured(chip, instr->addr, instr->len))
return -EIO;

- /* Grab the lock and see if the device is available */
- ret = nand_get_device(chip);
- if (ret)
- return ret;
+ /* Grab the lock */
+ nand_get_device(chip);

/* Shift to get first page */
page = (int)(instr->addr >> chip->page_shift);
@@ -4499,8 +4482,8 @@ static void nand_sync(struct mtd_info *mtd)

pr_debug("%s: called\n", __func__);

- /* Grab the lock and see if the device is available */
- WARN_ON(nand_get_device(chip));
+ /* Grab the lock */
+ nand_get_device(chip);
/* Release it and go back */
nand_release_device(chip);
}
@@ -4517,9 +4500,7 @@ static int nand_block_isbad(struct mtd_info *mtd, loff_t offs)
int ret;

/* Select the NAND device */
- ret = nand_get_device(chip);
- if (ret)
- return ret;
+ nand_get_device(chip);

nand_select_target(chip, chipnr);

@@ -4565,8 +4546,6 @@ static int nand_suspend(struct mtd_info *mtd)
mutex_lock(&chip->lock);
if (chip->ops.suspend)
ret = chip->ops.suspend(chip);
- if (!ret)
- chip->suspended = 1;
mutex_unlock(&chip->lock);

return ret;
@@ -4580,15 +4559,10 @@ static void nand_resume(struct mtd_info *mtd)
{
struct nand_chip *chip = mtd_to_nand(mtd);

+
mutex_lock(&chip->lock);
- if (chip->suspended) {
- if (chip->ops.resume)
- chip->ops.resume(chip);
- chip->suspended = 0;
- } else {
- pr_err("%s called for a chip which is not in suspended state\n",
- __func__);
- }
+ if (chip->ops.resume)
+ chip->ops.resume(chip);
mutex_unlock(&chip->lock);
}

diff --git a/include/linux/mtd/rawnand.h b/include/linux/mtd/rawnand.h
index b2f9dd3cbd69..1198a6548912 100644
--- a/include/linux/mtd/rawnand.h
+++ b/include/linux/mtd/rawnand.h
@@ -1237,9 +1237,7 @@ struct nand_secure_region {
* @pagecache.page: Page number currently in the cache. -1 means no page is
* currently cached
* @buf_align: Minimum buffer alignment required by a platform
- * @lock: Lock protecting the suspended field. Also used to serialize accesses
- * to the NAND device
- * @suspended: Set to 1 when the device is suspended, 0 when it's not
+ * @lock: Lock to serialize accesses to the NAND device
* @cur_cs: Currently selected target. -1 means no target selected, otherwise we
* should always have cur_cs >= 0 && cur_cs < nanddev_ntargets().
* NAND Controller drivers should not modify this value, but they're
@@ -1293,7 +1291,6 @@ struct nand_chip {

/* Internals */
struct mutex lock;
- unsigned int suspended : 1;
int cur_cs;
int read_retries;
struct nand_secure_region *secure_regions;
--
2.33.0

2021-10-11 16:36:54

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH 0/3] mtd: core: protect access to mtd devices while in suspend

On Mon, 11 Oct 2021 13:52:50 +0200
Sean Nyekjaer <[email protected]> wrote:

> Follow-up on discussion in https://lkml.org/lkml/2021/10/4/41
>
> Changes since from rfc v1/v2:
> - added access protection for all device access hooks in mtd_info.
> - added Suggested-by to [1/3] patch.
> - removed refereces to commit ef347c0cfd61 ("mtd: rawnand: gpmi: Implement exec_op")
> from commit msg as commit 013e6292aaf5 ("mtd: rawnand: Simplify the locking") is
> to be blamed.
> - tested on a kernel with LOCKDEP enabled.
>
> @Miquel: I havn't covered every ioctl, to me it looks like they havn't
> direct device access.
>
> One (small) issue still present. gpmi_nand.c uses the rwsem before it's
> initialized. Seems cumbersome to have every mtd/nand driver to call
> init_waitqueue_head() and init_rwsem(). Could we somehow move the call
> to mtd_set_dev_defaults() before nand_create_bbt()?

I have a nasty trick for that one, but I'm not sure Miquel will like it
(actually, I don't like it either, but it's so simple compared to the
other options we have that I'm tempted to go for this approach until
someone has time to invest in a cleaner solution :-)):

diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
index 3d6c6e880520..a9ac2d528a4d 100644
--- a/drivers/mtd/nand/raw/nand_base.c
+++ b/drivers/mtd/nand/raw/nand_base.c
@@ -6222,8 +6222,6 @@ static int nand_scan_tail(struct nand_chip *chip)
mtd->_sync = nand_sync;
mtd->_lock = nand_lock;
mtd->_unlock = nand_unlock;
- mtd->_suspend = nand_suspend;
- mtd->_resume = nand_resume;
mtd->_reboot = nand_shutdown;
mtd->_block_isreserved = nand_block_isreserved;
mtd->_block_isbad = nand_block_isbad;
@@ -6269,6 +6267,13 @@ static int nand_scan_tail(struct nand_chip *chip)
if (ret)
goto err_free_secure_regions;

+ /*
+ * Populate the suspend/resume hooks after the BBT has been scanned to
+ * avoid using the suspend lock and resume waitqueue which are only
+ * initialized when mtd_device_register() is called.
+ */
+ mtd->_suspend = nand_suspend;
+ mtd->_resume = nand_resume;
return 0;

err_free_secure_regions:



>
> ---
> nand: device found, Manufacturer ID: 0x98, Chip ID: 0xdc
> nand: Toshiba NAND 512MiB 3,3V 8-bit
> nand: 512 MiB, SLC, erase size: 256 KiB, page size: 4096, OOB size: 128
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 1 at kernel/locking/rwsem.c:1240 down_read+0x160/0x184
> DEBUG_RWSEMS_WARN_ON(sem->magic != sem): count = 0x0, magic = 0x0, owner = 0x0, curr 0xc40b0000, list not empty
> Modules linked in:
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.15.0-rc3-00002-g7eab985fa8cd-dirty #163
> Hardware name: Freescale i.MX6 Ultralite (Device Tree)
> [<c010fa90>] (unwind_backtrace) from [<c010b8fc>] (show_stack+0x10/0x14)
> [<c010b8fc>] (show_stack) from [<c0b34998>] (dump_stack_lvl+0x58/0x70)
> [<c0b34998>] (dump_stack_lvl) from [<c0123884>] (__warn+0xd4/0x154)
> [<c0123884>] (__warn) from [<c0b2e9ac>] (warn_slowpath_fmt+0x94/0xbc)
> [<c0b2e9ac>] (warn_slowpath_fmt) from [<c0b428c8>] (down_read+0x160/0x184)
> [<c0b428c8>] (down_read) from [<c06df42c>] (mtd_read_oob+0xc8/0x284)
> [<c06df42c>] (mtd_read_oob) from [<c06f28cc>] (scan_block_fast+0x78/0xf4)
> [<c06f28cc>] (scan_block_fast) from [<c06f3c14>] (search_bbt+0x268/0x304)
> [<c06f3c14>] (search_bbt) from [<c06f3ef0>] (nand_create_bbt+0x240/0x6dc)
> [<c06f3ef0>] (nand_create_bbt) from [<c06fa758>] (gpmi_nand_probe+0x568/0x6d0)
> [<c06fa758>] (gpmi_nand_probe) from [<c068731c>] (platform_probe+0x58/0xb8)
> [<c068731c>] (platform_probe) from [<c0684a48>] (really_probe.part.0+0x9c/0x32c)
> [<c0684a48>] (really_probe.part.0) from [<c0684d78>] (__driver_probe_device+0xa0/0x138)
> [<c0684d78>] (__driver_probe_device) from [<c0684e40>] (driver_probe_device+0x30/0x10c)
> [<c0684e40>] (driver_probe_device) from [<c0685510>] (__driver_attach+0xb4/0x174)
> [<c0685510>] (__driver_attach) from [<c0682b84>] (bus_for_each_dev+0x78/0xb8)
> [<c0682b84>] (bus_for_each_dev) from [<c0683ea4>] (bus_add_driver+0xf0/0x1d8)
> [<c0683ea4>] (bus_add_driver) from [<c0686270>] (driver_register+0x88/0x118)
> [<c0686270>] (driver_register) from [<c0102108>] (do_one_initcall+0x74/0x378)
> [<c0102108>] (do_one_initcall) from [<c1001304>] (kernel_init_freeable+0x2a4/0x348)
> [<c1001304>] (kernel_init_freeable) from [<c0b3bc34>] (kernel_init+0x10/0x124)
> [<c0b3bc34>] (kernel_init) from [<c010011c>] (ret_from_fork+0x14/0x38)
> Exception stack(0xc40a7fb0 to 0xc40a7ff8)
> 7fa0: ???????? ???????? ???????? ????????
> 7fc0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 7fe0: ???????? ???????? ???????? ???????? ???????? ????????
> irq event stamp: 110936
> hardirqs lastenabled at (110935): [<c0100b04>] __irq_svc+0x64/0x80
> hardirqs last disabled at (110936): [<c0100af0>] __irq_svc+0x50/0x80
> softirqs lastenabled at (110906): [<c093818c>] netlink_insert+0x60/0x8a4
> softirqs last disabled at (110904): [<c08ac6d4>] release_sock+0x18/0x98
> ---[ end trace c70a1a8da23b619d ]---
> Bad block table found at page 131008, version 0x01
> Bad block table found at page 130944, version 0x01
> ---
>
> Sean Nyekjaer (3):
> mtd: core: protect access to MTD devices while in suspend
> mtd: rawnand: remove suspended check
> mtd: mtdconcat: add suspend lock handling

Patch 3 (with the suggested changes) should be moved at the beginning of
the series to keep things bisectable.

>
> drivers/mtd/mtdconcat.c | 11 ++-
> drivers/mtd/mtdcore.c | 133 +++++++++++++++++++++++++++----
> drivers/mtd/nand/raw/nand_base.c | 52 +++---------
> include/linux/mtd/mtd.h | 109 +++++++++++++++++++++----
> include/linux/mtd/rawnand.h | 5 +-
> 5 files changed, 234 insertions(+), 76 deletions(-)
>

2021-10-15 12:47:36

by Miquel Raynal

[permalink] [raw]
Subject: Re: [PATCH 0/3] mtd: core: protect access to mtd devices while in suspend

Hi Sean,

[email protected] wrote on Mon, 11 Oct 2021 16:05:46 +0200:

> On Mon, 11 Oct 2021 13:52:50 +0200
> Sean Nyekjaer <[email protected]> wrote:
>
> > Follow-up on discussion in https://lkml.org/lkml/2021/10/4/41
> >
> > Changes since from rfc v1/v2:
> > - added access protection for all device access hooks in mtd_info.
> > - added Suggested-by to [1/3] patch.
> > - removed refereces to commit ef347c0cfd61 ("mtd: rawnand: gpmi: Implement exec_op")
> > from commit msg as commit 013e6292aaf5 ("mtd: rawnand: Simplify the locking") is
> > to be blamed.
> > - tested on a kernel with LOCKDEP enabled.
> >
> > @Miquel: I havn't covered every ioctl, to me it looks like they havn't
> > direct device access.

Yes indeed it looks like they re-use most of the mtdcore.c functions,
so it should be fine.

> > One (small) issue still present. gpmi_nand.c uses the rwsem before it's
> > initialized. Seems cumbersome to have every mtd/nand driver to call
> > init_waitqueue_head() and init_rwsem(). Could we somehow move the call
> > to mtd_set_dev_defaults() before nand_create_bbt()?
>
> I have a nasty trick for that one, but I'm not sure Miquel will like it
> (actually, I don't like it either, but it's so simple compared to the
> other options we have that I'm tempted to go for this approach until
> someone has time to invest in a cleaner solution :-)):
>
> diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
> index 3d6c6e880520..a9ac2d528a4d 100644
> --- a/drivers/mtd/nand/raw/nand_base.c
> +++ b/drivers/mtd/nand/raw/nand_base.c
> @@ -6222,8 +6222,6 @@ static int nand_scan_tail(struct nand_chip *chip)
> mtd->_sync = nand_sync;
> mtd->_lock = nand_lock;
> mtd->_unlock = nand_unlock;
> - mtd->_suspend = nand_suspend;
> - mtd->_resume = nand_resume;
> mtd->_reboot = nand_shutdown;
> mtd->_block_isreserved = nand_block_isreserved;
> mtd->_block_isbad = nand_block_isbad;
> @@ -6269,6 +6267,13 @@ static int nand_scan_tail(struct nand_chip *chip)
> if (ret)
> goto err_free_secure_regions;
>
> + /*
> + * Populate the suspend/resume hooks after the BBT has been scanned to
> + * avoid using the suspend lock and resume waitqueue which are only
> + * initialized when mtd_device_register() is called.
> + */
> + mtd->_suspend = nand_suspend;
> + mtd->_resume = nand_resume;
> return 0;

I'm fine with this as long as it is documented for now.

Thanks,
Miquèl

2021-10-19 18:10:37

by Sean Nyekjaer

[permalink] [raw]
Subject: Re: [PATCH 0/3] mtd: core: protect access to mtd devices while in suspend

On Fri, Oct 15, 2021 at 08:22:06AM +0200, Miquel Raynal wrote:
> Hi Sean,
>
> [email protected] wrote on Mon, 11 Oct 2021 16:05:46 +0200:
>
> > On Mon, 11 Oct 2021 13:52:50 +0200

[ ... ]

>
> > > One (small) issue still present. gpmi_nand.c uses the rwsem before it's
> > > initialized. Seems cumbersome to have every mtd/nand driver to call
> > > init_waitqueue_head() and init_rwsem(). Could we somehow move the call
> > > to mtd_set_dev_defaults() before nand_create_bbt()?
> >
> > I have a nasty trick for that one, but I'm not sure Miquel will like it
> > (actually, I don't like it either, but it's so simple compared to the
> > other options we have that I'm tempted to go for this approach until
> > someone has time to invest in a cleaner solution :-)):
> >
> > diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
> > index 3d6c6e880520..a9ac2d528a4d 100644
> > --- a/drivers/mtd/nand/raw/nand_base.c
> > +++ b/drivers/mtd/nand/raw/nand_base.c
> > @@ -6222,8 +6222,6 @@ static int nand_scan_tail(struct nand_chip *chip)
> > mtd->_sync = nand_sync;
> > mtd->_lock = nand_lock;
> > mtd->_unlock = nand_unlock;
> > - mtd->_suspend = nand_suspend;
> > - mtd->_resume = nand_resume;
> > mtd->_reboot = nand_shutdown;
> > mtd->_block_isreserved = nand_block_isreserved;
> > mtd->_block_isbad = nand_block_isbad;
> > @@ -6269,6 +6267,13 @@ static int nand_scan_tail(struct nand_chip *chip)
> > if (ret)
> > goto err_free_secure_regions;
> >
> > + /*
> > + * Populate the suspend/resume hooks after the BBT has been scanned to
> > + * avoid using the suspend lock and resume waitqueue which are only
> > + * initialized when mtd_device_register() is called.
> > + */
> > + mtd->_suspend = nand_suspend;
> > + mtd->_resume = nand_resume;
> > return 0;
>
> I'm fine with this as long as it is documented for now.
>

Hi Boris and Miquel,

gpmi-nand.c sets NAND_SKIP_BBTSCAN so we won't get there and populate
suspend resume hooks :(
Guess there is other drivers that does the same thing...

I have looked in the history and tried to blame gpmi-nand.c, but I can't
see the reason to why NAND_SKIP_BBTSCAN is set.

I have tested Boris' hack with this diff (and it seems to work):

diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
index 4d08e4ab5c1b..ec7f44d630b6 100644
--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
+++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
@@ -2073,8 +2073,6 @@ static int gpmi_nand_attach_chip(struct nand_chip *chip)
if (ret)
return ret;

- chip->options |= NAND_SKIP_BBTSCAN;
-
return 0;
}

@@ -2460,9 +2459,6 @@ static int gpmi_nand_init(struct gpmi_nand_data *this)
goto err_out;

ret = nand_boot_init(this);
- if (ret)
- goto err_nand_cleanup;
- ret = nand_create_bbt(chip);
if (ret)
goto err_nand_cleanup;


Do I miss something?

/Sean

2021-10-20 06:54:37

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH 0/3] mtd: core: protect access to mtd devices while in suspend

On Tue, 19 Oct 2021 20:08:00 +0200
Sean Nyekjaer <[email protected]> wrote:

> On Fri, Oct 15, 2021 at 08:22:06AM +0200, Miquel Raynal wrote:
> > Hi Sean,
> >
> > [email protected] wrote on Mon, 11 Oct 2021 16:05:46 +0200:
> >
> > > On Mon, 11 Oct 2021 13:52:50 +0200
>
> [ ... ]
>
> >
> > > > One (small) issue still present. gpmi_nand.c uses the rwsem before it's
> > > > initialized. Seems cumbersome to have every mtd/nand driver to call
> > > > init_waitqueue_head() and init_rwsem(). Could we somehow move the call
> > > > to mtd_set_dev_defaults() before nand_create_bbt()?
> > >
> > > I have a nasty trick for that one, but I'm not sure Miquel will like it
> > > (actually, I don't like it either, but it's so simple compared to the
> > > other options we have that I'm tempted to go for this approach until
> > > someone has time to invest in a cleaner solution :-)):
> > >
> > > diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
> > > index 3d6c6e880520..a9ac2d528a4d 100644
> > > --- a/drivers/mtd/nand/raw/nand_base.c
> > > +++ b/drivers/mtd/nand/raw/nand_base.c
> > > @@ -6222,8 +6222,6 @@ static int nand_scan_tail(struct nand_chip *chip)
> > > mtd->_sync = nand_sync;
> > > mtd->_lock = nand_lock;
> > > mtd->_unlock = nand_unlock;
> > > - mtd->_suspend = nand_suspend;
> > > - mtd->_resume = nand_resume;
> > > mtd->_reboot = nand_shutdown;
> > > mtd->_block_isreserved = nand_block_isreserved;
> > > mtd->_block_isbad = nand_block_isbad;
> > > @@ -6269,6 +6267,13 @@ static int nand_scan_tail(struct nand_chip *chip)
> > > if (ret)
> > > goto err_free_secure_regions;
> > >
> > > + /*
> > > + * Populate the suspend/resume hooks after the BBT has been scanned to
> > > + * avoid using the suspend lock and resume waitqueue which are only
> > > + * initialized when mtd_device_register() is called.
> > > + */
> > > + mtd->_suspend = nand_suspend;
> > > + mtd->_resume = nand_resume;
> > > return 0;
> >
> > I'm fine with this as long as it is documented for now.
> >
>
> Hi Boris and Miquel,
>
> gpmi-nand.c sets NAND_SKIP_BBTSCAN so we won't get there and populate
> suspend resume hooks :(
> Guess there is other drivers that does the same thing...

This should fix the issue:

diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
index 3d6c6e880520..c92b17f66994 100644
--- a/drivers/mtd/nand/raw/nand_base.c
+++ b/drivers/mtd/nand/raw/nand_base.c
@@ -6222,8 +6222,6 @@ static int nand_scan_tail(struct nand_chip *chip)
mtd->_sync = nand_sync;
mtd->_lock = nand_lock;
mtd->_unlock = nand_unlock;
- mtd->_suspend = nand_suspend;
- mtd->_resume = nand_resume;
mtd->_reboot = nand_shutdown;
mtd->_block_isreserved = nand_block_isreserved;
mtd->_block_isbad = nand_block_isbad;
@@ -6262,13 +6260,21 @@ static int nand_scan_tail(struct nand_chip *chip)

/* Check, if we should skip the bad block table scan */
if (chip->options & NAND_SKIP_BBTSCAN)
- return 0;
+ goto out;

/* Build bad block table */
ret = nand_create_bbt(chip);
if (ret)
goto err_free_secure_regions;

+out:
+ /*
+ * Populate the suspend/resume hooks after the BBT has been scanned to
+ * avoid using the suspend lock and resume waitqueue which are only
+ * initialized when mtd_device_register() is called.
+ */
+ mtd->_suspend = nand_suspend;
+ mtd->_resume = nand_resume;
return 0;

err_free_secure_regions:

2021-10-20 07:04:28

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH 0/3] mtd: core: protect access to mtd devices while in suspend

On Wed, 20 Oct 2021 08:52:50 +0200
Boris Brezillon <[email protected]> wrote:

> On Tue, 19 Oct 2021 20:08:00 +0200
> Sean Nyekjaer <[email protected]> wrote:
>
> > On Fri, Oct 15, 2021 at 08:22:06AM +0200, Miquel Raynal wrote:
> > > Hi Sean,
> > >
> > > [email protected] wrote on Mon, 11 Oct 2021 16:05:46 +0200:
> > >
> > > > On Mon, 11 Oct 2021 13:52:50 +0200
> >
> > [ ... ]
> >
> > >
> > > > > One (small) issue still present. gpmi_nand.c uses the rwsem before it's
> > > > > initialized. Seems cumbersome to have every mtd/nand driver to call
> > > > > init_waitqueue_head() and init_rwsem(). Could we somehow move the call
> > > > > to mtd_set_dev_defaults() before nand_create_bbt()?
> > > >
> > > > I have a nasty trick for that one, but I'm not sure Miquel will like it
> > > > (actually, I don't like it either, but it's so simple compared to the
> > > > other options we have that I'm tempted to go for this approach until
> > > > someone has time to invest in a cleaner solution :-)):
> > > >
> > > > diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
> > > > index 3d6c6e880520..a9ac2d528a4d 100644
> > > > --- a/drivers/mtd/nand/raw/nand_base.c
> > > > +++ b/drivers/mtd/nand/raw/nand_base.c
> > > > @@ -6222,8 +6222,6 @@ static int nand_scan_tail(struct nand_chip *chip)
> > > > mtd->_sync = nand_sync;
> > > > mtd->_lock = nand_lock;
> > > > mtd->_unlock = nand_unlock;
> > > > - mtd->_suspend = nand_suspend;
> > > > - mtd->_resume = nand_resume;
> > > > mtd->_reboot = nand_shutdown;
> > > > mtd->_block_isreserved = nand_block_isreserved;
> > > > mtd->_block_isbad = nand_block_isbad;
> > > > @@ -6269,6 +6267,13 @@ static int nand_scan_tail(struct nand_chip *chip)
> > > > if (ret)
> > > > goto err_free_secure_regions;
> > > >
> > > > + /*
> > > > + * Populate the suspend/resume hooks after the BBT has been scanned to
> > > > + * avoid using the suspend lock and resume waitqueue which are only
> > > > + * initialized when mtd_device_register() is called.
> > > > + */
> > > > + mtd->_suspend = nand_suspend;
> > > > + mtd->_resume = nand_resume;
> > > > return 0;
> > >
> > > I'm fine with this as long as it is documented for now.
> > >
> >
> > Hi Boris and Miquel,
> >
> > gpmi-nand.c sets NAND_SKIP_BBTSCAN so we won't get there and populate
> > suspend resume hooks :(
> > Guess there is other drivers that does the same thing...
>
> This should fix the issue:
>
> diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
> index 3d6c6e880520..c92b17f66994 100644
> --- a/drivers/mtd/nand/raw/nand_base.c
> +++ b/drivers/mtd/nand/raw/nand_base.c
> @@ -6222,8 +6222,6 @@ static int nand_scan_tail(struct nand_chip *chip)
> mtd->_sync = nand_sync;
> mtd->_lock = nand_lock;
> mtd->_unlock = nand_unlock;
> - mtd->_suspend = nand_suspend;
> - mtd->_resume = nand_resume;
> mtd->_reboot = nand_shutdown;
> mtd->_block_isreserved = nand_block_isreserved;
> mtd->_block_isbad = nand_block_isbad;
> @@ -6262,13 +6260,21 @@ static int nand_scan_tail(struct nand_chip *chip)
>
> /* Check, if we should skip the bad block table scan */
> if (chip->options & NAND_SKIP_BBTSCAN)
> - return 0;
> + goto out;
>
> /* Build bad block table */
> ret = nand_create_bbt(chip);
> if (ret)
> goto err_free_secure_regions;
>
> +out:
> + /*
> + * Populate the suspend/resume hooks after the BBT has been scanned to
> + * avoid using the suspend lock and resume waitqueue which are only
> + * initialized when mtd_device_register() is called.
> + */
> + mtd->_suspend = nand_suspend;
> + mtd->_resume = nand_resume;
> return 0;
>
> err_free_secure_regions:

Actually, this version is even cleaner:

diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
index 3d6c6e880520..98c39b7f6279 100644
--- a/drivers/mtd/nand/raw/nand_base.c
+++ b/drivers/mtd/nand/raw/nand_base.c
@@ -6222,8 +6222,6 @@ static int nand_scan_tail(struct nand_chip *chip)
mtd->_sync = nand_sync;
mtd->_lock = nand_lock;
mtd->_unlock = nand_unlock;
- mtd->_suspend = nand_suspend;
- mtd->_resume = nand_resume;
mtd->_reboot = nand_shutdown;
mtd->_block_isreserved = nand_block_isreserved;
mtd->_block_isbad = nand_block_isbad;
@@ -6261,14 +6259,20 @@ static int nand_scan_tail(struct nand_chip *chip)
goto err_free_interface_config;

/* Check, if we should skip the bad block table scan */
- if (chip->options & NAND_SKIP_BBTSCAN)
- return 0;
-
- /* Build bad block table */
- ret = nand_create_bbt(chip);
- if (ret)
- goto err_free_secure_regions;
+ if (chip->options & NAND_SKIP_BBTSCAN) {
+ /* Build bad block table */
+ ret = nand_create_bbt(chip);
+ if (ret)
+ goto err_free_secure_regions;
+ }

+ /*
+ * Populate the suspend/resume hooks after the BBT has been scanned to
+ * avoid using the suspend lock and resume waitqueue which are only
+ * initialized when mtd_device_register() is called.
+ */
+ mtd->_suspend = nand_suspend;
+ mtd->_resume = nand_resume;
return 0;

err_free_secure_regions:

2021-10-20 07:13:45

by Miquel Raynal

[permalink] [raw]
Subject: Re: [PATCH 0/3] mtd: core: protect access to mtd devices while in suspend

Hi Boris,

[email protected] wrote on Wed, 20 Oct 2021 09:00:58 +0200:

> On Wed, 20 Oct 2021 08:52:50 +0200
> Boris Brezillon <[email protected]> wrote:
>
> > On Tue, 19 Oct 2021 20:08:00 +0200
> > Sean Nyekjaer <[email protected]> wrote:
> >
> > > On Fri, Oct 15, 2021 at 08:22:06AM +0200, Miquel Raynal wrote:
> > > > Hi Sean,
> > > >
> > > > [email protected] wrote on Mon, 11 Oct 2021 16:05:46 +0200:
> > > >
> > > > > On Mon, 11 Oct 2021 13:52:50 +0200
> > >
> > > [ ... ]
> > >
> > > >
> > > > > > One (small) issue still present. gpmi_nand.c uses the rwsem before it's
> > > > > > initialized. Seems cumbersome to have every mtd/nand driver to call
> > > > > > init_waitqueue_head() and init_rwsem(). Could we somehow move the call
> > > > > > to mtd_set_dev_defaults() before nand_create_bbt()?
> > > > >
> > > > > I have a nasty trick for that one, but I'm not sure Miquel will like it
> > > > > (actually, I don't like it either, but it's so simple compared to the
> > > > > other options we have that I'm tempted to go for this approach until
> > > > > someone has time to invest in a cleaner solution :-)):
> > > > >
> > > > > diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
> > > > > index 3d6c6e880520..a9ac2d528a4d 100644
> > > > > --- a/drivers/mtd/nand/raw/nand_base.c
> > > > > +++ b/drivers/mtd/nand/raw/nand_base.c
> > > > > @@ -6222,8 +6222,6 @@ static int nand_scan_tail(struct nand_chip *chip)
> > > > > mtd->_sync = nand_sync;
> > > > > mtd->_lock = nand_lock;
> > > > > mtd->_unlock = nand_unlock;
> > > > > - mtd->_suspend = nand_suspend;
> > > > > - mtd->_resume = nand_resume;
> > > > > mtd->_reboot = nand_shutdown;
> > > > > mtd->_block_isreserved = nand_block_isreserved;
> > > > > mtd->_block_isbad = nand_block_isbad;
> > > > > @@ -6269,6 +6267,13 @@ static int nand_scan_tail(struct nand_chip *chip)
> > > > > if (ret)
> > > > > goto err_free_secure_regions;
> > > > >
> > > > > + /*
> > > > > + * Populate the suspend/resume hooks after the BBT has been scanned to
> > > > > + * avoid using the suspend lock and resume waitqueue which are only
> > > > > + * initialized when mtd_device_register() is called.
> > > > > + */
> > > > > + mtd->_suspend = nand_suspend;
> > > > > + mtd->_resume = nand_resume;
> > > > > return 0;
> > > >
> > > > I'm fine with this as long as it is documented for now.
> > > >
> > >
> > > Hi Boris and Miquel,
> > >
> > > gpmi-nand.c sets NAND_SKIP_BBTSCAN so we won't get there and populate
> > > suspend resume hooks :(
> > > Guess there is other drivers that does the same thing...
> >
> > This should fix the issue:
> >
> > diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
> > index 3d6c6e880520..c92b17f66994 100644
> > --- a/drivers/mtd/nand/raw/nand_base.c
> > +++ b/drivers/mtd/nand/raw/nand_base.c
> > @@ -6222,8 +6222,6 @@ static int nand_scan_tail(struct nand_chip *chip)
> > mtd->_sync = nand_sync;
> > mtd->_lock = nand_lock;
> > mtd->_unlock = nand_unlock;
> > - mtd->_suspend = nand_suspend;
> > - mtd->_resume = nand_resume;
> > mtd->_reboot = nand_shutdown;
> > mtd->_block_isreserved = nand_block_isreserved;
> > mtd->_block_isbad = nand_block_isbad;
> > @@ -6262,13 +6260,21 @@ static int nand_scan_tail(struct nand_chip *chip)
> >
> > /* Check, if we should skip the bad block table scan */
> > if (chip->options & NAND_SKIP_BBTSCAN)
> > - return 0;
> > + goto out;
> >
> > /* Build bad block table */
> > ret = nand_create_bbt(chip);
> > if (ret)
> > goto err_free_secure_regions;
> >
> > +out:
> > + /*
> > + * Populate the suspend/resume hooks after the BBT has been scanned to
> > + * avoid using the suspend lock and resume waitqueue which are only
> > + * initialized when mtd_device_register() is called.
> > + */
> > + mtd->_suspend = nand_suspend;
> > + mtd->_resume = nand_resume;
> > return 0;
> >
> > err_free_secure_regions:
>
> Actually, this version is even cleaner:
>
> diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
> index 3d6c6e880520..98c39b7f6279 100644
> --- a/drivers/mtd/nand/raw/nand_base.c
> +++ b/drivers/mtd/nand/raw/nand_base.c
> @@ -6222,8 +6222,6 @@ static int nand_scan_tail(struct nand_chip *chip)
> mtd->_sync = nand_sync;
> mtd->_lock = nand_lock;
> mtd->_unlock = nand_unlock;
> - mtd->_suspend = nand_suspend;
> - mtd->_resume = nand_resume;
> mtd->_reboot = nand_shutdown;
> mtd->_block_isreserved = nand_block_isreserved;
> mtd->_block_isbad = nand_block_isbad;
> @@ -6261,14 +6259,20 @@ static int nand_scan_tail(struct nand_chip *chip)
> goto err_free_interface_config;
>
> /* Check, if we should skip the bad block table scan */
> - if (chip->options & NAND_SKIP_BBTSCAN)
> - return 0;
> -
> - /* Build bad block table */
> - ret = nand_create_bbt(chip);
> - if (ret)
> - goto err_free_secure_regions;
> + if (chip->options & NAND_SKIP_BBTSCAN) {
> + /* Build bad block table */
> + ret = nand_create_bbt(chip);
> + if (ret)
> + goto err_free_secure_regions;
> + }

Nice idea.

>
> + /*
> + * Populate the suspend/resume hooks after the BBT has been scanned to
> + * avoid using the suspend lock and resume waitqueue which are only
> + * initialized when mtd_device_register() is called.
> + */
> + mtd->_suspend = nand_suspend;
> + mtd->_resume = nand_resume;
> return 0;
>
> err_free_secure_regions:

Cheers,
Miquèl

2021-10-20 07:14:08

by Sean Nyekjaer

[permalink] [raw]
Subject: Re: [PATCH 0/3] mtd: core: protect access to mtd devices while in suspend

On Wed, Oct 20, 2021 at 09:00:58AM +0200, Boris Brezillon wrote:
> On Wed, 20 Oct 2021 08:52:50 +0200
> Boris Brezillon <[email protected]> wrote:
>

[ ... ]
> > >
> > > Hi Boris and Miquel,
> > >
> > > gpmi-nand.c sets NAND_SKIP_BBTSCAN so we won't get there and populate
> > > suspend resume hooks :(
> > > Guess there is other drivers that does the same thing...
>
> Actually, this version is even cleaner:
>
> diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
> index 3d6c6e880520..98c39b7f6279 100644
> --- a/drivers/mtd/nand/raw/nand_base.c
> +++ b/drivers/mtd/nand/raw/nand_base.c
> @@ -6222,8 +6222,6 @@ static int nand_scan_tail(struct nand_chip *chip)
> mtd->_sync = nand_sync;
> mtd->_lock = nand_lock;
> mtd->_unlock = nand_unlock;
> - mtd->_suspend = nand_suspend;
> - mtd->_resume = nand_resume;
> mtd->_reboot = nand_shutdown;
> mtd->_block_isreserved = nand_block_isreserved;
> mtd->_block_isbad = nand_block_isbad;
> @@ -6261,14 +6259,20 @@ static int nand_scan_tail(struct nand_chip *chip)
> goto err_free_interface_config;
>
> /* Check, if we should skip the bad block table scan */
> - if (chip->options & NAND_SKIP_BBTSCAN)
> - return 0;
> -
> - /* Build bad block table */
> - ret = nand_create_bbt(chip);
> - if (ret)
> - goto err_free_secure_regions;
> + if (chip->options & NAND_SKIP_BBTSCAN) {
> + /* Build bad block table */
> + ret = nand_create_bbt(chip);
> + if (ret)
> + goto err_free_secure_regions;
> + }
>
> + /*
> + * Populate the suspend/resume hooks after the BBT has been scanned to
> + * avoid using the suspend lock and resume waitqueue which are only
> + * initialized when mtd_device_register() is called.
> + */
> + mtd->_suspend = nand_suspend;
> + mtd->_resume = nand_resume;
> return 0;
>
> err_free_secure_regions:

Why is the gpmi-nand.c and other drivers set NAND_SKIP_BBTSCAN and then
call nand_create_bbt() directly?

To me it looks like legacy leftover...

/Sean

2021-10-20 07:24:46

by Miquel Raynal

[permalink] [raw]
Subject: Re: [PATCH 0/3] mtd: core: protect access to mtd devices while in suspend

Hi Sean,

[email protected] wrote on Wed, 20 Oct 2021 09:12:35 +0200:

> On Wed, Oct 20, 2021 at 09:00:58AM +0200, Boris Brezillon wrote:
> > On Wed, 20 Oct 2021 08:52:50 +0200
> > Boris Brezillon <[email protected]> wrote:
> >
>
> [ ... ]
> > > >
> > > > Hi Boris and Miquel,
> > > >
> > > > gpmi-nand.c sets NAND_SKIP_BBTSCAN so we won't get there and populate
> > > > suspend resume hooks :(
> > > > Guess there is other drivers that does the same thing...
> >
> > Actually, this version is even cleaner:
> >
> > diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
> > index 3d6c6e880520..98c39b7f6279 100644
> > --- a/drivers/mtd/nand/raw/nand_base.c
> > +++ b/drivers/mtd/nand/raw/nand_base.c
> > @@ -6222,8 +6222,6 @@ static int nand_scan_tail(struct nand_chip *chip)
> > mtd->_sync = nand_sync;
> > mtd->_lock = nand_lock;
> > mtd->_unlock = nand_unlock;
> > - mtd->_suspend = nand_suspend;
> > - mtd->_resume = nand_resume;
> > mtd->_reboot = nand_shutdown;
> > mtd->_block_isreserved = nand_block_isreserved;
> > mtd->_block_isbad = nand_block_isbad;
> > @@ -6261,14 +6259,20 @@ static int nand_scan_tail(struct nand_chip *chip)
> > goto err_free_interface_config;
> >
> > /* Check, if we should skip the bad block table scan */
> > - if (chip->options & NAND_SKIP_BBTSCAN)
> > - return 0;
> > -
> > - /* Build bad block table */
> > - ret = nand_create_bbt(chip);
> > - if (ret)
> > - goto err_free_secure_regions;
> > + if (chip->options & NAND_SKIP_BBTSCAN) {
> > + /* Build bad block table */
> > + ret = nand_create_bbt(chip);
> > + if (ret)
> > + goto err_free_secure_regions;
> > + }
> >
> > + /*
> > + * Populate the suspend/resume hooks after the BBT has been scanned to
> > + * avoid using the suspend lock and resume waitqueue which are only
> > + * initialized when mtd_device_register() is called.
> > + */
> > + mtd->_suspend = nand_suspend;
> > + mtd->_resume = nand_resume;
> > return 0;
> >
> > err_free_secure_regions:
>
> Why is the gpmi-nand.c and other drivers set NAND_SKIP_BBTSCAN and then
> call nand_create_bbt() directly?
>
> To me it looks like legacy leftover...

It is...

885d71e5838f ("mtd: gpmi: fix the NULL pointer") added this flag to
avoid a NULL pointer dereference on chip->buffer->databuf on mx23,
which IIRC does not exist anymore.

I believe we could get rid of it. But anyway, I like the "cleanup"
proposed by Boris so IMHO you can add these two patches to the series.

Thanks,
Miquèl

2021-10-20 07:25:42

by Sean Nyekjaer

[permalink] [raw]
Subject: Re: [PATCH 0/3] mtd: core: protect access to mtd devices while in suspend

On Wed, Oct 20, 2021 at 09:12:28AM +0200, Miquel Raynal wrote:
> >
> > Actually, this version is even cleaner:
> >
> > diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
> > index 3d6c6e880520..98c39b7f6279 100644
> > --- a/drivers/mtd/nand/raw/nand_base.c
> > +++ b/drivers/mtd/nand/raw/nand_base.c
> > @@ -6222,8 +6222,6 @@ static int nand_scan_tail(struct nand_chip *chip)
> > mtd->_sync = nand_sync;
> > mtd->_lock = nand_lock;
> > mtd->_unlock = nand_unlock;
> > - mtd->_suspend = nand_suspend;
> > - mtd->_resume = nand_resume;
> > mtd->_reboot = nand_shutdown;
> > mtd->_block_isreserved = nand_block_isreserved;
> > mtd->_block_isbad = nand_block_isbad;
> > @@ -6261,14 +6259,20 @@ static int nand_scan_tail(struct nand_chip *chip)
> > goto err_free_interface_config;
> >
> > /* Check, if we should skip the bad block table scan */
> > - if (chip->options & NAND_SKIP_BBTSCAN)
> > - return 0;
> > -
> > - /* Build bad block table */
> > - ret = nand_create_bbt(chip);
> > - if (ret)
> > - goto err_free_secure_regions;
> > + if (chip->options & NAND_SKIP_BBTSCAN) {
> > + /* Build bad block table */
> > + ret = nand_create_bbt(chip);
> > + if (ret)
> > + goto err_free_secure_regions;
> > + }
>
> Nice idea.
>
> >
> > + /*
> > + * Populate the suspend/resume hooks after the BBT has been scanned to
> > + * avoid using the suspend lock and resume waitqueue which are only
> > + * initialized when mtd_device_register() is called.
> > + */
> > + mtd->_suspend = nand_suspend;
> > + mtd->_resume = nand_resume;
> > return 0;
> >
> > err_free_secure_regions:

Could be a nice idea, but it doesn't work...
gpmi-nand.c calls nand_create_bbt() after this have run ;)

/Sean

2021-10-20 07:31:45

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH 0/3] mtd: core: protect access to mtd devices while in suspend

On Wed, 20 Oct 2021 09:12:35 +0200
Sean Nyekjaer <[email protected]> wrote:

> On Wed, Oct 20, 2021 at 09:00:58AM +0200, Boris Brezillon wrote:
> > On Wed, 20 Oct 2021 08:52:50 +0200
> > Boris Brezillon <[email protected]> wrote:
> >
>
> [ ... ]
> > > >
> > > > Hi Boris and Miquel,
> > > >
> > > > gpmi-nand.c sets NAND_SKIP_BBTSCAN so we won't get there and populate
> > > > suspend resume hooks :(
> > > > Guess there is other drivers that does the same thing...
> >
> > Actually, this version is even cleaner:
> >
> > diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
> > index 3d6c6e880520..98c39b7f6279 100644
> > --- a/drivers/mtd/nand/raw/nand_base.c
> > +++ b/drivers/mtd/nand/raw/nand_base.c
> > @@ -6222,8 +6222,6 @@ static int nand_scan_tail(struct nand_chip *chip)
> > mtd->_sync = nand_sync;
> > mtd->_lock = nand_lock;
> > mtd->_unlock = nand_unlock;
> > - mtd->_suspend = nand_suspend;
> > - mtd->_resume = nand_resume;
> > mtd->_reboot = nand_shutdown;
> > mtd->_block_isreserved = nand_block_isreserved;
> > mtd->_block_isbad = nand_block_isbad;
> > @@ -6261,14 +6259,20 @@ static int nand_scan_tail(struct nand_chip *chip)
> > goto err_free_interface_config;
> >
> > /* Check, if we should skip the bad block table scan */
> > - if (chip->options & NAND_SKIP_BBTSCAN)
> > - return 0;
> > -
> > - /* Build bad block table */
> > - ret = nand_create_bbt(chip);
> > - if (ret)
> > - goto err_free_secure_regions;
> > + if (chip->options & NAND_SKIP_BBTSCAN) {
> > + /* Build bad block table */
> > + ret = nand_create_bbt(chip);
> > + if (ret)
> > + goto err_free_secure_regions;
> > + }
> >
> > + /*
> > + * Populate the suspend/resume hooks after the BBT has been scanned to
> > + * avoid using the suspend lock and resume waitqueue which are only
> > + * initialized when mtd_device_register() is called.
> > + */
> > + mtd->_suspend = nand_suspend;
> > + mtd->_resume = nand_resume;
> > return 0;
> >
> > err_free_secure_regions:
>
> Why is the gpmi-nand.c and other drivers set NAND_SKIP_BBTSCAN and then
> call nand_create_bbt() directly?

Dunno, but there's a nand_boot_init() call between the nand_scan() and
nand_create_bbt() calls, so I guess it has to do with something done in
this function...

>
> To me it looks like legacy leftover...

If I were you, I wouldn't take the risk to change that in the same
patch series. The suspend/resume changes are already quite invasive,
so let's try to keep it as small/simple as possible.

2021-10-20 07:51:58

by Boris Brezillon

[permalink] [raw]
Subject: Re: [PATCH 0/3] mtd: core: protect access to mtd devices while in suspend

On Wed, 20 Oct 2021 09:23:52 +0200
Sean Nyekjaer <[email protected]> wrote:

> On Wed, Oct 20, 2021 at 09:12:28AM +0200, Miquel Raynal wrote:
> > >
> > > Actually, this version is even cleaner:
> > >
> > > diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
> > > index 3d6c6e880520..98c39b7f6279 100644
> > > --- a/drivers/mtd/nand/raw/nand_base.c
> > > +++ b/drivers/mtd/nand/raw/nand_base.c
> > > @@ -6222,8 +6222,6 @@ static int nand_scan_tail(struct nand_chip *chip)
> > > mtd->_sync = nand_sync;
> > > mtd->_lock = nand_lock;
> > > mtd->_unlock = nand_unlock;
> > > - mtd->_suspend = nand_suspend;
> > > - mtd->_resume = nand_resume;
> > > mtd->_reboot = nand_shutdown;
> > > mtd->_block_isreserved = nand_block_isreserved;
> > > mtd->_block_isbad = nand_block_isbad;
> > > @@ -6261,14 +6259,20 @@ static int nand_scan_tail(struct nand_chip *chip)
> > > goto err_free_interface_config;
> > >
> > > /* Check, if we should skip the bad block table scan */
> > > - if (chip->options & NAND_SKIP_BBTSCAN)
> > > - return 0;
> > > -
> > > - /* Build bad block table */
> > > - ret = nand_create_bbt(chip);
> > > - if (ret)
> > > - goto err_free_secure_regions;
> > > + if (chip->options & NAND_SKIP_BBTSCAN) {
> > > + /* Build bad block table */
> > > + ret = nand_create_bbt(chip);
> > > + if (ret)
> > > + goto err_free_secure_regions;
> > > + }
> >
> > Nice idea.
> >
> > >
> > > + /*
> > > + * Populate the suspend/resume hooks after the BBT has been scanned to
> > > + * avoid using the suspend lock and resume waitqueue which are only
> > > + * initialized when mtd_device_register() is called.
> > > + */
> > > + mtd->_suspend = nand_suspend;
> > > + mtd->_resume = nand_resume;
> > > return 0;
> > >
> > > err_free_secure_regions:
>
> Could be a nice idea, but it doesn't work...
> gpmi-nand.c calls nand_create_bbt() after this have run ;)

Er, indeed. Can you try with this instead:

diff --git a/drivers/mtd/nand/raw/nand_bbt.c b/drivers/mtd/nand/raw/nand_bbt.c
index b7ad030225f8..548647bd27a4 100644
--- a/drivers/mtd/nand/raw/nand_bbt.c
+++ b/drivers/mtd/nand/raw/nand_bbt.c
@@ -1397,8 +1397,28 @@ static int nand_create_badblock_pattern(struct nand_chip *this)
*/
int nand_create_bbt(struct nand_chip *this)
{
+ struct mtd_info *mtd = nand_to_mtd(this)
+ int (*suspend) (struct mtd_info *) = mtd->_suspend;
+ int (*resume) (struct mtd_info *) = mtd->_resume;
int ret;

+ /*
+ * The BBT scan logic use the MTD helpers before the MTD layer had a
+ * chance to initialize the device, and that leads to issues when
+ * accessing the uninitialized suspend lock. Let's temporarily set the
+ * suspend/resume hooks to NULL to skip the lock acquire/release step.
+ *
+ * FIXME: This is an ugly hack, so please don't copy this pattern to
+ * other MTD implementations. The proper fix would be to implement a
+ * generic BBT scan logic at the NAND level that's not using any of the
+ * MTD helpers to access pages. We also might consider doing a two
+ * step initialization at the MTD level (mtd_device_init() +
+ * mtd_device_register()) so some of the fields are initialized
+ * early.
+ */
+ mtd->_suspend = NULL;
+ mtd->_resume = NULL;
+
/* Is a flash based bad block table requested? */
if (this->bbt_options & NAND_BBT_USE_FLASH) {
/* Use the default pattern descriptors */
@@ -1422,7 +1442,13 @@ int nand_create_bbt(struct nand_chip *this)
return ret;
}

- return nand_scan_bbt(this, this->badblock_pattern);
+ ret = nand_scan_bbt(this, this->badblock_pattern);
+
+ /* Restore the suspend/resume hooks. */
+ mtd->_suspend = suspend;
+ mtd->_resume = resume;
+
+ return ret;
}
EXPORT_SYMBOL(nand_create_bbt);