Allow settings for on-die ecc such that if on-die ECC is selected
don't error out but require ECC strap setting of zero
Signed-off-by: David Regan <[email protected]>
Reviewed-by: William Zhang <[email protected]>
---
Changes in v3: None
---
Changes in v2:
- Added to patch series
---
drivers/mtd/nand/raw/brcmnand/brcmnand.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
index a4e311b6798c..42526f3250c9 100644
--- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c
+++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
@@ -2727,9 +2727,11 @@ static int brcmnand_setup_dev(struct brcmnand_host *host)
cfg->blk_adr_bytes = get_blk_adr_bytes(mtd->size, mtd->writesize);
if (chip->ecc.engine_type != NAND_ECC_ENGINE_TYPE_ON_HOST) {
- dev_err(ctrl->dev, "only HW ECC supported; selected: %d\n",
- chip->ecc.engine_type);
- return -EINVAL;
+ if (chip->ecc.strength) {
+ dev_err(ctrl->dev, "ERROR!!! HW ECC must be set to zero for non-hardware ECC; selected: %d\n",
+ chip->ecc.strength);
+ return -EINVAL;
+ }
}
if (chip->ecc.algo == NAND_ECC_ALGO_UNKNOWN) {
@@ -2797,7 +2799,11 @@ static int brcmnand_setup_dev(struct brcmnand_host *host)
if (ret)
return ret;
- brcmnand_set_ecc_enabled(host, 1);
+ if (chip->ecc.engine_type == NAND_ECC_ENGINE_TYPE_ON_DIE) {
+ dev_dbg(ctrl->dev, "Disable HW ECC for on-die ECC\n");
+ brcmnand_set_ecc_enabled(host, 0);
+ } else
+ brcmnand_set_ecc_enabled(host, 1);
brcmnand_print_cfg(host, msg, cfg);
dev_info(ctrl->dev, "detected %s\n", msg);
--
2.37.3
Hi David,
[email protected] wrote on Tue, 23 Jan 2024 19:04:58 -0800:
> Allow settings for on-die ecc such that if on-die ECC is selected
> don't error out but require ECC strap setting of zero
>
> Signed-off-by: David Regan <[email protected]>
> Reviewed-by: William Zhang <[email protected]>
> ---
> Changes in v3: None
> ---
> Changes in v2:
> - Added to patch series
> ---
> drivers/mtd/nand/raw/brcmnand/brcmnand.c | 14 ++++++++++----
> 1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
> index a4e311b6798c..42526f3250c9 100644
> --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c
> +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
> @@ -2727,9 +2727,11 @@ static int brcmnand_setup_dev(struct brcmnand_host *host)
> cfg->blk_adr_bytes = get_blk_adr_bytes(mtd->size, mtd->writesize);
>
> if (chip->ecc.engine_type != NAND_ECC_ENGINE_TYPE_ON_HOST) {
> - dev_err(ctrl->dev, "only HW ECC supported; selected: %d\n",
> - chip->ecc.engine_type);
> - return -EINVAL;
> + if (chip->ecc.strength) {
> + dev_err(ctrl->dev, "ERROR!!! HW ECC must be set to zero for non-hardware ECC; selected: %d\n",
> + chip->ecc.strength);
Can you use a more formal string? Also clarify it because I don't
really understand what it leads to.
> + return -EINVAL;
> + }
> }
>
> if (chip->ecc.algo == NAND_ECC_ALGO_UNKNOWN) {
> @@ -2797,7 +2799,11 @@ static int brcmnand_setup_dev(struct brcmnand_host *host)
> if (ret)
> return ret;
>
> - brcmnand_set_ecc_enabled(host, 1);
> + if (chip->ecc.engine_type == NAND_ECC_ENGINE_TYPE_ON_DIE) {
> + dev_dbg(ctrl->dev, "Disable HW ECC for on-die ECC\n");
Not needed.
> + brcmnand_set_ecc_enabled(host, 0);
> + } else
> + brcmnand_set_ecc_enabled(host, 1);
Style is wrong, but otherwise I think ECC should be kept disabled while
not in active use, so I am a bit surprised by this line.
>
> brcmnand_print_cfg(host, msg, cfg);
> dev_info(ctrl->dev, "detected %s\n", msg);
Thanks,
Miquèl
Hi Miquèl,
On Wed, Jan 24, 2024 at 9:40 AM Miquel Raynal <[email protected]> wrote:
>
> Hi David,
>
> [email protected] wrote on Tue, 23 Jan 2024 19:04:58 -0800:
>
> > Allow settings for on-die ecc such that if on-die ECC is selected
> > don't error out but require ECC strap setting of zero
> >
> > Signed-off-by: David Regan <[email protected]>
> > Reviewed-by: William Zhang <[email protected]>
> > ---
> > Changes in v3: None
> > ---
> > Changes in v2:
> > - Added to patch series
> > ---
> > drivers/mtd/nand/raw/brcmnand/brcmnand.c | 14 ++++++++++----
> > 1 file changed, 10 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
> > index a4e311b6798c..42526f3250c9 100644
> > --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c
> > +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
> > @@ -2727,9 +2727,11 @@ static int brcmnand_setup_dev(struct brcmnand_host *host)
> > cfg->blk_adr_bytes = get_blk_adr_bytes(mtd->size, mtd->writesize);
> >
> > if (chip->ecc.engine_type != NAND_ECC_ENGINE_TYPE_ON_HOST) {
> > - dev_err(ctrl->dev, "only HW ECC supported; selected: %d\n",
> > - chip->ecc.engine_type);
> > - return -EINVAL;
> > + if (chip->ecc.strength) {
> > + dev_err(ctrl->dev, "ERROR!!! HW ECC must be set to zero for non-hardware ECC; selected: %d\n",
> > + chip->ecc.strength);
>
> Can you use a more formal string? Also clarify it because I don't
> really understand what it leads to.
How about:
dev_err(ctrl->dev, "HW ECC set to %d, must be zero for on-die ECC\n",
>
> > + return -EINVAL;
> > + }
> > }
> >
> > if (chip->ecc.algo == NAND_ECC_ALGO_UNKNOWN) {
> > @@ -2797,7 +2799,11 @@ static int brcmnand_setup_dev(struct brcmnand_host *host)
> > if (ret)
> > return ret;
> >
> > - brcmnand_set_ecc_enabled(host, 1);
> > + if (chip->ecc.engine_type == NAND_ECC_ENGINE_TYPE_ON_DIE) {
> > + dev_dbg(ctrl->dev, "Disable HW ECC for on-die ECC\n");
>
> Not needed.
Will remove.
>
> > + brcmnand_set_ecc_enabled(host, 0);
> > + } else
> > + brcmnand_set_ecc_enabled(host, 1);
>
> Style is wrong, but otherwise I think ECC should be kept disabled while
> not in active use, so I am a bit surprised by this line.
This is a double check to turn on/off our hardware ECC.
>
> >
> > brcmnand_print_cfg(host, msg, cfg);
> > dev_info(ctrl->dev, "detected %s\n", msg);
>
>
> Thanks,
> Miquèl
Thanks!
-Dave
Hi David,
[email protected] wrote on Thu, 25 Jan 2024 11:47:46 -0800:
> Hi Miquèl,
>
> On Wed, Jan 24, 2024 at 9:40 AM Miquel Raynal <[email protected]> wrote:
> >
> > Hi David,
> >
> > [email protected] wrote on Tue, 23 Jan 2024 19:04:58 -0800:
> >
> > > Allow settings for on-die ecc such that if on-die ECC is selected
> > > don't error out but require ECC strap setting of zero
> > >
> > > Signed-off-by: David Regan <[email protected]>
> > > Reviewed-by: William Zhang <[email protected]>
> > > ---
> > > Changes in v3: None
> > > ---
> > > Changes in v2:
> > > - Added to patch series
> > > ---
> > > drivers/mtd/nand/raw/brcmnand/brcmnand.c | 14 ++++++++++----
> > > 1 file changed, 10 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
> > > index a4e311b6798c..42526f3250c9 100644
> > > --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c
> > > +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
> > > @@ -2727,9 +2727,11 @@ static int brcmnand_setup_dev(struct brcmnand_host *host)
> > > cfg->blk_adr_bytes = get_blk_adr_bytes(mtd->size, mtd->writesize);
> > >
> > > if (chip->ecc.engine_type != NAND_ECC_ENGINE_TYPE_ON_HOST) {
> > > - dev_err(ctrl->dev, "only HW ECC supported; selected: %d\n",
> > > - chip->ecc.engine_type);
> > > - return -EINVAL;
> > > + if (chip->ecc.strength) {
> > > + dev_err(ctrl->dev, "ERROR!!! HW ECC must be set to zero for non-hardware ECC; selected: %d\n",
> > > + chip->ecc.strength);
> >
> > Can you use a more formal string? Also clarify it because I don't
> > really understand what it leads to.
>
> How about:
>
> dev_err(ctrl->dev, "HW ECC set to %d, must be zero for on-die ECC\n",
Actually I am wondering how legitimate this is. Just don't enable the
on host ECC engine if it's not in use. No need to check the core's
choice.
>
> >
> > > + return -EINVAL;
> > > + }
> > > }
> > >
> > > if (chip->ecc.algo == NAND_ECC_ALGO_UNKNOWN) {
> > > @@ -2797,7 +2799,11 @@ static int brcmnand_setup_dev(struct brcmnand_host *host)
> > > if (ret)
> > > return ret;
> > >
> > > - brcmnand_set_ecc_enabled(host, 1);
> > > + if (chip->ecc.engine_type == NAND_ECC_ENGINE_TYPE_ON_DIE) {
> > > + dev_dbg(ctrl->dev, "Disable HW ECC for on-die ECC\n");
> >
> > Not needed.
>
> Will remove.
>
> >
> > > + brcmnand_set_ecc_enabled(host, 0);
> > > + } else
> > > + brcmnand_set_ecc_enabled(host, 1);
> >
> > Style is wrong, but otherwise I think ECC should be kept disabled while
> > not in active use, so I am a bit surprised by this line.
>
> This is a double check to turn on/off our hardware ECC.
I expect the engine to be always disabled. Enable it only when you
need (may require an additional patch before this one).
Thanks,
Miquèl
Hi Miquèl,
On Thu, Jan 25, 2024 at 10:19 PM Miquel Raynal
<[email protected]> wrote:
>
> Hi David,
>
> [email protected] wrote on Thu, 25 Jan 2024 11:47:46 -0800:
>
> > Hi Miquèl,
> >
> > On Wed, Jan 24, 2024 at 9:40 AM Miquel Raynal <[email protected]> wrote:
> > >
> > > Hi David,
> > >
> > > [email protected] wrote on Tue, 23 Jan 2024 19:04:58 -0800:
> > >
> > > > Allow settings for on-die ecc such that if on-die ECC is selected
> > > > don't error out but require ECC strap setting of zero
> > > >
> > > > Signed-off-by: David Regan <[email protected]>
> > > > Reviewed-by: William Zhang <[email protected]>
> > > > ---
> > > > Changes in v3: None
> > > > ---
> > > > Changes in v2:
> > > > - Added to patch series
> > > > ---
> > > > drivers/mtd/nand/raw/brcmnand/brcmnand.c | 14 ++++++++++----
> > > > 1 file changed, 10 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
> > > > index a4e311b6798c..42526f3250c9 100644
> > > > --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c
> > > > +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
> > > > @@ -2727,9 +2727,11 @@ static int brcmnand_setup_dev(struct brcmnand_host *host)
> > > > cfg->blk_adr_bytes = get_blk_adr_bytes(mtd->size, mtd->writesize);
> > > >
> > > > if (chip->ecc.engine_type != NAND_ECC_ENGINE_TYPE_ON_HOST) {
> > > > - dev_err(ctrl->dev, "only HW ECC supported; selected: %d\n",
> > > > - chip->ecc.engine_type);
> > > > - return -EINVAL;
> > > > + if (chip->ecc.strength) {
> > > > + dev_err(ctrl->dev, "ERROR!!! HW ECC must be set to zero for non-hardware ECC; selected: %d\n",
> > > > + chip->ecc.strength);
> > >
> > > Can you use a more formal string? Also clarify it because I don't
> > > really understand what it leads to.
> >
> > How about:
> >
> > dev_err(ctrl->dev, "HW ECC set to %d, must be zero for on-die ECC\n",
>
> Actually I am wondering how legitimate this is. Just don't enable the
> on host ECC engine if it's not in use. No need to check the core's
> choice.
Our chip ECC engine will either be on if it's needed or off if it's not.
Either I can do that in one place or put checks in before each
read/write to turn on/off the ECC engine, which seems a lot more
work and changes and possible issues/problems.
Turning it on/off as needed has not been explicitly tested and
could cause unforeseen consequences. This
is a minimal change which should have minimal impact.
>
> >
> > >
> > > > + return -EINVAL;
> > > > + }
> > > > }
> > > >
> > > > if (chip->ecc.algo == NAND_ECC_ALGO_UNKNOWN) {
> > > > @@ -2797,7 +2799,11 @@ static int brcmnand_setup_dev(struct brcmnand_host *host)
> > > > if (ret)
> > > > return ret;
> > > >
> > > > - brcmnand_set_ecc_enabled(host, 1);
> > > > + if (chip->ecc.engine_type == NAND_ECC_ENGINE_TYPE_ON_DIE) {
> > > > + dev_dbg(ctrl->dev, "Disable HW ECC for on-die ECC\n");
> > >
> > > Not needed.
> >
> > Will remove.
> >
> > >
> > > > + brcmnand_set_ecc_enabled(host, 0);
> > > > + } else
> > > > + brcmnand_set_ecc_enabled(host, 1);
> > >
> > > Style is wrong, but otherwise I think ECC should be kept disabled while
> > > not in active use, so I am a bit surprised by this line.
> >
> > This is a double check to turn on/off our hardware ECC.
>
> I expect the engine to be always disabled. Enable it only when you
> need (may require an additional patch before this one).
We are already turning on the ECC enable at this point,
this is just adding the option to turn it off if the NAND chip
itself will be doing the ECC instead of our controller.
>
> Thanks,
> Miquèl
Thanks!
-Dave
Hi David,
[email protected] wrote on Fri, 26 Jan 2024 11:57:39 -0800:
> Hi Miquèl,
>
> On Thu, Jan 25, 2024 at 10:19 PM Miquel Raynal
> <[email protected]> wrote:
> >
> > Hi David,
> >
> > [email protected] wrote on Thu, 25 Jan 2024 11:47:46 -0800:
> >
> > > Hi Miquèl,
> > >
> > > On Wed, Jan 24, 2024 at 9:40 AM Miquel Raynal <[email protected]> wrote:
> > > >
> > > > Hi David,
> > > >
> > > > [email protected] wrote on Tue, 23 Jan 2024 19:04:58 -0800:
> > > >
> > > > > Allow settings for on-die ecc such that if on-die ECC is selected
> > > > > don't error out but require ECC strap setting of zero
> > > > >
> > > > > Signed-off-by: David Regan <[email protected]>
> > > > > Reviewed-by: William Zhang <[email protected]>
> > > > > ---
> > > > > Changes in v3: None
> > > > > ---
> > > > > Changes in v2:
> > > > > - Added to patch series
> > > > > ---
> > > > > drivers/mtd/nand/raw/brcmnand/brcmnand.c | 14 ++++++++++----
> > > > > 1 file changed, 10 insertions(+), 4 deletions(-)
> > > > >
> > > > > diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
> > > > > index a4e311b6798c..42526f3250c9 100644
> > > > > --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c
> > > > > +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
> > > > > @@ -2727,9 +2727,11 @@ static int brcmnand_setup_dev(struct brcmnand_host *host)
> > > > > cfg->blk_adr_bytes = get_blk_adr_bytes(mtd->size, mtd->writesize);
> > > > >
> > > > > if (chip->ecc.engine_type != NAND_ECC_ENGINE_TYPE_ON_HOST) {
> > > > > - dev_err(ctrl->dev, "only HW ECC supported; selected: %d\n",
> > > > > - chip->ecc.engine_type);
> > > > > - return -EINVAL;
> > > > > + if (chip->ecc.strength) {
> > > > > + dev_err(ctrl->dev, "ERROR!!! HW ECC must be set to zero for non-hardware ECC; selected: %d\n",
> > > > > + chip->ecc.strength);
> > > >
> > > > Can you use a more formal string? Also clarify it because I don't
> > > > really understand what it leads to.
> > >
> > > How about:
> > >
> > > dev_err(ctrl->dev, "HW ECC set to %d, must be zero for on-die ECC\n",
> >
> > Actually I am wondering how legitimate this is. Just don't enable the
> > on host ECC engine if it's not in use. No need to check the core's
> > choice.
>
> Our chip ECC engine will either be on if it's needed or off if it's not.
> Either I can do that in one place or put checks in before each
> read/write to turn on/off the ECC engine, which seems a lot more
> work and changes and possible issues/problems.
> Turning it on/off as needed has not been explicitly tested and
> could cause unforeseen consequences. This
> is a minimal change which should have minimal impact.
>
> >
> > >
> > > >
> > > > > + return -EINVAL;
> > > > > + }
> > > > > }
> > > > >
> > > > > if (chip->ecc.algo == NAND_ECC_ALGO_UNKNOWN) {
> > > > > @@ -2797,7 +2799,11 @@ static int brcmnand_setup_dev(struct brcmnand_host *host)
> > > > > if (ret)
> > > > > return ret;
> > > > >
> > > > > - brcmnand_set_ecc_enabled(host, 1);
> > > > > + if (chip->ecc.engine_type == NAND_ECC_ENGINE_TYPE_ON_DIE) {
> > > > > + dev_dbg(ctrl->dev, "Disable HW ECC for on-die ECC\n");
> > > >
> > > > Not needed.
> > >
> > > Will remove.
> > >
> > > >
> > > > > + brcmnand_set_ecc_enabled(host, 0);
> > > > > + } else
> > > > > + brcmnand_set_ecc_enabled(host, 1);
> > > >
> > > > Style is wrong, but otherwise I think ECC should be kept disabled while
> > > > not in active use, so I am a bit surprised by this line.
> > >
> > > This is a double check to turn on/off our hardware ECC.
> >
> > I expect the engine to be always disabled. Enable it only when you
> > need (may require an additional patch before this one).
>
> We are already turning on the ECC enable at this point,
> this is just adding the option to turn it off if the NAND chip
> itself will be doing the ECC instead of our controller.
Sorry if I have not been clear.
This sequence:
- init
- enable hw ECC engine
Is broken.
It *cannot* work as any operation going through exec_op now may
perform page reads which should be unmodified by the ECC engine. You
driver *must* follow the following sequence:
- init and disable (or keep disabled) the hw ECC engine
- when you perform a page operation with correction you need to
- enable the engine
- perform the operation
- disable the engine
Thanks,
Miquèl
Hi William,
[email protected] wrote on Tue, 30 Jan 2024 00:11:32 -0800:
> Hi Miquel,
>
> On 1/29/24 02:52, Miquel Raynal wrote:
> > Hi David,
> >
> > [email protected] wrote on Fri, 26 Jan 2024 11:57:39 -0800:
> >
> >> Hi Miquèl,
> >>
> >> On Thu, Jan 25, 2024 at 10:19 PM Miquel Raynal
> >> <[email protected]> wrote:
> >>>
> >>> Hi David,
> >>>
> >>> [email protected] wrote on Thu, 25 Jan 2024 11:47:46 -0800:
> >>> >>>> Hi Miquèl,
> >>>>
> >>>> On Wed, Jan 24, 2024 at 9:40 AM Miquel Raynal <[email protected]> wrote:
> >>>>>
> >>>>> Hi David,
> >>>>>
> >>>>> [email protected] wrote on Tue, 23 Jan 2024 19:04:58 -0800:
> >>>>> >>>>>> Allow settings for on-die ecc such that if on-die ECC is selected
> >>>>>> don't error out but require ECC strap setting of zero
> >>>>>>
> >>>>>> Signed-off-by: David Regan <[email protected]>
> >>>>>> Reviewed-by: William Zhang <[email protected]>
> >>>>>> ---
> >>>>>> Changes in v3: None
> >>>>>> ---
> >>>>>> Changes in v2:
> >>>>>> - Added to patch series
> >>>>>> ---
> >>>>>> drivers/mtd/nand/raw/brcmnand/brcmnand.c | 14 ++++++++++----
> >>>>>> 1 file changed, 10 insertions(+), 4 deletions(-)
> >>>>>>
> >>>>>> diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
> >>>>>> index a4e311b6798c..42526f3250c9 100644
> >>>>>> --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c
> >>>>>> +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
> >>>>>> @@ -2727,9 +2727,11 @@ static int brcmnand_setup_dev(struct brcmnand_host *host)
> >>>>>> cfg->blk_adr_bytes = get_blk_adr_bytes(mtd->size, mtd->writesize);
> >>>>>>
> >>>>>> if (chip->ecc.engine_type != NAND_ECC_ENGINE_TYPE_ON_HOST) {
> >>>>>> - dev_err(ctrl->dev, "only HW ECC supported; selected: %d\n",
> >>>>>> - chip->ecc.engine_type);
> >>>>>> - return -EINVAL;
> >>>>>> + if (chip->ecc.strength) {
> >>>>>> + dev_err(ctrl->dev, "ERROR!!! HW ECC must be set to zero for non-hardware ECC; selected: %d\n",
> >>>>>> + chip->ecc.strength);
> >>>>>
> >>>>> Can you use a more formal string? Also clarify it because I don't
> >>>>> really understand what it leads to.
> >>>>
> >>>> How about:
> >>>>
> >>>> dev_err(ctrl->dev, "HW ECC set to %d, must be zero for on-die ECC\n",
> >>>
> >>> Actually I am wondering how legitimate this is. Just don't enable the
> >>> on host ECC engine if it's not in use. No need to check the core's
> >>> choice.
> >>
> >> Our chip ECC engine will either be on if it's needed or off if it's not.
> >> Either I can do that in one place or put checks in before each
> >> read/write to turn on/off the ECC engine, which seems a lot more
> >> work and changes and possible issues/problems.
> >> Turning it on/off as needed has not been explicitly tested and
> >> could cause unforeseen consequences. This
> >> is a minimal change which should have minimal impact.
> >>
> >>> >>>> >>>>> >>>>>> + return -EINVAL;
> >>>>>> + }
> >>>>>> }
> >>>>>>
> >>>>>> if (chip->ecc.algo == NAND_ECC_ALGO_UNKNOWN) {
> >>>>>> @@ -2797,7 +2799,11 @@ static int brcmnand_setup_dev(struct brcmnand_host *host)
> >>>>>> if (ret)
> >>>>>> return ret;
> >>>>>>
> >>>>>> - brcmnand_set_ecc_enabled(host, 1);
> >>>>>> + if (chip->ecc.engine_type == NAND_ECC_ENGINE_TYPE_ON_DIE) {
> >>>>>> + dev_dbg(ctrl->dev, "Disable HW ECC for on-die ECC\n");
> >>>>>
> >>>>> Not needed.
> >>>>
> >>>> Will remove.
> >>>> >>>>> >>>>>> + brcmnand_set_ecc_enabled(host, 0);
> >>>>>> + } else
> >>>>>> + brcmnand_set_ecc_enabled(host, 1);
> >>>>>
> >>>>> Style is wrong, but otherwise I think ECC should be kept disabled while
> >>>>> not in active use, so I am a bit surprised by this line.
> >>>>
> >>>> This is a double check to turn on/off our hardware ECC.
> >>>
> >>> I expect the engine to be always disabled. Enable it only when you
> >>> need (may require an additional patch before this one).
> >>
> >> We are already turning on the ECC enable at this point,
> >> this is just adding the option to turn it off if the NAND chip
> >> itself will be doing the ECC instead of our controller.
> >
> > Sorry if I have not been clear.
> >
> > This sequence:
> > - init
> > - enable hw ECC engine
> > Is broken.
> >
> ECC engine is not enabled for all the cases. Here we only intended to enable it for the nand chip that is set to use NAND_ECC_ENGINE_TYPE_ON_HOST. The logic here should better change to:
> if (chip->ecc.engine_type == NAND_ECC_ENGINE_TYPE_ON_HOST)
> brcmnand_set_ecc_enabled(host, 1);
> else
> brcmnand_set_ecc_enabled(host, 0);
>
> > It *cannot* work as any operation going through exec_op now may
> > perform page reads which should be unmodified by the ECC engine. You > driver *must* follow the following sequence:
> > - init and disable (or keep disabled) the hw ECC engine
> > - when you perform a page operation with correction you need to
> > - enable the engine
> > - perform the operation
> > - disable the engine
> > Maybe I am missing something here but are you saying the exec_op can have different ecc type for page read/write at run time on the same nand chip? I don't see the op instr structure has the ecc type field and thought it is only bind to the nand chip and won't change at run time. So looks to me the init time setting to the engine based on ecc.engine_type should be sufficient.
>
> What you described here can work for the hw.ecc read path (ecc.read_page = brcmnand_read_page) which always assumes ecc is enabled. Although it is probably not too bad with these two extra operation, it would be better if we don't have to add anything as our current code does. For the brcmnand_read_page_raw, we currently disable the engine and then re-enable it(but we need to fix it to only enable it with hw ecc engine type). So it is just opposite of you logic but works the same with no impact on the most performance critical path.
This is not "my" logic, this is the "core's" logic. I am saying: your
approach is broken because that is not how the API is supposed to work,
but it mostly works in the standard case.
Thanks,
Miquèl
Hi Miquel,
On 1/29/24 02:52, Miquel Raynal wrote:
> Hi David,
>
> [email protected] wrote on Fri, 26 Jan 2024 11:57:39 -0800:
>
>> Hi Miquèl,
>>
>> On Thu, Jan 25, 2024 at 10:19 PM Miquel Raynal
>> <[email protected]> wrote:
>>>
>>> Hi David,
>>>
>>> [email protected] wrote on Thu, 25 Jan 2024 11:47:46 -0800:
>>>
>>>> Hi Miquèl,
>>>>
>>>> On Wed, Jan 24, 2024 at 9:40 AM Miquel Raynal <[email protected]> wrote:
>>>>>
>>>>> Hi David,
>>>>>
>>>>> [email protected] wrote on Tue, 23 Jan 2024 19:04:58 -0800:
>>>>>
>>>>>> Allow settings for on-die ecc such that if on-die ECC is selected
>>>>>> don't error out but require ECC strap setting of zero
>>>>>>
>>>>>> Signed-off-by: David Regan <[email protected]>
>>>>>> Reviewed-by: William Zhang <[email protected]>
>>>>>> ---
>>>>>> Changes in v3: None
>>>>>> ---
>>>>>> Changes in v2:
>>>>>> - Added to patch series
>>>>>> ---
>>>>>> drivers/mtd/nand/raw/brcmnand/brcmnand.c | 14 ++++++++++----
>>>>>> 1 file changed, 10 insertions(+), 4 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
>>>>>> index a4e311b6798c..42526f3250c9 100644
>>>>>> --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c
>>>>>> +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
>>>>>> @@ -2727,9 +2727,11 @@ static int brcmnand_setup_dev(struct brcmnand_host *host)
>>>>>> cfg->blk_adr_bytes = get_blk_adr_bytes(mtd->size, mtd->writesize);
>>>>>>
>>>>>> if (chip->ecc.engine_type != NAND_ECC_ENGINE_TYPE_ON_HOST) {
>>>>>> - dev_err(ctrl->dev, "only HW ECC supported; selected: %d\n",
>>>>>> - chip->ecc.engine_type);
>>>>>> - return -EINVAL;
>>>>>> + if (chip->ecc.strength) {
>>>>>> + dev_err(ctrl->dev, "ERROR!!! HW ECC must be set to zero for non-hardware ECC; selected: %d\n",
>>>>>> + chip->ecc.strength);
>>>>>
>>>>> Can you use a more formal string? Also clarify it because I don't
>>>>> really understand what it leads to.
>>>>
>>>> How about:
>>>>
>>>> dev_err(ctrl->dev, "HW ECC set to %d, must be zero for on-die ECC\n",
>>>
>>> Actually I am wondering how legitimate this is. Just don't enable the
>>> on host ECC engine if it's not in use. No need to check the core's
>>> choice.
>>
>> Our chip ECC engine will either be on if it's needed or off if it's not.
>> Either I can do that in one place or put checks in before each
>> read/write to turn on/off the ECC engine, which seems a lot more
>> work and changes and possible issues/problems.
>> Turning it on/off as needed has not been explicitly tested and
>> could cause unforeseen consequences. This
>> is a minimal change which should have minimal impact.
>>
>>>
>>>>
>>>>>
>>>>>> + return -EINVAL;
>>>>>> + }
>>>>>> }
>>>>>>
>>>>>> if (chip->ecc.algo == NAND_ECC_ALGO_UNKNOWN) {
>>>>>> @@ -2797,7 +2799,11 @@ static int brcmnand_setup_dev(struct brcmnand_host *host)
>>>>>> if (ret)
>>>>>> return ret;
>>>>>>
>>>>>> - brcmnand_set_ecc_enabled(host, 1);
>>>>>> + if (chip->ecc.engine_type == NAND_ECC_ENGINE_TYPE_ON_DIE) {
>>>>>> + dev_dbg(ctrl->dev, "Disable HW ECC for on-die ECC\n");
>>>>>
>>>>> Not needed.
>>>>
>>>> Will remove.
>>>>
>>>>>
>>>>>> + brcmnand_set_ecc_enabled(host, 0);
>>>>>> + } else
>>>>>> + brcmnand_set_ecc_enabled(host, 1);
>>>>>
>>>>> Style is wrong, but otherwise I think ECC should be kept disabled while
>>>>> not in active use, so I am a bit surprised by this line.
>>>>
>>>> This is a double check to turn on/off our hardware ECC.
>>>
>>> I expect the engine to be always disabled. Enable it only when you
>>> need (may require an additional patch before this one).
>>
>> We are already turning on the ECC enable at this point,
>> this is just adding the option to turn it off if the NAND chip
>> itself will be doing the ECC instead of our controller.
>
> Sorry if I have not been clear.
>
> This sequence:
> - init
> - enable hw ECC engine
> Is broken.
>
ECC engine is not enabled for all the cases. Here we only intended to
enable it for the nand chip that is set to use
NAND_ECC_ENGINE_TYPE_ON_HOST. The logic here should better change to:
if (chip->ecc.engine_type == NAND_ECC_ENGINE_TYPE_ON_HOST)
brcmnand_set_ecc_enabled(host, 1);
else
brcmnand_set_ecc_enabled(host, 0);
> It *cannot* work as any operation going through exec_op now may
> perform page reads which should be unmodified by the ECC engine. You > driver *must* follow the following sequence:
> - init and disable (or keep disabled) the hw ECC engine
> - when you perform a page operation with correction you need to
> - enable the engine
> - perform the operation
> - disable the engine
> Maybe I am missing something here but are you saying the exec_op can
have different ecc type for page read/write at run time on the same nand
chip? I don't see the op instr structure has the ecc type field and
thought it is only bind to the nand chip and won't change at run time.
So looks to me the init time setting to the engine based on
ecc.engine_type should be sufficient.
What you described here can work for the hw.ecc read path (ecc.read_page
= brcmnand_read_page) which always assumes ecc is enabled. Although it
is probably not too bad with these two extra operation, it would be
better if we don't have to add anything as our current code does. For
the brcmnand_read_page_raw, we currently disable the engine and then
re-enable it(but we need to fix it to only enable it with hw ecc engine
type). So it is just opposite of you logic but works the same with no
impact on the most performance critical path.
> Thanks,
> Miquèl
Hi David,
[email protected] wrote on Tue, 30 Jan 2024 07:26:02 -0800:
> Hi Miquel,
>
> On Tue, Jan 30, 2024 at 3:01 AM Miquel Raynal <[email protected]> wrote:
> >
> > Hi William,
> >
> > [email protected] wrote on Tue, 30 Jan 2024 00:11:32 -0800:
> >
> > > Hi Miquel,
> > >
> > > On 1/29/24 02:52, Miquel Raynal wrote:
> > > > Hi David,
> > > >
> > > > [email protected] wrote on Fri, 26 Jan 2024 11:57:39 -0800:
> > > >
> > > >> Hi Miquèl,
> > > >>
> > > >> On Thu, Jan 25, 2024 at 10:19 PM Miquel Raynal
> > > >> <[email protected]> wrote:
> > > >>>
> > > >>> Hi David,
> > > >>>
> > > >>> [email protected] wrote on Thu, 25 Jan 2024 11:47:46 -0800:
> > > >>> >>>> Hi Miquèl,
> > > >>>>
> > > >>>> On Wed, Jan 24, 2024 at 9:40 AM Miquel Raynal <[email protected]> wrote:
> > > >>>>>
> > > >>>>> Hi David,
> > > >>>>>
> > > >>>>> [email protected] wrote on Tue, 23 Jan 2024 19:04:58 -0800:
> > > >>>>> >>>>>> Allow settings for on-die ecc such that if on-die ECC is selected
> > > >>>>>> don't error out but require ECC strap setting of zero
> > > >>>>>>
> > > >>>>>> Signed-off-by: David Regan <[email protected]>
> > > >>>>>> Reviewed-by: William Zhang <[email protected]>
> > > >>>>>> ---
> > > >>>>>> Changes in v3: None
> > > >>>>>> ---
> > > >>>>>> Changes in v2:
> > > >>>>>> - Added to patch series
> > > >>>>>> ---
> > > >>>>>> drivers/mtd/nand/raw/brcmnand/brcmnand.c | 14 ++++++++++----
> > > >>>>>> 1 file changed, 10 insertions(+), 4 deletions(-)
> > > >>>>>>
> > > >>>>>> diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
> > > >>>>>> index a4e311b6798c..42526f3250c9 100644
> > > >>>>>> --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c
> > > >>>>>> +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
> > > >>>>>> @@ -2727,9 +2727,11 @@ static int brcmnand_setup_dev(struct brcmnand_host *host)
> > > >>>>>> cfg->blk_adr_bytes = get_blk_adr_bytes(mtd->size, mtd->writesize);
> > > >>>>>>
> > > >>>>>> if (chip->ecc.engine_type != NAND_ECC_ENGINE_TYPE_ON_HOST) {
> > > >>>>>> - dev_err(ctrl->dev, "only HW ECC supported; selected: %d\n",
> > > >>>>>> - chip->ecc.engine_type);
> > > >>>>>> - return -EINVAL;
> > > >>>>>> + if (chip->ecc.strength) {
> > > >>>>>> + dev_err(ctrl->dev, "ERROR!!! HW ECC must be set to zero for non-hardware ECC; selected: %d\n",
> > > >>>>>> + chip->ecc.strength);
> > > >>>>>
> > > >>>>> Can you use a more formal string? Also clarify it because I don't
> > > >>>>> really understand what it leads to.
> > > >>>>
> > > >>>> How about:
> > > >>>>
> > > >>>> dev_err(ctrl->dev, "HW ECC set to %d, must be zero for on-die ECC\n",
> > > >>>
> > > >>> Actually I am wondering how legitimate this is. Just don't enable the
> > > >>> on host ECC engine if it's not in use. No need to check the core's
> > > >>> choice.
> > > >>
> > > >> Our chip ECC engine will either be on if it's needed or off if it's not.
> > > >> Either I can do that in one place or put checks in before each
> > > >> read/write to turn on/off the ECC engine, which seems a lot more
> > > >> work and changes and possible issues/problems.
> > > >> Turning it on/off as needed has not been explicitly tested and
> > > >> could cause unforeseen consequences. This
> > > >> is a minimal change which should have minimal impact.
> > > >>
> > > >>> >>>> >>>>> >>>>>> + return -EINVAL;
> > > >>>>>> + }
> > > >>>>>> }
> > > >>>>>>
> > > >>>>>> if (chip->ecc.algo == NAND_ECC_ALGO_UNKNOWN) {
> > > >>>>>> @@ -2797,7 +2799,11 @@ static int brcmnand_setup_dev(struct brcmnand_host *host)
> > > >>>>>> if (ret)
> > > >>>>>> return ret;
> > > >>>>>>
> > > >>>>>> - brcmnand_set_ecc_enabled(host, 1);
> > > >>>>>> + if (chip->ecc.engine_type == NAND_ECC_ENGINE_TYPE_ON_DIE) {
> > > >>>>>> + dev_dbg(ctrl->dev, "Disable HW ECC for on-die ECC\n");
> > > >>>>>
> > > >>>>> Not needed.
> > > >>>>
> > > >>>> Will remove.
> > > >>>> >>>>> >>>>>> + brcmnand_set_ecc_enabled(host, 0);
> > > >>>>>> + } else
> > > >>>>>> + brcmnand_set_ecc_enabled(host, 1);
> > > >>>>>
> > > >>>>> Style is wrong, but otherwise I think ECC should be kept disabled while
> > > >>>>> not in active use, so I am a bit surprised by this line.
> > > >>>>
> > > >>>> This is a double check to turn on/off our hardware ECC.
> > > >>>
> > > >>> I expect the engine to be always disabled. Enable it only when you
> > > >>> need (may require an additional patch before this one).
> > > >>
> > > >> We are already turning on the ECC enable at this point,
> > > >> this is just adding the option to turn it off if the NAND chip
> > > >> itself will be doing the ECC instead of our controller.
> > > >
> > > > Sorry if I have not been clear.
> > > >
> > > > This sequence:
> > > > - init
> > > > - enable hw ECC engine
> > > > Is broken.
> > > >
> > > ECC engine is not enabled for all the cases. Here we only intended to enable it for the nand chip that is set to use NAND_ECC_ENGINE_TYPE_ON_HOST. The logic here should better change to:
> > > if (chip->ecc.engine_type == NAND_ECC_ENGINE_TYPE_ON_HOST)
> > > brcmnand_set_ecc_enabled(host, 1);
> > > else
> > > brcmnand_set_ecc_enabled(host, 0);
> > >
> > > > It *cannot* work as any operation going through exec_op now may
> > > > perform page reads which should be unmodified by the ECC engine. You > driver *must* follow the following sequence:
> > > > - init and disable (or keep disabled) the hw ECC engine
> > > > - when you perform a page operation with correction you need to
> > > > - enable the engine
> > > > - perform the operation
> > > > - disable the engine
> > > > Maybe I am missing something here but are you saying the exec_op can have different ecc type for page read/write at run time on the same nand chip? I don't see the op instr structure has the ecc type field and thought it is only bind to the nand chip and won't change at run time. So looks to me the init time setting to the engine based on ecc.engine_type should be sufficient.
> > >
> > > What you described here can work for the hw.ecc read path (ecc.read_page = brcmnand_read_page) which always assumes ecc is enabled. Although it is probably not too bad with these two extra operation, it would be better if we don't have to add anything as our current code does. For the brcmnand_read_page_raw, we currently disable the engine and then re-enable it(but we need to fix it to only enable it with hw ecc engine type). So it is just opposite of you logic but works the same with no impact on the most performance critical path.
> >
> > This is not "my" logic, this is the "core's" logic. I am saying: your
> > approach is broken because that is not how the API is supposed to work,
> > but it mostly works in the standard case.
>
> In the interest of minimizing register writes, would it be acceptable to
> enable/disable ECC at the beginning of a standard
> path transfer but not, after the transfer, turn off the ECC? This should not
> affect other standard path operations nor affect the exec_op path as those
> are low level transfers which our ECC engine would not touch and the NAND
> device driver should be responsible for turning on/off its own ECC.
Do you have legitimate concerns about this register write taking way
more time than I could expect? Because compared to the transfer of a
NAND page + tR/tPROG it should not be noticeable. I don't see how you
could even measure such impact actually, unless the register write does
way more than usual. I'm fine with the above idea if you show me it has
an interest.
Thanks,
Miquèl
Hi Miquel,
On Tue, Jan 30, 2024 at 3:01 AM Miquel Raynal <[email protected]> wrote:
>
> Hi William,
>
> [email protected] wrote on Tue, 30 Jan 2024 00:11:32 -0800:
>
> > Hi Miquel,
> >
> > On 1/29/24 02:52, Miquel Raynal wrote:
> > > Hi David,
> > >
> > > [email protected] wrote on Fri, 26 Jan 2024 11:57:39 -0800:
> > >
> > >> Hi Miquèl,
> > >>
> > >> On Thu, Jan 25, 2024 at 10:19 PM Miquel Raynal
> > >> <[email protected]> wrote:
> > >>>
> > >>> Hi David,
> > >>>
> > >>> [email protected] wrote on Thu, 25 Jan 2024 11:47:46 -0800:
> > >>> >>>> Hi Miquèl,
> > >>>>
> > >>>> On Wed, Jan 24, 2024 at 9:40 AM Miquel Raynal <[email protected]> wrote:
> > >>>>>
> > >>>>> Hi David,
> > >>>>>
> > >>>>> [email protected] wrote on Tue, 23 Jan 2024 19:04:58 -0800:
> > >>>>> >>>>>> Allow settings for on-die ecc such that if on-die ECC is selected
> > >>>>>> don't error out but require ECC strap setting of zero
> > >>>>>>
> > >>>>>> Signed-off-by: David Regan <[email protected]>
> > >>>>>> Reviewed-by: William Zhang <[email protected]>
> > >>>>>> ---
> > >>>>>> Changes in v3: None
> > >>>>>> ---
> > >>>>>> Changes in v2:
> > >>>>>> - Added to patch series
> > >>>>>> ---
> > >>>>>> drivers/mtd/nand/raw/brcmnand/brcmnand.c | 14 ++++++++++----
> > >>>>>> 1 file changed, 10 insertions(+), 4 deletions(-)
> > >>>>>>
> > >>>>>> diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
> > >>>>>> index a4e311b6798c..42526f3250c9 100644
> > >>>>>> --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c
> > >>>>>> +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
> > >>>>>> @@ -2727,9 +2727,11 @@ static int brcmnand_setup_dev(struct brcmnand_host *host)
> > >>>>>> cfg->blk_adr_bytes = get_blk_adr_bytes(mtd->size, mtd->writesize);
> > >>>>>>
> > >>>>>> if (chip->ecc.engine_type != NAND_ECC_ENGINE_TYPE_ON_HOST) {
> > >>>>>> - dev_err(ctrl->dev, "only HW ECC supported; selected: %d\n",
> > >>>>>> - chip->ecc.engine_type);
> > >>>>>> - return -EINVAL;
> > >>>>>> + if (chip->ecc.strength) {
> > >>>>>> + dev_err(ctrl->dev, "ERROR!!! HW ECC must be set to zero for non-hardware ECC; selected: %d\n",
> > >>>>>> + chip->ecc.strength);
> > >>>>>
> > >>>>> Can you use a more formal string? Also clarify it because I don't
> > >>>>> really understand what it leads to.
> > >>>>
> > >>>> How about:
> > >>>>
> > >>>> dev_err(ctrl->dev, "HW ECC set to %d, must be zero for on-die ECC\n",
> > >>>
> > >>> Actually I am wondering how legitimate this is. Just don't enable the
> > >>> on host ECC engine if it's not in use. No need to check the core's
> > >>> choice.
> > >>
> > >> Our chip ECC engine will either be on if it's needed or off if it's not.
> > >> Either I can do that in one place or put checks in before each
> > >> read/write to turn on/off the ECC engine, which seems a lot more
> > >> work and changes and possible issues/problems.
> > >> Turning it on/off as needed has not been explicitly tested and
> > >> could cause unforeseen consequences. This
> > >> is a minimal change which should have minimal impact.
> > >>
> > >>> >>>> >>>>> >>>>>> + return -EINVAL;
> > >>>>>> + }
> > >>>>>> }
> > >>>>>>
> > >>>>>> if (chip->ecc.algo == NAND_ECC_ALGO_UNKNOWN) {
> > >>>>>> @@ -2797,7 +2799,11 @@ static int brcmnand_setup_dev(struct brcmnand_host *host)
> > >>>>>> if (ret)
> > >>>>>> return ret;
> > >>>>>>
> > >>>>>> - brcmnand_set_ecc_enabled(host, 1);
> > >>>>>> + if (chip->ecc.engine_type == NAND_ECC_ENGINE_TYPE_ON_DIE) {
> > >>>>>> + dev_dbg(ctrl->dev, "Disable HW ECC for on-die ECC\n");
> > >>>>>
> > >>>>> Not needed.
> > >>>>
> > >>>> Will remove.
> > >>>> >>>>> >>>>>> + brcmnand_set_ecc_enabled(host, 0);
> > >>>>>> + } else
> > >>>>>> + brcmnand_set_ecc_enabled(host, 1);
> > >>>>>
> > >>>>> Style is wrong, but otherwise I think ECC should be kept disabled while
> > >>>>> not in active use, so I am a bit surprised by this line.
> > >>>>
> > >>>> This is a double check to turn on/off our hardware ECC.
> > >>>
> > >>> I expect the engine to be always disabled. Enable it only when you
> > >>> need (may require an additional patch before this one).
> > >>
> > >> We are already turning on the ECC enable at this point,
> > >> this is just adding the option to turn it off if the NAND chip
> > >> itself will be doing the ECC instead of our controller.
> > >
> > > Sorry if I have not been clear.
> > >
> > > This sequence:
> > > - init
> > > - enable hw ECC engine
> > > Is broken.
> > >
> > ECC engine is not enabled for all the cases. Here we only intended to enable it for the nand chip that is set to use NAND_ECC_ENGINE_TYPE_ON_HOST. The logic here should better change to:
> > if (chip->ecc.engine_type == NAND_ECC_ENGINE_TYPE_ON_HOST)
> > brcmnand_set_ecc_enabled(host, 1);
> > else
> > brcmnand_set_ecc_enabled(host, 0);
> >
> > > It *cannot* work as any operation going through exec_op now may
> > > perform page reads which should be unmodified by the ECC engine. You > driver *must* follow the following sequence:
> > > - init and disable (or keep disabled) the hw ECC engine
> > > - when you perform a page operation with correction you need to
> > > - enable the engine
> > > - perform the operation
> > > - disable the engine
> > > Maybe I am missing something here but are you saying the exec_op can have different ecc type for page read/write at run time on the same nand chip? I don't see the op instr structure has the ecc type field and thought it is only bind to the nand chip and won't change at run time. So looks to me the init time setting to the engine based on ecc.engine_type should be sufficient.
> >
> > What you described here can work for the hw.ecc read path (ecc.read_page = brcmnand_read_page) which always assumes ecc is enabled. Although it is probably not too bad with these two extra operation, it would be better if we don't have to add anything as our current code does. For the brcmnand_read_page_raw, we currently disable the engine and then re-enable it(but we need to fix it to only enable it with hw ecc engine type). So it is just opposite of you logic but works the same with no impact on the most performance critical path.
>
> This is not "my" logic, this is the "core's" logic. I am saying: your
> approach is broken because that is not how the API is supposed to work,
> but it mostly works in the standard case.
In the interest of minimizing register writes, would it be acceptable to
enable/disable ECC at the beginning of a standard
path transfer but not, after the transfer, turn off the ECC? This should not
affect other standard path operations nor affect the exec_op path as those
are low level transfers which our ECC engine would not touch and the NAND
device driver should be responsible for turning on/off its own ECC.
>
> Thanks,
> Miquèl
Thanks!
-Dave
Hi Miquel,
On 1/30/24 10:55, Miquel Raynal wrote:
> Hi David,
>
> [email protected] wrote on Tue, 30 Jan 2024 07:26:02 -0800:
>
>> Hi Miquel,
>>
>> On Tue, Jan 30, 2024 at 3:01 AM Miquel Raynal <[email protected]> wrote:
>>>
>>> Hi William,
>>>
>>> [email protected] wrote on Tue, 30 Jan 2024 00:11:32 -0800:
>>>
>>>> Hi Miquel,
>>>>
>>>> On 1/29/24 02:52, Miquel Raynal wrote:
>>>>> Hi David,
>>>>>
>>>>> [email protected] wrote on Fri, 26 Jan 2024 11:57:39 -0800:
>>>>>
>>>>>> Hi Miquèl,
>>>>>>
>>>>>> On Thu, Jan 25, 2024 at 10:19 PM Miquel Raynal
>>>>>> <[email protected]> wrote:
>>>>>>>
>>>>>>> Hi David,
>>>>>>>
>>>>>>> [email protected] wrote on Thu, 25 Jan 2024 11:47:46 -0800:
>>>>>>> >>>> Hi Miquèl,
>>>>>>>>
>>>>>>>> On Wed, Jan 24, 2024 at 9:40 AM Miquel Raynal <[email protected]> wrote:
>>>>>>>>>
>>>>>>>>> Hi David,
>>>>>>>>>
>>>>>>>>> [email protected] wrote on Tue, 23 Jan 2024 19:04:58 -0800:
>>>>>>>>> >>>>>> Allow settings for on-die ecc such that if on-die ECC is selected
>>>>>>>>>> don't error out but require ECC strap setting of zero
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: David Regan <[email protected]>
>>>>>>>>>> Reviewed-by: William Zhang <[email protected]>
>>>>>>>>>> ---
>>>>>>>>>> Changes in v3: None
>>>>>>>>>> ---
>>>>>>>>>> Changes in v2:
>>>>>>>>>> - Added to patch series
>>>>>>>>>> ---
>>>>>>>>>> drivers/mtd/nand/raw/brcmnand/brcmnand.c | 14 ++++++++++----
>>>>>>>>>> 1 file changed, 10 insertions(+), 4 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
>>>>>>>>>> index a4e311b6798c..42526f3250c9 100644
>>>>>>>>>> --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c
>>>>>>>>>> +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
>>>>>>>>>> @@ -2727,9 +2727,11 @@ static int brcmnand_setup_dev(struct brcmnand_host *host)
>>>>>>>>>> cfg->blk_adr_bytes = get_blk_adr_bytes(mtd->size, mtd->writesize);
>>>>>>>>>>
>>>>>>>>>> if (chip->ecc.engine_type != NAND_ECC_ENGINE_TYPE_ON_HOST) {
>>>>>>>>>> - dev_err(ctrl->dev, "only HW ECC supported; selected: %d\n",
>>>>>>>>>> - chip->ecc.engine_type);
>>>>>>>>>> - return -EINVAL;
>>>>>>>>>> + if (chip->ecc.strength) {
>>>>>>>>>> + dev_err(ctrl->dev, "ERROR!!! HW ECC must be set to zero for non-hardware ECC; selected: %d\n",
>>>>>>>>>> + chip->ecc.strength);
>>>>>>>>>
>>>>>>>>> Can you use a more formal string? Also clarify it because I don't
>>>>>>>>> really understand what it leads to.
>>>>>>>>
>>>>>>>> How about:
>>>>>>>>
>>>>>>>> dev_err(ctrl->dev, "HW ECC set to %d, must be zero for on-die ECC\n",
>>>>>>>
>>>>>>> Actually I am wondering how legitimate this is. Just don't enable the
>>>>>>> on host ECC engine if it's not in use. No need to check the core's
>>>>>>> choice.
>>>>>>
>>>>>> Our chip ECC engine will either be on if it's needed or off if it's not.
>>>>>> Either I can do that in one place or put checks in before each
>>>>>> read/write to turn on/off the ECC engine, which seems a lot more
>>>>>> work and changes and possible issues/problems.
>>>>>> Turning it on/off as needed has not been explicitly tested and
>>>>>> could cause unforeseen consequences. This
>>>>>> is a minimal change which should have minimal impact.
>>>>>>
>>>>>>> >>>> >>>>> >>>>>> + return -EINVAL;
>>>>>>>>>> + }
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> if (chip->ecc.algo == NAND_ECC_ALGO_UNKNOWN) {
>>>>>>>>>> @@ -2797,7 +2799,11 @@ static int brcmnand_setup_dev(struct brcmnand_host *host)
>>>>>>>>>> if (ret)
>>>>>>>>>> return ret;
>>>>>>>>>>
>>>>>>>>>> - brcmnand_set_ecc_enabled(host, 1);
>>>>>>>>>> + if (chip->ecc.engine_type == NAND_ECC_ENGINE_TYPE_ON_DIE) {
>>>>>>>>>> + dev_dbg(ctrl->dev, "Disable HW ECC for on-die ECC\n");
>>>>>>>>>
>>>>>>>>> Not needed.
>>>>>>>>
>>>>>>>> Will remove.
>>>>>>>> >>>>> >>>>>> + brcmnand_set_ecc_enabled(host, 0);
>>>>>>>>>> + } else
>>>>>>>>>> + brcmnand_set_ecc_enabled(host, 1);
>>>>>>>>>
>>>>>>>>> Style is wrong, but otherwise I think ECC should be kept disabled while
>>>>>>>>> not in active use, so I am a bit surprised by this line.
>>>>>>>>
>>>>>>>> This is a double check to turn on/off our hardware ECC.
>>>>>>>
>>>>>>> I expect the engine to be always disabled. Enable it only when you
>>>>>>> need (may require an additional patch before this one).
>>>>>>
>>>>>> We are already turning on the ECC enable at this point,
>>>>>> this is just adding the option to turn it off if the NAND chip
>>>>>> itself will be doing the ECC instead of our controller.
>>>>>
>>>>> Sorry if I have not been clear.
>>>>>
>>>>> This sequence:
>>>>> - init
>>>>> - enable hw ECC engine
>>>>> Is broken.
>>>>>
>>>> ECC engine is not enabled for all the cases. Here we only intended to enable it for the nand chip that is set to use NAND_ECC_ENGINE_TYPE_ON_HOST. The logic here should better change to:
>>>> if (chip->ecc.engine_type == NAND_ECC_ENGINE_TYPE_ON_HOST)
>>>> brcmnand_set_ecc_enabled(host, 1);
>>>> else
>>>> brcmnand_set_ecc_enabled(host, 0);
>>>>
>>>>> It *cannot* work as any operation going through exec_op now may
>>>>> perform page reads which should be unmodified by the ECC engine. You > driver *must* follow the following sequence:
>>>>> - init and disable (or keep disabled) the hw ECC engine
>>>>> - when you perform a page operation with correction you need to
>>>>> - enable the engine
>>>>> - perform the operation
>>>>> - disable the engine
>>>>> Maybe I am missing something here but are you saying the exec_op can have different ecc type for page read/write at run time on the same nand chip? I don't see the op instr structure has the ecc type field and thought it is only bind to the nand chip and won't change at run time. So looks to me the init time setting to the engine based on ecc.engine_type should be sufficient.
>>>>
>>>> What you described here can work for the hw.ecc read path (ecc.read_page = brcmnand_read_page) which always assumes ecc is enabled. Although it is probably not too bad with these two extra operation, it would be better if we don't have to add anything as our current code does. For the brcmnand_read_page_raw, we currently disable the engine and then re-enable it(but we need to fix it to only enable it with hw ecc engine type). So it is just opposite of you logic but works the same with no impact on the most performance critical path.
>>>
>>> This is not "my" logic, this is the "core's" logic. I am saying: your
>>> approach is broken because that is not how the API is supposed to work,
>>> but it mostly works in the standard case.
>>
>> In the interest of minimizing register writes, would it be acceptable to
>> enable/disable ECC at the beginning of a standard
>> path transfer but not, after the transfer, turn off the ECC? This should not
>> affect other standard path operations nor affect the exec_op path as those
>> are low level transfers which our ECC engine would not touch and the NAND
>> device driver should be responsible for turning on/off its own ECC.
>
> Do you have legitimate concerns about this register write taking way
> more time than I could expect? Because compared to the transfer of a
> NAND page + tR/tPROG it should not be noticeable. I don't see how you
> could even measure such impact actually, unless the register write does
> way more than usual. I'm fine with the above idea if you show me it has
> an interest.
>
Dave did the mtd_speed test and we can see we get consistently ~35KB/s
slower with the extra enable and disable ecc engine calls in ecc read
page path.
With the change:
[ 28.148355] mtd_speedtest: page read speed is 9857 KiB/s
[ 31.754258] mtd_speedtest: 2 page read speed is 9865 KiB/s
Without the change
[ 56.444735] mtd_speedtest: page read speed is 9892 KiB/s
[ 60.042262] mtd_speedtest: 2 page read speed is 9897 KiB/s
Although it is only less than 1% drop, it is still something. I
understand the procedure you laid out above is the preferred way but
with our driver fully control the chip ecc read/write page, ecc
read_raw/write_raw page function and exec_op path, I don't see where it
may not work. What are the non-standard cases that can break it?
If the driver started with preferred procedure, we will never have this
conversation:) But unfortunately it is not the case and now if we want
to revert this but lose performance, I just don't feel like to swallow
this if there is no real world case that breaks this code.
> Thanks,
> Miquèl
Hi William,
> >>>>>>>> This is a double check to turn on/off our hardware ECC.
> >>>>>>>
> >>>>>>> I expect the engine to be always disabled. Enable it only when you
> >>>>>>> need (may require an additional patch before this one).
> >>>>>>
> >>>>>> We are already turning on the ECC enable at this point,
> >>>>>> this is just adding the option to turn it off if the NAND chip
> >>>>>> itself will be doing the ECC instead of our controller.
> >>>>>
> >>>>> Sorry if I have not been clear.
> >>>>>
> >>>>> This sequence:
> >>>>> - init
> >>>>> - enable hw ECC engine
> >>>>> Is broken.
> >>>>> >>>> ECC engine is not enabled for all the cases. Here we only intended to enable it for the nand chip that is set to use NAND_ECC_ENGINE_TYPE_ON_HOST. The logic here should better change to:
> >>>> if (chip->ecc.engine_type == NAND_ECC_ENGINE_TYPE_ON_HOST)
> >>>> brcmnand_set_ecc_enabled(host, 1);
> >>>> else
> >>>> brcmnand_set_ecc_enabled(host, 0);
> >>>> >>>>> It *cannot* work as any operation going through exec_op now may
> >>>>> perform page reads which should be unmodified by the ECC engine. You > driver *must* follow the following sequence:
> >>>>> - init and disable (or keep disabled) the hw ECC engine
> >>>>> - when you perform a page operation with correction you need to
> >>>>> - enable the engine
> >>>>> - perform the operation
> >>>>> - disable the engine
> >>>>> Maybe I am missing something here but are you saying the exec_op can have different ecc type for page read/write at run time on the same nand chip? I don't see the op instr structure has the ecc type field and thought it is only bind to the nand chip and won't change at run time. So looks to me the init time setting to the engine based on ecc.engine_type should be sufficient.
> >>>>
> >>>> What you described here can work for the hw.ecc read path (ecc.read_page = brcmnand_read_page) which always assumes ecc is enabled. Although it is probably not too bad with these two extra operation, it would be better if we don't have to add anything as our current code does. For the brcmnand_read_page_raw, we currently disable the engine and then re-enable it(but we need to fix it to only enable it with hw ecc engine type). So it is just opposite of you logic but works the same with no impact on the most performance critical path.
> >>>
> >>> This is not "my" logic, this is the "core's" logic. I am saying: your
> >>> approach is broken because that is not how the API is supposed to work,
> >>> but it mostly works in the standard case.
> >>
> >> In the interest of minimizing register writes, would it be acceptable to
> >> enable/disable ECC at the beginning of a standard
> >> path transfer but not, after the transfer, turn off the ECC? This should not
> >> affect other standard path operations nor affect the exec_op path as those
> >> are low level transfers which our ECC engine would not touch and the NAND
> >> device driver should be responsible for turning on/off its own ECC.
> >
> > Do you have legitimate concerns about this register write taking way
> > more time than I could expect? Because compared to the transfer of a
> > NAND page + tR/tPROG it should not be noticeable. I don't see how you
> > could even measure such impact actually, unless the register write does
> > way more than usual. I'm fine with the above idea if you show me it has
> > an interest.
> >
> Dave did the mtd_speed test and we can see we get consistently ~35KB/s slower with the extra enable and disable ecc engine calls in ecc read page path.
>
> With the change:
> [ 28.148355] mtd_speedtest: page read speed is 9857 KiB/s
> [ 31.754258] mtd_speedtest: 2 page read speed is 9865 KiB/s
> Without the change
> [ 56.444735] mtd_speedtest: page read speed is 9892 KiB/s
> [ 60.042262] mtd_speedtest: 2 page read speed is 9897 KiB/s
I believe if you repeat this 10 times you'll get totally different
results. I don't think this test on a non RT machine is precise enough
so that a unique 35kiB difference can be interpreted as being
significant.
> Although it is only less than 1% drop, it is still something. I understand the procedure you laid out above is the preferred way but with our driver fully control the chip ecc read/write page, ecc read_raw/write_raw page function and exec_op path, I don't see where it may not work.
I just told you, the exec_op path runs with ECC enabled. I don't know
how this controller works. Now if you don't care and are 100% sure this
is working and future proof, just keep it like this.
Cheers,
Miquèl
On 2/1/24 00:25, Miquel Raynal wrote:
> Hi William,
>
>>>>>>>>>> This is a double check to turn on/off our hardware ECC.
>>>>>>>>>
>>>>>>>>> I expect the engine to be always disabled. Enable it only when you
>>>>>>>>> need (may require an additional patch before this one).
>>>>>>>>
>>>>>>>> We are already turning on the ECC enable at this point,
>>>>>>>> this is just adding the option to turn it off if the NAND chip
>>>>>>>> itself will be doing the ECC instead of our controller.
>>>>>>>
>>>>>>> Sorry if I have not been clear.
>>>>>>>
>>>>>>> This sequence:
>>>>>>> - init
>>>>>>> - enable hw ECC engine
>>>>>>> Is broken.
>>>>>>> >>>> ECC engine is not enabled for all the cases. Here we only intended to enable it for the nand chip that is set to use NAND_ECC_ENGINE_TYPE_ON_HOST. The logic here should better change to:
>>>>>> if (chip->ecc.engine_type == NAND_ECC_ENGINE_TYPE_ON_HOST)
>>>>>> brcmnand_set_ecc_enabled(host, 1);
>>>>>> else
>>>>>> brcmnand_set_ecc_enabled(host, 0);
>>>>>> >>>>> It *cannot* work as any operation going through exec_op now may
>>>>>>> perform page reads which should be unmodified by the ECC engine. You > driver *must* follow the following sequence:
>>>>>>> - init and disable (or keep disabled) the hw ECC engine
>>>>>>> - when you perform a page operation with correction you need to
>>>>>>> - enable the engine
>>>>>>> - perform the operation
>>>>>>> - disable the engine
>>>>>>> Maybe I am missing something here but are you saying the exec_op can have different ecc type for page read/write at run time on the same nand chip? I don't see the op instr structure has the ecc type field and thought it is only bind to the nand chip and won't change at run time. So looks to me the init time setting to the engine based on ecc.engine_type should be sufficient.
>>>>>>
>>>>>> What you described here can work for the hw.ecc read path (ecc.read_page = brcmnand_read_page) which always assumes ecc is enabled. Although it is probably not too bad with these two extra operation, it would be better if we don't have to add anything as our current code does. For the brcmnand_read_page_raw, we currently disable the engine and then re-enable it(but we need to fix it to only enable it with hw ecc engine type). So it is just opposite of you logic but works the same with no impact on the most performance critical path.
>>>>>
>>>>> This is not "my" logic, this is the "core's" logic. I am saying: your
>>>>> approach is broken because that is not how the API is supposed to work,
>>>>> but it mostly works in the standard case.
>>>>
>>>> In the interest of minimizing register writes, would it be acceptable to
>>>> enable/disable ECC at the beginning of a standard
>>>> path transfer but not, after the transfer, turn off the ECC? This should not
>>>> affect other standard path operations nor affect the exec_op path as those
>>>> are low level transfers which our ECC engine would not touch and the NAND
>>>> device driver should be responsible for turning on/off its own ECC.
>>>
>>> Do you have legitimate concerns about this register write taking way
>>> more time than I could expect? Because compared to the transfer of a
>>> NAND page + tR/tPROG it should not be noticeable. I don't see how you
>>> could even measure such impact actually, unless the register write does
>>> way more than usual. I'm fine with the above idea if you show me it has
>>> an interest.
>>>
>> Dave did the mtd_speed test and we can see we get consistently ~35KB/s slower with the extra enable and disable ecc engine calls in ecc read page path.
>>
>> With the change:
>> [ 28.148355] mtd_speedtest: page read speed is 9857 KiB/s
>> [ 31.754258] mtd_speedtest: 2 page read speed is 9865 KiB/s
>> Without the change
>> [ 56.444735] mtd_speedtest: page read speed is 9892 KiB/s
>> [ 60.042262] mtd_speedtest: 2 page read speed is 9897 KiB/s
>
> I believe if you repeat this 10 times you'll get totally different
> results. I don't think this test on a non RT machine is precise enough
> so that a unique 35kiB difference can be interpreted as being
> significant.
>
No, it is very consistent. It is not RT system but a bare minimum setup
with just linux kernel, mtd/nand drivers and a shell. We don't run any
apps and extra drivers. So the system is pretty idle while we run mtd
speed test.
>> Although it is only less than 1% drop, it is still something. I understand the procedure you laid out above is the preferred way but with our driver fully control the chip ecc read/write page, ecc read_raw/write_raw page function and exec_op path, I don't see where it may not work.
>
> I just told you, the exec_op path runs with ECC enabled. I don't know
> how this controller works. Now if you don't care and are 100% sure this
> is working and future proof, just keep it like this.
>
This is something I don't get. The patch basically only enable the ecc
when NAND_ECC_ENGINE_TYPE_ON_HOST is set. But in this case exec_op does
not do page read/write. Even page read/write go through exec_op, we do
want the engine to do the ecc checking. So need to have it enabled.
Other op like parameter read does not go through controller's ecc engine
so it does not matter if ecc is on or not.
If NAND_ECC_ENGINE_TYPE_ON_DIE is set, this patch disable the controller
engine and let NAND chip do the work as it requires. So exec_op path
run with ecc disabled when page read/write also go through exec_op path.
I don't see any issue with this either.
I am not trying to make your job hard but I want to understand if there
is really any issue here.
> Cheers,
> Miquèl
Hi Miquèl,
On Thu, Feb 1, 2024 at 12:26 AM Miquel Raynal <[email protected]> wrote:
>
> Hi William,
>
> > >>>>>>>> This is a double check to turn on/off our hardware ECC.
> > >>>>>>>
> > >>>>>>> I expect the engine to be always disabled. Enable it only when you
> > >>>>>>> need (may require an additional patch before this one).
> > >>>>>>
> > >>>>>> We are already turning on the ECC enable at this point,
> > >>>>>> this is just adding the option to turn it off if the NAND chip
> > >>>>>> itself will be doing the ECC instead of our controller.
> > >>>>>
> > >>>>> Sorry if I have not been clear.
> > >>>>>
> > >>>>> This sequence:
> > >>>>> - init
> > >>>>> - enable hw ECC engine
> > >>>>> Is broken.
> > >>>>> >>>> ECC engine is not enabled for all the cases. Here we only intended to enable it for the nand chip that is set to use NAND_ECC_ENGINE_TYPE_ON_HOST. The logic here should better change to:
> > >>>> if (chip->ecc.engine_type == NAND_ECC_ENGINE_TYPE_ON_HOST)
> > >>>> brcmnand_set_ecc_enabled(host, 1);
> > >>>> else
> > >>>> brcmnand_set_ecc_enabled(host, 0);
> > >>>> >>>>> It *cannot* work as any operation going through exec_op now may
> > >>>>> perform page reads which should be unmodified by the ECC engine. You > driver *must* follow the following sequence:
> > >>>>> - init and disable (or keep disabled) the hw ECC engine
> > >>>>> - when you perform a page operation with correction you need to
> > >>>>> - enable the engine
> > >>>>> - perform the operation
> > >>>>> - disable the engine
> > >>>>> Maybe I am missing something here but are you saying the exec_op can have different ecc type for page read/write at run time on the same nand chip? I don't see the op instr structure has the ecc type field and thought it is only bind to the nand chip and won't change at run time. So looks to me the init time setting to the engine based on ecc.engine_type should be sufficient.
> > >>>>
> > >>>> What you described here can work for the hw.ecc read path (ecc.read_page = brcmnand_read_page) which always assumes ecc is enabled. Although it is probably not too bad with these two extra operation, it would be better if we don't have to add anything as our current code does. For the brcmnand_read_page_raw, we currently disable the engine and then re-enable it(but we need to fix it to only enable it with hw ecc engine type). So it is just opposite of you logic but works the same with no impact on the most performance critical path.
> > >>>
> > >>> This is not "my" logic, this is the "core's" logic. I am saying: your
> > >>> approach is broken because that is not how the API is supposed to work,
> > >>> but it mostly works in the standard case.
> > >>
> > >> In the interest of minimizing register writes, would it be acceptable to
> > >> enable/disable ECC at the beginning of a standard
> > >> path transfer but not, after the transfer, turn off the ECC? This should not
> > >> affect other standard path operations nor affect the exec_op path as those
> > >> are low level transfers which our ECC engine would not touch and the NAND
> > >> device driver should be responsible for turning on/off its own ECC.
> > >
> > > Do you have legitimate concerns about this register write taking way
> > > more time than I could expect? Because compared to the transfer of a
> > > NAND page + tR/tPROG it should not be noticeable. I don't see how you
> > > could even measure such impact actually, unless the register write does
> > > way more than usual. I'm fine with the above idea if you show me it has
> > > an interest.
> > >
> > Dave did the mtd_speed test and we can see we get consistently ~35KB/s slower with the extra enable and disable ecc engine calls in ecc read page path.
> >
> > With the change:
> > [ 28.148355] mtd_speedtest: page read speed is 9857 KiB/s
> > [ 31.754258] mtd_speedtest: 2 page read speed is 9865 KiB/s
> > Without the change
> > [ 56.444735] mtd_speedtest: page read speed is 9892 KiB/s
> > [ 60.042262] mtd_speedtest: 2 page read speed is 9897 KiB/s
>
> I believe if you repeat this 10 times you'll get totally different
> results. I don't think this test on a non RT machine is precise enough
> so that a unique 35kiB difference can be interpreted as being
> significant.
>
> > Although it is only less than 1% drop, it is still something. I understand the procedure you laid out above is the preferred way but with our driver fully control the chip ecc read/write page, ecc read_raw/write_raw page function and exec_op path, I don't see where it may not work.
>
> I just told you, the exec_op path runs with ECC enabled. I don't know
> how this controller works. Now if you don't care and are 100% sure this
> is working and future proof, just keep it like this.
Thank you for all your help, it appears this will need further rework
and testing. Since it's not critical to this series I will most likely
pull this patch and take all your good information to prepare for
a future update.
>
> Cheers,
> Miquèl
Thanks!
-Dave