If the qman driver (qman_ccsr) doesn't probe or fail to probe before
qman_portal, qm_ccsr_start will be either NULL or a stale pointer to an
unmapped page.
This leads to a crash when probing qman_portal as the init_pcfg function
calls qman_liodn_fixup that tries to read qman registers.
Assume that qman didn't probe when the pool mask is 0.
Signed-off-by: Alexandre Belloni <[email protected]>
---
drivers/soc/fsl/qbman/qman_portal.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/soc/fsl/qbman/qman_portal.c b/drivers/soc/fsl/qbman/qman_portal.c
index a120002b630e..4fc80d2c8feb 100644
--- a/drivers/soc/fsl/qbman/qman_portal.c
+++ b/drivers/soc/fsl/qbman/qman_portal.c
@@ -277,6 +277,8 @@ static int qman_portal_probe(struct platform_device *pdev)
}
pcfg->pools = qm_get_pools_sdqcr();
+ if (pcfg->pools == 0)
+ return -EPROBE_DEFER;
spin_lock(&qman_lock);
cpu = cpumask_next_zero(-1, &portal_cpus);
--
2.18.0
If the qman driver didn't probe, calling qman_alloc_fqid_range,
qman_alloc_pool_range or qman_alloc_cgrid_range (as done in dpaa_eth) will
pass a NULL pointer to gen_pool_alloc, leading to a NULL pointer
dereference.
Signed-off-by: Alexandre Belloni <[email protected]>
---
drivers/soc/fsl/qbman/qman.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/soc/fsl/qbman/qman.c b/drivers/soc/fsl/qbman/qman.c
index ecb22749df0b..8cc015183043 100644
--- a/drivers/soc/fsl/qbman/qman.c
+++ b/drivers/soc/fsl/qbman/qman.c
@@ -2729,6 +2729,9 @@ static int qman_alloc_range(struct gen_pool *p, u32 *result, u32 cnt)
{
unsigned long addr;
+ if (!p)
+ return -ENODEV;
+
addr = gen_pool_alloc(p, cnt);
if (!addr)
return -ENOMEM;
--
2.18.0
On 8/23/2018 5:36 PM, Alexandre Belloni wrote:
> If the qman driver (qman_ccsr) doesn't probe or fail to probe before
> qman_portal, qm_ccsr_start will be either NULL or a stale pointer to an
> unmapped page.
>
> This leads to a crash when probing qman_portal as the init_pcfg function
> calls qman_liodn_fixup that tries to read qman registers.
>
> Assume that qman didn't probe when the pool mask is 0.
>
> Signed-off-by: Alexandre Belloni <[email protected]>
> ---
> drivers/soc/fsl/qbman/qman_portal.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/soc/fsl/qbman/qman_portal.c b/drivers/soc/fsl/qbman/qman_portal.c
> index a120002b630e..4fc80d2c8feb 100644
> --- a/drivers/soc/fsl/qbman/qman_portal.c
> +++ b/drivers/soc/fsl/qbman/qman_portal.c
> @@ -277,6 +277,8 @@ static int qman_portal_probe(struct platform_device *pdev)
> }
>
> pcfg->pools = qm_get_pools_sdqcr();
> + if (pcfg->pools == 0)
> + return -EPROBE_DEFER;
>
> spin_lock(&qman_lock);
> cpu = cpumask_next_zero(-1, &portal_cpus);
Reviewed-by: Roy Pledge <[email protected]>
On 8/23/2018 5:36 PM, Alexandre Belloni wrote:
> If the qman driver didn't probe, calling qman_alloc_fqid_range,
> qman_alloc_pool_range or qman_alloc_cgrid_range (as done in dpaa_eth) will
> pass a NULL pointer to gen_pool_alloc, leading to a NULL pointer
> dereference.
>
> Signed-off-by: Alexandre Belloni <[email protected]>
> ---
> drivers/soc/fsl/qbman/qman.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/soc/fsl/qbman/qman.c b/drivers/soc/fsl/qbman/qman.c
> index ecb22749df0b..8cc015183043 100644
> --- a/drivers/soc/fsl/qbman/qman.c
> +++ b/drivers/soc/fsl/qbman/qman.c
> @@ -2729,6 +2729,9 @@ static int qman_alloc_range(struct gen_pool *p, u32 *result, u32 cnt)
> {
> unsigned long addr;
>
> + if (!p)
> + return -ENODEV;
> +
> addr = gen_pool_alloc(p, cnt);
> if (!addr)
> return -ENOMEM;
Reviewed-by: Roy Pledge <[email protected]>
On Fri, Aug 24, 2018 at 9:54 AM Roy Pledge <[email protected]> wrote:
>
> On 8/23/2018 5:36 PM, Alexandre Belloni wrote:
> > If the qman driver (qman_ccsr) doesn't probe or fail to probe before
> > qman_portal, qm_ccsr_start will be either NULL or a stale pointer to an
> > unmapped page.
> >
> > This leads to a crash when probing qman_portal as the init_pcfg function
> > calls qman_liodn_fixup that tries to read qman registers.
> >
> > Assume that qman didn't probe when the pool mask is 0.
> >
> > Signed-off-by: Alexandre Belloni <[email protected]>
Series applied to the fix branch of soc/fsl.
> > ---
> > drivers/soc/fsl/qbman/qman_portal.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/soc/fsl/qbman/qman_portal.c b/drivers/soc/fsl/qbman/qman_portal.c
> > index a120002b630e..4fc80d2c8feb 100644
> > --- a/drivers/soc/fsl/qbman/qman_portal.c
> > +++ b/drivers/soc/fsl/qbman/qman_portal.c
> > @@ -277,6 +277,8 @@ static int qman_portal_probe(struct platform_device *pdev)
> > }
> >
> > pcfg->pools = qm_get_pools_sdqcr();
> > + if (pcfg->pools == 0)
> > + return -EPROBE_DEFER;
> >
> > spin_lock(&qman_lock);
> > cpu = cpumask_next_zero(-1, &portal_cpus);
>
> Reviewed-by: Roy Pledge <[email protected]>
>
>
Hi,
On Thu, Aug 23, 2018 at 11:36 PM Alexandre Belloni
<[email protected]> wrote:
>
> If the qman driver (qman_ccsr) doesn't probe or fail to probe before
> qman_portal, qm_ccsr_start will be either NULL or a stale pointer to an
> unmapped page.
>
> This leads to a crash when probing qman_portal as the init_pcfg function
> calls qman_liodn_fixup that tries to read qman registers.
>
> Assume that qman didn't probe when the pool mask is 0.
>
> Signed-off-by: Alexandre Belloni <[email protected]>
> ---
> drivers/soc/fsl/qbman/qman_portal.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/soc/fsl/qbman/qman_portal.c b/drivers/soc/fsl/qbman/qman_portal.c
> index a120002b630e..4fc80d2c8feb 100644
> --- a/drivers/soc/fsl/qbman/qman_portal.c
> +++ b/drivers/soc/fsl/qbman/qman_portal.c
> @@ -277,6 +277,8 @@ static int qman_portal_probe(struct platform_device *pdev)
> }
>
> pcfg->pools = qm_get_pools_sdqcr();
> + if (pcfg->pools == 0)
> + return -EPROBE_DEFER;
This is quite late in the probe, after a bunch of resources have been claimed.
Note that the ioremaps above this are doing unwinds, and you'll end up
doing duplicate ioremaps if you come in and probe again.
You should probably unwind those allocations, or move them to devm_*
or do this check earlier in the function.
-Olof
On Tue, Sep 25, 2018 at 2:47 PM Olof Johansson <[email protected]> wrote:
>
> Hi,
>
>
> On Thu, Aug 23, 2018 at 11:36 PM Alexandre Belloni
> <[email protected]> wrote:
> >
> > If the qman driver (qman_ccsr) doesn't probe or fail to probe before
> > qman_portal, qm_ccsr_start will be either NULL or a stale pointer to an
> > unmapped page.
> >
> > This leads to a crash when probing qman_portal as the init_pcfg function
> > calls qman_liodn_fixup that tries to read qman registers.
> >
> > Assume that qman didn't probe when the pool mask is 0.
> >
> > Signed-off-by: Alexandre Belloni <[email protected]>
> > ---
> > drivers/soc/fsl/qbman/qman_portal.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/soc/fsl/qbman/qman_portal.c b/drivers/soc/fsl/qbman/qman_portal.c
> > index a120002b630e..4fc80d2c8feb 100644
> > --- a/drivers/soc/fsl/qbman/qman_portal.c
> > +++ b/drivers/soc/fsl/qbman/qman_portal.c
> > @@ -277,6 +277,8 @@ static int qman_portal_probe(struct platform_device *pdev)
> > }
> >
> > pcfg->pools = qm_get_pools_sdqcr();
> > + if (pcfg->pools == 0)
> > + return -EPROBE_DEFER;
>
> This is quite late in the probe, after a bunch of resources have been claimed.
>
> Note that the ioremaps above this are doing unwinds, and you'll end up
> doing duplicate ioremaps if you come in and probe again.
>
> You should probably unwind those allocations, or move them to devm_*
> or do this check earlier in the function.
Hi Roy,
Is there any more straightforward indicator on if qman has been
probed? So that we can check it at the begining of the probe?
Regards,
Leo
On 25/09/2018 21:45:56+0200, Olof Johansson wrote:
> Hi,
>
>
> On Thu, Aug 23, 2018 at 11:36 PM Alexandre Belloni
> <[email protected]> wrote:
> >
> > If the qman driver (qman_ccsr) doesn't probe or fail to probe before
> > qman_portal, qm_ccsr_start will be either NULL or a stale pointer to an
> > unmapped page.
> >
> > This leads to a crash when probing qman_portal as the init_pcfg function
> > calls qman_liodn_fixup that tries to read qman registers.
> >
> > Assume that qman didn't probe when the pool mask is 0.
> >
> > Signed-off-by: Alexandre Belloni <[email protected]>
> > ---
> > drivers/soc/fsl/qbman/qman_portal.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/soc/fsl/qbman/qman_portal.c b/drivers/soc/fsl/qbman/qman_portal.c
> > index a120002b630e..4fc80d2c8feb 100644
> > --- a/drivers/soc/fsl/qbman/qman_portal.c
> > +++ b/drivers/soc/fsl/qbman/qman_portal.c
> > @@ -277,6 +277,8 @@ static int qman_portal_probe(struct platform_device *pdev)
> > }
> >
> > pcfg->pools = qm_get_pools_sdqcr();
> > + if (pcfg->pools == 0)
> > + return -EPROBE_DEFER;
>
> This is quite late in the probe, after a bunch of resources have been claimed.
>
> Note that the ioremaps above this are doing unwinds, and you'll end up
> doing duplicate ioremaps if you come in and probe again.
>
> You should probably unwind those allocations, or move them to devm_*
> or do this check earlier in the function.
>
The actual chance of having that happen is quite small (this was coming
from a non working DT) and I mainly wanted to avoid a crash so the
platform could still boot. I would think moving to devm_ would be the
right thing to do.
--
Alexandre Belloni, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
On Wed, Sep 26, 2018 at 4:28 AM Alexandre Belloni
<[email protected]> wrote:
>
> On 25/09/2018 21:45:56+0200, Olof Johansson wrote:
> > Hi,
> >
> >
> > On Thu, Aug 23, 2018 at 11:36 PM Alexandre Belloni
> > <[email protected]> wrote:
> > >
> > > If the qman driver (qman_ccsr) doesn't probe or fail to probe before
> > > qman_portal, qm_ccsr_start will be either NULL or a stale pointer to an
> > > unmapped page.
> > >
> > > This leads to a crash when probing qman_portal as the init_pcfg function
> > > calls qman_liodn_fixup that tries to read qman registers.
> > >
> > > Assume that qman didn't probe when the pool mask is 0.
> > >
> > > Signed-off-by: Alexandre Belloni <[email protected]>
> > > ---
> > > drivers/soc/fsl/qbman/qman_portal.c | 2 ++
> > > 1 file changed, 2 insertions(+)
> > >
> > > diff --git a/drivers/soc/fsl/qbman/qman_portal.c b/drivers/soc/fsl/qbman/qman_portal.c
> > > index a120002b630e..4fc80d2c8feb 100644
> > > --- a/drivers/soc/fsl/qbman/qman_portal.c
> > > +++ b/drivers/soc/fsl/qbman/qman_portal.c
> > > @@ -277,6 +277,8 @@ static int qman_portal_probe(struct platform_device *pdev)
> > > }
> > >
> > > pcfg->pools = qm_get_pools_sdqcr();
> > > + if (pcfg->pools == 0)
> > > + return -EPROBE_DEFER;
> >
> > This is quite late in the probe, after a bunch of resources have been claimed.
> >
> > Note that the ioremaps above this are doing unwinds, and you'll end up
> > doing duplicate ioremaps if you come in and probe again.
> >
> > You should probably unwind those allocations, or move them to devm_*
> > or do this check earlier in the function.
> >
>
> The actual chance of having that happen is quite small (this was coming
> from a non working DT) and I mainly wanted to avoid a crash so the
> platform could still boot. I would think moving to devm_ would be the
> right thing to do.
Even if it is not failing with the upstreamed device trees, it is
still good to harden the driver for possible issues. Moving to devm_
is definitely a right thing to do. But I also think checking if the
qman is already probed should be the first thing to do before starting
to allocate resources and etc and rolling back later. Probably we can
move the qm_get_pools_sdqcr() to the begining of the probe to
determine if qman is probed as it doesn't seem to depend on any of the
setups done right now.
Regards,
Leo
On Wed, Sep 26, 2018 at 1:15 PM Li Yang <[email protected]> wrote:
>
> On Wed, Sep 26, 2018 at 4:28 AM Alexandre Belloni
> <[email protected]> wrote:
> >
> > On 25/09/2018 21:45:56+0200, Olof Johansson wrote:
> > > Hi,
> > >
> > >
> > > On Thu, Aug 23, 2018 at 11:36 PM Alexandre Belloni
> > > <[email protected]> wrote:
> > > >
> > > > If the qman driver (qman_ccsr) doesn't probe or fail to probe before
> > > > qman_portal, qm_ccsr_start will be either NULL or a stale pointer to an
> > > > unmapped page.
> > > >
> > > > This leads to a crash when probing qman_portal as the init_pcfg function
> > > > calls qman_liodn_fixup that tries to read qman registers.
> > > >
> > > > Assume that qman didn't probe when the pool mask is 0.
> > > >
> > > > Signed-off-by: Alexandre Belloni <[email protected]>
> > > > ---
> > > > drivers/soc/fsl/qbman/qman_portal.c | 2 ++
> > > > 1 file changed, 2 insertions(+)
> > > >
> > > > diff --git a/drivers/soc/fsl/qbman/qman_portal.c b/drivers/soc/fsl/qbman/qman_portal.c
> > > > index a120002b630e..4fc80d2c8feb 100644
> > > > --- a/drivers/soc/fsl/qbman/qman_portal.c
> > > > +++ b/drivers/soc/fsl/qbman/qman_portal.c
> > > > @@ -277,6 +277,8 @@ static int qman_portal_probe(struct platform_device *pdev)
> > > > }
> > > >
> > > > pcfg->pools = qm_get_pools_sdqcr();
> > > > + if (pcfg->pools == 0)
> > > > + return -EPROBE_DEFER;
> > >
> > > This is quite late in the probe, after a bunch of resources have been claimed.
> > >
> > > Note that the ioremaps above this are doing unwinds, and you'll end up
> > > doing duplicate ioremaps if you come in and probe again.
> > >
> > > You should probably unwind those allocations, or move them to devm_*
> > > or do this check earlier in the function.
> > >
> >
> > The actual chance of having that happen is quite small (this was coming
> > from a non working DT) and I mainly wanted to avoid a crash so the
> > platform could still boot. I would think moving to devm_ would be the
> > right thing to do.
>
> Even if it is not failing with the upstreamed device trees, it is
> still good to harden the driver for possible issues. Moving to devm_
> is definitely a right thing to do. But I also think checking if the
> qman is already probed should be the first thing to do before starting
> to allocate resources and etc and rolling back later. Probably we can
> move the qm_get_pools_sdqcr() to the begining of the probe to
> determine if qman is probed as it doesn't seem to depend on any of the
> setups done right now.
I just find out Laurentiu also included the following patches in his
SMMU patch series (although not neccessarily related to SMMU) which
also fix the same problem. I think they are more straightforward and
can deal with the case that qman failed to probe. So we can take
these to fix this problem instead in 4.19.
https://patchwork.kernel.org/patch/10616021/
https://patchwork.kernel.org/patch/10616019/
https://patchwork.kernel.org/patch/10615971/
Regards,
Leo