2023-10-26 02:02:51

by Fang Xiang

[permalink] [raw]
Subject: [PATCH] irqchip/gic-v3-its: Fix the coherent issue in its_setup_baser() when shr = 0.

The table would not be flushed if the input parameter shr = 0 in its_setup_baser() and
it would cause a coherent problem.

Signed-off-by: Fang Xiang <[email protected]>
---
drivers/irqchip/irq-gic-v3-its.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 75a2dd550625..58a9f24ccfa7 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -2394,13 +2394,15 @@ static int its_setup_baser(struct its_node *its, struct its_baser *baser,
* non-cacheable as well.
*/
shr = tmp & GITS_BASER_SHAREABILITY_MASK;
- if (!shr) {
+ if (!shr)
cache = GITS_BASER_nC;
- gic_flush_dcache_to_poc(base, PAGE_ORDER_TO_SIZE(order));
- }
+
goto retry_baser;
}

+ if (!shr)
+ gic_flush_dcache_to_poc(base, PAGE_ORDER_TO_SIZE(order));
+
if (val != tmp) {
pr_err("ITS@%pa: %s doesn't stick: %llx %llx\n",
&its->phys_base, its_base_type_string[type],
--
2.34.1


2023-10-26 08:01:42

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH] irqchip/gic-v3-its: Fix the coherent issue in its_setup_baser() when shr = 0.

On Thu, 26 Oct 2023 03:01:16 +0100,
Fang Xiang <[email protected]> wrote:
>
> The table would not be flushed if the input parameter shr = 0 in
> its_setup_baser() and it would cause a coherent problem.

Would? Or does? I'm asking, as you have previously indicated that this
workaround was working for you.

Have you actually observed a problem? Or is that by inspection?

>
> Signed-off-by: Fang Xiang <[email protected]>
> ---
> drivers/irqchip/irq-gic-v3-its.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index 75a2dd550625..58a9f24ccfa7 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -2394,13 +2394,15 @@ static int its_setup_baser(struct its_node *its, struct its_baser *baser,
> * non-cacheable as well.
> */
> shr = tmp & GITS_BASER_SHAREABILITY_MASK;
> - if (!shr) {
> + if (!shr)
> cache = GITS_BASER_nC;
> - gic_flush_dcache_to_poc(base, PAGE_ORDER_TO_SIZE(order));
> - }
> +
> goto retry_baser;
> }
>
> + if (!shr)
> + gic_flush_dcache_to_poc(base, PAGE_ORDER_TO_SIZE(order));
> +

This is wrong. You're doing the cache clean *after* the register has
been programmed with its final value, and the ITS is perfectly allowed
to prefetch anything it wants as soon as you program the register. The
clean must thus happen before the write. Yes, it was wrong before, but
you are now making it wrong always.

> if (val != tmp) {
> pr_err("ITS@%pa: %s doesn't stick: %llx %llx\n",
> &its->phys_base, its_base_type_string[type],

Overall, I think we need a slightly better fix. Since non-coherent
ITSs are quickly becoming the common case, we can save ourselves some
effort and hoist the quirked attributes early.

Does the hack below work for you?

M.

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 75a2dd550625..d76d44ea2de1 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -2379,12 +2379,12 @@ static int its_setup_baser(struct its_node *its, struct its_baser *baser,
break;
}

+ if (!shr)
+ gic_flush_dcache_to_poc(base, PAGE_ORDER_TO_SIZE(order));
+
its_write_baser(its, baser, val);
tmp = baser->val;

- if (its->flags & ITS_FLAGS_FORCE_NON_SHAREABLE)
- tmp &= ~GITS_BASER_SHAREABILITY_MASK;
-
if ((val ^ tmp) & GITS_BASER_SHAREABILITY_MASK) {
/*
* Shareability didn't stick. Just use
@@ -2394,10 +2394,9 @@ static int its_setup_baser(struct its_node *its, struct its_baser *baser,
* non-cacheable as well.
*/
shr = tmp & GITS_BASER_SHAREABILITY_MASK;
- if (!shr) {
+ if (!shr)
cache = GITS_BASER_nC;
- gic_flush_dcache_to_poc(base, PAGE_ORDER_TO_SIZE(order));
- }
+
goto retry_baser;
}

@@ -2609,6 +2608,11 @@ static int its_alloc_tables(struct its_node *its)
/* erratum 24313: ignore memory access type */
cache = GITS_BASER_nCnB;

+ if (its->flags & ITS_FLAGS_FORCE_NON_SHAREABLE) {
+ cache = GITS_BASER_nC;
+ shr = 0;
+ }
+
for (i = 0; i < GITS_BASER_NR_REGS; i++) {
struct its_baser *baser = its->tables + i;
u64 val = its_read_baser(its, baser);

--
Without deviation from the norm, progress is not possible.

2023-10-26 08:56:36

by Fang Xiang

[permalink] [raw]
Subject: Re: [PATCH] irqchip/gic-v3-its: Fix the coherent issue in its_setup_baser() when shr = 0.

On Thu, Oct 26, 2023 at 09:01:17AM +0100, Marc Zyngier wrote:
> On Thu, 26 Oct 2023 03:01:16 +0100,
> Fang Xiang <[email protected]> wrote:
> >
> > The table would not be flushed if the input parameter shr = 0 in
> > its_setup_baser() and it would cause a coherent problem.
>
> Would? Or does? I'm asking, as you have previously indicated that this
> workaround was working for you.
>
> Have you actually observed a problem? Or is that by inspection?
>
I actually observed this problem on my device. GIC get a dirty table
because CPU did not flush the clean one to memory.
> >
> > Signed-off-by: Fang Xiang <[email protected]>
> > ---
> > drivers/irqchip/irq-gic-v3-its.c | 8 +++++---
> > 1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> > index 75a2dd550625..58a9f24ccfa7 100644
> > --- a/drivers/irqchip/irq-gic-v3-its.c
> > +++ b/drivers/irqchip/irq-gic-v3-its.c
> > @@ -2394,13 +2394,15 @@ static int its_setup_baser(struct its_node *its, struct its_baser *baser,
> > * non-cacheable as well.
> > */
> > shr = tmp & GITS_BASER_SHAREABILITY_MASK;
> > - if (!shr) {
> > + if (!shr)
> > cache = GITS_BASER_nC;
> > - gic_flush_dcache_to_poc(base, PAGE_ORDER_TO_SIZE(order));
> > - }
> > +
> > goto retry_baser;
> > }
> >
> > + if (!shr)
> > + gic_flush_dcache_to_poc(base, PAGE_ORDER_TO_SIZE(order));
> > +
>
> This is wrong. You're doing the cache clean *after* the register has
> been programmed with its final value, and the ITS is perfectly allowed
> to prefetch anything it wants as soon as you program the register. The
> clean must thus happen before the write. Yes, it was wrong before, but
> you are now making it wrong always.
Sorry for that. But on my device, GIC would not read the table before
ITS enable(GITS_CTLR.Enabled == 1). When ITS is disabled, the prefetch
happens ever in other platforms?
>
> > if (val != tmp) {
> > pr_err("ITS@%pa: %s doesn't stick: %llx %llx\n",
> > &its->phys_base, its_base_type_string[type],
>
> Overall, I think we need a slightly better fix. Since non-coherent
> ITSs are quickly becoming the common case, we can save ourselves some
> effort and hoist the quirked attributes early.
>
> Does the hack below work for you?
>
> M.
>
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index 75a2dd550625..d76d44ea2de1 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -2379,12 +2379,12 @@ static int its_setup_baser(struct its_node *its, struct its_baser *baser,
> break;
> }
>
> + if (!shr)
> + gic_flush_dcache_to_poc(base, PAGE_ORDER_TO_SIZE(order));
> +
> its_write_baser(its, baser, val);
> tmp = baser->val;
>
> - if (its->flags & ITS_FLAGS_FORCE_NON_SHAREABLE)
> - tmp &= ~GITS_BASER_SHAREABILITY_MASK;
> -
> if ((val ^ tmp) & GITS_BASER_SHAREABILITY_MASK) {
> /*
> * Shareability didn't stick. Just use
> @@ -2394,10 +2394,9 @@ static int its_setup_baser(struct its_node *its, struct its_baser *baser,
> * non-cacheable as well.
> */
> shr = tmp & GITS_BASER_SHAREABILITY_MASK;
> - if (!shr) {
> + if (!shr)
> cache = GITS_BASER_nC;
> - gic_flush_dcache_to_poc(base, PAGE_ORDER_TO_SIZE(order));
> - }
> +
> goto retry_baser;
> }
>
> @@ -2609,6 +2608,11 @@ static int its_alloc_tables(struct its_node *its)
> /* erratum 24313: ignore memory access type */
> cache = GITS_BASER_nCnB;
>
> + if (its->flags & ITS_FLAGS_FORCE_NON_SHAREABLE) {
> + cache = GITS_BASER_nC;
> + shr = 0;
> + }
> +
> for (i = 0; i < GITS_BASER_NR_REGS; i++) {
> struct its_baser *baser = its->tables + i;
> u64 val = its_read_baser(its, baser);
>
There maybe a risk in this patch above when non-shareable attibute indicated
by hardware, the table would not be flushed ever.
> --
> Without deviation from the norm, progress is not possible.

2023-10-26 09:48:33

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH] irqchip/gic-v3-its: Fix the coherent issue in its_setup_baser() when shr = 0.

On Thu, 26 Oct 2023 09:48:13 +0100,
Fang Xiang <[email protected]> wrote:
>
> On Thu, Oct 26, 2023 at 09:01:17AM +0100, Marc Zyngier wrote:
> > On Thu, 26 Oct 2023 03:01:16 +0100,
> > Fang Xiang <[email protected]> wrote:
> > >
> > > The table would not be flushed if the input parameter shr = 0 in
> > > its_setup_baser() and it would cause a coherent problem.
> >
> > Would? Or does? I'm asking, as you have previously indicated that this
> > workaround was working for you.
> >
> > Have you actually observed a problem? Or is that by inspection?
> >
> I actually observed this problem on my device. GIC get a dirty table
> because CPU did not flush the clean one to memory.

So how comes you previously reported that it was working for you?

> > >
> > > Signed-off-by: Fang Xiang <[email protected]>
> > > ---
> > > drivers/irqchip/irq-gic-v3-its.c | 8 +++++---
> > > 1 file changed, 5 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> > > index 75a2dd550625..58a9f24ccfa7 100644
> > > --- a/drivers/irqchip/irq-gic-v3-its.c
> > > +++ b/drivers/irqchip/irq-gic-v3-its.c
> > > @@ -2394,13 +2394,15 @@ static int its_setup_baser(struct its_node *its, struct its_baser *baser,
> > > * non-cacheable as well.
> > > */
> > > shr = tmp & GITS_BASER_SHAREABILITY_MASK;
> > > - if (!shr) {
> > > + if (!shr)
> > > cache = GITS_BASER_nC;
> > > - gic_flush_dcache_to_poc(base, PAGE_ORDER_TO_SIZE(order));
> > > - }
> > > +
> > > goto retry_baser;
> > > }
> > >
> > > + if (!shr)
> > > + gic_flush_dcache_to_poc(base, PAGE_ORDER_TO_SIZE(order));
> > > +
> >
> > This is wrong. You're doing the cache clean *after* the register has
> > been programmed with its final value, and the ITS is perfectly allowed
> > to prefetch anything it wants as soon as you program the register. The
> > clean must thus happen before the write. Yes, it was wrong before, but
> > you are now making it wrong always.
> Sorry for that. But on my device, GIC would not read the table before
> ITS enable(GITS_CTLR.Enabled == 1). When ITS is disabled, the prefetch
> happens ever in other platforms?

GITS_CTLR.Enabled == 1 controls the *translation* (i.e. whether a
write to GITS_TRANSLATER gets processed or not). It doesn't say
anything of the tables that are pointed to by the ITS.

If you care to read the spec, you will find this (Arm IHI 0069H, page
5-95, "Software access to the private ITS tables"):

<quote>
* For a table that is pointed to by a GITS_BASER<n> register for which
GITS_BASER<n>.Valid == 1 and GITS_BASER<n>.Indirect == 0, behavior is
UNPREDICTABLE if the table is written by software.
</quote>

and a cache clean definitely counts as a write from the PoV of the
ITS. What your device does is pretty much irrelevant, as the
architecture allows any sort of access as soon as the Valid bit is
set.

> >
> > > if (val != tmp) {
> > > pr_err("ITS@%pa: %s doesn't stick: %llx %llx\n",
> > > &its->phys_base, its_base_type_string[type],
> >
> > Overall, I think we need a slightly better fix. Since non-coherent
> > ITSs are quickly becoming the common case, we can save ourselves some
> > effort and hoist the quirked attributes early.
> >
> > Does the hack below work for you?
> >
> > M.
> >
> > diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> > index 75a2dd550625..d76d44ea2de1 100644
> > --- a/drivers/irqchip/irq-gic-v3-its.c
> > +++ b/drivers/irqchip/irq-gic-v3-its.c
> > @@ -2379,12 +2379,12 @@ static int its_setup_baser(struct its_node *its, struct its_baser *baser,
> > break;
> > }
> >
> > + if (!shr)
> > + gic_flush_dcache_to_poc(base, PAGE_ORDER_TO_SIZE(order));
> > +
> > its_write_baser(its, baser, val);
> > tmp = baser->val;
> >
> > - if (its->flags & ITS_FLAGS_FORCE_NON_SHAREABLE)
> > - tmp &= ~GITS_BASER_SHAREABILITY_MASK;
> > -
> > if ((val ^ tmp) & GITS_BASER_SHAREABILITY_MASK) {
> > /*
> > * Shareability didn't stick. Just use
> > @@ -2394,10 +2394,9 @@ static int its_setup_baser(struct its_node *its, struct its_baser *baser,
> > * non-cacheable as well.
> > */
> > shr = tmp & GITS_BASER_SHAREABILITY_MASK;
> > - if (!shr) {
> > + if (!shr)
> > cache = GITS_BASER_nC;
> > - gic_flush_dcache_to_poc(base, PAGE_ORDER_TO_SIZE(order));
> > - }
> > +
> > goto retry_baser;
> > }
> >
> > @@ -2609,6 +2608,11 @@ static int its_alloc_tables(struct its_node *its)
> > /* erratum 24313: ignore memory access type */
> > cache = GITS_BASER_nCnB;
> >
> > + if (its->flags & ITS_FLAGS_FORCE_NON_SHAREABLE) {
> > + cache = GITS_BASER_nC;
> > + shr = 0;
> > + }
> > +
> > for (i = 0; i < GITS_BASER_NR_REGS; i++) {
> > struct its_baser *baser = its->tables + i;
> > u64 val = its_read_baser(its, baser);
> >
> There maybe a risk in this patch above when non-shareable attibute indicated
> by hardware, the table would not be flushed ever.

How? If the HW rejects the shareability attribute, we set shr to 0 and
hit the retry path. At this point, we will clean the page to the PoC
before writing the register again.

What am I missing?

M.

--
Without deviation from the norm, progress is not possible.

2023-10-26 11:54:54

by Fang Xiang

[permalink] [raw]
Subject: Re: [PATCH] irqchip/gic-v3-its: Fix the coherent issue in its_setup_baser() when shr = 0.

On Thu, Oct 26, 2023 at 10:48:05AM +0100, Marc Zyngier wrote:
> On Thu, 26 Oct 2023 09:48:13 +0100,
> Fang Xiang <[email protected]> wrote:
> >
> > On Thu, Oct 26, 2023 at 09:01:17AM +0100, Marc Zyngier wrote:
> > > On Thu, 26 Oct 2023 03:01:16 +0100,
> > > Fang Xiang <[email protected]> wrote:
> > > >
> > > > The table would not be flushed if the input parameter shr = 0 in
> > > > its_setup_baser() and it would cause a coherent problem.
> > >
> > > Would? Or does? I'm asking, as you have previously indicated that this
> > > workaround was working for you.
> > >
> > > Have you actually observed a problem? Or is that by inspection?
> > >
> > I actually observed this problem on my device. GIC get a dirty table
> > because CPU did not flush the clean one to memory.
>
> So how comes you previously reported that it was working for you?
>

It is a complicated situation. Just the Virtual CPUs table is dirty on
my device, so the physical LPI works well.

> > > >
> > > > Signed-off-by: Fang Xiang <[email protected]>
> > > > ---
> > > > drivers/irqchip/irq-gic-v3-its.c | 8 +++++---
> > > > 1 file changed, 5 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> > > > index 75a2dd550625..58a9f24ccfa7 100644
> > > > --- a/drivers/irqchip/irq-gic-v3-its.c
> > > > +++ b/drivers/irqchip/irq-gic-v3-its.c
> > > > @@ -2394,13 +2394,15 @@ static int its_setup_baser(struct its_node *its, struct its_baser *baser,
> > > > * non-cacheable as well.
> > > > */
> > > > shr = tmp & GITS_BASER_SHAREABILITY_MASK;
> > > > - if (!shr) {
> > > > + if (!shr)
> > > > cache = GITS_BASER_nC;
> > > > - gic_flush_dcache_to_poc(base, PAGE_ORDER_TO_SIZE(order));
> > > > - }
> > > > +
> > > > goto retry_baser;
> > > > }
> > > >
> > > > + if (!shr)
> > > > + gic_flush_dcache_to_poc(base, PAGE_ORDER_TO_SIZE(order));
> > > > +
> > >
> > > This is wrong. You're doing the cache clean *after* the register has
> > > been programmed with its final value, and the ITS is perfectly allowed
> > > to prefetch anything it wants as soon as you program the register. The
> > > clean must thus happen before the write. Yes, it was wrong before, but
> > > you are now making it wrong always.
> > Sorry for that. But on my device, GIC would not read the table before
> > ITS enable(GITS_CTLR.Enabled == 1). When ITS is disabled, the prefetch
> > happens ever in other platforms?
>
> GITS_CTLR.Enabled == 1 controls the *translation* (i.e. whether a
> write to GITS_TRANSLATER gets processed or not). It doesn't say
> anything of the tables that are pointed to by the ITS.
>
> If you care to read the spec, you will find this (Arm IHI 0069H, page
> 5-95, "Software access to the private ITS tables"):
>
> <quote>
> * For a table that is pointed to by a GITS_BASER<n> register for which
> GITS_BASER<n>.Valid == 1 and GITS_BASER<n>.Indirect == 0, behavior is
> UNPREDICTABLE if the table is written by software.
> </quote>
>
> and a cache clean definitely counts as a write from the PoV of the
> ITS. What your device does is pretty much irrelevant, as the
> architecture allows any sort of access as soon as the Valid bit is
> set.
>

Yes, I see it. The tables should be flushed before we write the GITS_BASER[n]
registers.

> > >
> > > > if (val != tmp) {
> > > > pr_err("ITS@%pa: %s doesn't stick: %llx %llx\n",
> > > > &its->phys_base, its_base_type_string[type],
> > >
> > > Overall, I think we need a slightly better fix. Since non-coherent
> > > ITSs are quickly becoming the common case, we can save ourselves some
> > > effort and hoist the quirked attributes early.
> > >
> > > Does the hack below work for you?
> > >
> > > M.
> > >
> > > diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> > > index 75a2dd550625..d76d44ea2de1 100644
> > > --- a/drivers/irqchip/irq-gic-v3-its.c
> > > +++ b/drivers/irqchip/irq-gic-v3-its.c
> > > @@ -2379,12 +2379,12 @@ static int its_setup_baser(struct its_node *its, struct its_baser *baser,
> > > break;
> > > }
> > >
> > > + if (!shr)
> > > + gic_flush_dcache_to_poc(base, PAGE_ORDER_TO_SIZE(order));
> > > +
> > > its_write_baser(its, baser, val);
> > > tmp = baser->val;
> > >
> > > - if (its->flags & ITS_FLAGS_FORCE_NON_SHAREABLE)
> > > - tmp &= ~GITS_BASER_SHAREABILITY_MASK;
> > > -
> > > if ((val ^ tmp) & GITS_BASER_SHAREABILITY_MASK) {
> > > /*
> > > * Shareability didn't stick. Just use
> > > @@ -2394,10 +2394,9 @@ static int its_setup_baser(struct its_node *its, struct its_baser *baser,
> > > * non-cacheable as well.
> > > */
> > > shr = tmp & GITS_BASER_SHAREABILITY_MASK;
> > > - if (!shr) {
> > > + if (!shr)
> > > cache = GITS_BASER_nC;
> > > - gic_flush_dcache_to_poc(base, PAGE_ORDER_TO_SIZE(order));
> > > - }
> > > +
> > > goto retry_baser;
> > > }
> > >
> > > @@ -2609,6 +2608,11 @@ static int its_alloc_tables(struct its_node *its)
> > > /* erratum 24313: ignore memory access type */
> > > cache = GITS_BASER_nCnB;
> > >
> > > + if (its->flags & ITS_FLAGS_FORCE_NON_SHAREABLE) {
> > > + cache = GITS_BASER_nC;
> > > + shr = 0;
> > > + }
> > > +
> > > for (i = 0; i < GITS_BASER_NR_REGS; i++) {
> > > struct its_baser *baser = its->tables + i;
> > > u64 val = its_read_baser(its, baser);
> > >
> > There maybe a risk in this patch above when non-shareable attibute indicated
> > by hardware, the table would not be flushed ever.
>
> How? If the HW rejects the shareability attribute, we set shr to 0 and
> hit the retry path. At this point, we will clean the page to the PoC
> before writing the register again.
>
> What am I missing?
>

Sorry, It is a good fix. I will test it on my device and give a feedback
soon.

> M.
>
> --
> Without deviation from the norm, progress is not possible.

2023-11-27 14:29:34

by Slawomir Stepien

[permalink] [raw]
Subject: Re: [PATCH] irqchip/gic-v3-its: Fix the coherent issue in its_setup_baser() when shr = 0.

Hello Marc and Fang

(Maybe nobody cares anymore...but I've observed the problem and I'm glad it is now fixed, see below
my symptoms).

On paź 26, 2023 09:01, Marc Zyngier wrote:
> On Thu, 26 Oct 2023 03:01:16 +0100,
> Fang Xiang <[email protected]> wrote:
> >
> > The table would not be flushed if the input parameter shr = 0 in
> > its_setup_baser() and it would cause a coherent problem.
>
> Would? Or does? I'm asking, as you have previously indicated that this
> workaround was working for you.
>
> Have you actually observed a problem? Or is that by inspection?

On my Orange Pi 5 Plus (RK3588), I've seen (only during reboot):

[ 68.236523] ITS queue timeout (17856 17793)
[ 68.236893] ITS cmd its_build_discard_cmd failed
[ 69.697180] ITS queue timeout (17920 17793)
[ 69.697546] ITS cmd its_build_discard_cmd failed
[ 71.157769] ITS queue timeout (17984 17793)
[ 71.158136] ITS cmd its_build_discard_cmd failed
[ 72.618378] ITS queue timeout (18048 17793)
[ 72.618748] ITS cmd its_build_discard_cmd failed
[ 74.078961] ITS queue timeout (18112 17793)
[ 74.079327] ITS cmd its_build_discard_cmd failed
[ 75.539560] ITS queue timeout (18176 17793)
[ 75.539926] ITS cmd its_build_discard_cmd failed
[ 76.999319] ITS queue timeout (18240 17793)
[ 76.999685] ITS cmd its_build_inv_cmd failed
[ 78.458810] ITS queue timeout (18304 17793)
[ 78.459176] ITS cmd its_build_discard_cmd failed
[ 79.918328] ITS queue timeout (18368 17793)
[ 79.918694] ITS cmd its_build_inv_cmd failed
[ 81.378019] ITS queue timeout (18432 17793)
[ 81.378386] ITS cmd its_build_discard_cmd failed
[ 82.838738] ITS queue timeout (18496 17793)
[ 82.839105] ITS cmd its_build_discard_cmd failed
[ 84.299324] ITS queue timeout (18560 17793)
[ 84.299690] ITS cmd its_build_discard_cmd failed
[ 85.759906] ITS queue timeout (18624 17793)
[ 85.760273] ITS cmd its_build_discard_cmd failed
[ 87.220496] ITS queue timeout (18688 17793)
[ 87.220861] ITS cmd its_build_discard_cmd failed
[ 88.681075] ITS queue timeout (18752 17793)
[ 88.681441] ITS cmd its_build_discard_cmd failed
[ 90.141657] ITS queue timeout (18816 17793)
[ 90.142024] ITS cmd its_build_discard_cmd failed
[ 91.602233] ITS queue timeout (18880 17793)
[ 91.602599] ITS cmd its_build_discard_cmd failed
[ 93.062827] ITS queue timeout (18944 17793)
[ 93.063193] ITS cmd its_build_discard_cmd failed
[ 94.601533] ITS queue timeout (8992 8929)
[ 94.601885] ITS cmd its_build_discard_cmd failed
[ 96.062111] ITS queue timeout (9056 8929)
[ 96.062462] ITS cmd its_build_discard_cmd failed
[ 97.522652] ITS queue timeout (9120 8929)
[ 97.523005] ITS cmd its_build_discard_cmd failed
[ 98.983192] ITS queue timeout (9184 8929)
[ 98.983543] ITS cmd its_build_discard_cmd failed
[ 100.443753] ITS queue timeout (9248 8929)
[ 100.444104] ITS cmd its_build_discard_cmd failed
[ 101.904294] ITS queue timeout (9312 8929)
[ 101.904645] ITS cmd its_build_discard_cmd failed
[ 103.364847] ITS queue timeout (9376 8929)
[ 103.365198] ITS cmd its_build_discard_cmd failed
[ 104.825394] ITS queue timeout (9440 8929)
[ 104.825743] ITS cmd its_build_discard_cmd failed
[ 104.827172] reboot: Restarting system

This patch (the one that ended up in 6.7.0-rc3) fixed this issue for me.

> >
> > Signed-off-by: Fang Xiang <[email protected]>
> > ---
> > drivers/irqchip/irq-gic-v3-its.c | 8 +++++---
> > 1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> > index 75a2dd550625..58a9f24ccfa7 100644
> > --- a/drivers/irqchip/irq-gic-v3-its.c
> > +++ b/drivers/irqchip/irq-gic-v3-its.c
> > @@ -2394,13 +2394,15 @@ static int its_setup_baser(struct its_node *its, struct its_baser *baser,
> > * non-cacheable as well.
> > */
> > shr = tmp & GITS_BASER_SHAREABILITY_MASK;
> > - if (!shr) {
> > + if (!shr)
> > cache = GITS_BASER_nC;
> > - gic_flush_dcache_to_poc(base, PAGE_ORDER_TO_SIZE(order));
> > - }
> > +
> > goto retry_baser;
> > }
> >
> > + if (!shr)
> > + gic_flush_dcache_to_poc(base, PAGE_ORDER_TO_SIZE(order));
> > +
>
> This is wrong. You're doing the cache clean *after* the register has
> been programmed with its final value, and the ITS is perfectly allowed
> to prefetch anything it wants as soon as you program the register. The
> clean must thus happen before the write. Yes, it was wrong before, but
> you are now making it wrong always.
>
> > if (val != tmp) {
> > pr_err("ITS@%pa: %s doesn't stick: %llx %llx\n",
> > &its->phys_base, its_base_type_string[type],
>
> Overall, I think we need a slightly better fix. Since non-coherent
> ITSs are quickly becoming the common case, we can save ourselves some
> effort and hoist the quirked attributes early.
>
> Does the hack below work for you?
>
> M.
>
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index 75a2dd550625..d76d44ea2de1 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -2379,12 +2379,12 @@ static int its_setup_baser(struct its_node *its, struct its_baser *baser,
> break;
> }
>
> + if (!shr)
> + gic_flush_dcache_to_poc(base, PAGE_ORDER_TO_SIZE(order));
> +
> its_write_baser(its, baser, val);
> tmp = baser->val;
>
> - if (its->flags & ITS_FLAGS_FORCE_NON_SHAREABLE)
> - tmp &= ~GITS_BASER_SHAREABILITY_MASK;
> -
> if ((val ^ tmp) & GITS_BASER_SHAREABILITY_MASK) {
> /*
> * Shareability didn't stick. Just use
> @@ -2394,10 +2394,9 @@ static int its_setup_baser(struct its_node *its, struct its_baser *baser,
> * non-cacheable as well.
> */
> shr = tmp & GITS_BASER_SHAREABILITY_MASK;
> - if (!shr) {
> + if (!shr)
> cache = GITS_BASER_nC;
> - gic_flush_dcache_to_poc(base, PAGE_ORDER_TO_SIZE(order));
> - }
> +
> goto retry_baser;
> }
>
> @@ -2609,6 +2608,11 @@ static int its_alloc_tables(struct its_node *its)
> /* erratum 24313: ignore memory access type */
> cache = GITS_BASER_nCnB;
>
> + if (its->flags & ITS_FLAGS_FORCE_NON_SHAREABLE) {
> + cache = GITS_BASER_nC;
> + shr = 0;
> + }
> +
> for (i = 0; i < GITS_BASER_NR_REGS; i++) {
> struct its_baser *baser = its->tables + i;
> u64 val = its_read_baser(its, baser);

--
Slawomir Stepien