2023-02-14 23:34:38

by Florian Fainelli

[permalink] [raw]
Subject: [PATCH 0/3] Save/restore for GICv3

Hi all,

This patch series adds support for saving and restoring the GIC
distributor and re-distributor which was missing for platforms that
implement suspend states where the GIC loses power and therefore its
state.

The system that I have been testing this with has:

[ 0.000000] GICv3: GIC: Using split EOI/Deactivate mode
[ 0.000000] GICv3: 288 SPIs implemented
[ 0.000000] GICv3: 0 Extended SPIs implemented
[ 0.000000] Root IRQ handler: gic_handle_irq
[ 0.000000] GICv3: GICv3 features: 16 PPIs

So no support for extended PPIs or SPIs, hopefully the code is correct,
or close to.

Thanks!

Florian Fainelli (3):
irqchip/gic-v3: Use switch/case statements in gic_cpu_pm_notifier
irqchip/gic-v3: Propagate gic_cpu_pm_init() return code
irqchip/gic-v3: Save and restore distributor and re-distributor

drivers/irqchip/irq-gic-v3.c | 282 ++++++++++++++++++++++++++++-
include/linux/irqchip/arm-gic-v3.h | 4 +
2 files changed, 280 insertions(+), 6 deletions(-)

--
2.34.1



2023-02-14 23:34:41

by Florian Fainelli

[permalink] [raw]
Subject: [PATCH 1/3] irqchip/gic-v3: Use switch/case statements in gic_cpu_pm_notifier

No functional change, but facilitates adding new states in subsequent
changes.

Signed-off-by: Florian Fainelli <[email protected]>
---
drivers/irqchip/irq-gic-v3.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index bb57ab8bff6a..b60fadb7eb44 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -1374,14 +1374,20 @@ static int gic_retrigger(struct irq_data *data)
static int gic_cpu_pm_notifier(struct notifier_block *self,
unsigned long cmd, void *v)
{
- if (cmd == CPU_PM_EXIT) {
+ switch (cmd) {
+ case CPU_PM_ENTER:
+ if (gic_dist_security_disabled()) {
+ gic_write_grpen1(0);
+ gic_enable_redist(false);
+ }
+ break;
+ case CPU_PM_EXIT:
if (gic_dist_security_disabled())
gic_enable_redist(true);
gic_cpu_sys_reg_init();
- } else if (cmd == CPU_PM_ENTER && gic_dist_security_disabled()) {
- gic_write_grpen1(0);
- gic_enable_redist(false);
+ break;
}
+
return NOTIFY_OK;
}

--
2.34.1


2023-02-14 23:34:44

by Florian Fainelli

[permalink] [raw]
Subject: [PATCH 2/3] irqchip/gic-v3: Propagate gic_cpu_pm_init() return code

In preparation for allocating memory which may fail, propagate the
return code from gic_cpu_pm_init().

Signed-off-by: Florian Fainelli <[email protected]>
---
drivers/irqchip/irq-gic-v3.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index b60fadb7eb44..48b0e9aba27c 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -1395,9 +1395,11 @@ static struct notifier_block gic_cpu_pm_notifier_block = {
.notifier_call = gic_cpu_pm_notifier,
};

-static void gic_cpu_pm_init(void)
+static int gic_cpu_pm_init(void)
{
cpu_pm_register_notifier(&gic_cpu_pm_notifier_block);
+
+ return 0;
}

#else
@@ -1891,7 +1893,9 @@ static int __init gic_init_bases(void __iomem *dist_base,
gic_dist_init();
gic_cpu_init();
gic_smp_init();
- gic_cpu_pm_init();
+ err = gic_cpu_pm_init();
+ if (err)
+ goto out_set_handle;

if (gic_dist_supports_lpis()) {
its_init(handle, &gic_data.rdists, gic_data.domain);
@@ -1906,6 +1910,8 @@ static int __init gic_init_bases(void __iomem *dist_base,

return 0;

+out_set_handle:
+ set_handle_irq(NULL);
out_free:
if (gic_data.domain)
irq_domain_remove(gic_data.domain);
--
2.34.1


2023-02-14 23:34:51

by Florian Fainelli

[permalink] [raw]
Subject: [PATCH 3/3] irqchip/gic-v3: Save and restore distributor and re-distributor

On platforms implementing Suspend to RAM where the GIC loses power, we
are not properly saving and restoring the GIC distributor and
re-distributor registers thus leading to the system resuming without any
functional interrupts.

Add support for saving and restoring the GIC distributor and
re-distributor in order to properly suspend and resume with a functional
system.

Signed-off-by: Florian Fainelli <[email protected]>
---
drivers/irqchip/irq-gic-v3.c | 258 +++++++++++++++++++++++++++++
include/linux/irqchip/arm-gic-v3.h | 4 +
2 files changed, 262 insertions(+)

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 48b0e9aba27c..4caab61268d0 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -11,6 +11,7 @@
#include <linux/cpu_pm.h>
#include <linux/delay.h>
#include <linux/interrupt.h>
+#include <linux/io-64-nonatomic-lo-hi.h>
#include <linux/irqdomain.h>
#include <linux/kstrtox.h>
#include <linux/of.h>
@@ -57,6 +58,25 @@ struct gic_chip_data {
bool has_rss;
unsigned int ppi_nr;
struct partition_desc **ppi_descs;
+#ifdef CONFIG_CPU_PM
+ u32 *saved_spi_conf;
+ u64 *saved_spi_target;
+ u32 *saved_spi_enable;
+ u32 *saved_spi_active;
+
+ u32 *saved_espi_conf;
+ u64 *saved_espi_target;
+ u32 *saved_espi_enable;
+ u32 *saved_espi_active;
+
+ u32 saved_ppi_conf;
+ u32 saved_ppi_enable;
+ u32 saved_ppi_active;
+
+ u32 *saved_eppi_conf;
+ u32 *saved_eppi_enable;
+ u32 *saved_eppi_active;
+#endif
};

static struct gic_chip_data gic_data __read_mostly;
@@ -1371,6 +1391,143 @@ static int gic_retrigger(struct irq_data *data)
}

#ifdef CONFIG_CPU_PM
+static void gic_rdist_save(void)
+{
+ struct gic_chip_data *gic = &gic_data;
+ void __iomem *rbase = gic_data_rdist_sgi_base();
+ unsigned int i;
+
+ gic->saved_ppi_conf = readl_relaxed(rbase + GICR_ICFGR0 + 4);
+ gic->saved_ppi_enable = readl_relaxed(rbase + GICR_ISENABLER0);
+ gic->saved_ppi_active = readl_relaxed(rbase + GICR_ICACTIVER0);
+
+ for (i = 0; i < DIV_ROUND_UP(gic->ppi_nr - 16, 32); i++) {
+ gic->saved_eppi_conf[i] =
+ readl_relaxed(rbase + GICR_ICFGRnE + i * 4);
+ gic->saved_eppi_enable[i] =
+ readl_relaxed(rbase + GICR_ISENABLERnE + i * 4);
+ gic->saved_eppi_active[i] =
+ readl_relaxed(rbase + GICR_ICACTIVERnE + i * 4);
+ }
+}
+
+static void gic_dist_save(void)
+{
+ struct gic_chip_data *gic = &gic_data;
+ void __iomem *base = gic_data.dist_base;
+ unsigned int i;
+
+ /* Save the SPIs first */
+ for (i = 2; i < DIV_ROUND_UP(GIC_LINE_NR, 16); i++)
+ gic->saved_spi_conf[i] =
+ readl_relaxed(base + GICD_ICFGR + i * 4);
+
+ for (i = 32; i < GIC_LINE_NR; i++)
+ gic->saved_spi_target[i] =
+ readq_relaxed(base + GICD_IROUTER + i * 8);
+
+ for (i = 1; i < DIV_ROUND_UP(GIC_LINE_NR, 32); i++) {
+ gic->saved_spi_enable[i] =
+ readl_relaxed(base + GICD_ISENABLER + i * 4);
+ gic->saved_spi_active[i] =
+ readl_relaxed(base + GICD_ISACTIVER + i * 4);
+ }
+
+ /* Save the EPIs next */
+ for (i = 0; i < DIV_ROUND_UP(GIC_ESPI_NR, 16); i++)
+ gic->saved_espi_conf[i] =
+ readl_relaxed(base + GICD_ICFGRnE + i * 4);
+
+ for (i = 0; i < GIC_ESPI_NR; i++)
+ gic->saved_espi_target[i] =
+ readq_relaxed(base + GICD_IROUTERnE + i * 8);
+
+ for (i = 0; i < DIV_ROUND_UP(GIC_ESPI_NR, 32); i++) {
+ gic->saved_espi_enable[i] =
+ readl_relaxed(base + GICD_ISENABLERnE + i * 4);
+ gic->saved_espi_active[i] =
+ readl_relaxed(base + GICD_ISACTIVERnE + i * 4);
+ }
+}
+
+static void gic_rdist_restore(void)
+{
+ struct gic_chip_data *gic = &gic_data;
+ void __iomem *rbase = gic_data_rdist_sgi_base();
+ unsigned int i;
+
+ writel_relaxed(gic->saved_ppi_conf, rbase + GICR_ICFGR0 + 4);
+ writel_relaxed(gic->saved_ppi_enable, rbase + GICR_ISENABLER0);
+ writel_relaxed(gic->saved_ppi_active, rbase + GICR_ICACTIVER0);
+
+ for (i = 0; i < DIV_ROUND_UP(gic->ppi_nr - 16, 32); i++) {
+ writel_relaxed(gic->saved_eppi_conf[i],
+ rbase + GICR_ICFGRnE + i * 4);
+ writel_relaxed(gic->saved_eppi_enable[i],
+ rbase + GICR_ISENABLERnE + i * 4);
+ writel_relaxed(gic->saved_eppi_active[i],
+ rbase + GICR_ICACTIVERnE + i * 4);
+ }
+}
+
+static void gic_dist_restore(void)
+{
+ struct gic_chip_data *gic = &gic_data;
+ void __iomem *base = gic_data.dist_base;
+ unsigned int i;
+
+ /* Ensure distributor is disabled */
+ writel_relaxed(0, base + GICD_CTLR);
+ gic_dist_wait_for_rwp();
+
+ /* Configure SPIs as non-secure Group-1. */
+ for (i = 32; i < GIC_LINE_NR; i += 32)
+ writel_relaxed(~0, base + GICD_IGROUPR + i / 8);
+
+ /* Restore the SPIs */
+ for (i = 2; i < DIV_ROUND_UP(GIC_LINE_NR, 16); i++)
+ writel_relaxed(gic->saved_spi_conf[i],
+ base + GICD_ICFGR + i * 4);
+
+ for (i = 32; i < GIC_LINE_NR; i++)
+ writel_relaxed(gic->saved_spi_target[i],
+ base + GICD_IROUTER + i * 8);
+
+ for (i = 1; i < DIV_ROUND_UP(GIC_LINE_NR, 32); i++) {
+ writel_relaxed(gic->saved_spi_enable[i],
+ base + GICD_ISENABLER + i * 4);
+ writel_relaxed(gic->saved_spi_active[i],
+ base + GICD_ISACTIVER + i * 4);
+ }
+
+ /* Configure ESPIs as non-secure Group-1. */
+ for (i = 0; i < GIC_ESPI_NR; i += 32)
+ writel_relaxed(~0U, base + GICD_IGROUPRnE + i / 8);
+
+ /* Restore the ESPIs */
+ for (i = 0; i < DIV_ROUND_UP(GIC_ESPI_NR, 16); i++)
+ writel_relaxed(gic->saved_espi_conf[i],
+ base + GICD_ICFGRnE + i * 4);
+
+ for (i = 0; i < GIC_ESPI_NR; i++)
+ writeq_relaxed(gic->saved_espi_target[i],
+ base + GICD_IROUTERnE + i * 8);
+
+ for (i = 0; i < DIV_ROUND_UP(GIC_ESPI_NR, 32); i++) {
+ writel_relaxed(gic->saved_espi_enable[i],
+ base + GICD_ISENABLERnE + i * 4);
+ writel_relaxed(gic->saved_espi_active[i],
+ base + GICD_ISACTIVERnE + i * 4);
+ }
+
+ for (i = 0; i < GIC_ESPI_NR; i += 4)
+ writel_relaxed(GICD_INT_DEF_PRI_X4, base + GICD_IPRIORITYRnE + i);
+
+ /* Enable distributor with ARE, Group1 */
+ writel_relaxed(GICD_CTLR_ARE_NS | GICD_CTLR_ENABLE_G1A | GICD_CTLR_ENABLE_G1,
+ base + GICD_CTLR);
+}
+
static int gic_cpu_pm_notifier(struct notifier_block *self,
unsigned long cmd, void *v)
{
@@ -1380,12 +1537,20 @@ static int gic_cpu_pm_notifier(struct notifier_block *self,
gic_write_grpen1(0);
gic_enable_redist(false);
}
+ gic_rdist_save();
+ break;
+ case CPU_CLUSTER_PM_ENTER:
+ gic_dist_save();
break;
case CPU_PM_EXIT:
+ gic_rdist_restore();
if (gic_dist_security_disabled())
gic_enable_redist(true);
gic_cpu_sys_reg_init();
break;
+ case CPU_CLUSTER_PM_EXIT:
+ gic_dist_restore();
+ break;
}

return NOTIFY_OK;
@@ -1397,9 +1562,102 @@ static struct notifier_block gic_cpu_pm_notifier_block = {

static int gic_cpu_pm_init(void)
{
+ struct gic_chip_data *gic = &gic_data;
+ unsigned int spi_size = DIV_ROUND_UP(GIC_LINE_NR, 32);
+ unsigned int espi_size = DIV_ROUND_UP(GIC_ESPI_NR, 32);
+ unsigned int eppi_size = DIV_ROUND_UP(gic->ppi_nr - 16, 32);
+
+ gic->saved_spi_conf = kcalloc(DIV_ROUND_UP(GIC_LINE_NR, 16),
+ sizeof(*gic->saved_spi_conf),
+ GFP_KERNEL);
+ if (WARN_ON(!gic->saved_spi_conf))
+ return -ENOMEM;
+
+ gic->saved_spi_target = kcalloc(GIC_LINE_NR,
+ sizeof(*gic->saved_spi_target),
+ GFP_KERNEL);
+ if (WARN_ON(!gic->saved_spi_target))
+ goto out_free_spi_conf;
+
+ gic->saved_spi_enable = kcalloc(spi_size,
+ sizeof(*gic->saved_spi_enable),
+ GFP_KERNEL);
+ if (WARN_ON(!gic->saved_spi_enable))
+ goto out_free_spi_target;
+
+ gic->saved_spi_active = kcalloc(spi_size,
+ sizeof(*gic->saved_spi_active),
+ GFP_KERNEL);
+ if (WARN_ON(!gic->saved_spi_active))
+ goto out_free_spi_enable;
+
+ gic->saved_espi_conf = kcalloc(DIV_ROUND_UP(GIC_ESPI_NR, 16),
+ sizeof(*gic->saved_espi_conf),
+ GFP_KERNEL);
+ if (WARN_ON(!gic->saved_espi_conf))
+ goto out_free_spi_active;
+
+ gic->saved_espi_target = kcalloc(GIC_ESPI_NR,
+ sizeof(*gic->saved_espi_target),
+ GFP_KERNEL);
+ if (WARN_ON(!gic->saved_espi_target))
+ goto out_free_espi_conf;
+
+ gic->saved_espi_enable = kcalloc(espi_size,
+ sizeof(*gic->saved_espi_enable),
+ GFP_KERNEL);
+ if (WARN_ON(!gic->saved_espi_enable))
+ goto out_free_espi_target;
+
+ gic->saved_espi_active = kcalloc(espi_size,
+ sizeof(*gic->saved_espi_active),
+ GFP_KERNEL);
+ if (WARN_ON(!gic->saved_espi_active))
+ goto out_free_espi_enable;
+
+ gic->saved_eppi_conf = kcalloc(DIV_ROUND_UP(gic->ppi_nr - 16, 16),
+ sizeof(*gic->saved_eppi_conf),
+ GFP_KERNEL);
+ if (WARN_ON(!gic->saved_eppi_conf))
+ goto out_free_espi_active;
+
+ gic->saved_eppi_enable = kcalloc(eppi_size,
+ sizeof(*gic->saved_eppi_enable),
+ GFP_KERNEL);
+ if (WARN_ON(!gic->saved_eppi_enable))
+ goto out_free_eppi_conf;
+
+ gic->saved_eppi_active = kcalloc(eppi_size,
+ sizeof(*gic->saved_eppi_active),
+ GFP_KERNEL);
+ if (WARN_ON(!gic->saved_eppi_active))
+ goto out_free_eppi_enable;
+
cpu_pm_register_notifier(&gic_cpu_pm_notifier_block);

return 0;
+
+out_free_eppi_enable:
+ kfree(gic->saved_eppi_enable);
+out_free_eppi_conf:
+ kfree(gic->saved_eppi_conf);
+out_free_espi_active:
+ kfree(gic->saved_espi_active);
+out_free_espi_enable:
+ kfree(gic->saved_espi_enable);
+out_free_espi_target:
+ kfree(gic->saved_espi_target);
+out_free_espi_conf:
+ kfree(gic->saved_espi_conf);
+out_free_spi_active:
+ kfree(gic->saved_spi_active);
+out_free_spi_enable:
+ kfree(gic->saved_spi_enable);
+out_free_spi_target:
+ kfree(gic->saved_spi_target);
+out_free_spi_conf:
+ kfree(gic->saved_spi_conf);
+ return -ENOMEM;
}

#else
diff --git a/include/linux/irqchip/arm-gic-v3.h b/include/linux/irqchip/arm-gic-v3.h
index 728691365464..40483530cadd 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -229,13 +229,17 @@
*/
#define GICR_IGROUPR0 GICD_IGROUPR
#define GICR_ISENABLER0 GICD_ISENABLER
+#define GICR_ISENABLERnE GICD_ISENABLERnE
#define GICR_ICENABLER0 GICD_ICENABLER
#define GICR_ISPENDR0 GICD_ISPENDR
#define GICR_ICPENDR0 GICD_ICPENDR
#define GICR_ISACTIVER0 GICD_ISACTIVER
+#define GICR_ISACTIVERnE GICD_ISACTIVERnE
#define GICR_ICACTIVER0 GICD_ICACTIVER
+#define GICR_ICACTIVERnE GICD_ICACTIVERnE
#define GICR_IPRIORITYR0 GICD_IPRIORITYR
#define GICR_ICFGR0 GICD_ICFGR
+#define GICR_ICFGRnE GICD_ICFGRnE
#define GICR_IGRPMODR0 GICD_IGRPMODR
#define GICR_NSACR GICD_NSACR

--
2.34.1


2023-02-15 08:02:42

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH 3/3] irqchip/gic-v3: Save and restore distributor and re-distributor

On Tue, 14 Feb 2023 23:34:26 +0000,
Florian Fainelli <[email protected]> wrote:
>
> On platforms implementing Suspend to RAM where the GIC loses power, we
> are not properly saving and restoring the GIC distributor and
> re-distributor registers thus leading to the system resuming without any
> functional interrupts.

The real question is *why* we need any of this. On any decent system,
this is the firmware's job. It was *never* the OS GIC driver's job
the first place.

Importantly, the OS cannot save the full state: a large part of it is
only accessible via secure, and Linux doesn't run in secure mode. How
do you restore the group configuration, for example? Oh wait, you
don't even save it.

So unless you have a single security state system, this cannot
work. And apart from VMs (which by the way do not need any of this),
there is no GICv3-based system without EL3. If you know of one, please
let me know. And if it existed, then all the save/restore should
happen only when GICD_CTLR.DS==1.

To conclude, this patch doesn't do what it advertises, because it
*cannot* do it, by definition. The secure firmware is the only place
where this can be done.

M.

--
Without deviation from the norm, progress is not possible.

2023-02-15 12:11:00

by Sudeep Holla

[permalink] [raw]
Subject: Re: [PATCH 3/3] irqchip/gic-v3: Save and restore distributor and re-distributor

On Wed, Feb 15, 2023 at 08:02:20AM +0000, Marc Zyngier wrote:
> On Tue, 14 Feb 2023 23:34:26 +0000,
> Florian Fainelli <[email protected]> wrote:
> >
> > On platforms implementing Suspend to RAM where the GIC loses power, we
> > are not properly saving and restoring the GIC distributor and
> > re-distributor registers thus leading to the system resuming without any
> > functional interrupts.
>
> The real question is *why* we need any of this. On any decent system,
> this is the firmware's job. It was *never* the OS GIC driver's job
> the first place.
>

Completely agreed on the points you have made here, no disagreement.
However I would like to iterate some of the arguments/concerns the
firmware teams I have interacted in the past have made around this.
And this is while ago(couple of years) and they may have different
views. I am repeating them as I think it may be still valid on some
systems so that we can make some suggestions if we have here.

> Importantly, the OS cannot save the full state: a large part of it is
> only accessible via secure, and Linux doesn't run in secure mode. How
> do you restore the group configuration, for example? Oh wait, you
> don't even save it.
>

Agreed, we can't manage secure side configurations. But one of the concern
was about the large memory footprint to save the larger non-secure GIC
context in the smaller secure memory.

One of the suggestion at the time was to carve out a chunk of non-secure
memory and let the secure side use the same for context save and restore.
Not sure if this was tried out especially for the GIC. I may need to
chase that with the concerned teams.

Thanks Florian for starting this thread and sorry that I couldn't recollect
lots of the information when we chatted in the private about this. Marc
response triggered all the memory back.

> So unless you have a single security state system, this cannot
> work. And apart from VMs (which by the way do not need any of this),
> there is no GICv3-based system without EL3. If you know of one, please
> let me know. And if it existed, then all the save/restore should
> happen only when GICD_CTLR.DS==1.
>

Yes, now I remember the discussion we had probably almost 9-10 years
back when I first added the CPU PM notifiers for GICv3. I am sure we
would have discussed this at-least couple of times after that. Yet I
just got carried away by the fact that GICv2 does the save/restore and
this should also be possible. Sorry for that.

--
Regards,
Sudeep

2023-02-15 14:40:18

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH 3/3] irqchip/gic-v3: Save and restore distributor and re-distributor

On Wed, 15 Feb 2023 12:10:50 +0000,
Sudeep Holla <[email protected]> wrote:
>
> On Wed, Feb 15, 2023 at 08:02:20AM +0000, Marc Zyngier wrote:
> > On Tue, 14 Feb 2023 23:34:26 +0000,
> > Florian Fainelli <[email protected]> wrote:
> > >
> > > On platforms implementing Suspend to RAM where the GIC loses power, we
> > > are not properly saving and restoring the GIC distributor and
> > > re-distributor registers thus leading to the system resuming without any
> > > functional interrupts.
> >
> > The real question is *why* we need any of this. On any decent system,
> > this is the firmware's job. It was *never* the OS GIC driver's job
> > the first place.
> >
>
> Completely agreed on the points you have made here, no disagreement.
> However I would like to iterate some of the arguments/concerns the
> firmware teams I have interacted in the past have made around this.
> And this is while ago(couple of years) and they may have different
> views. I am repeating them as I think it may be still valid on some
> systems so that we can make some suggestions if we have here.
>
> > Importantly, the OS cannot save the full state: a large part of it is
> > only accessible via secure, and Linux doesn't run in secure mode. How
> > do you restore the group configuration, for example? Oh wait, you
> > don't even save it.
> >
>
> Agreed, we can't manage secure side configurations. But one of the concern
> was about the large memory footprint to save the larger non-secure GIC
> context in the smaller secure memory.
>
> One of the suggestion at the time was to carve out a chunk of non-secure
> memory and let the secure side use the same for context save and restore.
> Not sure if this was tried out especially for the GIC. I may need to
> chase that with the concerned teams.

The main issue is that you still need secure memory to save the secure
state, as leaving it in NS memory would be an interesting attack
vector! Other than that, I see no issue with FW carving out the memory
it needs to save/restore the NS state of the GIC.

Note that this isn't only the (re-)distributor(s) PPI/SPI registers.
The LPI setup must also be saved, and that includes all the ITS
registers. I'm surprised the FW folks are, all of a sudden,
discovering this requirements. It isn't like the GIC architecture is a
novelty, and they have had ample time to review the spec...

>
> Thanks Florian for starting this thread and sorry that I couldn't recollect
> lots of the information when we chatted in the private about this. Marc
> response triggered all the memory back.
>
> > So unless you have a single security state system, this cannot
> > work. And apart from VMs (which by the way do not need any of this),
> > there is no GICv3-based system without EL3. If you know of one, please
> > let me know. And if it existed, then all the save/restore should
> > happen only when GICD_CTLR.DS==1.
> >
>
> Yes, now I remember the discussion we had probably almost 9-10 years
> back when I first added the CPU PM notifiers for GICv3. I am sure we
> would have discussed this at-least couple of times after that. Yet I
> just got carried away by the fact that GICv2 does the save/restore and
> this should also be possible. Sorry for that.

GICv2 is just as fsck'd. It is just that we pretend it works for the
sake of 32bit that may run in secure mode. On a 64bit machine, or in a
NS setup, it is doomed for the same reasons. There really isn't any
substitute for secure firmware here.

M.

--
Without deviation from the norm, progress is not possible.

2023-02-15 15:10:59

by Sudeep Holla

[permalink] [raw]
Subject: Re: [PATCH 3/3] irqchip/gic-v3: Save and restore distributor and re-distributor

On Wed, Feb 15, 2023 at 02:40:04PM +0000, Marc Zyngier wrote:
> On Wed, 15 Feb 2023 12:10:50 +0000,
> Sudeep Holla <[email protected]> wrote:
> >
> > On Wed, Feb 15, 2023 at 08:02:20AM +0000, Marc Zyngier wrote:
> > > On Tue, 14 Feb 2023 23:34:26 +0000,
> > > Florian Fainelli <[email protected]> wrote:
> > > >
> > > > On platforms implementing Suspend to RAM where the GIC loses power, we
> > > > are not properly saving and restoring the GIC distributor and
> > > > re-distributor registers thus leading to the system resuming without any
> > > > functional interrupts.
> > >
> > > The real question is *why* we need any of this. On any decent system,
> > > this is the firmware's job. It was *never* the OS GIC driver's job
> > > the first place.
> > >
> >
> > Completely agreed on the points you have made here, no disagreement.
> > However I would like to iterate some of the arguments/concerns the
> > firmware teams I have interacted in the past have made around this.
> > And this is while ago(couple of years) and they may have different
> > views. I am repeating them as I think it may be still valid on some
> > systems so that we can make some suggestions if we have here.
> >
> > > Importantly, the OS cannot save the full state: a large part of it is
> > > only accessible via secure, and Linux doesn't run in secure mode. How
> > > do you restore the group configuration, for example? Oh wait, you
> > > don't even save it.
> > >
> >
> > Agreed, we can't manage secure side configurations. But one of the concern
> > was about the large memory footprint to save the larger non-secure GIC
> > context in the smaller secure memory.
> >
> > One of the suggestion at the time was to carve out a chunk of non-secure
> > memory and let the secure side use the same for context save and restore.
> > Not sure if this was tried out especially for the GIC. I may need to
> > chase that with the concerned teams.
>
> The main issue is that you still need secure memory to save the secure
> state, as leaving it in NS memory would be an interesting attack
> vector! Other than that, I see no issue with FW carving out the memory
> it needs to save/restore the NS state of the GIC.
>

Yes I meant NS memory for only NS state of GIC.

> Note that this isn't only the (re-)distributor(s) PPI/SPI registers.
> The LPI setup must also be saved, and that includes all the ITS
> registers. I'm surprised the FW folks are, all of a sudden,
> discovering this requirements. It isn't like the GIC architecture is a
> novelty, and they have had ample time to review the spec...
>

I understand your concern about late realisation ????.

Another issue in general I see with reference firmware stack(like
Trusted Firmware in this case) is that the requirements are driven from
the reference platforms which may not have this GIC save/restore
requirement as they are in always on domain and it is then made platform
specific problem in that project which may not be ideal and may result
in somewhat misleading indirectly other firmware developers using it.

Also remember some firmware folks asking about LPI context, I am not sure
if there was any work done in that area.

> >
> > Thanks Florian for starting this thread and sorry that I couldn't recollect
> > lots of the information when we chatted in the private about this. Marc
> > response triggered all the memory back.
> >
> > > So unless you have a single security state system, this cannot
> > > work. And apart from VMs (which by the way do not need any of this),
> > > there is no GICv3-based system without EL3. If you know of one, please
> > > let me know. And if it existed, then all the save/restore should
> > > happen only when GICD_CTLR.DS==1.
> > >
> >
> > Yes, now I remember the discussion we had probably almost 9-10 years
> > back when I first added the CPU PM notifiers for GICv3. I am sure we
> > would have discussed this at-least couple of times after that. Yet I
> > just got carried away by the fact that GICv2 does the save/restore and
> > this should also be possible. Sorry for that.
>
> GICv2 is just as fsck'd. It is just that we pretend it works for the
> sake of 32bit that may run in secure mode. On a 64bit machine, or in a
> NS setup, it is doomed for the same reasons. There really isn't any
> substitute for secure firmware here.

Fair enough and thanks for refreshing my memory on this.

Hi Florian,

I did little bit digging in the TF-A and found this.
plat_arm_gic_{save,resume}()in plat/arm/common/arm_gicv3.c which I assume
makes it platform specific code and hence not used on any other platform.
I also missed to see this earlier as I explicitly ignored the plat/ directory
assuming it is all platform specific code not shared across.

Not sure if the firmware on your platform is not using that or is it
different firmware altogether or may be TF-A forked before this change.
If it is missing anything, it would be good to get that fixed and look
at ways to generalise it.

--
Regards,
Sudeep

2023-02-15 17:27:43

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 2/3] irqchip/gic-v3: Propagate gic_cpu_pm_init() return code

Hi Florian,

I love your patch! Yet something to improve:

[auto build test ERROR on tip/irq/core]
[also build test ERROR on soc/for-next linus/master v6.2-rc8 next-20230215]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Florian-Fainelli/irqchip-gic-v3-Use-switch-case-statements-in-gic_cpu_pm_notifier/20230215-073628
patch link: https://lore.kernel.org/r/20230214233426.2994501-3-f.fainelli%40gmail.com
patch subject: [PATCH 2/3] irqchip/gic-v3: Propagate gic_cpu_pm_init() return code
config: arm64-randconfig-r011-20230213 (https://download.01.org/0day-ci/archive/20230216/[email protected]/config)
compiler: aarch64-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/8657e4fd7d9714934c7660bc3693d9ad507679a0
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Florian-Fainelli/irqchip-gic-v3-Use-switch-case-statements-in-gic_cpu_pm_notifier/20230215-073628
git checkout 8657e4fd7d9714934c7660bc3693d9ad507679a0
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=arm64 olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=arm64 SHELL=/bin/bash drivers/irqchip/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>
| Link: https://lore.kernel.org/oe-kbuild-all/[email protected]/

All errors (new ones prefixed by >>):

drivers/irqchip/irq-gic-v3.c: In function 'gic_init_bases':
>> drivers/irqchip/irq-gic-v3.c:1896:13: error: void value not ignored as it ought to be
1896 | err = gic_cpu_pm_init();
| ^


vim +1896 drivers/irqchip/irq-gic-v3.c

1825
1826 static int __init gic_init_bases(void __iomem *dist_base,
1827 struct redist_region *rdist_regs,
1828 u32 nr_redist_regions,
1829 u64 redist_stride,
1830 struct fwnode_handle *handle)
1831 {
1832 u32 typer;
1833 int err;
1834
1835 if (!is_hyp_mode_available())
1836 static_branch_disable(&supports_deactivate_key);
1837
1838 if (static_branch_likely(&supports_deactivate_key))
1839 pr_info("GIC: Using split EOI/Deactivate mode\n");
1840
1841 gic_data.fwnode = handle;
1842 gic_data.dist_base = dist_base;
1843 gic_data.redist_regions = rdist_regs;
1844 gic_data.nr_redist_regions = nr_redist_regions;
1845 gic_data.redist_stride = redist_stride;
1846
1847 /*
1848 * Find out how many interrupts are supported.
1849 */
1850 typer = readl_relaxed(gic_data.dist_base + GICD_TYPER);
1851 gic_data.rdists.gicd_typer = typer;
1852
1853 gic_enable_quirks(readl_relaxed(gic_data.dist_base + GICD_IIDR),
1854 gic_quirks, &gic_data);
1855
1856 pr_info("%d SPIs implemented\n", GIC_LINE_NR - 32);
1857 pr_info("%d Extended SPIs implemented\n", GIC_ESPI_NR);
1858
1859 /*
1860 * ThunderX1 explodes on reading GICD_TYPER2, in violation of the
1861 * architecture spec (which says that reserved registers are RES0).
1862 */
1863 if (!(gic_data.flags & FLAGS_WORKAROUND_CAVIUM_ERRATUM_38539))
1864 gic_data.rdists.gicd_typer2 = readl_relaxed(gic_data.dist_base + GICD_TYPER2);
1865
1866 gic_data.domain = irq_domain_create_tree(handle, &gic_irq_domain_ops,
1867 &gic_data);
1868 gic_data.rdists.rdist = alloc_percpu(typeof(*gic_data.rdists.rdist));
1869 gic_data.rdists.has_rvpeid = true;
1870 gic_data.rdists.has_vlpis = true;
1871 gic_data.rdists.has_direct_lpi = true;
1872 gic_data.rdists.has_vpend_valid_dirty = true;
1873
1874 if (WARN_ON(!gic_data.domain) || WARN_ON(!gic_data.rdists.rdist)) {
1875 err = -ENOMEM;
1876 goto out_free;
1877 }
1878
1879 irq_domain_update_bus_token(gic_data.domain, DOMAIN_BUS_WIRED);
1880
1881 gic_data.has_rss = !!(typer & GICD_TYPER_RSS);
1882
1883 if (typer & GICD_TYPER_MBIS) {
1884 err = mbi_init(handle, gic_data.domain);
1885 if (err)
1886 pr_err("Failed to initialize MBIs\n");
1887 }
1888
1889 set_handle_irq(gic_handle_irq);
1890
1891 gic_update_rdist_properties();
1892
1893 gic_dist_init();
1894 gic_cpu_init();
1895 gic_smp_init();
> 1896 err = gic_cpu_pm_init();
1897 if (err)
1898 goto out_set_handle;
1899
1900 if (gic_dist_supports_lpis()) {
1901 its_init(handle, &gic_data.rdists, gic_data.domain);
1902 its_cpu_init();
1903 its_lpi_memreserve_init();
1904 } else {
1905 if (IS_ENABLED(CONFIG_ARM_GIC_V2M))
1906 gicv2m_init(handle, gic_data.domain);
1907 }
1908
1909 gic_enable_nmi_support();
1910
1911 return 0;
1912
1913 out_set_handle:
1914 set_handle_irq(NULL);
1915 out_free:
1916 if (gic_data.domain)
1917 irq_domain_remove(gic_data.domain);
1918 free_percpu(gic_data.rdists.rdist);
1919 return err;
1920 }
1921

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

2023-02-15 18:09:26

by Florian Fainelli

[permalink] [raw]
Subject: Re: [PATCH 3/3] irqchip/gic-v3: Save and restore distributor and re-distributor

On 2/15/23 07:10, Sudeep Holla wrote:
> On Wed, Feb 15, 2023 at 02:40:04PM +0000, Marc Zyngier wrote:
>> On Wed, 15 Feb 2023 12:10:50 +0000,
>> Sudeep Holla <[email protected]> wrote:
>>>
>>> On Wed, Feb 15, 2023 at 08:02:20AM +0000, Marc Zyngier wrote:
>>>> On Tue, 14 Feb 2023 23:34:26 +0000,
>>>> Florian Fainelli <[email protected]> wrote:
>>>>>
>>>>> On platforms implementing Suspend to RAM where the GIC loses power, we
>>>>> are not properly saving and restoring the GIC distributor and
>>>>> re-distributor registers thus leading to the system resuming without any
>>>>> functional interrupts.
>>>>
>>>> The real question is *why* we need any of this. On any decent system,
>>>> this is the firmware's job. It was *never* the OS GIC driver's job
>>>> the first place.
>>>>
>>>
>>> Completely agreed on the points you have made here, no disagreement.
>>> However I would like to iterate some of the arguments/concerns the
>>> firmware teams I have interacted in the past have made around this.
>>> And this is while ago(couple of years) and they may have different
>>> views. I am repeating them as I think it may be still valid on some
>>> systems so that we can make some suggestions if we have here.
>>>
>>>> Importantly, the OS cannot save the full state: a large part of it is
>>>> only accessible via secure, and Linux doesn't run in secure mode. How
>>>> do you restore the group configuration, for example? Oh wait, you
>>>> don't even save it.
>>>>
>>>
>>> Agreed, we can't manage secure side configurations. But one of the concern
>>> was about the large memory footprint to save the larger non-secure GIC
>>> context in the smaller secure memory.
>>>
>>> One of the suggestion at the time was to carve out a chunk of non-secure
>>> memory and let the secure side use the same for context save and restore.
>>> Not sure if this was tried out especially for the GIC. I may need to
>>> chase that with the concerned teams.
>>
>> The main issue is that you still need secure memory to save the secure
>> state, as leaving it in NS memory would be an interesting attack
>> vector! Other than that, I see no issue with FW carving out the memory
>> it needs to save/restore the NS state of the GIC.
>>
>
> Yes I meant NS memory for only NS state of GIC.

The secure state of the GIC is being re-initialized coming out of
suspend to DRAM since the chip lost its state, in fact we configure it
the same as we did during cold boot and then the firmware goes on
re-initializing the various secure interrupts it uses.

The non-secure state was not dealt with by the firmware, which prompted
me to mimicking what the GICv2 driver does since there is an expectation
from Linux that interrupts will be saved/restored.

>
>> Note that this isn't only the (re-)distributor(s) PPI/SPI registers.
>> The LPI setup must also be saved, and that includes all the ITS
>> registers. I'm surprised the FW folks are, all of a sudden,
>> discovering this requirements. It isn't like the GIC architecture is a
>> novelty, and they have had ample time to review the spec...
>>
>
> I understand your concern about late realisation ????.
>
> Another issue in general I see with reference firmware stack(like
> Trusted Firmware in this case) is that the requirements are driven from
> the reference platforms which may not have this GIC save/restore
> requirement as they are in always on domain and it is then made platform
> specific problem in that project which may not be ideal and may result
> in somewhat misleading indirectly other firmware developers using it.

I suppose it is so obvious that saving/restoring must happen by trusted
firmware that there is no point in putting that in BIG CAPITAL WORDS for
people to know about it.

>
> Also remember some firmware folks asking about LPI context, I am not sure
> if there was any work done in that area.
>
>>>
>>> Thanks Florian for starting this thread and sorry that I couldn't recollect
>>> lots of the information when we chatted in the private about this. Marc
>>> response triggered all the memory back.
>>>
>>>> So unless you have a single security state system, this cannot
>>>> work. And apart from VMs (which by the way do not need any of this),
>>>> there is no GICv3-based system without EL3. If you know of one, please
>>>> let me know. And if it existed, then all the save/restore should
>>>> happen only when GICD_CTLR.DS==1.
>>>>
>>>
>>> Yes, now I remember the discussion we had probably almost 9-10 years
>>> back when I first added the CPU PM notifiers for GICv3. I am sure we
>>> would have discussed this at-least couple of times after that. Yet I
>>> just got carried away by the fact that GICv2 does the save/restore and
>>> this should also be possible. Sorry for that.
>>
>> GICv2 is just as fsck'd. It is just that we pretend it works for the
>> sake of 32bit that may run in secure mode. On a 64bit machine, or in a
>> NS setup, it is doomed for the same reasons. There really isn't any
>> substitute for secure firmware here.
>
> Fair enough and thanks for refreshing my memory on this.
>
> Hi Florian,
>
> I did little bit digging in the TF-A and found this.
> plat_arm_gic_{save,resume}()in plat/arm/common/arm_gicv3.c which I assume
> makes it platform specific code and hence not used on any other platform.
> I also missed to see this earlier as I explicitly ignored the plat/ directory
> assuming it is all platform specific code not shared across.
>
> Not sure if the firmware on your platform is not using that or is it
> different firmware altogether or may be TF-A forked before this change.
> If it is missing anything, it would be good to get that fixed and look
> at ways to generalise it.

As you may or may not remember we have our own ARM Trusted Firmware for
better of for worse, I will put the code to save and restore the
registers there. Thanks!
--
Florian


2023-02-16 08:31:21

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH 3/3] irqchip/gic-v3: Save and restore distributor and re-distributor

On Wed, 15 Feb 2023 15:10:48 +0000,
Sudeep Holla <[email protected]> wrote:
>
> On Wed, Feb 15, 2023 at 02:40:04PM +0000, Marc Zyngier wrote:
> > On Wed, 15 Feb 2023 12:10:50 +0000,
> > Sudeep Holla <[email protected]> wrote:
> > >
> > > On Wed, Feb 15, 2023 at 08:02:20AM +0000, Marc Zyngier wrote:
> > > > On Tue, 14 Feb 2023 23:34:26 +0000,
> > > > Florian Fainelli <[email protected]> wrote:
> > > > >
> > > > > On platforms implementing Suspend to RAM where the GIC loses power, we
> > > > > are not properly saving and restoring the GIC distributor and
> > > > > re-distributor registers thus leading to the system resuming without any
> > > > > functional interrupts.
> > > >
> > > > The real question is *why* we need any of this. On any decent system,
> > > > this is the firmware's job. It was *never* the OS GIC driver's job
> > > > the first place.
> > > >
> > >
> > > Completely agreed on the points you have made here, no disagreement.
> > > However I would like to iterate some of the arguments/concerns the
> > > firmware teams I have interacted in the past have made around this.
> > > And this is while ago(couple of years) and they may have different
> > > views. I am repeating them as I think it may be still valid on some
> > > systems so that we can make some suggestions if we have here.
> > >
> > > > Importantly, the OS cannot save the full state: a large part of it is
> > > > only accessible via secure, and Linux doesn't run in secure mode. How
> > > > do you restore the group configuration, for example? Oh wait, you
> > > > don't even save it.
> > > >
> > >
> > > Agreed, we can't manage secure side configurations. But one of the concern
> > > was about the large memory footprint to save the larger non-secure GIC
> > > context in the smaller secure memory.
> > >
> > > One of the suggestion at the time was to carve out a chunk of non-secure
> > > memory and let the secure side use the same for context save and restore.
> > > Not sure if this was tried out especially for the GIC. I may need to
> > > chase that with the concerned teams.
> >
> > The main issue is that you still need secure memory to save the secure
> > state, as leaving it in NS memory would be an interesting attack
> > vector! Other than that, I see no issue with FW carving out the memory
> > it needs to save/restore the NS state of the GIC.
> >
>
> Yes I meant NS memory for only NS state of GIC.
>
> > Note that this isn't only the (re-)distributor(s) PPI/SPI registers.
> > The LPI setup must also be saved, and that includes all the ITS
> > registers. I'm surprised the FW folks are, all of a sudden,
> > discovering this requirements. It isn't like the GIC architecture is a
> > novelty, and they have had ample time to review the spec...
> >
>
> I understand your concern about late realisation ????.
>
> Another issue in general I see with reference firmware stack(like
> Trusted Firmware in this case) is that the requirements are driven from
> the reference platforms which may not have this GIC save/restore
> requirement as they are in always on domain and it is then made platform
> specific problem in that project which may not be ideal and may result
> in somewhat misleading indirectly other firmware developers using
> it.

Yeah, that's the usual state of affair. Unrealistic platforms, no
insight (and more generally no interest) in the actual usage model.

Still, most people got it right, so I guess they must be reading the
spec. How comes this was never picked from contributions to TF-A?
Surely duplication of platform code should be a massive hint to the
firmware maintainers?

Thanks,

M.

--
Without deviation from the norm, progress is not possible.