2015-11-10 14:39:49

by Jon Hunter

[permalink] [raw]
Subject: [RFC PATCH 0/2] Add support for Tegra210 AGIC

The Tegra210 AGIC interrupt controller is a 2nd level interrupt controller
located in a separate power domain to the main GIC interrupt controller.
It can route interrupts to the main CPU cluster or an Audio DSP slave.

Ideally we would like to re-use the existing ARM GIC driver because the
AGIC is a GIC-400. However, in order to do so this requires a few
significant changes to the exisiting GIC driver for power management
reasons.

The purpose of this RFC is to get some feedback on the approach and see
if we can support the AGIC in this way or if a separate driver is
warranted for this device.

Please note that this RFC does not address the issue of interrupt routing.
Ideally I was thinking that having a mechanism/API to migrate an interrupt
from the CPU cluster to the DSP would make sense, however, I don't believe
this is supported today in the kernel. Would such an approach be acceptable
or if not, is there a better way to handle this?

Jon Hunter (2):
genirq: Add runtime resume/suspend support for IRQ chips
irqchip/gic: Add support for tegra AGIC interrupt controller

drivers/irqchip/irq-gic.c | 341 +++++++++++++++++++++++++++++++++++++++-------
include/linux/irq.h | 4 +
kernel/irq/internals.h | 17 +++
kernel/irq/manage.c | 7 +
4 files changed, 319 insertions(+), 50 deletions(-)

--
2.1.4


2015-11-10 14:40:50

by Jon Hunter

[permalink] [raw]
Subject: [RFC PATCH 1/2] genirq: Add runtime resume/suspend support for IRQ chips

Some IRQ chips may be located in a power domain outside of the CPU subsystem
and hence will require device specific runtime power management. Ideally,
rather than adding more functions to the irq_chip_ops function table, using
existing chip functions such as irq_startup/shutdown or
irq_request/release_resources() would be best. However, these existing chip
functions are called in the context of a spinlock which is not ideal for
power management operations that may take some time to power up a domain.

Two possible solutions are:
1. Move existing chip operators such as irq_request/release_resources()
outside of the spinlock and use these helpers.
2. Add new chip operators that are called outside of any spinlocks while
setting up and freeing an IRQ.

Not knowing whether we can safely move irq_request/release_resources() to
outside the spinlock (but hopefully this will solicit some feedback), add
new chip operators for runtime resuming and suspending of an IRQ chip.

With this change the irqchip clocks will be enabled/disabled on allocating
and freeing interrupts.

Signed-off-by: Jon Hunter <[email protected]>
---
include/linux/irq.h | 4 ++++
kernel/irq/internals.h | 17 +++++++++++++++++
kernel/irq/manage.c | 7 +++++++
3 files changed, 28 insertions(+)

diff --git a/include/linux/irq.h b/include/linux/irq.h
index 3c1c96786248..a29050cfb5f6 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -329,6 +329,8 @@ static inline irq_hw_number_t irqd_to_hwirq(struct irq_data *d)
* chip, when one or more interrupts are installed
* @irq_resume: function called from core code on resume once per chip,
* when one ore more interrupts are installed
+ * @irq_runtime_suspend:function called to runtime suspend a chip
+ * @irq_runtime_resume: function called to runtime resume a chip
* @irq_pm_shutdown: function called from core code on shutdown once per chip
* @irq_calc_mask: Optional function to set irq_data.mask for special cases
* @irq_print_chip: optional to print special chip info in show_interrupts
@@ -369,6 +371,8 @@ struct irq_chip {

void (*irq_suspend)(struct irq_data *data);
void (*irq_resume)(struct irq_data *data);
+ int (*irq_runtime_suspend)(struct irq_data *data);
+ int (*irq_runtime_resume)(struct irq_data *data);
void (*irq_pm_shutdown)(struct irq_data *data);

void (*irq_calc_mask)(struct irq_data *data);
diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h
index 05c2188271b8..187e49369c16 100644
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -125,6 +125,23 @@ static inline void chip_bus_sync_unlock(struct irq_desc *desc)
desc->irq_data.chip->irq_bus_sync_unlock(&desc->irq_data);
}

+/* Inline functions for support of irq chips that require runtime pm */
+static inline int chip_runtime_resume(struct irq_desc *desc)
+{
+ if (!desc->irq_data.chip->irq_runtime_resume)
+ return 0;
+
+ return desc->irq_data.chip->irq_runtime_resume(&desc->irq_data);
+}
+
+static inline int chip_runtime_suspend(struct irq_desc *desc)
+{
+ if (!desc->irq_data.chip->irq_runtime_suspend)
+ return 0;
+
+ return desc->irq_data.chip->irq_runtime_suspend(&desc->irq_data);
+}
+
#define _IRQ_DESC_CHECK (1 << 0)
#define _IRQ_DESC_PERCPU (1 << 1)

diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 0eebaeef317b..66e33df73140 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -1116,6 +1116,10 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
if (!try_module_get(desc->owner))
return -ENODEV;

+ ret = chip_runtime_resume(desc);
+ if (ret < 0)
+ return ret;
+
new->irq = irq;

/*
@@ -1393,6 +1397,7 @@ out_thread:
put_task_struct(t);
}
out_mput:
+ chip_runtime_suspend(desc);
module_put(desc->owner);
return ret;
}
@@ -1506,6 +1511,7 @@ static struct irqaction *__free_irq(unsigned int irq, void *dev_id)
}
}

+ chip_runtime_suspend(desc);
module_put(desc->owner);
kfree(action->secondary);
return action;
@@ -1792,6 +1798,7 @@ static struct irqaction *__free_percpu_irq(unsigned int irq, void __percpu *dev_

unregister_handler_proc(irq, action);

+ chip_runtime_suspend(desc);
module_put(desc->owner);
return action;

--
2.1.4

2015-11-10 14:39:53

by Jon Hunter

[permalink] [raw]
Subject: [RFC PATCH 2/2] irqchip/gic: Add support for tegra AGIC interrupt controller

Add a driver for the Tegra-AGIC interrupt controller which is compatible
with the ARM GIC-400 interrupt controller.

The Tegra AGIC (Audio GIC) is part of the Audio Processing Engine (APE) on
Tegra210 and can route interrupts to either the GIC for the CPU subsystem
or the Audio DSP (ADSP) within the APE. The AGIC uses CPU interface 0 to
route interrupts to the CPU GIC and CPU interface 1 to route interrupts to
the ADSP.

The APE is located within its own power domain on the chip and so the
AGIC needs to manage both the power domain and its clocks. Commit
afbbd2338176 ("irqchip/gic: Document optional Clock and Power Domain
properties") adding clock and power-domain properties to the GIC binding
and so the aim would be to make use of these to handle power management
(however, this is very much dependent upon adding support for generic
PM domains for Tegra which is still a work-in-progress).

With the AGIC being located in a different power domain to the main CPU
cluster this means that:
1. The interrupt controller cannot be registered via IRQCHIP_DECLARE()
because it needs to be registered as a platform device so that the
generic PM domain core will ensure that the power domain is available
before probing.
2. The interrupt controller cannot be suspended/restored based upon
changes in the CPU power state and needs to use runtime-pm instead.

This is very much a work-in-progress and there are still a few items that
need to be resolved. These items are:
1. Currently the GIC platform driver assumes that the device is a non-root
GIC and hence has a parent interrupt. It also assumes that for non-root
GICs PPIs are not used and hence gic_cpu_save/restore is not used. This
can be changed.
2. Currently routing of interrupts to the ADSP for Tegra210 is not
supported by this driver. Although the ADSP on Tegra210 could also setup
the AGIC distributor having two independent subsystems configure the
distributor does not seem like a good idea. Given that the ADSP is a
slave and would be under the control of the kernel via its own driver,
it would seem best that only the kernel configures the distributors
routing of the interrupts. This could be achieved by adding a new genirq
API to migrate the interrupt. The GIC driver already has an API to
migrate all interrupts from one CPU interface to another (which I
understand is for a different reason), but having an generic API to
migrate an interrupt to another device could be useful (unless something
already exists that I have overlooked).
3. Once Linus W's patch, "irqchip/gic: assign irqchip dynamically" is
merged then the runtime resume/suspend function could be populated at
runtime for the chips that need it.

Please let me know if you have any thoughts/opinions on the above.

Signed-off-by: Jon Hunter <[email protected]>
---
drivers/irqchip/irq-gic.c | 341 +++++++++++++++++++++++++++++++++++++++-------
1 file changed, 291 insertions(+), 50 deletions(-)

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index 515c823c1c95..4e34ac453e30 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -26,6 +26,7 @@
#include <linux/module.h>
#include <linux/list.h>
#include <linux/smp.h>
+#include <linux/clk.h>
#include <linux/cpu.h>
#include <linux/cpu_pm.h>
#include <linux/cpumask.h>
@@ -37,6 +38,8 @@
#include <linux/irqdomain.h>
#include <linux/interrupt.h>
#include <linux/percpu.h>
+#include <linux/platform_device.h>
+#include <linux/pm_runtime.h>
#include <linux/slab.h>
#include <linux/irqchip.h>
#include <linux/irqchip/chained_irq.h>
@@ -69,15 +72,17 @@ union gic_base {
};

struct gic_chip_data {
+ struct device *dev;
+ struct clk *clk;
union gic_base dist_base;
union gic_base cpu_base;
#ifdef CONFIG_CPU_PM
- u32 saved_spi_enable[DIV_ROUND_UP(1020, 32)];
- u32 saved_spi_conf[DIV_ROUND_UP(1020, 16)];
- u32 saved_spi_target[DIV_ROUND_UP(1020, 4)];
u32 __percpu *saved_ppi_enable;
u32 __percpu *saved_ppi_conf;
#endif
+ u32 saved_spi_enable[DIV_ROUND_UP(1020, 32)];
+ u32 saved_spi_conf[DIV_ROUND_UP(1020, 16)];
+ u32 saved_spi_target[DIV_ROUND_UP(1020, 4)];
struct irq_domain *domain;
unsigned int gic_irqs;
#ifdef CONFIG_GIC_NON_BANKED
@@ -380,6 +385,36 @@ static void gic_handle_cascade_irq(struct irq_desc *desc)
chained_irq_exit(chip, desc);
}

+static int gic_runtime_resume(struct irq_data *d)
+{
+ struct gic_chip_data *gic = irq_data_get_irq_chip_data(d);
+ int ret;
+
+ if (!gic->dev)
+ return 0;
+
+ ret = pm_runtime_get_sync(gic->dev);
+ if (ret < 0)
+ return ret;
+
+ return 0;
+}
+
+static int gic_runtime_suspend(struct irq_data *d)
+{
+ struct gic_chip_data *gic = irq_data_get_irq_chip_data(d);
+ int ret;
+
+ if (!gic->dev)
+ return 0;
+
+ ret = pm_runtime_put(gic->dev);
+ if (ret < 0)
+ return ret;
+
+ return 0;
+}
+
static struct irq_chip gic_chip = {
.name = "GIC",
.irq_mask = gic_mask_irq,
@@ -391,6 +426,8 @@ static struct irq_chip gic_chip = {
#endif
.irq_get_irqchip_state = gic_irq_get_irqchip_state,
.irq_set_irqchip_state = gic_irq_set_irqchip_state,
+ .irq_runtime_resume = gic_runtime_resume,
+ .irq_runtime_suspend = gic_runtime_suspend,
.flags = IRQCHIP_SET_TYPE_MASKED |
IRQCHIP_SKIP_SET_WAKE |
IRQCHIP_MASK_ON_SUSPEND,
@@ -408,6 +445,8 @@ static struct irq_chip gic_eoimode1_chip = {
.irq_get_irqchip_state = gic_irq_get_irqchip_state,
.irq_set_irqchip_state = gic_irq_set_irqchip_state,
.irq_set_vcpu_affinity = gic_irq_set_vcpu_affinity,
+ .irq_runtime_resume = gic_runtime_resume,
+ .irq_runtime_suspend = gic_runtime_suspend,
.flags = IRQCHIP_SET_TYPE_MASKED |
IRQCHIP_SKIP_SET_WAKE |
IRQCHIP_MASK_ON_SUSPEND,
@@ -459,7 +498,7 @@ static void gic_cpu_if_up(struct gic_chip_data *gic)
}


-static void __init gic_dist_init(struct gic_chip_data *gic)
+static void gic_dist_init(struct gic_chip_data *gic)
{
unsigned int i;
u32 cpumask;
@@ -533,38 +572,36 @@ int gic_cpu_if_down(unsigned int gic_nr)
return 0;
}

-#ifdef CONFIG_CPU_PM
/*
* Saves the GIC distributor registers during suspend or idle. Must be called
* with interrupts disabled but before powering down the GIC. After calling
* this function, no interrupts will be delivered by the GIC, and another
* platform-specific wakeup source must be enabled.
*/
-static void gic_dist_save(unsigned int gic_nr)
+static void gic_dist_save(struct gic_chip_data *gic)
{
unsigned int gic_irqs;
void __iomem *dist_base;
int i;

- if (gic_nr >= MAX_GIC_NR)
- BUG();
+ BUG_ON(!gic);

- gic_irqs = gic_data[gic_nr].gic_irqs;
- dist_base = gic_data_dist_base(&gic_data[gic_nr]);
+ gic_irqs = gic->gic_irqs;
+ dist_base = gic_data_dist_base(gic);

if (!dist_base)
return;

for (i = 0; i < DIV_ROUND_UP(gic_irqs, 16); i++)
- gic_data[gic_nr].saved_spi_conf[i] =
+ gic->saved_spi_conf[i] =
readl_relaxed(dist_base + GIC_DIST_CONFIG + i * 4);

for (i = 0; i < DIV_ROUND_UP(gic_irqs, 4); i++)
- gic_data[gic_nr].saved_spi_target[i] =
+ gic->saved_spi_target[i] =
readl_relaxed(dist_base + GIC_DIST_TARGET + i * 4);

for (i = 0; i < DIV_ROUND_UP(gic_irqs, 32); i++)
- gic_data[gic_nr].saved_spi_enable[i] =
+ gic->saved_spi_enable[i] =
readl_relaxed(dist_base + GIC_DIST_ENABLE_SET + i * 4);
}

@@ -575,17 +612,16 @@ static void gic_dist_save(unsigned int gic_nr)
* handled normally, but any edge interrupts that occured will not be seen by
* the GIC and need to be handled by the platform-specific wakeup source.
*/
-static void gic_dist_restore(unsigned int gic_nr)
+static void gic_dist_restore(struct gic_chip_data *gic)
{
unsigned int gic_irqs;
unsigned int i;
void __iomem *dist_base;

- if (gic_nr >= MAX_GIC_NR)
- BUG();
+ BUG_ON(!gic);

- gic_irqs = gic_data[gic_nr].gic_irqs;
- dist_base = gic_data_dist_base(&gic_data[gic_nr]);
+ gic_irqs = gic->gic_irqs;
+ dist_base = gic_data_dist_base(gic);

if (!dist_base)
return;
@@ -593,7 +629,7 @@ static void gic_dist_restore(unsigned int gic_nr)
writel_relaxed(GICD_DISABLE, dist_base + GIC_DIST_CTRL);

for (i = 0; i < DIV_ROUND_UP(gic_irqs, 16); i++)
- writel_relaxed(gic_data[gic_nr].saved_spi_conf[i],
+ writel_relaxed(gic->saved_spi_conf[i],
dist_base + GIC_DIST_CONFIG + i * 4);

for (i = 0; i < DIV_ROUND_UP(gic_irqs, 4); i++)
@@ -601,16 +637,17 @@ static void gic_dist_restore(unsigned int gic_nr)
dist_base + GIC_DIST_PRI + i * 4);

for (i = 0; i < DIV_ROUND_UP(gic_irqs, 4); i++)
- writel_relaxed(gic_data[gic_nr].saved_spi_target[i],
+ writel_relaxed(gic->saved_spi_target[i],
dist_base + GIC_DIST_TARGET + i * 4);

for (i = 0; i < DIV_ROUND_UP(gic_irqs, 32); i++)
- writel_relaxed(gic_data[gic_nr].saved_spi_enable[i],
+ writel_relaxed(gic->saved_spi_enable[i],
dist_base + GIC_DIST_ENABLE_SET + i * 4);

writel_relaxed(GICD_ENABLE, dist_base + GIC_DIST_CTRL);
}

+#ifdef CONFIG_CPU_PM
static void gic_cpu_save(unsigned int gic_nr)
{
int i;
@@ -688,11 +725,11 @@ static int gic_notifier(struct notifier_block *self, unsigned long cmd, void *v)
gic_cpu_restore(i);
break;
case CPU_CLUSTER_PM_ENTER:
- gic_dist_save(i);
+ gic_dist_save(&gic_data[i]);
break;
case CPU_CLUSTER_PM_ENTER_FAILED:
case CPU_CLUSTER_PM_EXIT:
- gic_dist_restore(i);
+ gic_dist_restore(&gic_data[i]);
break;
}
}
@@ -998,19 +1035,15 @@ static const struct irq_domain_ops gic_irq_domain_ops = {
.unmap = gic_irq_domain_unmap,
};

-static void __init __gic_init_bases(unsigned int gic_nr, int irq_start,
- void __iomem *dist_base, void __iomem *cpu_base,
- u32 percpu_offset, struct fwnode_handle *handle)
+static int __gic_init_bases(struct gic_chip_data *gic, int irq_start,
+ void __iomem *dist_base, void __iomem *cpu_base,
+ u32 percpu_offset, struct fwnode_handle *handle)
{
irq_hw_number_t hwirq_base;
- struct gic_chip_data *gic;
int gic_irqs, irq_base, i;

- BUG_ON(gic_nr >= MAX_GIC_NR);
-
gic_check_cpu_features();

- gic = &gic_data[gic_nr];
#ifdef CONFIG_GIC_NON_BANKED
if (percpu_offset) { /* Frankein-GIC without banked registers... */
unsigned int cpu;
@@ -1021,7 +1054,7 @@ static void __init __gic_init_bases(unsigned int gic_nr, int irq_start,
!gic->cpu_base.percpu_base)) {
free_percpu(gic->dist_base.percpu_base);
free_percpu(gic->cpu_base.percpu_base);
- return;
+ return -ENOMEM;
}

for_each_possible_cpu(cpu) {
@@ -1063,7 +1096,7 @@ static void __init __gic_init_bases(unsigned int gic_nr, int irq_start,
* For primary GICs, skip over SGIs.
* For secondary GICs, skip over PPIs, too.
*/
- if (gic_nr == 0 && (irq_start & 31) > 0) {
+ if (gic == &gic_data[0] && (irq_start & 31) > 0) {
hwirq_base = 16;
if (irq_start != -1)
irq_start = (irq_start & ~31) + 16;
@@ -1086,9 +1119,9 @@ static void __init __gic_init_bases(unsigned int gic_nr, int irq_start,
}

if (WARN_ON(!gic->domain))
- return;
+ return -ENODEV;

- if (gic_nr == 0) {
+ if (gic == &gic_data[0]) {
/*
* Initialize the CPU interface map to all CPUs.
* It will be refined as each CPU probes its ID.
@@ -1107,18 +1140,29 @@ static void __init __gic_init_bases(unsigned int gic_nr, int irq_start,

gic_dist_init(gic);
gic_cpu_init(gic);
- gic_pm_init(gic);
+
+ return 0;
}

void __init gic_init(unsigned int gic_nr, int irq_start,
void __iomem *dist_base, void __iomem *cpu_base)
{
+ struct gic_chip_data *gic;
+
+ BUG_ON(gic_nr >= MAX_GIC_NR);
+
+ gic = &gic_data[gic_nr];
+
/*
* Non-DT/ACPI systems won't run a hypervisor, so let's not
* bother with these...
*/
static_key_slow_dec(&supports_deactivate);
- __gic_init_bases(gic_nr, irq_start, dist_base, cpu_base, 0, NULL);
+
+ if (__gic_init_bases(gic, irq_start, dist_base, cpu_base, 0, NULL))
+ return;
+
+ gic_pm_init(gic);
}

#ifdef CONFIG_OF
@@ -1162,21 +1206,30 @@ static bool gic_check_eoimode(struct device_node *node, void __iomem **base)
return true;
}

-static int __init
-gic_of_init(struct device_node *node, struct device_node *parent)
+static int gic_of_add(struct gic_chip_data *gic, struct device_node *node,
+ int irq)
{
void __iomem *cpu_base;
void __iomem *dist_base;
u32 percpu_offset;
- int irq;
+ int ret;

- if (WARN_ON(!node))
+ if (WARN_ON(!gic || !node))
return -ENODEV;

dist_base = of_iomap(node, 0);
- WARN(!dist_base, "unable to map gic dist registers\n");
+ if (!dist_base) {
+ WARN(1, "unable to map gic dist registers\n");
+ return -ENOMEM;
+ }

cpu_base = of_iomap(node, 1);
+ if (!cpu_base) {
+ WARN(1, "unable to map gic cpu registers\n");
+ ret = -ENOMEM;
+ goto err_map;
+ }
+
WARN(!cpu_base, "unable to map gic cpu registers\n");

/*
@@ -1189,20 +1242,51 @@ gic_of_init(struct device_node *node, struct device_node *parent)
if (of_property_read_u32(node, "cpu-offset", &percpu_offset))
percpu_offset = 0;

- __gic_init_bases(gic_cnt, -1, dist_base, cpu_base, percpu_offset,
- &node->fwnode);
- if (!gic_cnt)
+ ret = __gic_init_bases(gic, -1, dist_base, cpu_base, percpu_offset,
+ &node->fwnode);
+ if (ret)
+ goto err_gic;
+
+ if (gic == &gic_data[0])
gic_init_physaddr(node);

- if (parent) {
- irq = irq_of_parse_and_map(node, 0);
- gic_cascade_irq(gic_cnt, irq);
- }
+ if (irq)
+ irq_set_chained_handler_and_data(irq, gic_handle_cascade_irq,
+ gic);

if (IS_ENABLED(CONFIG_ARM_GIC_V2M))
gicv2m_of_init(node, gic_data[gic_cnt].domain);

+ return 0;
+
+err_gic:
+ iounmap(dist_base);
+err_map:
+ iounmap(cpu_base);
+
+ return ret;
+}
+
+static int __init
+gic_of_init(struct device_node *node, struct device_node *parent)
+{
+ struct gic_chip_data *gic;
+ int ret, irq = 0;
+
+ BUG_ON(gic_cnt >= MAX_GIC_NR);
+
+ gic = &gic_data[gic_cnt];
+
+ if (parent)
+ irq = irq_of_parse_and_map(node, 0);
+
+ ret = gic_of_add(gic, node, irq);
+ if (ret)
+ return ret;
+
+ gic_pm_init(gic);
gic_cnt++;
+
return 0;
}
IRQCHIP_DECLARE(gic_400, "arm,gic-400", gic_of_init);
@@ -1215,6 +1299,153 @@ IRQCHIP_DECLARE(msm_8660_qgic, "qcom,msm-8660-qgic", gic_of_init);
IRQCHIP_DECLARE(msm_qgic2, "qcom,msm-qgic2", gic_of_init);
IRQCHIP_DECLARE(pl390, "arm,pl390", gic_of_init);

+static int gic_runtime_resume_cb(struct device *dev)
+{
+ struct gic_chip_data *gic = dev_get_drvdata(dev);
+ int ret;
+
+ ret = clk_prepare_enable(gic->clk);
+ if (ret)
+ return ret;
+
+ gic_dist_restore(gic);
+
+ return 0;
+}
+
+static int gic_runtime_suspend_cb(struct device *dev)
+{
+ struct gic_chip_data *gic = dev_get_drvdata(dev);
+
+ gic_dist_save(gic);
+
+ clk_disable_unprepare(gic->clk);
+
+ return 0;
+}
+
+static int gic_probe(struct platform_device *pdev)
+{
+ struct device *dev = &pdev->dev;
+ struct gic_chip_data *gic;
+ int ret, irq;
+
+ if (dev->of_node == NULL)
+ return -EINVAL;
+
+ gic = devm_kzalloc(dev, sizeof(*gic), GFP_KERNEL);
+ if (!gic)
+ return -ENOMEM;
+
+ gic->clk = of_clk_get(dev->of_node, 0);
+ if (IS_ERR(gic->clk)) {
+ dev_err(dev, "clock not found\n");
+ return PTR_ERR(gic->clk);
+ }
+
+ gic->dev = dev;
+ platform_set_drvdata(pdev, gic);
+
+ pm_runtime_enable(dev);
+ if (pm_runtime_enabled(dev))
+ ret = pm_runtime_get_sync(dev);
+ else
+ ret = gic_runtime_resume_cb(dev);
+
+ if (ret < 0) {
+ pm_runtime_disable(dev);
+ goto err_rpm;
+ }
+
+ irq = irq_of_parse_and_map(dev->of_node, 0);
+ if (!irq) {
+ ret = -EINVAL;
+ goto err_irq;
+ }
+
+ ret = gic_of_add(gic, dev->of_node, irq);
+ if (ret)
+ goto err_gic;
+
+ pm_runtime_put(dev);
+
+ dev_info(dev, "GIC IRQ controller registered\n");
+
+ return 0;
+
+err_gic:
+ irq_dispose_mapping(irq);
+err_irq:
+ pm_runtime_disable(dev);
+ if (!pm_runtime_status_suspended(dev))
+ gic_runtime_suspend_cb(dev);
+err_rpm:
+ clk_put(gic->clk);
+
+ return ret;
+}
+
+static int gic_remove(struct platform_device *pdev)
+{
+ struct gic_chip_data *gic = platform_get_drvdata(pdev);
+ irq_hw_number_t hwirq;
+ void __iomem *base;
+
+ for (hwirq = 0; hwirq < gic->gic_irqs; hwirq++)
+ irq_dispose_mapping(irq_find_mapping(gic->domain, hwirq));
+
+ irq_domain_remove(gic->domain);
+
+ pm_runtime_disable(&pdev->dev);
+ if (!pm_runtime_status_suspended(&pdev->dev))
+ gic_runtime_suspend_cb(&pdev->dev);
+
+ base = gic_data_cpu_base(gic);
+ if (base)
+ iounmap(base);
+
+ base = gic_data_dist_base(gic);
+ if (base)
+ iounmap(base);
+
+ clk_put(gic->clk);
+
+ return 0;
+}
+
+#ifdef CONFIG_PM_SLEEP
+static int gic_suspend(struct device *dev)
+{
+ return pm_runtime_suspended(dev) == false;
+}
+#endif
+
+static const struct dev_pm_ops gic_pm_ops = {
+ SET_RUNTIME_PM_OPS(gic_runtime_suspend_cb,
+ gic_runtime_resume_cb, NULL)
+ SET_SYSTEM_SLEEP_PM_OPS(gic_suspend, NULL)
+};
+
+static const struct of_device_id gic_match[] = {
+ { .compatible = "nvidia,tegra210-agic", },
+ {},
+};
+MODULE_DEVICE_TABLE(of, gic_match);
+
+static struct platform_driver gic_driver = {
+ .probe = gic_probe,
+ .remove = gic_remove,
+ .driver = {
+ .name = "gic",
+ .of_match_table = gic_match,
+ .pm = &gic_pm_ops,
+ }
+};
+
+module_platform_driver(gic_driver);
+
+MODULE_DESCRIPTION("ARM GIC IRQ Controller");
+MODULE_LICENSE("GPL v2");
#endif

#ifdef CONFIG_ACPI
@@ -1279,7 +1510,7 @@ static int __init gic_v2_acpi_init(struct acpi_subtable_header *header,
struct acpi_madt_generic_distributor *dist;
void __iomem *cpu_base, *dist_base;
struct fwnode_handle *domain_handle;
- int count;
+ int count, ret;

/* Collect CPU base addresses */
count = acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_INTERRUPT,
@@ -1322,7 +1553,17 @@ static int __init gic_v2_acpi_init(struct acpi_subtable_header *header,
return -ENOMEM;
}

- __gic_init_bases(0, -1, dist_base, cpu_base, 0, domain_handle);
+ ret = __gic_init_bases(&gic_data[0], -1, dist_base, cpu_base, 0,
+ domain_handle);
+ if (ret) {
+ pr_err("Failed to initialise GIC\n");
+ irq_domain_free_fwnode(domain_handle);
+ iounmap(cpu_base);
+ iounmap(dist_base);
+ return ret;
+ }
+
+ gic_pm_init(&gic_data[0]);

acpi_set_irq_model(ACPI_IRQ_MODEL_GIC, domain_handle);
return 0;
--
2.1.4

2015-11-10 15:27:30

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] genirq: Add runtime resume/suspend support for IRQ chips

Jon,

On Tue, 10 Nov 2015, Jon Hunter wrote:
> void (*irq_suspend)(struct irq_data *data);
> void (*irq_resume)(struct irq_data *data);
> + int (*irq_runtime_suspend)(struct irq_data *data);
> + int (*irq_runtime_resume)(struct irq_data *data);
> void (*irq_pm_shutdown)(struct irq_data *data);

So this is the second patch within a few days which adds that just
with different names:

http://lkml.kernel.org/r/[email protected]

Can you folks please tell me which of the names is the correct one?

> +/* Inline functions for support of irq chips that require runtime pm */
> +static inline int chip_runtime_resume(struct irq_desc *desc)
> +{
> + if (!desc->irq_data.chip->irq_runtime_resume)
> + return 0;
> +
> + return desc->irq_data.chip->irq_runtime_resume(&desc->irq_data);
> +}
> +
> +static inline int chip_runtime_suspend(struct irq_desc *desc)
> +{
> + if (!desc->irq_data.chip->irq_runtime_suspend)
> + return 0;
> +
> + return desc->irq_data.chip->irq_runtime_suspend(&desc->irq_data);

We really don't need a return value for that one.

> +}
> +
> #define _IRQ_DESC_CHECK (1 << 0)
> #define _IRQ_DESC_PERCPU (1 << 1)
>
> diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
> index 0eebaeef317b..66e33df73140 100644
> --- a/kernel/irq/manage.c
> +++ b/kernel/irq/manage.c
> @@ -1116,6 +1116,10 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
> if (!try_module_get(desc->owner))
> return -ENODEV;
>
> + ret = chip_runtime_resume(desc);
> + if (ret < 0)
> + return ret;

Leaks module ref count.

> +
> new->irq = irq;
>
> /*
> @@ -1393,6 +1397,7 @@ out_thread:
> put_task_struct(t);
> }
> out_mput:
> + chip_runtime_suspend(desc);
> module_put(desc->owner);
> return ret;
> }
> @@ -1506,6 +1511,7 @@ static struct irqaction *__free_irq(unsigned int irq, void *dev_id)
> }
> }
>
> + chip_runtime_suspend(desc);
> module_put(desc->owner);
> kfree(action->secondary);
> return action;
> @@ -1792,6 +1798,7 @@ static struct irqaction *__free_percpu_irq(unsigned int irq, void __percpu *dev_
>
> unregister_handler_proc(irq, action);
>
> + chip_runtime_suspend(desc);

Where is the corresponding call in request_percpu_irq() ?

Can you folks please agree on something which is correct and complete?

Thanks,

tglx

2015-11-10 15:58:34

by Jon Hunter

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] genirq: Add runtime resume/suspend support for IRQ chips

Hi Thomas,

On 10/11/15 15:26, Thomas Gleixner wrote:
> Jon,
>
> On Tue, 10 Nov 2015, Jon Hunter wrote:
>> void (*irq_suspend)(struct irq_data *data);
>> void (*irq_resume)(struct irq_data *data);
>> + int (*irq_runtime_suspend)(struct irq_data *data);
>> + int (*irq_runtime_resume)(struct irq_data *data);
>> void (*irq_pm_shutdown)(struct irq_data *data);
>
> So this is the second patch within a few days which adds that just
> with different names:
>
> http://lkml.kernel.org/r/[email protected]
>
> Can you folks please tell me which of the names is the correct one?

Sorry. I was unaware of that patch.

>> +/* Inline functions for support of irq chips that require runtime pm */
>> +static inline int chip_runtime_resume(struct irq_desc *desc)
>> +{
>> + if (!desc->irq_data.chip->irq_runtime_resume)
>> + return 0;
>> +
>> + return desc->irq_data.chip->irq_runtime_resume(&desc->irq_data);
>> +}
>> +
>> +static inline int chip_runtime_suspend(struct irq_desc *desc)
>> +{
>> + if (!desc->irq_data.chip->irq_runtime_suspend)
>> + return 0;
>> +
>> + return desc->irq_data.chip->irq_runtime_suspend(&desc->irq_data);
>
> We really don't need a return value for that one.

Ok.

>> +}
>> +
>> #define _IRQ_DESC_CHECK (1 << 0)
>> #define _IRQ_DESC_PERCPU (1 << 1)
>>
>> diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
>> index 0eebaeef317b..66e33df73140 100644
>> --- a/kernel/irq/manage.c
>> +++ b/kernel/irq/manage.c
>> @@ -1116,6 +1116,10 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
>> if (!try_module_get(desc->owner))
>> return -ENODEV;
>>
>> + ret = chip_runtime_resume(desc);
>> + if (ret < 0)
>> + return ret;
>
> Leaks module ref count.

Ok.

>> +
>> new->irq = irq;
>>
>> /*
>> @@ -1393,6 +1397,7 @@ out_thread:
>> put_task_struct(t);
>> }
>> out_mput:
>> + chip_runtime_suspend(desc);
>> module_put(desc->owner);
>> return ret;
>> }
>> @@ -1506,6 +1511,7 @@ static struct irqaction *__free_irq(unsigned int irq, void *dev_id)
>> }
>> }
>>
>> + chip_runtime_suspend(desc);
>> module_put(desc->owner);
>> kfree(action->secondary);
>> return action;
>> @@ -1792,6 +1798,7 @@ static struct irqaction *__free_percpu_irq(unsigned int irq, void __percpu *dev_
>>
>> unregister_handler_proc(irq, action);
>>
>> + chip_runtime_suspend(desc);
>
> Where is the corresponding call in request_percpu_irq() ?

I was trying to simplify matters by placing the resume call in
__setup_irq() as opposed to requested_threaded_irq(). However, the would
mean the resume is inside the bus_lock and may be I should not assume
that I can sleep here.

> Can you folks please agree on something which is correct and complete?

Soren I am happy to defer to your patch and drop this. My only comment
would be what about the request_percpu_irq() path in your patch?

Cheers
Jon

2015-11-10 16:48:09

by Grygorii Strashko

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] genirq: Add runtime resume/suspend support for IRQ chips

Hi Jon,
On 11/10/2015 05:58 PM, Jon Hunter wrote:
> Hi Thomas,
>
> On 10/11/15 15:26, Thomas Gleixner wrote:
>> Jon,
>>
>> On Tue, 10 Nov 2015, Jon Hunter wrote:
>>> void (*irq_suspend)(struct irq_data *data);
>>> void (*irq_resume)(struct irq_data *data);
>>> + int (*irq_runtime_suspend)(struct irq_data *data);
>>> + int (*irq_runtime_resume)(struct irq_data *data);
>>> void (*irq_pm_shutdown)(struct irq_data *data);
>>
>> So this is the second patch within a few days which adds that just
>> with different names:
>>
>> http://lkml.kernel.org/r/[email protected]
>>
>> Can you folks please tell me which of the names is the correct one?
>
> Sorry. I was unaware of that patch.
>
>>> +/* Inline functions for support of irq chips that require runtime pm */
>>> +static inline int chip_runtime_resume(struct irq_desc *desc)
>>> +{
>>> + if (!desc->irq_data.chip->irq_runtime_resume)
>>> + return 0;
>>> +
>>> + return desc->irq_data.chip->irq_runtime_resume(&desc->irq_data);
>>> +}
>>> +
>>> +static inline int chip_runtime_suspend(struct irq_desc *desc)
>>> +{
>>> + if (!desc->irq_data.chip->irq_runtime_suspend)
>>> + return 0;
>>> +
>>> + return desc->irq_data.chip->irq_runtime_suspend(&desc->irq_data);
>>
>> We really don't need a return value for that one.
>
> Ok.
>
>>> +}
>>> +
>>> #define _IRQ_DESC_CHECK (1 << 0)
>>> #define _IRQ_DESC_PERCPU (1 << 1)
>>>
>>> diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
>>> index 0eebaeef317b..66e33df73140 100644
>>> --- a/kernel/irq/manage.c
>>> +++ b/kernel/irq/manage.c
>>> @@ -1116,6 +1116,10 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
>>> if (!try_module_get(desc->owner))
>>> return -ENODEV;
>>>
>>> + ret = chip_runtime_resume(desc);
>>> + if (ret < 0)
>>> + return ret;
>>
>> Leaks module ref count.
>
> Ok.
>
>>> +
>>> new->irq = irq;
>>>
>>> /*
>>> @@ -1393,6 +1397,7 @@ out_thread:
>>> put_task_struct(t);
>>> }
>>> out_mput:
>>> + chip_runtime_suspend(desc);
>>> module_put(desc->owner);
>>> return ret;
>>> }
>>> @@ -1506,6 +1511,7 @@ static struct irqaction *__free_irq(unsigned int irq, void *dev_id)
>>> }
>>> }
>>>
>>> + chip_runtime_suspend(desc);
>>> module_put(desc->owner);
>>> kfree(action->secondary);
>>> return action;
>>> @@ -1792,6 +1798,7 @@ static struct irqaction *__free_percpu_irq(unsigned int irq, void __percpu *dev_
>>>
>>> unregister_handler_proc(irq, action);
>>>
>>> + chip_runtime_suspend(desc);
>>
>> Where is the corresponding call in request_percpu_irq() ?
>
> I was trying to simplify matters by placing the resume call in
> __setup_irq() as opposed to requested_threaded_irq(). However, the would
> mean the resume is inside the bus_lock and may be I should not assume
> that I can sleep here.
>
>> Can you folks please agree on something which is correct and complete?
>
> Soren I am happy to defer to your patch and drop this. My only comment
> would be what about the request_percpu_irq() path in your patch?
>

I have the same comment here as I asked Soren:
1) There are no restrictions to call irq set_irq_type() whenever,
as result HW can be accessed before request_x_irq()/__setup_irq().
And this is used quite widely now :(

For example, during OF boot:

[a] irq_create_of_mapping()
- irq_create_fwspec_mapping()
- irq_set_irq_type()

or
irq_set_irq_type(irq, IRQ_TYPE_LEVEL_HIGH);
irq_set_chained_handler(irq, mx31ads_expio_irq_handler);

or
irq_set_irq_type(alarm_irq, IRQ_TYPE_EDGE_BOTH);
err = devm_request_irq(&pdev->dev, alarm_irq, fan_alarm_irq_handler,
(there are ~200 occurrences of irq set_irq_type in Kernel)

2) if i'm not wrong, the same is valid for irq_set_irq_wake() and irq_set_affinity()

I'm not saying all these code is correct, but that what's now in kernel :(
I've tried to test Soren's patch with omap-gpio and immediately hit case [a] :.(


--
regards,
-grygorii

2015-11-10 18:07:19

by Lars-Peter Clausen

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] genirq: Add runtime resume/suspend support for IRQ chips

On 11/10/2015 05:47 PM, Grygorii Strashko wrote:
[...]
>> I was trying to simplify matters by placing the resume call in
>> __setup_irq() as opposed to requested_threaded_irq(). However, the would
>> mean the resume is inside the bus_lock and may be I should not assume
>> that I can sleep here.
>>
>>> Can you folks please agree on something which is correct and complete?
>>
>> Soren I am happy to defer to your patch and drop this. My only comment
>> would be what about the request_percpu_irq() path in your patch?
>>
>
> I have the same comment here as I asked Soren:
> 1) There are no restrictions to call irq set_irq_type() whenever,
> as result HW can be accessed before request_x_irq()/__setup_irq().
> And this is used quite widely now :(
>

Changing the configuration of a resource that is not owned seems to be
fairly broken. In the worst case this will overwrite the configuration that
was set by owner of the resource.

Especially those that call irq_set_irq_type() directly before request_irq(),
given that you supply the trigger type to request_irq() which will make sure
that there are no conflicts and the configure.

This is a bit like calling gpio_set_direction() before you call
gpio_request(), which will also have PM issues.

> For example, during OF boot:
>
> [a] irq_create_of_mapping()
> - irq_create_fwspec_mapping()
> - irq_set_irq_type()
>
> or
> irq_set_irq_type(irq, IRQ_TYPE_LEVEL_HIGH);
> irq_set_chained_handler(irq, mx31ads_expio_irq_handler);
>
> or
> irq_set_irq_type(alarm_irq, IRQ_TYPE_EDGE_BOTH);
> err = devm_request_irq(&pdev->dev, alarm_irq, fan_alarm_irq_handler,
> (there are ~200 occurrences of irq set_irq_type in Kernel)
>
> 2) if i'm not wrong, the same is valid for irq_set_irq_wake() and irq_set_affinity()
>
> I'm not saying all these code is correct, but that what's now in kernel :(
> I've tried to test Soren's patch with omap-gpio and immediately hit case [a] :.(

All functions for which are part of the public API and for which it is legal
to call them without calling request_irq() (or similar) first will need to
have pm_get()/pm_put().

2015-11-11 10:13:55

by Jon Hunter

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] genirq: Add runtime resume/suspend support for IRQ chips


On 10/11/15 18:07, Lars-Peter Clausen wrote:
> On 11/10/2015 05:47 PM, Grygorii Strashko wrote:
> [...]
>>> I was trying to simplify matters by placing the resume call in
>>> __setup_irq() as opposed to requested_threaded_irq(). However, the would
>>> mean the resume is inside the bus_lock and may be I should not assume
>>> that I can sleep here.
>>>
>>>> Can you folks please agree on something which is correct and complete?
>>>
>>> Soren I am happy to defer to your patch and drop this. My only comment
>>> would be what about the request_percpu_irq() path in your patch?
>>>
>>
>> I have the same comment here as I asked Soren:
>> 1) There are no restrictions to call irq set_irq_type() whenever,
>> as result HW can be accessed before request_x_irq()/__setup_irq().
>> And this is used quite widely now :(
>>
>
> Changing the configuration of a resource that is not owned seems to be
> fairly broken. In the worst case this will overwrite the configuration that
> was set by owner of the resource.
>
> Especially those that call irq_set_irq_type() directly before request_irq(),
> given that you supply the trigger type to request_irq() which will make sure
> that there are no conflicts and the configure.
>
> This is a bit like calling gpio_set_direction() before you call
> gpio_request(), which will also have PM issues.

Yes, I agree that this does sound a bit odd, but ...

>> For example, during OF boot:
>>
>> [a] irq_create_of_mapping()
>> - irq_create_fwspec_mapping()
>> - irq_set_irq_type()

The above means that if someone calls of_irq_get() (or
platform_get_irq()), before request_irq(), then this will call
irq_create_of_mapping() and hence, call irq_set_irq_type. So should
irq_create_fwspec_mapping() be setting the type in the first place? I
can see it is convenient to do it here.

>> or
>> irq_set_irq_type(irq, IRQ_TYPE_LEVEL_HIGH);
>> irq_set_chained_handler(irq, mx31ads_expio_irq_handler);
>>
>> or
>> irq_set_irq_type(alarm_irq, IRQ_TYPE_EDGE_BOTH);
>> err = devm_request_irq(&pdev->dev, alarm_irq, fan_alarm_irq_handler,
>> (there are ~200 occurrences of irq set_irq_type in Kernel)
>>
>> 2) if i'm not wrong, the same is valid for irq_set_irq_wake() and irq_set_affinity()
>>
>> I'm not saying all these code is correct, but that what's now in kernel :(
>> I've tried to test Soren's patch with omap-gpio and immediately hit case [a] :.(
>
> All functions for which are part of the public API and for which it is legal
> to call them without calling request_irq() (or similar) first will need to
> have pm_get()/pm_put().

Right. May be we can look at the various entry points to the chip
operators to get a feel for which public APIs need to be handled.

Cheers
Jon

2015-11-11 15:42:03

by Grygorii Strashko

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] genirq: Add runtime resume/suspend support for IRQ chips

On 11/11/2015 12:13 PM, Jon Hunter wrote:
>
> On 10/11/15 18:07, Lars-Peter Clausen wrote:
>> On 11/10/2015 05:47 PM, Grygorii Strashko wrote:
>> [...]
>>>> I was trying to simplify matters by placing the resume call in
>>>> __setup_irq() as opposed to requested_threaded_irq(). However, the would
>>>> mean the resume is inside the bus_lock and may be I should not assume
>>>> that I can sleep here.
>>>>
>>>>> Can you folks please agree on something which is correct and complete?
>>>>
>>>> Soren I am happy to defer to your patch and drop this. My only comment
>>>> would be what about the request_percpu_irq() path in your patch?
>>>>
>>>
>>> I have the same comment here as I asked Soren:
>>> 1) There are no restrictions to call irq set_irq_type() whenever,
>>> as result HW can be accessed before request_x_irq()/__setup_irq().
>>> And this is used quite widely now :(
>>>
>>
>> Changing the configuration of a resource that is not owned seems to be
>> fairly broken. In the worst case this will overwrite the configuration that
>> was set by owner of the resource.
>>
>> Especially those that call irq_set_irq_type() directly before request_irq(),
>> given that you supply the trigger type to request_irq() which will make sure
>> that there are no conflicts and the configure.
>>
>> This is a bit like calling gpio_set_direction() before you call
>> gpio_request(), which will also have PM issues.
>
> Yes, I agree that this does sound a bit odd, but ...
>
>>> For example, during OF boot:
>>>
>>> [a] irq_create_of_mapping()
>>> - irq_create_fwspec_mapping()
>>> - irq_set_irq_type()
>
> The above means that if someone calls of_irq_get() (or
> platform_get_irq()), before request_irq(), then this will call
> irq_create_of_mapping() and hence, call irq_set_irq_type. So should
> irq_create_fwspec_mapping() be setting the type in the first place? I
> can see it is convenient to do it here.

In general there is another option - save OF-flags and pass them to
__setup_irq() where they can be processed.

>
>>> or
[b]
>>> irq_set_irq_type(irq, IRQ_TYPE_LEVEL_HIGH);
>>> irq_set_chained_handler(irq, mx31ads_expio_irq_handler);

option: add "flag" parameter to irq_set_chained_handler

>>>
>>> or
[c]
>>> irq_set_irq_type(alarm_irq, IRQ_TYPE_EDGE_BOTH);
>>> err = devm_request_irq(&pdev->dev, alarm_irq, fan_alarm_irq_handler,
>>> (there are ~200 occurrences of irq set_irq_type in Kernel)
>>>
>>> 2) if i'm not wrong, the same is valid for irq_set_irq_wake() and irq_set_affinity()
>>>
>>> I'm not saying all these code is correct, but that what's now in kernel :(
>>> I've tried to test Soren's patch with omap-gpio and immediately hit case [a] :.(
>>
>> All functions for which are part of the public API and for which it is legal
>> to call them without calling request_irq() (or similar) first will need to
>> have pm_get()/pm_put().
>
> Right. May be we can look at the various entry points to the chip
> operators to get a feel for which public APIs need to be handled.


Seems yes. But we need to be very careful with this, some of functions could be
called recursively (nested), like:
[d]
static int pcf857x_irq_set_wake(struct irq_data *data, unsigned int on)
{
...
error = irq_set_irq_wake(gpio->irq_parent, on);


Personally, I have nothing against irq_pm_(get|put) :) and thought about similar things
when tried to solve the same problem for omap-gpio driver.
But :(, I have to fall back to irq_bus_lock/sync_unlock, because of [a,b,c] - all above
APIs surrounded by chip_bus_lock/chip_bus_sync_unlock. ([d] - I've not hit it just because
I was lucky).


--
regards,
-grygorii

2015-11-12 10:59:36

by Jon Hunter

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] genirq: Add runtime resume/suspend support for IRQ chips


On 11/11/15 15:41, Grygorii Strashko wrote:
> On 11/11/2015 12:13 PM, Jon Hunter wrote:
>>
>> On 10/11/15 18:07, Lars-Peter Clausen wrote:
>>> On 11/10/2015 05:47 PM, Grygorii Strashko wrote:
>>> [...]
>>>>> I was trying to simplify matters by placing the resume call in
>>>>> __setup_irq() as opposed to requested_threaded_irq(). However, the would
>>>>> mean the resume is inside the bus_lock and may be I should not assume
>>>>> that I can sleep here.
>>>>>
>>>>>> Can you folks please agree on something which is correct and complete?
>>>>>
>>>>> Soren I am happy to defer to your patch and drop this. My only comment
>>>>> would be what about the request_percpu_irq() path in your patch?
>>>>>
>>>>
>>>> I have the same comment here as I asked Soren:
>>>> 1) There are no restrictions to call irq set_irq_type() whenever,
>>>> as result HW can be accessed before request_x_irq()/__setup_irq().
>>>> And this is used quite widely now :(
>>>>
>>>
>>> Changing the configuration of a resource that is not owned seems to be
>>> fairly broken. In the worst case this will overwrite the configuration that
>>> was set by owner of the resource.
>>>
>>> Especially those that call irq_set_irq_type() directly before request_irq(),
>>> given that you supply the trigger type to request_irq() which will make sure
>>> that there are no conflicts and the configure.
>>>
>>> This is a bit like calling gpio_set_direction() before you call
>>> gpio_request(), which will also have PM issues.
>>
>> Yes, I agree that this does sound a bit odd, but ...
>>
>>>> For example, during OF boot:
>>>>
>>>> [a] irq_create_of_mapping()
>>>> - irq_create_fwspec_mapping()
>>>> - irq_set_irq_type()
>>
>> The above means that if someone calls of_irq_get() (or
>> platform_get_irq()), before request_irq(), then this will call
>> irq_create_of_mapping() and hence, call irq_set_irq_type. So should
>> irq_create_fwspec_mapping() be setting the type in the first place? I
>> can see it is convenient to do it here.
>
> In general there is another option - save OF-flags and pass them to
> __setup_irq() where they can be processed.

Right, we could look at doing something like this.

>>>> or
> [b]
>>>> irq_set_irq_type(irq, IRQ_TYPE_LEVEL_HIGH);
>>>> irq_set_chained_handler(irq, mx31ads_expio_irq_handler);
>
> option: add "flag" parameter to irq_set_chained_handler
>
>>>>
>>>> or
> [c]
>>>> irq_set_irq_type(alarm_irq, IRQ_TYPE_EDGE_BOTH);
>>>> err = devm_request_irq(&pdev->dev, alarm_irq, fan_alarm_irq_handler,
>>>> (there are ~200 occurrences of irq set_irq_type in Kernel)
>>>>
>>>> 2) if i'm not wrong, the same is valid for irq_set_irq_wake() and irq_set_affinity()
>>>>
>>>> I'm not saying all these code is correct, but that what's now in kernel :(
>>>> I've tried to test Soren's patch with omap-gpio and immediately hit case [a] :.(
>>>
>>> All functions for which are part of the public API and for which it is legal
>>> to call them without calling request_irq() (or similar) first will need to
>>> have pm_get()/pm_put().
>>
>> Right. May be we can look at the various entry points to the chip
>> operators to get a feel for which public APIs need to be handled.
>
>
> Seems yes. But we need to be very careful with this, some of functions could be
> called recursively (nested), like:
> [d]
> static int pcf857x_irq_set_wake(struct irq_data *data, unsigned int on)
> {
> ...
> error = irq_set_irq_wake(gpio->irq_parent, on);
>
>
> Personally, I have nothing against irq_pm_(get|put) :) and thought about similar things
> when tried to solve the same problem for omap-gpio driver.
> But :(, I have to fall back to irq_bus_lock/sync_unlock, because of [a,b,c] - all above
> APIs surrounded by chip_bus_lock/chip_bus_sync_unlock. ([d] - I've not hit it just because
> I was lucky).

I had a quick peek at the omap-gpio driver and I see that internally you
are using the gpio ref-count to manage RPM and use the bus-lock hooks to
invoke RPM.

This can definitely be complex when considering all the potential paths,
but I think that we need to a look at this from a chip-ops perspective
because only the chip knows if it is accessible or not. That said, what
we need to assess is:

1. Which chip-ops should ONLY be called after an IRQ has been allocated
(eg, enable/disable, mask/unmask, type, etc). These chip-ops should
not try to control the chip PM, but should possibly WARN if called
when the chip is not accessible.
2. For chip-ops that may be called without allocating an IRQ (eg.
bus_lock, irq_suspend, etc), can these be called from an atomic
context? If they might be called from an atomic context then these
are the chip-ops which will cause problems as we cannot guarantee
that all IRQ chips can support irq-safe RPM.

AFAICT, looking at the chip-ops, it appears to me that ones that should
be called without allocating a IRQ are:

@irq_cpu_online
@irq_cpu_offline
@irq_bus_lock
@irq_bus_sync_unlock
@irq_suspend
@irq_resume
@irq_pm_shutdown
@irq_calc_mask (not used by any chips?)
@irq_print_chip
@irq_get_irqchip_state? (not sure about this one)

Of the above the only one I can see that would be called in an atomic
context would be irq_get_irqchip_state() and I am not sure if that
should be called without allocating the IRQ first?

Bottom line is that think that the chip can protect itself from an
unexpected access. Yes setting the type needs to be sorted out, but
hopefully this is do-able.

Cheers
Jon

2015-11-12 13:20:18

by Lars-Peter Clausen

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] genirq: Add runtime resume/suspend support for IRQ chips

On 11/12/2015 11:59 AM, Jon Hunter wrote:
>
> On 11/11/15 15:41, Grygorii Strashko wrote:
>> On 11/11/2015 12:13 PM, Jon Hunter wrote:
>>>
>>> On 10/11/15 18:07, Lars-Peter Clausen wrote:
>>>> On 11/10/2015 05:47 PM, Grygorii Strashko wrote:
>>>> [...]
>>>>>> I was trying to simplify matters by placing the resume call in
>>>>>> __setup_irq() as opposed to requested_threaded_irq(). However, the would
>>>>>> mean the resume is inside the bus_lock and may be I should not assume
>>>>>> that I can sleep here.
>>>>>>
>>>>>>> Can you folks please agree on something which is correct and complete?
>>>>>>
>>>>>> Soren I am happy to defer to your patch and drop this. My only comment
>>>>>> would be what about the request_percpu_irq() path in your patch?
>>>>>>
>>>>>
>>>>> I have the same comment here as I asked Soren:
>>>>> 1) There are no restrictions to call irq set_irq_type() whenever,
>>>>> as result HW can be accessed before request_x_irq()/__setup_irq().
>>>>> And this is used quite widely now :(
>>>>>
>>>>
>>>> Changing the configuration of a resource that is not owned seems to be
>>>> fairly broken. In the worst case this will overwrite the configuration that
>>>> was set by owner of the resource.
>>>>
>>>> Especially those that call irq_set_irq_type() directly before request_irq(),
>>>> given that you supply the trigger type to request_irq() which will make sure
>>>> that there are no conflicts and the configure.
>>>>
>>>> This is a bit like calling gpio_set_direction() before you call
>>>> gpio_request(), which will also have PM issues.
>>>
>>> Yes, I agree that this does sound a bit odd, but ...
>>>
>>>>> For example, during OF boot:
>>>>>
>>>>> [a] irq_create_of_mapping()
>>>>> - irq_create_fwspec_mapping()
>>>>> - irq_set_irq_type()
>>>
>>> The above means that if someone calls of_irq_get() (or
>>> platform_get_irq()), before request_irq(), then this will call
>>> irq_create_of_mapping() and hence, call irq_set_irq_type. So should
>>> irq_create_fwspec_mapping() be setting the type in the first place? I
>>> can see it is convenient to do it here.
>>
>> In general there is another option - save OF-flags and pass them to
>> __setup_irq() where they can be processed.
>
> Right, we could look at doing something like this.
>
>>>>> or
>> [b]
>>>>> irq_set_irq_type(irq, IRQ_TYPE_LEVEL_HIGH);
>>>>> irq_set_chained_handler(irq, mx31ads_expio_irq_handler);
>>
>> option: add "flag" parameter to irq_set_chained_handler
>>
>>>>>
>>>>> or
>> [c]
>>>>> irq_set_irq_type(alarm_irq, IRQ_TYPE_EDGE_BOTH);
>>>>> err = devm_request_irq(&pdev->dev, alarm_irq, fan_alarm_irq_handler,
>>>>> (there are ~200 occurrences of irq set_irq_type in Kernel)
>>>>>
>>>>> 2) if i'm not wrong, the same is valid for irq_set_irq_wake() and irq_set_affinity()
>>>>>
>>>>> I'm not saying all these code is correct, but that what's now in kernel :(
>>>>> I've tried to test Soren's patch with omap-gpio and immediately hit case [a] :.(
>>>>
>>>> All functions for which are part of the public API and for which it is legal
>>>> to call them without calling request_irq() (or similar) first will need to
>>>> have pm_get()/pm_put().
>>>
>>> Right. May be we can look at the various entry points to the chip
>>> operators to get a feel for which public APIs need to be handled.
>>
>>
>> Seems yes. But we need to be very careful with this, some of functions could be
>> called recursively (nested), like:
>> [d]
>> static int pcf857x_irq_set_wake(struct irq_data *data, unsigned int on)
>> {
>> ...
>> error = irq_set_irq_wake(gpio->irq_parent, on);
>>
>>
>> Personally, I have nothing against irq_pm_(get|put) :) and thought about similar things
>> when tried to solve the same problem for omap-gpio driver.
>> But :(, I have to fall back to irq_bus_lock/sync_unlock, because of [a,b,c] - all above
>> APIs surrounded by chip_bus_lock/chip_bus_sync_unlock. ([d] - I've not hit it just because
>> I was lucky).
>
> I had a quick peek at the omap-gpio driver and I see that internally you
> are using the gpio ref-count to manage RPM and use the bus-lock hooks to
> invoke RPM.
>
> This can definitely be complex when considering all the potential paths,
> but I think that we need to a look at this from a chip-ops perspective
> because only the chip knows if it is accessible or not. That said, what
> we need to assess is:
>
> 1. Which chip-ops should ONLY be called after an IRQ has been allocated
> (eg, enable/disable, mask/unmask, type, etc). These chip-ops should
> not try to control the chip PM, but should possibly WARN if called
> when the chip is not accessible.
> 2. For chip-ops that may be called without allocating an IRQ (eg.
> bus_lock, irq_suspend, etc), can these be called from an atomic
> context? If they might be called from an atomic context then these
> are the chip-ops which will cause problems as we cannot guarantee
> that all IRQ chips can support irq-safe RPM.

They can't. chip_bus_lock() can sleep, so anything that locks the bus can't
be called from atomic context.

One easy way out might be to always call pm_get/pm_but from
bus_lock,/bus_unlock. This way the chip is guaranteed to be powered up when
accessed happens. In addition pm_get is called when the IRQ is request and
pm_put is called when the IRQ is release, this is to ensure the chip stays
powered when it is actively monitoring the IRQ lines.

2015-11-12 13:30:09

by Grygorii Strashko

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] genirq: Add runtime resume/suspend support for IRQ chips

On 11/12/2015 03:20 PM, Lars-Peter Clausen wrote:
> On 11/12/2015 11:59 AM, Jon Hunter wrote:
>>
>> On 11/11/15 15:41, Grygorii Strashko wrote:
>>> On 11/11/2015 12:13 PM, Jon Hunter wrote:
>>>>
>>>> On 10/11/15 18:07, Lars-Peter Clausen wrote:
>>>>> On 11/10/2015 05:47 PM, Grygorii Strashko wrote:
>>>>> [...]
>>>>>>> I was trying to simplify matters by placing the resume call in
>>>>>>> __setup_irq() as opposed to requested_threaded_irq(). However, the would
>>>>>>> mean the resume is inside the bus_lock and may be I should not assume
>>>>>>> that I can sleep here.
>>>>>>>
>>>>>>>> Can you folks please agree on something which is correct and complete?
>>>>>>>
>>>>>>> Soren I am happy to defer to your patch and drop this. My only comment
>>>>>>> would be what about the request_percpu_irq() path in your patch?
>>>>>>>
>>>>>>
>>>>>> I have the same comment here as I asked Soren:
>>>>>> 1) There are no restrictions to call irq set_irq_type() whenever,
>>>>>> as result HW can be accessed before request_x_irq()/__setup_irq().
>>>>>> And this is used quite widely now :(
>>>>>>
>>>>>
>>>>> Changing the configuration of a resource that is not owned seems to be
>>>>> fairly broken. In the worst case this will overwrite the configuration that
>>>>> was set by owner of the resource.
>>>>>
>>>>> Especially those that call irq_set_irq_type() directly before request_irq(),
>>>>> given that you supply the trigger type to request_irq() which will make sure
>>>>> that there are no conflicts and the configure.
>>>>>
>>>>> This is a bit like calling gpio_set_direction() before you call
>>>>> gpio_request(), which will also have PM issues.
>>>>
>>>> Yes, I agree that this does sound a bit odd, but ...
>>>>
>>>>>> For example, during OF boot:
>>>>>>
>>>>>> [a] irq_create_of_mapping()
>>>>>> - irq_create_fwspec_mapping()
>>>>>> - irq_set_irq_type()
>>>>
>>>> The above means that if someone calls of_irq_get() (or
>>>> platform_get_irq()), before request_irq(), then this will call
>>>> irq_create_of_mapping() and hence, call irq_set_irq_type. So should
>>>> irq_create_fwspec_mapping() be setting the type in the first place? I
>>>> can see it is convenient to do it here.
>>>
>>> In general there is another option - save OF-flags and pass them to
>>> __setup_irq() where they can be processed.
>>
>> Right, we could look at doing something like this.
>>
>>>>>> or
>>> [b]
>>>>>> irq_set_irq_type(irq, IRQ_TYPE_LEVEL_HIGH);
>>>>>> irq_set_chained_handler(irq, mx31ads_expio_irq_handler);
>>>
>>> option: add "flag" parameter to irq_set_chained_handler
>>>
>>>>>>
>>>>>> or
>>> [c]
>>>>>> irq_set_irq_type(alarm_irq, IRQ_TYPE_EDGE_BOTH);
>>>>>> err = devm_request_irq(&pdev->dev, alarm_irq, fan_alarm_irq_handler,
>>>>>> (there are ~200 occurrences of irq set_irq_type in Kernel)
>>>>>>
>>>>>> 2) if i'm not wrong, the same is valid for irq_set_irq_wake() and irq_set_affinity()
>>>>>>
>>>>>> I'm not saying all these code is correct, but that what's now in kernel :(
>>>>>> I've tried to test Soren's patch with omap-gpio and immediately hit case [a] :.(
>>>>>
>>>>> All functions for which are part of the public API and for which it is legal
>>>>> to call them without calling request_irq() (or similar) first will need to
>>>>> have pm_get()/pm_put().
>>>>
>>>> Right. May be we can look at the various entry points to the chip
>>>> operators to get a feel for which public APIs need to be handled.
>>>
>>>
>>> Seems yes. But we need to be very careful with this, some of functions could be
>>> called recursively (nested), like:
>>> [d]
>>> static int pcf857x_irq_set_wake(struct irq_data *data, unsigned int on)
>>> {
>>> ...
>>> error = irq_set_irq_wake(gpio->irq_parent, on);
>>>
>>>
>>> Personally, I have nothing against irq_pm_(get|put) :) and thought about similar things
>>> when tried to solve the same problem for omap-gpio driver.
>>> But :(, I have to fall back to irq_bus_lock/sync_unlock, because of [a,b,c] - all above
>>> APIs surrounded by chip_bus_lock/chip_bus_sync_unlock. ([d] - I've not hit it just because
>>> I was lucky).
>>
>> I had a quick peek at the omap-gpio driver and I see that internally you
>> are using the gpio ref-count to manage RPM and use the bus-lock hooks to
>> invoke RPM.
>>
>> This can definitely be complex when considering all the potential paths,
>> but I think that we need to a look at this from a chip-ops perspective
>> because only the chip knows if it is accessible or not. That said, what
>> we need to assess is:
>>
>> 1. Which chip-ops should ONLY be called after an IRQ has been allocated
>> (eg, enable/disable, mask/unmask, type, etc). These chip-ops should
>> not try to control the chip PM, but should possibly WARN if called
>> when the chip is not accessible.
>> 2. For chip-ops that may be called without allocating an IRQ (eg.
>> bus_lock, irq_suspend, etc), can these be called from an atomic
>> context? If they might be called from an atomic context then these
>> are the chip-ops which will cause problems as we cannot guarantee
>> that all IRQ chips can support irq-safe RPM.
>
> They can't. chip_bus_lock() can sleep, so anything that locks the bus can't
> be called from atomic context.
>
> One easy way out might be to always call pm_get/pm_but from
> bus_lock,/bus_unlock. This way the chip is guaranteed to be powered up when
> accessed happens. In addition pm_get is called when the IRQ is request and
> pm_put is called when the IRQ is release, this is to ensure the chip stays
> powered when it is actively monitoring the IRQ lines.
>

In general, this is simplest possible solution. More over, if irqchip will have
dev field PM runtime could be used directly instead of get/put.

but.. :( How about problem [d]?

--
regards,
-grygorii

2015-11-12 13:34:59

by Lars-Peter Clausen

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] genirq: Add runtime resume/suspend support for IRQ chips

On 11/12/2015 02:29 PM, Grygorii Strashko wrote:
> On 11/12/2015 03:20 PM, Lars-Peter Clausen wrote:
>> On 11/12/2015 11:59 AM, Jon Hunter wrote:
>>>
>>> On 11/11/15 15:41, Grygorii Strashko wrote:
>>>> On 11/11/2015 12:13 PM, Jon Hunter wrote:
>>>>>
>>>>> On 10/11/15 18:07, Lars-Peter Clausen wrote:
>>>>>> On 11/10/2015 05:47 PM, Grygorii Strashko wrote:
>>>>>> [...]
>>>>>>>> I was trying to simplify matters by placing the resume call in
>>>>>>>> __setup_irq() as opposed to requested_threaded_irq(). However, the would
>>>>>>>> mean the resume is inside the bus_lock and may be I should not assume
>>>>>>>> that I can sleep here.
>>>>>>>>
>>>>>>>>> Can you folks please agree on something which is correct and complete?
>>>>>>>>
>>>>>>>> Soren I am happy to defer to your patch and drop this. My only comment
>>>>>>>> would be what about the request_percpu_irq() path in your patch?
>>>>>>>>
>>>>>>>
>>>>>>> I have the same comment here as I asked Soren:
>>>>>>> 1) There are no restrictions to call irq set_irq_type() whenever,
>>>>>>> as result HW can be accessed before request_x_irq()/__setup_irq().
>>>>>>> And this is used quite widely now :(
>>>>>>>
>>>>>>
>>>>>> Changing the configuration of a resource that is not owned seems to be
>>>>>> fairly broken. In the worst case this will overwrite the configuration that
>>>>>> was set by owner of the resource.
>>>>>>
>>>>>> Especially those that call irq_set_irq_type() directly before request_irq(),
>>>>>> given that you supply the trigger type to request_irq() which will make sure
>>>>>> that there are no conflicts and the configure.
>>>>>>
>>>>>> This is a bit like calling gpio_set_direction() before you call
>>>>>> gpio_request(), which will also have PM issues.
>>>>>
>>>>> Yes, I agree that this does sound a bit odd, but ...
>>>>>
>>>>>>> For example, during OF boot:
>>>>>>>
>>>>>>> [a] irq_create_of_mapping()
>>>>>>> - irq_create_fwspec_mapping()
>>>>>>> - irq_set_irq_type()
>>>>>
>>>>> The above means that if someone calls of_irq_get() (or
>>>>> platform_get_irq()), before request_irq(), then this will call
>>>>> irq_create_of_mapping() and hence, call irq_set_irq_type. So should
>>>>> irq_create_fwspec_mapping() be setting the type in the first place? I
>>>>> can see it is convenient to do it here.
>>>>
>>>> In general there is another option - save OF-flags and pass them to
>>>> __setup_irq() where they can be processed.
>>>
>>> Right, we could look at doing something like this.
>>>
>>>>>>> or
>>>> [b]
>>>>>>> irq_set_irq_type(irq, IRQ_TYPE_LEVEL_HIGH);
>>>>>>> irq_set_chained_handler(irq, mx31ads_expio_irq_handler);
>>>>
>>>> option: add "flag" parameter to irq_set_chained_handler
>>>>
>>>>>>>
>>>>>>> or
>>>> [c]
>>>>>>> irq_set_irq_type(alarm_irq, IRQ_TYPE_EDGE_BOTH);
>>>>>>> err = devm_request_irq(&pdev->dev, alarm_irq, fan_alarm_irq_handler,
>>>>>>> (there are ~200 occurrences of irq set_irq_type in Kernel)
>>>>>>>
>>>>>>> 2) if i'm not wrong, the same is valid for irq_set_irq_wake() and irq_set_affinity()
>>>>>>>
>>>>>>> I'm not saying all these code is correct, but that what's now in kernel :(
>>>>>>> I've tried to test Soren's patch with omap-gpio and immediately hit case [a] :.(
>>>>>>
>>>>>> All functions for which are part of the public API and for which it is legal
>>>>>> to call them without calling request_irq() (or similar) first will need to
>>>>>> have pm_get()/pm_put().
>>>>>
>>>>> Right. May be we can look at the various entry points to the chip
>>>>> operators to get a feel for which public APIs need to be handled.
>>>>
>>>>
>>>> Seems yes. But we need to be very careful with this, some of functions could be
>>>> called recursively (nested), like:
>>>> [d]
>>>> static int pcf857x_irq_set_wake(struct irq_data *data, unsigned int on)
>>>> {
>>>> ...
>>>> error = irq_set_irq_wake(gpio->irq_parent, on);
>>>>
>>>>
>>>> Personally, I have nothing against irq_pm_(get|put) :) and thought about similar things
>>>> when tried to solve the same problem for omap-gpio driver.
>>>> But :(, I have to fall back to irq_bus_lock/sync_unlock, because of [a,b,c] - all above
>>>> APIs surrounded by chip_bus_lock/chip_bus_sync_unlock. ([d] - I've not hit it just because
>>>> I was lucky).
>>>
>>> I had a quick peek at the omap-gpio driver and I see that internally you
>>> are using the gpio ref-count to manage RPM and use the bus-lock hooks to
>>> invoke RPM.
>>>
>>> This can definitely be complex when considering all the potential paths,
>>> but I think that we need to a look at this from a chip-ops perspective
>>> because only the chip knows if it is accessible or not. That said, what
>>> we need to assess is:
>>>
>>> 1. Which chip-ops should ONLY be called after an IRQ has been allocated
>>> (eg, enable/disable, mask/unmask, type, etc). These chip-ops should
>>> not try to control the chip PM, but should possibly WARN if called
>>> when the chip is not accessible.
>>> 2. For chip-ops that may be called without allocating an IRQ (eg.
>>> bus_lock, irq_suspend, etc), can these be called from an atomic
>>> context? If they might be called from an atomic context then these
>>> are the chip-ops which will cause problems as we cannot guarantee
>>> that all IRQ chips can support irq-safe RPM.
>>
>> They can't. chip_bus_lock() can sleep, so anything that locks the bus can't
>> be called from atomic context.
>>
>> One easy way out might be to always call pm_get/pm_but from
>> bus_lock,/bus_unlock. This way the chip is guaranteed to be powered up when
>> accessed happens. In addition pm_get is called when the IRQ is request and
>> pm_put is called when the IRQ is release, this is to ensure the chip stays
>> powered when it is actively monitoring the IRQ lines.
>>
>
> In general, this is simplest possible solution. More over, if irqchip will have
> dev field PM runtime could be used directly instead of get/put.
>
> but.. :( How about problem [d]?
>

Can you explain why you thing this is a problem? I don't see the issue.

2015-11-12 13:35:37

by Jon Hunter

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] genirq: Add runtime resume/suspend support for IRQ chips


On 12/11/15 13:20, Lars-Peter Clausen wrote:
> On 11/12/2015 11:59 AM, Jon Hunter wrote:
>>
>> On 11/11/15 15:41, Grygorii Strashko wrote:
>>> On 11/11/2015 12:13 PM, Jon Hunter wrote:
>>>>
>>>> On 10/11/15 18:07, Lars-Peter Clausen wrote:
>>>>> On 11/10/2015 05:47 PM, Grygorii Strashko wrote:
>>>>> [...]
>>>>>>> I was trying to simplify matters by placing the resume call in
>>>>>>> __setup_irq() as opposed to requested_threaded_irq(). However, the would
>>>>>>> mean the resume is inside the bus_lock and may be I should not assume
>>>>>>> that I can sleep here.
>>>>>>>
>>>>>>>> Can you folks please agree on something which is correct and complete?
>>>>>>>
>>>>>>> Soren I am happy to defer to your patch and drop this. My only comment
>>>>>>> would be what about the request_percpu_irq() path in your patch?
>>>>>>>
>>>>>>
>>>>>> I have the same comment here as I asked Soren:
>>>>>> 1) There are no restrictions to call irq set_irq_type() whenever,
>>>>>> as result HW can be accessed before request_x_irq()/__setup_irq().
>>>>>> And this is used quite widely now :(
>>>>>>
>>>>>
>>>>> Changing the configuration of a resource that is not owned seems to be
>>>>> fairly broken. In the worst case this will overwrite the configuration that
>>>>> was set by owner of the resource.
>>>>>
>>>>> Especially those that call irq_set_irq_type() directly before request_irq(),
>>>>> given that you supply the trigger type to request_irq() which will make sure
>>>>> that there are no conflicts and the configure.
>>>>>
>>>>> This is a bit like calling gpio_set_direction() before you call
>>>>> gpio_request(), which will also have PM issues.
>>>>
>>>> Yes, I agree that this does sound a bit odd, but ...
>>>>
>>>>>> For example, during OF boot:
>>>>>>
>>>>>> [a] irq_create_of_mapping()
>>>>>> - irq_create_fwspec_mapping()
>>>>>> - irq_set_irq_type()
>>>>
>>>> The above means that if someone calls of_irq_get() (or
>>>> platform_get_irq()), before request_irq(), then this will call
>>>> irq_create_of_mapping() and hence, call irq_set_irq_type. So should
>>>> irq_create_fwspec_mapping() be setting the type in the first place? I
>>>> can see it is convenient to do it here.
>>>
>>> In general there is another option - save OF-flags and pass them to
>>> __setup_irq() where they can be processed.
>>
>> Right, we could look at doing something like this.
>>
>>>>>> or
>>> [b]
>>>>>> irq_set_irq_type(irq, IRQ_TYPE_LEVEL_HIGH);
>>>>>> irq_set_chained_handler(irq, mx31ads_expio_irq_handler);
>>>
>>> option: add "flag" parameter to irq_set_chained_handler
>>>
>>>>>>
>>>>>> or
>>> [c]
>>>>>> irq_set_irq_type(alarm_irq, IRQ_TYPE_EDGE_BOTH);
>>>>>> err = devm_request_irq(&pdev->dev, alarm_irq, fan_alarm_irq_handler,
>>>>>> (there are ~200 occurrences of irq set_irq_type in Kernel)
>>>>>>
>>>>>> 2) if i'm not wrong, the same is valid for irq_set_irq_wake() and irq_set_affinity()
>>>>>>
>>>>>> I'm not saying all these code is correct, but that what's now in kernel :(
>>>>>> I've tried to test Soren's patch with omap-gpio and immediately hit case [a] :.(
>>>>>
>>>>> All functions for which are part of the public API and for which it is legal
>>>>> to call them without calling request_irq() (or similar) first will need to
>>>>> have pm_get()/pm_put().
>>>>
>>>> Right. May be we can look at the various entry points to the chip
>>>> operators to get a feel for which public APIs need to be handled.
>>>
>>>
>>> Seems yes. But we need to be very careful with this, some of functions could be
>>> called recursively (nested), like:
>>> [d]
>>> static int pcf857x_irq_set_wake(struct irq_data *data, unsigned int on)
>>> {
>>> ...
>>> error = irq_set_irq_wake(gpio->irq_parent, on);
>>>
>>>
>>> Personally, I have nothing against irq_pm_(get|put) :) and thought about similar things
>>> when tried to solve the same problem for omap-gpio driver.
>>> But :(, I have to fall back to irq_bus_lock/sync_unlock, because of [a,b,c] - all above
>>> APIs surrounded by chip_bus_lock/chip_bus_sync_unlock. ([d] - I've not hit it just because
>>> I was lucky).
>>
>> I had a quick peek at the omap-gpio driver and I see that internally you
>> are using the gpio ref-count to manage RPM and use the bus-lock hooks to
>> invoke RPM.
>>
>> This can definitely be complex when considering all the potential paths,
>> but I think that we need to a look at this from a chip-ops perspective
>> because only the chip knows if it is accessible or not. That said, what
>> we need to assess is:
>>
>> 1. Which chip-ops should ONLY be called after an IRQ has been allocated
>> (eg, enable/disable, mask/unmask, type, etc). These chip-ops should
>> not try to control the chip PM, but should possibly WARN if called
>> when the chip is not accessible.
>> 2. For chip-ops that may be called without allocating an IRQ (eg.
>> bus_lock, irq_suspend, etc), can these be called from an atomic
>> context? If they might be called from an atomic context then these
>> are the chip-ops which will cause problems as we cannot guarantee
>> that all IRQ chips can support irq-safe RPM.
>
> They can't. chip_bus_lock() can sleep, so anything that locks the bus can't
> be called from atomic context.

Sorry, what can't? Yes I understand that we cannot call anything that
locks the bus from an atomic context.

> One easy way out might be to always call pm_get/pm_but from
> bus_lock,/bus_unlock. This way the chip is guaranteed to be powered up when
> accessed happens. In addition pm_get is called when the IRQ is request and
> pm_put is called when the IRQ is release, this is to ensure the chip stays
> powered when it is actively monitoring the IRQ lines.

Yes I had thought about that, but it is not quite that easy, because in
the case of request_irq() you don't want to pm_put() after the
bus_unlock(). However, the bus_lock/unlock() are good indicators of
different paths.

Cheers
Jon


2015-11-12 13:47:30

by Lars-Peter Clausen

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] genirq: Add runtime resume/suspend support for IRQ chips

On 11/12/2015 02:35 PM, Jon Hunter wrote:
>
> On 12/11/15 13:20, Lars-Peter Clausen wrote:
>> On 11/12/2015 11:59 AM, Jon Hunter wrote:
>>>
>>> On 11/11/15 15:41, Grygorii Strashko wrote:
>>>> On 11/11/2015 12:13 PM, Jon Hunter wrote:
>>>>>
>>>>> On 10/11/15 18:07, Lars-Peter Clausen wrote:
>>>>>> On 11/10/2015 05:47 PM, Grygorii Strashko wrote:
>>>>>> [...]
>>>>>>>> I was trying to simplify matters by placing the resume call in
>>>>>>>> __setup_irq() as opposed to requested_threaded_irq(). However, the would
>>>>>>>> mean the resume is inside the bus_lock and may be I should not assume
>>>>>>>> that I can sleep here.
>>>>>>>>
>>>>>>>>> Can you folks please agree on something which is correct and complete?
>>>>>>>>
>>>>>>>> Soren I am happy to defer to your patch and drop this. My only comment
>>>>>>>> would be what about the request_percpu_irq() path in your patch?
>>>>>>>>
>>>>>>>
>>>>>>> I have the same comment here as I asked Soren:
>>>>>>> 1) There are no restrictions to call irq set_irq_type() whenever,
>>>>>>> as result HW can be accessed before request_x_irq()/__setup_irq().
>>>>>>> And this is used quite widely now :(
>>>>>>>
>>>>>>
>>>>>> Changing the configuration of a resource that is not owned seems to be
>>>>>> fairly broken. In the worst case this will overwrite the configuration that
>>>>>> was set by owner of the resource.
>>>>>>
>>>>>> Especially those that call irq_set_irq_type() directly before request_irq(),
>>>>>> given that you supply the trigger type to request_irq() which will make sure
>>>>>> that there are no conflicts and the configure.
>>>>>>
>>>>>> This is a bit like calling gpio_set_direction() before you call
>>>>>> gpio_request(), which will also have PM issues.
>>>>>
>>>>> Yes, I agree that this does sound a bit odd, but ...
>>>>>
>>>>>>> For example, during OF boot:
>>>>>>>
>>>>>>> [a] irq_create_of_mapping()
>>>>>>> - irq_create_fwspec_mapping()
>>>>>>> - irq_set_irq_type()
>>>>>
>>>>> The above means that if someone calls of_irq_get() (or
>>>>> platform_get_irq()), before request_irq(), then this will call
>>>>> irq_create_of_mapping() and hence, call irq_set_irq_type. So should
>>>>> irq_create_fwspec_mapping() be setting the type in the first place? I
>>>>> can see it is convenient to do it here.
>>>>
>>>> In general there is another option - save OF-flags and pass them to
>>>> __setup_irq() where they can be processed.
>>>
>>> Right, we could look at doing something like this.
>>>
>>>>>>> or
>>>> [b]
>>>>>>> irq_set_irq_type(irq, IRQ_TYPE_LEVEL_HIGH);
>>>>>>> irq_set_chained_handler(irq, mx31ads_expio_irq_handler);
>>>>
>>>> option: add "flag" parameter to irq_set_chained_handler
>>>>
>>>>>>>
>>>>>>> or
>>>> [c]
>>>>>>> irq_set_irq_type(alarm_irq, IRQ_TYPE_EDGE_BOTH);
>>>>>>> err = devm_request_irq(&pdev->dev, alarm_irq, fan_alarm_irq_handler,
>>>>>>> (there are ~200 occurrences of irq set_irq_type in Kernel)
>>>>>>>
>>>>>>> 2) if i'm not wrong, the same is valid for irq_set_irq_wake() and irq_set_affinity()
>>>>>>>
>>>>>>> I'm not saying all these code is correct, but that what's now in kernel :(
>>>>>>> I've tried to test Soren's patch with omap-gpio and immediately hit case [a] :.(
>>>>>>
>>>>>> All functions for which are part of the public API and for which it is legal
>>>>>> to call them without calling request_irq() (or similar) first will need to
>>>>>> have pm_get()/pm_put().
>>>>>
>>>>> Right. May be we can look at the various entry points to the chip
>>>>> operators to get a feel for which public APIs need to be handled.
>>>>
>>>>
>>>> Seems yes. But we need to be very careful with this, some of functions could be
>>>> called recursively (nested), like:
>>>> [d]
>>>> static int pcf857x_irq_set_wake(struct irq_data *data, unsigned int on)
>>>> {
>>>> ...
>>>> error = irq_set_irq_wake(gpio->irq_parent, on);
>>>>
>>>>
>>>> Personally, I have nothing against irq_pm_(get|put) :) and thought about similar things
>>>> when tried to solve the same problem for omap-gpio driver.
>>>> But :(, I have to fall back to irq_bus_lock/sync_unlock, because of [a,b,c] - all above
>>>> APIs surrounded by chip_bus_lock/chip_bus_sync_unlock. ([d] - I've not hit it just because
>>>> I was lucky).
>>>
>>> I had a quick peek at the omap-gpio driver and I see that internally you
>>> are using the gpio ref-count to manage RPM and use the bus-lock hooks to
>>> invoke RPM.
>>>
>>> This can definitely be complex when considering all the potential paths,
>>> but I think that we need to a look at this from a chip-ops perspective
>>> because only the chip knows if it is accessible or not. That said, what
>>> we need to assess is:
>>>
>>> 1. Which chip-ops should ONLY be called after an IRQ has been allocated
>>> (eg, enable/disable, mask/unmask, type, etc). These chip-ops should
>>> not try to control the chip PM, but should possibly WARN if called
>>> when the chip is not accessible.
>>> 2. For chip-ops that may be called without allocating an IRQ (eg.
>>> bus_lock, irq_suspend, etc), can these be called from an atomic
>>> context? If they might be called from an atomic context then these
>>> are the chip-ops which will cause problems as we cannot guarantee
>>> that all IRQ chips can support irq-safe RPM.
>>
>> They can't. chip_bus_lock() can sleep, so anything that locks the bus can't
>> be called from atomic context.
>
> Sorry, what can't? Yes I understand that we cannot call anything that
> locks the bus from an atomic context.

They can't be called from atomic context. The chip_bus_lock() function may
sleep and you can't access the device without previously locking the bus.
Since the device only needs to be powered up when it is accessed its safe to
assume that the places where you need pm_get()/pm_put() are only called from
non-atomic context.

>
>> One easy way out might be to always call pm_get/pm_but from
>> bus_lock,/bus_unlock. This way the chip is guaranteed to be powered up when
>> accessed happens. In addition pm_get is called when the IRQ is request and
>> pm_put is called when the IRQ is release, this is to ensure the chip stays
>> powered when it is actively monitoring the IRQ lines.
>
> Yes I had thought about that, but it is not quite that easy, because in
> the case of request_irq() you don't want to pm_put() after the
> bus_unlock(). However, the bus_lock/unlock() are good indicators of
> different paths.

You'd call pm_get() twice in request_irq() once from bus_lock() and once
independently, that way you still have a reference even after the bus_unlock().

2015-11-12 14:02:33

by Jon Hunter

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] genirq: Add runtime resume/suspend support for IRQ chips


On 12/11/15 13:47, Lars-Peter Clausen wrote:
> On 11/12/2015 02:35 PM, Jon Hunter wrote:
>>
>> On 12/11/15 13:20, Lars-Peter Clausen wrote:
>>> On 11/12/2015 11:59 AM, Jon Hunter wrote:
>>>>
>>>> On 11/11/15 15:41, Grygorii Strashko wrote:
>>>>> On 11/11/2015 12:13 PM, Jon Hunter wrote:
>>>>>>
>>>>>> On 10/11/15 18:07, Lars-Peter Clausen wrote:
>>>>>>> On 11/10/2015 05:47 PM, Grygorii Strashko wrote:
>>>>>>> [...]
>>>>>>>>> I was trying to simplify matters by placing the resume call in
>>>>>>>>> __setup_irq() as opposed to requested_threaded_irq(). However, the would
>>>>>>>>> mean the resume is inside the bus_lock and may be I should not assume
>>>>>>>>> that I can sleep here.
>>>>>>>>>
>>>>>>>>>> Can you folks please agree on something which is correct and complete?
>>>>>>>>>
>>>>>>>>> Soren I am happy to defer to your patch and drop this. My only comment
>>>>>>>>> would be what about the request_percpu_irq() path in your patch?
>>>>>>>>>
>>>>>>>>
>>>>>>>> I have the same comment here as I asked Soren:
>>>>>>>> 1) There are no restrictions to call irq set_irq_type() whenever,
>>>>>>>> as result HW can be accessed before request_x_irq()/__setup_irq().
>>>>>>>> And this is used quite widely now :(
>>>>>>>>
>>>>>>>
>>>>>>> Changing the configuration of a resource that is not owned seems to be
>>>>>>> fairly broken. In the worst case this will overwrite the configuration that
>>>>>>> was set by owner of the resource.
>>>>>>>
>>>>>>> Especially those that call irq_set_irq_type() directly before request_irq(),
>>>>>>> given that you supply the trigger type to request_irq() which will make sure
>>>>>>> that there are no conflicts and the configure.
>>>>>>>
>>>>>>> This is a bit like calling gpio_set_direction() before you call
>>>>>>> gpio_request(), which will also have PM issues.
>>>>>>
>>>>>> Yes, I agree that this does sound a bit odd, but ...
>>>>>>
>>>>>>>> For example, during OF boot:
>>>>>>>>
>>>>>>>> [a] irq_create_of_mapping()
>>>>>>>> - irq_create_fwspec_mapping()
>>>>>>>> - irq_set_irq_type()
>>>>>>
>>>>>> The above means that if someone calls of_irq_get() (or
>>>>>> platform_get_irq()), before request_irq(), then this will call
>>>>>> irq_create_of_mapping() and hence, call irq_set_irq_type. So should
>>>>>> irq_create_fwspec_mapping() be setting the type in the first place? I
>>>>>> can see it is convenient to do it here.
>>>>>
>>>>> In general there is another option - save OF-flags and pass them to
>>>>> __setup_irq() where they can be processed.
>>>>
>>>> Right, we could look at doing something like this.
>>>>
>>>>>>>> or
>>>>> [b]
>>>>>>>> irq_set_irq_type(irq, IRQ_TYPE_LEVEL_HIGH);
>>>>>>>> irq_set_chained_handler(irq, mx31ads_expio_irq_handler);
>>>>>
>>>>> option: add "flag" parameter to irq_set_chained_handler
>>>>>
>>>>>>>>
>>>>>>>> or
>>>>> [c]
>>>>>>>> irq_set_irq_type(alarm_irq, IRQ_TYPE_EDGE_BOTH);
>>>>>>>> err = devm_request_irq(&pdev->dev, alarm_irq, fan_alarm_irq_handler,
>>>>>>>> (there are ~200 occurrences of irq set_irq_type in Kernel)
>>>>>>>>
>>>>>>>> 2) if i'm not wrong, the same is valid for irq_set_irq_wake() and irq_set_affinity()
>>>>>>>>
>>>>>>>> I'm not saying all these code is correct, but that what's now in kernel :(
>>>>>>>> I've tried to test Soren's patch with omap-gpio and immediately hit case [a] :.(
>>>>>>>
>>>>>>> All functions for which are part of the public API and for which it is legal
>>>>>>> to call them without calling request_irq() (or similar) first will need to
>>>>>>> have pm_get()/pm_put().
>>>>>>
>>>>>> Right. May be we can look at the various entry points to the chip
>>>>>> operators to get a feel for which public APIs need to be handled.
>>>>>
>>>>>
>>>>> Seems yes. But we need to be very careful with this, some of functions could be
>>>>> called recursively (nested), like:
>>>>> [d]
>>>>> static int pcf857x_irq_set_wake(struct irq_data *data, unsigned int on)
>>>>> {
>>>>> ...
>>>>> error = irq_set_irq_wake(gpio->irq_parent, on);
>>>>>
>>>>>
>>>>> Personally, I have nothing against irq_pm_(get|put) :) and thought about similar things
>>>>> when tried to solve the same problem for omap-gpio driver.
>>>>> But :(, I have to fall back to irq_bus_lock/sync_unlock, because of [a,b,c] - all above
>>>>> APIs surrounded by chip_bus_lock/chip_bus_sync_unlock. ([d] - I've not hit it just because
>>>>> I was lucky).
>>>>
>>>> I had a quick peek at the omap-gpio driver and I see that internally you
>>>> are using the gpio ref-count to manage RPM and use the bus-lock hooks to
>>>> invoke RPM.
>>>>
>>>> This can definitely be complex when considering all the potential paths,
>>>> but I think that we need to a look at this from a chip-ops perspective
>>>> because only the chip knows if it is accessible or not. That said, what
>>>> we need to assess is:
>>>>
>>>> 1. Which chip-ops should ONLY be called after an IRQ has been allocated
>>>> (eg, enable/disable, mask/unmask, type, etc). These chip-ops should
>>>> not try to control the chip PM, but should possibly WARN if called
>>>> when the chip is not accessible.
>>>> 2. For chip-ops that may be called without allocating an IRQ (eg.
>>>> bus_lock, irq_suspend, etc), can these be called from an atomic
>>>> context? If they might be called from an atomic context then these
>>>> are the chip-ops which will cause problems as we cannot guarantee
>>>> that all IRQ chips can support irq-safe RPM.
>>>
>>> They can't. chip_bus_lock() can sleep, so anything that locks the bus can't
>>> be called from atomic context.
>>
>> Sorry, what can't? Yes I understand that we cannot call anything that
>> locks the bus from an atomic context.
>
> They can't be called from atomic context. The chip_bus_lock() function may
> sleep and you can't access the device without previously locking the bus.
> Since the device only needs to be powered up when it is accessed its safe to
> assume that the places where you need pm_get()/pm_put() are only called from
> non-atomic context.

Right, absolutely.

>>> One easy way out might be to always call pm_get/pm_but from
>>> bus_lock,/bus_unlock. This way the chip is guaranteed to be powered up when
>>> accessed happens. In addition pm_get is called when the IRQ is request and
>>> pm_put is called when the IRQ is release, this is to ensure the chip stays
>>> powered when it is actively monitoring the IRQ lines.
>>
>> Yes I had thought about that, but it is not quite that easy, because in
>> the case of request_irq() you don't want to pm_put() after the
>> bus_unlock(). However, the bus_lock/unlock() are good indicators of
>> different paths.
>
> You'd call pm_get() twice in request_irq() once from bus_lock() and once
> independently, that way you still have a reference even after the bus_unlock().

Yes that is a possibility. However, there are places such as
show_interrupts() (kernel/irq/proc.c) and irq_gc_suspend() that do not
call bus_lock/unlock() which would need to be handled for PM. May be
these should also call bus_lock() as well?

Jon

2015-11-12 14:37:22

by Lars-Peter Clausen

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] genirq: Add runtime resume/suspend support for IRQ chips

On 11/12/2015 03:02 PM, Jon Hunter wrote:
[...]
>>>> One easy way out might be to always call pm_get/pm_but from
>>>> bus_lock,/bus_unlock. This way the chip is guaranteed to be powered up when
>>>> accessed happens. In addition pm_get is called when the IRQ is request and
>>>> pm_put is called when the IRQ is release, this is to ensure the chip stays
>>>> powered when it is actively monitoring the IRQ lines.
>>>
>>> Yes I had thought about that, but it is not quite that easy, because in
>>> the case of request_irq() you don't want to pm_put() after the
>>> bus_unlock(). However, the bus_lock/unlock() are good indicators of
>>> different paths.
>>
>> You'd call pm_get() twice in request_irq() once from bus_lock() and once
>> independently, that way you still have a reference even after the bus_unlock().
>
> Yes that is a possibility. However, there are places such as
> show_interrupts() (kernel/irq/proc.c) and irq_gc_suspend() that do not
> call bus_lock/unlock() which would need to be handled for PM. May be
> these should also call bus_lock() as well?

show_interrupts() only accesses software state, not hardware state, or does it?

suspend/resume is a bit tricky. It's kind of driver specific if it needs to
actually access the hardware or whether the state is already shadowed in
software. Maybe we can make this an exception for now and let drivers handle
this on their own.

2015-11-12 15:38:33

by Jon Hunter

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] genirq: Add runtime resume/suspend support for IRQ chips


On 12/11/15 14:37, Lars-Peter Clausen wrote:
> On 11/12/2015 03:02 PM, Jon Hunter wrote:
> [...]
>>>>> One easy way out might be to always call pm_get/pm_but from
>>>>> bus_lock,/bus_unlock. This way the chip is guaranteed to be powered up when
>>>>> accessed happens. In addition pm_get is called when the IRQ is request and
>>>>> pm_put is called when the IRQ is release, this is to ensure the chip stays
>>>>> powered when it is actively monitoring the IRQ lines.
>>>>
>>>> Yes I had thought about that, but it is not quite that easy, because in
>>>> the case of request_irq() you don't want to pm_put() after the
>>>> bus_unlock(). However, the bus_lock/unlock() are good indicators of
>>>> different paths.
>>>
>>> You'd call pm_get() twice in request_irq() once from bus_lock() and once
>>> independently, that way you still have a reference even after the bus_unlock().
>>
>> Yes that is a possibility. However, there are places such as
>> show_interrupts() (kernel/irq/proc.c) and irq_gc_suspend() that do not
>> call bus_lock/unlock() which would need to be handled for PM. May be
>> these should also call bus_lock() as well?
>
> show_interrupts() only accesses software state, not hardware state, or does it?

Good point. Today there only appears to be one user:

arch/powerpc/sysdev/fsl_msi.c: .irq_print_chip = fsl_msi_print_chip,

This one is purely software. However, it would be easy to handle the
show_interrupts case if needed.

> suspend/resume is a bit tricky. It's kind of driver specific if it needs to
> actually access the hardware or whether the state is already shadowed in
> software. Maybe we can make this an exception for now and let drivers handle
> this on their own.

Yes I would agree with you on that.

Cheers
Jon

2015-11-12 17:36:58

by Soren Brinkmann

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] genirq: Add runtime resume/suspend support for IRQ chips

Hi Jon,

On Tue, 2015-11-10 at 03:58PM +0000, Jon Hunter wrote:
> Hi Thomas,
>
> On 10/11/15 15:26, Thomas Gleixner wrote:
[...]
> > Can you folks please agree on something which is correct and complete?
>
> Soren I am happy to defer to your patch and drop this. My only comment
> would be what about the request_percpu_irq() path in your patch?

Sorry, I couldn't spent any time on this the last days. And now it seems
you are already a lot closer to a solution than the initial patches were.
I'm happy to step back from this patch set, but I'd appreciate if you
kept me in the loop.

Thanks,
Sören

2015-11-12 23:20:46

by Kevin Hilman

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] genirq: Add runtime resume/suspend support for IRQ chips

Jon Hunter <[email protected]> writes:

> Some IRQ chips may be located in a power domain outside of the CPU subsystem
> and hence will require device specific runtime power management. Ideally,
> rather than adding more functions to the irq_chip_ops function table, using
> existing chip functions such as irq_startup/shutdown or
> irq_request/release_resources() would be best. However, these existing chip
> functions are called in the context of a spinlock which is not ideal for
> power management operations that may take some time to power up a domain.
>
> Two possible solutions are:
> 1. Move existing chip operators such as irq_request/release_resources()
> outside of the spinlock and use these helpers.
> 2. Add new chip operators that are called outside of any spinlocks while
> setting up and freeing an IRQ.

> Not knowing whether we can safely move irq_request/release_resources() to
> outside the spinlock (but hopefully this will solicit some feedback), add
> new chip operators for runtime resuming and suspending of an IRQ chip.

I'm not quite seeing how this would connect to the actual hardware
power domain (presumabaly managed by genpd) and any other devices in
that domain (presumably managed by runtime PM.)

If all the RPM devices in the domain go idle, it will be powered off
independently of the status of the irqchip because the irqchip isn't
using RPM.

Is there a longer-term plan to handle the irqchips as a "normal" device
and use RPM? IMO, that approach would be helpful even for irqchips that
share power domains with CPUs, since there are efforts working towards
using genpd/RPM to manage CPUs/clusters.

Kevin

2015-11-13 09:01:50

by Jon Hunter

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] genirq: Add runtime resume/suspend support for IRQ chips


On 12/11/15 23:20, Kevin Hilman wrote:
> Jon Hunter <[email protected]> writes:
>
>> Some IRQ chips may be located in a power domain outside of the CPU subsystem
>> and hence will require device specific runtime power management. Ideally,
>> rather than adding more functions to the irq_chip_ops function table, using
>> existing chip functions such as irq_startup/shutdown or
>> irq_request/release_resources() would be best. However, these existing chip
>> functions are called in the context of a spinlock which is not ideal for
>> power management operations that may take some time to power up a domain.
>>
>> Two possible solutions are:
>> 1. Move existing chip operators such as irq_request/release_resources()
>> outside of the spinlock and use these helpers.
>> 2. Add new chip operators that are called outside of any spinlocks while
>> setting up and freeing an IRQ.
>
>> Not knowing whether we can safely move irq_request/release_resources() to
>> outside the spinlock (but hopefully this will solicit some feedback), add
>> new chip operators for runtime resuming and suspending of an IRQ chip.
>
> I'm not quite seeing how this would connect to the actual hardware
> power domain (presumabaly managed by genpd) and any other devices in
> that domain (presumably managed by runtime PM.)

So this patch is just providing some hooks that an irqchip can use to
perform any PM related operations. If you look at the 2nd patch in the
series you will see for the GIC that these helpers are used to call
pm_runtime_get/put() which would handle the power-domain.

> If all the RPM devices in the domain go idle, it will be powered off
> independently of the status of the irqchip because the irqchip isn't
> using RPM.

That's dependent on how the irqchip uses these helpers. If these helpers
invoke RPM then that will not be the case.

> Is there a longer-term plan to handle the irqchips as a "normal" device
> and use RPM? IMO, that approach would be helpful even for irqchips that
> share power domains with CPUs, since there are efforts working towards
> using genpd/RPM to manage CPUs/clusters.

That would ideal. However, the majority of irqchips today
create/register them with IRQCHIP_DECLARE() and not as "normal" devices.
Therefore, I was reluctant to add "struct device" to the irqchip
structure. However, if this is what you would prefer and Thomas is ok
with it, then that would be fine with me.

Cheers
Jon

2015-11-13 18:07:41

by Grygorii Strashko

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] genirq: Add runtime resume/suspend support for IRQ chips

On 11/12/2015 03:34 PM, Lars-Peter Clausen wrote:
> On 11/12/2015 02:29 PM, Grygorii Strashko wrote:
>> On 11/12/2015 03:20 PM, Lars-Peter Clausen wrote:
>>> On 11/12/2015 11:59 AM, Jon Hunter wrote:
>>>>
>>>> On 11/11/15 15:41, Grygorii Strashko wrote:
>>>>> On 11/11/2015 12:13 PM, Jon Hunter wrote:
>>>>>>
>>>>>> On 10/11/15 18:07, Lars-Peter Clausen wrote:
>>>>>>> On 11/10/2015 05:47 PM, Grygorii Strashko wrote:
>>>>>>> [...]
>>>>>>>>> I was trying to simplify matters by placing the resume call in
>>>>>>>>> __setup_irq() as opposed to requested_threaded_irq(). However, the would
>>>>>>>>> mean the resume is inside the bus_lock and may be I should not assume
>>>>>>>>> that I can sleep here.
>>>>>>>>>
>>>>>>>>>> Can you folks please agree on something which is correct and complete?
>>>>>>>>>
>>>>>>>>> Soren I am happy to defer to your patch and drop this. My only comment
>>>>>>>>> would be what about the request_percpu_irq() path in your patch?
>>>>>>>>>
>>>>>>>>
>>>>>>>> I have the same comment here as I asked Soren:
>>>>>>>> 1) There are no restrictions to call irq set_irq_type() whenever,
>>>>>>>> as result HW can be accessed before request_x_irq()/__setup_irq().
>>>>>>>> And this is used quite widely now :(
>>>>>>>>
>>>>>>>
>>>>>>> Changing the configuration of a resource that is not owned seems to be
>>>>>>> fairly broken. In the worst case this will overwrite the configuration that
>>>>>>> was set by owner of the resource.
>>>>>>>
>>>>>>> Especially those that call irq_set_irq_type() directly before request_irq(),
>>>>>>> given that you supply the trigger type to request_irq() which will make sure
>>>>>>> that there are no conflicts and the configure.
>>>>>>>
>>>>>>> This is a bit like calling gpio_set_direction() before you call
>>>>>>> gpio_request(), which will also have PM issues.
>>>>>>
>>>>>> Yes, I agree that this does sound a bit odd, but ...
>>>>>>
>>>>>>>> For example, during OF boot:
>>>>>>>>
>>>>>>>> [a] irq_create_of_mapping()
>>>>>>>> - irq_create_fwspec_mapping()
>>>>>>>> - irq_set_irq_type()
>>>>>>
>>>>>> The above means that if someone calls of_irq_get() (or
>>>>>> platform_get_irq()), before request_irq(), then this will call
>>>>>> irq_create_of_mapping() and hence, call irq_set_irq_type. So should
>>>>>> irq_create_fwspec_mapping() be setting the type in the first place? I
>>>>>> can see it is convenient to do it here.
>>>>>
>>>>> In general there is another option - save OF-flags and pass them to
>>>>> __setup_irq() where they can be processed.
>>>>
>>>> Right, we could look at doing something like this.
>>>>
>>>>>>>> or
>>>>> [b]
>>>>>>>> irq_set_irq_type(irq, IRQ_TYPE_LEVEL_HIGH);
>>>>>>>> irq_set_chained_handler(irq, mx31ads_expio_irq_handler);
>>>>>
>>>>> option: add "flag" parameter to irq_set_chained_handler
>>>>>
>>>>>>>>
>>>>>>>> or
>>>>> [c]
>>>>>>>> irq_set_irq_type(alarm_irq, IRQ_TYPE_EDGE_BOTH);
>>>>>>>> err = devm_request_irq(&pdev->dev, alarm_irq, fan_alarm_irq_handler,
>>>>>>>> (there are ~200 occurrences of irq set_irq_type in Kernel)
>>>>>>>>
>>>>>>>> 2) if i'm not wrong, the same is valid for irq_set_irq_wake() and irq_set_affinity()
>>>>>>>>
>>>>>>>> I'm not saying all these code is correct, but that what's now in kernel :(
>>>>>>>> I've tried to test Soren's patch with omap-gpio and immediately hit case [a] :.(
>>>>>>>
>>>>>>> All functions for which are part of the public API and for which it is legal
>>>>>>> to call them without calling request_irq() (or similar) first will need to
>>>>>>> have pm_get()/pm_put().
>>>>>>
>>>>>> Right. May be we can look at the various entry points to the chip
>>>>>> operators to get a feel for which public APIs need to be handled.
>>>>>
>>>>>
>>>>> Seems yes. But we need to be very careful with this, some of functions could be
>>>>> called recursively (nested), like:
>>>>> [d]
>>>>> static int pcf857x_irq_set_wake(struct irq_data *data, unsigned int on)
>>>>> {
>>>>> ...
>>>>> error = irq_set_irq_wake(gpio->irq_parent, on);
>>>>>
>>>>>
>>>>> Personally, I have nothing against irq_pm_(get|put) :) and thought about similar things
>>>>> when tried to solve the same problem for omap-gpio driver.
>>>>> But :(, I have to fall back to irq_bus_lock/sync_unlock, because of [a,b,c] - all above
>>>>> APIs surrounded by chip_bus_lock/chip_bus_sync_unlock. ([d] - I've not hit it just because
>>>>> I was lucky).
>>>>
>>>> I had a quick peek at the omap-gpio driver and I see that internally you
>>>> are using the gpio ref-count to manage RPM and use the bus-lock hooks to
>>>> invoke RPM.
>>>>
>>>> This can definitely be complex when considering all the potential paths,
>>>> but I think that we need to a look at this from a chip-ops perspective
>>>> because only the chip knows if it is accessible or not. That said, what
>>>> we need to assess is:
>>>>
>>>> 1. Which chip-ops should ONLY be called after an IRQ has been allocated
>>>> (eg, enable/disable, mask/unmask, type, etc). These chip-ops should
>>>> not try to control the chip PM, but should possibly WARN if called
>>>> when the chip is not accessible.
>>>> 2. For chip-ops that may be called without allocating an IRQ (eg.
>>>> bus_lock, irq_suspend, etc), can these be called from an atomic
>>>> context? If they might be called from an atomic context then these
>>>> are the chip-ops which will cause problems as we cannot guarantee
>>>> that all IRQ chips can support irq-safe RPM.
>>>
>>> They can't. chip_bus_lock() can sleep, so anything that locks the bus can't
>>> be called from atomic context.
>>>
>>> One easy way out might be to always call pm_get/pm_but from
>>> bus_lock,/bus_unlock. This way the chip is guaranteed to be powered up when
>>> accessed happens. In addition pm_get is called when the IRQ is request and
>>> pm_put is called when the IRQ is release, this is to ensure the chip stays
>>> powered when it is actively monitoring the IRQ lines.
>>>
>>
>> In general, this is simplest possible solution. More over, if irqchip will have
>> dev field PM runtime could be used directly instead of get/put.
>>
>> but.. :( How about problem [d]?
>>
>
> Can you explain why you thing this is a problem? I don't see the issue.
>

oh, Sorry missed your e-mail.

static int pcf857x_irq_set_wake(struct irq_data *data, unsigned int on)
{
...
error = irq_set_irq_wake(gpio->irq_parent, on);
|-irq_get_desc_buslock
|- chip_bus_lock
|- irq_pm_get
|- agic/zynq - pm_runtime_get_sync()
|- might_sleep_if(!(rpmflags & RPM_ASYNC) && !dev->power.irq_safe);
- same for put

As result, it will be mandatory to define irq_safe = true, but in this case it will be
impossible to use clk_prepare_enable()/unprepare() in PM runtime callbacks, for example.


Also, discussed approach will work for all only if it will be guaranteed that
irq_pm_get/put() will be called only once, like PM runtime API. Otherwise
it will be very hard to reuse them for chips which do not use PM runtime
for PM management - those chips will need to implement own counters to
avoid re-enabling/disabling of an already active/inactive devices.

So, decision has to be made - will irqchip PM management be PM runtime based only or not?
- if it will be PM runtime based only (agic/zynq):
- there are should be device always
- PM runtime API can be called where needed directly -> no new callbacks.

- if not:
- some additional sync/protection will need to be added to irqchip

In addition, we seems missed irq_set_chained_handler*() :( - It do not call
__setup_irq().

--
regards,
-grygorii

2015-11-13 20:02:41

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] genirq: Add runtime resume/suspend support for IRQ chips

On Fri, 13 Nov 2015, Jon Hunter wrote:
> On 12/11/15 23:20, Kevin Hilman wrote:
> > If all the RPM devices in the domain go idle, it will be powered off
> > independently of the status of the irqchip because the irqchip isn't
> > using RPM.
>
> That's dependent on how the irqchip uses these helpers. If these helpers
> invoke RPM then that will not be the case.

You need a very proper description of how that domain is working. If
all devices are idle, it's not necessary correct to power down the
irqchip as is might serve other devices as well.

OTOH, if it can be powered down then all idle devices need to release
the irq they requested because request_irq() would hold a ref on the
power domain.

I have no idea how you can describe that proper.

> > Is there a longer-term plan to handle the irqchips as a "normal" device
> > and use RPM? IMO, that approach would be helpful even for irqchips that
> > share power domains with CPUs, since there are efforts working towards
> > using genpd/RPM to manage CPUs/clusters.
>
> That would ideal. However, the majority of irqchips today
> create/register them with IRQCHIP_DECLARE() and not as "normal" devices.
> Therefore, I was reluctant to add "struct device" to the irqchip
> structure. However, if this is what you would prefer and Thomas is ok
> with it, then that would be fine with me.

I have no objections against that, but how is the 'struct device'
going to be initialized?

Thanks,

tglx

2015-11-16 09:46:51

by Jon Hunter

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] genirq: Add runtime resume/suspend support for IRQ chips


On 13/11/15 20:01, Thomas Gleixner wrote:
> On Fri, 13 Nov 2015, Jon Hunter wrote:
>> On 12/11/15 23:20, Kevin Hilman wrote:
>>> If all the RPM devices in the domain go idle, it will be powered off
>>> independently of the status of the irqchip because the irqchip isn't
>>> using RPM.
>>
>> That's dependent on how the irqchip uses these helpers. If these helpers
>> invoke RPM then that will not be the case.
>
> You need a very proper description of how that domain is working. If
> all devices are idle, it's not necessary correct to power down the
> irqchip as is might serve other devices as well.

Agreed. The irqchip should only be powered down if there are no
interrupts in-use/requested. Runtime-pm will keep a reference count for
all requested IRQs.

> OTOH, if it can be powered down then all idle devices need to release
> the irq they requested because request_irq() would hold a ref on the
> power domain.

Yes.

> I have no idea how you can describe that proper.

Do you mean properly describe the interaction between runtime-pm and the
irqchip?

>>> Is there a longer-term plan to handle the irqchips as a "normal" device
>>> and use RPM? IMO, that approach would be helpful even for irqchips that
>>> share power domains with CPUs, since there are efforts working towards
>>> using genpd/RPM to manage CPUs/clusters.
>>
>> That would ideal. However, the majority of irqchips today
>> create/register them with IRQCHIP_DECLARE() and not as "normal" devices.
>> Therefore, I was reluctant to add "struct device" to the irqchip
>> structure. However, if this is what you would prefer and Thomas is ok
>> with it, then that would be fine with me.
>
> I have no objections against that, but how is the 'struct device'
> going to be initialized?

It would be initialised by the irqchip driver. However, it would be
optional. The genirq core could simply check to see if the chip->dev
member is initialised and if so enable runtime-pm.

Cheers
Jon

2015-11-16 09:49:44

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] genirq: Add runtime resume/suspend support for IRQ chips

On Mon, Nov 16, 2015 at 10:46 AM, Jon Hunter <[email protected]> wrote:
> On 13/11/15 20:01, Thomas Gleixner wrote:
>> On Fri, 13 Nov 2015, Jon Hunter wrote:
>>> On 12/11/15 23:20, Kevin Hilman wrote:
>>>> If all the RPM devices in the domain go idle, it will be powered off
>>>> independently of the status of the irqchip because the irqchip isn't
>>>> using RPM.
>>>
>>> That's dependent on how the irqchip uses these helpers. If these helpers
>>> invoke RPM then that will not be the case.
>>
>> You need a very proper description of how that domain is working. If
>> all devices are idle, it's not necessary correct to power down the
>> irqchip as is might serve other devices as well.
>
> Agreed. The irqchip should only be powered down if there are no
> interrupts in-use/requested. Runtime-pm will keep a reference count for
> all requested IRQs.

That means the irqchip won't be powered down automatically when the
last user is powered down, unless all users release their irqs during
suspend.

Handling it automatically needs more bookkeeping than a simple reference
count.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2015-11-16 10:34:39

by Jon Hunter

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] genirq: Add runtime resume/suspend support for IRQ chips


On 16/11/15 09:49, Geert Uytterhoeven wrote:
> On Mon, Nov 16, 2015 at 10:46 AM, Jon Hunter <[email protected]> wrote:
>> On 13/11/15 20:01, Thomas Gleixner wrote:
>>> On Fri, 13 Nov 2015, Jon Hunter wrote:
>>>> On 12/11/15 23:20, Kevin Hilman wrote:
>>>>> If all the RPM devices in the domain go idle, it will be powered off
>>>>> independently of the status of the irqchip because the irqchip isn't
>>>>> using RPM.
>>>>
>>>> That's dependent on how the irqchip uses these helpers. If these helpers
>>>> invoke RPM then that will not be the case.
>>>
>>> You need a very proper description of how that domain is working. If
>>> all devices are idle, it's not necessary correct to power down the
>>> irqchip as is might serve other devices as well.
>>
>> Agreed. The irqchip should only be powered down if there are no
>> interrupts in-use/requested. Runtime-pm will keep a reference count for
>> all requested IRQs.
>
> That means the irqchip won't be powered down automatically when the
> last user is powered down, unless all users release their irqs during
> suspend.

Right.

> Handling it automatically needs more bookkeeping than a simple reference
> count.

So what would you suggest? Adding a pm_runtime_register_irq() API that
would register an IRQ with the device that you want RPM to handle? Not
sure if there is a better/easier way to handle this.

Cheers
Jon


2015-11-16 10:48:46

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] genirq: Add runtime resume/suspend support for IRQ chips

Hi Jon,

On Mon, Nov 16, 2015 at 11:34 AM, Jon Hunter <[email protected]> wrote:
> On 16/11/15 09:49, Geert Uytterhoeven wrote:
>> On Mon, Nov 16, 2015 at 10:46 AM, Jon Hunter <[email protected]> wrote:
>>> On 13/11/15 20:01, Thomas Gleixner wrote:
>>>> On Fri, 13 Nov 2015, Jon Hunter wrote:
>>>>> On 12/11/15 23:20, Kevin Hilman wrote:
>>>>>> If all the RPM devices in the domain go idle, it will be powered off
>>>>>> independently of the status of the irqchip because the irqchip isn't
>>>>>> using RPM.
>>>>>
>>>>> That's dependent on how the irqchip uses these helpers. If these helpers
>>>>> invoke RPM then that will not be the case.
>>>>
>>>> You need a very proper description of how that domain is working. If
>>>> all devices are idle, it's not necessary correct to power down the
>>>> irqchip as is might serve other devices as well.
>>>
>>> Agreed. The irqchip should only be powered down if there are no
>>> interrupts in-use/requested. Runtime-pm will keep a reference count for
>>> all requested IRQs.
>>
>> That means the irqchip won't be powered down automatically when the
>> last user is powered down, unless all users release their irqs during
>> suspend.
>
> Right.
>
>> Handling it automatically needs more bookkeeping than a simple reference
>> count.
>
> So what would you suggest? Adding a pm_runtime_register_irq() API that
> would register an IRQ with the device that you want RPM to handle? Not
> sure if there is a better/easier way to handle this.

The irqchip needs to keep track how many times request_irq() has been
called, cfr. your suggestion above.

On the other side, the system needs to keep track how many times request_irq()
has been called for each irqchip, so it can subtract those numbers from the
irqchip's counters during suspend of the device, and re-add them during resume.
So we need at least a "struct device *" parameter for request_irq().
devm_request_irq() already has that, but not all drivers use that.

However, I think this should be looked at into the context of "[RFD]
Functional dependencies between devices".
https://lwn.net/Articles/662205/
https://lkml.org/lkml/2015/10/27/388

There can be other dependencies than interrupts between devices.
All functions using dependencies need a "struct device *" parameter to
record information.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2015-11-17 11:57:22

by Jon Hunter

[permalink] [raw]
Subject: Re: [RFC PATCH 1/2] genirq: Add runtime resume/suspend support for IRQ chips

Hi Geert,

On 16/11/15 10:48, Geert Uytterhoeven wrote:
> On Mon, Nov 16, 2015 at 11:34 AM, Jon Hunter <[email protected]> wrote:

[snip]

>>> Handling it automatically needs more bookkeeping than a simple reference
>>> count.
>>
>> So what would you suggest? Adding a pm_runtime_register_irq() API that
>> would register an IRQ with the device that you want RPM to handle? Not
>> sure if there is a better/easier way to handle this.
>
> The irqchip needs to keep track how many times request_irq() has been
> called, cfr. your suggestion above.
>
> On the other side, the system needs to keep track how many times request_irq()
> has been called for each irqchip, so it can subtract those numbers from the
> irqchip's counters during suspend of the device, and re-add them during resume.
> So we need at least a "struct device *" parameter for request_irq().
> devm_request_irq() already has that, but not all drivers use that.

Yes that would make sense. However, I am wondering if the
syscore suspend/resume operators could be used here to do something
like ...

pm_runtime_disable(dev);
if (!pm_runtime_status_suspended(dev))
chip->irq_runtime_suspend(data);

> However, I think this should be looked at into the context of "[RFD]
> Functional dependencies between devices".
> https://lwn.net/Articles/662205/
> https://lkml.org/lkml/2015/10/27/388
>
> There can be other dependencies than interrupts between devices.
> All functions using dependencies need a "struct device *" parameter to
> record information.

Yes I like the sound of that. That would be ideal. However, I am
guessing that that is a way off at the moment ...

Cheers
Jon