2015-11-11 08:26:42

by Thomas Petazzoni

[permalink] [raw]
Subject: Re: [PATCH 0/5] Fix regression introduced by set_irq_flags() removal

Thomas,

On Tue, 20 Oct 2015 21:23:29 +0200 (CEST), Thomas Gleixner wrote:

> > However, relying on IRQ_NOAUTOEN being cleared doesn't seem like the
> > right long term solution, which is why I implemented what I believe is
> > a (hopefully) better long term solution.
>
> Agreed.
>
> I'll go over the proposed solution tomorrow afternoon (I'm
> traveling/conferencing...)

Have you had the time to consider the proposed solution? For 4.3 we
implemented the quick work-around that consisted in clearing
IRQ_NOAUTOEN, but it's probably not a very good long-term solution.

Don't hesitate to let me know if you'd like to see some modifications
to the proposed approach, or if you have a totally different approach
in mind.

Thanks,

Thomas
--
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com


2015-11-13 20:12:07

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 0/5] Fix regression introduced by set_irq_flags() removal

Thomas,

On Wed, 11 Nov 2015, Thomas Petazzoni wrote:
> Have you had the time to consider the proposed solution? For 4.3 we
> implemented the quick work-around that consisted in clearing
> IRQ_NOAUTOEN, but it's probably not a very good long-term solution.
>
> Don't hesitate to let me know if you'd like to see some modifications
> to the proposed approach, or if you have a totally different approach
> in mind.

I'm not sure if we really need all that muck if we can just rely on
that flag. I don't see the extra value, but you might have something
in mind which does not jump into my face right now.

Thanks,

tglx

2015-12-04 11:03:33

by Thomas Petazzoni

[permalink] [raw]
Subject: Re: [PATCH 0/5] Fix regression introduced by set_irq_flags() removal

Thomas,

On Fri, 13 Nov 2015 15:11:16 -0500 (EST), Thomas Gleixner wrote:

> On Wed, 11 Nov 2015, Thomas Petazzoni wrote:
> > Have you had the time to consider the proposed solution? For 4.3 we
> > implemented the quick work-around that consisted in clearing
> > IRQ_NOAUTOEN, but it's probably not a very good long-term solution.
> >
> > Don't hesitate to let me know if you'd like to see some modifications
> > to the proposed approach, or if you have a totally different approach
> > in mind.
>
> I'm not sure if we really need all that muck if we can just rely on
> that flag. I don't see the extra value, but you might have something
> in mind which does not jump into my face right now.

Well, the problem is that IRQ_NOAUTOEN is a global flag, which is OK
for global interrupts, but not good for per-CPU interrupts, since you
don't have the information on a per-CPU basis of which interrupt was
enabled before suspend, and therefore should be re-enabled after resume.

Until now, we don't have the problem since the only per-CPU interrupt
we were using was the local timer interrupt, and the local timers on
secondary CPUs are switched off during suspend and re-enabled during
resume. So re-enabling the interrupt on the boot CPU on resume is
sufficient.

However, our network driver recently switched to using per-CPU
interrupts as well, and in this case, it is really important to be able
to re-enable the per-CPU interrupts and the appropriate CPUs at resume
time. Since our HW registers are made so that it is not possible to
read out at suspend time which interrupts are enabled, we have to ask
the Linux kernel at resume time which interrupts should be re-enabled
at the HW level. Which is what my more complicated series was doing.

Do you have other suggestions to allow us to know which per-CPU
interrupts should be re-enabled on the different CPUs at resume time ?

Thanks,

Thomas
--
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

2015-12-05 17:25:23

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 0/5] Fix regression introduced by set_irq_flags() removal

Thomas,

On Fri, 4 Dec 2015, Thomas Petazzoni wrote:
> Well, the problem is that IRQ_NOAUTOEN is a global flag, which is OK
> for global interrupts, but not good for per-CPU interrupts, since you
> don't have the information on a per-CPU basis of which interrupt was
> enabled before suspend, and therefore should be re-enabled after resume.
>
> Until now, we don't have the problem since the only per-CPU interrupt
> we were using was the local timer interrupt, and the local timers on
> secondary CPUs are switched off during suspend and re-enabled during
> resume. So re-enabling the interrupt on the boot CPU on resume is
> sufficient.
>
> However, our network driver recently switched to using per-CPU
> interrupts as well, and in this case, it is really important to be able
> to re-enable the per-CPU interrupts and the appropriate CPUs at resume
> time. Since our HW registers are made so that it is not possible to
> read out at suspend time which interrupts are enabled, we have to ask
> the Linux kernel at resume time which interrupts should be re-enabled
> at the HW level. Which is what my more complicated series was doing.
>
> Do you have other suggestions to allow us to know which per-CPU
> interrupts should be re-enabled on the different CPUs at resume time ?

Ok. That makes sense. So I'm going to pick up the core change.

Thanks,

tglx

2015-12-06 09:29:10

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 0/5] Fix regression introduced by set_irq_flags() removal

Thomas,

On Sat, 5 Dec 2015, Thomas Gleixner wrote:
> On Fri, 4 Dec 2015, Thomas Petazzoni wrote:
> > Well, the problem is that IRQ_NOAUTOEN is a global flag, which is OK
> > for global interrupts, but not good for per-CPU interrupts, since you
> > don't have the information on a per-CPU basis of which interrupt was
> > enabled before suspend, and therefore should be re-enabled after resume.
> >
> > Until now, we don't have the problem since the only per-CPU interrupt
> > we were using was the local timer interrupt, and the local timers on
> > secondary CPUs are switched off during suspend and re-enabled during
> > resume. So re-enabling the interrupt on the boot CPU on resume is
> > sufficient.
> >
> > However, our network driver recently switched to using per-CPU
> > interrupts as well, and in this case, it is really important to be able
> > to re-enable the per-CPU interrupts and the appropriate CPUs at resume
> > time. Since our HW registers are made so that it is not possible to
> > read out at suspend time which interrupts are enabled, we have to ask
> > the Linux kernel at resume time which interrupts should be re-enabled
> > at the HW level. Which is what my more complicated series was doing.
> >
> > Do you have other suggestions to allow us to know which per-CPU
> > interrupts should be re-enabled on the different CPUs at resume time ?
>
> Ok. That makes sense. So I'm going to pick up the core change.

Second thoughts. That network driver example does not make sense.

You have a suspend/resume mechanism and a cpu hotplug machinery in
that driver, right? So that should be responsible for
disabling/enabling the per cpu interrupts. I don't think it's the
proper way to do that in the irq chip driver at some random point
during resume as you'd reenable interrupts on cpus which are not
online yet.

Thanks,

tglx



2015-12-08 08:58:39

by Thomas Petazzoni

[permalink] [raw]
Subject: Re: [PATCH 0/5] Fix regression introduced by set_irq_flags() removal

Hello Thomas,

On Sun, 6 Dec 2015 10:28:15 +0100 (CET), Thomas Gleixner wrote:

> Second thoughts. That network driver example does not make sense.
>
> You have a suspend/resume mechanism and a cpu hotplug machinery in
> that driver, right? So that should be responsible for
> disabling/enabling the per cpu interrupts. I don't think it's the
> proper way to do that in the irq chip driver at some random point
> during resume as you'd reenable interrupts on cpus which are not
> online yet.

The irqchip driver would re-enable the per-CPU interrupts in a CPU
notifier, so only when the secondary CPUs come online again after
resume.

When a device driver uses a normal (non per-CPU) interrupt, then it
doesn't have to take care of disabling the interrupt on suspend and
re-enabling the interrupt on resume at the interrupt controller level.
This is all transparently handled by the irqchip driver.

Why should the handling of per-CPU interrupts be different and require
explicit handling from each device driver rather than being
transparently handled by the irqchip driver ?

Best regards,

Thomas
--
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

2015-12-08 10:55:04

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 0/5] Fix regression introduced by set_irq_flags() removal

On Tue, 8 Dec 2015, Thomas Petazzoni wrote:
> When a device driver uses a normal (non per-CPU) interrupt, then it
> doesn't have to take care of disabling the interrupt on suspend and
> re-enabling the interrupt on resume at the interrupt controller level.
> This is all transparently handled by the irqchip driver.
>
> Why should the handling of per-CPU interrupts be different and require
> explicit handling from each device driver rather than being
> transparently handled by the irqchip driver ?

Fair enough. Did not think about the boot cpu part.

Thanks,

tglx