2020-12-05 01:47:59

by Jerry Snitselaar

[permalink] [raw]
Subject: [PATCH v3 0/4] tpm_tis: Detect interrupt storms

This patchset is an attempt to try and catch tpm_tis devices that have
interrupt storm issues, disable the interrupt, and use polling. In
2016 the tpm_tis interrupt code was accidently disabled, and polling
was just being used. When we initially tried to enable interrupts
again there were some reports of systems being hit with interrupt
storms. It turned out that the ThinkPad T490s had misconfigured a gpio
pin being used for the interrupt. The problem is more widespread
though, with interrupt storms also being seen on other platforms and
different TPM vendors. With the L490 the system hangs at tpm_tis
initialization even with the detection code, so change the earlier
detection code that used dmi to look for the T490s to instead look for
the L490 and disable interrupts.

Since kstat_irqs needs to be exported to allow building of tpm_tis
as a module, I've included a patch to change the i915_pmu code to
use kstat_irqs where before it was using its own version. If this
isn't desired it can be dropped.

I've been testing this on top of James' proposed patchset which
re-enables interrupts for tpm_tis. With the patchsets applied
it detects the problem on the T490s and on the Ice Lake development
system where I found the issue. I have Lenovo verifying that the
dmi detection code will now detect the L490 and avoid the hang
it experiences. I'm also working on getting access to an L490
to see if I can figure out what the underlying issue is.



Changes from v2:
- Export kstat_irqs to allow building tpm_tis as a module.
- Change i915_pmu.c to use kstat_irqs instead of it's own
version count_interrupts.
- Change include from linux/kernel_stat.h to linux/irq.h.
- Change dmi checking code to now look for L490 instead of
T490s.

Changes from v1:
- drop tpm_tis specific workqueue and use just system_w.

Jerry Snitselaar (4):
irq: export kstat_irqs
drm/i915/pmu: Use kstat_irqs to get interrupt count
tpm_tis: Disable interrupts if interrupt storm detected
tpm_tis: Disable Interrupts on the ThinkPad L490


Cc: Thomas Gleixner <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: David Airlie <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Jarkko Sakkinen <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Peter Huewe <[email protected]>
Cc: James Bottomley <[email protected]>
Cc: Matthew Garrett <[email protected]>
Cc: Hans de Goede <[email protected]>
Cc: [email protected]

drivers/char/tpm/tpm_tis.c | 4 ++--
drivers/char/tpm/tpm_tis_core.c | 27 +++++++++++++++++++++++++++
drivers/char/tpm/tpm_tis_core.h | 2 ++
drivers/gpu/drm/i915/i915_pmu.c | 18 +-----------------
include/linux/irqdesc.h | 1 +
kernel/irq/irqdesc.c | 1 +
6 files changed, 34 insertions(+), 19 deletions(-)

--
2.27.0


2020-12-05 01:48:14

by Jerry Snitselaar

[permalink] [raw]
Subject: [PATCH v3 3/4] tpm_tis: Disable interrupts if interrupt storm detected

When enabling the interrupt code for the tpm_tis driver we have
noticed some systems have a bios issue causing an interrupt storm to
occur. The issue isn't limited to a single tpm or system manufacturer
so keeping a denylist of systems with the issue isn't optimal. Instead
try to detect the problem occurring, disable interrupts, and revert to
polling when it happens.

Cc: Jarkko Sakkinen <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Peter Huewe <[email protected]>
Cc: James Bottomley <[email protected]>
Cc: Matthew Garrett <[email protected]>
Cc: Hans de Goede <[email protected]>
Signed-off-by: Jerry Snitselaar <[email protected]>
---
v3: - Change include from linux/kernel_stat.h to linux/irq.h
v2: - drop tpm_tis specific workqueue and use just system_w

drivers/char/tpm/tpm_tis_core.c | 27 +++++++++++++++++++++++++++
drivers/char/tpm/tpm_tis_core.h | 2 ++
2 files changed, 29 insertions(+)

diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index 92c51c6cfd1b..d817ff5664d1 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -24,6 +24,8 @@
#include <linux/wait.h>
#include <linux/acpi.h>
#include <linux/freezer.h>
+#include <linux/workqueue.h>
+#include <linux/irq.h>
#include "tpm.h"
#include "tpm_tis_core.h"

@@ -715,9 +717,23 @@ static irqreturn_t tis_int_handler(int dummy, void *dev_id)
{
struct tpm_chip *chip = dev_id;
struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev);
+ static bool check_storm = true;
+ static unsigned int check_start;
u32 interrupt;
int i, rc;

+ if (unlikely(check_storm)) {
+ if (!check_start) {
+ check_start = jiffies_to_msecs(jiffies);
+ } else if ((kstat_irqs(priv->irq) > 1000) &&
+ (jiffies_to_msecs(jiffies) - check_start < 500)) {
+ check_storm = false;
+ schedule_work(&priv->storm_work);
+ } else if (jiffies_to_msecs(jiffies) - check_start >= 500) {
+ check_storm = false;
+ }
+ }
+
rc = tpm_tis_read32(priv, TPM_INT_STATUS(priv->locality), &interrupt);
if (rc < 0)
return IRQ_NONE;
@@ -943,6 +959,14 @@ static const struct tpm_class_ops tpm_tis = {
.clk_enable = tpm_tis_clkrun_enable,
};

+static void tpm_tis_storm_work(struct work_struct *work)
+{
+ struct tpm_tis_data *priv = container_of(work, struct tpm_tis_data, storm_work);
+
+ disable_interrupts(priv->chip);
+ dev_warn(&priv->chip->dev, "Interrupt storm detected, using polling.\n");
+}
+
int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
const struct tpm_tis_phy_ops *phy_ops,
acpi_handle acpi_dev_handle)
@@ -959,6 +983,9 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
if (IS_ERR(chip))
return PTR_ERR(chip);

+ priv->chip = chip;
+ INIT_WORK(&priv->storm_work, tpm_tis_storm_work);
+
#ifdef CONFIG_ACPI
chip->acpi_dev_handle = acpi_dev_handle;
#endif
diff --git a/drivers/char/tpm/tpm_tis_core.h b/drivers/char/tpm/tpm_tis_core.h
index 9b2d32a59f67..973297ee2e16 100644
--- a/drivers/char/tpm/tpm_tis_core.h
+++ b/drivers/char/tpm/tpm_tis_core.h
@@ -95,6 +95,8 @@ struct tpm_tis_data {
u16 clkrun_enabled;
wait_queue_head_t int_queue;
wait_queue_head_t read_queue;
+ struct work_struct storm_work;
+ struct tpm_chip *chip;
const struct tpm_tis_phy_ops *phy_ops;
unsigned short rng_quality;
};
--
2.27.0

2020-12-05 01:48:35

by Jerry Snitselaar

[permalink] [raw]
Subject: [PATCH v3 4/4] tpm_tis: Disable Interrupts on the ThinkPad L490

The interrupt storm detection code detects the issue on the ThinkPad
T490s, but the L490 still hangs at initialization. So swap out the
T490s for the L490 in the dmi check.

Cc: Jarkko Sakkinen <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Peter Huewe <[email protected]>
Cc: James Bottomley <[email protected]>
Cc: Matthew Garrett <[email protected]>
Cc: Hans de Goede <[email protected]>
Signed-off-by: Jerry Snitselaar <[email protected]>
---
drivers/char/tpm/tpm_tis.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c
index 4ed6e660273a..7322e0986a83 100644
--- a/drivers/char/tpm/tpm_tis.c
+++ b/drivers/char/tpm/tpm_tis.c
@@ -77,10 +77,10 @@ static int tpm_tis_disable_irq(const struct dmi_system_id *d)
static const struct dmi_system_id tpm_tis_dmi_table[] = {
{
.callback = tpm_tis_disable_irq,
- .ident = "ThinkPad T490s",
+ .ident = "ThinkPad L490",
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
- DMI_MATCH(DMI_PRODUCT_VERSION, "ThinkPad T490s"),
+ DMI_MATCH(DMI_PRODUCT_VERSION, "ThinkPad L490"),
},
},
{}
--
2.27.0

2020-12-05 01:48:42

by Jerry Snitselaar

[permalink] [raw]
Subject: [PATCH v3 1/4] irq: export kstat_irqs

To try and detect potential interrupt storms that
have been occurring with tpm_tis devices it was suggested
to use kstat_irqs() to get the number of interrupts.
Since tpm_tis can be built as a module it needs kstat_irqs
exported.

Reported-by: kernel test robot <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Jarkko Sakkinen <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Peter Huewe <[email protected]>
Cc: James Bottomley <[email protected]>
Cc: Matthew Garrett <[email protected]>
Cc: Hans de Goede <[email protected]>
Signed-off-by: Jerry Snitselaar <[email protected]>
---
include/linux/irqdesc.h | 1 +
kernel/irq/irqdesc.c | 1 +
2 files changed, 2 insertions(+)

diff --git a/include/linux/irqdesc.h b/include/linux/irqdesc.h
index 5745491303e0..fff88c1f1ac6 100644
--- a/include/linux/irqdesc.h
+++ b/include/linux/irqdesc.h
@@ -153,6 +153,7 @@ static inline void generic_handle_irq_desc(struct irq_desc *desc)
}

int generic_handle_irq(unsigned int irq);
+unsigned int kstat_irqs(unsigned int irq);

#ifdef CONFIG_HANDLE_DOMAIN_IRQ
/*
diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index 1a7723604399..12398ef1796b 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -1000,6 +1000,7 @@ unsigned int kstat_irqs(unsigned int irq)
sum += *per_cpu_ptr(desc->kstat_irqs, cpu);
return sum;
}
+EXPORT_SYMBOL_GPL(kstat_irqs);

/**
* kstat_irqs_usr - Get the statistics for an interrupt
--
2.27.0

2020-12-05 10:47:43

by Jarkko Sakkinen

[permalink] [raw]
Subject: Re: [PATCH v3 1/4] irq: export kstat_irqs

On Fri, Dec 04, 2020 at 06:43:37PM -0700, Jerry Snitselaar wrote:
> To try and detect potential interrupt storms that
> have been occurring with tpm_tis devices it was suggested
> to use kstat_irqs() to get the number of interrupts.
> Since tpm_tis can be built as a module it needs kstat_irqs
> exported.

I think you should also have a paragraph explicitly stating that
i915_pmu.c contains a duplicate of kstat_irqs() because it is not
exported as of today. It adds a lot more weight to this given that
there is already existing mainline usage (kind of).

>
> Reported-by: kernel test robot <[email protected]>

I'm not sure if this makes much sense.

> Cc: Thomas Gleixner <[email protected]>
> Cc: Jarkko Sakkinen <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Peter Huewe <[email protected]>
> Cc: James Bottomley <[email protected]>
> Cc: Matthew Garrett <[email protected]>
> Cc: Hans de Goede <[email protected]>
> Signed-off-by: Jerry Snitselaar <[email protected]>

/Jarkko

2020-12-06 16:43:31

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v3 1/4] irq: export kstat_irqs

On Sat, Dec 05 2020 at 12:39, Jarkko Sakkinen wrote:
> On Fri, Dec 04, 2020 at 06:43:37PM -0700, Jerry Snitselaar wrote:
>> To try and detect potential interrupt storms that
>> have been occurring with tpm_tis devices it was suggested
>> to use kstat_irqs() to get the number of interrupts.
>> Since tpm_tis can be built as a module it needs kstat_irqs
>> exported.
>
> I think you should also have a paragraph explicitly stating that
> i915_pmu.c contains a duplicate of kstat_irqs() because it is not
> exported as of today. It adds a lot more weight to this given that
> there is already existing mainline usage (kind of).

It's abusage and just the fact that it exists is not an argument by
itself.

Thanks,

tglx

2020-12-06 17:45:28

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH v3 1/4] irq: export kstat_irqs

On Sun, 2020-12-06 at 17:40 +0100, Thomas Gleixner wrote:
> On Sat, Dec 05 2020 at 12:39, Jarkko Sakkinen wrote:
> > On Fri, Dec 04, 2020 at 06:43:37PM -0700, Jerry Snitselaar wrote:
> > > To try and detect potential interrupt storms that
> > > have been occurring with tpm_tis devices it was suggested
> > > to use kstat_irqs() to get the number of interrupts.
> > > Since tpm_tis can be built as a module it needs kstat_irqs
> > > exported.
> >
> > I think you should also have a paragraph explicitly stating that
> > i915_pmu.c contains a duplicate of kstat_irqs() because it is not
> > exported as of today. It adds a lot more weight to this given that
> > there is already existing mainline usage (kind of).
>
> It's abusage and just the fact that it exists is not an argument by
> itself.

What we want is a count of the interrupts to see if we're having an
interrupt storm from the TPM device (some seem to be wired to fire the
interrupt even when there's no event to warrant it). Since
kstat_irqs_user() does the correct RCU locking, should we be using that
instead?

James


2020-12-06 17:59:38

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v3 1/4] irq: export kstat_irqs

Jerry,

On Fri, Dec 04 2020 at 18:43, Jerry Snitselaar wrote:

The proper prefix is 'genirq:' git log kernel/irq/irqdesc.c would have
told you.

> To try and detect potential interrupt storms that
> have been occurring with tpm_tis devices it was suggested
> to use kstat_irqs() to get the number of interrupts.
> Since tpm_tis can be built as a module it needs kstat_irqs
> exported.

I'm not really enthused about exporting this without making it at least
safe. Using it from an interrupt handler is obviously safe vs. concurrent
removal, but the next driver writer who thinks this is cool is going to
get it wrong for sure.

Though I still have to figure out what the advantage of invoking a
function which needs to do a radix tree lookup over a device local
counter is just to keep track of this.

I'll reply on the TPM part of this as well.

Thanks,

tglx

2020-12-06 19:34:36

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v3 1/4] irq: export kstat_irqs

On Sun, Dec 06 2020 at 09:40, James Bottomley wrote:
> On Sun, 2020-12-06 at 17:40 +0100, Thomas Gleixner wrote:
>> On Sat, Dec 05 2020 at 12:39, Jarkko Sakkinen wrote:
>> > On Fri, Dec 04, 2020 at 06:43:37PM -0700, Jerry Snitselaar wrote:
>> > > To try and detect potential interrupt storms that
>> > > have been occurring with tpm_tis devices it was suggested
>> > > to use kstat_irqs() to get the number of interrupts.
>> > > Since tpm_tis can be built as a module it needs kstat_irqs
>> > > exported.
>> >
>> > I think you should also have a paragraph explicitly stating that
>> > i915_pmu.c contains a duplicate of kstat_irqs() because it is not
>> > exported as of today. It adds a lot more weight to this given that
>> > there is already existing mainline usage (kind of).
>>
>> It's abusage and just the fact that it exists is not an argument by
>> itself.
>
> What we want is a count of the interrupts to see if we're having an
> interrupt storm from the TPM device (some seem to be wired to fire the
> interrupt even when there's no event to warrant it). Since
> kstat_irqs_user() does the correct RCU locking, should we be using that
> instead?

If we need to export it, yes. But I still have to understand the
value. See my other reply.

Thanks,

tglx

2020-12-06 19:34:36

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v3 3/4] tpm_tis: Disable interrupts if interrupt storm detected

Jerry,

On Fri, Dec 04 2020 at 18:43, Jerry Snitselaar wrote:
> @@ -715,9 +717,23 @@ static irqreturn_t tis_int_handler(int dummy, void *dev_id)
> {
> struct tpm_chip *chip = dev_id;
> struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev);
> + static bool check_storm = true;
> + static unsigned int check_start;

So this assumes that there can't be two TPMs which is probably true, but
everything else in this driver has stuff in tpm_tis_data per device.

> u32 interrupt;
> int i, rc;
>
> + if (unlikely(check_storm)) {
> + if (!check_start) {
> + check_start = jiffies_to_msecs(jiffies);

Yuck. I had to read that twice to figure out that it's correct vs. the
truncation of the result to unsigned int. You can spare that conversion
by simply doing

unsigned long end_of_check = jiffies + HZ / 2;

and then the check becomes

time_before(jiffies, end_of_check)

> + } else if ((kstat_irqs(priv->irq) > 1000) &&
> + (jiffies_to_msecs(jiffies) - check_start < 500)) {

I assume you can't call disable_irq_nosync() here, but shouldn't this
shut up the interrupt at the TPM level right here?

> + check_storm = false;
> + schedule_work(&priv->storm_work);
> + } else if (jiffies_to_msecs(jiffies) - check_start >= 500) {
> + check_storm = false;
> + }
> + }

So back to kstat_irqs(). As this needs two extra variables anyway:

init()
priv->irq_check = 1;
priv->end_check = 0;

isr()
if (unlikely(priv->irq_check)) {
if (!priv->end_check) {
priv->end_check = jiffies + HZ / 2;
} else if (time_before(jiffies, priv->end_check)) {
if (priv->irq_check++ > 1000)
schedule_work(...);
} else {
priv->irq_check = 0;
}
}

Hmm? I still need to see an argument for an kstat_irqs() export being
superior.

Though I wonder whether such an infrastructure should be provided in the
irq core. Let me think about it.

Just as a side note. I was looking at tpm_tis_probe_irq_single() and
that function is leaking the interrupt request if any of the checks
afterwards fails, except for the final interrupt probe check which does
a cleanup. That means on fail before that the interrupt handler stays
requested up to the point where the module is removed. If that's a
shared interrupt and some other device is active on the same line, then
each interrupt from that device will call into the TPM code. Something
like the below is needed.

Also the X86 autoprobe mechanism is interesting:

if (IS_ENABLED(CONFIG_X86))
for (i = 3; i <= 15; i++)
if (!tpm_tis_probe_irq_single(chip, intmask, 0, i))
return;

The third argument is 'flags' which is handed to request_irq(). So that
won't ever be able to probe a shared interrupt. But if an interrupt
number > 0 is handed to tpm_tis_core_init() the interrupt is requested
with IRQF_SHARED. Same issue when the chip has an interrupt number in
the register. It's also requested exclusive which is pretty likely
to fail on ancient x86 machines.

The vast amount of comments didn't help to figure out what the reasoning
is.

Thanks,

tglx
---
drivers/char/tpm/tpm_tis_core.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)

--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -782,26 +782,26 @@ static int tpm_tis_probe_irq_single(stru
rc = tpm_tis_read8(priv, TPM_INT_VECTOR(priv->locality),
&original_int_vec);
if (rc < 0)
- return rc;
+ goto fail;

rc = tpm_tis_write8(priv, TPM_INT_VECTOR(priv->locality), irq);
if (rc < 0)
- return rc;
+ goto fail;

rc = tpm_tis_read32(priv, TPM_INT_STATUS(priv->locality), &int_status);
if (rc < 0)
- return rc;
+ goto fail;

/* Clear all existing */
rc = tpm_tis_write32(priv, TPM_INT_STATUS(priv->locality), int_status);
if (rc < 0)
- return rc;
+ goto fail;

/* Turn on */
rc = tpm_tis_write32(priv, TPM_INT_ENABLE(priv->locality),
intmask | TPM_GLOBAL_INT_ENABLE);
if (rc < 0)
- return rc;
+ goto fail;

priv->irq_tested = false;

@@ -825,6 +825,10 @@ static int tpm_tis_probe_irq_single(stru
}

return 0;
+
+fail:
+ disable_interrupts(chip);
+ return rc;
}

/* Try to find the IRQ the TPM is using. This is for legacy x86 systems that

2020-12-06 21:51:42

by Jerry Snitselaar

[permalink] [raw]
Subject: Re: [PATCH v3 1/4] irq: export kstat_irqs


Thomas Gleixner @ 2020-12-06 10:54 MST:

> Jerry,
>
> On Fri, Dec 04 2020 at 18:43, Jerry Snitselaar wrote:
>
> The proper prefix is 'genirq:' git log kernel/irq/irqdesc.c would have
> told you.
>
>> To try and detect potential interrupt storms that
>> have been occurring with tpm_tis devices it was suggested
>> to use kstat_irqs() to get the number of interrupts.
>> Since tpm_tis can be built as a module it needs kstat_irqs
>> exported.
>
> I'm not really enthused about exporting this without making it at least
> safe. Using it from an interrupt handler is obviously safe vs. concurrent
> removal, but the next driver writer who thinks this is cool is going to
> get it wrong for sure.
>
> Though I still have to figure out what the advantage of invoking a
> function which needs to do a radix tree lookup over a device local
> counter is just to keep track of this.
>
> I'll reply on the TPM part of this as well.
>
> Thanks,
>
> tglx

I can rework it to use a device local counter.

2020-12-07 19:31:31

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v3 3/4] tpm_tis: Disable interrupts if interrupt storm detected

On Sun, Dec 06, 2020 at 08:26:16PM +0100, Thomas Gleixner wrote:
> Just as a side note. I was looking at tpm_tis_probe_irq_single() and
> that function is leaking the interrupt request if any of the checks
> afterwards fails, except for the final interrupt probe check which does
> a cleanup. That means on fail before that the interrupt handler stays
> requested up to the point where the module is removed. If that's a
> shared interrupt and some other device is active on the same line, then
> each interrupt from that device will call into the TPM code. Something
> like the below is needed.
>
> Also the X86 autoprobe mechanism is interesting:
>
> if (IS_ENABLED(CONFIG_X86))
> for (i = 3; i <= 15; i++)
> if (!tpm_tis_probe_irq_single(chip, intmask, 0, i))
> return;
>
> The third argument is 'flags' which is handed to request_irq(). So that
> won't ever be able to probe a shared interrupt. But if an interrupt
> number > 0 is handed to tpm_tis_core_init() the interrupt is requested
> with IRQF_SHARED. Same issue when the chip has an interrupt number in
> the register. It's also requested exclusive which is pretty likely
> to fail on ancient x86 machines.

It is very likely none of this works any more, it has been repeatedly
reworked over the years and just left behind out of fear someone needs
it. I've thought it should be deleted for a while now.

I suppose the original logic was to try and probe without SHARED
because a probe would need exclusive access to the interrupt to tell
if the TPM was actually the source, not some other device.

It is all very old and very out of step with current thinking, IMHO. I
skeptical that TPM interrupts were ever valuable enough to deserve
this in the first place.

Jason

2020-12-07 20:02:59

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH v3 3/4] tpm_tis: Disable interrupts if interrupt storm detected

On Mon, 2020-12-07 at 15:28 -0400, Jason Gunthorpe wrote:
> On Sun, Dec 06, 2020 at 08:26:16PM +0100, Thomas Gleixner wrote:
> > Just as a side note. I was looking at tpm_tis_probe_irq_single()
> > and that function is leaking the interrupt request if any of the
> > checks afterwards fails, except for the final interrupt probe check
> > which does a cleanup. That means on fail before that the interrupt
> > handler stays requested up to the point where the module is
> > removed. If that's a shared interrupt and some other device is
> > active on the same line, then each interrupt from that device will
> > call into the TPM code. Something like the below is needed.
> >
> > Also the X86 autoprobe mechanism is interesting:
> >
> > if (IS_ENABLED(CONFIG_X86))
> > for (i = 3; i <= 15; i++)
> > if (!tpm_tis_probe_irq_single(chip, intmask, 0,
> > i))
> > return;
> >
> > The third argument is 'flags' which is handed to request_irq(). So
> > that won't ever be able to probe a shared interrupt. But if an
> > interrupt number > 0 is handed to tpm_tis_core_init() the interrupt
> > is requested with IRQF_SHARED. Same issue when the chip has an
> > interrupt number in the register. It's also requested exclusive
> > which is pretty likely to fail on ancient x86 machines.
>
> It is very likely none of this works any more, it has been repeatedly
> reworked over the years and just left behind out of fear someone
> needs it. I've thought it should be deleted for a while now.
>
> I suppose the original logic was to try and probe without SHARED
> because a probe would need exclusive access to the interrupt to tell
> if the TPM was actually the source, not some other device.
>
> It is all very old and very out of step with current thinking, IMHO.
> I skeptical that TPM interrupts were ever valuable enough to deserve
> this in the first place.

For what it's worth, I agree. Trying to probe all 15 ISA interrupts is
last millennium thinking we should completely avoid. If it's not
described in ACPI then you don't get an interrupt full stop.

James


2020-12-08 17:46:54

by Jarkko Sakkinen

[permalink] [raw]
Subject: Re: [PATCH v3 3/4] tpm_tis: Disable interrupts if interrupt storm detected

On Mon, Dec 07, 2020 at 03:28:03PM -0400, Jason Gunthorpe wrote:
> On Sun, Dec 06, 2020 at 08:26:16PM +0100, Thomas Gleixner wrote:
> > Just as a side note. I was looking at tpm_tis_probe_irq_single() and
> > that function is leaking the interrupt request if any of the checks
> > afterwards fails, except for the final interrupt probe check which does
> > a cleanup. That means on fail before that the interrupt handler stays
> > requested up to the point where the module is removed. If that's a
> > shared interrupt and some other device is active on the same line, then
> > each interrupt from that device will call into the TPM code. Something
> > like the below is needed.
> >
> > Also the X86 autoprobe mechanism is interesting:
> >
> > if (IS_ENABLED(CONFIG_X86))
> > for (i = 3; i <= 15; i++)
> > if (!tpm_tis_probe_irq_single(chip, intmask, 0, i))
> > return;
> >
> > The third argument is 'flags' which is handed to request_irq(). So that
> > won't ever be able to probe a shared interrupt. But if an interrupt
> > number > 0 is handed to tpm_tis_core_init() the interrupt is requested
> > with IRQF_SHARED. Same issue when the chip has an interrupt number in
> > the register. It's also requested exclusive which is pretty likely
> > to fail on ancient x86 machines.
>
> It is very likely none of this works any more, it has been repeatedly
> reworked over the years and just left behind out of fear someone needs
> it. I've thought it should be deleted for a while now.
>
> I suppose the original logic was to try and probe without SHARED
> because a probe would need exclusive access to the interrupt to tell
> if the TPM was actually the source, not some other device.
>
> It is all very old and very out of step with current thinking, IMHO. I
> skeptical that TPM interrupts were ever valuable enough to deserve
> this in the first place.
>
> Jason

+1 for removing it.

/Jarkko

2020-12-09 01:04:31

by Jarkko Sakkinen

[permalink] [raw]
Subject: Re: [PATCH v3 3/4] tpm_tis: Disable interrupts if interrupt storm detected

On Mon, Dec 07, 2020 at 11:58:44AM -0800, James Bottomley wrote:
> On Mon, 2020-12-07 at 15:28 -0400, Jason Gunthorpe wrote:
> > On Sun, Dec 06, 2020 at 08:26:16PM +0100, Thomas Gleixner wrote:
> > > Just as a side note. I was looking at tpm_tis_probe_irq_single()
> > > and that function is leaking the interrupt request if any of the
> > > checks afterwards fails, except for the final interrupt probe check
> > > which does a cleanup. That means on fail before that the interrupt
> > > handler stays requested up to the point where the module is
> > > removed. If that's a shared interrupt and some other device is
> > > active on the same line, then each interrupt from that device will
> > > call into the TPM code. Something like the below is needed.
> > >
> > > Also the X86 autoprobe mechanism is interesting:
> > >
> > > if (IS_ENABLED(CONFIG_X86))
> > > for (i = 3; i <= 15; i++)
> > > if (!tpm_tis_probe_irq_single(chip, intmask, 0,
> > > i))
> > > return;
> > >
> > > The third argument is 'flags' which is handed to request_irq(). So
> > > that won't ever be able to probe a shared interrupt. But if an
> > > interrupt number > 0 is handed to tpm_tis_core_init() the interrupt
> > > is requested with IRQF_SHARED. Same issue when the chip has an
> > > interrupt number in the register. It's also requested exclusive
> > > which is pretty likely to fail on ancient x86 machines.
> >
> > It is very likely none of this works any more, it has been repeatedly
> > reworked over the years and just left behind out of fear someone
> > needs it. I've thought it should be deleted for a while now.
> >
> > I suppose the original logic was to try and probe without SHARED
> > because a probe would need exclusive access to the interrupt to tell
> > if the TPM was actually the source, not some other device.
> >
> > It is all very old and very out of step with current thinking, IMHO.
> > I skeptical that TPM interrupts were ever valuable enough to deserve
> > this in the first place.
>
> For what it's worth, I agree. Trying to probe all 15 ISA interrupts is
> last millennium thinking we should completely avoid. If it's not
> described in ACPI then you don't get an interrupt full stop.
>
> James

Maybe you could add this as part of your patches?

/Jarkko