In tpm_tis_core_init(), tpm2_probe() will be called first, this
function will eventually call tpm_tis_send(), and then
tpm_tis_probe_irq_single() will detect whether the interrupt is
normal, mainly the installation interrupted, set `priv->irq_tested`
to false. The logic will eventually be executed to tpm_tis_send()
to trigger an interrupt.
There is currently such a scenario, which will cause the IRQ probe
code to never be executed, so that the TPM device is in polling
mode: after setting irq_tested to false, an interrupt occurs
between entering the ttpm_tis_send() function, and the interrupt
will be first set irq_tested to true will cause the IRQ probe code
to never be executed.
It seems that this interrupt comes from tpm2_probe(). Although the
interrupt has not been installed when tpm2_probe() is called, the
interrupt of tpm2_probe() is only received after IRQ detection.
This patch solves this issue by introducing a new variable, which
is only used in interrupts, and irq_tested only marks whether the
interrupt test has been completed.
Signed-off-by: Tianjia Zhang <[email protected]>
---
drivers/char/tpm/tpm_tis_core.c | 8 ++++----
drivers/char/tpm/tpm_tis_core.h | 1 +
2 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index 92c51c6cfd1b..d7589b0b3e56 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -502,7 +502,7 @@ static int tpm_tis_send(struct tpm_chip *chip, u8 *buf, size_t len)
int rc, irq;
struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev);
- if (!(chip->flags & TPM_CHIP_FLAG_IRQ) || priv->irq_tested)
+ if (priv->irq_tested)
return tpm_tis_send_main(chip, buf, len);
/* Verify receipt of the expected IRQ */
@@ -512,9 +512,9 @@ static int tpm_tis_send(struct tpm_chip *chip, u8 *buf, size_t len)
rc = tpm_tis_send_main(chip, buf, len);
priv->irq = irq;
chip->flags |= TPM_CHIP_FLAG_IRQ;
- if (!priv->irq_tested)
+ if (!priv->int_count)
tpm_msleep(1);
- if (!priv->irq_tested)
+ if (!priv->int_count)
disable_interrupts(chip);
priv->irq_tested = true;
return rc;
@@ -725,7 +725,7 @@ static irqreturn_t tis_int_handler(int dummy, void *dev_id)
if (interrupt == 0)
return IRQ_NONE;
- priv->irq_tested = true;
+ priv->int_count += 1;
if (interrupt & TPM_INTF_DATA_AVAIL_INT)
wake_up_interruptible(&priv->read_queue);
if (interrupt & TPM_INTF_LOCALITY_CHANGE_INT)
diff --git a/drivers/char/tpm/tpm_tis_core.h b/drivers/char/tpm/tpm_tis_core.h
index 9b2d32a59f67..c6845672f6f7 100644
--- a/drivers/char/tpm/tpm_tis_core.h
+++ b/drivers/char/tpm/tpm_tis_core.h
@@ -90,6 +90,7 @@ struct tpm_tis_data {
int locality;
int irq;
bool irq_tested;
+ unsigned int int_count;
unsigned int flags;
void __iomem *ilb_base_addr;
u16 clkrun_enabled;
--
2.19.1.3.ge56e4f7
On Wed, Jan 13, 2021 at 08:00:21PM +0800, Tianjia Zhang wrote:
> In tpm_tis_core_init(), tpm2_probe() will be called first, this
> function will eventually call tpm_tis_send(), and then
> tpm_tis_probe_irq_single() will detect whether the interrupt is
> normal, mainly the installation interrupted, set `priv->irq_tested`
> to false. The logic will eventually be executed to tpm_tis_send()
> to trigger an interrupt.
>
> There is currently such a scenario, which will cause the IRQ probe
> code to never be executed, so that the TPM device is in polling
> mode: after setting irq_tested to false, an interrupt occurs
> between entering the ttpm_tis_send() function, and the interrupt
> will be first set irq_tested to true will cause the IRQ probe code
> to never be executed.
Can you describe the scenario more detail?
> It seems that this interrupt comes from tpm2_probe(). Although the
> interrupt has not been installed when tpm2_probe() is called, the
> interrupt of tpm2_probe() is only received after IRQ detection.
>
> This patch solves this issue by introducing a new variable, which
> is only used in interrupts, and irq_tested only marks whether the
> interrupt test has been completed.
>
> Signed-off-by: Tianjia Zhang <[email protected]>
> ---
I'm not sure I understand this patch. TPM should be in polling
mode. This is also assumption before calling tpm_get_timeouts():
/* Before doing irq testing issue a command to the TPM in polling mode
* to make sure it works. May as well use that command to set the
* proper timeouts for the driver.
*/
if (tpm_get_timeouts(chip)) {
dev_err(dev, "Could not get TPM timeouts and durations\n");
rc = -ENODEV;
goto out_err;
}
/Jarkko
On 1/14/21 10:51 AM, Jarkko Sakkinen wrote:
> On Wed, Jan 13, 2021 at 08:00:21PM +0800, Tianjia Zhang wrote:
>> In tpm_tis_core_init(), tpm2_probe() will be called first, this
>> function will eventually call tpm_tis_send(), and then
>> tpm_tis_probe_irq_single() will detect whether the interrupt is
>> normal, mainly the installation interrupted, set `priv->irq_tested`
>> to false. The logic will eventually be executed to tpm_tis_send()
>> to trigger an interrupt.
>>
>> There is currently such a scenario, which will cause the IRQ probe
>> code to never be executed, so that the TPM device is in polling
>> mode: after setting irq_tested to false, an interrupt occurs
>> between entering the ttpm_tis_send() function, and the interrupt
>> will be first set irq_tested to true will cause the IRQ probe code
>> to never be executed.
>
> Can you describe the scenario more detail?
>
The problematic scenario we encountered is like this. The following
figure shows the execution flow of tpm_tis_core_init(). An interrupt
occurred before the IRQ probe. This interrupt was caused by
tpm2_probe(), but it was triggered before the IRQ probe was executed,
and the interrupt handler would set irq_tested to true, so the IRQ probe
code can never execute, that is, the code marked 2 in the figure will
never happen.
IRQ
tpm_tis_core_init()
tpm2_probe()
tpm_tis_send() -----------+
|
tpm_tis_probe_irq_single() |
|
devm_request_irq() | 1
priv->irq_tested = false |
tpm_tis_gen_interrupt() |
|
tpm_tis_send() |
irq_tested = true |
<------------------+
if (priv->irq_tested)
return tpm_tis_send_main()
/* probe IRQ */
tpm_tis_send_main() --------+
| 2
chip->flags |= FLAG_IRQ <-------+
priv->irq_tested = true
Best regards,
Tianjia
On Thu, Jan 14, 2021 at 12:12:16PM +0800, Tianjia Zhang wrote:
>
>
> On 1/14/21 10:51 AM, Jarkko Sakkinen wrote:
> > On Wed, Jan 13, 2021 at 08:00:21PM +0800, Tianjia Zhang wrote:
> > > In tpm_tis_core_init(), tpm2_probe() will be called first, this
> > > function will eventually call tpm_tis_send(), and then
> > > tpm_tis_probe_irq_single() will detect whether the interrupt is
> > > normal, mainly the installation interrupted, set `priv->irq_tested`
> > > to false. The logic will eventually be executed to tpm_tis_send()
> > > to trigger an interrupt.
> > >
> > > There is currently such a scenario, which will cause the IRQ probe
> > > code to never be executed, so that the TPM device is in polling
> > > mode: after setting irq_tested to false, an interrupt occurs
> > > between entering the ttpm_tis_send() function, and the interrupt
> > > will be first set irq_tested to true will cause the IRQ probe code
> > > to never be executed.
> >
> > Can you describe the scenario more detail?
> >
>
> The problematic scenario we encountered is like this. The following figure
> shows the execution flow of tpm_tis_core_init(). An interrupt occurred
> before the IRQ probe. This interrupt was caused by tpm2_probe(), but it was
> triggered before the IRQ probe was executed, and the interrupt handler would
> set irq_tested to true, so the IRQ probe code can never execute, that is,
> the code marked 2 in the figure will never happen.
TPM_INT_ENABLE is cleared on reset [*].
[*] Section 5.9.1
https://trustedcomputinggroup.org/resource/pc-client-work-group-pc-client-specific-tpm-interface-specification-tis/
/Jarkko
On 1/15/21 5:23 PM, Jarkko Sakkinen wrote:
> On Thu, Jan 14, 2021 at 12:12:16PM +0800, Tianjia Zhang wrote:
>>
>>
>> On 1/14/21 10:51 AM, Jarkko Sakkinen wrote:
>>> On Wed, Jan 13, 2021 at 08:00:21PM +0800, Tianjia Zhang wrote:
>>>> In tpm_tis_core_init(), tpm2_probe() will be called first, this
>>>> function will eventually call tpm_tis_send(), and then
>>>> tpm_tis_probe_irq_single() will detect whether the interrupt is
>>>> normal, mainly the installation interrupted, set `priv->irq_tested`
>>>> to false. The logic will eventually be executed to tpm_tis_send()
>>>> to trigger an interrupt.
>>>>
>>>> There is currently such a scenario, which will cause the IRQ probe
>>>> code to never be executed, so that the TPM device is in polling
>>>> mode: after setting irq_tested to false, an interrupt occurs
>>>> between entering the ttpm_tis_send() function, and the interrupt
>>>> will be first set irq_tested to true will cause the IRQ probe code
>>>> to never be executed.
>>>
>>> Can you describe the scenario more detail?
>>>
>>
>> The problematic scenario we encountered is like this. The following figure
>> shows the execution flow of tpm_tis_core_init(). An interrupt occurred
>> before the IRQ probe. This interrupt was caused by tpm2_probe(), but it was
>> triggered before the IRQ probe was executed, and the interrupt handler would
>> set irq_tested to true, so the IRQ probe code can never execute, that is,
>> the code marked 2 in the figure will never happen.
>
> TPM_INT_ENABLE is cleared on reset [*].
>
> [*] Section 5.9.1
> https://trustedcomputinggroup.org/resource/pc-client-work-group-pc-client-specific-tpm-interface-specification-tis/
>
> /Jarkko
>
Hi,
I got it, this seems to be a firmware issue. Thanks for your reply.
Best regards,
Tianjia
Greeting,
FYI, we noticed the following commit (built with gcc-9):
commit: cad8219df08a3961809468946a4d27eb3f4efb2c ("[PATCH] tpm/tpm_tis: Fix variable reset during IRQ probing")
url: https://github.com/0day-ci/linux/commits/Tianjia-Zhang/tpm-tpm_tis-Fix-variable-reset-during-IRQ-probing/20210113-200618
base: https://git.kernel.org/cgit/linux/kernel/git/gregkh/char-misc.git 09381eb16ad887e05bc2a9500261afaa5dc77cd3
in testcase: ltp
version: ltp-x86_64-14c1f76-1_20210101
with following parameters:
disk: 1HDD
fs: f2fs
test: fs_bind
ucode: 0xde
test-description: The LTP testsuite contains a collection of tools for testing the Linux kernel and related features.
test-url: http://linux-test-project.github.io/
on test machine: 8 threads Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz with 32G memory
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>
[ 11.505905] BUG: unable to handle page fault for address: ffffffffffffffe8
[ 11.506895] #PF: supervisor read access in kernel mode
[ 11.506895] #PF: error_code(0x0000) - not-present page
[ 11.506895] PGD 81c20d067 P4D 81c20d067 PUD 81c20f067 PMD 0
[ 11.506895] Oops: 0000 [#1] SMP PTI
[ 11.506895] CPU: 6 PID: 1 Comm: swapper/0 Tainted: G I 5.11.0-rc3-gcad8219df08a #1
[ 11.506895] Hardware name: Dell Inc. OptiPlex 7050/062KRH, BIOS 1.2.0 12/22/2016
[ 11.506895] RIP: 0010:prepare_to_wait_event (kbuild/src/consumer/include/linux/wait.h:177 kbuild/src/consumer/kernel/sched/wait.c:316)
[ 11.506895] Code: 00 01 75 64 49 8b 4c 24 08 4c 89 c6 48 8d 51 e8 49 39 c8 75 15 eb 18 48 8b 42 18 48 89 ce 48 8d 50 e8 49 39 c0 74 08 48 89 c1 <f6> 02 20 75 e8 48 8b 16 4c 89 f7 48 89 34 24 48 89 54 24 08 e8 fd
All code
========
0: 00 01 add %al,(%rcx)
2: 75 64 jne 0x68
4: 49 8b 4c 24 08 mov 0x8(%r12),%rcx
9: 4c 89 c6 mov %r8,%rsi
c: 48 8d 51 e8 lea -0x18(%rcx),%rdx
10: 49 39 c8 cmp %rcx,%r8
13: 75 15 jne 0x2a
15: eb 18 jmp 0x2f
17: 48 8b 42 18 mov 0x18(%rdx),%rax
1b: 48 89 ce mov %rcx,%rsi
1e: 48 8d 50 e8 lea -0x18(%rax),%rdx
22: 49 39 c0 cmp %rax,%r8
25: 74 08 je 0x2f
27: 48 89 c1 mov %rax,%rcx
2a:* f6 02 20 testb $0x20,(%rdx) <-- trapping instruction
2d: 75 e8 jne 0x17
2f: 48 8b 16 mov (%rsi),%rdx
32: 4c 89 f7 mov %r14,%rdi
35: 48 89 34 24 mov %rsi,(%rsp)
39: 48 89 54 24 08 mov %rdx,0x8(%rsp)
3e: e8 .byte 0xe8
3f: fd std
Code starting with the faulting instruction
===========================================
0: f6 02 20 testb $0x20,(%rdx)
3: 75 e8 jne 0xffffffffffffffed
5: 48 8b 16 mov (%rsi),%rdx
8: 4c 89 f7 mov %r14,%rdi
b: 48 89 34 24 mov %rsi,(%rsp)
f: 48 89 54 24 08 mov %rdx,0x8(%rsp)
14: e8 .byte 0xe8
15: fd std
[ 11.506895] RSP: 0000:ffffc900000339a8 EFLAGS: 00010086
[ 11.506895] RAX: ffffc90000033a28 RBX: 0000000000000001 RCX: 0000000000000000
[ 11.506895] RDX: ffffffffffffffe8 RSI: ffff888100fcf5a0 RDI: ffff888100fcf598
[ 11.506895] RBP: ffffc90000033a10 R08: ffff888100fcf5a0 R09: ffff88881ddaba98
[ 11.506895] R10: 00000000000001c9 R11: ffff88881dda9fc4 R12: ffff888100fcf598
[ 11.506895] R13: 0000000000000292 R14: ffffc90000033a28 R15: 0000000000000001
[ 11.506895] FS: 0000000000000000(0000) GS:ffff88881dd80000(0000) knlGS:0000000000000000
[ 11.506895] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 11.506895] CR2: ffffffffffffffe8 CR3: 000000081c20a001 CR4: 00000000003706e0
[ 11.506895] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 11.506895] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 11.506895] Call Trace:
[ 11.506895] ? tpm_tis_status (kbuild/src/consumer/drivers/char/tpm/tpm_tis_core.h:121 kbuild/src/consumer/drivers/char/tpm/tpm_tis_core.c:238)
[ 11.506895] wait_for_tpm_stat (kbuild/src/consumer/drivers/char/tpm/tpm_tis_core.c:68 (discriminator 15))
[ 11.506895] ? finish_wait (kbuild/src/consumer/kernel/sched/wait.c:403)
[ 11.506895] recv_data (kbuild/src/consumer/drivers/char/tpm/tpm_tis_core.c:299)
[ 11.506895] tpm_tis_recv (kbuild/src/consumer/drivers/char/tpm/tpm_tis_core.c:330)
[ 11.506895] tpm_transmit (kbuild/src/consumer/drivers/char/tpm/tpm-interface.c:126 kbuild/src/consumer/drivers/char/tpm/tpm-interface.c:173)
[ 11.506895] tpm_transmit_cmd (kbuild/src/consumer/drivers/char/tpm/tpm-interface.c:221)
[ 11.506895] tpm2_probe (kbuild/src/consumer/drivers/char/tpm/tpm2-cmd.c:498)
[ 11.506895] tpm_tis_core_init (kbuild/src/consumer/drivers/char/tpm/tpm_tis_core.c:1012)
[ 11.506895] ? __devm_ioremap_resource (kbuild/src/consumer/lib/devres.c:147)
[ 11.506895] tpm_tis_plat_probe (kbuild/src/consumer/drivers/char/tpm/tpm_tis.c:331)
[ 11.506895] platform_probe (kbuild/src/consumer/drivers/base/platform.c:1446)
[ 11.506895] really_probe (kbuild/src/consumer/drivers/base/dd.c:561)
[ 11.506895] driver_probe_device (kbuild/src/consumer/drivers/base/dd.c:745)
[ 11.506895] device_driver_attach (kbuild/src/consumer/drivers/base/dd.c:1020)
[ 11.506895] __driver_attach (kbuild/src/consumer/drivers/base/dd.c:1099)
[ 11.506895] ? device_driver_attach (kbuild/src/consumer/drivers/base/dd.c:1052)
[ 11.506895] ? device_driver_attach (kbuild/src/consumer/drivers/base/dd.c:1052)
[ 11.506895] bus_for_each_dev (kbuild/src/consumer/drivers/base/bus.c:305)
[ 11.506895] bus_add_driver (kbuild/src/consumer/drivers/base/bus.c:623)
[ 11.506895] driver_register (kbuild/src/consumer/drivers/base/driver.c:171)
[ 11.506895] ? tpm_init (kbuild/src/consumer/drivers/char/tpm/tpm_tis.c:390)
[ 11.506895] init_tis (kbuild/src/consumer/drivers/char/tpm/tpm_tis.c:397)
[ 11.506895] ? driver_register (kbuild/src/consumer/drivers/base/driver.c:182)
[ 11.506895] do_one_initcall (kbuild/src/consumer/init/main.c:1217)
[ 11.506895] kernel_init_freeable (kbuild/src/consumer/init/main.c:1289 kbuild/src/consumer/init/main.c:1306 kbuild/src/consumer/init/main.c:1326 kbuild/src/consumer/init/main.c:1527)
[ 11.506895] ? rest_init (kbuild/src/consumer/init/main.c:1412)
[ 11.506895] kernel_init (kbuild/src/consumer/init/main.c:1417)
[ 11.506895] ret_from_fork (kbuild/src/consumer/arch/x86/entry/entry_64.S:302)
[ 11.506895] Modules linked in:
[ 11.506895] CR2: ffffffffffffffe8
[ 11.506895] ---[ end trace 595ed6f1e30f2e0b ]---
[ 11.506895] RIP: 0010:prepare_to_wait_event (kbuild/src/consumer/include/linux/wait.h:177 kbuild/src/consumer/kernel/sched/wait.c:316)
[ 11.506895] Code: 00 01 75 64 49 8b 4c 24 08 4c 89 c6 48 8d 51 e8 49 39 c8 75 15 eb 18 48 8b 42 18 48 89 ce 48 8d 50 e8 49 39 c0 74 08 48 89 c1 <f6> 02 20 75 e8 48 8b 16 4c 89 f7 48 89 34 24 48 89 54 24 08 e8 fd
All code
========
0: 00 01 add %al,(%rcx)
2: 75 64 jne 0x68
4: 49 8b 4c 24 08 mov 0x8(%r12),%rcx
9: 4c 89 c6 mov %r8,%rsi
c: 48 8d 51 e8 lea -0x18(%rcx),%rdx
10: 49 39 c8 cmp %rcx,%r8
13: 75 15 jne 0x2a
15: eb 18 jmp 0x2f
17: 48 8b 42 18 mov 0x18(%rdx),%rax
1b: 48 89 ce mov %rcx,%rsi
1e: 48 8d 50 e8 lea -0x18(%rax),%rdx
22: 49 39 c0 cmp %rax,%r8
25: 74 08 je 0x2f
27: 48 89 c1 mov %rax,%rcx
2a:* f6 02 20 testb $0x20,(%rdx) <-- trapping instruction
2d: 75 e8 jne 0x17
2f: 48 8b 16 mov (%rsi),%rdx
32: 4c 89 f7 mov %r14,%rdi
35: 48 89 34 24 mov %rsi,(%rsp)
39: 48 89 54 24 08 mov %rdx,0x8(%rsp)
3e: e8 .byte 0xe8
3f: fd std
Code starting with the faulting instruction
===========================================
0: f6 02 20 testb $0x20,(%rdx)
3: 75 e8 jne 0xffffffffffffffed
5: 48 8b 16 mov (%rsi),%rdx
8: 4c 89 f7 mov %r14,%rdi
b: 48 89 34 24 mov %rsi,(%rsp)
f: 48 89 54 24 08 mov %rdx,0x8(%rsp)
14: e8 .byte 0xe8
15: fd std
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
Thanks,
Oliver Sang