This patch series enhances the support for the AMD's fTPM.
The AMD system assigned a command buffer and response buffer
independently in ACPI NVS region. ACPI NVS region allowed nothing to
assign a resource in it.
For supporting AMD's fTPM, I made a patch to enhance the code of command
and response buffer size calculation. I also made a patch to detect TPM
regions in ACPI NVS and work around it.
Seunghun Han (2):
tpm: tpm_crb: enhance command and response buffer size calculation
code
tpm: tpm_crb: enhance resource mapping mechanism for supporting AMD's
fTPM
drivers/char/tpm/tpm_crb.c | 69 ++++++++++++++++++++++++++++++++------
1 file changed, 59 insertions(+), 10 deletions(-)
--
2.21.0
I got an AMD system which had a Ryzen Threadripper 1950X and MSI
mainboard, and I had a problem with AMD's fTPM. My machine showed an error
message below, and the fTPM didn't work because of it.
[ 5.732084] tpm_crb MSFT0101:00: can't request region for resource
[mem 0x79b4f000-0x79b4ffff]
[ 5.732089] tpm_crb: probe of MSFT0101:00 failed with error -16
When I saw the iomem, I found two fTPM regions were in the ACPI NVS area.
The regions are below.
79a39000-79b6afff : ACPI Non-volatile Storage
79b4b000-79b4bfff : MSFT0101:00
79b4f000-79b4ffff : MSFT0101:00
After analyzing this issue, I found that crb_map_io() function called
devm_ioremap_resource() and it failed. The ACPI NVS didn't allow the TPM
CRB driver to assign a resource in it because a busy bit was set to
the ACPI NVS area.
To support AMD's fTPM, I added a function to check intersects between
the TPM region and ACPI NVS before it mapped the region. If some
intersects are detected, the function just calls devm_ioremap() for
a workaround. If there is no intersect, it calls devm_ioremap_resource().
Signed-off-by: Seunghun Han <[email protected]>
---
drivers/char/tpm/tpm_crb.c | 25 +++++++++++++++++++++++--
1 file changed, 23 insertions(+), 2 deletions(-)
diff --git a/drivers/char/tpm/tpm_crb.c b/drivers/char/tpm/tpm_crb.c
index 14f486c23af2..7b60cd594b92 100644
--- a/drivers/char/tpm/tpm_crb.c
+++ b/drivers/char/tpm/tpm_crb.c
@@ -450,6 +450,27 @@ static int crb_check_resource(struct acpi_resource *ares, void *data)
return 1;
}
+static void __iomem *crb_ioremap_resource(struct device *dev,
+ const struct resource *res)
+{
+ int rc;
+ resource_size_t size = res->end - res->start;
+
+ /* Broken BIOS assigns command and response buffers in ACPI NVS region.
+ * Check intersections between a resource and ACPI NVS for W/A.
+ */
+ rc = region_intersects(res->start, size, IORESOURCE_MEM |
+ IORESOURCE_BUSY, IORES_DESC_ACPI_NV_STORAGE);
+ if (rc != REGION_DISJOINT) {
+ dev_err(dev,
+ FW_BUG "Resource overlaps with a ACPI NVS. %pr\n",
+ res);
+ return devm_ioremap(dev, res->start, size);
+ }
+
+ return devm_ioremap_resource(dev, res);
+}
+
static void __iomem *crb_map_res(struct device *dev, struct crb_priv *priv,
struct resource *io_res, u64 start, u32 size)
{
@@ -464,7 +485,7 @@ static void __iomem *crb_map_res(struct device *dev, struct crb_priv *priv,
return (void __iomem *) ERR_PTR(-EINVAL);
if (!resource_contains(io_res, &new_res))
- return devm_ioremap_resource(dev, &new_res);
+ return crb_ioremap_resource(dev, &new_res);
return priv->iobase + (new_res.start - io_res->start);
}
@@ -536,7 +557,7 @@ static int crb_map_io(struct acpi_device *device, struct crb_priv *priv,
goto out_early;
}
- priv->iobase = devm_ioremap_resource(dev, &io_res);
+ priv->iobase = crb_ioremap_resource(dev, &io_res);
if (IS_ERR(priv->iobase)) {
ret = PTR_ERR(priv->iobase);
goto out_early;
--
2.21.0
The purpose of crb_fixup_cmd_size() function is to work around broken
BIOSes and get the trustable size between the ACPI region and register.
When the TPM has a command buffer and response buffer independently,
the crb_map_io() function calls crb_fixup_cmd_size() twice to calculate
each buffer size. However, the current implementation of it considers
one of two buffers.
To support independent command and response buffers, I changed
crb_check_resource() function for storing ACPI TPB regions to a list.
I also changed crb_fixup_cmd_size() to use the list for calculating each
buffer size.
Signed-off-by: Seunghun Han <[email protected]>
---
drivers/char/tpm/tpm_crb.c | 44 +++++++++++++++++++++++++++++++-------
1 file changed, 36 insertions(+), 8 deletions(-)
diff --git a/drivers/char/tpm/tpm_crb.c b/drivers/char/tpm/tpm_crb.c
index e59f1f91d7f3..14f486c23af2 100644
--- a/drivers/char/tpm/tpm_crb.c
+++ b/drivers/char/tpm/tpm_crb.c
@@ -442,6 +442,9 @@ static int crb_check_resource(struct acpi_resource *ares, void *data)
acpi_dev_resource_address_space(ares, &win)) {
*io_res = *res;
io_res->name = NULL;
+
+ /* Add this TPM CRB resource to the list */
+ return 0;
}
return 1;
@@ -471,7 +474,7 @@ static void __iomem *crb_map_res(struct device *dev, struct crb_priv *priv,
* region vs the registers. Trust the ACPI region. Such broken systems
* probably cannot send large TPM commands since the buffer will be truncated.
*/
-static u64 crb_fixup_cmd_size(struct device *dev, struct resource *io_res,
+static u64 __crb_fixup_cmd_size(struct device *dev, struct resource *io_res,
u64 start, u64 size)
{
if (io_res->start > start || io_res->end < start)
@@ -487,6 +490,26 @@ static u64 crb_fixup_cmd_size(struct device *dev, struct resource *io_res,
return io_res->end - start + 1;
}
+static u64 crb_fixup_cmd_size(struct device *dev, struct list_head *resources,
+ u64 start, u64 size)
+{
+ struct resource_entry *pos;
+ struct resource *cur_res;
+ u64 ret = size;
+
+ /* Check all TPM CRB resources with the start and size values */
+ resource_list_for_each_entry(pos, resources) {
+ cur_res = pos->res;
+
+ ret = __crb_fixup_cmd_size(dev, cur_res, start, size);
+ /* Broken BIOS is detected. Trust the ACPI region. */
+ if (ret < size)
+ break;
+ }
+
+ return ret;
+}
+
static int crb_map_io(struct acpi_device *device, struct crb_priv *priv,
struct acpi_table_tpm2 *buf)
{
@@ -506,16 +529,18 @@ static int crb_map_io(struct acpi_device *device, struct crb_priv *priv,
&io_res);
if (ret < 0)
return ret;
- acpi_dev_free_resource_list(&resources);
if (resource_type(&io_res) != IORESOURCE_MEM) {
dev_err(dev, FW_BUG "TPM2 ACPI table does not define a memory resource\n");
- return -EINVAL;
+ ret = -EINVAL;
+ goto out_early;
}
priv->iobase = devm_ioremap_resource(dev, &io_res);
- if (IS_ERR(priv->iobase))
- return PTR_ERR(priv->iobase);
+ if (IS_ERR(priv->iobase)) {
+ ret = PTR_ERR(priv->iobase);
+ goto out_early;
+ }
/* The ACPI IO region starts at the head area and continues to include
* the control area, as one nice sane region except for some older
@@ -532,7 +557,7 @@ static int crb_map_io(struct acpi_device *device, struct crb_priv *priv,
ret = __crb_request_locality(dev, priv, 0);
if (ret)
- return ret;
+ goto out_early;
priv->regs_t = crb_map_res(dev, priv, &io_res, buf->control_address,
sizeof(struct crb_regs_tail));
@@ -552,7 +577,7 @@ static int crb_map_io(struct acpi_device *device, struct crb_priv *priv,
pa_high = ioread32(&priv->regs_t->ctrl_cmd_pa_high);
pa_low = ioread32(&priv->regs_t->ctrl_cmd_pa_low);
cmd_pa = ((u64)pa_high << 32) | pa_low;
- cmd_size = crb_fixup_cmd_size(dev, &io_res, cmd_pa,
+ cmd_size = crb_fixup_cmd_size(dev, &resources, cmd_pa,
ioread32(&priv->regs_t->ctrl_cmd_size));
dev_dbg(dev, "cmd_hi = %X cmd_low = %X cmd_size %X\n",
@@ -566,7 +591,7 @@ static int crb_map_io(struct acpi_device *device, struct crb_priv *priv,
memcpy_fromio(&__rsp_pa, &priv->regs_t->ctrl_rsp_pa, 8);
rsp_pa = le64_to_cpu(__rsp_pa);
- rsp_size = crb_fixup_cmd_size(dev, &io_res, rsp_pa,
+ rsp_size = crb_fixup_cmd_size(dev, &resources, rsp_pa,
ioread32(&priv->regs_t->ctrl_rsp_size));
if (cmd_pa != rsp_pa) {
@@ -596,6 +621,9 @@ static int crb_map_io(struct acpi_device *device, struct crb_priv *priv,
__crb_relinquish_locality(dev, priv, 0);
+out_early:
+ acpi_dev_free_resource_list(&resources);
+
return ret;
}
--
2.21.0
On Fri, Aug 30, 2019 at 06:56:39PM +0900, Seunghun Han wrote:
> I got an AMD system which had a Ryzen Threadripper 1950X and MSI
> mainboard, and I had a problem with AMD's fTPM. My machine showed an error
> message below, and the fTPM didn't work because of it.
>
> [ 5.732084] tpm_crb MSFT0101:00: can't request region for resource
> [mem 0x79b4f000-0x79b4ffff]
> [ 5.732089] tpm_crb: probe of MSFT0101:00 failed with error -16
>
> When I saw the iomem, I found two fTPM regions were in the ACPI NVS area.
> The regions are below.
>
> 79a39000-79b6afff : ACPI Non-volatile Storage
> 79b4b000-79b4bfff : MSFT0101:00
> 79b4f000-79b4ffff : MSFT0101:00
>
> After analyzing this issue, I found that crb_map_io() function called
> devm_ioremap_resource() and it failed. The ACPI NVS didn't allow the TPM
> CRB driver to assign a resource in it because a busy bit was set to
> the ACPI NVS area.
>
> To support AMD's fTPM, I added a function to check intersects between
> the TPM region and ACPI NVS before it mapped the region. If some
> intersects are detected, the function just calls devm_ioremap() for
> a workaround. If there is no intersect, it calls devm_ioremap_resource().
>
> Signed-off-by: Seunghun Han <[email protected]>
> ---
> drivers/char/tpm/tpm_crb.c | 25 +++++++++++++++++++++++--
> 1 file changed, 23 insertions(+), 2 deletions(-)
This still seems to result in two drivers controlling the same
memory. Does this create bugs and races during resume?
Jason
>
> On Fri, Aug 30, 2019 at 06:56:39PM +0900, Seunghun Han wrote:
> > I got an AMD system which had a Ryzen Threadripper 1950X and MSI
> > mainboard, and I had a problem with AMD's fTPM. My machine showed an error
> > message below, and the fTPM didn't work because of it.
> >
> > [ 5.732084] tpm_crb MSFT0101:00: can't request region for resource
> > [mem 0x79b4f000-0x79b4ffff]
> > [ 5.732089] tpm_crb: probe of MSFT0101:00 failed with error -16
> >
> > When I saw the iomem, I found two fTPM regions were in the ACPI NVS area.
> > The regions are below.
> >
> > 79a39000-79b6afff : ACPI Non-volatile Storage
> > 79b4b000-79b4bfff : MSFT0101:00
> > 79b4f000-79b4ffff : MSFT0101:00
> >
> > After analyzing this issue, I found that crb_map_io() function called
> > devm_ioremap_resource() and it failed. The ACPI NVS didn't allow the TPM
> > CRB driver to assign a resource in it because a busy bit was set to
> > the ACPI NVS area.
> >
> > To support AMD's fTPM, I added a function to check intersects between
> > the TPM region and ACPI NVS before it mapped the region. If some
> > intersects are detected, the function just calls devm_ioremap() for
> > a workaround. If there is no intersect, it calls devm_ioremap_resource().
> >
> > Signed-off-by: Seunghun Han <[email protected]>
> > ---
> > drivers/char/tpm/tpm_crb.c | 25 +++++++++++++++++++++++--
> > 1 file changed, 23 insertions(+), 2 deletions(-)
>
> This still seems to result in two drivers controlling the same
> memory. Does this create bugs and races during resume?
>
> Jason
When I tested this patch in my machine, it seemed that ACPI NVS was
saved after TPM CRB driver sent "TPM2_Shutdown(STATE)" to the fTPM
while suspending. Then, ACPI NVS was restored while resuming.
After resuming, PCRs didn't change and TPM2 tools such as
tpm2_pcrlist, tpm2_extend, tpm2_getrandoms worked well.
So, according to my test result, it seems that the patch doesn't
create bugs and race during resume.
Seunghun
On Fri, Aug 30, 2019 at 10:54:59PM +0900, Seunghun Han wrote:
> When I tested this patch in my machine, it seemed that ACPI NVS was
> saved after TPM CRB driver sent "TPM2_Shutdown(STATE)" to the fTPM
> while suspending. Then, ACPI NVS was restored while resuming.
> After resuming, PCRs didn't change and TPM2 tools such as
> tpm2_pcrlist, tpm2_extend, tpm2_getrandoms worked well.
> So, according to my test result, it seems that the patch doesn't
> create bugs and race during resume.
I have a feeling that is shear luck of link time ordering and not guarenteed??
Jason
> From: [email protected] <linux-integrity-
> [email protected]> On Behalf Of Seunghun Han
> Sent: Friday, August 30, 2019 9:55 AM
> To: Jason Gunthorpe <[email protected]>
> Cc: Jarkko Sakkinen <[email protected]>; Peter Huewe
> <[email protected]>; open list:TPM DEVICE DRIVER <linux-
> [email protected]>; Linux Kernel Mailing List <linux-
> [email protected]>
> Subject: EXT: Re: [PATCH 2/2] tpm: tpm_crb: enhance resource mapping
> mechanism for supporting AMD's fTPM
>
> >
> > On Fri, Aug 30, 2019 at 06:56:39PM +0900, Seunghun Han wrote:
> > > I got an AMD system which had a Ryzen Threadripper 1950X and MSI
> > > mainboard, and I had a problem with AMD's fTPM. My machine showed
> an
> > > error message below, and the fTPM didn't work because of it.
> > >
> > > [ 5.732084] tpm_crb MSFT0101:00: can't request region for resource
> > > [mem 0x79b4f000-0x79b4ffff] [ 5.732089] tpm_crb: probe
> > > of MSFT0101:00 failed with error -16
> > >
> > > When I saw the iomem, I found two fTPM regions were in the ACPI NVS
> area.
> > > The regions are below.
> > >
> > > 79a39000-79b6afff : ACPI Non-volatile Storage
> > > 79b4b000-79b4bfff : MSFT0101:00
> > > 79b4f000-79b4ffff : MSFT0101:00
> > >
> > > After analyzing this issue, I found that crb_map_io() function
> > > called
> > > devm_ioremap_resource() and it failed. The ACPI NVS didn't allow the
> > > TPM CRB driver to assign a resource in it because a busy bit was set
> > > to the ACPI NVS area.
> > >
> > > To support AMD's fTPM, I added a function to check intersects
> > > between the TPM region and ACPI NVS before it mapped the region. If
> > > some intersects are detected, the function just calls devm_ioremap()
> > > for a workaround. If there is no intersect, it calls
> devm_ioremap_resource().
> > >
> > > Signed-off-by: Seunghun Han <[email protected]>
> > > ---
> > > drivers/char/tpm/tpm_crb.c | 25 +++++++++++++++++++++++--
> > > 1 file changed, 23 insertions(+), 2 deletions(-)
> >
> > This still seems to result in two drivers controlling the same memory.
> > Does this create bugs and races during resume?
> >
> > Jason
>
> When I tested this patch in my machine, it seemed that ACPI NVS was saved
> after TPM CRB driver sent "TPM2_Shutdown(STATE)" to the fTPM while
> suspending. Then, ACPI NVS was restored while resuming.
> After resuming, PCRs didn't change and TPM2 tools such as tpm2_pcrlist,
> tpm2_extend, tpm2_getrandoms worked well.
> So, according to my test result, it seems that the patch doesn't create bugs
> and race during resume.
>
> Seunghun
This was discussed earlier on the list.
The consensus was that, while safe now, this would be fragile, and subject to
unexpected changes in ACPI/NVS, and we really need to tell NVS to exclude the
regions for long term safety.
As separate issues, the patches do not work at all on some of my AMD systems.
First, you only force the remap if the overlap is with NVS, but I have systems
where the overlap is with other reserved regions. You should force the remap
regardless, but if it is NVS, grab the space back from NVS.
Second, the patch extends the wrong behavior of the current driver to both
buffer regions. If there is a conflict between what the device's control
register says, and what ACPI says, the existing driver explicitly "trusts the ACPI".
This is just wrong. The actual device will use the areas as defined by its
control registers regardless of what ACPI says. I talked to Microsoft, and
their driver trusts the control register values, and doesn't even look at the
ACPI values.
In practice, I have tested several systems in which the device registers show
The correct 4K buffers, but the driver instead trusts the ACPI values, which
list just 1K buffers. 1K buffers will not work for large requests, and the
device is going to read and write the 4K buffers regardless.
dave
>
> On Fri, Aug 30, 2019 at 10:54:59PM +0900, Seunghun Han wrote:
>
> > When I tested this patch in my machine, it seemed that ACPI NVS was
> > saved after TPM CRB driver sent "TPM2_Shutdown(STATE)" to the fTPM
> > while suspending. Then, ACPI NVS was restored while resuming.
> > After resuming, PCRs didn't change and TPM2 tools such as
> > tpm2_pcrlist, tpm2_extend, tpm2_getrandoms worked well.
> > So, according to my test result, it seems that the patch doesn't
> > create bugs and race during resume.
>
> I have a feeling that is shear luck of link time ordering and not guarenteed??
>
> Jason
No, it is guaranteed. As you know, suspend_nvs_save() is called by
acpi_pm_pre_suspend(), and it is called by
platform_suspend_prepare_noirq(). platform_suspend_prepare_noirq() is
also called by suspend_enter(), and it already suspends all devices
like TPM CRB driver before calling platform_suspend_prepare_noirq().
This means that the order is guaranteed and we don't need to worry about it.
Seunghun
>
> > From: [email protected] <linux-integrity-
> > [email protected]> On Behalf Of Seunghun Han
> > Sent: Friday, August 30, 2019 9:55 AM
> > To: Jason Gunthorpe <[email protected]>
> > Cc: Jarkko Sakkinen <[email protected]>; Peter Huewe
> > <[email protected]>; open list:TPM DEVICE DRIVER <linux-
> > [email protected]>; Linux Kernel Mailing List <linux-
> > [email protected]>
> > Subject: EXT: Re: [PATCH 2/2] tpm: tpm_crb: enhance resource mapping
> > mechanism for supporting AMD's fTPM
> >
> > >
> > > On Fri, Aug 30, 2019 at 06:56:39PM +0900, Seunghun Han wrote:
> > > > I got an AMD system which had a Ryzen Threadripper 1950X and MSI
> > > > mainboard, and I had a problem with AMD's fTPM. My machine showed
> > an
> > > > error message below, and the fTPM didn't work because of it.
> > > >
> > > > [ 5.732084] tpm_crb MSFT0101:00: can't request region for resource
> > > > [mem 0x79b4f000-0x79b4ffff] [ 5.732089] tpm_crb: probe
> > > > of MSFT0101:00 failed with error -16
> > > >
> > > > When I saw the iomem, I found two fTPM regions were in the ACPI NVS
> > area.
> > > > The regions are below.
> > > >
> > > > 79a39000-79b6afff : ACPI Non-volatile Storage
> > > > 79b4b000-79b4bfff : MSFT0101:00
> > > > 79b4f000-79b4ffff : MSFT0101:00
> > > >
> > > > After analyzing this issue, I found that crb_map_io() function
> > > > called
> > > > devm_ioremap_resource() and it failed. The ACPI NVS didn't allow the
> > > > TPM CRB driver to assign a resource in it because a busy bit was set
> > > > to the ACPI NVS area.
> > > >
> > > > To support AMD's fTPM, I added a function to check intersects
> > > > between the TPM region and ACPI NVS before it mapped the region. If
> > > > some intersects are detected, the function just calls devm_ioremap()
> > > > for a workaround. If there is no intersect, it calls
> > devm_ioremap_resource().
> > > >
> > > > Signed-off-by: Seunghun Han <[email protected]>
> > > > ---
> > > > drivers/char/tpm/tpm_crb.c | 25 +++++++++++++++++++++++--
> > > > 1 file changed, 23 insertions(+), 2 deletions(-)
> > >
> > > This still seems to result in two drivers controlling the same memory.
> > > Does this create bugs and races during resume?
> > >
> > > Jason
> >
> > When I tested this patch in my machine, it seemed that ACPI NVS was saved
> > after TPM CRB driver sent "TPM2_Shutdown(STATE)" to the fTPM while
> > suspending. Then, ACPI NVS was restored while resuming.
> > After resuming, PCRs didn't change and TPM2 tools such as tpm2_pcrlist,
> > tpm2_extend, tpm2_getrandoms worked well.
> > So, according to my test result, it seems that the patch doesn't create bugs
> > and race during resume.
> >
> > Seunghun
>
> This was discussed earlier on the list.
> The consensus was that, while safe now, this would be fragile, and subject to
> unexpected changes in ACPI/NVS, and we really need to tell NVS to exclude the
> regions for long term safety.
Thank you for your advice. We also discussed earlier and concluded
that checking and raw remapping are enough to work around this. The
link is here, https://lkml.org/lkml/2019/8/29/962 .
> As separate issues, the patches do not work at all on some of my AMD systems.
> First, you only force the remap if the overlap is with NVS, but I have systems
> where the overlap is with other reserved regions. You should force the remap
> regardless, but if it is NVS, grab the space back from NVS.
I didn't know about that. I just found the case from your thread that
AMD system assigned TPM region into the reserved area. However, as I
know, the reserved area has no busy bit so that TPM CRB driver could
assign buffer resources in it. Right? In my view, if you patched your
TPM driver with my patch series, then it could work. Would you explain
what happened in TPM CRB driver and reserved area?
>
> Second, the patch extends the wrong behavior of the current driver to both
> buffer regions. If there is a conflict between what the device's control
> register says, and what ACPI says, the existing driver explicitly "trusts the ACPI".
> This is just wrong. The actual device will use the areas as defined by its
> control registers regardless of what ACPI says. I talked to Microsoft, and
> their driver trusts the control register values, and doesn't even look at the
> ACPI values.
As you know, the original code trusts the ACPI table because of the
workaround for broken BIOS, and this code has worked well for a long
time. In my view, if we change this code to trust control register
value, we could make new problems and need a lot of time to check the
workaround. So, I want to trust the ACPI value now.
>
> In practice, I have tested several systems in which the device registers show
> The correct 4K buffers, but the driver instead trusts the ACPI values, which
> list just 1K buffers. 1K buffers will not work for large requests, and the
> device is going to read and write the 4K buffers regardless.
>
> dave
I know about that. However, the device driver is not going to read and
write 4K buffers if you patch your TPM driver with my patch series.
One of my patches has an enhancement feature that could calculate the
buffer size well. The TPM driver uses exactly 1K buffer for this case,
not 4K buffer, and it works.
Seunghun
> From: Seunghun Han <[email protected]>
> Sent: Friday, August 30, 2019 12:55 PM
> To: Safford, David (GE Global Research, US) <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>; Jarkko Sakkinen
> <[email protected]>; Peter Huewe <[email protected]>;
> open list:TPM DEVICE DRIVER <[email protected]>; Linux
> Kernel Mailing List <[email protected]>
> Subject: EXT: Re: [PATCH 2/2] tpm: tpm_crb: enhance resource mapping
> mechanism for supporting AMD's fTPM
>
> >
> > > From: [email protected] <linux-integrity-
> > > [email protected]> On Behalf Of Seunghun Han
> > > Sent: Friday, August 30, 2019 9:55 AM
> > > To: Jason Gunthorpe <[email protected]>
> > > Cc: Jarkko Sakkinen <[email protected]>; Peter Huewe
> > > <[email protected]>; open list:TPM DEVICE DRIVER <linux-
> > > [email protected]>; Linux Kernel Mailing List <linux-
> > > [email protected]>
> > > Subject: EXT: Re: [PATCH 2/2] tpm: tpm_crb: enhance resource mapping
> > > mechanism for supporting AMD's fTPM
> > >
> > > >
> > > > On Fri, Aug 30, 2019 at 06:56:39PM +0900, Seunghun Han wrote:
> > > > > I got an AMD system which had a Ryzen Threadripper 1950X and MSI
> > > > > mainboard, and I had a problem with AMD's fTPM. My machine
> > > > > showed
> > > an
> > > > > error message below, and the fTPM didn't work because of it.
> > > > >
> > > > > [ 5.732084] tpm_crb MSFT0101:00: can't request region for resource
> > > > > [mem 0x79b4f000-0x79b4ffff] [ 5.732089] tpm_crb:
> > > > > probe of MSFT0101:00 failed with error -16
> > > > >
> > > > > When I saw the iomem, I found two fTPM regions were in the ACPI
> > > > > NVS
> > > area.
> > > > > The regions are below.
> > > > >
> > > > > 79a39000-79b6afff : ACPI Non-volatile Storage
> > > > > 79b4b000-79b4bfff : MSFT0101:00
> > > > > 79b4f000-79b4ffff : MSFT0101:00
> > > > >
> > > > > After analyzing this issue, I found that crb_map_io() function
> > > > > called
> > > > > devm_ioremap_resource() and it failed. The ACPI NVS didn't allow
> > > > > the TPM CRB driver to assign a resource in it because a busy bit
> > > > > was set to the ACPI NVS area.
> > > > >
> > > > > To support AMD's fTPM, I added a function to check intersects
> > > > > between the TPM region and ACPI NVS before it mapped the region.
> > > > > If some intersects are detected, the function just calls
> > > > > devm_ioremap() for a workaround. If there is no intersect, it
> > > > > calls
> > > devm_ioremap_resource().
> > > > >
> > > > > Signed-off-by: Seunghun Han <[email protected]>
> > > > > ---
> > > > > drivers/char/tpm/tpm_crb.c | 25 +++++++++++++++++++++++--
> > > > > 1 file changed, 23 insertions(+), 2 deletions(-)
> > > >
> > > > This still seems to result in two drivers controlling the same memory.
> > > > Does this create bugs and races during resume?
> > > >
> > > > Jason
> > >
> > > When I tested this patch in my machine, it seemed that ACPI NVS was
> > > saved after TPM CRB driver sent "TPM2_Shutdown(STATE)" to the fTPM
> > > while suspending. Then, ACPI NVS was restored while resuming.
> > > After resuming, PCRs didn't change and TPM2 tools such as
> > > tpm2_pcrlist, tpm2_extend, tpm2_getrandoms worked well.
> > > So, according to my test result, it seems that the patch doesn't
> > > create bugs and race during resume.
> > >
> > > Seunghun
> >
> > This was discussed earlier on the list.
> > The consensus was that, while safe now, this would be fragile, and
> > subject to unexpected changes in ACPI/NVS, and we really need to tell
> > NVS to exclude the regions for long term safety.
>
> Thank you for your advice. We also discussed earlier and concluded that
> checking and raw remapping are enough to work around this. The link is
> here, https://lkml.org/lkml/2019/8/29/962 .
I don't see Matthew Garrett's agreement on that thread.
> > As separate issues, the patches do not work at all on some of my AMD
> systems.
> > First, you only force the remap if the overlap is with NVS, but I have
> > systems where the overlap is with other reserved regions. You should
> > force the remap regardless, but if it is NVS, grab the space back from NVS.
>
> I didn't know about that. I just found the case from your thread that AMD
> system assigned TPM region into the reserved area. However, as I know, the
> reserved area has no busy bit so that TPM CRB driver could assign buffer
> resources in it. Right? In my view, if you patched your TPM driver with my
> patch series, then it could work. Would you explain what happened in TPM
> CRB driver and reserved area?
Good question. I'll try it out this weekend.
> > Second, the patch extends the wrong behavior of the current driver to
> > both buffer regions. If there is a conflict between what the device's
> > control register says, and what ACPI says, the existing driver explicitly
> "trusts the ACPI".
> > This is just wrong. The actual device will use the areas as defined by
> > its control registers regardless of what ACPI says. I talked to
> > Microsoft, and their driver trusts the control register values, and
> > doesn't even look at the ACPI values.
>
> As you know, the original code trusts the ACPI table because of the
> workaround for broken BIOS, and this code has worked well for a long time.
> In my view, if we change this code to trust control register value, we could
> make new problems and need a lot of time to check the workaround. So, I
> want to trust the ACPI value now.
I don't think the workaround has every really worked, other than the
Helpful firmware error it emits. I don't think anyone has tested the
workaround with large requests. The tpm_crb device itself is telling
us the buffers it is using. Why are we ignoring that and trusting the
known bad ACPI tables? Makes no sense to me.
> >
> > In practice, I have tested several systems in which the device
> > registers show The correct 4K buffers, but the driver instead trusts
> > the ACPI values, which list just 1K buffers. 1K buffers will not work
> > for large requests, and the device is going to read and write the 4K buffers
> regardless.
> >
> > dave
>
> I know about that. However, the device driver is not going to read and write
> 4K buffers if you patch your TPM driver with my patch series.
> One of my patches has an enhancement feature that could calculate the
> buffer size well. The TPM driver uses exactly 1K buffer for this case, not 4K
> buffer, and it works.
Have you tested large requests (well over 1K) to make sure?
dave
>
> Seunghun
Hi Seunghun,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on jss-tpmdd/next]
[cannot apply to v5.3-rc6 next-20190830]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Seunghun-Han/Enhance-support-for-the-AMD-s-fTPM/20190901-042313
base: git://git.infradead.org/users/jjs/linux-tpmdd next
If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <[email protected]>
coccinelle warnings: (new ones prefixed by >>)
>> drivers/char/tpm/tpm_crb.c:457:29-32: WARNING: Suspicious code. resource_size is maybe missing with res
vim +457 drivers/char/tpm/tpm_crb.c
452
453 static void __iomem *crb_ioremap_resource(struct device *dev,
454 const struct resource *res)
455 {
456 int rc;
> 457 resource_size_t size = res->end - res->start;
458
459 /* Broken BIOS assigns command and response buffers in ACPI NVS region.
460 * Check intersections between a resource and ACPI NVS for W/A.
461 */
462 rc = region_intersects(res->start, size, IORESOURCE_MEM |
463 IORESOURCE_BUSY, IORES_DESC_ACPI_NV_STORAGE);
464 if (rc != REGION_DISJOINT) {
465 dev_err(dev,
466 FW_BUG "Resource overlaps with a ACPI NVS. %pr\n",
467 res);
468 return devm_ioremap(dev, res->start, size);
469 }
470
471 return devm_ioremap_resource(dev, res);
472 }
473
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
On Fri, Aug 30, 2019 at 05:58:39PM +0000, Safford, David (GE Global Research, US) wrote:
> > Thank you for your advice. We also discussed earlier and concluded that
> > checking and raw remapping are enough to work around this. The link is
> > here, https://lkml.org/lkml/2019/8/29/962 .
>
> I don't see Matthew Garrett's agreement on that thread.
No one has agreed on anything.
/Jarkko
>
> On Fri, Aug 30, 2019 at 05:58:39PM +0000, Safford, David (GE Global Research, US) wrote:
> > > Thank you for your advice. We also discussed earlier and concluded that
> > > checking and raw remapping are enough to work around this. The link is
> > > here, https://lkml.org/lkml/2019/8/29/962 .
> >
> > I don't see Matthew Garrett's agreement on that thread.
>
> No one has agreed on anything.
>
> /Jarkko
Jarkko,
you gave me good advice related to the NVS area and mapping like below.
"A function that gets region and then checks if NVS driver has matching
one and returns true/false based on that should be good enough. Then
you raw ioremap() in the TPM driver."
So, I made a patch on your advice and test it. According to my test
result, command and response buffers were saved and restored while
hibernation. And, there was no side-effect because they were just
buffers and hibernation didn't affect the control area of TPM CRB
driver. So, I think that saving and restoring buffers during sleep is
no problem. I also think your advice and solution are clear and good
to work around AMD's fTPM. I will attach my detailed test result soon.
Jarkko,
I have a question. Do you think this patch is not enough to handle
AMD's fTPM problem? If so, would you tell me about it? I will change
my patch.
Seunghun
>
> > From: Seunghun Han <[email protected]>
> > Sent: Friday, August 30, 2019 12:55 PM
> > To: Safford, David (GE Global Research, US) <[email protected]>
> > Cc: Jason Gunthorpe <[email protected]>; Jarkko Sakkinen
> > <[email protected]>; Peter Huewe <[email protected]>;
> > open list:TPM DEVICE DRIVER <[email protected]>; Linux
> > Kernel Mailing List <[email protected]>
> > Subject: EXT: Re: [PATCH 2/2] tpm: tpm_crb: enhance resource mapping
> > mechanism for supporting AMD's fTPM
> >
> > >
> > > > From: [email protected] <linux-integrity-
> > > > [email protected]> On Behalf Of Seunghun Han
> > > > Sent: Friday, August 30, 2019 9:55 AM
> > > > To: Jason Gunthorpe <[email protected]>
> > > > Cc: Jarkko Sakkinen <[email protected]>; Peter Huewe
> > > > <[email protected]>; open list:TPM DEVICE DRIVER <linux-
> > > > [email protected]>; Linux Kernel Mailing List <linux-
> > > > [email protected]>
> > > > Subject: EXT: Re: [PATCH 2/2] tpm: tpm_crb: enhance resource mapping
> > > > mechanism for supporting AMD's fTPM
> > > >
> > > > >
> > > > > On Fri, Aug 30, 2019 at 06:56:39PM +0900, Seunghun Han wrote:
> > > > > > I got an AMD system which had a Ryzen Threadripper 1950X and MSI
> > > > > > mainboard, and I had a problem with AMD's fTPM. My machine
> > > > > > showed
> > > > an
> > > > > > error message below, and the fTPM didn't work because of it.
> > > > > >
> > > > > > [ 5.732084] tpm_crb MSFT0101:00: can't request region for resource
> > > > > > [mem 0x79b4f000-0x79b4ffff] [ 5.732089] tpm_crb:
> > > > > > probe of MSFT0101:00 failed with error -16
> > > > > >
> > > > > > When I saw the iomem, I found two fTPM regions were in the ACPI
> > > > > > NVS
> > > > area.
> > > > > > The regions are below.
> > > > > >
> > > > > > 79a39000-79b6afff : ACPI Non-volatile Storage
> > > > > > 79b4b000-79b4bfff : MSFT0101:00
> > > > > > 79b4f000-79b4ffff : MSFT0101:00
> > > > > >
> > > > > > After analyzing this issue, I found that crb_map_io() function
> > > > > > called
> > > > > > devm_ioremap_resource() and it failed. The ACPI NVS didn't allow
> > > > > > the TPM CRB driver to assign a resource in it because a busy bit
> > > > > > was set to the ACPI NVS area.
> > > > > >
> > > > > > To support AMD's fTPM, I added a function to check intersects
> > > > > > between the TPM region and ACPI NVS before it mapped the region.
> > > > > > If some intersects are detected, the function just calls
> > > > > > devm_ioremap() for a workaround. If there is no intersect, it
> > > > > > calls
> > > > devm_ioremap_resource().
> > > > > >
> > > > > > Signed-off-by: Seunghun Han <[email protected]>
> > > > > > ---
> > > > > > drivers/char/tpm/tpm_crb.c | 25 +++++++++++++++++++++++--
> > > > > > 1 file changed, 23 insertions(+), 2 deletions(-)
> > > > >
> > > > > This still seems to result in two drivers controlling the same memory.
> > > > > Does this create bugs and races during resume?
> > > > >
> > > > > Jason
> > > >
> > > > When I tested this patch in my machine, it seemed that ACPI NVS was
> > > > saved after TPM CRB driver sent "TPM2_Shutdown(STATE)" to the fTPM
> > > > while suspending. Then, ACPI NVS was restored while resuming.
> > > > After resuming, PCRs didn't change and TPM2 tools such as
> > > > tpm2_pcrlist, tpm2_extend, tpm2_getrandoms worked well.
> > > > So, according to my test result, it seems that the patch doesn't
> > > > create bugs and race during resume.
> > > >
> > > > Seunghun
> > >
> > > This was discussed earlier on the list.
> > > The consensus was that, while safe now, this would be fragile, and
> > > subject to unexpected changes in ACPI/NVS, and we really need to tell
> > > NVS to exclude the regions for long term safety.
> >
> > Thank you for your advice. We also discussed earlier and concluded that
> > checking and raw remapping are enough to work around this. The link is
> > here, https://lkml.org/lkml/2019/8/29/962 .
>
> I don't see Matthew Garrett's agreement on that thread.
Thank you for your notification. I am sorry. I missed it and
misunderstood Jarkko's idea. So, I would like to invite Matthew
Garrett to this thread and attach my opinion on that. The problem is
that command and response buffers are in ACPI NVS area. ACPI NVS area
is saved and restored by drivers/acpi/nvs.c during hibernation, so
command and response buffers in ACPI NVS are also handled by nvs.c
file. However, TPM CRB driver uses the buffers to control a TPM
device, therefore, something may break.
I agree on that point. To remove uncertainty and find the solution,
I read the threads we discussed and did research about two points, 1)
the race condition and 2) the unexpected behavior of the TPM device.
1) The race condition concern comes from unknowing buffer access order
while hibernation.
If nvs.c and TPM CRB driver access the buffers concurrently, the race
condition occurs. Then, we can't know the contents of the buffers
deterministically, and it may occur the failure of TPM device.
However, hibernation_snapshot() function calls dpm_suspend() and
suspend_nvs_save() in order when the system enters into hibernation.
It also calls suspend_nvs_restore() and dpm_resume() in order when the
system exits from hibernation. So, no race condition occurs while
hibernation, and we always guarantee the contents of buffers as we
expect.
2) The unexpected behavior of the TPM device.
If nvs.c saves and restores the contents of the TPM CRB buffers while
hibernation, it may occur the unexpected behavior of the TPM device
because the buffers are used to control the TPM device. When the
system entered into hibernation, suspend_nvs_save() saved the command
and response buffers, and they had the last command and response data.
After exiting from hibernation, suspend_nvs_restore() restored the
last command and response data into the buffers and nothing happened.
I realized that they were just buffers. If we want to send a command
to the TPM device, we have to set the CRB_START_INVOKE bit to a
control_start register of a control area. The control area was not in
the ACPI NVS area, so it was not affected by nvs.c file. We can
guarantee the behavior of the TPM device.
Because of these two reasons, I agreed on Jarkko's idea in
https://lkml.org/lkml/2019/8/29/962 . It seems that removing or
changing regions described in the ACPI table is not natural after
setup. In my view, saving and restoring buffers was OK like other NVS
areas were expected because the buffers were in ACPI NVS area.
So, I made and sent this patch series. I would like to solve this
AMD's fTPM problem because I have been doing research on TPM and this
problem is critical for me (as you know fTPM doesn't work). If you
have any other concern or advice on the patch I made, please let me
know.
>
> > > As separate issues, the patches do not work at all on some of my AMD
> > systems.
> > > First, you only force the remap if the overlap is with NVS, but I have
> > > systems where the overlap is with other reserved regions. You should
> > > force the remap regardless, but if it is NVS, grab the space back from NVS.
> >
> > I didn't know about that. I just found the case from your thread that AMD
> > system assigned TPM region into the reserved area. However, as I know, the
> > reserved area has no busy bit so that TPM CRB driver could assign buffer
> > resources in it. Right? In my view, if you patched your TPM driver with my
> > patch series, then it could work. Would you explain what happened in TPM
> > CRB driver and reserved area?
>
> Good question. I'll try it out this weekend.
Thank you for your help.
>
> > > Second, the patch extends the wrong behavior of the current driver to
> > > both buffer regions. If there is a conflict between what the device's
> > > control register says, and what ACPI says, the existing driver explicitly
> > "trusts the ACPI".
> > > This is just wrong. The actual device will use the areas as defined by
> > > its control registers regardless of what ACPI says. I talked to
> > > Microsoft, and their driver trusts the control register values, and
> > > doesn't even look at the ACPI values.
> >
> > As you know, the original code trusts the ACPI table because of the
> > workaround for broken BIOS, and this code has worked well for a long time.
> > In my view, if we change this code to trust control register value, we could
> > make new problems and need a lot of time to check the workaround. So, I
> > want to trust the ACPI value now.
>
> I don't think the workaround has every really worked, other than the
> Helpful firmware error it emits. I don't think anyone has tested the
> workaround with large requests. The tpm_crb device itself is telling
> us the buffers it is using. Why are we ignoring that and trusting the
> known bad ACPI tables? Makes no sense to me.
>
Yes, you are right. However, crb_fixup_cmd_size() function have worked
for handling broken BIOS well. I meant if we changed the function to
trust control register, we may need to test all broken BIOS the
function covered. To solve AMD's fTPM case, supporting separated
command and response buffers was enough. If your concern is the buffer
size calculation and you would like to solve it, I would make another
patch after this one.
> > >
> > > In practice, I have tested several systems in which the device
> > > registers show The correct 4K buffers, but the driver instead trusts
> > > the ACPI values, which list just 1K buffers. 1K buffers will not work
> > > for large requests, and the device is going to read and write the 4K buffers
> > regardless.
> > >
> > > dave
> >
> > I know about that. However, the device driver is not going to read and write
> > 4K buffers if you patch your TPM driver with my patch series.
> > One of my patches has an enhancement feature that could calculate the
> > buffer size well. The TPM driver uses exactly 1K buffer for this case, not 4K
> > buffer, and it works.
>
> Have you tested large requests (well over 1K) to make sure?
First of all, I misunderstood your message.
I have to tell you about the buffer size exactly. The command and
response buffer sizes in ACPI table were 0x1000 and this was 4K, not
1K. The sizes in the control register were 0x4000 and this was 16K
(large buffer size), not 4K.
I have been using the TPM for my research and the typical cases like
creating public/private keys, encrypting/decrypting data,
sealing/unsealing a secrete, and getting random numbers are not over
4K buffer. So, as you know, I think the 4K buffer can handle the most
cases and the current implementation of crb_fixup_cmd_size() works
well. If you concern the specific case that uses over 4K buffer,
please let me know.
If you have more ideas and advice on this patch, please let me know. I
will improve my patch.
Seunghun
>
> dave
> >
> > Seunghun
>
> > From: Seunghun Han <[email protected]>
> > Sent: Friday, August 30, 2019 12:55 PM
> > To: Safford, David (GE Global Research, US) <[email protected]>
> > Cc: Jason Gunthorpe <[email protected]>; Jarkko Sakkinen
> > <[email protected]>; Peter Huewe <[email protected]>;
> > open list:TPM DEVICE DRIVER <[email protected]>; Linux
> > Kernel Mailing List <[email protected]>
> > Subject: EXT: Re: [PATCH 2/2] tpm: tpm_crb: enhance resource mapping
> > mechanism for supporting AMD's fTPM
> >
> > >
> > > > From: [email protected] <linux-integrity-
> > > > [email protected]> On Behalf Of Seunghun Han
> > > > Sent: Friday, August 30, 2019 9:55 AM
> > > > To: Jason Gunthorpe <[email protected]>
> > > > Cc: Jarkko Sakkinen <[email protected]>; Peter Huewe
> > > > <[email protected]>; open list:TPM DEVICE DRIVER <linux-
> > > > [email protected]>; Linux Kernel Mailing List <linux-
> > > > [email protected]>
> > > > Subject: EXT: Re: [PATCH 2/2] tpm: tpm_crb: enhance resource mapping
> > > > mechanism for supporting AMD's fTPM
> > > >
> > > > >
> > > > > On Fri, Aug 30, 2019 at 06:56:39PM +0900, Seunghun Han wrote:
> > > > > > I got an AMD system which had a Ryzen Threadripper 1950X and MSI
> > > > > > mainboard, and I had a problem with AMD's fTPM. My machine
> > > > > > showed
> > > > an
> > > > > > error message below, and the fTPM didn't work because of it.
> > > > > >
> > > > > > [ 5.732084] tpm_crb MSFT0101:00: can't request region for resource
> > > > > > [mem 0x79b4f000-0x79b4ffff] [ 5.732089] tpm_crb:
> > > > > > probe of MSFT0101:00 failed with error -16
> > > > > >
> > > > > > When I saw the iomem, I found two fTPM regions were in the ACPI
> > > > > > NVS
> > > > area.
> > > > > > The regions are below.
> > > > > >
> > > > > > 79a39000-79b6afff : ACPI Non-volatile Storage
> > > > > > 79b4b000-79b4bfff : MSFT0101:00
> > > > > > 79b4f000-79b4ffff : MSFT0101:00
> > > > > >
> > > > > > After analyzing this issue, I found that crb_map_io() function
> > > > > > called
> > > > > > devm_ioremap_resource() and it failed. The ACPI NVS didn't allow
> > > > > > the TPM CRB driver to assign a resource in it because a busy bit
> > > > > > was set to the ACPI NVS area.
> > > > > >
> > > > > > To support AMD's fTPM, I added a function to check intersects
> > > > > > between the TPM region and ACPI NVS before it mapped the region.
> > > > > > If some intersects are detected, the function just calls
> > > > > > devm_ioremap() for a workaround. If there is no intersect, it
> > > > > > calls
> > > > devm_ioremap_resource().
> > > > > >
> > > > > > Signed-off-by: Seunghun Han <[email protected]>
> > > > > > ---
> > > > > > drivers/char/tpm/tpm_crb.c | 25 +++++++++++++++++++++++--
> > > > > > 1 file changed, 23 insertions(+), 2 deletions(-)
> > > > >
> > > > > This still seems to result in two drivers controlling the same memory.
> > > > > Does this create bugs and races during resume?
> > > > >
> > > > > Jason
> > > >
> > > > When I tested this patch in my machine, it seemed that ACPI NVS was
> > > > saved after TPM CRB driver sent "TPM2_Shutdown(STATE)" to the fTPM
> > > > while suspending. Then, ACPI NVS was restored while resuming.
> > > > After resuming, PCRs didn't change and TPM2 tools such as
> > > > tpm2_pcrlist, tpm2_extend, tpm2_getrandoms worked well.
> > > > So, according to my test result, it seems that the patch doesn't
> > > > create bugs and race during resume.
> > > >
> > > > Seunghun
> > >
> > > This was discussed earlier on the list.
> > > The consensus was that, while safe now, this would be fragile, and
> > > subject to unexpected changes in ACPI/NVS, and we really need to tell
> > > NVS to exclude the regions for long term safety.
> >
> > Thank you for your advice. We also discussed earlier and concluded that
> > checking and raw remapping are enough to work around this. The link is
> > here, https://lkml.org/lkml/2019/8/29/962 .
>
> I don't see Matthew Garrett's agreement on that thread.
Thank you for your notification. I am sorry. I missed it and
misunderstood Jarkko's idea. So, I would like to invite Matthew
Garrett to this thread and attach my opinion on that. The problem is
that command and response buffers are in ACPI NVS area. ACPI NVS area
is saved and restored by drivers/acpi/nvs.c during hibernation, so
command and response buffers in ACPI NVS are also handled by nvs.c
file. However, TPM CRB driver uses the buffers to control a TPM
device, therefore, something may break.
I agree on that point. To remove uncertainty and find the solution,
I read the threads we discussed and did research about two points, 1)
the race condition and 2) the unexpected behavior of the TPM device.
1) The race condition concern comes from unknowing buffer access order
while hibernation.
If nvs.c and TPM CRB driver access the buffers concurrently, the race
condition occurs. Then, we can't know the contents of the buffers
deterministically, and it may occur the failure of TPM device.
However, hibernation_snapshot() function calls dpm_suspend() and
suspend_nvs_save() in order when the system enters into hibernation.
It also calls suspend_nvs_restore() and dpm_resume() in order when the
system exits from hibernation. So, no race condition occurs while
hibernation, and we always guarantee the contents of buffers as we
expect.
2) The unexpected behavior of the TPM device.
If nvs.c saves and restores the contents of the TPM CRB buffers while
hibernation, it may occur the unexpected behavior of the TPM device
because the buffers are used to control the TPM device. When the
system entered into hibernation, suspend_nvs_save() saved the command
and response buffers, and they had the last command and response data.
After exiting from hibernation, suspend_nvs_restore() restored the
last command and response data into the buffers and nothing happened.
I realized that they were just buffers. If we want to send a command
to the TPM device, we have to set the CRB_START_INVOKE bit to a
control_start register of a control area. The control area was not in
the ACPI NVS area, so it was not affected by nvs.c file. We can
guarantee the behavior of the TPM device.
Because of these two reasons, I agreed on Jarkko's idea in
https://lkml.org/lkml/2019/8/29/962 . It seems that removing or
changing regions described in the ACPI table is not natural after
setup. In my view, saving and restoring buffers was OK like other NVS
areas were expected because the buffers were in ACPI NVS area.
So, I made and sent this patch series. I would like to solve this
AMD's fTPM problem because I have been doing research on TPM and this
problem is critical for me (as you know fTPM doesn't work). If you
have any other concern or advice on the patch I made, please let me
know.
>
> > > As separate issues, the patches do not work at all on some of my AMD
> > systems.
> > > First, you only force the remap if the overlap is with NVS, but I have
> > > systems where the overlap is with other reserved regions. You should
> > > force the remap regardless, but if it is NVS, grab the space back from NVS.
> >
> > I didn't know about that. I just found the case from your thread that AMD
> > system assigned TPM region into the reserved area. However, as I know, the
> > reserved area has no busy bit so that TPM CRB driver could assign buffer
> > resources in it. Right? In my view, if you patched your TPM driver with my
> > patch series, then it could work. Would you explain what happened in TPM
> > CRB driver and reserved area?
>
> Good question. I'll try it out this weekend.
Thank you for your help.
>
> > > Second, the patch extends the wrong behavior of the current driver to
> > > both buffer regions. If there is a conflict between what the device's
> > > control register says, and what ACPI says, the existing driver explicitly
> > "trusts the ACPI".
> > > This is just wrong. The actual device will use the areas as defined by
> > > its control registers regardless of what ACPI says. I talked to
> > > Microsoft, and their driver trusts the control register values, and
> > > doesn't even look at the ACPI values.
> >
> > As you know, the original code trusts the ACPI table because of the
> > workaround for broken BIOS, and this code has worked well for a long time.
> > In my view, if we change this code to trust control register value, we could
> > make new problems and need a lot of time to check the workaround. So, I
> > want to trust the ACPI value now.
>
> I don't think the workaround has every really worked, other than the
> Helpful firmware error it emits. I don't think anyone has tested the
> workaround with large requests. The tpm_crb device itself is telling
> us the buffers it is using. Why are we ignoring that and trusting the
> known bad ACPI tables? Makes no sense to me.
>
Yes, you are right. However, crb_fixup_cmd_size() function have worked
for handling broken BIOS well. I meant if we changed the function to
trust control register, we may need to test all broken BIOS the
function covered. To solve AMD's fTPM case, supporting separated
command and response buffers was enough. If your concern is the buffer
size calculation and you would like to solve it, I would make another
patch after this one.
> > >
> > > In practice, I have tested several systems in which the device
> > > registers show The correct 4K buffers, but the driver instead trusts
> > > the ACPI values, which list just 1K buffers. 1K buffers will not work
> > > for large requests, and the device is going to read and write the 4K buffers
> > regardless.
> > >
> > > dave
> >
> > I know about that. However, the device driver is not going to read and write
> > 4K buffers if you patch your TPM driver with my patch series.
> > One of my patches has an enhancement feature that could calculate the
> > buffer size well. The TPM driver uses exactly 1K buffer for this case, not 4K
> > buffer, and it works.
>
> Have you tested large requests (well over 1K) to make sure?
First of all, I misunderstood your message.
I have to tell you about the buffer size exactly. The command and
response buffer sizes in ACPI table were 0x1000 and this was 4K, not
1K. The sizes in the control register were 0x4000 and this was 16K
(large buffer size), not 4K.
I have been using the TPM for my research and the typical cases like
creating public/private keys, encrypting/decrypting data,
sealing/unsealing a secrete, and getting random numbers are not over
4K buffer. So, as you know, I think the 4K buffer can handle the most
cases and the current implementation of crb_fixup_cmd_size() works
well. If you concern the specific case that uses over 4K buffer,
please let me know.
If you have more ideas and advice on this patch, please let me know. I
will improve my patch.
Seunghun
>
> dave
> >
> > Seunghun
> > > > First, you only force the remap if the overlap is with NVS, but I
> > > > have systems where the overlap is with other reserved regions. You
> > > > should force the remap regardless, but if it is NVS, grab the space back
> > > > from NVS.
> > >
> > > I didn't know about that. I just found the case from your thread
> > > that AMD system assigned TPM region into the reserved area. However,
> > > as I know, the reserved area has no busy bit so that TPM CRB driver
> > > could assign buffer resources in it. Right? In my view, if you
> > > patched your TPM driver with my patch series, then it could work.
> > > Would you explain what happened in TPM CRB driver and reserved area?
> >
> > Good question. I'll try it out this weekend.
>
> Thank you for your help.
I tried your patch out on my systems with a "reserved" but not "NVS"
region conflict, and you are correct - the region is not busy, and
the driver is able to map the buffers with your patch.
> First of all, I misunderstood your message.
> I have to tell you about the buffer size exactly. The command and response
> buffer sizes in ACPI table were 0x1000 and this was 4K, not 1K. The sizes in
> the control register were 0x4000 and this was 16K (large buffer size), not 4K.
> I have been using the TPM for my research and the typical cases like creating
> public/private keys, encrypting/decrypting data, sealing/unsealing a secrete,
> and getting random numbers are not over 4K buffer. So, as you know, I think
> the 4K buffer can handle the most cases and the current implementation of
> crb_fixup_cmd_size() works well. If you concern the specific case that uses
> over 4K buffer, please let me know.
I have read postings of some systems where ACPI says 1K, but in all of my cases
that I can test, you are correct that ACPI is saying 4K instead of the device's 16K.
I tried really hard, but couldn't send any valid requests over 4K, (I believe that's
actually the max by the spec), and therefore never saw any failures on my
systems. I think the driver behavior is wrong for those other cases, but perhaps
this should wait until someone can get access and do the testing.
So I'm happy with your patches, other than what is decided for the nvs driver
conflict. I'm testing them on some production systems, and have seen no other
issues.
dave
On Tue, 2019-09-03 at 07:42 +0900, Seunghun Han wrote:
> I have a question. Do you think this patch is not enough to handle
> AMD's fTPM problem? If so, would you tell me about it? I will change
> my patch.
I have no new feedback to give at this point and no absolutely time to
brainstorm new ideas.
You've now sent the same patch set version twice. The one that was
sent 8-30 has the same patch version and no change log so no action
taken from my part.
Please version your patch sets and keep the change log in the cover
letter.
/Jarkko
On Tue, 2019-09-03 at 18:56 +0900, Seunghun Han wrote:
> Thank you for your notification. I am sorry. I missed it and
> misunderstood Jarkko's idea. So, I would like to invite Matthew
> Garrett to this thread and attach my opinion on that. The problem is
> that command and response buffers are in ACPI NVS area. ACPI NVS area
> is saved and restored by drivers/acpi/nvs.c during hibernation, so
> command and response buffers in ACPI NVS are also handled by nvs.c
> file. However, TPM CRB driver uses the buffers to control a TPM
> device, therefore, something may break.
>
> I agree on that point. To remove uncertainty and find the solution,
> I read the threads we discussed and did research about two points, 1)
> the race condition and 2) the unexpected behavior of the TPM device.
>
> 1) The race condition concern comes from unknowing buffer access order
> while hibernation.
> If nvs.c and TPM CRB driver access the buffers concurrently, the race
> condition occurs. Then, we can't know the contents of the buffers
> deterministically, and it may occur the failure of TPM device.
> However, hibernation_snapshot() function calls dpm_suspend() and
> suspend_nvs_save() in order when the system enters into hibernation.
> It also calls suspend_nvs_restore() and dpm_resume() in order when the
> system exits from hibernation. So, no race condition occurs while
> hibernation, and we always guarantee the contents of buffers as we
> expect.
>
> 2) The unexpected behavior of the TPM device.
> If nvs.c saves and restores the contents of the TPM CRB buffers while
> hibernation, it may occur the unexpected behavior of the TPM device
> because the buffers are used to control the TPM device. When the
> system entered into hibernation, suspend_nvs_save() saved the command
> and response buffers, and they had the last command and response data.
> After exiting from hibernation, suspend_nvs_restore() restored the
> last command and response data into the buffers and nothing happened.
> I realized that they were just buffers. If we want to send a command
> to the TPM device, we have to set the CRB_START_INVOKE bit to a
> control_start register of a control area. The control area was not in
> the ACPI NVS area, so it was not affected by nvs.c file. We can
> guarantee the behavior of the TPM device.
>
> Because of these two reasons, I agreed on Jarkko's idea in
> https://lkml.org/lkml/2019/8/29/962 . It seems that removing or
> changing regions described in the ACPI table is not natural after
> setup. In my view, saving and restoring buffers was OK like other NVS
> areas were expected because the buffers were in ACPI NVS area.
>
> So, I made and sent this patch series. I would like to solve this
> AMD's fTPM problem because I have been doing research on TPM and this
> problem is critical for me (as you know fTPM doesn't work). If you
> have any other concern or advice on the patch I made, please let me
> know.
Please take time to edit your responses. Nobody will read that properly
because it is way too exhausting. A long prose only indicates unclear
thoughts in the end. If you know what you are doing, you can put things
into nutshell only in few senteces.
/Jarkko
>
> On Tue, 2019-09-03 at 07:42 +0900, Seunghun Han wrote:
> > I have a question. Do you think this patch is not enough to handle
> > AMD's fTPM problem? If so, would you tell me about it? I will change
> > my patch.
>
> I have no new feedback to give at this point and no absolutely time to
> brainstorm new ideas.
>
> You've now sent the same patch set version twice. The one that was
> sent 8-30 has the same patch version and no change log so no action
> taken from my part.
>
> Please version your patch sets and keep the change log in the cover
> letter.
>
> /Jarkko
>
Thank you for your advice. Then, is it enough that I change the point
the kbuild test robot told me below and resent the patch with
versioning?
coccinelle warnings: (new ones prefixed by >>)
>> drivers/char/tpm/tpm_crb.c:457:29-32: WARNING: Suspicious code. resource_size is maybe missing with res
vim +457 drivers/char/tpm/tpm_crb.c
452
453 static void __iomem *crb_ioremap_resource(struct device *dev,
454 const struct resource *res)
455 {
456 int rc;
> 457 resource_size_t size = res->end - res->start;
458
> I tried your patch out on my systems with a "reserved" but not "NVS"
> region conflict, and you are correct - the region is not busy, and
> the driver is able to map the buffers with your patch.
>
> > First of all, I misunderstood your message.
> > I have to tell you about the buffer size exactly. The command and response
> > buffer sizes in ACPI table were 0x1000 and this was 4K, not 1K. The sizes in
> > the control register were 0x4000 and this was 16K (large buffer size), not 4K.
> > I have been using the TPM for my research and the typical cases like creating
> > public/private keys, encrypting/decrypting data, sealing/unsealing a secrete,
> > and getting random numbers are not over 4K buffer. So, as you know, I think
> > the 4K buffer can handle the most cases and the current implementation of
> > crb_fixup_cmd_size() works well. If you concern the specific case that uses
> > over 4K buffer, please let me know.
>
> I have read postings of some systems where ACPI says 1K, but in all of my cases
> that I can test, you are correct that ACPI is saying 4K instead of the device's 16K.
> I tried really hard, but couldn't send any valid requests over 4K, (I believe that's
> actually the max by the spec), and therefore never saw any failures on my
> systems. I think the driver behavior is wrong for those other cases, but perhaps
> this should wait until someone can get access and do the testing.
>
> So I'm happy with your patches, other than what is decided for the nvs driver
> conflict. I'm testing them on some production systems, and have seen no other
> issues.
>
> dave
Thank you for your help and testing. I would like to make patch v2 to
change the point that kbuild robot told me.
If you don't mind, may I add "tested-by" tag to patch v2 with your
name and email address?
Seunghun
>
> On Tue, 2019-09-03 at 18:56 +0900, Seunghun Han wrote:
> > Thank you for your notification. I am sorry. I missed it and
> > misunderstood Jarkko's idea. So, I would like to invite Matthew
> > Garrett to this thread and attach my opinion on that. The problem is
> > that command and response buffers are in ACPI NVS area. ACPI NVS area
> > is saved and restored by drivers/acpi/nvs.c during hibernation, so
> > command and response buffers in ACPI NVS are also handled by nvs.c
> > file. However, TPM CRB driver uses the buffers to control a TPM
> > device, therefore, something may break.
> >
> > I agree on that point. To remove uncertainty and find the solution,
> > I read the threads we discussed and did research about two points, 1)
> > the race condition and 2) the unexpected behavior of the TPM device.
> >
> > 1) The race condition concern comes from unknowing buffer access order
> > while hibernation.
> > If nvs.c and TPM CRB driver access the buffers concurrently, the race
> > condition occurs. Then, we can't know the contents of the buffers
> > deterministically, and it may occur the failure of TPM device.
> > However, hibernation_snapshot() function calls dpm_suspend() and
> > suspend_nvs_save() in order when the system enters into hibernation.
> > It also calls suspend_nvs_restore() and dpm_resume() in order when the
> > system exits from hibernation. So, no race condition occurs while
> > hibernation, and we always guarantee the contents of buffers as we
> > expect.
> >
> > 2) The unexpected behavior of the TPM device.
> > If nvs.c saves and restores the contents of the TPM CRB buffers while
> > hibernation, it may occur the unexpected behavior of the TPM device
> > because the buffers are used to control the TPM device. When the
> > system entered into hibernation, suspend_nvs_save() saved the command
> > and response buffers, and they had the last command and response data.
> > After exiting from hibernation, suspend_nvs_restore() restored the
> > last command and response data into the buffers and nothing happened.
> > I realized that they were just buffers. If we want to send a command
> > to the TPM device, we have to set the CRB_START_INVOKE bit to a
> > control_start register of a control area. The control area was not in
> > the ACPI NVS area, so it was not affected by nvs.c file. We can
> > guarantee the behavior of the TPM device.
> >
> > Because of these two reasons, I agreed on Jarkko's idea in
> > https://lkml.org/lkml/2019/8/29/962 . It seems that removing or
> > changing regions described in the ACPI table is not natural after
> > setup. In my view, saving and restoring buffers was OK like other NVS
> > areas were expected because the buffers were in ACPI NVS area.
> >
> > So, I made and sent this patch series. I would like to solve this
> > AMD's fTPM problem because I have been doing research on TPM and this
> > problem is critical for me (as you know fTPM doesn't work). If you
> > have any other concern or advice on the patch I made, please let me
> > know.
>
> Please take time to edit your responses. Nobody will read that properly
> because it is way too exhausting. A long prose only indicates unclear
> thoughts in the end. If you know what you are doing, you can put things
> into nutshell only in few senteces.
>
> /Jarkko
>
I'm sorry about that. I would like to invite Matthew Garrett and
discuss ACPI NVS and command/response buffer mapping again. So, I want
to summarize my test result and explain my opinion on that. I think
the data and result are important to make a decision clearly.
According to my test results, it seems that intersects between ACPI
NVS and command/response buffers will not make a problem.
Additionally, according to Dave's test results, this patch series can
cover not only an intersection with ACPI NVS area but also an
intersection with the reserved area. Here is the link,
https://lkml.org/lkml/2019/9/3/481 . Considering these results, my
patch series can solve AMD's fTPM problems.
Matthew,
what do you think about test results and this patch? In my view, if
the command/response buffers are in ACPI NVS area, saving and
restoring the buffers are ok and couldn't break anything.
I would like to get some feedback from you.
Seunghun