From: "Dr. Greg Wettstein" <[email protected]>
Functionality of the xen-tpmfront driver was lost secondary to
the introduction of xenbus multi-page support in commit ccc9d90a9a8b
("xenbus_client: Extend interface to support multi-page ring").
In this commit a pointer to the shared page address was being
passed to the xenbus_grant_ring() function rather then the
address of the shared page itself. This resulted in a situation
where the driver would attach to the vtpm-stubdom but any attempt
to send a command to the stub domain would timeout.
A diagnostic finding for this regression is the following error
message being generated when the xen-tpmfront driver probes for a
device:
<3>vtpm vtpm-0: tpm_transmit: tpm_send: error -62
<3>vtpm vtpm-0: A TPM error (-62) occurred attempting to determine
the timeouts
This fix is relevant to all kernels from 4.1 forward which is the
release in which multi-page xenbus support was introduced.
Daniel De Graaf formulated the fix by code inspection after the
regression point was located.
Fixes: ccc9d90a9a8b ("xenbus_client: Extend interface to support multi-page ring")
Signed-off-by: Dr. Greg Wettstein <[email protected]>
[boris: fixed commit message formatting, added Fixes tag]
Signed-off-by: Boris Ostrovsky <[email protected]>
Cc: [email protected] # v4.1+
---
drivers/char/tpm/xen-tpmfront.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/char/tpm/xen-tpmfront.c b/drivers/char/tpm/xen-tpmfront.c
index 911475d36800..b150f87f38f5 100644
--- a/drivers/char/tpm/xen-tpmfront.c
+++ b/drivers/char/tpm/xen-tpmfront.c
@@ -264,7 +264,7 @@ static int setup_ring(struct xenbus_device *dev, struct tpm_private *priv)
return -ENOMEM;
}
- rv = xenbus_grant_ring(dev, &priv->shr, 1, &gref);
+ rv = xenbus_grant_ring(dev, priv->shr, 1, &gref);
if (rv < 0)
return rv;
--
2.17.1
On Thu, Sep 13, 2018 at 05:25:51PM -0400, Boris Ostrovsky wrote:
> From: "Dr. Greg Wettstein" <[email protected]>
>
> Functionality of the xen-tpmfront driver was lost secondary to
> the introduction of xenbus multi-page support in commit ccc9d90a9a8b
> ("xenbus_client: Extend interface to support multi-page ring").
>
> In this commit a pointer to the shared page address was being
> passed to the xenbus_grant_ring() function rather then the
> address of the shared page itself. This resulted in a situation
I'm sorry but I'm far from being expert with Xen and this sentence
confuses me so maybe could open it up a bit.
For me "shared page address" and "address of the shared page" are
the same thing. What am I missing? I mean just different forms in
english to describe the exact same thing...
/Jarkko
On 9/16/18 3:25 PM, Jarkko Sakkinen wrote:
> On Thu, Sep 13, 2018 at 05:25:51PM -0400, Boris Ostrovsky wrote:
>> From: "Dr. Greg Wettstein" <[email protected]>
>>
>> Functionality of the xen-tpmfront driver was lost secondary to
>> the introduction of xenbus multi-page support in commit ccc9d90a9a8b
>> ("xenbus_client: Extend interface to support multi-page ring").
>>
>> In this commit a pointer to the shared page address was being
>> passed to the xenbus_grant_ring() function rather then the
>> address of the shared page itself. This resulted in a situation
> I'm sorry but I'm far from being expert with Xen and this sentence
> confuses me so maybe could open it up a bit.
>
> For me "shared page address" and "address of the shared page" are
> the same thing. What am I missing? I mean just different forms in
> english to describe the exact same thing...
xenbus_grant_ring() takes as an argument address of the ring shared
between two guests. What Greg was trying to describe was the fact that
existing code instead passes address of location where this address is
stored (i.e. somewhat similar to difference between pointer and pointer
to a pointer).
Would this be better:
"In this commit pointer to location of the where the shared page address
is stored was being passed to the xenbus_grant_ring() function rather
then the
address of the shared page itself."
Or please suggest a better alternative, I'll be happy to amend the
commit message.
Thanks.
-boris
On Mon, Sep 17, 2018 at 09:54:37AM -0400, Boris Ostrovsky wrote:
> On 9/16/18 3:25 PM, Jarkko Sakkinen wrote:
> > On Thu, Sep 13, 2018 at 05:25:51PM -0400, Boris Ostrovsky wrote:
> >> From: "Dr. Greg Wettstein" <[email protected]>
> >>
> >> Functionality of the xen-tpmfront driver was lost secondary to
> >> the introduction of xenbus multi-page support in commit ccc9d90a9a8b
> >> ("xenbus_client: Extend interface to support multi-page ring").
> >>
> >> In this commit a pointer to the shared page address was being
> >> passed to the xenbus_grant_ring() function rather then the
> >> address of the shared page itself. This resulted in a situation
> > I'm sorry but I'm far from being expert with Xen and this sentence
> > confuses me so maybe could open it up a bit.
> >
> > For me "shared page address" and "address of the shared page" are
> > the same thing. What am I missing? I mean just different forms in
> > english to describe the exact same thing...
>
> xenbus_grant_ring() takes as an argument address of the ring shared
> between two guests. What Greg was trying to describe was the fact that
> existing code instead passes address of location where this address is
> stored (i.e. somewhat similar to difference between pointer and pointer
> to a pointer).
Just to understand this bug better why did not the wrong version
cause any undefined behavior? Sounds like a fatal bug. Does this
cause crashes?
> Would this be better:
>
> "In this commit pointer to location of the where the shared page address
> is stored was being passed to the xenbus_grant_ring() function rather
> then the
> address of the shared page itself."
Yes, definitely!
> Or please suggest a better alternative, I'll be happy to amend the
> commit message.
Thank you.
> Thanks.
> -boris
/Jarkko
On 9/17/18 5:19 PM, Jarkko Sakkinen wrote:
> On Mon, Sep 17, 2018 at 09:54:37AM -0400, Boris Ostrovsky wrote:
>> On 9/16/18 3:25 PM, Jarkko Sakkinen wrote:
>>> On Thu, Sep 13, 2018 at 05:25:51PM -0400, Boris Ostrovsky wrote:
>>>> From: "Dr. Greg Wettstein" <[email protected]>
>>>>
>>>> Functionality of the xen-tpmfront driver was lost secondary to
>>>> the introduction of xenbus multi-page support in commit ccc9d90a9a8b
>>>> ("xenbus_client: Extend interface to support multi-page ring").
>>>>
>>>> In this commit a pointer to the shared page address was being
>>>> passed to the xenbus_grant_ring() function rather then the
>>>> address of the shared page itself. This resulted in a situation
>>> I'm sorry but I'm far from being expert with Xen and this sentence
>>> confuses me so maybe could open it up a bit.
>>>
>>> For me "shared page address" and "address of the shared page" are
>>> the same thing. What am I missing? I mean just different forms in
>>> english to describe the exact same thing...
>> xenbus_grant_ring() takes as an argument address of the ring shared
>> between two guests. What Greg was trying to describe was the fact that
>> existing code instead passes address of location where this address is
>> stored (i.e. somewhat similar to difference between pointer and pointer
>> to a pointer).
> Just to understand this bug better why did not the wrong version
> cause any undefined behavior? Sounds like a fatal bug. Does this
> cause crashes?
AFAIK, no, no crashes. I haven't tested this myself (and I believe
relatively few people use this functionality, which explains why this
has not been fixed for so long) but I don't think it will necessarily
crash. It's just that the frontend driver will be reading from wrong
location, causing TPM not to function properly.
Or maybe the frontend is writing but then I believe the write would
occur into tpm_private, and so will corrupt it. But the protocol will
fail right after this anyway.
>
>> Would this be better:
>>
>> "In this commit pointer to location of the where the shared page address
>> is stored was being passed to the xenbus_grant_ring() function rather
>> then the
>> address of the shared page itself."
> Yes, definitely!
OK, I'll send it shortly.
Thanks.
-boris
>
>> Or please suggest a better alternative, I'll be happy to amend the
>> commit message.
> Thank you.
>
>> Thanks.
>> -boris
> /Jarkko
On tisdag 18 september 2018 kl. 01:25:29 EEST Boris Ostrovsky wrote:
> On 9/17/18 5:19 PM, Jarkko Sakkinen wrote:
> > Just to understand this bug better why did not the wrong version
> > cause any undefined behavior? Sounds like a fatal bug. Does this
> > cause crashes?
>
> AFAIK, no, no crashes. I haven't tested this myself (and I believe
> relatively few people use this functionality, which explains why this
> has not been fixed for so long) but I don't think it will necessarily
> crash. It's just that the frontend driver will be reading from wrong
> location, causing TPM not to function properly.
I bumped my head into this last week and spent most of the
week trying to figure out why the vtpm did not respond.
Finally found the email from the guy that dirscovered and fixed
it. Did the fix and rescompiled. Now it seems to be working
fine. The patch is surprisingly 2 years old!!
I will be very pleased to see it go in to the
official kernel!
But no crash. Just a timeout when trying to communicate with
the vtpm-engine.
Best
Dag