2020-02-10 11:03:02

by Christian Borntraeger

[permalink] [raw]
Subject: Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot



On 10.02.20 10:47, Eugenio Perez Martin wrote:
> Hi Christian.
>
> I'm not able to reproduce the failure with eccb852f1fe6bede630e2e4f1a121a81e34354ab commit. Could you add more data? Your configuration (libvirt or qemu line), and host's dmesg output if any?
>
> Thanks!

If it was not obvious, this is on s390x, a big endian system.


2020-02-11 09:37:29

by Eugenio Perez Martin

[permalink] [raw]
Subject: Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot

On Mon, 2020-02-10 at 12:01 +0100, Christian Borntraeger wrote:
>
> On 10.02.20 10:47, Eugenio Perez Martin wrote:
> > Hi Christian.
> >
> > I'm not able to reproduce the failure with eccb852f1fe6bede630e2e4f1a121a81e34354ab commit. Could you add more data?
> > Your configuration (libvirt or qemu line), and host's dmesg output if any?
> >
> > Thanks!
>
> If it was not obvious, this is on s390x, a big endian system.
>

Hi Christian. Thank you very much for the hints.

Could we add some debug traces? Something like the inline patch should give us some clues.

Thanks!

From a8d65d5f0ae3d305443ee84b4842b7c712a1ac1d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <[email protected]>
Date: Tue, 11 Feb 2020 10:29:01 +0100
Subject: [PATCH] Add some traces

---
drivers/vhost/vhost.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index b5a51b1f2e79..60c048eebe4d 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -2239,8 +2239,8 @@ static int fetch_buf(struct vhost_virtqueue *vq)
vq->avail_idx = vhost16_to_cpu(vq, avail_idx);

if (unlikely((u16)(vq->avail_idx - last_avail_idx) > vq->num)) {
- vq_err(vq, "Guest moved used index from %u to %u",
- last_avail_idx, vq->avail_idx);
+ vq_err(vq, "Guest moved vq %p used index from %u to %u",
+ vq, last_avail_idx, vq->avail_idx);
return -EFAULT;
}

@@ -2336,6 +2336,8 @@ static int fetch_descs(struct vhost_virtqueue *vq)
while (!ret && vq->ndescs <= vq->batch_descs)
ret = fetch_buf(vq);

+ pr_debug("[vq=%p][vq->ndescs=%d]", vq, vq->ndescs);
+
return vq->ndescs ? 0 : ret;
}

@@ -2416,6 +2418,8 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
}

vq->first_desc = i + 1;
+ pr_debug("[vq=%p][vq->ndescs=%d][vq->first_desc=%d]", vq, vq->ndescs,
+ vq->first_desc);

return ret;

@@ -2459,6 +2463,7 @@ static int __vhost_add_used_n(struct vhost_virtqueue *vq,

start = vq->last_used_idx & (vq->num - 1);
used = vq->used->ring + start;
+ pr_debug("[vq=%p][start=%d][count=%u]", vq, start, count);
if (vhost_put_used(vq, heads, start, count)) {
vq_err(vq, "Failed to write used");
return -EFAULT;
--
2.18.1

2020-02-11 09:57:26

by Christian Borntraeger

[permalink] [raw]
Subject: Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot



On 11.02.20 10:33, Eugenio Pérez wrote:
> On Mon, 2020-02-10 at 12:01 +0100, Christian Borntraeger wrote:
>>
>> On 10.02.20 10:47, Eugenio Perez Martin wrote:
>>> Hi Christian.
>>>
>>> I'm not able to reproduce the failure with eccb852f1fe6bede630e2e4f1a121a81e34354ab commit. Could you add more data?
>>> Your configuration (libvirt or qemu line), and host's dmesg output if any?
>>>
>>> Thanks!
>>
>> If it was not obvious, this is on s390x, a big endian system.
>>
>
> Hi Christian. Thank you very much for the hints.
>
> Could we add some debug traces? Something like the inline patch should give us some clues.

attached. this time the guest crashed (and no guest moved message)


Attachments:
log (77.26 kB)

2020-02-11 10:08:34

by Christian Borntraeger

[permalink] [raw]
Subject: Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot



On 11.02.20 10:56, Christian Borntraeger wrote:
>
>
> On 11.02.20 10:33, Eugenio Pérez wrote:
>> On Mon, 2020-02-10 at 12:01 +0100, Christian Borntraeger wrote:
>>>
>>> On 10.02.20 10:47, Eugenio Perez Martin wrote:
>>>> Hi Christian.
>>>>
>>>> I'm not able to reproduce the failure with eccb852f1fe6bede630e2e4f1a121a81e34354ab commit. Could you add more data?
>>>> Your configuration (libvirt or qemu line), and host's dmesg output if any?
>>>>
>>>> Thanks!
>>>
>>> If it was not obvious, this is on s390x, a big endian system.
>>>
>>
>> Hi Christian. Thank you very much for the hints.
>>
>> Could we add some debug traces? Something like the inline patch should give us some clues.
>
> attached. this time the guest crashed (and no guest moved message)
>

And here with Guest moved vq after some reboots.


Attachments:
log2 (70.68 kB)

2020-02-11 13:06:37

by Eugenio Perez Martin

[permalink] [raw]
Subject: Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot

On Mon, 2020-02-10 at 12:01 +0100, Christian Borntraeger wrote:
>
> On 10.02.20 10:47, Eugenio Perez Martin wrote:
> > Hi Christian.
> >
> > I'm not able to reproduce the failure with eccb852f1fe6bede630e2e4f1a121a81e34354ab commit. Could you add more data?
> > Your configuration (libvirt or qemu line), and host's dmesg output if any?
> >
> > Thanks!
>
> If it was not obvious, this is on s390x, a big endian system.
>

Hi Christian. Thank you very much for your fast responses.

Could you try this patch on top of eccb852f1fe6bede630e2e4f1a121a81e34354ab?

Thanks!

From 71d0f9108a18aa894cc0c0c1c7efbad39f465a27 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <
[email protected]>
Date: Tue, 11 Feb 2020 13:19:10 +0100
Subject: [PATCH] vhost: fix return value of vhost_get_vq_desc

Before of the batch change, it was the chain's head. Need to keep that
way or we will not be able to free a chain of descriptors.

Fixes: eccb852f1fe6 ("vhost: batching fetches")
---
drivers/vhost/vhost.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index b5a51b1f2e79..fc422c3e5c08 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -2409,12 +2409,11 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
*out_num += ret;
}

- ret = desc->id;
-
if (!(desc->flags & VRING_DESC_F_NEXT))
break;
}

+ ret = vq->descs[vq->first_desc].id;
vq->first_desc = i + 1;

return ret;
--
2.18.1

2020-02-11 13:14:45

by Christian Borntraeger

[permalink] [raw]
Subject: Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot



On 11.02.20 14:04, Eugenio Pérez wrote:
> On Mon, 2020-02-10 at 12:01 +0100, Christian Borntraeger wrote:
>>
>> On 10.02.20 10:47, Eugenio Perez Martin wrote:
>>> Hi Christian.
>>>
>>> I'm not able to reproduce the failure with eccb852f1fe6bede630e2e4f1a121a81e34354ab commit. Could you add more data?
>>> Your configuration (libvirt or qemu line), and host's dmesg output if any?
>>>
>>> Thanks!
>>
>> If it was not obvious, this is on s390x, a big endian system.
>>
>
> Hi Christian. Thank you very much for your fast responses.
>
> Could you try this patch on top of eccb852f1fe6bede630e2e4f1a121a81e34354ab?

I still get
[ 43.665145] Guest moved used index from 0 to 289
after some reboots.


>
> Thanks!
>
> From 71d0f9108a18aa894cc0c0c1c7efbad39f465a27 Mon Sep 17 00:00:00 2001
> From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <
> [email protected]>
> Date: Tue, 11 Feb 2020 13:19:10 +0100
> Subject: [PATCH] vhost: fix return value of vhost_get_vq_desc
>
> Before of the batch change, it was the chain's head. Need to keep that
> way or we will not be able to free a chain of descriptors.
>
> Fixes: eccb852f1fe6 ("vhost: batching fetches")
> ---
> drivers/vhost/vhost.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index b5a51b1f2e79..fc422c3e5c08 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -2409,12 +2409,11 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
> *out_num += ret;
> }
>
> - ret = desc->id;
> -
> if (!(desc->flags & VRING_DESC_F_NEXT))
> break;
> }
>
> + ret = vq->descs[vq->first_desc].id;
> vq->first_desc = i + 1;
>
> return ret;
>

2020-02-11 14:17:44

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot

On Tue, Feb 11, 2020 at 02:04:54PM +0100, Eugenio P?rez wrote:
> On Mon, 2020-02-10 at 12:01 +0100, Christian Borntraeger wrote:
> >
> > On 10.02.20 10:47, Eugenio Perez Martin wrote:
> > > Hi Christian.
> > >
> > > I'm not able to reproduce the failure with eccb852f1fe6bede630e2e4f1a121a81e34354ab commit. Could you add more data?
> > > Your configuration (libvirt or qemu line), and host's dmesg output if any?
> > >
> > > Thanks!
> >
> > If it was not obvious, this is on s390x, a big endian system.
> >
>
> Hi Christian. Thank you very much for your fast responses.
>
> Could you try this patch on top of eccb852f1fe6bede630e2e4f1a121a81e34354ab?
>
> Thanks!
>
> >From 71d0f9108a18aa894cc0c0c1c7efbad39f465a27 Mon Sep 17 00:00:00 2001
> From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <
> [email protected]>
> Date: Tue, 11 Feb 2020 13:19:10 +0100
> Subject: [PATCH] vhost: fix return value of vhost_get_vq_desc
>
> Before of the batch change, it was the chain's head. Need to keep that
> way or we will not be able to free a chain of descriptors.

I think it's cleaner to have all descriptors in the chain
have the same id.

> Fixes: eccb852f1fe6 ("vhost: batching fetches")
> ---
> drivers/vhost/vhost.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index b5a51b1f2e79..fc422c3e5c08 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -2409,12 +2409,11 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
> *out_num += ret;
> }
>
> - ret = desc->id;
> -
> if (!(desc->flags & VRING_DESC_F_NEXT))
> break;
> }
>
> + ret = vq->descs[vq->first_desc].id;
> vq->first_desc = i + 1;
>
> return ret;
> --
> 2.18.1

2020-02-12 16:34:59

by Eugenio Perez Martin

[permalink] [raw]
Subject: Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot

On Tue, 2020-02-11 at 14:13 +0100, Christian Borntraeger wrote:
>
> On 11.02.20 14:04, Eugenio Pérez wrote:
> > On Mon, 2020-02-10 at 12:01 +0100, Christian Borntraeger wrote:
> > > On 10.02.20 10:47, Eugenio Perez Martin wrote:
> > > > Hi Christian.
> > > >
> > > > I'm not able to reproduce the failure with eccb852f1fe6bede630e2e4f1a121a81e34354ab commit. Could you add more
> > > > data?
> > > > Your configuration (libvirt or qemu line), and host's dmesg output if any?
> > > >
> > > > Thanks!
> > >
> > > If it was not obvious, this is on s390x, a big endian system.
> > >
> >
> > Hi Christian. Thank you very much for your fast responses.
> >
> > Could you try this patch on top of eccb852f1fe6bede630e2e4f1a121a81e34354ab?
>
> I still get
> [ 43.665145] Guest moved used index from 0 to 289
> after some reboots.
>
>
> > Thanks!
> >
> > From 71d0f9108a18aa894cc0c0c1c7efbad39f465a27 Mon Sep 17 00:00:00 2001
> > From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <
> > [email protected]>
> > Date: Tue, 11 Feb 2020 13:19:10 +0100
> > Subject: [PATCH] vhost: fix return value of vhost_get_vq_desc
> >
> > Before of the batch change, it was the chain's head. Need to keep that
> > way or we will not be able to free a chain of descriptors.
> >
> > Fixes: eccb852f1fe6 ("vhost: batching fetches")
> > ---
> > drivers/vhost/vhost.c | 3 +--
> > 1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> > index b5a51b1f2e79..fc422c3e5c08 100644
> > --- a/drivers/vhost/vhost.c
> > +++ b/drivers/vhost/vhost.c
> > @@ -2409,12 +2409,11 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
> > *out_num += ret;
> > }
> >
> > - ret = desc->id;
> > -
> > if (!(desc->flags & VRING_DESC_F_NEXT))
> > break;
> > }
> >
> > + ret = vq->descs[vq->first_desc].id;
> > vq->first_desc = i + 1;
> >
> > return ret;
> >

Sorry, still not able to reproduce the issue.

Could we try to disable all the vhost features?

Thanks!

diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 661088ae6dc7..08f6d2ccb697 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -250,11 +250,11 @@ int vhost_init_device_iotlb(struct vhost_dev *d, bool enabled);
} while (0)

enum {
- VHOST_FEATURES = (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) |
- (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
- (1ULL << VIRTIO_RING_F_EVENT_IDX) |
- (1ULL << VHOST_F_LOG_ALL) |
- (1ULL << VIRTIO_F_ANY_LAYOUT) |
+ VHOST_FEATURES = /* (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) | */
+ /* (1ULL << VIRTIO_RING_F_INDIRECT_DESC) | */
+ /* (1ULL << VIRTIO_RING_F_EVENT_IDX) | */
+ /* (1ULL << VHOST_F_LOG_ALL) | */
+ /* (1ULL << VIRTIO_F_ANY_LAYOUT) | */
(1ULL << VIRTIO_F_VERSION_1)
};

2020-02-13 09:31:03

by Christian Borntraeger

[permalink] [raw]
Subject: Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot



On 12.02.20 17:34, Eugenio Pérez wrote:
> On Tue, 2020-02-11 at 14:13 +0100, Christian Borntraeger wrote:
>>
>> On 11.02.20 14:04, Eugenio Pérez wrote:
>>> On Mon, 2020-02-10 at 12:01 +0100, Christian Borntraeger wrote:
>>>> On 10.02.20 10:47, Eugenio Perez Martin wrote:
>>>>> Hi Christian.
>>>>>
>>>>> I'm not able to reproduce the failure with eccb852f1fe6bede630e2e4f1a121a81e34354ab commit. Could you add more
>>>>> data?
>>>>> Your configuration (libvirt or qemu line), and host's dmesg output if any?
>>>>>
>>>>> Thanks!
>>>>
>>>> If it was not obvious, this is on s390x, a big endian system.
>>>>
>>>
>>> Hi Christian. Thank you very much for your fast responses.
>>>
>>> Could you try this patch on top of eccb852f1fe6bede630e2e4f1a121a81e34354ab?
>>
>> I still get
>> [ 43.665145] Guest moved used index from 0 to 289
>> after some reboots.
>>
>>
>>> Thanks!
>>>
>>> From 71d0f9108a18aa894cc0c0c1c7efbad39f465a27 Mon Sep 17 00:00:00 2001
>>> From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <
>>> [email protected]>
>>> Date: Tue, 11 Feb 2020 13:19:10 +0100
>>> Subject: [PATCH] vhost: fix return value of vhost_get_vq_desc
>>>
>>> Before of the batch change, it was the chain's head. Need to keep that
>>> way or we will not be able to free a chain of descriptors.
>>>
>>> Fixes: eccb852f1fe6 ("vhost: batching fetches")
>>> ---
>>> drivers/vhost/vhost.c | 3 +--
>>> 1 file changed, 1 insertion(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
>>> index b5a51b1f2e79..fc422c3e5c08 100644
>>> --- a/drivers/vhost/vhost.c
>>> +++ b/drivers/vhost/vhost.c
>>> @@ -2409,12 +2409,11 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
>>> *out_num += ret;
>>> }
>>>
>>> - ret = desc->id;
>>> -
>>> if (!(desc->flags & VRING_DESC_F_NEXT))
>>> break;
>>> }
>>>
>>> + ret = vq->descs[vq->first_desc].id;
>>> vq->first_desc = i + 1;
>>>
>>> return ret;
>>>
>
> Sorry, still not able to reproduce the issue.
>
> Could we try to disable all the vhost features?
>
> Thanks!
>
> diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
> index 661088ae6dc7..08f6d2ccb697 100644
> --- a/drivers/vhost/vhost.h
> +++ b/drivers/vhost/vhost.h
> @@ -250,11 +250,11 @@ int vhost_init_device_iotlb(struct vhost_dev *d, bool enabled);
> } while (0)
>
> enum {
> - VHOST_FEATURES = (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) |
> - (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
> - (1ULL << VIRTIO_RING_F_EVENT_IDX) |
> - (1ULL << VHOST_F_LOG_ALL) |
> - (1ULL << VIRTIO_F_ANY_LAYOUT) |
> + VHOST_FEATURES = /* (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) | */
> + /* (1ULL << VIRTIO_RING_F_INDIRECT_DESC) | */
> + /* (1ULL << VIRTIO_RING_F_EVENT_IDX) | */
> + /* (1ULL << VHOST_F_LOG_ALL) | */
> + /* (1ULL << VIRTIO_F_ANY_LAYOUT) | */
> (1ULL << VIRTIO_F_VERSION_1)
> };
>


I still get guest crashes with this on top of eccb852f1fe6. (The patch did not
apply, I had to manually comment out these things)

2020-02-13 10:47:48

by Eugenio Perez Martin

[permalink] [raw]
Subject: Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot

On Thu, 2020-02-13 at 10:30 +0100, Christian Borntraeger wrote:
>
> On 12.02.20 17:34, Eugenio Pérez wrote:
> > On Tue, 2020-02-11 at 14:13 +0100, Christian Borntraeger wrote:
> > > On 11.02.20 14:04, Eugenio Pérez wrote:
> > > > On Mon, 2020-02-10 at 12:01 +0100, Christian Borntraeger wrote:
> > > > > On 10.02.20 10:47, Eugenio Perez Martin wrote:
> > > > > > Hi Christian.
> > > > > >
> > > > > > I'm not able to reproduce the failure with eccb852f1fe6bede630e2e4f1a121a81e34354ab commit. Could you add
> > > > > > more
> > > > > > data?
> > > > > > Your configuration (libvirt or qemu line), and host's dmesg output if any?
> > > > > >
> > > > > > Thanks!
> > > > >
> > > > > If it was not obvious, this is on s390x, a big endian system.
> > > > >
> > > >
> > > > Hi Christian. Thank you very much for your fast responses.
> > > >
> > > > Could you try this patch on top of eccb852f1fe6bede630e2e4f1a121a81e34354ab?
> > >
> > > I still get
> > > [ 43.665145] Guest moved used index from 0 to 289
> > > after some reboots.
> > >
> > >
> > > > Thanks!
> > > >
> > > > From 71d0f9108a18aa894cc0c0c1c7efbad39f465a27 Mon Sep 17 00:00:00 2001
> > > > From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <
> > > > [email protected]>
> > > > Date: Tue, 11 Feb 2020 13:19:10 +0100
> > > > Subject: [PATCH] vhost: fix return value of vhost_get_vq_desc
> > > >
> > > > Before of the batch change, it was the chain's head. Need to keep that
> > > > way or we will not be able to free a chain of descriptors.
> > > >
> > > > Fixes: eccb852f1fe6 ("vhost: batching fetches")
> > > > ---
> > > > drivers/vhost/vhost.c | 3 +--
> > > > 1 file changed, 1 insertion(+), 2 deletions(-)
> > > >
> > > > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> > > > index b5a51b1f2e79..fc422c3e5c08 100644
> > > > --- a/drivers/vhost/vhost.c
> > > > +++ b/drivers/vhost/vhost.c
> > > > @@ -2409,12 +2409,11 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
> > > > *out_num += ret;
> > > > }
> > > >
> > > > - ret = desc->id;
> > > > -
> > > > if (!(desc->flags & VRING_DESC_F_NEXT))
> > > > break;
> > > > }
> > > >
> > > > + ret = vq->descs[vq->first_desc].id;
> > > > vq->first_desc = i + 1;
> > > >
> > > > return ret;
> > > >
> >
> > Sorry, still not able to reproduce the issue.
> >
> > Could we try to disable all the vhost features?
> >
> > Thanks!
> >
> > diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
> > index 661088ae6dc7..08f6d2ccb697 100644
> > --- a/drivers/vhost/vhost.h
> > +++ b/drivers/vhost/vhost.h
> > @@ -250,11 +250,11 @@ int vhost_init_device_iotlb(struct vhost_dev *d, bool enabled);
> > } while (0)
> >
> > enum {
> > - VHOST_FEATURES = (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) |
> > - (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
> > - (1ULL << VIRTIO_RING_F_EVENT_IDX) |
> > - (1ULL << VHOST_F_LOG_ALL) |
> > - (1ULL << VIRTIO_F_ANY_LAYOUT) |
> > + VHOST_FEATURES = /* (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) | */
> > + /* (1ULL << VIRTIO_RING_F_INDIRECT_DESC) | */
> > + /* (1ULL << VIRTIO_RING_F_EVENT_IDX) | */
> > + /* (1ULL << VHOST_F_LOG_ALL) | */
> > + /* (1ULL << VIRTIO_F_ANY_LAYOUT) | */
> > (1ULL << VIRTIO_F_VERSION_1)
> > };
> >
>
> I still get guest crashes with this on top of eccb852f1fe6. (The patch did not
> apply, I had to manually comment out these things)
>

Sorry about that, I C&P transformed tabs to spaces.

Can we try tracing last_avail_idx with the attached patch? Can you enable also line and thread id (dyndbg='+plt')?

Thanks!

From f7012e8b9db711b12d36e6e97411e7afa34bf768 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <[email protected]>
Date: Thu, 13 Feb 2020 11:26:06 +0100
Subject: [PATCH] vhost: disable all features and trace last_avail_idx

---
drivers/vhost/vhost.c | 13 +++++++++++--
drivers/vhost/vhost.h | 10 +++++-----
2 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index fc422c3e5c08..021d70bed015 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1645,6 +1645,9 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned int ioctl, void __user *arg
vq->last_avail_idx = s.num;
/* Forget the cached index value. */
vq->avail_idx = vq->last_avail_idx;
+ pr_debug(
+ "VHOST_SET_VRING_BASE [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u]",
+ vq, vq->last_avail_idx, vq->avail_idx);
break;
case VHOST_GET_VRING_BASE:
s.index = idx;
@@ -2239,8 +2242,8 @@ static int fetch_buf(struct vhost_virtqueue *vq)
vq->avail_idx = vhost16_to_cpu(vq, avail_idx);

if (unlikely((u16)(vq->avail_idx - last_avail_idx) > vq->num)) {
- vq_err(vq, "Guest moved used index from %u to %u",
- last_avail_idx, vq->avail_idx);
+ vq_err(vq, "Guest moved vq %p used index from %u to %u",
+ vq, last_avail_idx, vq->avail_idx);
return -EFAULT;
}

@@ -2316,6 +2319,9 @@ static int fetch_buf(struct vhost_virtqueue *vq)
BUG_ON(!(vq->used_flags & VRING_USED_F_NO_NOTIFY));

/* On success, increment avail index. */
+ pr_debug(
+ "[vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][vq->ndescs=%d][vq->first_desc=%d]",
+ vq, vq->last_avail_idx, vq->avail_idx, vq->ndescs, vq->first_desc);
vq->last_avail_idx++;

return 0;
@@ -2431,6 +2437,9 @@ EXPORT_SYMBOL_GPL(vhost_get_vq_desc);
/* Reverse the effect of vhost_get_vq_desc. Useful for error handling. */
void vhost_discard_vq_desc(struct vhost_virtqueue *vq, int n)
{
+ pr_debug(
+ "DISCARD [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][n=%d]",
+ vq, vq->last_avail_idx, vq->avail_idx, n);
vq->last_avail_idx -= n;
}
EXPORT_SYMBOL_GPL(vhost_discard_vq_desc);
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 661088ae6dc7..08f6d2ccb697 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -250,11 +250,11 @@ int vhost_init_device_iotlb(struct vhost_dev *d, bool enabled);
} while (0)

enum {
- VHOST_FEATURES = (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) |
- (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
- (1ULL << VIRTIO_RING_F_EVENT_IDX) |
- (1ULL << VHOST_F_LOG_ALL) |
- (1ULL << VIRTIO_F_ANY_LAYOUT) |
+ VHOST_FEATURES = /* (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) | */
+ /* (1ULL << VIRTIO_RING_F_INDIRECT_DESC) | */
+ /* (1ULL << VIRTIO_RING_F_EVENT_IDX) | */
+ /* (1ULL << VHOST_F_LOG_ALL) | */
+ /* (1ULL << VIRTIO_F_ANY_LAYOUT) | */
(1ULL << VIRTIO_F_VERSION_1)
};

--
2.18.1


2020-02-13 13:11:07

by Christian Borntraeger

[permalink] [raw]
Subject: Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot



On 13.02.20 11:47, Eugenio Pérez wrote:
>
> Can we try tracing last_avail_idx with the attached patch? Can you enable also line and thread id (dyndbg='+plt')?







Attachments:
log (22.85 kB)

2020-02-13 16:31:32

by Eugenio Perez Martin

[permalink] [raw]
Subject: Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot

Can we try with this traces?

From b793b4106085ab1970bdedb340e49f37843ed585 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <[email protected]>
Date: Thu, 13 Feb 2020 17:27:05 +0100
Subject: [PATCH] vhost: Add debug in ioctl calls

---
drivers/vhost/net.c | 20 +++++++++++++++++---
drivers/vhost/vhost.c | 16 ++++++++++++++--
2 files changed, 31 insertions(+), 5 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index e158159671fa..e4d5f843f9c0 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -1505,10 +1505,13 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)

mutex_lock(&n->dev.mutex);
r = vhost_dev_check_owner(&n->dev);
- if (r)
+ if (r) {
+ pr_debug("vhost_dev_check_owner index=%u fd=%d rc r=%d", index, fd, r);
goto err;
+ }

if (index >= VHOST_NET_VQ_MAX) {
+ pr_debug("vhost_dev_check_owner index=%u fd=%d MAX=%d", index, fd, VHOST_NET_VQ_MAX);
r = -ENOBUFS;
goto err;
}
@@ -1518,22 +1521,26 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)

/* Verify that ring has been setup correctly. */
if (!vhost_vq_access_ok(vq)) {
+ pr_debug("vhost_net_set_backend index=%u fd=%d !vhost_vq_access_ok", index, fd);
r = -EFAULT;
goto err_vq;
}
sock = get_socket(fd);
if (IS_ERR(sock)) {
r = PTR_ERR(sock);
+ pr_debug("vhost_net_set_backend index=%u fd=%d get_socket err r=%d", index, fd, r);
goto err_vq;
}

/* start polling new socket */
oldsock = vq->private_data;
if (sock != oldsock) {
+ pr_debug("sock=%p != oldsock=%p index=%u fd=%d vq=%p", sock, oldsock, index, fd, vq);
ubufs = vhost_net_ubuf_alloc(vq,
sock && vhost_sock_zcopy(sock));
if (IS_ERR(ubufs)) {
r = PTR_ERR(ubufs);
+ pr_debug("ubufs index=%u fd=%d err r=%d vq=%p", index, fd, r, vq);
goto err_ubufs;
}

@@ -1541,11 +1548,15 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
vq->private_data = sock;
vhost_net_buf_unproduce(nvq);
r = vhost_vq_init_access(vq);
- if (r)
+ if (r) {
+ pr_debug("init_access index=%u fd=%d r=%d vq=%p", index, fd, r, vq);
goto err_used;
+ }
r = vhost_net_enable_vq(n, vq);
- if (r)
+ if (r) {
+ pr_debug("enable_vq index=%u fd=%d r=%d vq=%p", index, fd, r, vq);
goto err_used;
+ }
if (index == VHOST_NET_VQ_RX)
nvq->rx_ring = get_tap_ptr_ring(fd);

@@ -1559,6 +1570,8 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)

mutex_unlock(&vq->mutex);

+ pr_debug("sock=%p", sock);
+
if (oldubufs) {
vhost_net_ubuf_put_wait_and_free(oldubufs);
mutex_lock(&vq->mutex);
@@ -1710,6 +1723,7 @@ static long vhost_net_ioctl(struct file *f, unsigned int ioctl,

switch (ioctl) {
case VHOST_NET_SET_BACKEND:
+ pr_debug("VHOST_NET_SET_BACKEND");
if (copy_from_user(&backend, argp, sizeof backend))
return -EFAULT;
return vhost_net_set_backend(n, backend.index, backend.fd);
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 021d70bed015..7f4848f9cec3 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1642,18 +1642,30 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned int ioctl, void __user *arg
r = -EINVAL;
break;
}
+
+ if (vq->last_avail_idx || vq->avail_idx) {
+ pr_debug(
+ "strange VHOST_SET_VRING_BASE [vq=%p][s.index=%u][s.num=%u]",
+ vq, s.index, s.num);
+ dump_stack();
+ r = 0;
+ break;
+ }
vq->last_avail_idx = s.num;
/* Forget the cached index value. */
vq->avail_idx = vq->last_avail_idx;
pr_debug(
- "VHOST_SET_VRING_BASE [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u]",
- vq, vq->last_avail_idx, vq->avail_idx);
+ "VHOST_SET_VRING_BASE [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][s.index=%u][s.num=%u]",
+ vq, vq->last_avail_idx, vq->avail_idx, s.index, s.num);
break;
case VHOST_GET_VRING_BASE:
s.index = idx;
s.num = vq->last_avail_idx;
if (copy_to_user(argp, &s, sizeof s))
r = -EFAULT;
+ pr_debug(
+ "VHOST_GET_VRING_BASE [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][s.index=%u][s.num=%u]",
+ vq, vq->last_avail_idx, vq->avail_idx, s.index, s.num);
break;
case VHOST_SET_VRING_KICK:
if (copy_from_user(&f, argp, sizeof f)) {
--
2.18.1


2020-02-13 16:33:34

by Christian Borntraeger

[permalink] [raw]
Subject: Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot



On 13.02.20 17:29, Eugenio Pérez wrote:
> Can we try with this traces?

Does not apply on eccb852f1fe6bede630e2e4f1a121a81e34354ab, can you double check?

>
> From b793b4106085ab1970bdedb340e49f37843ed585 Mon Sep 17 00:00:00 2001
> From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <[email protected]>
> Date: Thu, 13 Feb 2020 17:27:05 +0100
> Subject: [PATCH] vhost: Add debug in ioctl calls
>
> ---
> drivers/vhost/net.c | 20 +++++++++++++++++---
> drivers/vhost/vhost.c | 16 ++++++++++++++--
> 2 files changed, 31 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index e158159671fa..e4d5f843f9c0 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -1505,10 +1505,13 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
>
> mutex_lock(&n->dev.mutex);
> r = vhost_dev_check_owner(&n->dev);
> - if (r)
> + if (r) {
> + pr_debug("vhost_dev_check_owner index=%u fd=%d rc r=%d", index, fd, r);
> goto err;
> + }
>
> if (index >= VHOST_NET_VQ_MAX) {
> + pr_debug("vhost_dev_check_owner index=%u fd=%d MAX=%d", index, fd, VHOST_NET_VQ_MAX);
> r = -ENOBUFS;
> goto err;
> }
> @@ -1518,22 +1521,26 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
>
> /* Verify that ring has been setup correctly. */
> if (!vhost_vq_access_ok(vq)) {
> + pr_debug("vhost_net_set_backend index=%u fd=%d !vhost_vq_access_ok", index, fd);
> r = -EFAULT;
> goto err_vq;
> }
> sock = get_socket(fd);
> if (IS_ERR(sock)) {
> r = PTR_ERR(sock);
> + pr_debug("vhost_net_set_backend index=%u fd=%d get_socket err r=%d", index, fd, r);
> goto err_vq;
> }
>
> /* start polling new socket */
> oldsock = vq->private_data;
> if (sock != oldsock) {
> + pr_debug("sock=%p != oldsock=%p index=%u fd=%d vq=%p", sock, oldsock, index, fd, vq);
> ubufs = vhost_net_ubuf_alloc(vq,
> sock && vhost_sock_zcopy(sock));
> if (IS_ERR(ubufs)) {
> r = PTR_ERR(ubufs);
> + pr_debug("ubufs index=%u fd=%d err r=%d vq=%p", index, fd, r, vq);
> goto err_ubufs;
> }
>
> @@ -1541,11 +1548,15 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
> vq->private_data = sock;
> vhost_net_buf_unproduce(nvq);
> r = vhost_vq_init_access(vq);
> - if (r)
> + if (r) {
> + pr_debug("init_access index=%u fd=%d r=%d vq=%p", index, fd, r, vq);
> goto err_used;
> + }
> r = vhost_net_enable_vq(n, vq);
> - if (r)
> + if (r) {
> + pr_debug("enable_vq index=%u fd=%d r=%d vq=%p", index, fd, r, vq);
> goto err_used;
> + }
> if (index == VHOST_NET_VQ_RX)
> nvq->rx_ring = get_tap_ptr_ring(fd);
>
> @@ -1559,6 +1570,8 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
>
> mutex_unlock(&vq->mutex);
>
> + pr_debug("sock=%p", sock);
> +
> if (oldubufs) {
> vhost_net_ubuf_put_wait_and_free(oldubufs);
> mutex_lock(&vq->mutex);
> @@ -1710,6 +1723,7 @@ static long vhost_net_ioctl(struct file *f, unsigned int ioctl,
>
> switch (ioctl) {
> case VHOST_NET_SET_BACKEND:
> + pr_debug("VHOST_NET_SET_BACKEND");
> if (copy_from_user(&backend, argp, sizeof backend))
> return -EFAULT;
> return vhost_net_set_backend(n, backend.index, backend.fd);
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index 021d70bed015..7f4848f9cec3 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -1642,18 +1642,30 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned int ioctl, void __user *arg
> r = -EINVAL;
> break;
> }
> +
> + if (vq->last_avail_idx || vq->avail_idx) {
> + pr_debug(
> + "strange VHOST_SET_VRING_BASE [vq=%p][s.index=%u][s.num=%u]",
> + vq, s.index, s.num);
> + dump_stack();
> + r = 0;
> + break;
> + }
> vq->last_avail_idx = s.num;
> /* Forget the cached index value. */
> vq->avail_idx = vq->last_avail_idx;
> pr_debug(
> - "VHOST_SET_VRING_BASE [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u]",
> - vq, vq->last_avail_idx, vq->avail_idx);
> + "VHOST_SET_VRING_BASE [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][s.index=%u][s.num=%u]",
> + vq, vq->last_avail_idx, vq->avail_idx, s.index, s.num);
> break;
> case VHOST_GET_VRING_BASE:
> s.index = idx;
> s.num = vq->last_avail_idx;
> if (copy_to_user(argp, &s, sizeof s))
> r = -EFAULT;
> + pr_debug(
> + "VHOST_GET_VRING_BASE [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][s.index=%u][s.num=%u]",
> + vq, vq->last_avail_idx, vq->avail_idx, s.index, s.num);
> break;
> case VHOST_SET_VRING_KICK:
> if (copy_from_user(&f, argp, sizeof f)) {
>

2020-02-14 07:07:45

by Eugenio Perez Martin

[permalink] [raw]
Subject: Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot

Hi Christian.

Sorry, that was meant to be applied over previous debug patch.

Here I inline the one meant to be applied over eccb852f1fe6bede630e2e4f1a121a81e34354ab.

Thanks!

From d978ace99e4844b49b794d768385db3d128a4cc0 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <[email protected]>
Date: Fri, 14 Feb 2020 08:02:26 +0100
Subject: [PATCH] vhost: disable all features and trace last_avail_idx and
ioctl calls

---
drivers/vhost/net.c | 20 +++++++++++++++++---
drivers/vhost/vhost.c | 25 +++++++++++++++++++++++--
drivers/vhost/vhost.h | 10 +++++-----
3 files changed, 45 insertions(+), 10 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index e158159671fa..e4d5f843f9c0 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -1505,10 +1505,13 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)

mutex_lock(&n->dev.mutex);
r = vhost_dev_check_owner(&n->dev);
- if (r)
+ if (r) {
+ pr_debug("vhost_dev_check_owner index=%u fd=%d rc r=%d", index, fd, r);
goto err;
+ }

if (index >= VHOST_NET_VQ_MAX) {
+ pr_debug("vhost_dev_check_owner index=%u fd=%d MAX=%d", index, fd, VHOST_NET_VQ_MAX);
r = -ENOBUFS;
goto err;
}
@@ -1518,22 +1521,26 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)

/* Verify that ring has been setup correctly. */
if (!vhost_vq_access_ok(vq)) {
+ pr_debug("vhost_net_set_backend index=%u fd=%d !vhost_vq_access_ok", index, fd);
r = -EFAULT;
goto err_vq;
}
sock = get_socket(fd);
if (IS_ERR(sock)) {
r = PTR_ERR(sock);
+ pr_debug("vhost_net_set_backend index=%u fd=%d get_socket err r=%d", index, fd, r);
goto err_vq;
}

/* start polling new socket */
oldsock = vq->private_data;
if (sock != oldsock) {
+ pr_debug("sock=%p != oldsock=%p index=%u fd=%d vq=%p", sock, oldsock, index, fd, vq);
ubufs = vhost_net_ubuf_alloc(vq,
sock && vhost_sock_zcopy(sock));
if (IS_ERR(ubufs)) {
r = PTR_ERR(ubufs);
+ pr_debug("ubufs index=%u fd=%d err r=%d vq=%p", index, fd, r, vq);
goto err_ubufs;
}

@@ -1541,11 +1548,15 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
vq->private_data = sock;
vhost_net_buf_unproduce(nvq);
r = vhost_vq_init_access(vq);
- if (r)
+ if (r) {
+ pr_debug("init_access index=%u fd=%d r=%d vq=%p", index, fd, r, vq);
goto err_used;
+ }
r = vhost_net_enable_vq(n, vq);
- if (r)
+ if (r) {
+ pr_debug("enable_vq index=%u fd=%d r=%d vq=%p", index, fd, r, vq);
goto err_used;
+ }
if (index == VHOST_NET_VQ_RX)
nvq->rx_ring = get_tap_ptr_ring(fd);

@@ -1559,6 +1570,8 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)

mutex_unlock(&vq->mutex);

+ pr_debug("sock=%p", sock);
+
if (oldubufs) {
vhost_net_ubuf_put_wait_and_free(oldubufs);
mutex_lock(&vq->mutex);
@@ -1710,6 +1723,7 @@ static long vhost_net_ioctl(struct file *f, unsigned int ioctl,

switch (ioctl) {
case VHOST_NET_SET_BACKEND:
+ pr_debug("VHOST_NET_SET_BACKEND");
if (copy_from_user(&backend, argp, sizeof backend))
return -EFAULT;
return vhost_net_set_backend(n, backend.index, backend.fd);
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index b5a51b1f2e79..ec25ba32fe81 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1642,15 +1642,30 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned int ioctl, void __user *arg
r = -EINVAL;
break;
}
+
+ if (vq->last_avail_idx || vq->avail_idx) {
+ pr_debug(
+ "strange VHOST_SET_VRING_BASE [vq=%p][s.index=%u][s.num=%u]",
+ vq, s.index, s.num);
+ dump_stack();
+ r = 0;
+ break;
+ }
vq->last_avail_idx = s.num;
/* Forget the cached index value. */
vq->avail_idx = vq->last_avail_idx;
+ pr_debug(
+ "VHOST_SET_VRING_BASE [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][s.index=%u][s.num=%u]",
+ vq, vq->last_avail_idx, vq->avail_idx, s.index, s.num);
break;
case VHOST_GET_VRING_BASE:
s.index = idx;
s.num = vq->last_avail_idx;
if (copy_to_user(argp, &s, sizeof s))
r = -EFAULT;
+ pr_debug(
+ "VHOST_GET_VRING_BASE [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][s.index=%u][s.num=%u]",
+ vq, vq->last_avail_idx, vq->avail_idx, s.index, s.num);
break;
case VHOST_SET_VRING_KICK:
if (copy_from_user(&f, argp, sizeof f)) {
@@ -2239,8 +2254,8 @@ static int fetch_buf(struct vhost_virtqueue *vq)
vq->avail_idx = vhost16_to_cpu(vq, avail_idx);

if (unlikely((u16)(vq->avail_idx - last_avail_idx) > vq->num)) {
- vq_err(vq, "Guest moved used index from %u to %u",
- last_avail_idx, vq->avail_idx);
+ vq_err(vq, "Guest moved vq %p used index from %u to %u",
+ vq, last_avail_idx, vq->avail_idx);
return -EFAULT;
}

@@ -2316,6 +2331,9 @@ static int fetch_buf(struct vhost_virtqueue *vq)
BUG_ON(!(vq->used_flags & VRING_USED_F_NO_NOTIFY));

/* On success, increment avail index. */
+ pr_debug(
+ "[vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][vq->ndescs=%d][vq->first_desc=%d]",
+ vq, vq->last_avail_idx, vq->avail_idx, vq->ndescs, vq->first_desc);
vq->last_avail_idx++;

return 0;
@@ -2432,6 +2450,9 @@ EXPORT_SYMBOL_GPL(vhost_get_vq_desc);
/* Reverse the effect of vhost_get_vq_desc. Useful for error handling. */
void vhost_discard_vq_desc(struct vhost_virtqueue *vq, int n)
{
+ pr_debug(
+ "DISCARD [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][n=%d]",
+ vq, vq->last_avail_idx, vq->avail_idx, n);
vq->last_avail_idx -= n;
}
EXPORT_SYMBOL_GPL(vhost_discard_vq_desc);
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 661088ae6dc7..08f6d2ccb697 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -250,11 +250,11 @@ int vhost_init_device_iotlb(struct vhost_dev *d, bool enabled);
} while (0)

enum {
- VHOST_FEATURES = (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) |
- (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
- (1ULL << VIRTIO_RING_F_EVENT_IDX) |
- (1ULL << VHOST_F_LOG_ALL) |
- (1ULL << VIRTIO_F_ANY_LAYOUT) |
+ VHOST_FEATURES = /* (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) | */
+ /* (1ULL << VIRTIO_RING_F_INDIRECT_DESC) | */
+ /* (1ULL << VIRTIO_RING_F_EVENT_IDX) | */
+ /* (1ULL << VHOST_F_LOG_ALL) | */
+ /* (1ULL << VIRTIO_F_ANY_LAYOUT) | */
(1ULL << VIRTIO_F_VERSION_1)
};

--
2.18.1


2020-02-14 07:35:39

by Christian Borntraeger

[permalink] [raw]
Subject: Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot

I did
ping -c 20 -f ... ; reboot
twice

The ping after the first reboot showed .......E

this was on the host console

[ 55.951885] CPU: 34 PID: 1908 Comm: CPU 0/KVM Not tainted 5.5.0+ #21
[ 55.951891] Hardware name: IBM 3906 M04 704 (LPAR)
[ 55.951892] Call Trace:
[ 55.951902] [<0000001ede114132>] show_stack+0x8a/0xd0
[ 55.951906] [<0000001edeb0672a>] dump_stack+0x8a/0xb8
[ 55.951915] [<000003ff803736a6>] vhost_vring_ioctl+0x6fe/0x858 [vhost]
[ 55.951919] [<000003ff8042a608>] vhost_net_ioctl+0x510/0x570 [vhost_net]
[ 55.951924] [<0000001ede3c4dd8>] do_vfs_ioctl+0x430/0x6f8
[ 55.951926] [<0000001ede3c5124>] ksys_ioctl+0x84/0xb0
[ 55.951927] [<0000001ede3c51ba>] __s390x_sys_ioctl+0x2a/0x38
[ 55.951931] [<0000001edeb27f72>] system_call+0x2a6/0x2c8
[ 55.951949] CPU: 34 PID: 1908 Comm: CPU 0/KVM Not tainted 5.5.0+ #21
[ 55.951950] Hardware name: IBM 3906 M04 704 (LPAR)
[ 55.951951] Call Trace:
[ 55.951952] [<0000001ede114132>] show_stack+0x8a/0xd0
[ 55.951954] [<0000001edeb0672a>] dump_stack+0x8a/0xb8
[ 55.951956] [<000003ff803736a6>] vhost_vring_ioctl+0x6fe/0x858 [vhost]
[ 55.951958] [<000003ff8042a608>] vhost_net_ioctl+0x510/0x570 [vhost_net]
[ 55.951959] [<0000001ede3c4dd8>] do_vfs_ioctl+0x430/0x6f8
[ 55.951961] [<0000001ede3c5124>] ksys_ioctl+0x84/0xb0
[ 55.951962] [<0000001ede3c51ba>] __s390x_sys_ioctl+0x2a/0x38
[ 55.951964] [<0000001edeb27f72>] system_call+0x2a6/0x2c8
[ 55.951997] Guest moved vq 0000000063d896c6 used index from 44 to 0
[ 56.609831] unexpected descriptor format for RX: out 0, in 0
[ 86.540460] CPU: 6 PID: 1908 Comm: CPU 0/KVM Not tainted 5.5.0+ #21
[ 86.540464] Hardware name: IBM 3906 M04 704 (LPAR)
[ 86.540466] Call Trace:
[ 86.540473] [<0000001ede114132>] show_stack+0x8a/0xd0
[ 86.540477] [<0000001edeb0672a>] dump_stack+0x8a/0xb8
[ 86.540486] [<000003ff803736a6>] vhost_vring_ioctl+0x6fe/0x858 [vhost]
[ 86.540490] [<000003ff8042a608>] vhost_net_ioctl+0x510/0x570 [vhost_net]
[ 86.540494] [<0000001ede3c4dd8>] do_vfs_ioctl+0x430/0x6f8
[ 86.540496] [<0000001ede3c5124>] ksys_ioctl+0x84/0xb0
[ 86.540498] [<0000001ede3c51ba>] __s390x_sys_ioctl+0x2a/0x38
[ 86.540501] [<0000001edeb27f72>] system_call+0x2a6/0x2c8
[ 86.540524] CPU: 6 PID: 1908 Comm: CPU 0/KVM Not tainted 5.5.0+ #21
[ 86.540525] Hardware name: IBM 3906 M04 704 (LPAR)
[ 86.540526] Call Trace:
[ 86.540527] [<0000001ede114132>] show_stack+0x8a/0xd0
[ 86.540528] [<0000001edeb0672a>] dump_stack+0x8a/0xb8
[ 86.540531] [<000003ff803736a6>] vhost_vring_ioctl+0x6fe/0x858 [vhost]
[ 86.540532] [<000003ff8042a608>] vhost_net_ioctl+0x510/0x570 [vhost_net]
[ 86.540534] [<0000001ede3c4dd8>] do_vfs_ioctl+0x430/0x6f8
[ 86.540536] [<0000001ede3c5124>] ksys_ioctl+0x84/0xb0
[ 86.540537] [<0000001ede3c51ba>] __s390x_sys_ioctl+0x2a/0x38
[ 86.540538] [<0000001edeb27f72>] system_call+0x2a6/0x2c8
[ 86.540570] unexpected descriptor format for RX: out 0, in 0
[ 86.540577] Unexpected header len for TX: 0 expected 0


On 14.02.20 08:06, Eugenio Pérez wrote:
> Hi Christian.
>
> Sorry, that was meant to be applied over previous debug patch.
>
> Here I inline the one meant to be applied over eccb852f1fe6bede630e2e4f1a121a81e34354ab.
>
> Thanks!
>
> From d978ace99e4844b49b794d768385db3d128a4cc0 Mon Sep 17 00:00:00 2001
> From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <[email protected]>
> Date: Fri, 14 Feb 2020 08:02:26 +0100
> Subject: [PATCH] vhost: disable all features and trace last_avail_idx and
> ioctl calls
>
> ---
> drivers/vhost/net.c | 20 +++++++++++++++++---
> drivers/vhost/vhost.c | 25 +++++++++++++++++++++++--
> drivers/vhost/vhost.h | 10 +++++-----
> 3 files changed, 45 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index e158159671fa..e4d5f843f9c0 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -1505,10 +1505,13 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
>
> mutex_lock(&n->dev.mutex);
> r = vhost_dev_check_owner(&n->dev);
> - if (r)
> + if (r) {
> + pr_debug("vhost_dev_check_owner index=%u fd=%d rc r=%d", index, fd, r);
> goto err;
> + }
>
> if (index >= VHOST_NET_VQ_MAX) {
> + pr_debug("vhost_dev_check_owner index=%u fd=%d MAX=%d", index, fd, VHOST_NET_VQ_MAX);
> r = -ENOBUFS;
> goto err;
> }
> @@ -1518,22 +1521,26 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
>
> /* Verify that ring has been setup correctly. */
> if (!vhost_vq_access_ok(vq)) {
> + pr_debug("vhost_net_set_backend index=%u fd=%d !vhost_vq_access_ok", index, fd);
> r = -EFAULT;
> goto err_vq;
> }
> sock = get_socket(fd);
> if (IS_ERR(sock)) {
> r = PTR_ERR(sock);
> + pr_debug("vhost_net_set_backend index=%u fd=%d get_socket err r=%d", index, fd, r);
> goto err_vq;
> }
>
> /* start polling new socket */
> oldsock = vq->private_data;
> if (sock != oldsock) {
> + pr_debug("sock=%p != oldsock=%p index=%u fd=%d vq=%p", sock, oldsock, index, fd, vq);
> ubufs = vhost_net_ubuf_alloc(vq,
> sock && vhost_sock_zcopy(sock));
> if (IS_ERR(ubufs)) {
> r = PTR_ERR(ubufs);
> + pr_debug("ubufs index=%u fd=%d err r=%d vq=%p", index, fd, r, vq);
> goto err_ubufs;
> }
>
> @@ -1541,11 +1548,15 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
> vq->private_data = sock;
> vhost_net_buf_unproduce(nvq);
> r = vhost_vq_init_access(vq);
> - if (r)
> + if (r) {
> + pr_debug("init_access index=%u fd=%d r=%d vq=%p", index, fd, r, vq);
> goto err_used;
> + }
> r = vhost_net_enable_vq(n, vq);
> - if (r)
> + if (r) {
> + pr_debug("enable_vq index=%u fd=%d r=%d vq=%p", index, fd, r, vq);
> goto err_used;
> + }
> if (index == VHOST_NET_VQ_RX)
> nvq->rx_ring = get_tap_ptr_ring(fd);
>
> @@ -1559,6 +1570,8 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
>
> mutex_unlock(&vq->mutex);
>
> + pr_debug("sock=%p", sock);
> +
> if (oldubufs) {
> vhost_net_ubuf_put_wait_and_free(oldubufs);
> mutex_lock(&vq->mutex);
> @@ -1710,6 +1723,7 @@ static long vhost_net_ioctl(struct file *f, unsigned int ioctl,
>
> switch (ioctl) {
> case VHOST_NET_SET_BACKEND:
> + pr_debug("VHOST_NET_SET_BACKEND");
> if (copy_from_user(&backend, argp, sizeof backend))
> return -EFAULT;
> return vhost_net_set_backend(n, backend.index, backend.fd);
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index b5a51b1f2e79..ec25ba32fe81 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -1642,15 +1642,30 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned int ioctl, void __user *arg
> r = -EINVAL;
> break;
> }
> +
> + if (vq->last_avail_idx || vq->avail_idx) {
> + pr_debug(
> + "strange VHOST_SET_VRING_BASE [vq=%p][s.index=%u][s.num=%u]",
> + vq, s.index, s.num);
> + dump_stack();
> + r = 0;
> + break;
> + }
> vq->last_avail_idx = s.num;
> /* Forget the cached index value. */
> vq->avail_idx = vq->last_avail_idx;
> + pr_debug(
> + "VHOST_SET_VRING_BASE [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][s.index=%u][s.num=%u]",
> + vq, vq->last_avail_idx, vq->avail_idx, s.index, s.num);
> break;
> case VHOST_GET_VRING_BASE:
> s.index = idx;
> s.num = vq->last_avail_idx;
> if (copy_to_user(argp, &s, sizeof s))
> r = -EFAULT;
> + pr_debug(
> + "VHOST_GET_VRING_BASE [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][s.index=%u][s.num=%u]",
> + vq, vq->last_avail_idx, vq->avail_idx, s.index, s.num);
> break;
> case VHOST_SET_VRING_KICK:
> if (copy_from_user(&f, argp, sizeof f)) {
> @@ -2239,8 +2254,8 @@ static int fetch_buf(struct vhost_virtqueue *vq)
> vq->avail_idx = vhost16_to_cpu(vq, avail_idx);
>
> if (unlikely((u16)(vq->avail_idx - last_avail_idx) > vq->num)) {
> - vq_err(vq, "Guest moved used index from %u to %u",
> - last_avail_idx, vq->avail_idx);
> + vq_err(vq, "Guest moved vq %p used index from %u to %u",
> + vq, last_avail_idx, vq->avail_idx);
> return -EFAULT;
> }
>
> @@ -2316,6 +2331,9 @@ static int fetch_buf(struct vhost_virtqueue *vq)
> BUG_ON(!(vq->used_flags & VRING_USED_F_NO_NOTIFY));
>
> /* On success, increment avail index. */
> + pr_debug(
> + "[vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][vq->ndescs=%d][vq->first_desc=%d]",
> + vq, vq->last_avail_idx, vq->avail_idx, vq->ndescs, vq->first_desc);
> vq->last_avail_idx++;
>
> return 0;
> @@ -2432,6 +2450,9 @@ EXPORT_SYMBOL_GPL(vhost_get_vq_desc);
> /* Reverse the effect of vhost_get_vq_desc. Useful for error handling. */
> void vhost_discard_vq_desc(struct vhost_virtqueue *vq, int n)
> {
> + pr_debug(
> + "DISCARD [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][n=%d]",
> + vq, vq->last_avail_idx, vq->avail_idx, n);
> vq->last_avail_idx -= n;
> }
> EXPORT_SYMBOL_GPL(vhost_discard_vq_desc);
> diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
> index 661088ae6dc7..08f6d2ccb697 100644
> --- a/drivers/vhost/vhost.h
> +++ b/drivers/vhost/vhost.h
> @@ -250,11 +250,11 @@ int vhost_init_device_iotlb(struct vhost_dev *d, bool enabled);
> } while (0)
>
> enum {
> - VHOST_FEATURES = (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) |
> - (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
> - (1ULL << VIRTIO_RING_F_EVENT_IDX) |
> - (1ULL << VHOST_F_LOG_ALL) |
> - (1ULL << VIRTIO_F_ANY_LAYOUT) |
> + VHOST_FEATURES = /* (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) | */
> + /* (1ULL << VIRTIO_RING_F_INDIRECT_DESC) | */
> + /* (1ULL << VIRTIO_RING_F_EVENT_IDX) | */
> + /* (1ULL << VHOST_F_LOG_ALL) | */
> + /* (1ULL << VIRTIO_F_ANY_LAYOUT) | */
> (1ULL << VIRTIO_F_VERSION_1)
> };
>
>

2020-02-14 07:41:50

by Eugenio Perez Martin

[permalink] [raw]
Subject: Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot

Hi.

Were the vhost and vhost_net modules loaded with dyndbg='+plt'? I miss
all the others regular debug traces on that one.

Thanks!

On Fri, Feb 14, 2020 at 8:34 AM Christian Borntraeger
<[email protected]> wrote:
>
> I did
> ping -c 20 -f ... ; reboot
> twice
>
> The ping after the first reboot showed .......E
>
> this was on the host console
>
> [ 55.951885] CPU: 34 PID: 1908 Comm: CPU 0/KVM Not tainted 5.5.0+ #21
> [ 55.951891] Hardware name: IBM 3906 M04 704 (LPAR)
> [ 55.951892] Call Trace:
> [ 55.951902] [<0000001ede114132>] show_stack+0x8a/0xd0
> [ 55.951906] [<0000001edeb0672a>] dump_stack+0x8a/0xb8
> [ 55.951915] [<000003ff803736a6>] vhost_vring_ioctl+0x6fe/0x858 [vhost]
> [ 55.951919] [<000003ff8042a608>] vhost_net_ioctl+0x510/0x570 [vhost_net]
> [ 55.951924] [<0000001ede3c4dd8>] do_vfs_ioctl+0x430/0x6f8
> [ 55.951926] [<0000001ede3c5124>] ksys_ioctl+0x84/0xb0
> [ 55.951927] [<0000001ede3c51ba>] __s390x_sys_ioctl+0x2a/0x38
> [ 55.951931] [<0000001edeb27f72>] system_call+0x2a6/0x2c8
> [ 55.951949] CPU: 34 PID: 1908 Comm: CPU 0/KVM Not tainted 5.5.0+ #21
> [ 55.951950] Hardware name: IBM 3906 M04 704 (LPAR)
> [ 55.951951] Call Trace:
> [ 55.951952] [<0000001ede114132>] show_stack+0x8a/0xd0
> [ 55.951954] [<0000001edeb0672a>] dump_stack+0x8a/0xb8
> [ 55.951956] [<000003ff803736a6>] vhost_vring_ioctl+0x6fe/0x858 [vhost]
> [ 55.951958] [<000003ff8042a608>] vhost_net_ioctl+0x510/0x570 [vhost_net]
> [ 55.951959] [<0000001ede3c4dd8>] do_vfs_ioctl+0x430/0x6f8
> [ 55.951961] [<0000001ede3c5124>] ksys_ioctl+0x84/0xb0
> [ 55.951962] [<0000001ede3c51ba>] __s390x_sys_ioctl+0x2a/0x38
> [ 55.951964] [<0000001edeb27f72>] system_call+0x2a6/0x2c8
> [ 55.951997] Guest moved vq 0000000063d896c6 used index from 44 to 0
> [ 56.609831] unexpected descriptor format for RX: out 0, in 0
> [ 86.540460] CPU: 6 PID: 1908 Comm: CPU 0/KVM Not tainted 5.5.0+ #21
> [ 86.540464] Hardware name: IBM 3906 M04 704 (LPAR)
> [ 86.540466] Call Trace:
> [ 86.540473] [<0000001ede114132>] show_stack+0x8a/0xd0
> [ 86.540477] [<0000001edeb0672a>] dump_stack+0x8a/0xb8
> [ 86.540486] [<000003ff803736a6>] vhost_vring_ioctl+0x6fe/0x858 [vhost]
> [ 86.540490] [<000003ff8042a608>] vhost_net_ioctl+0x510/0x570 [vhost_net]
> [ 86.540494] [<0000001ede3c4dd8>] do_vfs_ioctl+0x430/0x6f8
> [ 86.540496] [<0000001ede3c5124>] ksys_ioctl+0x84/0xb0
> [ 86.540498] [<0000001ede3c51ba>] __s390x_sys_ioctl+0x2a/0x38
> [ 86.540501] [<0000001edeb27f72>] system_call+0x2a6/0x2c8
> [ 86.540524] CPU: 6 PID: 1908 Comm: CPU 0/KVM Not tainted 5.5.0+ #21
> [ 86.540525] Hardware name: IBM 3906 M04 704 (LPAR)
> [ 86.540526] Call Trace:
> [ 86.540527] [<0000001ede114132>] show_stack+0x8a/0xd0
> [ 86.540528] [<0000001edeb0672a>] dump_stack+0x8a/0xb8
> [ 86.540531] [<000003ff803736a6>] vhost_vring_ioctl+0x6fe/0x858 [vhost]
> [ 86.540532] [<000003ff8042a608>] vhost_net_ioctl+0x510/0x570 [vhost_net]
> [ 86.540534] [<0000001ede3c4dd8>] do_vfs_ioctl+0x430/0x6f8
> [ 86.540536] [<0000001ede3c5124>] ksys_ioctl+0x84/0xb0
> [ 86.540537] [<0000001ede3c51ba>] __s390x_sys_ioctl+0x2a/0x38
> [ 86.540538] [<0000001edeb27f72>] system_call+0x2a6/0x2c8
> [ 86.540570] unexpected descriptor format for RX: out 0, in 0
> [ 86.540577] Unexpected header len for TX: 0 expected 0
>
>
> On 14.02.20 08:06, Eugenio Pérez wrote:
> > Hi Christian.
> >
> > Sorry, that was meant to be applied over previous debug patch.
> >
> > Here I inline the one meant to be applied over eccb852f1fe6bede630e2e4f1a121a81e34354ab.
> >
> > Thanks!
> >
> > From d978ace99e4844b49b794d768385db3d128a4cc0 Mon Sep 17 00:00:00 2001
> > From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <[email protected]>
> > Date: Fri, 14 Feb 2020 08:02:26 +0100
> > Subject: [PATCH] vhost: disable all features and trace last_avail_idx and
> > ioctl calls
> >
> > ---
> > drivers/vhost/net.c | 20 +++++++++++++++++---
> > drivers/vhost/vhost.c | 25 +++++++++++++++++++++++--
> > drivers/vhost/vhost.h | 10 +++++-----
> > 3 files changed, 45 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> > index e158159671fa..e4d5f843f9c0 100644
> > --- a/drivers/vhost/net.c
> > +++ b/drivers/vhost/net.c
> > @@ -1505,10 +1505,13 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
> >
> > mutex_lock(&n->dev.mutex);
> > r = vhost_dev_check_owner(&n->dev);
> > - if (r)
> > + if (r) {
> > + pr_debug("vhost_dev_check_owner index=%u fd=%d rc r=%d", index, fd, r);
> > goto err;
> > + }
> >
> > if (index >= VHOST_NET_VQ_MAX) {
> > + pr_debug("vhost_dev_check_owner index=%u fd=%d MAX=%d", index, fd, VHOST_NET_VQ_MAX);
> > r = -ENOBUFS;
> > goto err;
> > }
> > @@ -1518,22 +1521,26 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
> >
> > /* Verify that ring has been setup correctly. */
> > if (!vhost_vq_access_ok(vq)) {
> > + pr_debug("vhost_net_set_backend index=%u fd=%d !vhost_vq_access_ok", index, fd);
> > r = -EFAULT;
> > goto err_vq;
> > }
> > sock = get_socket(fd);
> > if (IS_ERR(sock)) {
> > r = PTR_ERR(sock);
> > + pr_debug("vhost_net_set_backend index=%u fd=%d get_socket err r=%d", index, fd, r);
> > goto err_vq;
> > }
> >
> > /* start polling new socket */
> > oldsock = vq->private_data;
> > if (sock != oldsock) {
> > + pr_debug("sock=%p != oldsock=%p index=%u fd=%d vq=%p", sock, oldsock, index, fd, vq);
> > ubufs = vhost_net_ubuf_alloc(vq,
> > sock && vhost_sock_zcopy(sock));
> > if (IS_ERR(ubufs)) {
> > r = PTR_ERR(ubufs);
> > + pr_debug("ubufs index=%u fd=%d err r=%d vq=%p", index, fd, r, vq);
> > goto err_ubufs;
> > }
> >
> > @@ -1541,11 +1548,15 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
> > vq->private_data = sock;
> > vhost_net_buf_unproduce(nvq);
> > r = vhost_vq_init_access(vq);
> > - if (r)
> > + if (r) {
> > + pr_debug("init_access index=%u fd=%d r=%d vq=%p", index, fd, r, vq);
> > goto err_used;
> > + }
> > r = vhost_net_enable_vq(n, vq);
> > - if (r)
> > + if (r) {
> > + pr_debug("enable_vq index=%u fd=%d r=%d vq=%p", index, fd, r, vq);
> > goto err_used;
> > + }
> > if (index == VHOST_NET_VQ_RX)
> > nvq->rx_ring = get_tap_ptr_ring(fd);
> >
> > @@ -1559,6 +1570,8 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
> >
> > mutex_unlock(&vq->mutex);
> >
> > + pr_debug("sock=%p", sock);
> > +
> > if (oldubufs) {
> > vhost_net_ubuf_put_wait_and_free(oldubufs);
> > mutex_lock(&vq->mutex);
> > @@ -1710,6 +1723,7 @@ static long vhost_net_ioctl(struct file *f, unsigned int ioctl,
> >
> > switch (ioctl) {
> > case VHOST_NET_SET_BACKEND:
> > + pr_debug("VHOST_NET_SET_BACKEND");
> > if (copy_from_user(&backend, argp, sizeof backend))
> > return -EFAULT;
> > return vhost_net_set_backend(n, backend.index, backend.fd);
> > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> > index b5a51b1f2e79..ec25ba32fe81 100644
> > --- a/drivers/vhost/vhost.c
> > +++ b/drivers/vhost/vhost.c
> > @@ -1642,15 +1642,30 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned int ioctl, void __user *arg
> > r = -EINVAL;
> > break;
> > }
> > +
> > + if (vq->last_avail_idx || vq->avail_idx) {
> > + pr_debug(
> > + "strange VHOST_SET_VRING_BASE [vq=%p][s.index=%u][s.num=%u]",
> > + vq, s.index, s.num);
> > + dump_stack();
> > + r = 0;
> > + break;
> > + }
> > vq->last_avail_idx = s.num;
> > /* Forget the cached index value. */
> > vq->avail_idx = vq->last_avail_idx;
> > + pr_debug(
> > + "VHOST_SET_VRING_BASE [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][s.index=%u][s.num=%u]",
> > + vq, vq->last_avail_idx, vq->avail_idx, s.index, s.num);
> > break;
> > case VHOST_GET_VRING_BASE:
> > s.index = idx;
> > s.num = vq->last_avail_idx;
> > if (copy_to_user(argp, &s, sizeof s))
> > r = -EFAULT;
> > + pr_debug(
> > + "VHOST_GET_VRING_BASE [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][s.index=%u][s.num=%u]",
> > + vq, vq->last_avail_idx, vq->avail_idx, s.index, s.num);
> > break;
> > case VHOST_SET_VRING_KICK:
> > if (copy_from_user(&f, argp, sizeof f)) {
> > @@ -2239,8 +2254,8 @@ static int fetch_buf(struct vhost_virtqueue *vq)
> > vq->avail_idx = vhost16_to_cpu(vq, avail_idx);
> >
> > if (unlikely((u16)(vq->avail_idx - last_avail_idx) > vq->num)) {
> > - vq_err(vq, "Guest moved used index from %u to %u",
> > - last_avail_idx, vq->avail_idx);
> > + vq_err(vq, "Guest moved vq %p used index from %u to %u",
> > + vq, last_avail_idx, vq->avail_idx);
> > return -EFAULT;
> > }
> >
> > @@ -2316,6 +2331,9 @@ static int fetch_buf(struct vhost_virtqueue *vq)
> > BUG_ON(!(vq->used_flags & VRING_USED_F_NO_NOTIFY));
> >
> > /* On success, increment avail index. */
> > + pr_debug(
> > + "[vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][vq->ndescs=%d][vq->first_desc=%d]",
> > + vq, vq->last_avail_idx, vq->avail_idx, vq->ndescs, vq->first_desc);
> > vq->last_avail_idx++;
> >
> > return 0;
> > @@ -2432,6 +2450,9 @@ EXPORT_SYMBOL_GPL(vhost_get_vq_desc);
> > /* Reverse the effect of vhost_get_vq_desc. Useful for error handling. */
> > void vhost_discard_vq_desc(struct vhost_virtqueue *vq, int n)
> > {
> > + pr_debug(
> > + "DISCARD [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][n=%d]",
> > + vq, vq->last_avail_idx, vq->avail_idx, n);
> > vq->last_avail_idx -= n;
> > }
> > EXPORT_SYMBOL_GPL(vhost_discard_vq_desc);
> > diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
> > index 661088ae6dc7..08f6d2ccb697 100644
> > --- a/drivers/vhost/vhost.h
> > +++ b/drivers/vhost/vhost.h
> > @@ -250,11 +250,11 @@ int vhost_init_device_iotlb(struct vhost_dev *d, bool enabled);
> > } while (0)
> >
> > enum {
> > - VHOST_FEATURES = (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) |
> > - (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
> > - (1ULL << VIRTIO_RING_F_EVENT_IDX) |
> > - (1ULL << VHOST_F_LOG_ALL) |
> > - (1ULL << VIRTIO_F_ANY_LAYOUT) |
> > + VHOST_FEATURES = /* (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) | */
> > + /* (1ULL << VIRTIO_RING_F_INDIRECT_DESC) | */
> > + /* (1ULL << VIRTIO_RING_F_EVENT_IDX) | */
> > + /* (1ULL << VHOST_F_LOG_ALL) | */
> > + /* (1ULL << VIRTIO_F_ANY_LAYOUT) | */
> > (1ULL << VIRTIO_F_VERSION_1)
> > };
> >
> >
>

2020-02-14 07:45:28

by Christian Borntraeger

[permalink] [raw]
Subject: Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot



On 14.02.20 08:40, Eugenio Perez Martin wrote:
> Hi.
>
> Were the vhost and vhost_net modules loaded with dyndbg='+plt'? I miss
> all the others regular debug traces on that one.

I did

echo -n 'file drivers/vhost/vhost.c +plt' > control
and
echo -n 'file drivers/vhost/net.c +plt' > control

but apparently it did not work...me hates dynamic debug.

>
> Thanks!
>
> On Fri, Feb 14, 2020 at 8:34 AM Christian Borntraeger
> <[email protected]> wrote:
>>
>> I did
>> ping -c 20 -f ... ; reboot
>> twice
>>
>> The ping after the first reboot showed .......E
>>
>> this was on the host console
>>
>> [ 55.951885] CPU: 34 PID: 1908 Comm: CPU 0/KVM Not tainted 5.5.0+ #21
>> [ 55.951891] Hardware name: IBM 3906 M04 704 (LPAR)
>> [ 55.951892] Call Trace:
>> [ 55.951902] [<0000001ede114132>] show_stack+0x8a/0xd0
>> [ 55.951906] [<0000001edeb0672a>] dump_stack+0x8a/0xb8
>> [ 55.951915] [<000003ff803736a6>] vhost_vring_ioctl+0x6fe/0x858 [vhost]
>> [ 55.951919] [<000003ff8042a608>] vhost_net_ioctl+0x510/0x570 [vhost_net]
>> [ 55.951924] [<0000001ede3c4dd8>] do_vfs_ioctl+0x430/0x6f8
>> [ 55.951926] [<0000001ede3c5124>] ksys_ioctl+0x84/0xb0
>> [ 55.951927] [<0000001ede3c51ba>] __s390x_sys_ioctl+0x2a/0x38
>> [ 55.951931] [<0000001edeb27f72>] system_call+0x2a6/0x2c8
>> [ 55.951949] CPU: 34 PID: 1908 Comm: CPU 0/KVM Not tainted 5.5.0+ #21
>> [ 55.951950] Hardware name: IBM 3906 M04 704 (LPAR)
>> [ 55.951951] Call Trace:
>> [ 55.951952] [<0000001ede114132>] show_stack+0x8a/0xd0
>> [ 55.951954] [<0000001edeb0672a>] dump_stack+0x8a/0xb8
>> [ 55.951956] [<000003ff803736a6>] vhost_vring_ioctl+0x6fe/0x858 [vhost]
>> [ 55.951958] [<000003ff8042a608>] vhost_net_ioctl+0x510/0x570 [vhost_net]
>> [ 55.951959] [<0000001ede3c4dd8>] do_vfs_ioctl+0x430/0x6f8
>> [ 55.951961] [<0000001ede3c5124>] ksys_ioctl+0x84/0xb0
>> [ 55.951962] [<0000001ede3c51ba>] __s390x_sys_ioctl+0x2a/0x38
>> [ 55.951964] [<0000001edeb27f72>] system_call+0x2a6/0x2c8
>> [ 55.951997] Guest moved vq 0000000063d896c6 used index from 44 to 0
>> [ 56.609831] unexpected descriptor format for RX: out 0, in 0
>> [ 86.540460] CPU: 6 PID: 1908 Comm: CPU 0/KVM Not tainted 5.5.0+ #21
>> [ 86.540464] Hardware name: IBM 3906 M04 704 (LPAR)
>> [ 86.540466] Call Trace:
>> [ 86.540473] [<0000001ede114132>] show_stack+0x8a/0xd0
>> [ 86.540477] [<0000001edeb0672a>] dump_stack+0x8a/0xb8
>> [ 86.540486] [<000003ff803736a6>] vhost_vring_ioctl+0x6fe/0x858 [vhost]
>> [ 86.540490] [<000003ff8042a608>] vhost_net_ioctl+0x510/0x570 [vhost_net]
>> [ 86.540494] [<0000001ede3c4dd8>] do_vfs_ioctl+0x430/0x6f8
>> [ 86.540496] [<0000001ede3c5124>] ksys_ioctl+0x84/0xb0
>> [ 86.540498] [<0000001ede3c51ba>] __s390x_sys_ioctl+0x2a/0x38
>> [ 86.540501] [<0000001edeb27f72>] system_call+0x2a6/0x2c8
>> [ 86.540524] CPU: 6 PID: 1908 Comm: CPU 0/KVM Not tainted 5.5.0+ #21
>> [ 86.540525] Hardware name: IBM 3906 M04 704 (LPAR)
>> [ 86.540526] Call Trace:
>> [ 86.540527] [<0000001ede114132>] show_stack+0x8a/0xd0
>> [ 86.540528] [<0000001edeb0672a>] dump_stack+0x8a/0xb8
>> [ 86.540531] [<000003ff803736a6>] vhost_vring_ioctl+0x6fe/0x858 [vhost]
>> [ 86.540532] [<000003ff8042a608>] vhost_net_ioctl+0x510/0x570 [vhost_net]
>> [ 86.540534] [<0000001ede3c4dd8>] do_vfs_ioctl+0x430/0x6f8
>> [ 86.540536] [<0000001ede3c5124>] ksys_ioctl+0x84/0xb0
>> [ 86.540537] [<0000001ede3c51ba>] __s390x_sys_ioctl+0x2a/0x38
>> [ 86.540538] [<0000001edeb27f72>] system_call+0x2a6/0x2c8
>> [ 86.540570] unexpected descriptor format for RX: out 0, in 0
>> [ 86.540577] Unexpected header len for TX: 0 expected 0
>>
>>
>> On 14.02.20 08:06, Eugenio Pérez wrote:
>>> Hi Christian.
>>>
>>> Sorry, that was meant to be applied over previous debug patch.
>>>
>>> Here I inline the one meant to be applied over eccb852f1fe6bede630e2e4f1a121a81e34354ab.
>>>
>>> Thanks!
>>>
>>> From d978ace99e4844b49b794d768385db3d128a4cc0 Mon Sep 17 00:00:00 2001
>>> From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <[email protected]>
>>> Date: Fri, 14 Feb 2020 08:02:26 +0100
>>> Subject: [PATCH] vhost: disable all features and trace last_avail_idx and
>>> ioctl calls
>>>
>>> ---
>>> drivers/vhost/net.c | 20 +++++++++++++++++---
>>> drivers/vhost/vhost.c | 25 +++++++++++++++++++++++--
>>> drivers/vhost/vhost.h | 10 +++++-----
>>> 3 files changed, 45 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
>>> index e158159671fa..e4d5f843f9c0 100644
>>> --- a/drivers/vhost/net.c
>>> +++ b/drivers/vhost/net.c
>>> @@ -1505,10 +1505,13 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
>>>
>>> mutex_lock(&n->dev.mutex);
>>> r = vhost_dev_check_owner(&n->dev);
>>> - if (r)
>>> + if (r) {
>>> + pr_debug("vhost_dev_check_owner index=%u fd=%d rc r=%d", index, fd, r);
>>> goto err;
>>> + }
>>>
>>> if (index >= VHOST_NET_VQ_MAX) {
>>> + pr_debug("vhost_dev_check_owner index=%u fd=%d MAX=%d", index, fd, VHOST_NET_VQ_MAX);
>>> r = -ENOBUFS;
>>> goto err;
>>> }
>>> @@ -1518,22 +1521,26 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
>>>
>>> /* Verify that ring has been setup correctly. */
>>> if (!vhost_vq_access_ok(vq)) {
>>> + pr_debug("vhost_net_set_backend index=%u fd=%d !vhost_vq_access_ok", index, fd);
>>> r = -EFAULT;
>>> goto err_vq;
>>> }
>>> sock = get_socket(fd);
>>> if (IS_ERR(sock)) {
>>> r = PTR_ERR(sock);
>>> + pr_debug("vhost_net_set_backend index=%u fd=%d get_socket err r=%d", index, fd, r);
>>> goto err_vq;
>>> }
>>>
>>> /* start polling new socket */
>>> oldsock = vq->private_data;
>>> if (sock != oldsock) {
>>> + pr_debug("sock=%p != oldsock=%p index=%u fd=%d vq=%p", sock, oldsock, index, fd, vq);
>>> ubufs = vhost_net_ubuf_alloc(vq,
>>> sock && vhost_sock_zcopy(sock));
>>> if (IS_ERR(ubufs)) {
>>> r = PTR_ERR(ubufs);
>>> + pr_debug("ubufs index=%u fd=%d err r=%d vq=%p", index, fd, r, vq);
>>> goto err_ubufs;
>>> }
>>>
>>> @@ -1541,11 +1548,15 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
>>> vq->private_data = sock;
>>> vhost_net_buf_unproduce(nvq);
>>> r = vhost_vq_init_access(vq);
>>> - if (r)
>>> + if (r) {
>>> + pr_debug("init_access index=%u fd=%d r=%d vq=%p", index, fd, r, vq);
>>> goto err_used;
>>> + }
>>> r = vhost_net_enable_vq(n, vq);
>>> - if (r)
>>> + if (r) {
>>> + pr_debug("enable_vq index=%u fd=%d r=%d vq=%p", index, fd, r, vq);
>>> goto err_used;
>>> + }
>>> if (index == VHOST_NET_VQ_RX)
>>> nvq->rx_ring = get_tap_ptr_ring(fd);
>>>
>>> @@ -1559,6 +1570,8 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
>>>
>>> mutex_unlock(&vq->mutex);
>>>
>>> + pr_debug("sock=%p", sock);
>>> +
>>> if (oldubufs) {
>>> vhost_net_ubuf_put_wait_and_free(oldubufs);
>>> mutex_lock(&vq->mutex);
>>> @@ -1710,6 +1723,7 @@ static long vhost_net_ioctl(struct file *f, unsigned int ioctl,
>>>
>>> switch (ioctl) {
>>> case VHOST_NET_SET_BACKEND:
>>> + pr_debug("VHOST_NET_SET_BACKEND");
>>> if (copy_from_user(&backend, argp, sizeof backend))
>>> return -EFAULT;
>>> return vhost_net_set_backend(n, backend.index, backend.fd);
>>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
>>> index b5a51b1f2e79..ec25ba32fe81 100644
>>> --- a/drivers/vhost/vhost.c
>>> +++ b/drivers/vhost/vhost.c
>>> @@ -1642,15 +1642,30 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned int ioctl, void __user *arg
>>> r = -EINVAL;
>>> break;
>>> }
>>> +
>>> + if (vq->last_avail_idx || vq->avail_idx) {
>>> + pr_debug(
>>> + "strange VHOST_SET_VRING_BASE [vq=%p][s.index=%u][s.num=%u]",
>>> + vq, s.index, s.num);
>>> + dump_stack();
>>> + r = 0;
>>> + break;
>>> + }
>>> vq->last_avail_idx = s.num;
>>> /* Forget the cached index value. */
>>> vq->avail_idx = vq->last_avail_idx;
>>> + pr_debug(
>>> + "VHOST_SET_VRING_BASE [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][s.index=%u][s.num=%u]",
>>> + vq, vq->last_avail_idx, vq->avail_idx, s.index, s.num);
>>> break;
>>> case VHOST_GET_VRING_BASE:
>>> s.index = idx;
>>> s.num = vq->last_avail_idx;
>>> if (copy_to_user(argp, &s, sizeof s))
>>> r = -EFAULT;
>>> + pr_debug(
>>> + "VHOST_GET_VRING_BASE [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][s.index=%u][s.num=%u]",
>>> + vq, vq->last_avail_idx, vq->avail_idx, s.index, s.num);
>>> break;
>>> case VHOST_SET_VRING_KICK:
>>> if (copy_from_user(&f, argp, sizeof f)) {
>>> @@ -2239,8 +2254,8 @@ static int fetch_buf(struct vhost_virtqueue *vq)
>>> vq->avail_idx = vhost16_to_cpu(vq, avail_idx);
>>>
>>> if (unlikely((u16)(vq->avail_idx - last_avail_idx) > vq->num)) {
>>> - vq_err(vq, "Guest moved used index from %u to %u",
>>> - last_avail_idx, vq->avail_idx);
>>> + vq_err(vq, "Guest moved vq %p used index from %u to %u",
>>> + vq, last_avail_idx, vq->avail_idx);
>>> return -EFAULT;
>>> }
>>>
>>> @@ -2316,6 +2331,9 @@ static int fetch_buf(struct vhost_virtqueue *vq)
>>> BUG_ON(!(vq->used_flags & VRING_USED_F_NO_NOTIFY));
>>>
>>> /* On success, increment avail index. */
>>> + pr_debug(
>>> + "[vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][vq->ndescs=%d][vq->first_desc=%d]",
>>> + vq, vq->last_avail_idx, vq->avail_idx, vq->ndescs, vq->first_desc);
>>> vq->last_avail_idx++;
>>>
>>> return 0;
>>> @@ -2432,6 +2450,9 @@ EXPORT_SYMBOL_GPL(vhost_get_vq_desc);
>>> /* Reverse the effect of vhost_get_vq_desc. Useful for error handling. */
>>> void vhost_discard_vq_desc(struct vhost_virtqueue *vq, int n)
>>> {
>>> + pr_debug(
>>> + "DISCARD [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][n=%d]",
>>> + vq, vq->last_avail_idx, vq->avail_idx, n);
>>> vq->last_avail_idx -= n;
>>> }
>>> EXPORT_SYMBOL_GPL(vhost_discard_vq_desc);
>>> diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
>>> index 661088ae6dc7..08f6d2ccb697 100644
>>> --- a/drivers/vhost/vhost.h
>>> +++ b/drivers/vhost/vhost.h
>>> @@ -250,11 +250,11 @@ int vhost_init_device_iotlb(struct vhost_dev *d, bool enabled);
>>> } while (0)
>>>
>>> enum {
>>> - VHOST_FEATURES = (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) |
>>> - (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
>>> - (1ULL << VIRTIO_RING_F_EVENT_IDX) |
>>> - (1ULL << VHOST_F_LOG_ALL) |
>>> - (1ULL << VIRTIO_F_ANY_LAYOUT) |
>>> + VHOST_FEATURES = /* (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) | */
>>> + /* (1ULL << VIRTIO_RING_F_INDIRECT_DESC) | */
>>> + /* (1ULL << VIRTIO_RING_F_EVENT_IDX) | */
>>> + /* (1ULL << VHOST_F_LOG_ALL) | */
>>> + /* (1ULL << VIRTIO_F_ANY_LAYOUT) | */
>>> (1ULL << VIRTIO_F_VERSION_1)
>>> };
>>>
>>>
>>
>

2020-02-14 07:49:46

by Christian Borntraeger

[permalink] [raw]
Subject: Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot

repro


On 14.02.20 08:43, Christian Borntraeger wrote:
>
>
> On 14.02.20 08:40, Eugenio Perez Martin wrote:
>> Hi.
>>
>> Were the vhost and vhost_net modules loaded with dyndbg='+plt'? I miss
>> all the others regular debug traces on that one.
>
> I did
>
> echo -n 'file drivers/vhost/vhost.c +plt' > control
> and
> echo -n 'file drivers/vhost/net.c +plt' > control
>
> but apparently it did not work...me hates dynamic debug.
>
>>
>> Thanks!
>>
>> On Fri, Feb 14, 2020 at 8:34 AM Christian Borntraeger
>> <[email protected]> wrote:
>>>
>>> I did
>>> ping -c 20 -f ... ; reboot
>>> twice
>>>
>>> The ping after the first reboot showed .......E
>>>
>>> this was on the host console
>>>
>>> [ 55.951885] CPU: 34 PID: 1908 Comm: CPU 0/KVM Not tainted 5.5.0+ #21
>>> [ 55.951891] Hardware name: IBM 3906 M04 704 (LPAR)
>>> [ 55.951892] Call Trace:
>>> [ 55.951902] [<0000001ede114132>] show_stack+0x8a/0xd0
>>> [ 55.951906] [<0000001edeb0672a>] dump_stack+0x8a/0xb8
>>> [ 55.951915] [<000003ff803736a6>] vhost_vring_ioctl+0x6fe/0x858 [vhost]
>>> [ 55.951919] [<000003ff8042a608>] vhost_net_ioctl+0x510/0x570 [vhost_net]
>>> [ 55.951924] [<0000001ede3c4dd8>] do_vfs_ioctl+0x430/0x6f8
>>> [ 55.951926] [<0000001ede3c5124>] ksys_ioctl+0x84/0xb0
>>> [ 55.951927] [<0000001ede3c51ba>] __s390x_sys_ioctl+0x2a/0x38
>>> [ 55.951931] [<0000001edeb27f72>] system_call+0x2a6/0x2c8
>>> [ 55.951949] CPU: 34 PID: 1908 Comm: CPU 0/KVM Not tainted 5.5.0+ #21
>>> [ 55.951950] Hardware name: IBM 3906 M04 704 (LPAR)
>>> [ 55.951951] Call Trace:
>>> [ 55.951952] [<0000001ede114132>] show_stack+0x8a/0xd0
>>> [ 55.951954] [<0000001edeb0672a>] dump_stack+0x8a/0xb8
>>> [ 55.951956] [<000003ff803736a6>] vhost_vring_ioctl+0x6fe/0x858 [vhost]
>>> [ 55.951958] [<000003ff8042a608>] vhost_net_ioctl+0x510/0x570 [vhost_net]
>>> [ 55.951959] [<0000001ede3c4dd8>] do_vfs_ioctl+0x430/0x6f8
>>> [ 55.951961] [<0000001ede3c5124>] ksys_ioctl+0x84/0xb0
>>> [ 55.951962] [<0000001ede3c51ba>] __s390x_sys_ioctl+0x2a/0x38
>>> [ 55.951964] [<0000001edeb27f72>] system_call+0x2a6/0x2c8
>>> [ 55.951997] Guest moved vq 0000000063d896c6 used index from 44 to 0
>>> [ 56.609831] unexpected descriptor format for RX: out 0, in 0
>>> [ 86.540460] CPU: 6 PID: 1908 Comm: CPU 0/KVM Not tainted 5.5.0+ #21
>>> [ 86.540464] Hardware name: IBM 3906 M04 704 (LPAR)
>>> [ 86.540466] Call Trace:
>>> [ 86.540473] [<0000001ede114132>] show_stack+0x8a/0xd0
>>> [ 86.540477] [<0000001edeb0672a>] dump_stack+0x8a/0xb8
>>> [ 86.540486] [<000003ff803736a6>] vhost_vring_ioctl+0x6fe/0x858 [vhost]
>>> [ 86.540490] [<000003ff8042a608>] vhost_net_ioctl+0x510/0x570 [vhost_net]
>>> [ 86.540494] [<0000001ede3c4dd8>] do_vfs_ioctl+0x430/0x6f8
>>> [ 86.540496] [<0000001ede3c5124>] ksys_ioctl+0x84/0xb0
>>> [ 86.540498] [<0000001ede3c51ba>] __s390x_sys_ioctl+0x2a/0x38
>>> [ 86.540501] [<0000001edeb27f72>] system_call+0x2a6/0x2c8
>>> [ 86.540524] CPU: 6 PID: 1908 Comm: CPU 0/KVM Not tainted 5.5.0+ #21
>>> [ 86.540525] Hardware name: IBM 3906 M04 704 (LPAR)
>>> [ 86.540526] Call Trace:
>>> [ 86.540527] [<0000001ede114132>] show_stack+0x8a/0xd0
>>> [ 86.540528] [<0000001edeb0672a>] dump_stack+0x8a/0xb8
>>> [ 86.540531] [<000003ff803736a6>] vhost_vring_ioctl+0x6fe/0x858 [vhost]
>>> [ 86.540532] [<000003ff8042a608>] vhost_net_ioctl+0x510/0x570 [vhost_net]
>>> [ 86.540534] [<0000001ede3c4dd8>] do_vfs_ioctl+0x430/0x6f8
>>> [ 86.540536] [<0000001ede3c5124>] ksys_ioctl+0x84/0xb0
>>> [ 86.540537] [<0000001ede3c51ba>] __s390x_sys_ioctl+0x2a/0x38
>>> [ 86.540538] [<0000001edeb27f72>] system_call+0x2a6/0x2c8
>>> [ 86.540570] unexpected descriptor format for RX: out 0, in 0
>>> [ 86.540577] Unexpected header len for TX: 0 expected 0
>>>
>>>
>>> On 14.02.20 08:06, Eugenio Pérez wrote:
>>>> Hi Christian.
>>>>
>>>> Sorry, that was meant to be applied over previous debug patch.
>>>>
>>>> Here I inline the one meant to be applied over eccb852f1fe6bede630e2e4f1a121a81e34354ab.
>>>>
>>>> Thanks!
>>>>
>>>> From d978ace99e4844b49b794d768385db3d128a4cc0 Mon Sep 17 00:00:00 2001
>>>> From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <[email protected]>
>>>> Date: Fri, 14 Feb 2020 08:02:26 +0100
>>>> Subject: [PATCH] vhost: disable all features and trace last_avail_idx and
>>>> ioctl calls
>>>>
>>>> ---
>>>> drivers/vhost/net.c | 20 +++++++++++++++++---
>>>> drivers/vhost/vhost.c | 25 +++++++++++++++++++++++--
>>>> drivers/vhost/vhost.h | 10 +++++-----
>>>> 3 files changed, 45 insertions(+), 10 deletions(-)
>>>>
>>>> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
>>>> index e158159671fa..e4d5f843f9c0 100644
>>>> --- a/drivers/vhost/net.c
>>>> +++ b/drivers/vhost/net.c
>>>> @@ -1505,10 +1505,13 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
>>>>
>>>> mutex_lock(&n->dev.mutex);
>>>> r = vhost_dev_check_owner(&n->dev);
>>>> - if (r)
>>>> + if (r) {
>>>> + pr_debug("vhost_dev_check_owner index=%u fd=%d rc r=%d", index, fd, r);
>>>> goto err;
>>>> + }
>>>>
>>>> if (index >= VHOST_NET_VQ_MAX) {
>>>> + pr_debug("vhost_dev_check_owner index=%u fd=%d MAX=%d", index, fd, VHOST_NET_VQ_MAX);
>>>> r = -ENOBUFS;
>>>> goto err;
>>>> }
>>>> @@ -1518,22 +1521,26 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
>>>>
>>>> /* Verify that ring has been setup correctly. */
>>>> if (!vhost_vq_access_ok(vq)) {
>>>> + pr_debug("vhost_net_set_backend index=%u fd=%d !vhost_vq_access_ok", index, fd);
>>>> r = -EFAULT;
>>>> goto err_vq;
>>>> }
>>>> sock = get_socket(fd);
>>>> if (IS_ERR(sock)) {
>>>> r = PTR_ERR(sock);
>>>> + pr_debug("vhost_net_set_backend index=%u fd=%d get_socket err r=%d", index, fd, r);
>>>> goto err_vq;
>>>> }
>>>>
>>>> /* start polling new socket */
>>>> oldsock = vq->private_data;
>>>> if (sock != oldsock) {
>>>> + pr_debug("sock=%p != oldsock=%p index=%u fd=%d vq=%p", sock, oldsock, index, fd, vq);
>>>> ubufs = vhost_net_ubuf_alloc(vq,
>>>> sock && vhost_sock_zcopy(sock));
>>>> if (IS_ERR(ubufs)) {
>>>> r = PTR_ERR(ubufs);
>>>> + pr_debug("ubufs index=%u fd=%d err r=%d vq=%p", index, fd, r, vq);
>>>> goto err_ubufs;
>>>> }
>>>>
>>>> @@ -1541,11 +1548,15 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
>>>> vq->private_data = sock;
>>>> vhost_net_buf_unproduce(nvq);
>>>> r = vhost_vq_init_access(vq);
>>>> - if (r)
>>>> + if (r) {
>>>> + pr_debug("init_access index=%u fd=%d r=%d vq=%p", index, fd, r, vq);
>>>> goto err_used;
>>>> + }
>>>> r = vhost_net_enable_vq(n, vq);
>>>> - if (r)
>>>> + if (r) {
>>>> + pr_debug("enable_vq index=%u fd=%d r=%d vq=%p", index, fd, r, vq);
>>>> goto err_used;
>>>> + }
>>>> if (index == VHOST_NET_VQ_RX)
>>>> nvq->rx_ring = get_tap_ptr_ring(fd);
>>>>
>>>> @@ -1559,6 +1570,8 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
>>>>
>>>> mutex_unlock(&vq->mutex);
>>>>
>>>> + pr_debug("sock=%p", sock);
>>>> +
>>>> if (oldubufs) {
>>>> vhost_net_ubuf_put_wait_and_free(oldubufs);
>>>> mutex_lock(&vq->mutex);
>>>> @@ -1710,6 +1723,7 @@ static long vhost_net_ioctl(struct file *f, unsigned int ioctl,
>>>>
>>>> switch (ioctl) {
>>>> case VHOST_NET_SET_BACKEND:
>>>> + pr_debug("VHOST_NET_SET_BACKEND");
>>>> if (copy_from_user(&backend, argp, sizeof backend))
>>>> return -EFAULT;
>>>> return vhost_net_set_backend(n, backend.index, backend.fd);
>>>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
>>>> index b5a51b1f2e79..ec25ba32fe81 100644
>>>> --- a/drivers/vhost/vhost.c
>>>> +++ b/drivers/vhost/vhost.c
>>>> @@ -1642,15 +1642,30 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned int ioctl, void __user *arg
>>>> r = -EINVAL;
>>>> break;
>>>> }
>>>> +
>>>> + if (vq->last_avail_idx || vq->avail_idx) {
>>>> + pr_debug(
>>>> + "strange VHOST_SET_VRING_BASE [vq=%p][s.index=%u][s.num=%u]",
>>>> + vq, s.index, s.num);
>>>> + dump_stack();
>>>> + r = 0;
>>>> + break;
>>>> + }
>>>> vq->last_avail_idx = s.num;
>>>> /* Forget the cached index value. */
>>>> vq->avail_idx = vq->last_avail_idx;
>>>> + pr_debug(
>>>> + "VHOST_SET_VRING_BASE [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][s.index=%u][s.num=%u]",
>>>> + vq, vq->last_avail_idx, vq->avail_idx, s.index, s.num);
>>>> break;
>>>> case VHOST_GET_VRING_BASE:
>>>> s.index = idx;
>>>> s.num = vq->last_avail_idx;
>>>> if (copy_to_user(argp, &s, sizeof s))
>>>> r = -EFAULT;
>>>> + pr_debug(
>>>> + "VHOST_GET_VRING_BASE [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][s.index=%u][s.num=%u]",
>>>> + vq, vq->last_avail_idx, vq->avail_idx, s.index, s.num);
>>>> break;
>>>> case VHOST_SET_VRING_KICK:
>>>> if (copy_from_user(&f, argp, sizeof f)) {
>>>> @@ -2239,8 +2254,8 @@ static int fetch_buf(struct vhost_virtqueue *vq)
>>>> vq->avail_idx = vhost16_to_cpu(vq, avail_idx);
>>>>
>>>> if (unlikely((u16)(vq->avail_idx - last_avail_idx) > vq->num)) {
>>>> - vq_err(vq, "Guest moved used index from %u to %u",
>>>> - last_avail_idx, vq->avail_idx);
>>>> + vq_err(vq, "Guest moved vq %p used index from %u to %u",
>>>> + vq, last_avail_idx, vq->avail_idx);
>>>> return -EFAULT;
>>>> }
>>>>
>>>> @@ -2316,6 +2331,9 @@ static int fetch_buf(struct vhost_virtqueue *vq)
>>>> BUG_ON(!(vq->used_flags & VRING_USED_F_NO_NOTIFY));
>>>>
>>>> /* On success, increment avail index. */
>>>> + pr_debug(
>>>> + "[vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][vq->ndescs=%d][vq->first_desc=%d]",
>>>> + vq, vq->last_avail_idx, vq->avail_idx, vq->ndescs, vq->first_desc);
>>>> vq->last_avail_idx++;
>>>>
>>>> return 0;
>>>> @@ -2432,6 +2450,9 @@ EXPORT_SYMBOL_GPL(vhost_get_vq_desc);
>>>> /* Reverse the effect of vhost_get_vq_desc. Useful for error handling. */
>>>> void vhost_discard_vq_desc(struct vhost_virtqueue *vq, int n)
>>>> {
>>>> + pr_debug(
>>>> + "DISCARD [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][n=%d]",
>>>> + vq, vq->last_avail_idx, vq->avail_idx, n);
>>>> vq->last_avail_idx -= n;
>>>> }
>>>> EXPORT_SYMBOL_GPL(vhost_discard_vq_desc);
>>>> diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
>>>> index 661088ae6dc7..08f6d2ccb697 100644
>>>> --- a/drivers/vhost/vhost.h
>>>> +++ b/drivers/vhost/vhost.h
>>>> @@ -250,11 +250,11 @@ int vhost_init_device_iotlb(struct vhost_dev *d, bool enabled);
>>>> } while (0)
>>>>
>>>> enum {
>>>> - VHOST_FEATURES = (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) |
>>>> - (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
>>>> - (1ULL << VIRTIO_RING_F_EVENT_IDX) |
>>>> - (1ULL << VHOST_F_LOG_ALL) |
>>>> - (1ULL << VIRTIO_F_ANY_LAYOUT) |
>>>> + VHOST_FEATURES = /* (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) | */
>>>> + /* (1ULL << VIRTIO_RING_F_INDIRECT_DESC) | */
>>>> + /* (1ULL << VIRTIO_RING_F_EVENT_IDX) | */
>>>> + /* (1ULL << VHOST_F_LOG_ALL) | */
>>>> + /* (1ULL << VIRTIO_F_ANY_LAYOUT) | */
>>>> (1ULL << VIRTIO_F_VERSION_1)
>>>> };
>>>>
>>>>
>>>
>>


Attachments:
log (32.69 kB)

2020-02-14 12:17:49

by Eugenio Perez Martin

[permalink] [raw]
Subject: Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot

On Fri, 2020-02-14 at 08:47 +0100, Christian Borntraeger wrote:
> repro
>
>
> On 14.02.20 08:43, Christian Borntraeger wrote:
> >
> > On 14.02.20 08:40, Eugenio Perez Martin wrote:
> > > Hi.
> > >
> > > Were the vhost and vhost_net modules loaded with dyndbg='+plt'? I miss
> > > all the others regular debug traces on that one.
> >
> > I did
> >
> > echo -n 'file drivers/vhost/vhost.c +plt' > control
> > and
> > echo -n 'file drivers/vhost/net.c +plt' > control
> >
> > but apparently it did not work...me hates dynamic debug.

Sorry about use dyndbg, but it is the easiest way to obtain accurate line numbers, threads id...

I usually load the module with that option (modprobe vhost dyndbg=+pfmlt && modprobe vhost_net dyndbg=+pfmlt), so it is
available from the very beginning.

> >
> > > Thanks!
> > >
> > > On Fri, Feb 14, 2020 at 8:34 AM Christian Borntraeger
> > > <[email protected]> wrote:
> > > > I did
> > > > ping -c 20 -f ... ; reboot
> > > > twice
> > > >
> > > > The ping after the first reboot showed .......E
> > > >
> > > > this was on the host console
> > > >
> > > > [ 55.951885] CPU: 34 PID: 1908 Comm: CPU 0/KVM Not tainted 5.5.0+ #21
> > > > [ 55.951891] Hardware name: IBM 3906 M04 704 (LPAR)
> > > > [ 55.951892] Call Trace:
> > > > [ 55.951902] [<0000001ede114132>] show_stack+0x8a/0xd0
> > > > [ 55.951906] [<0000001edeb0672a>] dump_stack+0x8a/0xb8
> > > > [ 55.951915] [<000003ff803736a6>] vhost_vring_ioctl+0x6fe/0x858 [vhost]
> > > > [ 55.951919] [<000003ff8042a608>] vhost_net_ioctl+0x510/0x570 [vhost_net]
> > > > [ 55.951924] [<0000001ede3c4dd8>] do_vfs_ioctl+0x430/0x6f8
> > > > [ 55.951926] [<0000001ede3c5124>] ksys_ioctl+0x84/0xb0
> > > > [ 55.951927] [<0000001ede3c51ba>] __s390x_sys_ioctl+0x2a/0x38
> > > > [ 55.951931] [<0000001edeb27f72>] system_call+0x2a6/0x2c8
> > > > [ 55.951949] CPU: 34 PID: 1908 Comm: CPU 0/KVM Not tainted 5.5.0+ #21
> > > > [ 55.951950] Hardware name: IBM 3906 M04 704 (LPAR)
> > > > [ 55.951951] Call Trace:
> > > > [ 55.951952] [<0000001ede114132>] show_stack+0x8a/0xd0
> > > > [ 55.951954] [<0000001edeb0672a>] dump_stack+0x8a/0xb8
> > > > [ 55.951956] [<000003ff803736a6>] vhost_vring_ioctl+0x6fe/0x858 [vhost]
> > > > [ 55.951958] [<000003ff8042a608>] vhost_net_ioctl+0x510/0x570 [vhost_net]
> > > > [ 55.951959] [<0000001ede3c4dd8>] do_vfs_ioctl+0x430/0x6f8
> > > > [ 55.951961] [<0000001ede3c5124>] ksys_ioctl+0x84/0xb0
> > > > [ 55.951962] [<0000001ede3c51ba>] __s390x_sys_ioctl+0x2a/0x38
> > > > [ 55.951964] [<0000001edeb27f72>] system_call+0x2a6/0x2c8
> > > > [ 55.951997] Guest moved vq 0000000063d896c6 used index from 44 to 0
> > > > [ 56.609831] unexpected descriptor format for RX: out 0, in 0
> > > > [ 86.540460] CPU: 6 PID: 1908 Comm: CPU 0/KVM Not tainted 5.5.0+ #21
> > > > [ 86.540464] Hardware name: IBM 3906 M04 704 (LPAR)
> > > > [ 86.540466] Call Trace:
> > > > [ 86.540473] [<0000001ede114132>] show_stack+0x8a/0xd0
> > > > [ 86.540477] [<0000001edeb0672a>] dump_stack+0x8a/0xb8
> > > > [ 86.540486] [<000003ff803736a6>] vhost_vring_ioctl+0x6fe/0x858 [vhost]
> > > > [ 86.540490] [<000003ff8042a608>] vhost_net_ioctl+0x510/0x570 [vhost_net]
> > > > [ 86.540494] [<0000001ede3c4dd8>] do_vfs_ioctl+0x430/0x6f8
> > > > [ 86.540496] [<0000001ede3c5124>] ksys_ioctl+0x84/0xb0
> > > > [ 86.540498] [<0000001ede3c51ba>] __s390x_sys_ioctl+0x2a/0x38
> > > > [ 86.540501] [<0000001edeb27f72>] system_call+0x2a6/0x2c8
> > > > [ 86.540524] CPU: 6 PID: 1908 Comm: CPU 0/KVM Not tainted 5.5.0+ #21
> > > > [ 86.540525] Hardware name: IBM 3906 M04 704 (LPAR)
> > > > [ 86.540526] Call Trace:
> > > > [ 86.540527] [<0000001ede114132>] show_stack+0x8a/0xd0
> > > > [ 86.540528] [<0000001edeb0672a>] dump_stack+0x8a/0xb8
> > > > [ 86.540531] [<000003ff803736a6>] vhost_vring_ioctl+0x6fe/0x858 [vhost]
> > > > [ 86.540532] [<000003ff8042a608>] vhost_net_ioctl+0x510/0x570 [vhost_net]
> > > > [ 86.540534] [<0000001ede3c4dd8>] do_vfs_ioctl+0x430/0x6f8
> > > > [ 86.540536] [<0000001ede3c5124>] ksys_ioctl+0x84/0xb0
> > > > [ 86.540537] [<0000001ede3c51ba>] __s390x_sys_ioctl+0x2a/0x38
> > > > [ 86.540538] [<0000001edeb27f72>] system_call+0x2a6/0x2c8
> > > > [ 86.540570] unexpected descriptor format for RX: out 0, in 0
> > > > [ 86.540577] Unexpected header len for TX: 0 expected 0
> > > >
> > > >
> > > > On 14.02.20 08:06, Eugenio Pérez wrote:
> > > > > Hi Christian.
> > > > >
> > > > > Sorry, that was meant to be applied over previous debug patch.
> > > > >
> > > > > Here I inline the one meant to be applied over eccb852f1fe6bede630e2e4f1a121a81e34354ab.
> > > > >
> > > > > Thanks!
> > > > >
> > > > > From d978ace99e4844b49b794d768385db3d128a4cc0 Mon Sep 17 00:00:00 2001
> > > > > From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <[email protected]>
> > > > > Date: Fri, 14 Feb 2020 08:02:26 +0100
> > > > > Subject: [PATCH] vhost: disable all features and trace last_avail_idx and
> > > > > ioctl calls
> > > > >
> > > > > ---
> > > > > drivers/vhost/net.c | 20 +++++++++++++++++---
> > > > > drivers/vhost/vhost.c | 25 +++++++++++++++++++++++--
> > > > > drivers/vhost/vhost.h | 10 +++++-----
> > > > > 3 files changed, 45 insertions(+), 10 deletions(-)
> > > > >
> > > > > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> > > > > index e158159671fa..e4d5f843f9c0 100644
> > > > > --- a/drivers/vhost/net.c
> > > > > +++ b/drivers/vhost/net.c
> > > > > @@ -1505,10 +1505,13 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
> > > > >
> > > > > mutex_lock(&n->dev.mutex);
> > > > > r = vhost_dev_check_owner(&n->dev);
> > > > > - if (r)
> > > > > + if (r) {
> > > > > + pr_debug("vhost_dev_check_owner index=%u fd=%d rc r=%d", index, fd, r);
> > > > > goto err;
> > > > > + }
> > > > >
> > > > > if (index >= VHOST_NET_VQ_MAX) {
> > > > > + pr_debug("vhost_dev_check_owner index=%u fd=%d MAX=%d", index, fd, VHOST_NET_VQ_MAX);
> > > > > r = -ENOBUFS;
> > > > > goto err;
> > > > > }
> > > > > @@ -1518,22 +1521,26 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
> > > > >
> > > > > /* Verify that ring has been setup correctly. */
> > > > > if (!vhost_vq_access_ok(vq)) {
> > > > > + pr_debug("vhost_net_set_backend index=%u fd=%d !vhost_vq_access_ok", index, fd);
> > > > > r = -EFAULT;
> > > > > goto err_vq;
> > > > > }
> > > > > sock = get_socket(fd);
> > > > > if (IS_ERR(sock)) {
> > > > > r = PTR_ERR(sock);
> > > > > + pr_debug("vhost_net_set_backend index=%u fd=%d get_socket err r=%d", index, fd, r);
> > > > > goto err_vq;
> > > > > }
> > > > >
> > > > > /* start polling new socket */
> > > > > oldsock = vq->private_data;
> > > > > if (sock != oldsock) {
> > > > > + pr_debug("sock=%p != oldsock=%p index=%u fd=%d vq=%p", sock, oldsock, index, fd, vq);
> > > > > ubufs = vhost_net_ubuf_alloc(vq,
> > > > > sock && vhost_sock_zcopy(sock));
> > > > > if (IS_ERR(ubufs)) {
> > > > > r = PTR_ERR(ubufs);
> > > > > + pr_debug("ubufs index=%u fd=%d err r=%d vq=%p", index, fd, r, vq);
> > > > > goto err_ubufs;
> > > > > }
> > > > >
> > > > > @@ -1541,11 +1548,15 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
> > > > > vq->private_data = sock;
> > > > > vhost_net_buf_unproduce(nvq);
> > > > > r = vhost_vq_init_access(vq);
> > > > > - if (r)
> > > > > + if (r) {
> > > > > + pr_debug("init_access index=%u fd=%d r=%d vq=%p", index, fd, r, vq);
> > > > > goto err_used;
> > > > > + }
> > > > > r = vhost_net_enable_vq(n, vq);
> > > > > - if (r)
> > > > > + if (r) {
> > > > > + pr_debug("enable_vq index=%u fd=%d r=%d vq=%p", index, fd, r, vq);
> > > > > goto err_used;
> > > > > + }
> > > > > if (index == VHOST_NET_VQ_RX)
> > > > > nvq->rx_ring = get_tap_ptr_ring(fd);
> > > > >
> > > > > @@ -1559,6 +1570,8 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
> > > > >
> > > > > mutex_unlock(&vq->mutex);
> > > > >
> > > > > + pr_debug("sock=%p", sock);
> > > > > +
> > > > > if (oldubufs) {
> > > > > vhost_net_ubuf_put_wait_and_free(oldubufs);
> > > > > mutex_lock(&vq->mutex);
> > > > > @@ -1710,6 +1723,7 @@ static long vhost_net_ioctl(struct file *f, unsigned int ioctl,
> > > > >
> > > > > switch (ioctl) {
> > > > > case VHOST_NET_SET_BACKEND:
> > > > > + pr_debug("VHOST_NET_SET_BACKEND");
> > > > > if (copy_from_user(&backend, argp, sizeof backend))
> > > > > return -EFAULT;
> > > > > return vhost_net_set_backend(n, backend.index, backend.fd);
> > > > > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> > > > > index b5a51b1f2e79..ec25ba32fe81 100644
> > > > > --- a/drivers/vhost/vhost.c
> > > > > +++ b/drivers/vhost/vhost.c
> > > > > @@ -1642,15 +1642,30 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned int ioctl, void __user *arg
> > > > > r = -EINVAL;
> > > > > break;
> > > > > }
> > > > > +
> > > > > + if (vq->last_avail_idx || vq->avail_idx) {
> > > > > + pr_debug(
> > > > > + "strange VHOST_SET_VRING_BASE [vq=%p][s.index=%u][s.num=%u]",
> > > > > + vq, s.index, s.num);
> > > > > + dump_stack();
> > > > > + r = 0;
> > > > > + break;
> > > > > + }
> > > > > vq->last_avail_idx = s.num;
> > > > > /* Forget the cached index value. */
> > > > > vq->avail_idx = vq->last_avail_idx;
> > > > > + pr_debug(
> > > > > + "VHOST_SET_VRING_BASE [vq=%p][vq->last_avail_idx=%u][vq-
> > > > > >avail_idx=%u][s.index=%u][s.num=%u]",
> > > > > + vq, vq->last_avail_idx, vq->avail_idx, s.index, s.num);
> > > > > break;
> > > > > case VHOST_GET_VRING_BASE:
> > > > > s.index = idx;
> > > > > s.num = vq->last_avail_idx;
> > > > > if (copy_to_user(argp, &s, sizeof s))
> > > > > r = -EFAULT;
> > > > > + pr_debug(
> > > > > + "VHOST_GET_VRING_BASE [vq=%p][vq->last_avail_idx=%u][vq-
> > > > > >avail_idx=%u][s.index=%u][s.num=%u]",
> > > > > + vq, vq->last_avail_idx, vq->avail_idx, s.index, s.num);
> > > > > break;
> > > > > case VHOST_SET_VRING_KICK:
> > > > > if (copy_from_user(&f, argp, sizeof f)) {
> > > > > @@ -2239,8 +2254,8 @@ static int fetch_buf(struct vhost_virtqueue *vq)
> > > > > vq->avail_idx = vhost16_to_cpu(vq, avail_idx);
> > > > >
> > > > > if (unlikely((u16)(vq->avail_idx - last_avail_idx) > vq->num)) {
> > > > > - vq_err(vq, "Guest moved used index from %u to %u",
> > > > > - last_avail_idx, vq->avail_idx);
> > > > > + vq_err(vq, "Guest moved vq %p used index from %u to %u",
> > > > > + vq, last_avail_idx, vq->avail_idx);
> > > > > return -EFAULT;
> > > > > }
> > > > >
> > > > > @@ -2316,6 +2331,9 @@ static int fetch_buf(struct vhost_virtqueue *vq)
> > > > > BUG_ON(!(vq->used_flags & VRING_USED_F_NO_NOTIFY));
> > > > >
> > > > > /* On success, increment avail index. */
> > > > > + pr_debug(
> > > > > + "[vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][vq->ndescs=%d][vq->first_desc=%d]",
> > > > > + vq, vq->last_avail_idx, vq->avail_idx, vq->ndescs, vq->first_desc);
> > > > > vq->last_avail_idx++;
> > > > >
> > > > > return 0;
> > > > > @@ -2432,6 +2450,9 @@ EXPORT_SYMBOL_GPL(vhost_get_vq_desc);
> > > > > /* Reverse the effect of vhost_get_vq_desc. Useful for error handling. */
> > > > > void vhost_discard_vq_desc(struct vhost_virtqueue *vq, int n)
> > > > > {
> > > > > + pr_debug(
> > > > > + "DISCARD [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][n=%d]",
> > > > > + vq, vq->last_avail_idx, vq->avail_idx, n);
> > > > > vq->last_avail_idx -= n;
> > > > > }
> > > > > EXPORT_SYMBOL_GPL(vhost_discard_vq_desc);
> > > > > diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
> > > > > index 661088ae6dc7..08f6d2ccb697 100644
> > > > > --- a/drivers/vhost/vhost.h
> > > > > +++ b/drivers/vhost/vhost.h
> > > > > @@ -250,11 +250,11 @@ int vhost_init_device_iotlb(struct vhost_dev *d, bool enabled);
> > > > > } while (0)
> > > > >
> > > > > enum {
> > > > > - VHOST_FEATURES = (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) |
> > > > > - (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
> > > > > - (1ULL << VIRTIO_RING_F_EVENT_IDX) |
> > > > > - (1ULL << VHOST_F_LOG_ALL) |
> > > > > - (1ULL << VIRTIO_F_ANY_LAYOUT) |
> > > > > + VHOST_FEATURES = /* (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) | */
> > > > > + /* (1ULL << VIRTIO_RING_F_INDIRECT_DESC) | */
> > > > > + /* (1ULL << VIRTIO_RING_F_EVENT_IDX) | */
> > > > > + /* (1ULL << VHOST_F_LOG_ALL) | */
> > > > > + /* (1ULL << VIRTIO_F_ANY_LAYOUT) | */
> > > > > (1ULL << VIRTIO_F_VERSION_1)
> > > > > };
> > > > >
> > > > >

Can you try the inlined patch over 52c36ce7f334 ("vhost: use batched version by default")? My intention is to check if
"strange VHOST_SET_VRING_BASE" line appears. In previous tests, it appears very fast, but maybe it takes some time for
it to appear, or it does not appear anymore.

Thanks!

From 756a96e489688b2c04230580770aac17c8a46265 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <[email protected]>
Date: Fri, 14 Feb 2020 08:02:26 +0100
Subject: [PATCH] vhost: disable all features and trace last_avail_idx and
ioctl calls

---
drivers/vhost/net.c | 20 +++++++++++++++++---
drivers/vhost/vhost.c | 25 +++++++++++++++++++++++--
drivers/vhost/vhost.h | 10 +++++-----
3 files changed, 45 insertions(+), 10 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index e158159671fa..e4d5f843f9c0 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -1505,10 +1505,13 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)

mutex_lock(&n->dev.mutex);
r = vhost_dev_check_owner(&n->dev);
- if (r)
+ if (r) {
+ pr_debug("vhost_dev_check_owner index=%u fd=%d rc r=%d", index, fd, r);
goto err;
+ }

if (index >= VHOST_NET_VQ_MAX) {
+ pr_debug("vhost_dev_check_owner index=%u fd=%d MAX=%d", index, fd, VHOST_NET_VQ_MAX);
r = -ENOBUFS;
goto err;
}
@@ -1518,22 +1521,26 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)

/* Verify that ring has been setup correctly. */
if (!vhost_vq_access_ok(vq)) {
+ pr_debug("vhost_net_set_backend index=%u fd=%d !vhost_vq_access_ok", index, fd);
r = -EFAULT;
goto err_vq;
}
sock = get_socket(fd);
if (IS_ERR(sock)) {
r = PTR_ERR(sock);
+ pr_debug("vhost_net_set_backend index=%u fd=%d get_socket err r=%d", index, fd, r);
goto err_vq;
}

/* start polling new socket */
oldsock = vq->private_data;
if (sock != oldsock) {
+ pr_debug("sock=%p != oldsock=%p index=%u fd=%d vq=%p", sock, oldsock, index, fd, vq);
ubufs = vhost_net_ubuf_alloc(vq,
sock && vhost_sock_zcopy(sock));
if (IS_ERR(ubufs)) {
r = PTR_ERR(ubufs);
+ pr_debug("ubufs index=%u fd=%d err r=%d vq=%p", index, fd, r, vq);
goto err_ubufs;
}

@@ -1541,11 +1548,15 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
vq->private_data = sock;
vhost_net_buf_unproduce(nvq);
r = vhost_vq_init_access(vq);
- if (r)
+ if (r) {
+ pr_debug("init_access index=%u fd=%d r=%d vq=%p", index, fd, r, vq);
goto err_used;
+ }
r = vhost_net_enable_vq(n, vq);
- if (r)
+ if (r) {
+ pr_debug("enable_vq index=%u fd=%d r=%d vq=%p", index, fd, r, vq);
goto err_used;
+ }
if (index == VHOST_NET_VQ_RX)
nvq->rx_ring = get_tap_ptr_ring(fd);

@@ -1559,6 +1570,8 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)

mutex_unlock(&vq->mutex);

+ pr_debug("sock=%p", sock);
+
if (oldubufs) {
vhost_net_ubuf_put_wait_and_free(oldubufs);
mutex_lock(&vq->mutex);
@@ -1710,6 +1723,7 @@ static long vhost_net_ioctl(struct file *f, unsigned int ioctl,

switch (ioctl) {
case VHOST_NET_SET_BACKEND:
+ pr_debug("VHOST_NET_SET_BACKEND");
if (copy_from_user(&backend, argp, sizeof backend))
return -EFAULT;
return vhost_net_set_backend(n, backend.index, backend.fd);
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 56c5253056ee..babc48c8a8c4 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1640,15 +1640,30 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned int ioctl, void __user *arg
r = -EINVAL;
break;
}
+
+ if (vq->last_avail_idx || vq->avail_idx) {
+ pr_debug(
+ "strange VHOST_SET_VRING_BASE [vq=%p][s.index=%u][s.num=%u]",
+ vq, s.index, s.num);
+ dump_stack();
+ r = 0;
+ break;
+ }
vq->last_avail_idx = s.num;
/* Forget the cached index value. */
vq->avail_idx = vq->last_avail_idx;
+ pr_debug(
+ "VHOST_SET_VRING_BASE [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][s.index=%u][s.num=%u]",
+ vq, vq->last_avail_idx, vq->avail_idx, s.index, s.num);
break;
case VHOST_GET_VRING_BASE:
s.index = idx;
s.num = vq->last_avail_idx;
if (copy_to_user(argp, &s, sizeof s))
r = -EFAULT;
+ pr_debug(
+ "VHOST_GET_VRING_BASE [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][s.index=%u][s.num=%u]",
+ vq, vq->last_avail_idx, vq->avail_idx, s.index, s.num);
break;
case VHOST_SET_VRING_KICK:
if (copy_from_user(&f, argp, sizeof f)) {
@@ -2233,8 +2248,8 @@ static int fetch_descs(struct vhost_virtqueue *vq)
vq->avail_idx = vhost16_to_cpu(vq, avail_idx);

if (unlikely((u16)(vq->avail_idx - last_avail_idx) > vq->num)) {
- vq_err(vq, "Guest moved used index from %u to %u",
- last_avail_idx, vq->avail_idx);
+ vq_err(vq, "Guest moved vq %p used index from %u to %u",
+ vq, last_avail_idx, vq->avail_idx);
return -EFAULT;
}

@@ -2310,6 +2325,9 @@ static int fetch_descs(struct vhost_virtqueue *vq)
BUG_ON(!(vq->used_flags & VRING_USED_F_NO_NOTIFY));

/* On success, increment avail index. */
+ pr_debug(
+ "[vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][vq->ndescs=%d][vq->first_desc=%d]",
+ vq, vq->last_avail_idx, vq->avail_idx, vq->ndescs, vq->first_desc);
vq->last_avail_idx++;

return 0;
@@ -2403,6 +2421,9 @@ EXPORT_SYMBOL_GPL(vhost_get_vq_desc);
/* Reverse the effect of vhost_get_vq_desc. Useful for error handling. */
void vhost_discard_vq_desc(struct vhost_virtqueue *vq, int n)
{
+ pr_debug(
+ "DISCARD [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][n=%d]",
+ vq, vq->last_avail_idx, vq->avail_idx, n);
vq->last_avail_idx -= n;
}
EXPORT_SYMBOL_GPL(vhost_discard_vq_desc);
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index a0bcf8bffa43..2ce2d3a97c31 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -248,11 +248,11 @@ int vhost_init_device_iotlb(struct vhost_dev *d, bool enabled);
} while (0)

enum {
- VHOST_FEATURES = (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) |
- (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
- (1ULL << VIRTIO_RING_F_EVENT_IDX) |
- (1ULL << VHOST_F_LOG_ALL) |
- (1ULL << VIRTIO_F_ANY_LAYOUT) |
+ VHOST_FEATURES = /* (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) | */
+ /* (1ULL << VIRTIO_RING_F_INDIRECT_DESC) | */
+ /* (1ULL << VIRTIO_RING_F_EVENT_IDX) | */
+ /* (1ULL << VHOST_F_LOG_ALL) | */
+ /* (1ULL << VIRTIO_F_ANY_LAYOUT) | */
(1ULL << VIRTIO_F_VERSION_1)
};

--
2.18.1


2020-02-14 12:24:41

by Christian Borntraeger

[permalink] [raw]
Subject: Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot



On 14.02.20 13:17, Eugenio Pérez wrote:
> Can you try the inlined patch over 52c36ce7f334 ("vhost: use batched version by default")? My intention is to check if
> "strange VHOST_SET_VRING_BASE" line appears. In previous tests, it appears very fast, but maybe it takes some time for
> it to appear, or it does not appear anymore.

LD [M] drivers/vhost/vhost_vsock.o
CC [M] drivers/vhost/vhost.o
In file included from ./include/linux/printk.h:331,
from ./include/linux/kernel.h:15,
from ./include/linux/list.h:9,
from ./include/linux/wait.h:7,
from ./include/linux/eventfd.h:13,
from drivers/vhost/vhost.c:13:
drivers/vhost/vhost.c: In function ‘fetch_descs’:
drivers/vhost/vhost.c:2330:56: error: ‘struct vhost_virtqueue’ has no member named ‘first_desc’
2330 | vq, vq->last_avail_idx, vq->avail_idx, vq->ndescs, vq->first_desc);
| ^~
./include/linux/dynamic_debug.h:125:15: note: in definition of macro ‘__dynamic_func_call’
125 | func(&id, ##__VA_ARGS__); \
| ^~~~~~~~~~~
./include/linux/dynamic_debug.h:153:2: note: in expansion of macro ‘_dynamic_func_call’
153 | _dynamic_func_call(fmt, __dynamic_pr_debug, \
| ^~~~~~~~~~~~~~~~~~
./include/linux/printk.h:335:2: note: in expansion of macro ‘dynamic_pr_debug’
335 | dynamic_pr_debug(fmt, ##__VA_ARGS__)
| ^~~~~~~~~~~~~~~~
drivers/vhost/vhost.c:2328:2: note: in expansion of macro ‘pr_debug’
2328 | pr_debug(
| ^~~~~~~~
make[2]: *** [scripts/Makefile.build:266: drivers/vhost/vhost.o] Error 1
make[1]: *** [scripts/Makefile.build:503: drivers/vhost] Error 2

2020-02-14 12:27:29

by Eugenio Perez Martin

[permalink] [raw]
Subject: Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot

On Fri, 2020-02-14 at 13:22 +0100, Christian Borntraeger wrote:
>
> On 14.02.20 13:17, Eugenio Pérez wrote:
> > Can you try the inlined patch over 52c36ce7f334 ("vhost: use batched version by default")? My intention is to check
> > if
> > "strange VHOST_SET_VRING_BASE" line appears. In previous tests, it appears very fast, but maybe it takes some time
> > for
> > it to appear, or it does not appear anymore.
>
> LD [M] drivers/vhost/vhost_vsock.o
> CC [M] drivers/vhost/vhost.o
> In file included from ./include/linux/printk.h:331,
> from ./include/linux/kernel.h:15,
> from ./include/linux/list.h:9,
> from ./include/linux/wait.h:7,
> from ./include/linux/eventfd.h:13,
> from drivers/vhost/vhost.c:13:
> drivers/vhost/vhost.c: In function ‘fetch_descs’:
> drivers/vhost/vhost.c:2330:56: error: ‘struct vhost_virtqueue’ has no member named ‘first_desc’
> 2330 | vq, vq->last_avail_idx, vq->avail_idx, vq->ndescs, vq->first_desc);
> | ^~
> ./include/linux/dynamic_debug.h:125:15: note: in definition of macro ‘__dynamic_func_call’
> 125 | func(&id, ##__VA_ARGS__); \
> | ^~~~~~~~~~~
> ./include/linux/dynamic_debug.h:153:2: note: in expansion of macro ‘_dynamic_func_call’
> 153 | _dynamic_func_call(fmt, __dynamic_pr_debug, \
> | ^~~~~~~~~~~~~~~~~~
> ./include/linux/printk.h:335:2: note: in expansion of macro ‘dynamic_pr_debug’
> 335 | dynamic_pr_debug(fmt, ##__VA_ARGS__)
> | ^~~~~~~~~~~~~~~~
> drivers/vhost/vhost.c:2328:2: note: in expansion of macro ‘pr_debug’
> 2328 | pr_debug(
> | ^~~~~~~~
> make[2]: *** [scripts/Makefile.build:266: drivers/vhost/vhost.o] Error 1
> make[1]: *** [scripts/Makefile.build:503: drivers/vhost] Error 2
>

Sorry about that. Here is the right patch.

From 5d7b5304c163910936382d46561fc43eb770aad2 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <[email protected]>
Date: Fri, 14 Feb 2020 08:02:26 +0100
Subject: [PATCH] vhost: disable all features and trace last_avail_idx and
ioctl calls

---
drivers/vhost/net.c | 20 +++++++++++++++++---
drivers/vhost/vhost.c | 25 +++++++++++++++++++++++--
drivers/vhost/vhost.h | 10 +++++-----
3 files changed, 45 insertions(+), 10 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index e158159671fa..e4d5f843f9c0 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -1505,10 +1505,13 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)

mutex_lock(&n->dev.mutex);
r = vhost_dev_check_owner(&n->dev);
- if (r)
+ if (r) {
+ pr_debug("vhost_dev_check_owner index=%u fd=%d rc r=%d", index, fd, r);
goto err;
+ }

if (index >= VHOST_NET_VQ_MAX) {
+ pr_debug("vhost_dev_check_owner index=%u fd=%d MAX=%d", index, fd, VHOST_NET_VQ_MAX);
r = -ENOBUFS;
goto err;
}
@@ -1518,22 +1521,26 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)

/* Verify that ring has been setup correctly. */
if (!vhost_vq_access_ok(vq)) {
+ pr_debug("vhost_net_set_backend index=%u fd=%d !vhost_vq_access_ok", index, fd);
r = -EFAULT;
goto err_vq;
}
sock = get_socket(fd);
if (IS_ERR(sock)) {
r = PTR_ERR(sock);
+ pr_debug("vhost_net_set_backend index=%u fd=%d get_socket err r=%d", index, fd, r);
goto err_vq;
}

/* start polling new socket */
oldsock = vq->private_data;
if (sock != oldsock) {
+ pr_debug("sock=%p != oldsock=%p index=%u fd=%d vq=%p", sock, oldsock, index, fd, vq);
ubufs = vhost_net_ubuf_alloc(vq,
sock && vhost_sock_zcopy(sock));
if (IS_ERR(ubufs)) {
r = PTR_ERR(ubufs);
+ pr_debug("ubufs index=%u fd=%d err r=%d vq=%p", index, fd, r, vq);
goto err_ubufs;
}

@@ -1541,11 +1548,15 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
vq->private_data = sock;
vhost_net_buf_unproduce(nvq);
r = vhost_vq_init_access(vq);
- if (r)
+ if (r) {
+ pr_debug("init_access index=%u fd=%d r=%d vq=%p", index, fd, r, vq);
goto err_used;
+ }
r = vhost_net_enable_vq(n, vq);
- if (r)
+ if (r) {
+ pr_debug("enable_vq index=%u fd=%d r=%d vq=%p", index, fd, r, vq);
goto err_used;
+ }
if (index == VHOST_NET_VQ_RX)
nvq->rx_ring = get_tap_ptr_ring(fd);

@@ -1559,6 +1570,8 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)

mutex_unlock(&vq->mutex);

+ pr_debug("sock=%p", sock);
+
if (oldubufs) {
vhost_net_ubuf_put_wait_and_free(oldubufs);
mutex_lock(&vq->mutex);
@@ -1710,6 +1723,7 @@ static long vhost_net_ioctl(struct file *f, unsigned int ioctl,

switch (ioctl) {
case VHOST_NET_SET_BACKEND:
+ pr_debug("VHOST_NET_SET_BACKEND");
if (copy_from_user(&backend, argp, sizeof backend))
return -EFAULT;
return vhost_net_set_backend(n, backend.index, backend.fd);
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 56c5253056ee..2e72bbeffac7 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1640,15 +1640,30 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned int ioctl, void __user *arg
r = -EINVAL;
break;
}
+
+ if (vq->last_avail_idx || vq->avail_idx) {
+ pr_debug(
+ "strange VHOST_SET_VRING_BASE [vq=%p][s.index=%u][s.num=%u]",
+ vq, s.index, s.num);
+ dump_stack();
+ r = 0;
+ break;
+ }
vq->last_avail_idx = s.num;
/* Forget the cached index value. */
vq->avail_idx = vq->last_avail_idx;
+ pr_debug(
+ "VHOST_SET_VRING_BASE [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][s.index=%u][s.num=%u]",
+ vq, vq->last_avail_idx, vq->avail_idx, s.index, s.num);
break;
case VHOST_GET_VRING_BASE:
s.index = idx;
s.num = vq->last_avail_idx;
if (copy_to_user(argp, &s, sizeof s))
r = -EFAULT;
+ pr_debug(
+ "VHOST_GET_VRING_BASE [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][s.index=%u][s.num=%u]",
+ vq, vq->last_avail_idx, vq->avail_idx, s.index, s.num);
break;
case VHOST_SET_VRING_KICK:
if (copy_from_user(&f, argp, sizeof f)) {
@@ -2233,8 +2248,8 @@ static int fetch_descs(struct vhost_virtqueue *vq)
vq->avail_idx = vhost16_to_cpu(vq, avail_idx);

if (unlikely((u16)(vq->avail_idx - last_avail_idx) > vq->num)) {
- vq_err(vq, "Guest moved used index from %u to %u",
- last_avail_idx, vq->avail_idx);
+ vq_err(vq, "Guest moved vq %p used index from %u to %u",
+ vq, last_avail_idx, vq->avail_idx);
return -EFAULT;
}

@@ -2310,6 +2325,9 @@ static int fetch_descs(struct vhost_virtqueue *vq)
BUG_ON(!(vq->used_flags & VRING_USED_F_NO_NOTIFY));

/* On success, increment avail index. */
+ pr_debug(
+ "[vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][vq->ndescs=%d]",
+ vq, vq->last_avail_idx, vq->avail_idx, vq->ndescs);
vq->last_avail_idx++;

return 0;
@@ -2403,6 +2421,9 @@ EXPORT_SYMBOL_GPL(vhost_get_vq_desc);
/* Reverse the effect of vhost_get_vq_desc. Useful for error handling. */
void vhost_discard_vq_desc(struct vhost_virtqueue *vq, int n)
{
+ pr_debug(
+ "DISCARD [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][n=%d]",
+ vq, vq->last_avail_idx, vq->avail_idx, n);
vq->last_avail_idx -= n;
}
EXPORT_SYMBOL_GPL(vhost_discard_vq_desc);
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index a0bcf8bffa43..2ce2d3a97c31 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -248,11 +248,11 @@ int vhost_init_device_iotlb(struct vhost_dev *d, bool enabled);
} while (0)

enum {
- VHOST_FEATURES = (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) |
- (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
- (1ULL << VIRTIO_RING_F_EVENT_IDX) |
- (1ULL << VHOST_F_LOG_ALL) |
- (1ULL << VIRTIO_F_ANY_LAYOUT) |
+ VHOST_FEATURES = /* (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) | */
+ /* (1ULL << VIRTIO_RING_F_INDIRECT_DESC) | */
+ /* (1ULL << VIRTIO_RING_F_EVENT_IDX) | */
+ /* (1ULL << VHOST_F_LOG_ALL) | */
+ /* (1ULL << VIRTIO_F_ANY_LAYOUT) | */
(1ULL << VIRTIO_F_VERSION_1)
};

--
2.18.1


2020-02-14 12:35:43

by Christian Borntraeger

[permalink] [raw]
Subject: Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot



On 14.02.20 13:26, Eugenio Pérez wrote:
> On Fri, 2020-02-14 at 13:22 +0100, Christian Borntraeger wrote:
>>
>> On 14.02.20 13:17, Eugenio Pérez wrote:
>>> Can you try the inlined patch over 52c36ce7f334 ("vhost: use batched version by default")? My intention is to check
>>> if
>>> "strange VHOST_SET_VRING_BASE" line appears. In previous tests, it appears very fast, but maybe it takes some time
>>> for
>>> it to appear, or it does not appear anymore.


yep it does:

[ 67.801012] [1917] vhost:vhost_vring_ioctl:1655: VHOST_SET_VRING_BASE [vq=0000000088199421][vq->last_avail_idx=0][vq->avail_idx=0][s.index=0][s.num=0]
[ 67.801018] [1917] vhost:vhost_vring_ioctl:1655: VHOST_SET_VRING_BASE [vq=00000000175f11ec][vq->last_avail_idx=0][vq->avail_idx=0][s.index=1][s.num=0]
[ 67.801026] [1917] vhost_net:vhost_net_ioctl:1726: VHOST_NET_SET_BACKEND
[ 67.801028] [1917] vhost_net:vhost_net_set_backend:1538: sock=0000000082d8d291 != oldsock=000000001ae027fd index=0 fd=39 vq=0000000088199421
[ 67.801032] [1917] vhost_net:vhost_net_set_backend:1573: sock=0000000082d8d291
[ 67.801033] [1917] vhost_net:vhost_net_ioctl:1726: VHOST_NET_SET_BACKEND
[ 67.801034] [1917] vhost_net:vhost_net_set_backend:1538: sock=0000000082d8d291 != oldsock=000000001ae027fd index=1 fd=39 vq=00000000175f11ec
[ 67.801035] [1917] vhost_net:vhost_net_set_backend:1573: sock=0000000082d8d291
[ 67.801037] [1915] vhost:vhost_discard_vq_desc:2424: DISCARD [vq=0000000088199421][vq->last_avail_idx=0][vq->avail_idx=0][n=0]
[ 68.648803] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=0][vq->avail_idx=256][vq->ndescs=1]
[ 68.648810] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=1][vq->avail_idx=256][vq->ndescs=1]
[ 68.648812] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=2][vq->avail_idx=256][vq->ndescs=1]
[ 68.648815] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=3][vq->avail_idx=256][vq->ndescs=1]
[ 68.648817] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=4][vq->avail_idx=256][vq->ndescs=1]
[ 68.648818] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=5][vq->avail_idx=256][vq->ndescs=1]
[ 68.648820] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=6][vq->avail_idx=256][vq->ndescs=1]
[ 68.648822] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=7][vq->avail_idx=256][vq->ndescs=1]
[ 68.648824] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=8][vq->avail_idx=256][vq->ndescs=1]
[ 68.648826] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=9][vq->avail_idx=256][vq->ndescs=1]
[ 68.648828] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=10][vq->avail_idx=256][vq->ndescs=1]
[ 68.648829] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=11][vq->avail_idx=256][vq->ndescs=1]
[ 68.648831] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=12][vq->avail_idx=256][vq->ndescs=1]
[ 68.648832] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=13][vq->avail_idx=256][vq->ndescs=1]
[ 68.648833] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=14][vq->avail_idx=256][vq->ndescs=1]
[ 68.670292] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=0][vq->avail_idx=1][vq->ndescs=1]
[ 68.687187] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=1][vq->avail_idx=2][vq->ndescs=1]
[ 68.687623] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=15][vq->avail_idx=256][vq->ndescs=1]
[ 68.687641] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=2][vq->avail_idx=4][vq->ndescs=1]
[ 68.687642] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=3][vq->avail_idx=4][vq->ndescs=1]
[ 68.690274] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=16][vq->avail_idx=256][vq->ndescs=1]
[ 68.690539] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=4][vq->avail_idx=5][vq->ndescs=1]
[ 68.715379] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=17][vq->avail_idx=256][vq->ndescs=1]
[ 68.800525] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=5][vq->avail_idx=6][vq->ndescs=1]
[ 68.890537] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=6][vq->avail_idx=7][vq->ndescs=1]
[ 68.900587] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=7][vq->avail_idx=8][vq->ndescs=1]
[ 68.916837] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=8][vq->avail_idx=9][vq->ndescs=2]
[ 68.928828] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=18][vq->avail_idx=256][vq->ndescs=1]
[ 69.090540] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=9][vq->avail_idx=10][vq->ndescs=1]
[ 69.119651] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=10][vq->avail_idx=11][vq->ndescs=2]
[ 69.132325] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=19][vq->avail_idx=256][vq->ndescs=1]
[ 69.323473] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=11][vq->avail_idx=12][vq->ndescs=2]
[ 69.354557] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=20][vq->avail_idx=256][vq->ndescs=1]
[ 69.442550] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=21][vq->avail_idx=256][vq->ndescs=1]
[ 69.523593] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=12][vq->avail_idx=13][vq->ndescs=2]
[ 69.557360] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=22][vq->avail_idx=256][vq->ndescs=1]
[ 69.980634] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=13][vq->avail_idx=14][vq->ndescs=1]
[ 69.981364] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=14][vq->avail_idx=15][vq->ndescs=1]
[ 70.010545] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=15][vq->avail_idx=16][vq->ndescs=1]
[ 70.161316] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=23][vq->avail_idx=256][vq->ndescs=1]
[ 70.177640] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=24][vq->avail_idx=256][vq->ndescs=1]
[ 70.280564] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=16][vq->avail_idx=17][vq->ndescs=1]
[ 70.670327] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=17][vq->avail_idx=18][vq->ndescs=1]
[ 70.932887] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=18][vq->avail_idx=19][vq->ndescs=2]
[ 70.940587] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=19][vq->avail_idx=20][vq->ndescs=1]
[ 70.947598] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=25][vq->avail_idx=256][vq->ndescs=1]
[ 71.070388] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=20][vq->avail_idx=21][vq->ndescs=1]
[ 71.070770] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=26][vq->avail_idx=256][vq->ndescs=1]
[ 71.070805] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=21][vq->avail_idx=22][vq->ndescs=1]
[ 71.070977] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=27][vq->avail_idx=256][vq->ndescs=1]
[ 71.071049] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=22][vq->avail_idx=23][vq->ndescs=1]
[ 71.071195] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=28][vq->avail_idx=256][vq->ndescs=1]
[ 71.071243] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=23][vq->avail_idx=24][vq->ndescs=1]
[ 71.071386] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=29][vq->avail_idx=256][vq->ndescs=1]
[ 71.071433] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=24][vq->avail_idx=25][vq->ndescs=1]
[ 71.071575] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=30][vq->avail_idx=256][vq->ndescs=1]
[ 71.071611] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=25][vq->avail_idx=26][vq->ndescs=1]
[ 71.071747] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=31][vq->avail_idx=256][vq->ndescs=1]
[ 71.071789] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=26][vq->avail_idx=27][vq->ndescs=1]
[ 71.071923] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=32][vq->avail_idx=256][vq->ndescs=1]
[ 71.071960] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=27][vq->avail_idx=28][vq->ndescs=1]
[ 71.072096] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=33][vq->avail_idx=256][vq->ndescs=1]
[ 71.072128] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=28][vq->avail_idx=29][vq->ndescs=1]
[ 71.072267] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=34][vq->avail_idx=256][vq->ndescs=1]
[ 71.072300] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=29][vq->avail_idx=30][vq->ndescs=1]
[ 71.072432] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=35][vq->avail_idx=256][vq->ndescs=1]
[ 71.072463] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=30][vq->avail_idx=31][vq->ndescs=1]
[ 71.072596] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=36][vq->avail_idx=256][vq->ndescs=1]
[ 71.072630] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=31][vq->avail_idx=32][vq->ndescs=1]
[ 71.072759] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=37][vq->avail_idx=256][vq->ndescs=1]
[ 71.072791] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=32][vq->avail_idx=33][vq->ndescs=1]
[ 71.072933] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=38][vq->avail_idx=256][vq->ndescs=1]
[ 71.073054] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=33][vq->avail_idx=34][vq->ndescs=1]
[ 71.073193] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=39][vq->avail_idx=256][vq->ndescs=1]
[ 71.073247] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=34][vq->avail_idx=35][vq->ndescs=1]
[ 71.073383] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=40][vq->avail_idx=256][vq->ndescs=1]
[ 71.073434] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=35][vq->avail_idx=36][vq->ndescs=1]
[ 71.073571] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=41][vq->avail_idx=256][vq->ndescs=1]
[ 71.073627] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=36][vq->avail_idx=37][vq->ndescs=1]
[ 71.073762] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=42][vq->avail_idx=256][vq->ndescs=1]
[ 71.073813] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=37][vq->avail_idx=38][vq->ndescs=1]
[ 71.073948] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=43][vq->avail_idx=256][vq->ndescs=1]
[ 71.073998] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=38][vq->avail_idx=39][vq->ndescs=1]
[ 71.074136] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=44][vq->avail_idx=256][vq->ndescs=1]
[ 71.074186] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=39][vq->avail_idx=40][vq->ndescs=1]
[ 71.074320] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=45][vq->avail_idx=256][vq->ndescs=1]
[ 71.074370] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=40][vq->avail_idx=41][vq->ndescs=1]
[ 71.074503] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=46][vq->avail_idx=256][vq->ndescs=1]
[ 72.344493] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=47][vq->avail_idx=256][vq->ndescs=1]
[ 72.553413] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=48][vq->avail_idx=256][vq->ndescs=1]
[ 73.522174] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=49][vq->avail_idx=256][vq->ndescs=1]
[ 73.705202] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=50][vq->avail_idx=256][vq->ndescs=1]
[ 73.705239] [1915] vhost:fetch_descs:2328: [vq=00000000175f11ec][vq->last_avail_idx=41][vq->avail_idx=42][vq->ndescs=1]
[ 73.994388] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=51][vq->avail_idx=256][vq->ndescs=1]
[ 74.208443] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=52][vq->avail_idx=256][vq->ndescs=1]
[ 74.433345] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=53][vq->avail_idx=256][vq->ndescs=1]
[ 74.594756] [1910] vhost_net:vhost_net_ioctl:1726: VHOST_NET_SET_BACKEND
[ 74.594761] [1910] vhost_net:vhost_net_set_backend:1538: sock=000000001ae027fd != oldsock=0000000082d8d291 index=0 fd=-1 vq=0000000088199421
[ 74.594762] [1910] vhost_net:vhost_net_set_backend:1573: sock=000000001ae027fd
[ 74.594803] [1910] vhost_net:vhost_net_ioctl:1726: VHOST_NET_SET_BACKEND
[ 74.594804] [1910] vhost_net:vhost_net_set_backend:1538: sock=000000001ae027fd != oldsock=0000000082d8d291 index=1 fd=-1 vq=00000000175f11ec
[ 74.594805] [1910] vhost_net:vhost_net_set_backend:1573: sock=000000001ae027fd
[ 74.594847] [1910] vhost:vhost_vring_ioctl:1664: VHOST_GET_VRING_BASE [vq=0000000088199421][vq->last_avail_idx=54][vq->avail_idx=256][s.index=0][s.num=54]
[ 74.594850] [1910] vhost:vhost_vring_ioctl:1664: VHOST_GET_VRING_BASE [vq=00000000175f11ec][vq->last_avail_idx=42][vq->avail_idx=42][s.index=1][s.num=42]
[ 77.003191] [1918] vhost:vhost_vring_ioctl:1645: strange VHOST_SET_VRING_BASE [vq=0000000088199421][s.index=0][s.num=0]
[ 77.003194] CPU: 62 PID: 1918 Comm: CPU 1/KVM Not tainted 5.5.0+ #22
[ 77.003197] Hardware name: IBM 3906 M04 704 (LPAR)
[ 77.003198] Call Trace:
[ 77.003207] [<0000000b8d93c132>] show_stack+0x8a/0xd0
[ 77.003211] [<0000000b8e32e72a>] dump_stack+0x8a/0xb8
[ 77.003224] [<000003ff803567ae>] vhost_vring_ioctl+0x6fe/0x858 [vhost]
[ 77.003228] [<000003ff8036c670>] vhost_net_ioctl+0x510/0x570 [vhost_net]
[ 77.003234] [<0000000b8dbecdd8>] do_vfs_ioctl+0x430/0x6f8
[ 77.003235] [<0000000b8dbed124>] ksys_ioctl+0x84/0xb0
[ 77.003237] [<0000000b8dbed1ba>] __s390x_sys_ioctl+0x2a/0x38
[ 77.003240] [<0000000b8e34ff72>] system_call+0x2a6/0x2c8
[ 77.003261] [1918] vhost:vhost_vring_ioctl:1645: strange VHOST_SET_VRING_BASE [vq=00000000175f11ec][s.index=1][s.num=0]
[ 77.003262] CPU: 62 PID: 1918 Comm: CPU 1/KVM Not tainted 5.5.0+ #22
[ 77.003263] Hardware name: IBM 3906 M04 704 (LPAR)
[ 77.003264] Call Trace:
[ 77.003266] [<0000000b8d93c132>] show_stack+0x8a/0xd0
[ 77.003267] [<0000000b8e32e72a>] dump_stack+0x8a/0xb8
[ 77.003270] [<000003ff803567ae>] vhost_vring_ioctl+0x6fe/0x858 [vhost]
[ 77.003271] [<000003ff8036c670>] vhost_net_ioctl+0x510/0x570 [vhost_net]
[ 77.003273] [<0000000b8dbecdd8>] do_vfs_ioctl+0x430/0x6f8
[ 77.003274] [<0000000b8dbed124>] ksys_ioctl+0x84/0xb0
[ 77.003276] [<0000000b8dbed1ba>] __s390x_sys_ioctl+0x2a/0x38
[ 77.003277] [<0000000b8e34ff72>] system_call+0x2a6/0x2c8
[ 77.003297] [1918] vhost_net:vhost_net_ioctl:1726: VHOST_NET_SET_BACKEND
[ 77.003300] [1918] vhost_net:vhost_net_set_backend:1538: sock=0000000082d8d291 != oldsock=000000001ae027fd index=0 fd=39 vq=0000000088199421
[ 77.003304] [1918] vhost_net:vhost_net_set_backend:1573: sock=0000000082d8d291
[ 77.003305] [1918] vhost_net:vhost_net_ioctl:1726: VHOST_NET_SET_BACKEND
[ 77.003306] [1918] vhost_net:vhost_net_set_backend:1538: sock=0000000082d8d291 != oldsock=000000001ae027fd index=1 fd=39 vq=00000000175f11ec
[ 77.003308] [1915] vhost:fetch_descs:2328: [vq=0000000088199421][vq->last_avail_idx=54][vq->avail_idx=256][vq->ndescs=1]
[ 77.003308] [1918] vhost_net:vhost_net_set_backend:1573: sock=0000000082d8d291
[ 77.003310] [1915] vhost_net:get_rx_bufs:1061: unexpected descriptor format for RX: out 0, in 0
[ 77.003312] [1915] vhost:vhost_discard_vq_desc:2424: DISCARD [vq=0000000088199421][vq->last_avail_idx=55][vq->avail_idx=256][n=0]

2020-03-27 08:06:02

by Eugenio Perez Martin

[permalink] [raw]
Subject: Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot

On Tue, Feb 11, 2020 at 2:58 PM Michael S. Tsirkin <[email protected]> wrote:
>
> On Tue, Feb 11, 2020 at 02:04:54PM +0100, Eugenio Pérez wrote:
> > On Mon, 2020-02-10 at 12:01 +0100, Christian Borntraeger wrote:
> > >
> > > On 10.02.20 10:47, Eugenio Perez Martin wrote:
> > > > Hi Christian.
> > > >
> > > > I'm not able to reproduce the failure with eccb852f1fe6bede630e2e4f1a121a81e34354ab commit. Could you add more data?
> > > > Your configuration (libvirt or qemu line), and host's dmesg output if any?
> > > >
> > > > Thanks!
> > >
> > > If it was not obvious, this is on s390x, a big endian system.
> > >
> >
> > Hi Christian. Thank you very much for your fast responses.
> >
> > Could you try this patch on top of eccb852f1fe6bede630e2e4f1a121a81e34354ab?
> >
> > Thanks!
> >
> > >From 71d0f9108a18aa894cc0c0c1c7efbad39f465a27 Mon Sep 17 00:00:00 2001
> > From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <
> > [email protected]>
> > Date: Tue, 11 Feb 2020 13:19:10 +0100
> > Subject: [PATCH] vhost: fix return value of vhost_get_vq_desc
> >
> > Before of the batch change, it was the chain's head. Need to keep that
> > way or we will not be able to free a chain of descriptors.
>
> I think it's cleaner to have all descriptors in the chain
> have the same id.
>

Hi Michael.

Not sure if I understood you correctly. All the descriptors in the
chain still have the same id, this just returns the first one instead
of update it in every loop iteration. Can left unapplied in the final
version though, is up to you.

> > Fixes: eccb852f1fe6 ("vhost: batching fetches")
> > ---
> > drivers/vhost/vhost.c | 3 +--
> > 1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> > index b5a51b1f2e79..fc422c3e5c08 100644
> > --- a/drivers/vhost/vhost.c
> > +++ b/drivers/vhost/vhost.c
> > @@ -2409,12 +2409,11 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
> > *out_num += ret;
> > }
> >
> > - ret = desc->id;
> > -
> > if (!(desc->flags & VRING_DESC_F_NEXT))
> > break;
> > }
> >
> > + ret = vq->descs[vq->first_desc].id;
> > vq->first_desc = i + 1;
> >
> > return ret;
> > --
> > 2.18.1
>

2020-03-27 11:10:23

by Eugenio Perez Martin

[permalink] [raw]
Subject: Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot

Hi Christian.

Sorry for the late response. Could we try this one over eccb852f1fe6bede630e2e4f1a121a81e34354ab, and see if you still
can reproduce the bug?

Apart from that, could you print me the backtrace when qemu calls vhost_kernel_set_vring_base and
vhost_kernel_get_vring_base functions?

Thank you very much!

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index e158159671fa..a1a4239512bb 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -1505,10 +1505,13 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)

mutex_lock(&n->dev.mutex);
r = vhost_dev_check_owner(&n->dev);
- if (r)
+ if (r) {
+ pr_debug("vhost_dev_check_owner index=%u fd=%d rc r=%d", index, fd, r);
goto err;
+ }

if (index >= VHOST_NET_VQ_MAX) {
+ pr_debug("vhost_dev_check_owner index=%u fd=%d MAX=%d", index, fd, VHOST_NET_VQ_MAX);
r = -ENOBUFS;
goto err;
}
@@ -1518,22 +1521,26 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)

/* Verify that ring has been setup correctly. */
if (!vhost_vq_access_ok(vq)) {
+ pr_debug("vhost_net_set_backend index=%u fd=%d !vhost_vq_access_ok", index, fd);
r = -EFAULT;
goto err_vq;
}
sock = get_socket(fd);
if (IS_ERR(sock)) {
r = PTR_ERR(sock);
+ pr_debug("vhost_net_set_backend index=%u fd=%d get_socket err r=%d", index, fd, r);
goto err_vq;
}

/* start polling new socket */
oldsock = vq->private_data;
if (sock != oldsock) {
+ pr_debug("sock=%p != oldsock=%p index=%u fd=%d vq=%p", sock, oldsock, index, fd, vq);
ubufs = vhost_net_ubuf_alloc(vq,
sock && vhost_sock_zcopy(sock));
if (IS_ERR(ubufs)) {
r = PTR_ERR(ubufs);
+ pr_debug("ubufs index=%u fd=%d err r=%d vq=%p", index, fd, r, vq);
goto err_ubufs;
}

@@ -1541,11 +1548,15 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
vq->private_data = sock;
vhost_net_buf_unproduce(nvq);
r = vhost_vq_init_access(vq);
- if (r)
+ if (r) {
+ pr_debug("init_access index=%u fd=%d r=%d vq=%p", index, fd, r, vq);
goto err_used;
+ }
r = vhost_net_enable_vq(n, vq);
- if (r)
+ if (r) {
+ pr_debug("enable_vq index=%u fd=%d r=%d vq=%p", index, fd, r, vq);
goto err_used;
+ }
if (index == VHOST_NET_VQ_RX)
nvq->rx_ring = get_tap_ptr_ring(fd);

@@ -1559,6 +1570,8 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)

mutex_unlock(&vq->mutex);

+ pr_debug("sock=%p", sock);
+
if (oldubufs) {
vhost_net_ubuf_put_wait_and_free(oldubufs);
mutex_lock(&vq->mutex);
@@ -1712,6 +1725,9 @@ static long vhost_net_ioctl(struct file *f, unsigned int ioctl,
case VHOST_NET_SET_BACKEND:
if (copy_from_user(&backend, argp, sizeof backend))
return -EFAULT;
+ pr_debug("VHOST_NET_SET_BACKEND [b.index=%u][b.fd=%d]",
+ backend.index, backend.fd);
+ dump_stack();
return vhost_net_set_backend(n, backend.index, backend.fd);
case VHOST_GET_FEATURES:
features = VHOST_NET_FEATURES;
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index b5a51b1f2e79..9dd0bcae0b22 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -372,6 +372,11 @@ static int vhost_worker(void *data)
return 0;
}

+static int vhost_vq_num_batch_descs(struct vhost_virtqueue *vq)
+{
+ return vq->max_descs - UIO_MAXIOV;
+}
+
static void vhost_vq_free_iovecs(struct vhost_virtqueue *vq)
{
kfree(vq->descs);
@@ -394,7 +399,9 @@ static long vhost_dev_alloc_iovecs(struct vhost_dev *dev)
for (i = 0; i < dev->nvqs; ++i) {
vq = dev->vqs[i];
vq->max_descs = dev->iov_limit;
- vq->batch_descs = dev->iov_limit - UIO_MAXIOV;
+ if (vhost_vq_num_batch_descs(vq) < 0) {
+ return -EINVAL;
+ }
vq->descs = kmalloc_array(vq->max_descs,
sizeof(*vq->descs),
GFP_KERNEL);
@@ -1642,15 +1649,27 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned int ioctl, void __user *arg
r = -EINVAL;
break;
}
+
+ pr_debug(
+ "VHOST_SET_VRING_BASE [vq=%p][s.index=%u][s.num=%u][vq->avail_idx=%d][vq->last_avail_idx=%d][vq-
>ndescs=%d][vq->first_desc=%d]",
+ vq, s.index, s.num, vq->avail_idx, vq->last_avail_idx,
+ vq->ndescs, vq->first_desc);
+ dump_stack();
vq->last_avail_idx = s.num;
/* Forget the cached index value. */
vq->avail_idx = vq->last_avail_idx;
+ vq->ndescs = vq->first_desc = 0;
break;
case VHOST_GET_VRING_BASE:
s.index = idx;
s.num = vq->last_avail_idx;
if (copy_to_user(argp, &s, sizeof s))
r = -EFAULT;
+ pr_debug(
+ "VHOST_GET_VRING_BASE [vq=%p][s.index=%u][s.num=%u][vq->avail_idx=%d][vq->last_avail_idx=%d][vq-
>ndescs=%d][vq->first_desc=%d]",
+ vq, s.index, s.num, vq->avail_idx, vq->last_avail_idx,
+ vq->ndescs, vq->first_desc);
+ dump_stack();
break;
case VHOST_SET_VRING_KICK:
if (copy_from_user(&f, argp, sizeof f)) {
@@ -2239,8 +2258,8 @@ static int fetch_buf(struct vhost_virtqueue *vq)
vq->avail_idx = vhost16_to_cpu(vq, avail_idx);

if (unlikely((u16)(vq->avail_idx - last_avail_idx) > vq->num)) {
- vq_err(vq, "Guest moved used index from %u to %u",
- last_avail_idx, vq->avail_idx);
+ vq_err(vq, "Guest moved vq %p used index from %u to %u",
+ vq, last_avail_idx, vq->avail_idx);
return -EFAULT;
}

@@ -2316,6 +2335,9 @@ static int fetch_buf(struct vhost_virtqueue *vq)
BUG_ON(!(vq->used_flags & VRING_USED_F_NO_NOTIFY));

/* On success, increment avail index. */
+ pr_debug(
+ "[vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][vq->ndescs=%d][vq->first_desc=%d]",
+ vq, vq->last_avail_idx, vq->avail_idx, vq->ndescs, vq->first_desc);
vq->last_avail_idx++;

return 0;
@@ -2333,7 +2355,7 @@ static int fetch_descs(struct vhost_virtqueue *vq)
if (vq->ndescs)
return 0;

- while (!ret && vq->ndescs <= vq->batch_descs)
+ while (!ret && vq->ndescs <= vhost_vq_num_batch_descs(vq))
ret = fetch_buf(vq);

return vq->ndescs ? 0 : ret;
@@ -2432,6 +2454,9 @@ EXPORT_SYMBOL_GPL(vhost_get_vq_desc);
/* Reverse the effect of vhost_get_vq_desc. Useful for error handling. */
void vhost_discard_vq_desc(struct vhost_virtqueue *vq, int n)
{
+ pr_debug(
+ "DISCARD [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][n=%d]",
+ vq, vq->last_avail_idx, vq->avail_idx, n);
vq->last_avail_idx -= n;
}
EXPORT_SYMBOL_GPL(vhost_discard_vq_desc);
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 661088ae6dc7..e648b9b997d4 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -102,7 +102,6 @@ struct vhost_virtqueue {
int ndescs;
int first_desc;
int max_descs;
- int batch_descs;

const struct vhost_umem_node *meta_iotlb[VHOST_NUM_ADDRS];
struct file *kick;

2020-03-27 15:47:40

by Christian Borntraeger

[permalink] [raw]
Subject: Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot



On 27.03.20 12:08, Eugenio Pérez wrote:
> Hi Christian.
>
> Sorry for the late response. Could we try this one over eccb852f1fe6bede630e2e4f1a121a81e34354ab, and see if you still
> can reproduce the bug?

To much time has passed and too many things have changed on that system.
I have trouble reproducing this with either
eccb852f1fe6bede630e2e4f1a121a81e34354ab or 52c36ce7f334.
I will try to reproduce this again :-/

>
> Apart from that, could you print me the backtrace when qemu calls vhost_kernel_set_vring_base and
> vhost_kernel_get_vring_base functions?
>
> Thank you very much!
>
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index e158159671fa..a1a4239512bb 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -1505,10 +1505,13 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
>
> mutex_lock(&n->dev.mutex);
> r = vhost_dev_check_owner(&n->dev);
> - if (r)
> + if (r) {
> + pr_debug("vhost_dev_check_owner index=%u fd=%d rc r=%d", index, fd, r);
> goto err;
> + }
>
> if (index >= VHOST_NET_VQ_MAX) {
> + pr_debug("vhost_dev_check_owner index=%u fd=%d MAX=%d", index, fd, VHOST_NET_VQ_MAX);
> r = -ENOBUFS;
> goto err;
> }
> @@ -1518,22 +1521,26 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
>
> /* Verify that ring has been setup correctly. */
> if (!vhost_vq_access_ok(vq)) {
> + pr_debug("vhost_net_set_backend index=%u fd=%d !vhost_vq_access_ok", index, fd);
> r = -EFAULT;
> goto err_vq;
> }
> sock = get_socket(fd);
> if (IS_ERR(sock)) {
> r = PTR_ERR(sock);
> + pr_debug("vhost_net_set_backend index=%u fd=%d get_socket err r=%d", index, fd, r);
> goto err_vq;
> }
>
> /* start polling new socket */
> oldsock = vq->private_data;
> if (sock != oldsock) {
> + pr_debug("sock=%p != oldsock=%p index=%u fd=%d vq=%p", sock, oldsock, index, fd, vq);
> ubufs = vhost_net_ubuf_alloc(vq,
> sock && vhost_sock_zcopy(sock));
> if (IS_ERR(ubufs)) {
> r = PTR_ERR(ubufs);
> + pr_debug("ubufs index=%u fd=%d err r=%d vq=%p", index, fd, r, vq);
> goto err_ubufs;
> }
>
> @@ -1541,11 +1548,15 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
> vq->private_data = sock;
> vhost_net_buf_unproduce(nvq);
> r = vhost_vq_init_access(vq);
> - if (r)
> + if (r) {
> + pr_debug("init_access index=%u fd=%d r=%d vq=%p", index, fd, r, vq);
> goto err_used;
> + }
> r = vhost_net_enable_vq(n, vq);
> - if (r)
> + if (r) {
> + pr_debug("enable_vq index=%u fd=%d r=%d vq=%p", index, fd, r, vq);
> goto err_used;
> + }
> if (index == VHOST_NET_VQ_RX)
> nvq->rx_ring = get_tap_ptr_ring(fd);
>
> @@ -1559,6 +1570,8 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
>
> mutex_unlock(&vq->mutex);
>
> + pr_debug("sock=%p", sock);
> +
> if (oldubufs) {
> vhost_net_ubuf_put_wait_and_free(oldubufs);
> mutex_lock(&vq->mutex);
> @@ -1712,6 +1725,9 @@ static long vhost_net_ioctl(struct file *f, unsigned int ioctl,
> case VHOST_NET_SET_BACKEND:
> if (copy_from_user(&backend, argp, sizeof backend))
> return -EFAULT;
> + pr_debug("VHOST_NET_SET_BACKEND [b.index=%u][b.fd=%d]",
> + backend.index, backend.fd);
> + dump_stack();
> return vhost_net_set_backend(n, backend.index, backend.fd);
> case VHOST_GET_FEATURES:
> features = VHOST_NET_FEATURES;
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index b5a51b1f2e79..9dd0bcae0b22 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -372,6 +372,11 @@ static int vhost_worker(void *data)
> return 0;
> }
>
> +static int vhost_vq_num_batch_descs(struct vhost_virtqueue *vq)
> +{
> + return vq->max_descs - UIO_MAXIOV;
> +}
> +
> static void vhost_vq_free_iovecs(struct vhost_virtqueue *vq)
> {
> kfree(vq->descs);
> @@ -394,7 +399,9 @@ static long vhost_dev_alloc_iovecs(struct vhost_dev *dev)
> for (i = 0; i < dev->nvqs; ++i) {
> vq = dev->vqs[i];
> vq->max_descs = dev->iov_limit;
> - vq->batch_descs = dev->iov_limit - UIO_MAXIOV;
> + if (vhost_vq_num_batch_descs(vq) < 0) {
> + return -EINVAL;
> + }
> vq->descs = kmalloc_array(vq->max_descs,
> sizeof(*vq->descs),
> GFP_KERNEL);
> @@ -1642,15 +1649,27 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned int ioctl, void __user *arg
> r = -EINVAL;
> break;
> }
> +
> + pr_debug(
> + "VHOST_SET_VRING_BASE [vq=%p][s.index=%u][s.num=%u][vq->avail_idx=%d][vq->last_avail_idx=%d][vq-
>> ndescs=%d][vq->first_desc=%d]",
> + vq, s.index, s.num, vq->avail_idx, vq->last_avail_idx,
> + vq->ndescs, vq->first_desc);
> + dump_stack();
> vq->last_avail_idx = s.num;
> /* Forget the cached index value. */
> vq->avail_idx = vq->last_avail_idx;
> + vq->ndescs = vq->first_desc = 0;
> break;
> case VHOST_GET_VRING_BASE:
> s.index = idx;
> s.num = vq->last_avail_idx;
> if (copy_to_user(argp, &s, sizeof s))
> r = -EFAULT;
> + pr_debug(
> + "VHOST_GET_VRING_BASE [vq=%p][s.index=%u][s.num=%u][vq->avail_idx=%d][vq->last_avail_idx=%d][vq-
>> ndescs=%d][vq->first_desc=%d]",
> + vq, s.index, s.num, vq->avail_idx, vq->last_avail_idx,
> + vq->ndescs, vq->first_desc);
> + dump_stack();
> break;
> case VHOST_SET_VRING_KICK:
> if (copy_from_user(&f, argp, sizeof f)) {
> @@ -2239,8 +2258,8 @@ static int fetch_buf(struct vhost_virtqueue *vq)
> vq->avail_idx = vhost16_to_cpu(vq, avail_idx);
>
> if (unlikely((u16)(vq->avail_idx - last_avail_idx) > vq->num)) {
> - vq_err(vq, "Guest moved used index from %u to %u",
> - last_avail_idx, vq->avail_idx);
> + vq_err(vq, "Guest moved vq %p used index from %u to %u",
> + vq, last_avail_idx, vq->avail_idx);
> return -EFAULT;
> }
>
> @@ -2316,6 +2335,9 @@ static int fetch_buf(struct vhost_virtqueue *vq)
> BUG_ON(!(vq->used_flags & VRING_USED_F_NO_NOTIFY));
>
> /* On success, increment avail index. */
> + pr_debug(
> + "[vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][vq->ndescs=%d][vq->first_desc=%d]",
> + vq, vq->last_avail_idx, vq->avail_idx, vq->ndescs, vq->first_desc);
> vq->last_avail_idx++;
>
> return 0;
> @@ -2333,7 +2355,7 @@ static int fetch_descs(struct vhost_virtqueue *vq)
> if (vq->ndescs)
> return 0;
>
> - while (!ret && vq->ndescs <= vq->batch_descs)
> + while (!ret && vq->ndescs <= vhost_vq_num_batch_descs(vq))
> ret = fetch_buf(vq);
>
> return vq->ndescs ? 0 : ret;
> @@ -2432,6 +2454,9 @@ EXPORT_SYMBOL_GPL(vhost_get_vq_desc);
> /* Reverse the effect of vhost_get_vq_desc. Useful for error handling. */
> void vhost_discard_vq_desc(struct vhost_virtqueue *vq, int n)
> {
> + pr_debug(
> + "DISCARD [vq=%p][vq->last_avail_idx=%u][vq->avail_idx=%u][n=%d]",
> + vq, vq->last_avail_idx, vq->avail_idx, n);
> vq->last_avail_idx -= n;
> }
> EXPORT_SYMBOL_GPL(vhost_discard_vq_desc);
> diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
> index 661088ae6dc7..e648b9b997d4 100644
> --- a/drivers/vhost/vhost.h
> +++ b/drivers/vhost/vhost.h
> @@ -102,7 +102,6 @@ struct vhost_virtqueue {
> int ndescs;
> int first_desc;
> int max_descs;
> - int batch_descs;
>
> const struct vhost_umem_node *meta_iotlb[VHOST_NUM_ADDRS];
> struct file *kick;
>