2014-01-30 18:02:59

by Richard Yao

[permalink] [raw]
Subject: [PATCH] 9p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffers (second submission)

This is my first real patch submission, which I sent out in December.
Unfortunately, I sent it to only a subset of the correct people. Time passed,
Greg Koah-Hartman was kind enough to give me some pointers and I am now sending
it back out to the correct people. My apologies to anyone who was bit by this
issue because it was not sent to the correct people sooner.

Richard Yao (1):
9p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffers

net/9p/trans_virtio.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

--
1.8.3.2


2014-01-30 18:03:12

by Richard Yao

[permalink] [raw]
Subject: [PATCH] 9p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffers

The 9p-virtio transport does zero copy on things larger than 1024 bytes
in size. It accomplishes this by returning the physical addresses of
pages to the virtio-pci device. At present, the translation is usually a
bit shift.

However, that approach produces an invalid page address when we
read/write to vmalloc buffers, such as those used for Linux kernle
modules. This causes QEMU to die printing:

qemu-system-x86_64: virtio: trying to map MMIO memory

This patch enables 9p-virtio to correctly handle this case. This not
only enables us to load Linux kernel modules off virtfs, but also
enables ZFS file-based vdevs on virtfs to be used without killing QEMU.

Also, special thanks to both Avi Kivity and Alexander Graf for their
interpretation of QEMU backtraces. Without their guidence, tracking down
this bug would have taken much longer.

Signed-off-by: Richard Yao <[email protected]>
Acked-by: Alexander Graf <[email protected]>
Reviewed-by: Will Deacon <[email protected]>
---
net/9p/trans_virtio.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
index cd1e1ed..b2009bc 100644
--- a/net/9p/trans_virtio.c
+++ b/net/9p/trans_virtio.c
@@ -340,7 +340,10 @@ static int p9_get_mapped_pages(struct virtio_chan *chan,
int count = nr_pages;
while (nr_pages) {
s = rest_of_page(data);
- pages[index++] = kmap_to_page(data);
+ if (is_vmalloc_or_module_addr(data))
+ pages[index++] = vmalloc_to_page(data);
+ else
+ pages[index++] = kmap_to_page(data);
data += s;
nr_pages--;
}
--
1.8.3.2

2014-01-31 00:29:29

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] 9p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffers

From: Richard Yao <[email protected]>
Date: Thu, 30 Jan 2014 13:02:48 -0500

> The 9p-virtio transport does zero copy on things larger than 1024 bytes
> in size. It accomplishes this by returning the physical addresses of
> pages to the virtio-pci device. At present, the translation is usually a
> bit shift.
>
> However, that approach produces an invalid page address when we
> read/write to vmalloc buffers, such as those used for Linux kernle
> modules. This causes QEMU to die printing:
>
> qemu-system-x86_64: virtio: trying to map MMIO memory
>
> This patch enables 9p-virtio to correctly handle this case. This not
> only enables us to load Linux kernel modules off virtfs, but also
> enables ZFS file-based vdevs on virtfs to be used without killing QEMU.
>
> Also, special thanks to both Avi Kivity and Alexander Graf for their
> interpretation of QEMU backtraces. Without their guidence, tracking down
> this bug would have taken much longer.
>
> Signed-off-by: Richard Yao <[email protected]>
> Acked-by: Alexander Graf <[email protected]>
> Reviewed-by: Will Deacon <[email protected]>

Applied, thanks.

2014-01-31 00:44:54

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] 9p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffers

From: David Miller <[email protected]>
Date: Thu, 30 Jan 2014 16:29:26 -0800 (PST)

> From: Richard Yao <[email protected]>
> Date: Thu, 30 Jan 2014 13:02:48 -0500
>
>> The 9p-virtio transport does zero copy on things larger than 1024 bytes
>> in size. It accomplishes this by returning the physical addresses of
>> pages to the virtio-pci device. At present, the translation is usually a
>> bit shift.
>>
>> However, that approach produces an invalid page address when we
>> read/write to vmalloc buffers, such as those used for Linux kernle
>> modules. This causes QEMU to die printing:
>>
>> qemu-system-x86_64: virtio: trying to map MMIO memory
>>
>> This patch enables 9p-virtio to correctly handle this case. This not
>> only enables us to load Linux kernel modules off virtfs, but also
>> enables ZFS file-based vdevs on virtfs to be used without killing QEMU.
>>
>> Also, special thanks to both Avi Kivity and Alexander Graf for their
>> interpretation of QEMU backtraces. Without their guidence, tracking down
>> this bug would have taken much longer.
>>
>> Signed-off-by: Richard Yao <[email protected]>
>> Acked-by: Alexander Graf <[email protected]>
>> Reviewed-by: Will Deacon <[email protected]>
>
> Applied, thanks.

Actually I had to revert, is_vmalloc_or_malloc_addr() is not exported to
modules, so this change breaks the build.

2014-01-31 01:02:27

by Richard Yao

[permalink] [raw]
Subject: Re: [PATCH] 9p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffers

On Jan 30, 2014, at 7:44 PM, David Miller <[email protected]> wrote:

> From: David Miller <[email protected]>
> Date: Thu, 30 Jan 2014 16:29:26 -0800 (PST)
>
>> From: Richard Yao <[email protected]>
>> Date: Thu, 30 Jan 2014 13:02:48 -0500
>>
>>> The 9p-virtio transport does zero copy on things larger than 1024 bytes
>>> in size. It accomplishes this by returning the physical addresses of
>>> pages to the virtio-pci device. At present, the translation is usually a
>>> bit shift.
>>>
>>> However, that approach produces an invalid page address when we
>>> read/write to vmalloc buffers, such as those used for Linux kernle
>>> modules. This causes QEMU to die printing:
>>>
>>> qemu-system-x86_64: virtio: trying to map MMIO memory
>>>
>>> This patch enables 9p-virtio to correctly handle this case. This not
>>> only enables us to load Linux kernel modules off virtfs, but also
>>> enables ZFS file-based vdevs on virtfs to be used without killing QEMU.
>>>
>>> Also, special thanks to both Avi Kivity and Alexander Graf for their
>>> interpretation of QEMU backtraces. Without their guidence, tracking down
>>> this bug would have taken much longer.
>>>
>>> Signed-off-by: Richard Yao <[email protected]>
>>> Acked-by: Alexander Graf <[email protected]>
>>> Reviewed-by: Will Deacon <[email protected]>
>>
>> Applied, thanks.
>
> Actually I had to revert, is_vmalloc_or_malloc_addr() is not exported to
> modules, so this change breaks the build.

Thanks for catching that. I had originally used is_vmalloc_addr() instead of is_vmalloc_or_malloc_addr(), but changed it after realizing this did not correct the problem on all architectures. The is_vmalloc_addr() lives in headers. I will send out a patch to get that symbol exported and resubmit this after it is merged.-