Recently, syzbot reported [1] ("kernel BUG in page_table_check_clear").
The root cause is that usbdev_mmap calls remap_pfn_range on kmalloc'ed
memory, which leads to type confusion between struct page and slab in
page_table_check. This series of patches fixes the usb side by avoiding
mapping slab pages into userspace, and fixes the mm side by enforcing
that all user-accessible pages are not slab pages. A more detailed
analysis and some discussion of how to fix the problem can also be found
in [1].
[1] https://lore.kernel.org/lkml/[email protected]/T/
Changes since v1:
* Fix inconsistent coding styles. (Alan Stern)
* Relax !DEVMEM requirements to EXCLUSIVE_SYSTEM_RAM, which is
equivalent to !DEVMEM || STRICT_DEVMEM. (David Hildenbrand)
* A few random tweaks in commit messages and code comments, none of
them major.
Link to v1:
https://lore.kernel.org/lkml/[email protected]/T/
Cc: Matthew Wilcox <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Pasha Tatashin <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Alan Stern <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Ruihan Li (4):
usb: usbfs: Enforce page requirements for mmap
usb: usbfs: Use consistent mmap functions
mm: page_table_check: Make it dependent on EXCLUSIVE_SYSTEM_RAM
mm: page_table_check: Ensure user pages are not slab pages
Documentation/mm/page_table_check.rst | 18 ++++++++++++
drivers/usb/core/buffer.c | 41 +++++++++++++++++++++++++++
drivers/usb/core/devio.c | 20 +++++++++----
include/linux/page-flags.h | 6 ++++
include/linux/usb/hcd.h | 5 ++++
mm/Kconfig.debug | 2 +-
mm/page_table_check.c | 6 ++++
7 files changed, 91 insertions(+), 7 deletions(-)
--
2.40.1
When hcd->localmem_pool is non-null, localmem_pool is used to allocate
DMA memory. In this case, the dma address will be properly returned (in
dma_handle), and dma_mmap_coherent should be used to map this memory
into the user space. However, the current implementation uses
pfn_remap_range, which is supposed to map normal pages.
Instead of repeating the logic in the memory allocation function, this
patch introduces a more robust solution. Here, the type of allocated
memory is checked by testing whether dma_handle is properly set. If
dma_handle is properly returned, it means some DMA pages are allocated
and dma_mmap_coherent should be used to map them. Otherwise, normal
pages are allocated and pfn_remap_range should be called. This ensures
that the correct mmap functions are used consistently, independently
with logic details that determine which type of memory gets allocated.
Fixes: a0e710a7def4 ("USB: usbfs: fix mmap dma mismatch")
Cc: [email protected]
Signed-off-by: Ruihan Li <[email protected]>
---
drivers/usb/core/devio.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/core/devio.c b/drivers/usb/core/devio.c
index 3936ca7f7..fcf68818e 100644
--- a/drivers/usb/core/devio.c
+++ b/drivers/usb/core/devio.c
@@ -235,7 +235,7 @@ static int usbdev_mmap(struct file *file, struct vm_area_struct *vma)
size_t size = vma->vm_end - vma->vm_start;
void *mem;
unsigned long flags;
- dma_addr_t dma_handle;
+ dma_addr_t dma_handle = DMA_MAPPING_ERROR;
int ret;
ret = usbfs_increase_memory_usage(size + sizeof(struct usb_memory));
@@ -265,7 +265,14 @@ static int usbdev_mmap(struct file *file, struct vm_area_struct *vma)
usbm->vma_use_count = 1;
INIT_LIST_HEAD(&usbm->memlist);
- if (hcd->localmem_pool || !hcd_uses_dma(hcd)) {
+ /*
+ * In DMA-unavailable cases, hcd_buffer_alloc_pages allocates
+ * normal pages and assigns DMA_MAPPING_ERROR to dma_handle. Check
+ * whether we are in such cases, and then use remap_pfn_range (or
+ * dma_mmap_coherent) to map normal (or DMA) pages into the user
+ * space, respectively.
+ */
+ if (dma_handle == DMA_MAPPING_ERROR) {
if (remap_pfn_range(vma, vma->vm_start,
virt_to_phys(usbm->mem) >> PAGE_SHIFT,
size, vma->vm_page_prot) < 0) {
--
2.40.1
From: Ruihan Li
> Sent: 15 May 2023 14:10
>
> When hcd->localmem_pool is non-null, localmem_pool is used to allocate
> DMA memory. In this case, the dma address will be properly returned (in
> dma_handle), and dma_mmap_coherent should be used to map this memory
> into the user space. However, the current implementation uses
> pfn_remap_range, which is supposed to map normal pages.
I've an (out of tree) driver that does the same.
Am I right in thinking that this does still work?
I can't change the driver to use dma_map_coherent() because it
doesn't let me mmap from a page offset within a 16k allocation.
In this case the memory area is an 8MB shared transfer area to an
FPGA PCIe target sparsely filled with 16kB allocation (max 512 allocs).
The discontinuous physical memory blocks appear as logically
contiguous to both the FPGA logic and when mapped to userspace.
(But not to driver code.)
I don't really want to expose the 16k allocation size to userspace.
If we need more than 8MB then the allocation size would need
changing.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
On Mon, May 15, 2023 at 04:07:01PM +0000, David Laight wrote:
>
> From: Ruihan Li
> > Sent: 15 May 2023 14:10
> >
> > When hcd->localmem_pool is non-null, localmem_pool is used to allocate
> > DMA memory. In this case, the dma address will be properly returned (in
> > dma_handle), and dma_mmap_coherent should be used to map this memory
> > into the user space. However, the current implementation uses
> > pfn_remap_range, which is supposed to map normal pages.
>
> I've an (out of tree) driver that does the same.
> Am I right in thinking that this does still work?
Generally, it still works most of the time, but it can break sometimes.
I am going to quote commit 2bef9aed6f0e ("usb: usbfs: correct
kernel->user page attribute mismatch"), which introduces
dma_mmap_coherent in usbdev_mmap, and says [1]:
On some architectures (e.g. arm64) requests for
IO coherent memory may use non-cachable attributes if
the relevant device isn't cache coherent. If these
pages are then remapped into userspace as cacheable,
they may not be coherent with the non-cacheable mappings.
[1] https://lore.kernel.org/all/[email protected]/
I think it means that if your driver deals with devices that aren't
cache-coherent on arm64, using pfn_remap_range directly may cause
problems. Otherwise, you may need to check the arch-specific dma mmap
operation and see if it performs additional things that pfn_remap_range
does not (for the arm example, arm_iommu_mmap_attrs updates the
vm_page_prot field to make the pages non-cacheable if the device is not
cache-coherent [2]).
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/mm/dma-mapping.c?id=f1fcbaa18b28dec10281551dfe6ed3a3ed80e3d6#n1129
>
> I can't change the driver to use dma_map_coherent() because it
> doesn't let me mmap from a page offset within a 16k allocation.
>
> In this case the memory area is an 8MB shared transfer area to an
> FPGA PCIe target sparsely filled with 16kB allocation (max 512 allocs).
> The discontinuous physical memory blocks appear as logically
> contiguous to both the FPGA logic and when mapped to userspace.
> (But not to driver code.)
>
> I don't really want to expose the 16k allocation size to userspace.
> If we need more than 8MB then the allocation size would need
> changing.
>
> David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
Thanks,
Ruihan Li