Previouly drivers have their own way of mapping range of
kernel pages/memory into user vma and this was done by
invoking vm_insert_page() within a loop.
As this pattern is common across different drivers, it can
be generalized by creating a new function and use it across
the drivers.
vm_insert_range is the new API which will be used to map a
range of kernel memory/pages to user vma.
This API is tested by Heiko for Rockchip drm driver, on rk3188,
rk3288, rk3328 and rk3399 with graphics.
Signed-off-by: Souptick Joarder <[email protected]>
Reviewed-by: Matthew Wilcox <[email protected]>
Reviewed-by: Mike Rapoport <[email protected]>
Tested-by: Heiko Stuebner <[email protected]>
---
include/linux/mm.h | 2 ++
mm/memory.c | 38 ++++++++++++++++++++++++++++++++++++++
mm/nommu.c | 7 +++++++
3 files changed, 47 insertions(+)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index fcf9cc9..2bc399f 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2506,6 +2506,8 @@ unsigned long change_prot_numa(struct vm_area_struct *vma,
int remap_pfn_range(struct vm_area_struct *, unsigned long addr,
unsigned long pfn, unsigned long size, pgprot_t);
int vm_insert_page(struct vm_area_struct *, unsigned long addr, struct page *);
+int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
+ struct page **pages, unsigned long page_count);
vm_fault_t vmf_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
unsigned long pfn);
vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
diff --git a/mm/memory.c b/mm/memory.c
index 15c417e..84ea46c 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1478,6 +1478,44 @@ static int insert_page(struct vm_area_struct *vma, unsigned long addr,
}
/**
+ * vm_insert_range - insert range of kernel pages into user vma
+ * @vma: user vma to map to
+ * @addr: target user address of this page
+ * @pages: pointer to array of source kernel pages
+ * @page_count: number of pages need to insert into user vma
+ *
+ * This allows drivers to insert range of kernel pages they've allocated
+ * into a user vma. This is a generic function which drivers can use
+ * rather than using their own way of mapping range of kernel pages into
+ * user vma.
+ *
+ * If we fail to insert any page into the vma, the function will return
+ * immediately leaving any previously-inserted pages present. Callers
+ * from the mmap handler may immediately return the error as their caller
+ * will destroy the vma, removing any successfully-inserted pages. Other
+ * callers should make their own arrangements for calling unmap_region().
+ *
+ * Context: Process context. Called by mmap handlers.
+ * Return: 0 on success and error code otherwise
+ */
+int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
+ struct page **pages, unsigned long page_count)
+{
+ unsigned long uaddr = addr;
+ int ret = 0, i;
+
+ for (i = 0; i < page_count; i++) {
+ ret = vm_insert_page(vma, uaddr, pages[i]);
+ if (ret < 0)
+ return ret;
+ uaddr += PAGE_SIZE;
+ }
+
+ return ret;
+}
+EXPORT_SYMBOL(vm_insert_range);
+
+/**
* vm_insert_page - insert single page into user vma
* @vma: user vma to map to
* @addr: target user address of this page
diff --git a/mm/nommu.c b/mm/nommu.c
index 749276b..d6ef5c7 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -473,6 +473,13 @@ int vm_insert_page(struct vm_area_struct *vma, unsigned long addr,
}
EXPORT_SYMBOL(vm_insert_page);
+int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
+ struct page **pages, unsigned long page_count)
+{
+ return -EINVAL;
+}
+EXPORT_SYMBOL(vm_insert_range);
+
/*
* sys_brk() for the most part doesn't need the global kernel
* lock, except when an application is doing something nasty
--
1.9.1
Em Fri, 7 Dec 2018 00:09:45 +0530
Souptick Joarder <[email protected]> escreveu:
> Previouly drivers have their own way of mapping range of
> kernel pages/memory into user vma and this was done by
> invoking vm_insert_page() within a loop.
>
> As this pattern is common across different drivers, it can
> be generalized by creating a new function and use it across
> the drivers.
>
> vm_insert_range is the new API which will be used to map a
> range of kernel memory/pages to user vma.
>
> This API is tested by Heiko for Rockchip drm driver, on rk3188,
> rk3288, rk3328 and rk3399 with graphics.
>
> Signed-off-by: Souptick Joarder <[email protected]>
> Reviewed-by: Matthew Wilcox <[email protected]>
> Reviewed-by: Mike Rapoport <[email protected]>
> Tested-by: Heiko Stuebner <[email protected]>
Looks good to me.
Reviewed-by: Mauro Carvalho Chehab <[email protected]>
> ---
> include/linux/mm.h | 2 ++
> mm/memory.c | 38 ++++++++++++++++++++++++++++++++++++++
> mm/nommu.c | 7 +++++++
> 3 files changed, 47 insertions(+)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index fcf9cc9..2bc399f 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -2506,6 +2506,8 @@ unsigned long change_prot_numa(struct vm_area_struct *vma,
> int remap_pfn_range(struct vm_area_struct *, unsigned long addr,
> unsigned long pfn, unsigned long size, pgprot_t);
> int vm_insert_page(struct vm_area_struct *, unsigned long addr, struct page *);
> +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
> + struct page **pages, unsigned long page_count);
> vm_fault_t vmf_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
> unsigned long pfn);
> vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
> diff --git a/mm/memory.c b/mm/memory.c
> index 15c417e..84ea46c 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -1478,6 +1478,44 @@ static int insert_page(struct vm_area_struct *vma, unsigned long addr,
> }
>
> /**
> + * vm_insert_range - insert range of kernel pages into user vma
> + * @vma: user vma to map to
> + * @addr: target user address of this page
> + * @pages: pointer to array of source kernel pages
> + * @page_count: number of pages need to insert into user vma
> + *
> + * This allows drivers to insert range of kernel pages they've allocated
> + * into a user vma. This is a generic function which drivers can use
> + * rather than using their own way of mapping range of kernel pages into
> + * user vma.
> + *
> + * If we fail to insert any page into the vma, the function will return
> + * immediately leaving any previously-inserted pages present. Callers
> + * from the mmap handler may immediately return the error as their caller
> + * will destroy the vma, removing any successfully-inserted pages. Other
> + * callers should make their own arrangements for calling unmap_region().
> + *
> + * Context: Process context. Called by mmap handlers.
> + * Return: 0 on success and error code otherwise
> + */
> +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
> + struct page **pages, unsigned long page_count)
> +{
> + unsigned long uaddr = addr;
> + int ret = 0, i;
> +
> + for (i = 0; i < page_count; i++) {
> + ret = vm_insert_page(vma, uaddr, pages[i]);
> + if (ret < 0)
> + return ret;
> + uaddr += PAGE_SIZE;
> + }
> +
> + return ret;
> +}
> +EXPORT_SYMBOL(vm_insert_range);
> +
> +/**
> * vm_insert_page - insert single page into user vma
> * @vma: user vma to map to
> * @addr: target user address of this page
> diff --git a/mm/nommu.c b/mm/nommu.c
> index 749276b..d6ef5c7 100644
> --- a/mm/nommu.c
> +++ b/mm/nommu.c
> @@ -473,6 +473,13 @@ int vm_insert_page(struct vm_area_struct *vma, unsigned long addr,
> }
> EXPORT_SYMBOL(vm_insert_page);
>
> +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
> + struct page **pages, unsigned long page_count)
> +{
> + return -EINVAL;
> +}
> +EXPORT_SYMBOL(vm_insert_range);
> +
> /*
> * sys_brk() for the most part doesn't need the global kernel
> * lock, except when an application is doing something nasty
Thanks,
Mauro
On 06/12/2018 18:39, Souptick Joarder wrote:
> Previouly drivers have their own way of mapping range of
> kernel pages/memory into user vma and this was done by
> invoking vm_insert_page() within a loop.
>
> As this pattern is common across different drivers, it can
> be generalized by creating a new function and use it across
> the drivers.
>
> vm_insert_range is the new API which will be used to map a
> range of kernel memory/pages to user vma.
>
> This API is tested by Heiko for Rockchip drm driver, on rk3188,
> rk3288, rk3328 and rk3399 with graphics.
>
> Signed-off-by: Souptick Joarder <[email protected]>
> Reviewed-by: Matthew Wilcox <[email protected]>
> Reviewed-by: Mike Rapoport <[email protected]>
> Tested-by: Heiko Stuebner <[email protected]>
> ---
> include/linux/mm.h | 2 ++
> mm/memory.c | 38 ++++++++++++++++++++++++++++++++++++++
> mm/nommu.c | 7 +++++++
> 3 files changed, 47 insertions(+)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index fcf9cc9..2bc399f 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -2506,6 +2506,8 @@ unsigned long change_prot_numa(struct vm_area_struct *vma,
> int remap_pfn_range(struct vm_area_struct *, unsigned long addr,
> unsigned long pfn, unsigned long size, pgprot_t);
> int vm_insert_page(struct vm_area_struct *, unsigned long addr, struct page *);
> +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
> + struct page **pages, unsigned long page_count);
> vm_fault_t vmf_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
> unsigned long pfn);
> vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
> diff --git a/mm/memory.c b/mm/memory.c
> index 15c417e..84ea46c 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -1478,6 +1478,44 @@ static int insert_page(struct vm_area_struct *vma, unsigned long addr,
> }
>
> /**
> + * vm_insert_range - insert range of kernel pages into user vma
> + * @vma: user vma to map to
> + * @addr: target user address of this page
> + * @pages: pointer to array of source kernel pages
> + * @page_count: number of pages need to insert into user vma
> + *
> + * This allows drivers to insert range of kernel pages they've allocated
> + * into a user vma. This is a generic function which drivers can use
> + * rather than using their own way of mapping range of kernel pages into
> + * user vma.
> + *
> + * If we fail to insert any page into the vma, the function will return
> + * immediately leaving any previously-inserted pages present. Callers
> + * from the mmap handler may immediately return the error as their caller
> + * will destroy the vma, removing any successfully-inserted pages. Other
> + * callers should make their own arrangements for calling unmap_region().
> + *
> + * Context: Process context. Called by mmap handlers.
> + * Return: 0 on success and error code otherwise
> + */
> +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
> + struct page **pages, unsigned long page_count)
> +{
> + unsigned long uaddr = addr;
> + int ret = 0, i;
Some of the sites being replaced were effectively ensuring that vma and
pages were mutually compatible as an initial condition - would it be
worth adding something here for robustness, e.g.:
+ if (page_count != vma_pages(vma))
+ return -ENXIO;
?
(then you could also clean up a couple more places where you're not
already removing such checks)
Robin.
> +
> + for (i = 0; i < page_count; i++) {
> + ret = vm_insert_page(vma, uaddr, pages[i]);
> + if (ret < 0)
> + return ret;
> + uaddr += PAGE_SIZE;
> + }
> +
> + return ret;
> +}
> +EXPORT_SYMBOL(vm_insert_range);
> +
> +/**
> * vm_insert_page - insert single page into user vma
> * @vma: user vma to map to
> * @addr: target user address of this page
> diff --git a/mm/nommu.c b/mm/nommu.c
> index 749276b..d6ef5c7 100644
> --- a/mm/nommu.c
> +++ b/mm/nommu.c
> @@ -473,6 +473,13 @@ int vm_insert_page(struct vm_area_struct *vma, unsigned long addr,
> }
> EXPORT_SYMBOL(vm_insert_page);
>
> +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
> + struct page **pages, unsigned long page_count)
> +{
> + return -EINVAL;
> +}
> +EXPORT_SYMBOL(vm_insert_range);
> +
> /*
> * sys_brk() for the most part doesn't need the global kernel
> * lock, except when an application is doing something nasty
>
On Fri, Dec 07, 2018 at 03:34:56PM +0000, Robin Murphy wrote:
> > +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
> > + struct page **pages, unsigned long page_count)
> > +{
> > + unsigned long uaddr = addr;
> > + int ret = 0, i;
>
> Some of the sites being replaced were effectively ensuring that vma and
> pages were mutually compatible as an initial condition - would it be worth
> adding something here for robustness, e.g.:
>
> + if (page_count != vma_pages(vma))
> + return -ENXIO;
I think we want to allow this to be used to populate part of a VMA.
So perhaps:
if (page_count > vma_pages(vma))
return -ENXIO;
On Fri, Dec 7, 2018 at 10:41 PM Matthew Wilcox <[email protected]> wrote:
>
> On Fri, Dec 07, 2018 at 03:34:56PM +0000, Robin Murphy wrote:
> > > +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
> > > + struct page **pages, unsigned long page_count)
> > > +{
> > > + unsigned long uaddr = addr;
> > > + int ret = 0, i;
> >
> > Some of the sites being replaced were effectively ensuring that vma and
> > pages were mutually compatible as an initial condition - would it be worth
> > adding something here for robustness, e.g.:
> >
> > + if (page_count != vma_pages(vma))
> > + return -ENXIO;
>
> I think we want to allow this to be used to populate part of a VMA.
> So perhaps:
>
> if (page_count > vma_pages(vma))
> return -ENXIO;
Ok, This can be added.
I think Patch [2/9] is the only leftover place where this
check could be removed.
On 2018-12-07 7:28 pm, Souptick Joarder wrote:
> On Fri, Dec 7, 2018 at 10:41 PM Matthew Wilcox <[email protected]> wrote:
>>
>> On Fri, Dec 07, 2018 at 03:34:56PM +0000, Robin Murphy wrote:
>>>> +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
>>>> + struct page **pages, unsigned long page_count)
>>>> +{
>>>> + unsigned long uaddr = addr;
>>>> + int ret = 0, i;
>>>
>>> Some of the sites being replaced were effectively ensuring that vma and
>>> pages were mutually compatible as an initial condition - would it be worth
>>> adding something here for robustness, e.g.:
>>>
>>> + if (page_count != vma_pages(vma))
>>> + return -ENXIO;
>>
>> I think we want to allow this to be used to populate part of a VMA.
>> So perhaps:
>>
>> if (page_count > vma_pages(vma))
>> return -ENXIO;
>
> Ok, This can be added.
>
> I think Patch [2/9] is the only leftover place where this
> check could be removed.
Right, 9/9 could also have relied on my stricter check here, but since
it's really testing whether it actually managed to allocate vma_pages()
worth of pages earlier, Matthew's more lenient version won't help for
that one. (Why privcmd_buf_mmap() doesn't clean up and return an error
as soon as that allocation loop fails, without taking the mutex under
which it still does a bunch more pointless work to only undo it again,
is a mind-boggling mystery, but that's not our problem here...)
Robin.
On Sat, Dec 8, 2018 at 2:40 AM Robin Murphy <[email protected]> wrote:
>
> On 2018-12-07 7:28 pm, Souptick Joarder wrote:
> > On Fri, Dec 7, 2018 at 10:41 PM Matthew Wilcox <[email protected]> wrote:
> >>
> >> On Fri, Dec 07, 2018 at 03:34:56PM +0000, Robin Murphy wrote:
> >>>> +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
> >>>> + struct page **pages, unsigned long page_count)
> >>>> +{
> >>>> + unsigned long uaddr = addr;
> >>>> + int ret = 0, i;
> >>>
> >>> Some of the sites being replaced were effectively ensuring that vma and
> >>> pages were mutually compatible as an initial condition - would it be worth
> >>> adding something here for robustness, e.g.:
> >>>
> >>> + if (page_count != vma_pages(vma))
> >>> + return -ENXIO;
> >>
> >> I think we want to allow this to be used to populate part of a VMA.
> >> So perhaps:
> >>
> >> if (page_count > vma_pages(vma))
> >> return -ENXIO;
> >
> > Ok, This can be added.
> >
> > I think Patch [2/9] is the only leftover place where this
> > check could be removed.
>
> Right, 9/9 could also have relied on my stricter check here, but since
> it's really testing whether it actually managed to allocate vma_pages()
> worth of pages earlier, Matthew's more lenient version won't help for
> that one.
(Why privcmd_buf_mmap() doesn't clean up and return an error
> as soon as that allocation loop fails, without taking the mutex under
> which it still does a bunch more pointless work to only undo it again,
> is a mind-boggling mystery, but that's not our problem here...)
I think some clean up can be done here in a separate patch.