A series of fixes to simplify and improve the GUP interface with an eye to
providing groundwork to future improvements:-
* __access_remote_vm() and access_remote_vm() are functionally identical,
so make the former static such that in future we can potentially change
the external-facing implementation details of this function.
* Extend is_valid_gup_args() to cover the missing FOLL_TOUCH case, and
simplify things by defining INTERNAL_GUP_FLAGS to check against.
* Adjust __get_user_pages_locked() to explicitly treat a failure to pin any
pages as an error in all circumstances other than FOLL_NOWAIT being
specified, bringing it in line with the nommu implementation of this
function.
* (With many thanks to Arnd who suggested this in the first instance)
Update get_user_page_vma_remote() to explicitly only return a page or an
error, simplifying the interface and avoiding the questionable
IS_ERR_OR_NULL() pattern.
Lorenzo Stoakes (4):
mm: make __access_remote_vm() static
mm/gup: explicitly define and check internal GUP flags, disallow
FOLL_TOUCH
mm/gup: make failure to pin an error if FOLL_NOWAIT not specified
mm/gup: adapt get_user_page_vma_remote() to never return NULL
arch/arm64/kernel/mte.c | 4 ++--
include/linux/mm.h | 18 +++++++++++++-----
kernel/events/uprobes.c | 4 ++--
kernel/ptrace.c | 2 +-
mm/gup.c | 16 +++++++++++++---
mm/internal.h | 3 +++
mm/memory.c | 7 +++----
mm/nommu.c | 4 ++--
8 files changed, 39 insertions(+), 19 deletions(-)
--
2.42.0
access_remote_vm() passes through parameters to __access_remote_vm()
directly, so remove the __access_remote_vm() function from mm.h aand use
access_remote_vm() in the one caller that needs it (ptrace_access_vm()).
This allows future adjustments to the GUP-internal __access_remote_vm()
function while keeping the access_remote_vm() function stable.
Signed-off-by: Lorenzo Stoakes <[email protected]>
---
include/linux/mm.h | 2 --
kernel/ptrace.c | 2 +-
mm/memory.c | 4 ++--
mm/nommu.c | 4 ++--
4 files changed, 5 insertions(+), 7 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 52c40b3d0813..7b89f7bd420d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2415,8 +2415,6 @@ extern int access_process_vm(struct task_struct *tsk, unsigned long addr,
void *buf, int len, unsigned int gup_flags);
extern int access_remote_vm(struct mm_struct *mm, unsigned long addr,
void *buf, int len, unsigned int gup_flags);
-extern int __access_remote_vm(struct mm_struct *mm, unsigned long addr,
- void *buf, int len, unsigned int gup_flags);
long get_user_pages_remote(struct mm_struct *mm,
unsigned long start, unsigned long nr_pages,
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 443057bee87c..d8b5e13a2229 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -59,7 +59,7 @@ int ptrace_access_vm(struct task_struct *tsk, unsigned long addr,
return 0;
}
- ret = __access_remote_vm(mm, addr, buf, len, gup_flags);
+ ret = access_remote_vm(mm, addr, buf, len, gup_flags);
mmput(mm);
return ret;
diff --git a/mm/memory.c b/mm/memory.c
index d4820802b01b..e2743aa95b56 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5881,8 +5881,8 @@ EXPORT_SYMBOL_GPL(generic_access_phys);
/*
* Access another process' address space as given in mm.
*/
-int __access_remote_vm(struct mm_struct *mm, unsigned long addr, void *buf,
- int len, unsigned int gup_flags)
+static int __access_remote_vm(struct mm_struct *mm, unsigned long addr,
+ void *buf, int len, unsigned int gup_flags)
{
void *old_buf = buf;
int write = gup_flags & FOLL_WRITE;
diff --git a/mm/nommu.c b/mm/nommu.c
index 7f9e9e5a0e12..f9553579389b 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -1651,8 +1651,8 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
}
EXPORT_SYMBOL(filemap_map_pages);
-int __access_remote_vm(struct mm_struct *mm, unsigned long addr, void *buf,
- int len, unsigned int gup_flags)
+static int __access_remote_vm(struct mm_struct *mm, unsigned long addr,
+ void *buf, int len, unsigned int gup_flags)
{
struct vm_area_struct *vma;
int write = gup_flags & FOLL_WRITE;
--
2.42.0
Rather than open-coding a list of internal GUP flags in
is_valid_gup_args(), define which ones are internal.
In addition, we were not explicitly checking to see if the user passed in
FOLL_TOUCH somehow, this patch fixes that.
Signed-off-by: Lorenzo Stoakes <[email protected]>
---
mm/gup.c | 5 ++---
mm/internal.h | 3 +++
2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/mm/gup.c b/mm/gup.c
index 2f8a2d89fde1..b21b33d1787e 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2227,12 +2227,11 @@ static bool is_valid_gup_args(struct page **pages, int *locked,
/*
* These flags not allowed to be specified externally to the gup
* interfaces:
- * - FOLL_PIN/FOLL_TRIED/FOLL_FAST_ONLY are internal only
+ * - FOLL_TOUCH/FOLL_PIN/FOLL_TRIED/FOLL_FAST_ONLY are internal only
* - FOLL_REMOTE is internal only and used on follow_page()
* - FOLL_UNLOCKABLE is internal only and used if locked is !NULL
*/
- if (WARN_ON_ONCE(gup_flags & (FOLL_PIN | FOLL_TRIED | FOLL_UNLOCKABLE |
- FOLL_REMOTE | FOLL_FAST_ONLY)))
+ if (WARN_ON_ONCE(gup_flags & INTERNAL_GUP_FLAGS))
return false;
gup_flags |= to_set;
diff --git a/mm/internal.h b/mm/internal.h
index 449891ad7fdb..499016c6b01d 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -1018,6 +1018,9 @@ enum {
FOLL_UNLOCKABLE = 1 << 21,
};
+#define INTERNAL_GUP_FLAGS (FOLL_TOUCH | FOLL_TRIED | FOLL_REMOTE | FOLL_PIN | \
+ FOLL_FAST_ONLY | FOLL_UNLOCKABLE)
+
/*
* Indicates for which pages that are write-protected in the page table,
* whether GUP has to trigger unsharing via FAULT_FLAG_UNSHARE such that the
--
2.42.0
get_user_pages_remote() will never return 0 except in the case of
FOLL_NOWAIT being specified, which we explicitly disallow.
This simplifies error handling for the caller and avoids the awkwardness of
dealing with both errors and failing to pin. Failing to pin here is an
error.
Suggested-by: Arnd Bergmann <[email protected]>
Signed-off-by: Lorenzo Stoakes <[email protected]>
---
arch/arm64/kernel/mte.c | 4 ++--
include/linux/mm.h | 16 +++++++++++++---
kernel/events/uprobes.c | 4 ++--
mm/memory.c | 3 +--
4 files changed, 18 insertions(+), 9 deletions(-)
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index 4edecaac8f91..8878b392df58 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -411,8 +411,8 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr,
struct page *page = get_user_page_vma_remote(mm, addr,
gup_flags, &vma);
- if (IS_ERR_OR_NULL(page)) {
- err = page == NULL ? -EIO : PTR_ERR(page);
+ if (IS_ERR(page)) {
+ err = PTR_ERR(page);
break;
}
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 7b89f7bd420d..da9631683d38 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2425,6 +2425,7 @@ long pin_user_pages_remote(struct mm_struct *mm,
unsigned int gup_flags, struct page **pages,
int *locked);
+/* Either retrieve a single VMA and page, or an error. */
static inline struct page *get_user_page_vma_remote(struct mm_struct *mm,
unsigned long addr,
int gup_flags,
@@ -2432,12 +2433,21 @@ static inline struct page *get_user_page_vma_remote(struct mm_struct *mm,
{
struct page *page;
struct vm_area_struct *vma;
- int got = get_user_pages_remote(mm, addr, 1, gup_flags, &page, NULL);
+ int got;
+
+ if (unlikely(gup_flags & FOLL_NOWAIT))
+ return ERR_PTR(-EINVAL);
+
+ got = get_user_pages_remote(mm, addr, 1, gup_flags, &page, NULL);
if (got < 0)
return ERR_PTR(got);
- if (got == 0)
- return NULL;
+
+ /*
+ * get_user_pages_remote() is guaranteed to not return 0 for
+ * non-FOLL_NOWAIT contexts, so this should never happen.
+ */
+ VM_WARN_ON(got == 0);
vma = vma_lookup(mm, addr);
if (WARN_ON_ONCE(!vma)) {
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 3048589e2e85..435aac1d8c27 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -474,8 +474,8 @@ int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
gup_flags |= FOLL_SPLIT_PMD;
/* Read the page with vaddr into memory */
old_page = get_user_page_vma_remote(mm, vaddr, gup_flags, &vma);
- if (IS_ERR_OR_NULL(old_page))
- return old_page ? PTR_ERR(old_page) : 0;
+ if (IS_ERR(old_page))
+ return PTR_ERR(old_page);
ret = verify_opcode(old_page, vaddr, &opcode);
if (ret <= 0)
diff --git a/mm/memory.c b/mm/memory.c
index e2743aa95b56..f2eef3d1cf58 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5905,7 +5905,7 @@ static int __access_remote_vm(struct mm_struct *mm, unsigned long addr,
struct page *page = get_user_page_vma_remote(mm, addr,
gup_flags, &vma);
- if (IS_ERR_OR_NULL(page)) {
+ if (IS_ERR(page)) {
/* We might need to expand the stack to access it */
vma = vma_lookup(mm, addr);
if (!vma) {
@@ -5919,7 +5919,6 @@ static int __access_remote_vm(struct mm_struct *mm, unsigned long addr,
continue;
}
-
/*
* Check if this is a VM_IO | VM_PFNMAP VMA, which
* we can access using slightly different code.
--
2.42.0
On Sun, Oct 01, 2023 at 05:00:05PM +0100, Lorenzo Stoakes wrote:
> get_user_pages_remote() will never return 0 except in the case of
> FOLL_NOWAIT being specified, which we explicitly disallow.
>
> This simplifies error handling for the caller and avoids the awkwardness of
> dealing with both errors and failing to pin. Failing to pin here is an
> error.
>
> Suggested-by: Arnd Bergmann <[email protected]>
> Signed-off-by: Lorenzo Stoakes <[email protected]>
> ---
> arch/arm64/kernel/mte.c | 4 ++--
For arm64:
Acked-by: Catalin Marinas <[email protected]>
On 01.10.23 18:00, Lorenzo Stoakes wrote:
> Rather than open-coding a list of internal GUP flags in
> is_valid_gup_args(), define which ones are internal.
>
> In addition, we were not explicitly checking to see if the user passed in
> FOLL_TOUCH somehow, this patch fixes that.
>
> Signed-off-by: Lorenzo Stoakes <[email protected]>
> ---
> mm/gup.c | 5 ++---
> mm/internal.h | 3 +++
> 2 files changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/mm/gup.c b/mm/gup.c
> index 2f8a2d89fde1..b21b33d1787e 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -2227,12 +2227,11 @@ static bool is_valid_gup_args(struct page **pages, int *locked,
> /*
> * These flags not allowed to be specified externally to the gup
> * interfaces:
> - * - FOLL_PIN/FOLL_TRIED/FOLL_FAST_ONLY are internal only
> + * - FOLL_TOUCH/FOLL_PIN/FOLL_TRIED/FOLL_FAST_ONLY are internal only
> * - FOLL_REMOTE is internal only and used on follow_page()
> * - FOLL_UNLOCKABLE is internal only and used if locked is !NULL
> */
> - if (WARN_ON_ONCE(gup_flags & (FOLL_PIN | FOLL_TRIED | FOLL_UNLOCKABLE |
> - FOLL_REMOTE | FOLL_FAST_ONLY)))
> + if (WARN_ON_ONCE(gup_flags & INTERNAL_GUP_FLAGS))
> return false;
>
> gup_flags |= to_set;
> diff --git a/mm/internal.h b/mm/internal.h
> index 449891ad7fdb..499016c6b01d 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -1018,6 +1018,9 @@ enum {
> FOLL_UNLOCKABLE = 1 << 21,
> };
>
> +#define INTERNAL_GUP_FLAGS (FOLL_TOUCH | FOLL_TRIED | FOLL_REMOTE | FOLL_PIN | \
> + FOLL_FAST_ONLY | FOLL_UNLOCKABLE)
> +
> /*
> * Indicates for which pages that are write-protected in the page table,
> * whether GUP has to trigger unsharing via FAULT_FLAG_UNSHARE such that the
Reviewed-by: David Hildenbrand <[email protected]>
--
Cheers,
David / dhildenb
On 01.10.23 18:00, Lorenzo Stoakes wrote:
> get_user_pages_remote() will never return 0 except in the case of
> FOLL_NOWAIT being specified, which we explicitly disallow.
>
> This simplifies error handling for the caller and avoids the awkwardness of
> dealing with both errors and failing to pin. Failing to pin here is an
> error.
>
> Suggested-by: Arnd Bergmann <[email protected]>
> Signed-off-by: Lorenzo Stoakes <[email protected]>
> ---
> arch/arm64/kernel/mte.c | 4 ++--
> include/linux/mm.h | 16 +++++++++++++---
> kernel/events/uprobes.c | 4 ++--
> mm/memory.c | 3 +--
> 4 files changed, 18 insertions(+), 9 deletions(-)
>
> diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
> index 4edecaac8f91..8878b392df58 100644
> --- a/arch/arm64/kernel/mte.c
> +++ b/arch/arm64/kernel/mte.c
> @@ -411,8 +411,8 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr,
> struct page *page = get_user_page_vma_remote(mm, addr,
> gup_flags, &vma);
>
> - if (IS_ERR_OR_NULL(page)) {
> - err = page == NULL ? -EIO : PTR_ERR(page);
> + if (IS_ERR(page)) {
> + err = PTR_ERR(page);
> break;
> }
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 7b89f7bd420d..da9631683d38 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -2425,6 +2425,7 @@ long pin_user_pages_remote(struct mm_struct *mm,
> unsigned int gup_flags, struct page **pages,
> int *locked);
>
> +/* Either retrieve a single VMA and page, or an error. */
> static inline struct page *get_user_page_vma_remote(struct mm_struct *mm,
> unsigned long addr,
> int gup_flags,
> @@ -2432,12 +2433,21 @@ static inline struct page *get_user_page_vma_remote(struct mm_struct *mm,
> {
> struct page *page;
> struct vm_area_struct *vma;
> - int got = get_user_pages_remote(mm, addr, 1, gup_flags, &page, NULL);
> + int got;
> +
> + if (unlikely(gup_flags & FOLL_NOWAIT))
> + return ERR_PTR(-EINVAL);
> +
Do we have any callers or do we want to make that official (document it)
and use WARN_ON_ONCE() instead?
> + got = get_user_pages_remote(mm, addr, 1, gup_flags, &page, NULL);
>
> if (got < 0)
> return ERR_PTR(got);
> - if (got == 0)
> - return NULL;
> +
> + /*
> + * get_user_pages_remote() is guaranteed to not return 0 for
> + * non-FOLL_NOWAIT contexts, so this should never happen.
> + */
> + VM_WARN_ON(got == 0);
You should probably just drop that. Not worth the comment + code and its
better checked inside get_user_pages_remote().
Ideally, just document that behavior for get_user_pages_remote() "Will
never return 0 without FOLL_NOWAIT."
--
Cheers,
David / dhildenb
On 01.10.23 18:00, Lorenzo Stoakes wrote:
> access_remote_vm() passes through parameters to __access_remote_vm()
> directly, so remove the __access_remote_vm() function from mm.h aand use
> access_remote_vm() in the one caller that needs it (ptrace_access_vm()).
>
Wondering why we even still have __access_remote_vm() in the first
place, if it really should just be access_remote_vm() directly.
Reviewed-by: David Hildenbrand <[email protected]>
--
Cheers,
David / dhildenb
On Mon, Oct 02, 2023 at 01:08:41PM +0200, David Hildenbrand wrote:
> On 01.10.23 18:00, Lorenzo Stoakes wrote:
> > get_user_pages_remote() will never return 0 except in the case of
> > FOLL_NOWAIT being specified, which we explicitly disallow.
> >
> > This simplifies error handling for the caller and avoids the awkwardness of
> > dealing with both errors and failing to pin. Failing to pin here is an
> > error.
> >
> > Suggested-by: Arnd Bergmann <[email protected]>
> > Signed-off-by: Lorenzo Stoakes <[email protected]>
> > ---
> > arch/arm64/kernel/mte.c | 4 ++--
> > include/linux/mm.h | 16 +++++++++++++---
> > kernel/events/uprobes.c | 4 ++--
> > mm/memory.c | 3 +--
> > 4 files changed, 18 insertions(+), 9 deletions(-)
> >
> > diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
> > index 4edecaac8f91..8878b392df58 100644
> > --- a/arch/arm64/kernel/mte.c
> > +++ b/arch/arm64/kernel/mte.c
> > @@ -411,8 +411,8 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr,
> > struct page *page = get_user_page_vma_remote(mm, addr,
> > gup_flags, &vma);
> > - if (IS_ERR_OR_NULL(page)) {
> > - err = page == NULL ? -EIO : PTR_ERR(page);
> > + if (IS_ERR(page)) {
> > + err = PTR_ERR(page);
> > break;
> > }
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index 7b89f7bd420d..da9631683d38 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -2425,6 +2425,7 @@ long pin_user_pages_remote(struct mm_struct *mm,
> > unsigned int gup_flags, struct page **pages,
> > int *locked);
> > +/* Either retrieve a single VMA and page, or an error. */
> > static inline struct page *get_user_page_vma_remote(struct mm_struct *mm,
> > unsigned long addr,
> > int gup_flags,
> > @@ -2432,12 +2433,21 @@ static inline struct page *get_user_page_vma_remote(struct mm_struct *mm,
> > {
> > struct page *page;
> > struct vm_area_struct *vma;
> > - int got = get_user_pages_remote(mm, addr, 1, gup_flags, &page, NULL);
> > + int got;
> > +
> > + if (unlikely(gup_flags & FOLL_NOWAIT))
> > + return ERR_PTR(-EINVAL);
> > +
>
> Do we have any callers or do we want to make that official (document it) and
> use WARN_ON_ONCE() instead?
We have no callers who want to do that, will WARN_ON_ONCE() and update
comment in v2.
>
> > + got = get_user_pages_remote(mm, addr, 1, gup_flags, &page, NULL);
> > if (got < 0)
> > return ERR_PTR(got);
> > - if (got == 0)
> > - return NULL;
> > +
> > + /*
> > + * get_user_pages_remote() is guaranteed to not return 0 for
> > + * non-FOLL_NOWAIT contexts, so this should never happen.
> > + */
> > + VM_WARN_ON(got == 0);
>
> You should probably just drop that. Not worth the comment + code and its
> better checked inside get_user_pages_remote().
>
> Ideally, just document that behavior for get_user_pages_remote() "Will never
> return 0 without FOLL_NOWAIT."
Well you'd need to put it at all the other callers of
__get_user_pages_locked() too :) so I think probably not worth doing that,
at least in this patch series.
In any case, we have explicitly added this check there to ensure that's not
possible so I think we're good, will just strip in v2.
>
> --
> Cheers,
>
> David / dhildenb
>
On Sun, Oct 01, 2023 at 05:00:02PM +0100, Lorenzo Stoakes wrote:
> access_remote_vm() passes through parameters to __access_remote_vm()
> directly, so remove the __access_remote_vm() function from mm.h aand use
> access_remote_vm() in the one caller that needs it (ptrace_access_vm()).
>
> This allows future adjustments to the GUP-internal __access_remote_vm()
> function while keeping the access_remote_vm() function stable.
>
> Signed-off-by: Lorenzo Stoakes <[email protected]>
> ---
> include/linux/mm.h | 2 --
> kernel/ptrace.c | 2 +-
> mm/memory.c | 4 ++--
> mm/nommu.c | 4 ++--
> 4 files changed, 5 insertions(+), 7 deletions(-)
Why do we even have two versions?
Reviewed-by: Jason Gunthorpe <[email protected]>
Jason
On Sun, Oct 01, 2023 at 05:00:03PM +0100, Lorenzo Stoakes wrote:
> Rather than open-coding a list of internal GUP flags in
> is_valid_gup_args(), define which ones are internal.
>
> In addition, we were not explicitly checking to see if the user passed in
> FOLL_TOUCH somehow, this patch fixes that.
>
> Signed-off-by: Lorenzo Stoakes <[email protected]>
> ---
> mm/gup.c | 5 ++---
> mm/internal.h | 3 +++
> 2 files changed, 5 insertions(+), 3 deletions(-)
Does gup_test still work? It uses FOLL_TOUCH?
Hmm. I guess it was broken for a while anyhow:
/* Just the flags we need, copied from mm.h: */
#define FOLL_WRITE 0x01 /* check pte is writable */
#define FOLL_TOUCH 0x02 /* mark page accessed */
Aside from that this seems OK
Reviewed-by: Jason Gunthorpe <[email protected]>
Jason
On Sun, Oct 01, 2023 at 05:00:05PM +0100, Lorenzo Stoakes wrote:
> get_user_pages_remote() will never return 0 except in the case of
> FOLL_NOWAIT being specified, which we explicitly disallow.
>
> This simplifies error handling for the caller and avoids the awkwardness of
> dealing with both errors and failing to pin. Failing to pin here is an
> error.
>
> Suggested-by: Arnd Bergmann <[email protected]>
> Signed-off-by: Lorenzo Stoakes <[email protected]>
> ---
> arch/arm64/kernel/mte.c | 4 ++--
> include/linux/mm.h | 16 +++++++++++++---
> kernel/events/uprobes.c | 4 ++--
> mm/memory.c | 3 +--
> 4 files changed, 18 insertions(+), 9 deletions(-)
Good riddance to IS_ERR_OR_NULL
Reviewed-by: Jason Gunthorpe <[email protected]>
Jason