On Wed, Apr 28, 2021 at 11:01:09AM -0700, Axel Rasmussen wrote:
> Consider the following sequence of events (described from the point of
> view of the commit that introduced the bug - see "Fixes:" below):
>
> 1. Userspace issues a UFFD ioctl, which ends up calling into
> shmem_mcopy_atomic_pte(). We successfully account the blocks, we
> shmem_alloc_page(), but then the copy_from_user() fails. We return
> -EFAULT. We don't release the page we allocated.
> 2. Our caller detects this error code, tries the copy_from_user() after
> dropping the mmap_sem, and retries, calling back into
> shmem_mcopy_atomic_pte().
> 3. Meanwhile, let's say another process filled up the tmpfs being used.
> 4. So shmem_mcopy_atomic_pte() fails to account blocks this time, and
> immediately returns - without releasing the page. This triggers a
> BUG_ON in our caller, which asserts that the page should always be
> consumed, unless -EFAULT is returned.
>
> (Later on in the commit history, -EFAULT became -ENOENT, mmap_sem became
> mmap_lock, and shmem_inode_acct_block() was added.)
I suggest you do s/EFAULT/ENOENT/ directly in above.
>
> A malicious user (even an unprivileged one) could trigger this
> intentionally without too much trouble.
>
> To fix this, detect if we have a "dangling" page when accounting fails,
> and if so, release it before returning.
>
> Fixes: cb658a453b93 ("userfaultfd: shmem: avoid leaking blocks and used blocks in UFFDIO_COPY")
> Reported-by: Hugh Dickins <[email protected]>
> Signed-off-by: Axel Rasmussen <[email protected]>
> ---
> mm/shmem.c | 13 ++++++++++++-
> 1 file changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 26c76b13ad23..46766c9d7151 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -2375,8 +2375,19 @@ static int shmem_mfill_atomic_pte(struct mm_struct *dst_mm,
> pgoff_t offset, max_off;
>
> ret = -ENOMEM;
> - if (!shmem_inode_acct_block(inode, 1))
> + if (!shmem_inode_acct_block(inode, 1)) {
> + /*
> + * We may have got a page, returned -ENOENT triggering a retry,
> + * and now we find ourselves with -ENOMEM. Release the page, to
> + * avoid a BUG_ON in our caller.
> + */
> + if (unlikely(*pagep)) {
> + unlock_page(*pagep);
Not necessary?
> + put_page(*pagep);
> + *pagep = NULL;
> + }
> goto out;
All "goto out" in this functions looks weird as it returns directly... so if
you're touching this after all, I suggest we do "return -ENOMEM" directly and
drop the "ret = -ENOMEM".
Thanks,
> + }
>
> if (!*pagep) {
> page = shmem_alloc_page(gfp, info, pgoff);
> --
> 2.31.1.498.g6c1eba8ee3d-goog
>
--
Peter Xu
On Wed, 28 Apr 2021, Peter Xu wrote:
> On Wed, Apr 28, 2021 at 11:01:09AM -0700, Axel Rasmussen wrote:
> > Consider the following sequence of events (described from the point of
> > view of the commit that introduced the bug - see "Fixes:" below):
> >
> > 1. Userspace issues a UFFD ioctl, which ends up calling into
> > shmem_mcopy_atomic_pte(). We successfully account the blocks, we
> > shmem_alloc_page(), but then the copy_from_user() fails. We return
> > -EFAULT. We don't release the page we allocated.
> > 2. Our caller detects this error code, tries the copy_from_user() after
> > dropping the mmap_sem, and retries, calling back into
> > shmem_mcopy_atomic_pte().
> > 3. Meanwhile, let's say another process filled up the tmpfs being used.
> > 4. So shmem_mcopy_atomic_pte() fails to account blocks this time, and
> > immediately returns - without releasing the page. This triggers a
> > BUG_ON in our caller, which asserts that the page should always be
> > consumed, unless -EFAULT is returned.
> >
> > (Later on in the commit history, -EFAULT became -ENOENT, mmap_sem became
> > mmap_lock, and shmem_inode_acct_block() was added.)
>
> I suggest you do s/EFAULT/ENOENT/ directly in above.
I haven't looked into the history, but it would be best for this to
describe the situation in v5.12, never mind the details which were
different at the time of the commit tagged with Fixes. But we stay
alert that when it's backported to stable, we may need to adjust
something to suit those releases (which will depend on how much
else has been backported to them meanwhile).
>
> >
> > A malicious user (even an unprivileged one) could trigger this
> > intentionally without too much trouble.
I regret having suggested that. Maybe. Opinions differ as to whether
it's helpful to call attention like that. I'd say delete that paragraph.
> >
> > To fix this, detect if we have a "dangling" page when accounting fails,
> > and if so, release it before returning.
> >
> > Fixes: cb658a453b93 ("userfaultfd: shmem: avoid leaking blocks and used blocks in UFFDIO_COPY")
> > Reported-by: Hugh Dickins <[email protected]>
> > Signed-off-by: Axel Rasmussen <[email protected]>
Thanks for getting on to this so quickly, Axel.
But Peter is right, that unlock_page() needs removing.
> > ---
> > mm/shmem.c | 13 ++++++++++++-
> > 1 file changed, 12 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/shmem.c b/mm/shmem.c
> > index 26c76b13ad23..46766c9d7151 100644
> > --- a/mm/shmem.c
> > +++ b/mm/shmem.c
> > @@ -2375,8 +2375,19 @@ static int shmem_mfill_atomic_pte(struct mm_struct *dst_mm,
> > pgoff_t offset, max_off;
> >
> > ret = -ENOMEM;
> > - if (!shmem_inode_acct_block(inode, 1))
> > + if (!shmem_inode_acct_block(inode, 1)) {
> > + /*
> > + * We may have got a page, returned -ENOENT triggering a retry,
> > + * and now we find ourselves with -ENOMEM. Release the page, to
> > + * avoid a BUG_ON in our caller.
> > + */
> > + if (unlikely(*pagep)) {
> > + unlock_page(*pagep);
>
> Not necessary?
Worse than not necessary: would trigger a VM_BUG_ON_PAGE(). Delete!
>
> > + put_page(*pagep);
> > + *pagep = NULL;
> > + }
> > goto out;
>
> All "goto out" in this functions looks weird as it returns directly... so if
> you're touching this after all, I suggest we do "return -ENOMEM" directly and
> drop the "ret = -ENOMEM".
No strong feeling either way from me on that: whichever looks best
to you. But I suspect the "ret = -ENOMEM" cannot be dropped,
because it's relied on further down too?
>
> Thanks,
>
> > + }
> >
> > if (!*pagep) {
> > page = shmem_alloc_page(gfp, info, pgoff);
> > --
> > 2.31.1.498.g6c1eba8ee3d-goog
> >
>
> --
> Peter Xu
On Wed, Apr 28, 2021 at 02:03:05PM -0700, Hugh Dickins wrote:
[...]
> > > + put_page(*pagep);
> > > + *pagep = NULL;
> > > + }
> > > goto out;
> >
> > All "goto out" in this functions looks weird as it returns directly... so if
> > you're touching this after all, I suggest we do "return -ENOMEM" directly and
> > drop the "ret = -ENOMEM".
>
> No strong feeling either way from me on that: whichever looks best
> to you. But I suspect the "ret = -ENOMEM" cannot be dropped,
> because it's relied on further down too?
Ah sorry I just noticed Axel didn't really touch that line.. :) So yeah please
also feel free to keep it as is.
If to drop it, "ret = -ENOMEM" can go as well, I think.. since all later errors
should always reset variable "ret".
Thanks,
--
Peter Xu
On Wed, Apr 28, 2021 at 2:24 PM Peter Xu <[email protected]> wrote:
>
> On Wed, Apr 28, 2021 at 02:03:05PM -0700, Hugh Dickins wrote:
>
> [...]
>
> > > > + put_page(*pagep);
> > > > + *pagep = NULL;
> > > > + }
> > > > goto out;
> > >
> > > All "goto out" in this functions looks weird as it returns directly... so if
> > > you're touching this after all, I suggest we do "return -ENOMEM" directly and
> > > drop the "ret = -ENOMEM".
> >
> > No strong feeling either way from me on that: whichever looks best
> > to you. But I suspect the "ret = -ENOMEM" cannot be dropped,
> > because it's relied on further down too?
>
> Ah sorry I just noticed Axel didn't really touch that line.. :) So yeah please
> also feel free to keep it as is.
>
> If to drop it, "ret = -ENOMEM" can go as well, I think.. since all later errors
> should always reset variable "ret".
Although I can see a refactor which simplifies the error handling a
bit, my inclination is to leave it alone in this patch, since it's
trying to be a simple fix and especially considering we may need to
backport it as far back as 4.14.
But, I'll keep this feedback in mind and try to apply it as much as
possible to my other series which is significantly refactoring this
function already, so the end state ought to be as simple + consistent
error handling as possible.
>
> Thanks,
>
> --
> Peter Xu
>