find_vma() will continue to search upwards until the end of the virtual
memory space. This means the si_code would almost never be set to
SEGV_MAPERR even when the address falls outside of any VMA. The result
is that the si_code is not reliable as it may or may not be set to the
correct result, depending on where the address falls in the address
space.
Using find_vma_intersection() allows for what is intended by only
returning a VMA if it falls within the range provided, in this case a
window of 1.
Signed-off-by: Liam R. Howlett <[email protected]>
---
arch/arm64/kernel/traps.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index a05d34f0e82a..a44007904a64 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -383,9 +383,10 @@ void force_signal_inject(int signal, int code, unsigned long address, unsigned i
void arm64_notify_segfault(unsigned long addr)
{
int code;
+ unsigned long ut_addr = untagged_addr(addr);
mmap_read_lock(current->mm);
- if (find_vma(current->mm, untagged_addr(addr)) == NULL)
+ if (find_vma_intersection(current->mm, ut_addr, ut_addr + 1) == NULL)
code = SEGV_MAPERR;
else
code = SEGV_ACCERR;
--
2.30.0
On Wed, Apr 07, 2021 at 03:11:06PM +0000, Liam Howlett wrote:
> find_vma() will continue to search upwards until the end of the virtual
> memory space. This means the si_code would almost never be set to
> SEGV_MAPERR even when the address falls outside of any VMA. The result
> is that the si_code is not reliable as it may or may not be set to the
> correct result, depending on where the address falls in the address
> space.
>
> Using find_vma_intersection() allows for what is intended by only
> returning a VMA if it falls within the range provided, in this case a
> window of 1.
>
> Signed-off-by: Liam R. Howlett <[email protected]>
> ---
> arch/arm64/kernel/traps.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> index a05d34f0e82a..a44007904a64 100644
> --- a/arch/arm64/kernel/traps.c
> +++ b/arch/arm64/kernel/traps.c
> @@ -383,9 +383,10 @@ void force_signal_inject(int signal, int code, unsigned long address, unsigned i
> void arm64_notify_segfault(unsigned long addr)
> {
> int code;
> + unsigned long ut_addr = untagged_addr(addr);
>
> mmap_read_lock(current->mm);
> - if (find_vma(current->mm, untagged_addr(addr)) == NULL)
> + if (find_vma_intersection(current->mm, ut_addr, ut_addr + 1) == NULL)
> code = SEGV_MAPERR;
> else
> code = SEGV_ACCERR;
I don't think your change is entirely correct either. We can have a
fault below the vma of a stack (with VM_GROWSDOWN) and
find_vma_intersection() would return NULL but it should be a SEGV_ACCERR
instead.
Maybe this should employ similar checks as __do_page_fault() (with
expand_stack() and VM_GROWSDOWN).
--
Catalin
* Catalin Marinas <[email protected]> [210412 13:44]:
> On Wed, Apr 07, 2021 at 03:11:06PM +0000, Liam Howlett wrote:
> > find_vma() will continue to search upwards until the end of the virtual
> > memory space. This means the si_code would almost never be set to
> > SEGV_MAPERR even when the address falls outside of any VMA. The result
> > is that the si_code is not reliable as it may or may not be set to the
> > correct result, depending on where the address falls in the address
> > space.
> >
> > Using find_vma_intersection() allows for what is intended by only
> > returning a VMA if it falls within the range provided, in this case a
> > window of 1.
> >
> > Signed-off-by: Liam R. Howlett <[email protected]>
> > ---
> > arch/arm64/kernel/traps.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> > index a05d34f0e82a..a44007904a64 100644
> > --- a/arch/arm64/kernel/traps.c
> > +++ b/arch/arm64/kernel/traps.c
> > @@ -383,9 +383,10 @@ void force_signal_inject(int signal, int code, unsigned long address, unsigned i
> > void arm64_notify_segfault(unsigned long addr)
> > {
> > int code;
> > + unsigned long ut_addr = untagged_addr(addr);
> >
> > mmap_read_lock(current->mm);
> > - if (find_vma(current->mm, untagged_addr(addr)) == NULL)
> > + if (find_vma_intersection(current->mm, ut_addr, ut_addr + 1) == NULL)
> > code = SEGV_MAPERR;
> > else
> > code = SEGV_ACCERR;
Thank you for taking the time to thoroughly review this patch.
>
> I don't think your change is entirely correct either. We can have a
> fault below the vma of a stack (with VM_GROWSDOWN) and
> find_vma_intersection() would return NULL but it should be a SEGV_ACCERR
> instead.
I'm pretty sure I am missing something. From what you said above, I
think this means that there can be a user cache fault below the stack
which should notify the user application that they are not allowed to
expand the stack by sending a SIGV_ACCERR in the si_code? Is this
expected behaviour or am I missing a code path to this function?
>
> Maybe this should employ similar checks as __do_page_fault() (with
> expand_stack() and VM_GROWSDOWN).
You mean the code needs to detect endianness and to check if this is an
attempt to expand the stack for both cases?
Thanks,
Liam
On Tue, Apr 13, 2021 at 04:52:34PM +0000, Liam Howlett wrote:
> * Catalin Marinas <[email protected]> [210412 13:44]:
> > On Wed, Apr 07, 2021 at 03:11:06PM +0000, Liam Howlett wrote:
> > > find_vma() will continue to search upwards until the end of the virtual
> > > memory space. This means the si_code would almost never be set to
> > > SEGV_MAPERR even when the address falls outside of any VMA. The result
> > > is that the si_code is not reliable as it may or may not be set to the
> > > correct result, depending on where the address falls in the address
> > > space.
> > >
> > > Using find_vma_intersection() allows for what is intended by only
> > > returning a VMA if it falls within the range provided, in this case a
> > > window of 1.
> > >
> > > Signed-off-by: Liam R. Howlett <[email protected]>
> > > ---
> > > arch/arm64/kernel/traps.c | 3 ++-
> > > 1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> > > index a05d34f0e82a..a44007904a64 100644
> > > --- a/arch/arm64/kernel/traps.c
> > > +++ b/arch/arm64/kernel/traps.c
> > > @@ -383,9 +383,10 @@ void force_signal_inject(int signal, int code, unsigned long address, unsigned i
> > > void arm64_notify_segfault(unsigned long addr)
> > > {
> > > int code;
> > > + unsigned long ut_addr = untagged_addr(addr);
> > >
> > > mmap_read_lock(current->mm);
> > > - if (find_vma(current->mm, untagged_addr(addr)) == NULL)
> > > + if (find_vma_intersection(current->mm, ut_addr, ut_addr + 1) == NULL)
> > > code = SEGV_MAPERR;
> > > else
> > > code = SEGV_ACCERR;
[...]
> > I don't think your change is entirely correct either. We can have a
> > fault below the vma of a stack (with VM_GROWSDOWN) and
> > find_vma_intersection() would return NULL but it should be a SEGV_ACCERR
> > instead.
>
> I'm pretty sure I am missing something. From what you said above, I
> think this means that there can be a user cache fault below the stack
> which should notify the user application that they are not allowed to
> expand the stack by sending a SIGV_ACCERR in the si_code? Is this
> expected behaviour or am I missing a code path to this function?
My point was that find_vma() may return a valid vma where addr < vm_end
but also addr < vm_addr. It's the responsibility of the caller to check
that that vma can be expanded (VM_GROWSDOWN) and we do something like
this in __do_page_fault(). find_vma_intersection(), OTOH, requires addr
>= vm_start.
If we hit this case (addr < vm_start), normally we'd first need to check
whether it's expandable and, if not, return MAPERR. If it's expandable,
it should be ACCERR since something else caused the fault.
Now, I think at least for user_cache_maint_handler(), we can assume that
__do_page_fault() handled any expansion already, so we don't need to
check it here. In this case, your find_vma_intersection() check should
work.
Are there other cases where we invoke arm64_notify_segfault() without a
prior fault? I think in swp_handler() we can bail out early before we
even attempted the access so we may report MAPERR but ACCERR is a better
indication. Also in sys_rt_sigreturn() we always call it as
arm64_notify_segfault(regs->sp). I'm not sure that's correct in all
cases, see restore_altstack().
I guess this code needs some tidying up.
> > Maybe this should employ similar checks as __do_page_fault() (with
> > expand_stack() and VM_GROWSDOWN).
>
> You mean the code needs to detect endianness and to check if this is an
> attempt to expand the stack for both cases?
Nothing to do with endianness, just the relation between the address and
the vma->vm_start and whether the vma can be expanded down.
--
Catalin
* Catalin Marinas <[email protected]> [210413 14:00]:
> On Tue, Apr 13, 2021 at 04:52:34PM +0000, Liam Howlett wrote:
> > * Catalin Marinas <[email protected]> [210412 13:44]:
> > > On Wed, Apr 07, 2021 at 03:11:06PM +0000, Liam Howlett wrote:
> > > > find_vma() will continue to search upwards until the end of the virtual
> > > > memory space. This means the si_code would almost never be set to
> > > > SEGV_MAPERR even when the address falls outside of any VMA. The result
> > > > is that the si_code is not reliable as it may or may not be set to the
> > > > correct result, depending on where the address falls in the address
> > > > space.
> > > >
> > > > Using find_vma_intersection() allows for what is intended by only
> > > > returning a VMA if it falls within the range provided, in this case a
> > > > window of 1.
> > > >
> > > > Signed-off-by: Liam R. Howlett <[email protected]>
> > > > ---
> > > > arch/arm64/kernel/traps.c | 3 ++-
> > > > 1 file changed, 2 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> > > > index a05d34f0e82a..a44007904a64 100644
> > > > --- a/arch/arm64/kernel/traps.c
> > > > +++ b/arch/arm64/kernel/traps.c
> > > > @@ -383,9 +383,10 @@ void force_signal_inject(int signal, int code, unsigned long address, unsigned i
> > > > void arm64_notify_segfault(unsigned long addr)
> > > > {
> > > > int code;
> > > > + unsigned long ut_addr = untagged_addr(addr);
> > > >
> > > > mmap_read_lock(current->mm);
> > > > - if (find_vma(current->mm, untagged_addr(addr)) == NULL)
> > > > + if (find_vma_intersection(current->mm, ut_addr, ut_addr + 1) == NULL)
> > > > code = SEGV_MAPERR;
> > > > else
> > > > code = SEGV_ACCERR;
> [...]
> > > I don't think your change is entirely correct either. We can have a
> > > fault below the vma of a stack (with VM_GROWSDOWN) and
> > > find_vma_intersection() would return NULL but it should be a SEGV_ACCERR
> > > instead.
> >
> > I'm pretty sure I am missing something. From what you said above, I
> > think this means that there can be a user cache fault below the stack
> > which should notify the user application that they are not allowed to
> > expand the stack by sending a SIGV_ACCERR in the si_code? Is this
> > expected behaviour or am I missing a code path to this function?
>
> My point was that find_vma() may return a valid vma where addr < vm_end
> but also addr < vm_addr. It's the responsibility of the caller to check
> that that vma can be expanded (VM_GROWSDOWN) and we do something like
> this in __do_page_fault(). find_vma_intersection(), OTOH, requires addr
> >= vm_start.
Right. The find_vma() interface is not clear by the function name;
returning a VMA that doesn't include the address of interest is unclear.
I think this is why we ended up with the bug in the first place.
>
> If we hit this case (addr < vm_start), normally we'd first need to check
> whether it's expandable and, if not, return MAPERR. If it's expandable,
> it should be ACCERR since something else caused the fault.
>
> Now, I think at least for user_cache_maint_handler(), we can assume that
> __do_page_fault() handled any expansion already, so we don't need to
> check it here. In this case, your find_vma_intersection() check should
> work.
>
> Are there other cases where we invoke arm64_notify_segfault() without a
> prior fault? I think in swp_handler() we can bail out early before we
> even attempted the access so we may report MAPERR but ACCERR is a better
> indication.
swp_handler() is also buggy. It is currently getting the ACCERR as long
as the address being checked is > mm->highest_vm_end. If access_ok()
fails, it should return ACCERR and not search VMAs for the address at
all.
...
>Also in sys_rt_sigreturn() we always call it as
> arm64_notify_segfault(regs->sp). I'm not sure that's correct in all
> cases, see restore_altstack().
Ditto for sys_rt_sigreturn() and sys_sigreturn(), they both suffer the
same bug as swp_handler() outlined above.
In the case of restore_sigframe() or restore_altstack() failing, it
seems that the signal shouldn't be dependent on where the address falls
within the VMA at all. Should the signal still be SIGSEGV or something
else? By the comments, I would have thought SIGBUS, si_code of
BUS_ADRALN?
>
> I guess this code needs some tidying up.
Indeed, there seems to be a few code paths that need to skip this
function all together and just set the code to ACCERR - especially since
the access_ok() just validates the number itself.
I don't think the right answer is to rewrite the function to check
VM_GROWSDOWN, as I cannot see a way to reach this function when trying
to expand the stack which should report back ACCERR. Do you agree?
I also see that my fix would expose other bugs which need to be
addressed at the same time.
>
> > > Maybe this should employ similar checks as __do_page_fault() (with
> > > expand_stack() and VM_GROWSDOWN).
> >
> > You mean the code needs to detect endianness and to check if this is an
> > attempt to expand the stack for both cases?
>
> Nothing to do with endianness, just the relation between the address and
> the vma->vm_start and whether the vma can be expanded down.
Okay, thanks for clarifying.
Cheers,
Liam