In NUMMU kernel the value of linux_binprm::p is the offset inside the
temporary program arguments array maintained in separate pages in the
linux_binprm::page. linux_binprm::exec being a copy of linux_binprm::p
thus must be adjusted when that array is copied to the user stack.
Without that adjustment the value passed by the NOMMU kernel to the ELF
program in the AT_EXECFN entry of the aux array doesn't make any sense
and it may break programs that try to access memory pointed to by that
entry.
Adjust linux_binprm::exec before the successful return from the
transfer_args_to_stack().
Cc: [email protected]
Signed-off-by: Max Filippov <[email protected]>
---
fs/exec.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/exec.c b/fs/exec.c
index af4fbb61cd53..5ee2545c3e18 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -895,6 +895,7 @@ int transfer_args_to_stack(struct linux_binprm *bprm,
goto out;
}
+ bprm->exec += *sp_location - MAX_ARG_PAGES * PAGE_SIZE;
*sp_location = sp;
out:
--
2.39.2
On Wed, Mar 20, 2024 at 12:29 PM Matthew Wilcox <[email protected]> wrote:
>
> On Wed, Mar 20, 2024 at 11:26:07AM -0700, Max Filippov wrote:
> > In NUMMU kernel the value of linux_binprm::p is the offset inside the
> > temporary program arguments array maintained in separate pages in the
> > linux_binprm::page. linux_binprm::exec being a copy of linux_binprm::p
> > thus must be adjusted when that array is copied to the user stack.
> > Without that adjustment the value passed by the NOMMU kernel to the ELF
> > program in the AT_EXECFN entry of the aux array doesn't make any sense
> > and it may break programs that try to access memory pointed to by that
> > entry.
> >
> > Adjust linux_binprm::exec before the successful return from the
> > transfer_args_to_stack().
>
> Do you know which commit broke this, ie how far back should this be
> backported? Or has it always been broken?
From reading the code I see that linux_binprm::p started being an offset
in the commit b6a2fea39318 ("mm: variable length argument support")
which is v2.6.22-3328-gb6a2fea39318 and filling in the AT_EXECFN
aux entry was added in the commit
5edc2a5123a7 ("binfmt_elf_fdpic: wire up AT_EXECFD, AT_EXECFN, AT_SECURE")
which is v2.6.27-4641-g5edc2a5123a7. I don't see any translation
of the linux_binprm::exec at that time so to me it looks like it's always
been broken.
--
Thanks.
-- Max
On Wed, Mar 20, 2024 at 11:26:07AM -0700, Max Filippov wrote:
> In NUMMU kernel the value of linux_binprm::p is the offset inside the
> temporary program arguments array maintained in separate pages in the
> linux_binprm::page. linux_binprm::exec being a copy of linux_binprm::p
> thus must be adjusted when that array is copied to the user stack.
> Without that adjustment the value passed by the NOMMU kernel to the ELF
> program in the AT_EXECFN entry of the aux array doesn't make any sense
> and it may break programs that try to access memory pointed to by that
> entry.
>
> Adjust linux_binprm::exec before the successful return from the
> transfer_args_to_stack().
Do you know which commit broke this, ie how far back should this be
backported? Or has it always been broken?
> Cc: [email protected]
> Signed-off-by: Max Filippov <[email protected]>
> ---
> fs/exec.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/fs/exec.c b/fs/exec.c
> index af4fbb61cd53..5ee2545c3e18 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -895,6 +895,7 @@ int transfer_args_to_stack(struct linux_binprm *bprm,
> goto out;
> }
>
> + bprm->exec += *sp_location - MAX_ARG_PAGES * PAGE_SIZE;
> *sp_location = sp;
>
> out:
> --
> 2.39.2
>
>
On Wed, Mar 20, 2024 at 11:26:07AM -0700, Max Filippov wrote:
> In NUMMU kernel the value of linux_binprm::p is the offset inside the
> temporary program arguments array maintained in separate pages in the
> linux_binprm::page. linux_binprm::exec being a copy of linux_binprm::p
> thus must be adjusted when that array is copied to the user stack.
> Without that adjustment the value passed by the NOMMU kernel to the ELF
> program in the AT_EXECFN entry of the aux array doesn't make any sense
> and it may break programs that try to access memory pointed to by that
> entry.
>
> Adjust linux_binprm::exec before the successful return from the
> transfer_args_to_stack().
What's the best way to test this? (Is there a qemu setup I can use to
see the before/after of AT_EXECFN?)
How did you encounter the problem?
>
> Cc: [email protected]
> Signed-off-by: Max Filippov <[email protected]>
> ---
> fs/exec.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/fs/exec.c b/fs/exec.c
> index af4fbb61cd53..5ee2545c3e18 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -895,6 +895,7 @@ int transfer_args_to_stack(struct linux_binprm *bprm,
> goto out;
> }
>
> + bprm->exec += *sp_location - MAX_ARG_PAGES * PAGE_SIZE;
> *sp_location = sp;
>
> out:
> --
> 2.39.2
>
--
Kees Cook
On Thu, Mar 21, 2024 at 10:05 AM Kees Cook <[email protected]> wrote:
>
> On Wed, Mar 20, 2024 at 11:26:07AM -0700, Max Filippov wrote:
> > In NUMMU kernel the value of linux_binprm::p is the offset inside the
> > temporary program arguments array maintained in separate pages in the
> > linux_binprm::page. linux_binprm::exec being a copy of linux_binprm::p
> > thus must be adjusted when that array is copied to the user stack.
> > Without that adjustment the value passed by the NOMMU kernel to the ELF
> > program in the AT_EXECFN entry of the aux array doesn't make any sense
> > and it may break programs that try to access memory pointed to by that
> > entry.
> >
> > Adjust linux_binprm::exec before the successful return from the
> > transfer_args_to_stack().
>
> What's the best way to test this? (Is there a qemu setup I can use to
> see the before/after of AT_EXECFN?)
I put a readme with the steps to build such system here:
http://jcmvbkbc.org/~dumb/tmp/202403211236/README
it uses a prebuilt rootfs image and a 6.8 kernel branch with two
patches on top of it: one adds a dts and a defconfig and the other
is this fix. The rootfs boots successfully with this fix, but panics
if this fix is removed.
The easiest way to actually see the AT_EXECFN is, I guess, to
do something like that:
---8<---
diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
index fefc642541cb..22d34272a570 100644
--- a/fs/binfmt_elf_fdpic.c
+++ b/fs/binfmt_elf_fdpic.c
@@ -659,6 +659,7 @@ static int create_elf_fdpic_tables(struct
linux_binprm *bprm,
NEW_AUX_ENT(AT_EGID, (elf_addr_t)
from_kgid_munged(cred->user_ns, cred->egid));
NEW_AUX_ENT(AT_SECURE, bprm->secureexec);
NEW_AUX_ENT(AT_EXECFN, bprm->exec);
+ pr_info("%s: AT_EXECFN = %#lx\n", __func__, bprm->exec);
#ifdef ARCH_DLINFO
nr = 0;
---8<---
> How did you encounter the problem?
I'm doing xtensa FDPIC port of musl libc and this issue popped up when
I began testing it on qemu-system-xtensa with the real linux kernel.
Related post to the musl ML:
https://www.openwall.com/lists/musl/2024/03/20/2
--
Thanks.
-- Max
On Wed, 20 Mar 2024 11:26:07 -0700, Max Filippov wrote:
> In NUMMU kernel the value of linux_binprm::p is the offset inside the
> temporary program arguments array maintained in separate pages in the
> linux_binprm::page. linux_binprm::exec being a copy of linux_binprm::p
> thus must be adjusted when that array is copied to the user stack.
> Without that adjustment the value passed by the NOMMU kernel to the ELF
> program in the AT_EXECFN entry of the aux array doesn't make any sense
> and it may break programs that try to access memory pointed to by that
> entry.
>
> [...]
Applied to for-next/execve, thanks!
[1/1] exec: fix linux_binprm::exec in transfer_args_to_stack()
https://git.kernel.org/kees/c/2aea94ac14d1
Take care,
--
Kees Cook
On Thu, Mar 21, 2024 at 12:52:16PM -0700, Max Filippov wrote:
> On Thu, Mar 21, 2024 at 10:05 AM Kees Cook <[email protected]> wrote:
> >
> > On Wed, Mar 20, 2024 at 11:26:07AM -0700, Max Filippov wrote:
> > > In NUMMU kernel the value of linux_binprm::p is the offset inside the
> > > temporary program arguments array maintained in separate pages in the
> > > linux_binprm::page. linux_binprm::exec being a copy of linux_binprm::p
> > > thus must be adjusted when that array is copied to the user stack.
> > > Without that adjustment the value passed by the NOMMU kernel to the ELF
> > > program in the AT_EXECFN entry of the aux array doesn't make any sense
> > > and it may break programs that try to access memory pointed to by that
> > > entry.
> > >
> > > Adjust linux_binprm::exec before the successful return from the
> > > transfer_args_to_stack().
> >
> > What's the best way to test this? (Is there a qemu setup I can use to
> > see the before/after of AT_EXECFN?)
>
> I put a readme with the steps to build such system here:
> http://jcmvbkbc.org/~dumb/tmp/202403211236/README
> it uses a prebuilt rootfs image and a 6.8 kernel branch with two
> patches on top of it: one adds a dts and a defconfig and the other
> is this fix. The rootfs boots successfully with this fix, but panics
> if this fix is removed.
Ah, perfect! Thanks for this.
> The easiest way to actually see the AT_EXECFN is, I guess, to
> do something like that:
> ---8<---
> diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
> index fefc642541cb..22d34272a570 100644
> --- a/fs/binfmt_elf_fdpic.c
> +++ b/fs/binfmt_elf_fdpic.c
> @@ -659,6 +659,7 @@ static int create_elf_fdpic_tables(struct
> linux_binprm *bprm,
> NEW_AUX_ENT(AT_EGID, (elf_addr_t)
> from_kgid_munged(cred->user_ns, cred->egid));
> NEW_AUX_ENT(AT_SECURE, bprm->secureexec);
> NEW_AUX_ENT(AT_EXECFN, bprm->exec);
> + pr_info("%s: AT_EXECFN = %#lx\n", __func__, bprm->exec);
>
> #ifdef ARCH_DLINFO
> nr = 0;
> ---8<---
Does musl have something like the LD_SHOW_AUXV env variable. With glibc,
I usually explore auxv like so:
$ LD_SHOW_AUXV=1 uname -a | grep EXECFN
AT_EXECFN: /usr/bin/uname
> > How did you encounter the problem?
>
> I'm doing xtensa FDPIC port of musl libc and this issue popped up when
> I began testing it on qemu-system-xtensa with the real linux kernel.
> Related post to the musl ML:
> https://www.openwall.com/lists/musl/2024/03/20/2
Thanks!
--
Kees Cook
On Thu, Mar 21, 2024 at 8:48 PM Kees Cook <[email protected]> wrote:
> On Thu, Mar 21, 2024 at 12:52:16PM -0700, Max Filippov wrote:
> > On Thu, Mar 21, 2024 at 10:05 AM Kees Cook <[email protected]> wrote:
> > > What's the best way to test this? (Is there a qemu setup I can use to
> > > see the before/after of AT_EXECFN?)
> >
> > I put a readme with the steps to build such system here:
> > http://jcmvbkbc.org/~dumb/tmp/202403211236/README
> > it uses a prebuilt rootfs image and a 6.8 kernel branch with two
> > patches on top of it: one adds a dts and a defconfig and the other
> > is this fix. The rootfs boots successfully with this fix, but panics
> > if this fix is removed.
>
> Does musl have something like the LD_SHOW_AUXV env variable. With glibc,
> I usually explore auxv like so:
>
> $ LD_SHOW_AUXV=1 uname -a | grep EXECFN
> AT_EXECFN: /usr/bin/uname
I couldn't find anything like that in either musl or uClibc-ng.
So I updated the above rootfs and put the following program
into it as /bin/test-auxv:
#include <stdio.h>
#include <sys/auxv.h>
int main()
{
unsigned long p = getauxval(AT_EXECFN);
fprintf(stderr, "AT_EXECFN: 0x%lx\n", p);
if (p)
fprintf(stderr, "%s\n", (const char *)p);
return 0;
}
While looking at it I also noticed that /proc/<pid>/auxv is empty
on NOMMU, looks like there will be yet another fix for that.
--
Thanks.
-- Max