On Mon, 30 Jan 2012 18:09:09 +0400
Cyrill Gorcunov <[email protected]> wrote:
> After restore we would like the 'ps' command show the command
> line and evironment exactly the same it was at checkpoint time.
>
> So this additional PR_SET_MM_ allow us to do so. Note that
> these members of mm_struct is rather used for output in
> procfs, except auxv vector which is used by ld.so mostly.
This changelog is pretty darned hard to understand. Can we have a
version 2 please?
>
> ...
>
> @@ -1753,19 +1755,6 @@ static int prctl_set_mm(int opt, unsigne
> mm->end_data = addr;
> break;
>
> - case PR_SET_MM_START_STACK:
> -
> -#ifdef CONFIG_STACK_GROWSUP
> - vm_req_flags = VM_READ | VM_WRITE | VM_GROWSUP;
> -#else
> - vm_req_flags = VM_READ | VM_WRITE | VM_GROWSDOWN;
> -#endif
> - if ((vma->vm_flags & vm_req_flags) != vm_req_flags)
> - goto out;
> -
> - mm->start_stack = addr;
> - break;
> -
> case PR_SET_MM_START_BRK:
> if (addr <= mm->end_data)
> goto out;
> @@ -1790,16 +1779,53 @@ static int prctl_set_mm(int opt, unsigne
> mm->brk = addr;
> break;
Here would be a good place to add some nice comments explaining what
these do. Although I guess that isn't needed if one can get that info
by typing "man prctl".
> + case PR_SET_MM_START_STACK:
> + case PR_SET_MM_ARG_START:
> + case PR_SET_MM_ARG_END:
> + case PR_SET_MM_ENV_START:
> + case PR_SET_MM_ENV_END:
> +#ifdef CONFIG_STACK_GROWSUP
> + if (vma_flags_mismatch(vma, VM_READ | VM_WRITE | VM_GROWSUP, 0))
> +#else
> + if (vma_flags_mismatch(vma, VM_READ | VM_WRITE | VM_GROWSDOWN, 0))
> +#endif
> + goto out;
> + if (opt == PR_SET_MM_START_STACK)
> + mm->start_stack = addr;
> + else if (opt == PR_SET_MM_ARG_START)
> + mm->arg_start = addr;
> + else if (opt == PR_SET_MM_ARG_END)
> + mm->arg_end = addr;
> + else if (opt == PR_SET_MM_ENV_START)
> + mm->env_start = addr;
> + else if (opt == PR_SET_MM_ENV_END)
> + mm->env_end = addr;
> + break;
> +
> + case PR_SET_MM_AUXV: {
> + unsigned long user_auxv[AT_VECTOR_SIZE];
> +
> + if (arg4 > sizeof(mm->saved_auxv))
> + goto out;
> + up_read(&mm->mmap_sem);
> +
> + if (copy_from_user(user_auxv, (const void __user *)addr, arg4))
> + return EFAULT;
> +
> + task_lock(current);
> + memcpy(mm->saved_auxv, user_auxv, arg4);
> + task_unlock(current);
> +
> + return 0;
> + }
I worry a bit about this. We're giving userspace the ability to modify
various mm_struct fields. Userspace can already do this via
exec(elf-file), but perhaps this opens up a way in which userspace can
newly trigger kernel bugs.
On Thu, Feb 02, 2012 at 03:27:05PM -0800, Andrew Morton wrote:
> On Mon, 30 Jan 2012 18:09:09 +0400
> Cyrill Gorcunov <[email protected]> wrote:
>
> > After restore we would like the 'ps' command show the command
> > line and evironment exactly the same it was at checkpoint time.
> >
> > So this additional PR_SET_MM_ allow us to do so. Note that
> > these members of mm_struct is rather used for output in
> > procfs, except auxv vector which is used by ld.so mostly.
>
> This changelog is pretty darned hard to understand. Can we have a
> version 2 please?
>
yeah, will update.
...
> > @@ -1790,16 +1779,53 @@ static int prctl_set_mm(int opt, unsigne
> > mm->brk = addr;
> > break;
>
> Here would be a good place to add some nice comments explaining what
> these do. Although I guess that isn't needed if one can get that info
> by typing "man prctl".
>
I started cooking prctl man pages but found hardness to explain some
regular user who has no ideas about kernel internals why do we modify
mm_struct data, still I'm trying.
And I'll add comment here (since having it here in-place allows reader
to not read man page ;)
...
>
> I worry a bit about this. We're giving userspace the ability to modify
> various mm_struct fields. Userspace can already do this via
> exec(elf-file), but perhaps this opens up a way in which userspace can
> newly trigger kernel bugs.
>
At moment there is no more way to modify these fields other than elf
handler, but in future... hard to predict what else there will be
done and where also these fields appear in kernel code. but as i said
at moment this modification is pretty safe and even if one write some
buggy values -- he simply get weird output in /proc/ statistics and
such.
Cyrill