Unless I'm missing some trick, it's currently rather painful to mount
a namespace /proc. You have to actually be in the pid namespace to
mount the correct /proc instance, and you can't unmount the old /proc
until you've mounted the new /proc. This means that you have to fork
into the new pid namespace before you can finish setting it up.
Would it make sense to add a mount option to procfs to request a mount
for pid_ns_for_children instead of task_active_pid_ns?
--Andy
--
Andy Lutomirski
AMA Capital Management, LLC
Andy Lutomirski <[email protected]> writes:
> Unless I'm missing some trick, it's currently rather painful to mount
> a namespace /proc. You have to actually be in the pid namespace to
> mount the correct /proc instance, and you can't unmount the old /proc
> until you've mounted the new /proc. This means that you have to fork
> into the new pid namespace before you can finish setting it up.
Yes. You have to be inside just about all namespaces before you can
finish setting them up.
I don't know the context in which needed to be inside the pid namespace
is a burden.
> Would it make sense to add a mount option to procfs to request a mount
> for pid_ns_for_children instead of task_active_pid_ns?
This is about the using setns and unshare?
Adding a proc amount option that takes a pid namespace file descriptor
would be the general solution, and might be worth implementing.
Getting a pid namespace file descriptors when there are no pids might be
a challenge.
Eric
On Fri, Apr 25, 2014 at 12:37 PM, Eric W. Biederman
<[email protected]> wrote:
> Andy Lutomirski <[email protected]> writes:
>
>> Unless I'm missing some trick, it's currently rather painful to mount
>> a namespace /proc. You have to actually be in the pid namespace to
>> mount the correct /proc instance, and you can't unmount the old /proc
>> until you've mounted the new /proc. This means that you have to fork
>> into the new pid namespace before you can finish setting it up.
>
> Yes. You have to be inside just about all namespaces before you can
> finish setting them up.
>
> I don't know the context in which needed to be inside the pid namespace
> is a burden.
I'm trying to sandbox myself. I unshare everything, setup up new
mounts, pivot_root, umount the old stuff, fork, and wait around for
the child to finish.
This doesn't work: the parent can't mount the new /proc, and the child
can't either because it's too late.
The only solution I can think of without kernel changes is to fork the
child (pid 1) before pivot_root, which makes everything more
complicated. I suppose I can unshare, fork immediately, have the
child set up all the mounts, and then wake the parent, but this is an
annoying bit of extra complexity for no obvious gain.
>
>> Would it make sense to add a mount option to procfs to request a mount
>> for pid_ns_for_children instead of task_active_pid_ns?
>
> This is about the using setns and unshare?
>
> Adding a proc amount option that takes a pid namespace file descriptor
> would be the general solution, and might be worth implementing.
>
> Getting a pid namespace file descriptors when there are no pids might be
> a challenge.
Indeed, hence my request for a specific mode to mount /proc for
pid_ns_for_children.
FWIW, I also tried forking, having the child mount /proc and exit,
then forking again later on. That also doesn't work -- it looks like
you can't recreate pid 1 after it does.
--Andy
Andy Lutomirski <[email protected]> writes:
> On Fri, Apr 25, 2014 at 12:37 PM, Eric W. Biederman
> <[email protected]> wrote:
>> Andy Lutomirski <[email protected]> writes:
>>
>>> Unless I'm missing some trick, it's currently rather painful to mount
>>> a namespace /proc. You have to actually be in the pid namespace to
>>> mount the correct /proc instance, and you can't unmount the old /proc
>>> until you've mounted the new /proc. This means that you have to fork
>>> into the new pid namespace before you can finish setting it up.
>>
>> Yes. You have to be inside just about all namespaces before you can
>> finish setting them up.
>>
>> I don't know the context in which needed to be inside the pid namespace
>> is a burden.
>
> I'm trying to sandbox myself. I unshare everything, setup up new
> mounts, pivot_root, umount the old stuff, fork, and wait around for
> the child to finish.
>
> This doesn't work: the parent can't mount the new /proc, and the child
> can't either because it's too late.
>
> The only solution I can think of without kernel changes is to fork the
> child (pid 1) before pivot_root, which makes everything more
> complicated. I suppose I can unshare, fork immediately, have the
> child set up all the mounts, and then wake the parent, but this is an
> annoying bit of extra complexity for no obvious gain.
Or perhaps just use clone and clone flags.
What are you doing with the parent process? What value does it serve?
>>> Would it make sense to add a mount option to procfs to request a mount
>>> for pid_ns_for_children instead of task_active_pid_ns?
>>
>> This is about the using setns and unshare?
>>
>> Adding a proc amount option that takes a pid namespace file descriptor
>> would be the general solution, and might be worth implementing.
>>
>> Getting a pid namespace file descriptors when there are no pids might be
>> a challenge.
>
> Indeed, hence my request for a specific mode to mount /proc for
> pid_ns_for_children.
>
> FWIW, I also tried forking, having the child mount /proc and exit,
> then forking again later on. That also doesn't work -- it looks like
> you can't recreate pid 1 after it does.
Nope. Once pid 1 (init) is dead the pid namespace is dead.
Eric
On Fri, Apr 25, 2014 at 1:25 PM, Eric W. Biederman
<[email protected]> wrote:
> Andy Lutomirski <[email protected]> writes:
>
>> On Fri, Apr 25, 2014 at 12:37 PM, Eric W. Biederman
>> <[email protected]> wrote:
>>> Andy Lutomirski <[email protected]> writes:
>>>
>>>> Unless I'm missing some trick, it's currently rather painful to mount
>>>> a namespace /proc. You have to actually be in the pid namespace to
>>>> mount the correct /proc instance, and you can't unmount the old /proc
>>>> until you've mounted the new /proc. This means that you have to fork
>>>> into the new pid namespace before you can finish setting it up.
>>>
>>> Yes. You have to be inside just about all namespaces before you can
>>> finish setting them up.
>>>
>>> I don't know the context in which needed to be inside the pid namespace
>>> is a burden.
>>
>> I'm trying to sandbox myself. I unshare everything, setup up new
>> mounts, pivot_root, umount the old stuff, fork, and wait around for
>> the child to finish.
>>
>> This doesn't work: the parent can't mount the new /proc, and the child
>> can't either because it's too late.
>>
>> The only solution I can think of without kernel changes is to fork the
>> child (pid 1) before pivot_root, which makes everything more
>> complicated. I suppose I can unshare, fork immediately, have the
>> child set up all the mounts, and then wake the parent, but this is an
>> annoying bit of extra complexity for no obvious gain.
>
> Or perhaps just use clone and clone flags.
>
> What are you doing with the parent process? What value does it serve?
I'm not entirely sure. I'm hacking on this thing:
https://github.com/amluto/sandstorm/tree/userns
which isn't really my code. But there's an inner sandbox and an outer
sandbox, and only the inner sandbox is in a pid namespace.
I suppose what what I'm doing is a bit strange.
--Andy
On Fri, Apr 25, 2014 at 1:46 PM, Andy Lutomirski <[email protected]> wrote:
> On Fri, Apr 25, 2014 at 1:25 PM, Eric W. Biederman
> <[email protected]> wrote:
>> Andy Lutomirski <[email protected]> writes:
>>
>>> On Fri, Apr 25, 2014 at 12:37 PM, Eric W. Biederman
>>> <[email protected]> wrote:
>>>> Andy Lutomirski <[email protected]> writes:
>>>>
>>>>> Unless I'm missing some trick, it's currently rather painful to mount
>>>>> a namespace /proc. You have to actually be in the pid namespace to
>>>>> mount the correct /proc instance, and you can't unmount the old /proc
>>>>> until you've mounted the new /proc. This means that you have to fork
>>>>> into the new pid namespace before you can finish setting it up.
>>>>
>>>> Yes. You have to be inside just about all namespaces before you can
>>>> finish setting them up.
>>>>
>>>> I don't know the context in which needed to be inside the pid namespace
>>>> is a burden.
>>>
>>> I'm trying to sandbox myself. I unshare everything, setup up new
>>> mounts, pivot_root, umount the old stuff, fork, and wait around for
>>> the child to finish.
>>>
>>> This doesn't work: the parent can't mount the new /proc, and the child
>>> can't either because it's too late.
>>>
>>> The only solution I can think of without kernel changes is to fork the
>>> child (pid 1) before pivot_root, which makes everything more
>>> complicated. I suppose I can unshare, fork immediately, have the
>>> child set up all the mounts, and then wake the parent, but this is an
>>> annoying bit of extra complexity for no obvious gain.
>>
>> Or perhaps just use clone and clone flags.
>>
>> What are you doing with the parent process? What value does it serve?
>
> I'm not entirely sure. I'm hacking on this thing:
>
> https://github.com/amluto/sandstorm/tree/userns
>
> which isn't really my code. But there's an inner sandbox and an outer
> sandbox, and only the inner sandbox is in a pid namespace.
That was a semi-useless link. This is better:
https://github.com/amluto/sandstorm/blob/userns/src/sandstorm/supervisor-main.c%2B%2B
--Andy
Andy Lutomirski <[email protected]> writes:
> On Fri, Apr 25, 2014 at 1:25 PM, Eric W. Biederman
> <[email protected]> wrote:
>> Andy Lutomirski <[email protected]> writes:
>>
>>> On Fri, Apr 25, 2014 at 12:37 PM, Eric W. Biederman
>>> <[email protected]> wrote:
>>>> Andy Lutomirski <[email protected]> writes:
>>>>
>>>>> Unless I'm missing some trick, it's currently rather painful to mount
>>>>> a namespace /proc. You have to actually be in the pid namespace to
>>>>> mount the correct /proc instance, and you can't unmount the old /proc
>>>>> until you've mounted the new /proc. This means that you have to fork
>>>>> into the new pid namespace before you can finish setting it up.
>>>>
>>>> Yes. You have to be inside just about all namespaces before you can
>>>> finish setting them up.
>>>>
>>>> I don't know the context in which needed to be inside the pid namespace
>>>> is a burden.
>>>
>>> I'm trying to sandbox myself. I unshare everything, setup up new
>>> mounts, pivot_root, umount the old stuff, fork, and wait around for
>>> the child to finish.
>>>
>>> This doesn't work: the parent can't mount the new /proc, and the child
>>> can't either because it's too late.
>>>
>>> The only solution I can think of without kernel changes is to fork the
>>> child (pid 1) before pivot_root, which makes everything more
>>> complicated. I suppose I can unshare, fork immediately, have the
>>> child set up all the mounts, and then wake the parent, but this is an
>>> annoying bit of extra complexity for no obvious gain.
>>
>> Or perhaps just use clone and clone flags.
>>
>> What are you doing with the parent process? What value does it serve?
>
> I'm not entirely sure. I'm hacking on this thing:
>
> https://github.com/amluto/sandstorm/tree/userns
>
> which isn't really my code. But there's an inner sandbox and an outer
> sandbox, and only the inner sandbox is in a pid namespace.
>
> I suppose what what I'm doing is a bit strange.
A bit. But doing strange things is good.
Right now most of my energy is focused on closely the last of the design
issues. So I don't have much energy for new namespace related features
right now.
Eric
On Fri, Apr 25, 2014 at 2:17 PM, Eric W. Biederman
<[email protected]> wrote:
> Andy Lutomirski <[email protected]> writes:
>
>> On Fri, Apr 25, 2014 at 1:25 PM, Eric W. Biederman
>> <[email protected]> wrote:
>>> Andy Lutomirski <[email protected]> writes:
>>>
>>>> On Fri, Apr 25, 2014 at 12:37 PM, Eric W. Biederman
>>>> <[email protected]> wrote:
>>>>> Andy Lutomirski <[email protected]> writes:
>>>>>
>>>>>> Unless I'm missing some trick, it's currently rather painful to mount
>>>>>> a namespace /proc. You have to actually be in the pid namespace to
>>>>>> mount the correct /proc instance, and you can't unmount the old /proc
>>>>>> until you've mounted the new /proc. This means that you have to fork
>>>>>> into the new pid namespace before you can finish setting it up.
>>>>>
>>>>> Yes. You have to be inside just about all namespaces before you can
>>>>> finish setting them up.
>>>>>
>>>>> I don't know the context in which needed to be inside the pid namespace
>>>>> is a burden.
>>>>
>>>> I'm trying to sandbox myself. I unshare everything, setup up new
>>>> mounts, pivot_root, umount the old stuff, fork, and wait around for
>>>> the child to finish.
>>>>
>>>> This doesn't work: the parent can't mount the new /proc, and the child
>>>> can't either because it's too late.
>>>>
>>>> The only solution I can think of without kernel changes is to fork the
>>>> child (pid 1) before pivot_root, which makes everything more
>>>> complicated. I suppose I can unshare, fork immediately, have the
>>>> child set up all the mounts, and then wake the parent, but this is an
>>>> annoying bit of extra complexity for no obvious gain.
>>>
>>> Or perhaps just use clone and clone flags.
>>>
>>> What are you doing with the parent process? What value does it serve?
>>
>> I'm not entirely sure. I'm hacking on this thing:
>>
>> https://github.com/amluto/sandstorm/tree/userns
>>
>> which isn't really my code. But there's an inner sandbox and an outer
>> sandbox, and only the inner sandbox is in a pid namespace.
>>
>> I suppose what what I'm doing is a bit strange.
>
> A bit. But doing strange things is good.
>
> Right now most of my energy is focused on closely the last of the design
> issues. So I don't have much energy for new namespace related features
> right now.
No problem. I may write up the patch, although I'll have to support
current kernels anyway.
--Andy
Quoting Andy Lutomirski ([email protected]):
> On Fri, Apr 25, 2014 at 12:37 PM, Eric W. Biederman
> <[email protected]> wrote:
> > Andy Lutomirski <[email protected]> writes:
> >
> >> Unless I'm missing some trick, it's currently rather painful to mount
> >> a namespace /proc. You have to actually be in the pid namespace to
> >> mount the correct /proc instance, and you can't unmount the old /proc
> >> until you've mounted the new /proc. This means that you have to fork
> >> into the new pid namespace before you can finish setting it up.
> >
> > Yes. You have to be inside just about all namespaces before you can
> > finish setting them up.
> >
> > I don't know the context in which needed to be inside the pid namespace
> > is a burden.
>
> I'm trying to sandbox myself. I unshare everything, setup up new
> mounts, pivot_root, umount the old stuff, fork, and wait around for
> the child to finish.
>
> This doesn't work: the parent can't mount the new /proc, and the child
> can't either because it's too late.
I'm probably not thinking it through enough... But can't the parent, before
forking, do
mkdir -p /childproc/proc
mount --bind /childproc /childproc
mount --make-rshared /childproc
then the child mounts its proc under /childproc/proc and have that show
up in the parent's tree?
On Mon, Apr 28, 2014 at 6:39 AM, Serge Hallyn <[email protected]> wrote:
> Quoting Andy Lutomirski ([email protected]):
>> On Fri, Apr 25, 2014 at 12:37 PM, Eric W. Biederman
>> <[email protected]> wrote:
>> > Andy Lutomirski <[email protected]> writes:
>> >
>> >> Unless I'm missing some trick, it's currently rather painful to mount
>> >> a namespace /proc. You have to actually be in the pid namespace to
>> >> mount the correct /proc instance, and you can't unmount the old /proc
>> >> until you've mounted the new /proc. This means that you have to fork
>> >> into the new pid namespace before you can finish setting it up.
>> >
>> > Yes. You have to be inside just about all namespaces before you can
>> > finish setting them up.
>> >
>> > I don't know the context in which needed to be inside the pid namespace
>> > is a burden.
>>
>> I'm trying to sandbox myself. I unshare everything, setup up new
>> mounts, pivot_root, umount the old stuff, fork, and wait around for
>> the child to finish.
>>
>> This doesn't work: the parent can't mount the new /proc, and the child
>> can't either because it's too late.
>
> I'm probably not thinking it through enough... But can't the parent, before
> forking, do
>
> mkdir -p /childproc/proc
> mount --bind /childproc /childproc
> mount --make-rshared /childproc
>
> then the child mounts its proc under /childproc/proc and have that show
> up in the parent's tree?
Yes, and the --make-rshared /childproc isn't necessary. This is still
a bit annoying, since the parent now needs to wait for the child to
set up mounts if it wants to do anything that requires all the mounts
to be fully set up.
This issue certainly isn't a show-stopper, but it might be nice to
address if anyone ever adds options to proc to do other sensible
namespacy things (e.g. turning off sysctls).
--Andy