2015-12-09 02:34:37

by Dongsheng Yang

[permalink] [raw]
Subject: Re: piping core dump to a program escapes container

On 10/25/2015 05:54 AM, Shayan Pooya wrote:
> I noticed the following core_pattern behavior in my linux box while
> running docker containers. I am not sure if it is bug, but it is
> inconsistent and not documented.
>
> If the core_pattern is set on the host, the containers will observe
> and use the pattern for dumping cores (there is no per cgroup
> core_pattern). According to core(5) for setting core_pattern one can:
>
> 1. echo "/tmp/cores/core.%e.%p" > /proc/sys/kernel/core_pattern
> 2. echo "|/bin/custom_core /tmp/cores/ %e %p " > /proc/sys/kernel/core_pattern
>
> The former pattern evaluates the /tmp/cores path in the container's
> filesystem namespace. Which means, the host does not see a core file
> in /tmp/cores.
>
> However, the latter evaluates the /bin/custom_core path in the global
> filesystem namespace. Moreover, if /bin/core decides to write the core
> to a path (/tmp/cores in this case as shown by the arg to
> custom_core), the path will be evaluated in the global filesystem
> namespace as well.
>
> The latter behaviour is counter-intuitive and error-prone as the
> container can fill up the core-file directory which it does not have
> direct access to (which means the core is also not accessible for
> debugging if someone only has access to the container).

Hi Shayan,
We found the same problem with what you described here.
Is there any document for this behaviour? I want to know is
that intentional or as you said a 'bug'. Maybe that's intentional
to provide a way for admin to collect core dumps from all containers as
Richard said. I am interested in it too.

Anyone can help here?

Yang
>
> Currently, I work around this issue by detecting that the process is
> crashing from a container (by comparing the namespace pid to the
> global pid) and refuse to dump the core if it is from a container.
>
> Tested on Ubuntu (kernel 3.16) and Fedora (kernel 4.1).
> --
> To unsubscribe from this list: send the line "unsubscribe cgroups" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>



2015-12-09 02:43:52

by Dongsheng Yang

[permalink] [raw]
Subject: Re: piping core dump to a program escapes container

On 12/09/2015 10:26 AM, Dongsheng Yang wrote:
> On 10/25/2015 05:54 AM, Shayan Pooya wrote:
>> I noticed the following core_pattern behavior in my linux box while
>> running docker containers. I am not sure if it is bug, but it is
>> inconsistent and not documented.
>>
>> If the core_pattern is set on the host, the containers will observe
>> and use the pattern for dumping cores (there is no per cgroup
>> core_pattern). According to core(5) for setting core_pattern one can:
>>
>> 1. echo "/tmp/cores/core.%e.%p" > /proc/sys/kernel/core_pattern
>> 2. echo "|/bin/custom_core /tmp/cores/ %e %p " >
>> /proc/sys/kernel/core_pattern
>>
>> The former pattern evaluates the /tmp/cores path in the container's
>> filesystem namespace. Which means, the host does not see a core file
>> in /tmp/cores.
>>
>> However, the latter evaluates the /bin/custom_core path in the global
>> filesystem namespace. Moreover, if /bin/core decides to write the core
>> to a path (/tmp/cores in this case as shown by the arg to
>> custom_core), the path will be evaluated in the global filesystem
>> namespace as well.
>>
>> The latter behaviour is counter-intuitive and error-prone as the
>> container can fill up the core-file directory which it does not have
>> direct access to (which means the core is also not accessible for
>> debugging if someone only has access to the container).
>
> Hi Shayan,
> We found the same problem with what you described here.
> Is there any document for this behaviour? I want to know is
> that intentional or as you said a 'bug'. Maybe that's intentional
> to provide a way for admin to collect core dumps from all containers as
> Richard said. I am interested in it too.
>
> Anyone can help here?

In addition, is that a good idea to make core_pattern to be seperated
in different namespace?

Yang
>
> Yang
>>
>> Currently, I work around this issue by detecting that the process is
>> crashing from a container (by comparing the namespace pid to the
>> global pid) and refuse to dump the core if it is from a container.
>>
>> Tested on Ubuntu (kernel 3.16) and Fedora (kernel 4.1).
>> --
>> To unsubscribe from this list: send the line "unsubscribe cgroups" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
> .
>


2015-12-09 03:37:42

by Eric W. Biederman

[permalink] [raw]
Subject: Re: piping core dump to a program escapes container

Dongsheng Yang <[email protected]> writes:

> On 12/09/2015 10:26 AM, Dongsheng Yang wrote:
>> On 10/25/2015 05:54 AM, Shayan Pooya wrote:
>>> I noticed the following core_pattern behavior in my linux box while
>>> running docker containers. I am not sure if it is bug, but it is
>>> inconsistent and not documented.
>>>
>>> If the core_pattern is set on the host, the containers will observe
>>> and use the pattern for dumping cores (there is no per cgroup
>>> core_pattern). According to core(5) for setting core_pattern one can:
>>>
>>> 1. echo "/tmp/cores/core.%e.%p" > /proc/sys/kernel/core_pattern
>>> 2. echo "|/bin/custom_core /tmp/cores/ %e %p " >
>>> /proc/sys/kernel/core_pattern
>>>
>>> The former pattern evaluates the /tmp/cores path in the container's
>>> filesystem namespace. Which means, the host does not see a core file
>>> in /tmp/cores.
>>>
>>> However, the latter evaluates the /bin/custom_core path in the global
>>> filesystem namespace. Moreover, if /bin/core decides to write the core
>>> to a path (/tmp/cores in this case as shown by the arg to
>>> custom_core), the path will be evaluated in the global filesystem
>>> namespace as well.
>>>
>>> The latter behaviour is counter-intuitive and error-prone as the
>>> container can fill up the core-file directory which it does not have
>>> direct access to (which means the core is also not accessible for
>>> debugging if someone only has access to the container).

>From a container perspective it is perhaps counter intuitive from
the perspective of the operator of the machine nothing works specially
about core_pattern and it works as designed with no unusual danages.

>> Hi Shayan,
>> We found the same problem with what you described here.
>> Is there any document for this behaviour? I want to know is
>> that intentional or as you said a 'bug'. Maybe that's intentional
>> to provide a way for admin to collect core dumps from all containers as
>> Richard said. I am interested in it too.
>>
>> Anyone can help here?
>
> In addition, is that a good idea to make core_pattern to be seperated
> in different namespace?

The behavior was the best we could do at the time last time this issue
was examined. There is enough information available to be able to
write a core dumping program that can reliably place your core dumps
in your container.

There has not yet been an obvious namespace in which to stick
core_pattern, and even worse exactly how to appropriate launch a process
in a container has not been figured out.

If those tricky problems can be solved we can have a core_pattern in a
container. What we have now is the best we have been able to figure out
so far.

Eric


>
> Yang
>>
>> Yang
>>>
>>> Currently, I work around this issue by detecting that the process is
>>> crashing from a container (by comparing the namespace pid to the
>>> global pid) and refuse to dump the core if it is from a container.
>>>
>>> Tested on Ubuntu (kernel 3.16) and Fedora (kernel 4.1).
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe cgroups" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at http://www.tux.org/lkml/
>> .
>>

2015-12-09 06:01:09

by Dongsheng Yang

[permalink] [raw]
Subject: Re: piping core dump to a program escapes container

On 12/09/2015 11:29 AM, Eric W. Biederman wrote:
> Dongsheng Yang <[email protected]> writes:
>
[...]

> There has not yet been an obvious namespace in which to stick
> core_pattern, and even worse exactly how to appropriate launch a process
> in a container has not been figured out.
>
> If those tricky problems can be solved we can have a core_pattern in a
> container. What we have now is the best we have been able to figure out
> so far.

Thanx Eric, but if I want to make docker works rely on this behaviour,
is that reliable?

I mean, I want to make a docker container to dump the
core file to a specified path in host by a pipe way. But I am afraid
this behaviour would be changed later. Any suggestion?

Yang
>
> Eric
>
>
>>
>> Yang
>>>
>>> Yang
>>>>
>>>> Currently, I work around this issue by detecting that the process is
>>>> crashing from a container (by comparing the namespace pid to the
>>>> global pid) and refuse to dump the core if it is from a container.
>>>>
>>>> Tested on Ubuntu (kernel 3.16) and Fedora (kernel 4.1).
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe cgroups" in
>>>> the body of a message to [email protected]
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>> Please read the FAQ at http://www.tux.org/lkml/
>>> .
>>>
>
>
> .
>


2015-12-09 06:41:48

by Eric W. Biederman

[permalink] [raw]
Subject: Re: piping core dump to a program escapes container

Dongsheng Yang <[email protected]> writes:

> On 12/09/2015 11:29 AM, Eric W. Biederman wrote:
>> Dongsheng Yang <[email protected]> writes:
>>
> [...]
>
>> There has not yet been an obvious namespace in which to stick
>> core_pattern, and even worse exactly how to appropriate launch a process
>> in a container has not been figured out.
>>
>> If those tricky problems can be solved we can have a core_pattern in a
>> container. What we have now is the best we have been able to figure out
>> so far.
>
> Thanx Eric, but if I want to make docker works rely on this behaviour,
> is that reliable?
>
> I mean, I want to make a docker container to dump the
> core file to a specified path in host by a pipe way. But I am afraid
> this behaviour would be changed later. Any suggestion?

The kernel rules say if there is a behavior someone depends on and that
behavior changes and breaks userspace that is a regression and it is not
allowed.

As developers we try not to create regressions. But some days it
requires someone testing/using the functional enough to catdch an issue.

That said the real issue you are likely to run into when developing this
as part of docker is that docker doesn't get to own the core pattern.
It doesn't make sense for any one application to, as it is a kernel wide
setting. To have different app or container specific policies for core
dumping likely requires either solving the problems I mentioned with
containers or in userspace a solution so there can be an
/etc/core_pattern.d/ with different configuration and different scripts
that somehow know how to select which core files they want and dump them
sanely.

Eric

2015-12-09 08:14:49

by Dongsheng Yang

[permalink] [raw]
Subject: Re: piping core dump to a program escapes container

On 12/09/2015 02:32 PM, Eric W. Biederman wrote:
> Dongsheng Yang <[email protected]> writes:
>
>> On 12/09/2015 11:29 AM, Eric W. Biederman wrote:
>>> Dongsheng Yang <[email protected]> writes:
>>>
>> [...]
>>
>>> There has not yet been an obvious namespace in which to stick
>>> core_pattern, and even worse exactly how to appropriate launch a process
>>> in a container has not been figured out.
>>>
>>> If those tricky problems can be solved we can have a core_pattern in a
>>> container. What we have now is the best we have been able to figure out
>>> so far.
>>
>> Thanx Eric, but if I want to make docker works rely on this behaviour,
>> is that reliable?
>>
>> I mean, I want to make a docker container to dump the
>> core file to a specified path in host by a pipe way. But I am afraid
>> this behaviour would be changed later. Any suggestion?
>
> The kernel rules say if there is a behavior someone depends on and that
> behavior changes and breaks userspace that is a regression and it is not
> allowed.
>
> As developers we try not to create regressions. But some days it
> requires someone testing/using the functional enough to catdch an issue.
>
> That said the real issue you are likely to run into when developing this
> as part of docker is that docker doesn't get to own the core pattern.
> It doesn't make sense for any one application to, as it is a kernel wide
> setting.

Agreed.
> To have different app or container specific policies for core
> dumping likely requires either solving the problems I mentioned with
> containers or in userspace a solution so there can be an
> /etc/core_pattern.d/ with different configuration and different scripts
> that somehow know how to select which core files they want and dump them
> sanely.

We would try to solve the problems you mentioned, but sound not easy.
Anyway, I need to read some old discussion at first I think.

Thanx
Yang
>
> Eric
>
>
>


2015-12-09 08:44:30

by Bruno Prémont

[permalink] [raw]
Subject: Re: piping core dump to a program escapes container

On Tue, 08 Dec 2015 21:29:13 -0600 Eric W. Biederman wrote:
> Dongsheng Yang <[email protected]> writes:
>
> > On 12/09/2015 10:26 AM, Dongsheng Yang wrote:
> >> On 10/25/2015 05:54 AM, Shayan Pooya wrote:
> >>> I noticed the following core_pattern behavior in my linux box while
> >>> running docker containers. I am not sure if it is bug, but it is
> >>> inconsistent and not documented.
> >>>
> >>> If the core_pattern is set on the host, the containers will observe
> >>> and use the pattern for dumping cores (there is no per cgroup
> >>> core_pattern). According to core(5) for setting core_pattern one can:
> >>>
> >>> 1. echo "/tmp/cores/core.%e.%p" > /proc/sys/kernel/core_pattern
> >>> 2. echo "|/bin/custom_core /tmp/cores/ %e %p " >
> >>> /proc/sys/kernel/core_pattern
> >>>
> >>> The former pattern evaluates the /tmp/cores path in the container's
> >>> filesystem namespace. Which means, the host does not see a core file
> >>> in /tmp/cores.
> >>>
> >>> However, the latter evaluates the /bin/custom_core path in the global
> >>> filesystem namespace. Moreover, if /bin/core decides to write the core
> >>> to a path (/tmp/cores in this case as shown by the arg to
> >>> custom_core), the path will be evaluated in the global filesystem
> >>> namespace as well.
> >>>
> >>> The latter behaviour is counter-intuitive and error-prone as the
> >>> container can fill up the core-file directory which it does not have
> >>> direct access to (which means the core is also not accessible for
> >>> debugging if someone only has access to the container).
>
> From a container perspective it is perhaps counter intuitive from
> the perspective of the operator of the machine nothing works specially
> about core_pattern and it works as designed with no unusual danages.
>
> >> Hi Shayan,
> >> We found the same problem with what you described here.
> >> Is there any document for this behaviour? I want to know is
> >> that intentional or as you said a 'bug'. Maybe that's intentional
> >> to provide a way for admin to collect core dumps from all containers as
> >> Richard said. I am interested in it too.
> >>
> >> Anyone can help here?
> >
> > In addition, is that a good idea to make core_pattern to be seperated
> > in different namespace?
>
> The behavior was the best we could do at the time last time this issue
> was examined. There is enough information available to be able to
> write a core dumping program that can reliably place your core dumps
> in your container.
>
> There has not yet been an obvious namespace in which to stick
> core_pattern, and even worse exactly how to appropriate launch a process
> in a container has not been figured out.
>
> If those tricky problems can be solved we can have a core_pattern in a
> container. What we have now is the best we have been able to figure out
> so far.

Isn't the second option dangerous if its run in global namespace and
settable from some other namespace/container?

If a process inside a container can set /proc/sys/kernel/core_pattern
then it could e.g. set it to
echo "|/bin/rm -rf / /tmp/cores/ %e %p " > /proc/sys/kernel/core_pattern
and kill the host (eventually itself included).
Other command lines could do different bad things.


Something that would sound reasonable is to have the core dumping
helper process run under the namespaces the process which wrote to
/proc/sys/kernel/core_pattern had.
When some of those namespaces are gone, falling back to the namespaces
of the process for which core is to be dumped might seem reasonable
(or just not dumping core at all as is done when core_pipe_limit is
exceeded).

The value of core_pattern (and other core_* sysctls) should probably belong
to the mount namespace the proc filesystem used for setting its value
was in - or the matching namespace of calling process when set via syscall.

Bruno

2015-12-10 00:35:24

by Dongsheng Yang

[permalink] [raw]
Subject: Re: piping core dump to a program escapes container

On 12/09/2015 04:34 PM, Bruno Pr?mont wrote:
> On Tue, 08 Dec 2015 21:29:13 -0600 Eric W. Biederman wrote:
>> Dongsheng Yang <[email protected]> writes:
>>
>>> On 12/09/2015 10:26 AM, Dongsheng Yang wrote:
>>>> On 10/25/2015 05:54 AM, Shayan Pooya wrote:
>>>>> I noticed the following core_pattern behavior in my linux box while
>>>>> running docker containers. I am not sure if it is bug, but it is
>>>>> inconsistent and not documented.
>>>>>
>>>>> If the core_pattern is set on the host, the containers will observe
>>>>> and use the pattern for dumping cores (there is no per cgroup
>>>>> core_pattern). According to core(5) for setting core_pattern one can:
>>>>>
>>>>> 1. echo "/tmp/cores/core.%e.%p" > /proc/sys/kernel/core_pattern
>>>>> 2. echo "|/bin/custom_core /tmp/cores/ %e %p " >
>>>>> /proc/sys/kernel/core_pattern
>>>>>
>>>>> The former pattern evaluates the /tmp/cores path in the container's
>>>>> filesystem namespace. Which means, the host does not see a core file
>>>>> in /tmp/cores.
>>>>>
>>>>> However, the latter evaluates the /bin/custom_core path in the global
>>>>> filesystem namespace. Moreover, if /bin/core decides to write the core
>>>>> to a path (/tmp/cores in this case as shown by the arg to
>>>>> custom_core), the path will be evaluated in the global filesystem
>>>>> namespace as well.
>>>>>
>>>>> The latter behaviour is counter-intuitive and error-prone as the
>>>>> container can fill up the core-file directory which it does not have
>>>>> direct access to (which means the core is also not accessible for
>>>>> debugging if someone only has access to the container).
>>
>> From a container perspective it is perhaps counter intuitive from
>> the perspective of the operator of the machine nothing works specially
>> about core_pattern and it works as designed with no unusual danages.
>>
>>>> Hi Shayan,
>>>> We found the same problem with what you described here.
>>>> Is there any document for this behaviour? I want to know is
>>>> that intentional or as you said a 'bug'. Maybe that's intentional
>>>> to provide a way for admin to collect core dumps from all containers as
>>>> Richard said. I am interested in it too.
>>>>
>>>> Anyone can help here?
>>>
>>> In addition, is that a good idea to make core_pattern to be seperated
>>> in different namespace?
>>
>> The behavior was the best we could do at the time last time this issue
>> was examined. There is enough information available to be able to
>> write a core dumping program that can reliably place your core dumps
>> in your container.
>>
>> There has not yet been an obvious namespace in which to stick
>> core_pattern, and even worse exactly how to appropriate launch a process
>> in a container has not been figured out.
>>
>> If those tricky problems can be solved we can have a core_pattern in a
>> container. What we have now is the best we have been able to figure out
>> so far.
>
> Isn't the second option dangerous if its run in global namespace and
> settable from some other namespace/container?
>
> If a process inside a container can set /proc/sys/kernel/core_pattern
> then it could e.g. set it to
> echo "|/bin/rm -rf / /tmp/cores/ %e %p " > /proc/sys/kernel/core_pattern
> and kill the host (eventually itself included).
> Other command lines could do different bad things.

Yes, if you don't give a privileged to container, that's read-only
to them. But if you give containers privilege, that true as you said.
But that's similar with if you give a privilege to any of process
running in the machine. So I think it's not a problem.

Yang
>
>
> Something that would sound reasonable is to have the core dumping
> helper process run under the namespaces the process which wrote to
> /proc/sys/kernel/core_pattern had.
> When some of those namespaces are gone, falling back to the namespaces
> of the process for which core is to be dumped might seem reasonable
> (or just not dumping core at all as is done when core_pipe_limit is
> exceeded).
>
> The value of core_pattern (and other core_* sysctls) should probably belong
> to the mount namespace the proc filesystem used for setting its value
> was in - or the matching namespace of calling process when set via syscall.
>
> Bruno
>
>
> .
>


2015-12-10 03:06:16

by Dongsheng Yang

[permalink] [raw]
Subject: Re: piping core dump to a program escapes container

On 12/09/2015 11:29 AM, Eric W. Biederman wrote:
> Dongsheng Yang <[email protected]> writes:
>
>> On 12/09/2015 10:26 AM, Dongsheng Yang wrote:
>>> On 10/25/2015 05:54 AM, Shayan Pooya wrote:
>>>> I noticed the following core_pattern behavior in my linux box while
>>>> running docker containers. I am not sure if it is bug, but it is
>>>> inconsistent and not documented.
>>>>
>>>> If the core_pattern is set on the host, the containers will observe
>>>> and use the pattern for dumping cores (there is no per cgroup
>>>> core_pattern). According to core(5) for setting core_pattern one can:
>>>>
>>>> 1. echo "/tmp/cores/core.%e.%p" > /proc/sys/kernel/core_pattern
>>>> 2. echo "|/bin/custom_core /tmp/cores/ %e %p " >
>>>> /proc/sys/kernel/core_pattern
>>>>
>>>> The former pattern evaluates the /tmp/cores path in the container's
>>>> filesystem namespace. Which means, the host does not see a core file
>>>> in /tmp/cores.
>>>>
>>>> However, the latter evaluates the /bin/custom_core path in the global
>>>> filesystem namespace. Moreover, if /bin/core decides to write the core
>>>> to a path (/tmp/cores in this case as shown by the arg to
>>>> custom_core), the path will be evaluated in the global filesystem
>>>> namespace as well.
>>>>
>>>> The latter behaviour is counter-intuitive and error-prone as the
>>>> container can fill up the core-file directory which it does not have
>>>> direct access to (which means the core is also not accessible for
>>>> debugging if someone only has access to the container).
>
>>From a container perspective it is perhaps counter intuitive from
> the perspective of the operator of the machine nothing works specially
> about core_pattern and it works as designed with no unusual danages.
>
>>> Hi Shayan,
>>> We found the same problem with what you described here.
>>> Is there any document for this behaviour? I want to know is
>>> that intentional or as you said a 'bug'. Maybe that's intentional
>>> to provide a way for admin to collect core dumps from all containers as
>>> Richard said. I am interested in it too.
>>>
>>> Anyone can help here?
>>
>> In addition, is that a good idea to make core_pattern to be seperated
>> in different namespace?
>
> The behavior was the best we could do at the time last time this issue
> was examined. There is enough information available to be able to
> write a core dumping program that can reliably place your core dumps
> in your container.
>
> There has not yet been an obvious namespace in which to stick
> core_pattern, and even worse exactly how to appropriate launch a process
> in a container has not been figured out.

Hi Eric,
Could you provide an reference to these discussion?? In
addition, is there a already infrastructure to do this kind of thing?

Thanx
Yang
>
> If those tricky problems can be solved we can have a core_pattern in a
> container. What we have now is the best we have been able to figure out
> so far.
>
> Eric
>
>
>>
>> Yang
>>>
>>> Yang
>>>>
>>>> Currently, I work around this issue by detecting that the process is
>>>> crashing from a container (by comparing the namespace pid to the
>>>> global pid) and refuse to dump the core if it is from a container.
>>>>
>>>> Tested on Ubuntu (kernel 3.16) and Fedora (kernel 4.1).
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe cgroups" in
>>>> the body of a message to [email protected]
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>> Please read the FAQ at http://www.tux.org/lkml/
>>> .
>>>
>
>
> .
>