2017-09-07 11:17:50

by Roman Gushchin

[permalink] [raw]
Subject: [RFC] proc, coredump: add CoreDumping flag to /proc/pid/status

Right now there is no convenient way to check if a process is being
coredumped at the moment.

It might be necessary to recognize such state to prevent killing
the process and getting a broken coredump.
Writing a large core might take significant time, and the process
is unresponsive during it, so it might be killed by timeout,
if another process is monitoring and killing/restarting
hanging tasks.

To provide an ability to detect if a process is in the state of
being coreduped, we can expose a boolean CoreDumping flag
in /proc/pid/status.

Example:
$ cat core.sh
#!/bin/sh

echo "|/usr/bin/sleep 10" > /proc/sys/kernel/core_pattern
sleep 1000 &
PID=$!

cat /proc/$PID/status | grep CoreDumping
kill -ABRT $PID
sleep 1
cat /proc/$PID/status | grep CoreDumping

$ ./core.sh
CoreDumping: 0
CoreDumping: 1

Signed-off-by: Roman Gushchin <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
fs/proc/array.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/fs/proc/array.c b/fs/proc/array.c
index 88c355574aa0..fc4a0aa7f487 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -369,6 +369,11 @@ static void task_cpus_allowed(struct seq_file *m, struct task_struct *task)
cpumask_pr_args(&task->cpus_allowed));
}

+static inline void task_core_dumping(struct seq_file *m, struct mm_struct *mm)
+{
+ seq_printf(m, "CoreDumping:\t%d\n", !!mm->core_state);
+}
+
int proc_pid_status(struct seq_file *m, struct pid_namespace *ns,
struct pid *pid, struct task_struct *task)
{
@@ -379,6 +384,7 @@ int proc_pid_status(struct seq_file *m, struct pid_namespace *ns,

if (mm) {
task_mem(m, mm);
+ task_core_dumping(m, mm);
mmput(mm);
}
task_sig(m, task);
--
2.13.5


2017-09-13 22:05:52

by Roman Gushchin

[permalink] [raw]
Subject: Re: [RFC] proc, coredump: add CoreDumping flag to /proc/pid/status

On Thu, Sep 07, 2017 at 12:17:15PM +0100, Roman Gushchin wrote:
> Right now there is no convenient way to check if a process is being
> coredumped at the moment.
>
> It might be necessary to recognize such state to prevent killing
> the process and getting a broken coredump.
> Writing a large core might take significant time, and the process
> is unresponsive during it, so it might be killed by timeout,
> if another process is monitoring and killing/restarting
> hanging tasks.
>
> To provide an ability to detect if a process is in the state of
> being coreduped, we can expose a boolean CoreDumping flag
> in /proc/pid/status.
>

Ping?

2017-09-13 22:15:32

by Alexey Dobriyan

[permalink] [raw]
Subject: Re: [RFC] proc, coredump: add CoreDumping flag to /proc/pid/status

> To provide an ability to detect if a process is in the state of
> being coreduped, we can expose a boolean CoreDumping flag
> in /proc/pid/status.

Or add "State: C" ?

2017-09-13 22:22:12

by Roman Gushchin

[permalink] [raw]
Subject: Re: [RFC] proc, coredump: add CoreDumping flag to /proc/pid/status

On Thu, Sep 14, 2017 at 01:15:26AM +0300, Alexey Dobriyan wrote:
> > To provide an ability to detect if a process is in the state of
> > being coreduped, we can expose a boolean CoreDumping flag
> > in /proc/pid/status.
>
> Or add "State: C" ?

A program in such state can also sleep and run, so it's not
a state in terms of process states.

2017-09-13 22:46:48

by Alexey Dobriyan

[permalink] [raw]
Subject: Re: [RFC] proc, coredump: add CoreDumping flag to /proc/pid/status

On Wed, Sep 13, 2017 at 03:21:59PM -0700, Roman Gushchin wrote:
> On Thu, Sep 14, 2017 at 01:15:26AM +0300, Alexey Dobriyan wrote:
> > > To provide an ability to detect if a process is in the state of
> > > being coreduped, we can expose a boolean CoreDumping flag
> > > in /proc/pid/status.
> >
> > Or add "State: C" ?
>
> A program in such state can also sleep and run, so it's not
> a state in terms of process states.

Well, maybe something will break from seeing unknown process state.

Regardless, symlink /proc/$PID/coredump pointing to either "0" or "1"
is faster than open+read+parse+close.

2017-09-13 23:08:04

by Roman Gushchin

[permalink] [raw]
Subject: Re: [RFC] proc, coredump: add CoreDumping flag to /proc/pid/status

On Thu, Sep 14, 2017 at 01:46:43AM +0300, Alexey Dobriyan wrote:
> On Wed, Sep 13, 2017 at 03:21:59PM -0700, Roman Gushchin wrote:
> > On Thu, Sep 14, 2017 at 01:15:26AM +0300, Alexey Dobriyan wrote:
> > > > To provide an ability to detect if a process is in the state of
> > > > being coreduped, we can expose a boolean CoreDumping flag
> > > > in /proc/pid/status.
> > >
> > > Or add "State: C" ?
> >
> > A program in such state can also sleep and run, so it's not
> > a state in terms of process states.
>
> Well, maybe something will break from seeing unknown process state.
>
> Regardless, symlink /proc/$PID/coredump pointing to either "0" or "1"
> is faster than open+read+parse+close.

Performance doesn't really matter in this case: nobody should check
this flag often. An expected usecase is described above: check the flag
once before killing the process by timeout.
So, it doesn't look deserving a separate entity in procfs.

2017-09-14 22:44:47

by Roman Gushchin

[permalink] [raw]
Subject: Re: [RFC] proc, coredump: add CoreDumping flag to /proc/pid/status

Adding Andrew Morton and Oleg Nesterov to cc.

On Thu, Sep 07, 2017 at 12:17:15PM +0100, Roman Gushchin wrote:
> Right now there is no convenient way to check if a process is being
> coredumped at the moment.
>
> It might be necessary to recognize such state to prevent killing
> the process and getting a broken coredump.
> Writing a large core might take significant time, and the process
> is unresponsive during it, so it might be killed by timeout,
> if another process is monitoring and killing/restarting
> hanging tasks.
>
> To provide an ability to detect if a process is in the state of
> being coreduped, we can expose a boolean CoreDumping flag
> in /proc/pid/status.
>
> Example:
> $ cat core.sh
> #!/bin/sh
>
> echo "|/usr/bin/sleep 10" > /proc/sys/kernel/core_pattern
> sleep 1000 &
> PID=$!
>
> cat /proc/$PID/status | grep CoreDumping
> kill -ABRT $PID
> sleep 1
> cat /proc/$PID/status | grep CoreDumping
>
> $ ./core.sh
> CoreDumping: 0
> CoreDumping: 1
>
> Signed-off-by: Roman Gushchin <[email protected]>
> Cc: Alexander Viro <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> ---
> fs/proc/array.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/fs/proc/array.c b/fs/proc/array.c
> index 88c355574aa0..fc4a0aa7f487 100644
> --- a/fs/proc/array.c
> +++ b/fs/proc/array.c
> @@ -369,6 +369,11 @@ static void task_cpus_allowed(struct seq_file *m, struct task_struct *task)
> cpumask_pr_args(&task->cpus_allowed));
> }
>
> +static inline void task_core_dumping(struct seq_file *m, struct mm_struct *mm)
> +{
> + seq_printf(m, "CoreDumping:\t%d\n", !!mm->core_state);
> +}
> +
> int proc_pid_status(struct seq_file *m, struct pid_namespace *ns,
> struct pid *pid, struct task_struct *task)
> {
> @@ -379,6 +384,7 @@ int proc_pid_status(struct seq_file *m, struct pid_namespace *ns,
>
> if (mm) {
> task_mem(m, mm);
> + task_core_dumping(m, mm);
> mmput(mm);
> }
> task_sig(m, task);
> --
> 2.13.5
>

2017-09-20 23:07:17

by Roman Gushchin

[permalink] [raw]
Subject: [RESEND] proc, coredump: add CoreDumping flag to /proc/pid/status

Right now there is no convenient way to check if a process is being
coredumped at the moment.

It might be necessary to recognize such state to prevent killing
the process and getting a broken coredump.
Writing a large core might take significant time, and the process
is unresponsive during it, so it might be killed by timeout,
if another process is monitoring and killing/restarting
hanging tasks.

To provide an ability to detect if a process is in the state of
being coreduped, we can expose a boolean CoreDumping flag
in /proc/pid/status.

Example:
$ cat core.sh
#!/bin/sh

echo "|/usr/bin/sleep 10" > /proc/sys/kernel/core_pattern
sleep 1000 &
PID=$!

cat /proc/$PID/status | grep CoreDumping
kill -ABRT $PID
sleep 1
cat /proc/$PID/status | grep CoreDumping

$ ./core.sh
CoreDumping: 0
CoreDumping: 1

Signed-off-by: Roman Gushchin <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
fs/proc/array.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/fs/proc/array.c b/fs/proc/array.c
index 88c355574aa0..fc4a0aa7f487 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -369,6 +369,11 @@ static void task_cpus_allowed(struct seq_file *m, struct task_struct *task)
cpumask_pr_args(&task->cpus_allowed));
}

+static inline void task_core_dumping(struct seq_file *m, struct mm_struct *mm)
+{
+ seq_printf(m, "CoreDumping:\t%d\n", !!mm->core_state);
+}
+
int proc_pid_status(struct seq_file *m, struct pid_namespace *ns,
struct pid *pid, struct task_struct *task)
{
@@ -379,6 +384,7 @@ int proc_pid_status(struct seq_file *m, struct pid_namespace *ns,

if (mm) {
task_mem(m, mm);
+ task_core_dumping(m, mm);
mmput(mm);
}
task_sig(m, task);
--
2.13.5

2017-09-22 15:44:15

by Konstantin Khlebnikov

[permalink] [raw]
Subject: Re: [RESEND] proc, coredump: add CoreDumping flag to /proc/pid/status

On Thu, Sep 21, 2017 at 2:06 AM, Roman Gushchin <[email protected]> wrote:
> Right now there is no convenient way to check if a process is being
> coredumped at the moment.
>
> It might be necessary to recognize such state to prevent killing
> the process and getting a broken coredump.
> Writing a large core might take significant time, and the process
> is unresponsive during it, so it might be killed by timeout,
> if another process is monitoring and killing/restarting
> hanging tasks.
>
> To provide an ability to detect if a process is in the state of
> being coreduped, we can expose a boolean CoreDumping flag
> in /proc/pid/status.

Makes sense.

Maybe print this line only when task actually makes dump?
And probably expose pid of coredump helper.

Add Oleg into CC.

>
> Example:
> $ cat core.sh
> #!/bin/sh
>
> echo "|/usr/bin/sleep 10" > /proc/sys/kernel/core_pattern
> sleep 1000 &
> PID=$!
>
> cat /proc/$PID/status | grep CoreDumping
> kill -ABRT $PID
> sleep 1
> cat /proc/$PID/status | grep CoreDumping
>
> $ ./core.sh
> CoreDumping: 0
> CoreDumping: 1
>
> Signed-off-by: Roman Gushchin <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Alexander Viro <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> ---
> fs/proc/array.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/fs/proc/array.c b/fs/proc/array.c
> index 88c355574aa0..fc4a0aa7f487 100644
> --- a/fs/proc/array.c
> +++ b/fs/proc/array.c
> @@ -369,6 +369,11 @@ static void task_cpus_allowed(struct seq_file *m, struct task_struct *task)
> cpumask_pr_args(&task->cpus_allowed));
> }
>
> +static inline void task_core_dumping(struct seq_file *m, struct mm_struct *mm)
> +{
> + seq_printf(m, "CoreDumping:\t%d\n", !!mm->core_state);
> +}
> +
> int proc_pid_status(struct seq_file *m, struct pid_namespace *ns,
> struct pid *pid, struct task_struct *task)
> {
> @@ -379,6 +384,7 @@ int proc_pid_status(struct seq_file *m, struct pid_namespace *ns,
>
> if (mm) {
> task_mem(m, mm);
> + task_core_dumping(m, mm);
> mmput(mm);
> }
> task_sig(m, task);
> --
> 2.13.5
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>

2017-09-22 17:18:48

by Roman Gushchin

[permalink] [raw]
Subject: Re: [RESEND] proc, coredump: add CoreDumping flag to /proc/pid/status

On Fri, Sep 22, 2017 at 06:44:12PM +0300, Konstantin Khlebnikov wrote:
> On Thu, Sep 21, 2017 at 2:06 AM, Roman Gushchin <[email protected]> wrote:
> > Right now there is no convenient way to check if a process is being
> > coredumped at the moment.
> >
> > It might be necessary to recognize such state to prevent killing
> > the process and getting a broken coredump.
> > Writing a large core might take significant time, and the process
> > is unresponsive during it, so it might be killed by timeout,
> > if another process is monitoring and killing/restarting
> > hanging tasks.
> >
> > To provide an ability to detect if a process is in the state of
> > being coreduped, we can expose a boolean CoreDumping flag
> > in /proc/pid/status.
>
> Makes sense.
>
> Maybe print this line only when task actually makes dump?

I don't think we do this trick with any other fields...

> And probably expose pid of coredump helper.

It will be racy in most cases, so I'm not sure it worth it.
What's the usecase?
In any case, it sounds like a separate feature.

>
> Add Oleg into CC.

Thank you!

2017-09-26 12:39:38

by Roman Gushchin

[permalink] [raw]
Subject: Re: [RESEND] proc, coredump: add CoreDumping flag to /proc/pid/status

Hi, Andrew!

As there are no objections, can you, please, pick this patch?

Thank you!

On Wed, Sep 20, 2017 at 04:06:34PM -0700, Roman Gushchin wrote:
> Right now there is no convenient way to check if a process is being
> coredumped at the moment.
>
> It might be necessary to recognize such state to prevent killing
> the process and getting a broken coredump.
> Writing a large core might take significant time, and the process
> is unresponsive during it, so it might be killed by timeout,
> if another process is monitoring and killing/restarting
> hanging tasks.
>
> To provide an ability to detect if a process is in the state of
> being coreduped, we can expose a boolean CoreDumping flag
> in /proc/pid/status.
>
> Example:
> $ cat core.sh
> #!/bin/sh
>
> echo "|/usr/bin/sleep 10" > /proc/sys/kernel/core_pattern
> sleep 1000 &
> PID=$!
>
> cat /proc/$PID/status | grep CoreDumping
> kill -ABRT $PID
> sleep 1
> cat /proc/$PID/status | grep CoreDumping
>
> $ ./core.sh
> CoreDumping: 0
> CoreDumping: 1
>
> Signed-off-by: Roman Gushchin <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Alexander Viro <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> ---
> fs/proc/array.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/fs/proc/array.c b/fs/proc/array.c
> index 88c355574aa0..fc4a0aa7f487 100644
> --- a/fs/proc/array.c
> +++ b/fs/proc/array.c
> @@ -369,6 +369,11 @@ static void task_cpus_allowed(struct seq_file *m, struct task_struct *task)
> cpumask_pr_args(&task->cpus_allowed));
> }
>
> +static inline void task_core_dumping(struct seq_file *m, struct mm_struct *mm)
> +{
> + seq_printf(m, "CoreDumping:\t%d\n", !!mm->core_state);
> +}
> +
> int proc_pid_status(struct seq_file *m, struct pid_namespace *ns,
> struct pid *pid, struct task_struct *task)
> {
> @@ -379,6 +384,7 @@ int proc_pid_status(struct seq_file *m, struct pid_namespace *ns,
>
> if (mm) {
> task_mem(m, mm);
> + task_core_dumping(m, mm);
> mmput(mm);
> }
> task_sig(m, task);
> --
> 2.13.5
>

2017-09-27 23:31:09

by Andrew Morton

[permalink] [raw]
Subject: Re: [RESEND] proc, coredump: add CoreDumping flag to /proc/pid/status

On Wed, 20 Sep 2017 16:06:34 -0700 Roman Gushchin <[email protected]> wrote:

> Right now there is no convenient way to check if a process is being
> coredumped at the moment.
>
> It might be necessary to recognize such state to prevent killing
> the process and getting a broken coredump.
> Writing a large core might take significant time, and the process
> is unresponsive during it, so it might be killed by timeout,
> if another process is monitoring and killing/restarting
> hanging tasks.
>
> To provide an ability to detect if a process is in the state of
> being coreduped, we can expose a boolean CoreDumping flag
> in /proc/pid/status.
>
> Example:
> $ cat core.sh
> #!/bin/sh
>
> echo "|/usr/bin/sleep 10" > /proc/sys/kernel/core_pattern
> sleep 1000 &
> PID=$!
>
> cat /proc/$PID/status | grep CoreDumping
> kill -ABRT $PID
> sleep 1
> cat /proc/$PID/status | grep CoreDumping
>
> $ ./core.sh
> CoreDumping: 0
> CoreDumping: 1

I assume you have some real-world use case which benefits from this.

> fs/proc/array.c | 6 ++++++
> 1 file changed, 6 insertions(+)

A Documentation/ would be appropriate? Include a brief mention of
*why* someone might want to use this...


2017-09-28 13:54:32

by Roman Gushchin

[permalink] [raw]
Subject: Re: [RESEND] proc, coredump: add CoreDumping flag to /proc/pid/status

On Wed, Sep 27, 2017 at 04:31:06PM -0700, Andrew Morton wrote:
> On Wed, 20 Sep 2017 16:06:34 -0700 Roman Gushchin <[email protected]> wrote:
>
> > Right now there is no convenient way to check if a process is being
> > coredumped at the moment.
> >
> > It might be necessary to recognize such state to prevent killing
> > the process and getting a broken coredump.
> > Writing a large core might take significant time, and the process
> > is unresponsive during it, so it might be killed by timeout,
> > if another process is monitoring and killing/restarting
> > hanging tasks.
> >
> > To provide an ability to detect if a process is in the state of
> > being coreduped, we can expose a boolean CoreDumping flag
> > in /proc/pid/status.
> >
> > Example:
> > $ cat core.sh
> > #!/bin/sh
> >
> > echo "|/usr/bin/sleep 10" > /proc/sys/kernel/core_pattern
> > sleep 1000 &
> > PID=$!
> >
> > cat /proc/$PID/status | grep CoreDumping
> > kill -ABRT $PID
> > sleep 1
> > cat /proc/$PID/status | grep CoreDumping
> >
> > $ ./core.sh
> > CoreDumping: 0
> > CoreDumping: 1
>
> I assume you have some real-world use case which benefits from this.

Sure, we're getting a sensible number of corrupted coredump files
on machines in our fleet, just because processes are being killed
by timeout in the middle of the core writing process.

We do have a process health check, and some agent is responsible
for restarting processes which are not responding for health check requests.
Writing a large coredump to the disk can easily exceed the reasonable timeout
(especially on an overloaded machine).

This flag will allow the agent to distinguish processes which are being
coredumped, extend the timeout for them, and let them produce a full
coredump file.

>
> > fs/proc/array.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
>
> A Documentation/ would be appropriate? Include a brief mention of
> *why* someone might want to use this...
>
>

Here it is. Thank you!

--

>From 71f86fc2bdd6104dc7d63c0c2eeb6b414494a582 Mon Sep 17 00:00:00 2001
From: Roman Gushchin <[email protected]>
Date: Thu, 28 Sep 2017 13:47:19 +0100
Subject: [PATCH] proc: document CoreDumping flag in /proc/<pid>/status

Add description for the CoreDumping flag in /proc/<pid>/status.

The flag is intended to be used to avoid killing processes
during the generation of the coredump files and avoid getting
corrupted coredump files.

Signed-off-by: Roman Gushchin <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
---
Documentation/filesystems/proc.txt | 3 +++
1 file changed, 3 insertions(+)

diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index adba21b5ada7..bc832f8b7a70 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -181,6 +181,7 @@ read the file /proc/PID/status:
VmPTE: 20 kb
VmSwap: 0 kB
HugetlbPages: 0 kB
+ CoreDumping: 0
Threads: 1
SigQ: 0/28578
SigPnd: 0000000000000000
@@ -254,6 +255,8 @@ Table 1-2: Contents of the status files (as of 4.8)
VmSwap amount of swap used by anonymous private data
(shmem swap usage is not included)
HugetlbPages size of hugetlb memory portions
+ CoreDumping process's memory is currently being dumped
+ (killing the process may lead to a corrupted core)
Threads number of threads
SigQ number of signals queued/max. number for queue
SigPnd bitmap of pending signals for the thread
--
2.13.5