2002-02-14 16:11:33

by Michael Sinz

[permalink] [raw]
Subject: [PATCH] Core dump file control

I have, for a long time, wished that Linux had a way to specify where
core dumps are stored and what the name of the core dump is. Now that
I have been building large linux clusters with many diskless nodes,
this need has become even more important.

What I did with this patch is provide a new sysctl that lets you
control the name of the core file. The this name is actually a format
string such that certain values from the process can be included.

The sysctl is kernel.core_name_format and is a string up to 63 characters
(plus 1 for the null)

The following format options are available in that string:

%P The Process ID (current->pid)
%U The UID of the process (current->uid)
%N The command name of the process (current->comm)
%H The nodename of the system (system_utsname.nodename)
%% A "%"

For example, in my clusters, I have an NFS R/W mount at /coredumps
that all nodes have access to. The format string I use is:

sysctl -w "kernel.core_name_format=/coredumps/%H-%N-%P.core"

This then causes core dumps to be of the format:

/coredumps/whale.sinz.org-badprogram-13917.core

Only behavior of appending the PID to the "core" name is still
supported with the added logic of only doing so if the PID is
not already part of the name format. The default name format
is still just "core" to match old behavior.

NOTE - I was tempted to change the default format to be something
like "%N.core" which would at least identify the program that
caused the core file. However, I can do that as part of my init
process so it is not a issue here.

The attached patch is for Linux 2.4.17 but should patch relatively
easily to other versions. I tried to commend the code a bit
to explain the how and why.

-----------------
diff -urP linux-2.4.17/include/linux/sysctl.h linux.patch/include/linux/sysctl.h
--- linux-2.4.17/include/linux/sysctl.h Mon Nov 26 08:29:17 2001
+++ linux.patch/include/linux/sysctl.h Thu Feb 14 08:23:59 2002
@@ -124,6 +124,7 @@
KERN_CORE_USES_PID=52, /* int: use core or core.%pid */
KERN_TAINTED=53, /* int: various kernel tainted flags */
KERN_CADPID=54, /* int: PID of the process to notify on CAD */
+ KERN_CORE_NAME_FORMAT=55, /* string: core file name format string */
};


diff -urP linux-2.4.17/kernel/sysctl.c linux.patch/kernel/sysctl.c
--- linux-2.4.17/kernel/sysctl.c Fri Dec 21 12:42:04 2001
+++ linux.patch/kernel/sysctl.c Thu Feb 14 08:26:15 2002
@@ -49,6 +49,7 @@
extern int max_queued_signals;
extern int sysrq_enabled;
extern int core_uses_pid;
+extern char core_name_format[];
extern int cad_pid;

/* this is needed for the proc_dointvec_minmax for [fs_]overflow UID and GID */
@@ -171,6 +172,8 @@
0644, NULL, &proc_dointvec},
{KERN_CORE_USES_PID, "core_uses_pid", &core_uses_pid, sizeof(int),
0644, NULL, &proc_dointvec},
+ {KERN_CORE_NAME_FORMAT, "core_name_format", core_name_format, 64,
+ 0644, NULL, &proc_doutsstring, &sysctl_string},
{KERN_TAINTED, "tainted", &tainted, sizeof(int),
0644, NULL, &proc_dointvec},
{KERN_CAP_BSET, "cap-bound", &cap_bset, sizeof(kernel_cap_t),
diff -urP linux-2.4.17/fs/exec.c linux.patch/fs/exec.c
--- linux-2.4.17/fs/exec.c Fri Dec 21 12:41:55 2001
+++ linux.patch/fs/exec.c Thu Feb 14 10:00:16 2002
@@ -35,6 +35,7 @@
#include <linux/highmem.h>
#include <linux/spinlock.h>
#include <linux/personality.h>
+#include <linux/utsname.h>
#define __NO_VERSION__
#include <linux/module.h>

@@ -48,6 +49,12 @@

int core_uses_pid;

+/* The format string for the core file name...
+ We default to "core" such that past behavior
+ remains unchanged. The 64 character limit is
+ arbitrary but must match the sysctl table. */
+char core_name_format[64] = {"core"};
+
static struct linux_binfmt *formats;
static rwlock_t binfmt_lock = RW_LOCK_UNLOCKED;

@@ -933,14 +940,37 @@
__MOD_DEC_USE_COUNT(old->module);
}

+/* This is the maximum expanded core file name. We use
+ a reasonable number here since we use the stack to
+ do the expansion. However, the number should be big
+ enough to handle a reasonable command name plus PID
+ and/or UID in addition to the file name part that
+ is in the core_name_format string. */
+#define MAX_CORE_NAME (160)
+
int do_coredump(long signr, struct pt_regs * regs)
{
struct linux_binfmt * binfmt;
- char corename[6+sizeof(current->comm)+10];
struct file * file;
struct inode * inode;
int retval = 0;

+ int fmt_i;
+ int name_n;
+ int addPID;
+ char *cname;
+
+ /* The +11 is here to simplify the code path. What
+ we do is always check that we are less than MAX
+ but there are times when we also need to append
+ a number (such as the PID or UID). Rather than
+ using another temporary buffer, we provide for
+ enough extra space such that those numbers can
+ be added in one gulp even if we are just under
+ the MAX_CORE_NAME. Reduction in complexity of
+ the code path means a more reliable implementation. */
+ char corename[MAX_CORE_NAME + 1 + 11];
+
lock_kernel();
binfmt = current->binfmt;
if (!binfmt || !binfmt->core_dump)
@@ -951,10 +981,92 @@
if (current->rlim[RLIMIT_CORE].rlim_cur < binfmt->min_coredump)
goto fail;

- memcpy(corename,"core.", 5);
- corename[4] = '\0';
- if (core_uses_pid || atomic_read(&current->mm->mm_users) != 1)
- sprintf(&corename[4], ".%d", current->pid);
+ /* Set this to true if we are going to add the PID. If the PID
+ already is added in the format we will end up clearing this.
+ The purpose is to provide for the old behavior of adding the
+ PID to the core file name but to not add it if it already
+ was included via the file name format pattern. */
+ addPID = (core_uses_pid || atomic_read(&current->mm->mm_users) != 1);
+
+ /* Build the core file name as needed from the format string */
+ for (fmt_i=0, name_n=0;
+ name_n < MAX_CORE_NAME && core_name_format[fmt_i];
+ fmt_i++)
+ {
+ switch (core_name_format[fmt_i])
+ {
+ case '%': /* A format character */
+ fmt_i++;
+ switch (core_name_format[fmt_i])
+ {
+ case '%': /* The way we get this character */
+ corename[name_n++] = '%';
+ break;
+
+ case 'N': /* Process name */
+ cname=current->comm;
+
+ /* Only copy as much as will fit within the
+ MAX_CORE_NAME */
+ while (*cname && (name_n < MAX_CORE_NAME))
+ {
+ if (*cname != '/')
+ corename[name_n++] = *cname;
+ cname++;
+ }
+ break;
+
+ case 'H': /* Node name */
+ cname=system_utsname.nodename;
+
+ /* Only copy as much as will fit within the
+ MAX_CORE_NAME */
+ while (*cname && (name_n < MAX_CORE_NAME))
+ {
+ if (*cname != '/')
+ corename[name_n++] = *cname;
+ cname++;
+ }
+ break;
+
+ case 'P': /* Process PID */
+ /* Since we are adding it here, don't append */
+ addPID=0;
+
+ /* We don't need to pre-check that the number
+ fits since we added a padding of 11
+ characters to the end of the string buffer
+ just so that we don't need to do an extra
+ check */
+ name_n += sprintf(&corename[name_n],"%d",current->pid);
+ break;
+
+ case 'U': /* UID of the process */
+ /* We don't need to pre-check that the number
+ fits since we added a padding of 11
+ characters to the end of the string buffer
+ just so that we don't need to do an extra
+ check */
+ name_n += sprintf(&corename[name_n],"%d",current->uid);
+ break;
+ }
+ break;
+
+ default: /* Anything else just pass along */
+ corename[name_n++] = core_name_format[fmt_i];
+ }
+ }
+
+ /* If we still want to append the PID and there is room, do so */
+ /* This is to preserve current behavior */
+ if (addPID && (name_n < MAX_CORE_NAME))
+ {
+ name_n += sprintf(&corename[name_n],".%d",current->pid);
+ }
+
+ /* And make sure to null terminate the string */
+ corename[name_n]='\0';
+
file = filp_open(corename, O_CREAT | 2 | O_NOFOLLOW, 0600);
if (IS_ERR(file))
goto fail;


--
Michael Sinz ---- Worldgate Communications ---- [email protected]
A master's secrets are only as good as
the master's ability to explain them to others.


2002-02-14 16:47:47

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] Core dump file control

Michael Sinz <[email protected]> writes:
>
> This then causes core dumps to be of the format:
>
> /coredumps/whale.sinz.org-badprogram-13917.core

I had something like this for a long time on my todo list. The idea
was to set core_name_format to the name of a named pipe and have an
daemon on the other end that logs backtraces to syslogd (something a
bit like dr.watson)
Only problem is that it won't handle parallel coredumps very well
without some additional (deadlock prone) global locking or alternatively
support AF_UNIX stream sockets too that have the concept of multiple
streams over a single name.

-Andi

2002-02-14 17:09:36

by Michael Sinz

[permalink] [raw]
Subject: Re: [PATCH] Core dump file control

Andi Kleen wrote:
>
> Michael Sinz <[email protected]> writes:
> >
> > This then causes core dumps to be of the format:
> >
> > /coredumps/whale.sinz.org-badprogram-13917.core
>
> I had something like this for a long time on my todo list. The idea
> was to set core_name_format to the name of a named pipe and have an
> daemon on the other end that logs backtraces to syslogd (something a
> bit like dr.watson)
> Only problem is that it won't handle parallel coredumps very well
> without some additional (deadlock prone) global locking or alternatively
> support AF_UNIX stream sockets too that have the concept of multiple
> streams over a single name.

Ahh, interesting idea. I have not thought about using pipes since
my main concern was needing to put coredumps into a place that can
not fill up useful disk and, in the case of the diskless nodes, is
writeable since everything else is not. We just grab the core dumps
later in order to figure out what went wrong. Under non-development
cases we should get close to zero core dumps. (Ahh, if that were only
true at this time... :-)

--
Michael Sinz ---- Worldgate Communications ---- [email protected]
A master's secrets are only as good as
the master's ability to explain them to others.

2002-02-14 17:53:52

by Michael Sinz

[permalink] [raw]
Subject: Re: [PATCH] Core dump file control

diff -urP linux-2.5.4/fs/exec.c linux-2.5.4-patch/fs/exec.c
--- linux-2.5.4/fs/exec.c Sun Feb 10 20:50:15 2002
+++ linux-2.5.4-patch/fs/exec.c Thu Feb 14 12:45:00 2002
@@ -35,6 +35,7 @@
#include <linux/highmem.h>
#include <linux/spinlock.h>
#include <linux/personality.h>
+#include <linux/utsname.h>
#include <linux/binfmts.h>
#define __NO_VERSION__
#include <linux/module.h>
@@ -49,6 +50,12 @@

int core_uses_pid;

+/* The format string for the core file name...
+ We default to "core" such that past behavior
+ remains unchanged. The 64 character limit is
+ arbitrary but must match the sysctl table. */
+char core_name_format[64] = {"core"};
+
static struct linux_binfmt *formats;
static rwlock_t binfmt_lock = RW_LOCK_UNLOCKED;

@@ -934,14 +941,37 @@
__MOD_DEC_USE_COUNT(old->module);
}

+/* This is the maximum expanded core file name. We use
+ a reasonable number here since we use the stack to
+ do the expansion. However, the number should be big
+ enough to handle a reasonable command name plus PID
+ and/or UID in addition to the file name part that
+ is in the core_name_format string. */
+#define MAX_CORE_NAME (160)
+
int do_coredump(long signr, struct pt_regs * regs)
{
struct linux_binfmt * binfmt;
- char corename[6+sizeof(current->comm)+10];
struct file * file;
struct inode * inode;
int retval = 0;

+ int fmt_i;
+ int name_n;
+ int addPID;
+ char *cname;
+
+ /* The +11 is here to simplify the code path. What
+ we do is always check that we are less than MAX
+ but there are times when we also need to append
+ a number (such as the PID or UID). Rather than
+ using another temporary buffer, we provide for
+ enough extra space such that those numbers can
+ be added in one gulp even if we are just under
+ the MAX_CORE_NAME. Reduction in complexity of
+ the code path means a more reliable implementation. */
+ char corename[MAX_CORE_NAME + 1 + 11];
+
lock_kernel();
binfmt = current->binfmt;
if (!binfmt || !binfmt->core_dump)
@@ -952,10 +982,92 @@
if (current->rlim[RLIMIT_CORE].rlim_cur < binfmt->min_coredump)
goto fail;

- memcpy(corename,"core.", 5);
- corename[4] = '\0';
- if (core_uses_pid || atomic_read(&current->mm->mm_users) != 1)
- sprintf(&corename[4], ".%d", current->pid);
+ /* Set this to true if we are going to add the PID. If the PID
+ already is added in the format we will end up clearing this.
+ The purpose is to provide for the old behavior of adding the
+ PID to the core file name but to not add it if it already
+ was included via the file name format pattern. */
+ addPID = (core_uses_pid || atomic_read(&current->mm->mm_users) != 1);
+
+ /* Build the core file name as needed from the format string */
+ for (fmt_i=0, name_n=0;
+ name_n < MAX_CORE_NAME && core_name_format[fmt_i];
+ fmt_i++)
+ {
+ switch (core_name_format[fmt_i])
+ {
+ case '%': /* A format character */
+ fmt_i++;
+ switch (core_name_format[fmt_i])
+ {
+ case '%': /* The way we get this character */
+ corename[name_n++] = '%';
+ break;
+
+ case 'N': /* Process name */
+ cname=current->comm;
+
+ /* Only copy as much as will fit within the
+ MAX_CORE_NAME */
+ while (*cname && (name_n < MAX_CORE_NAME))
+ {
+ if (*cname != '/')
+ corename[name_n++] = *cname;
+ cname++;
+ }
+ break;
+
+ case 'H': /* Node name */
+ cname=system_utsname.nodename;
+
+ /* Only copy as much as will fit within the
+ MAX_CORE_NAME */
+ while (*cname && (name_n < MAX_CORE_NAME))
+ {
+ if (*cname != '/')
+ corename[name_n++] = *cname;
+ cname++;
+ }
+ break;
+
+ case 'P': /* Process PID */
+ /* Since we are adding it here, don't append */
+ addPID=0;
+
+ /* We don't need to pre-check that the number
+ fits since we added a padding of 11
+ characters to the end of the string buffer
+ just so that we don't need to do an extra
+ check */
+ name_n += sprintf(&corename[name_n],"%d",current->pid);
+ break;
+
+ case 'U': /* UID of the process */
+ /* We don't need to pre-check that the number
+ fits since we added a padding of 11
+ characters to the end of the string buffer
+ just so that we don't need to do an extra
+ check */
+ name_n += sprintf(&corename[name_n],"%d",current->uid);
+ break;
+ }
+ break;
+
+ default: /* Anything else just pass along */
+ corename[name_n++] = core_name_format[fmt_i];
+ }
+ }
+
+ /* If we still want to append the PID and there is room, do so */
+ /* This is to preserve current behavior */
+ if (addPID && (name_n < MAX_CORE_NAME))
+ {
+ name_n += sprintf(&corename[name_n],".%d",current->pid);
+ }
+
+ /* And make sure to null terminate the string */
+ corename[name_n]='\0';
+
file = filp_open(corename, O_CREAT | 2 | O_NOFOLLOW, 0600);
if (IS_ERR(file))
goto fail;
diff -urP linux-2.5.4/include/linux/sysctl.h linux-2.5.4-patch/include/linux/sysctl.h
--- linux-2.5.4/include/linux/sysctl.h Sun Feb 10 20:50:07 2002
+++ linux-2.5.4-patch/include/linux/sysctl.h Thu Feb 14 12:43:52 2002
@@ -124,6 +124,7 @@
KERN_CORE_USES_PID=52, /* int: use core or core.%pid */
KERN_TAINTED=53, /* int: various kernel tainted flags */
KERN_CADPID=54, /* int: PID of the process to notify on CAD */
+ KERN_CORE_NAME_FORMAT=55, /* string: core file name format string */
};


diff -urP linux-2.5.4/kernel/sysctl.c linux-2.5.4-patch/kernel/sysctl.c
--- linux-2.5.4/kernel/sysctl.c Sun Feb 10 20:50:09 2002
+++ linux-2.5.4-patch/kernel/sysctl.c Thu Feb 14 12:43:52 2002
@@ -50,6 +50,7 @@
extern int max_queued_signals;
extern int sysrq_enabled;
extern int core_uses_pid;
+extern char core_name_format[];
extern int cad_pid;

/* this is needed for the proc_dointvec_minmax for [fs_]overflow UID and GID */
@@ -172,6 +173,8 @@
0644, NULL, &proc_dointvec},
{KERN_CORE_USES_PID, "core_uses_pid", &core_uses_pid, sizeof(int),
0644, NULL, &proc_dointvec},
+ {KERN_CORE_NAME_FORMAT, "core_name_format", core_name_format, 64,
+ 0644, NULL, &proc_doutsstring, &sysctl_string},
{KERN_TAINTED, "tainted", &tainted, sizeof(int),
0644, NULL, &proc_dointvec},
{KERN_CAP_BSET, "cap-bound", &cap_bset, sizeof(kernel_cap_t),


Attachments:
myPatch-2.5 (6.12 kB)

2002-02-15 11:41:05

by Jakob Oestergaard

[permalink] [raw]
Subject: Re: [PATCH] Core dump file control

On Thu, Feb 14, 2002 at 11:10:55AM -0500, Michael Sinz wrote:
> I have, for a long time, wished that Linux had a way to specify where
> core dumps are stored and what the name of the core dump is. Now that
> I have been building large linux clusters with many diskless nodes,
> this need has become even more important.
...

I just wanted to throw in my 0.02 Euro on this one:

I have not yet tested your patch yet - but this functionality is *very*
important to my company as well.

Anyone developing applications with multiple processes will benefit
significantly from having core files named differnetly than just "core".

A patch was included in the kernel some time ago, to allow the appending of the
PID - however, this is not really good enough. It's better than nothing, but
it's not good.

What I want is "core.[process name]" eventually with a ".[pid]" appended. A
flexible scheme like your patch implements is very nice. Actually having
the core files in CWD is fine for me - I mainly care about the file name.

Furthermore, the patch that went in earlier is *horrible* code. Let me give a
few examples:

...
char corename[6+sizeof(current->comm)+10];
...
memcpy(corename,"core.", 5);
corename[4] = '\0';
...
if (core_uses_pid || atomic_read(&current->mm->mm_users) != 1)
sprintf(&corename[4], ".%d", current->pid);


Enough said.

--
................................................................
: [email protected] : And I see the elder races, :
:.........................: putrid forms of man :
: Jakob ?stergaard : See him rise and claim the earth, :
: OZ9ABN : his downfall is at hand. :
:.........................:............{Konkhra}...............:

2002-02-15 11:45:53

by Martin Dalecki

[permalink] [raw]
Subject: Re: [PATCH] Core dump file control

Jakob ?stergaard wrote:

>On Thu, Feb 14, 2002 at 11:10:55AM -0500, Michael Sinz wrote:
>
>>I have, for a long time, wished that Linux had a way to specify where
>>core dumps are stored and what the name of the core dump is. Now that
>>I have been building large linux clusters with many diskless nodes,
>>this need has become even more important.
>>
>...
>
>I just wanted to throw in my 0.02 Euro on this one:
>
>I have not yet tested your patch yet - but this functionality is *very*
>important to my company as well.
>
>Anyone developing applications with multiple processes will benefit
>significantly from having core files named differnetly than just "core".
>
>A patch was included in the kernel some time ago, to allow the appending of the
>PID - however, this is not really good enough. It's better than nothing, but
>it's not good.
>
>What I want is "core.[process name]" eventually with a ".[pid]" appended. A
>flexible scheme like your patch implements is very nice. Actually having
>the core files in CWD is fine for me - I mainly care about the file name.
>

Please execute the size command on the core fiel:

size core

to see why this isn't needed.

>

2002-02-15 12:07:19

by Michael Sinz

[permalink] [raw]
Subject: Re: [PATCH] Core dump file control

Jakob ?stergaard wrote:
>
> On Thu, Feb 14, 2002 at 11:10:55AM -0500, Michael Sinz wrote:
> > I have, for a long time, wished that Linux had a way to specify where
> > core dumps are stored and what the name of the core dump is. Now that
> > I have been building large linux clusters with many diskless nodes,
> > this need has become even more important.
> ...
>
> I just wanted to throw in my 0.02 Euro on this one:
>
> I have not yet tested your patch yet - but this functionality is *very*
> important to my company as well.
>
> Anyone developing applications with multiple processes will benefit
> significantly from having core files named differnetly than just "core".

That was my first need (%N.core is what I used on a different platform)
However, being able to specify a few more items really provides much more
flexibility.

> A patch was included in the kernel some time ago, to allow the appending of the
> PID - however, this is not really good enough. It's better than nothing, but
> it's not good.
>
> What I want is "core.[process name]" eventually with a ".[pid]" appended. A
> flexible scheme like your patch implements is very nice. Actually having
> the core files in CWD is fine for me - I mainly care about the file name.
>
> Furthermore, the patch that went in earlier is *horrible* code. Let me give a
> few examples:

I had noticed that - it was rather poor - and rather strange. I hope
people like my patch a bit better.

--
Michael Sinz ---- Worldgate Communications ---- [email protected]
A master's secrets are only as good as
the master's ability to explain them to others.

2002-02-15 12:11:29

by Michael Sinz

[permalink] [raw]
Subject: Re: [PATCH] Core dump file control

Martin Dalecki wrote:
>
> Jakob ?stergaard wrote:
>
> >On Thu, Feb 14, 2002 at 11:10:55AM -0500, Michael Sinz wrote:
> >
> >>I have, for a long time, wished that Linux had a way to specify where
> >>core dumps are stored and what the name of the core dump is. Now that
> >>I have been building large linux clusters with many diskless nodes,
> >>this need has become even more important.
> >>
> >...
> >
> >I just wanted to throw in my 0.02 Euro on this one:
> >
> >I have not yet tested your patch yet - but this functionality is *very*
> >important to my company as well.
> >
> >Anyone developing applications with multiple processes will benefit
> >significantly from having core files named differnetly than just "core".
> >
> >A patch was included in the kernel some time ago, to allow the appending of the
> >PID - however, this is not really good enough. It's better than nothing, but
> >it's not good.
> >
> >What I want is "core.[process name]" eventually with a ".[pid]" appended. A
> >flexible scheme like your patch implements is very nice. Actually having
> >the core files in CWD is fine for me - I mainly care about the file name.
>
> Please execute the size command on the core fiel:
>
> size core
>
> to see why this isn't needed.

Huh? explain... If I have multiple programs running and one cores,
would I not want to have the core file name be based on the program that
cored? Linux had the PID option (kernel.core_uses_pid) specifically so
that you could have multiple process core files.

Plus, I need to be able to redirect core files to a sane location in the
cluster (which is a whole other kettle of fish).

--
Michael Sinz ---- Worldgate Communications ---- [email protected]
A master's secrets are only as good as
the master's ability to explain them to others.

2002-02-15 12:13:39

by Jakob Oestergaard

[permalink] [raw]
Subject: Re: [PATCH] Core dump file control

On Fri, Feb 15, 2002 at 12:44:42PM +0100, Martin Dalecki wrote:
> Jakob ?stergaard wrote:
...
> >
> >What I want is "core.[process name]" eventually with a ".[pid]" appended. A
> >flexible scheme like your patch implements is very nice. Actually having
> >the core files in CWD is fine for me - I mainly care about the file name.
> >
>
> Please execute the size command on the core fiel:
>
> size core
>
> to see why this isn't needed.
>

Huh ?

I suppose you mean, that I can get the name of the executable that caused the
core dump, when running size - right ?

Well, you can do that easier with the file command.

But that doesn't prevent my 7 other processes from overwriting the core file
of the 8'th process which was the first one to crash. Multi-process systems
can, on occation, produce such "domino dumps". Separate names is a *must have*.

And having process names is nicer than having PIDs - I don't mind if my core
files are over-written on subsequent runs, actually it's nice (keeps the disks
from filling up).

--
................................................................
: [email protected] : And I see the elder races, :
:.........................: putrid forms of man :
: Jakob ?stergaard : See him rise and claim the earth, :
: OZ9ABN : his downfall is at hand. :
:.........................:............{Konkhra}...............:

2002-02-15 12:23:19

by Martin Dalecki

[permalink] [raw]
Subject: Re: [PATCH] Core dump file control

Jakob ?stergaard wrote:

>On Fri, Feb 15, 2002 at 12:44:42PM +0100, Martin Dalecki wrote:
>
>>Jakob ?stergaard wrote:
>>
>...
>
>>>What I want is "core.[process name]" eventually with a ".[pid]" appended. A
>>>flexible scheme like your patch implements is very nice. Actually having
>>>the core files in CWD is fine for me - I mainly care about the file name.
>>>
>>Please execute the size command on the core fiel:
>>
>>size core
>>
>>to see why this isn't needed.
>>
>
>Huh ?
>
>I suppose you mean, that I can get the name of the executable that caused the
>core dump, when running size - right ?
>
>Well, you can do that easier with the file command.
>
>But that doesn't prevent my 7 other processes from overwriting the core file
>of the 8'th process which was the first one to crash. Multi-process systems
>can, on occation, produce such "domino dumps". Separate names is a *must have*.
>
This point I fully agree with. And in fact 2.4.17 already does it the
core.{pid} way.

>And having process names is nicer than having PIDs - I don't mind if my core
>files are over-written on subsequent runs, actually it's nice (keeps the disks
>from filling up).
>
They can get long and annoying... They are not suitable for short name
filesystems... They provide a good
hint for deliberate overwrites.... and so on. Basically I think this
would be too much of the good.

>



2002-02-15 12:33:19

by Jakob Oestergaard

[permalink] [raw]
Subject: Re: [PATCH] Core dump file control

On Fri, Feb 15, 2002 at 01:22:18PM +0100, Martin Dalecki wrote:
> Jakob ?stergaard wrote:
..
>
> >And having process names is nicer than having PIDs - I don't mind if my core
> >files are over-written on subsequent runs, actually it's nice (keeps the disks
> >from filling up).
> >
> They can get long and annoying... They are not suitable for short name
> filesystems... They provide a good
> hint for deliberate overwrites.... and so on. Basically I think this
> would be too much of the good.

That is your oppinion, and I disagree.

And that is *exactly* why the suggested patch is so great - we just keep
the "core" name the default, and allow the user to set the name as he
pleases.


--
................................................................
: [email protected] : And I see the elder races, :
:.........................: putrid forms of man :
: Jakob ?stergaard : See him rise and claim the earth, :
: OZ9ABN : his downfall is at hand. :
:.........................:............{Konkhra}...............:

2002-02-15 12:55:52

by Michael Sinz

[permalink] [raw]
Subject: Re: [PATCH] Core dump file control

Martin Dalecki wrote:
>
> Jakob ?stergaard wrote:
>
> >On Fri, Feb 15, 2002 at 12:44:42PM +0100, Martin Dalecki wrote:
> >
> >>Jakob ?stergaard wrote:
> >>
> >...
> >
> >>>What I want is "core.[process name]" eventually with a ".[pid]" appended. A
> >>>flexible scheme like your patch implements is very nice. Actually having
> >>>the core files in CWD is fine for me - I mainly care about the file name.
> >>>
> >>Please execute the size command on the core fiel:
> >>
> >>size core
> >>
> >>to see why this isn't needed.
> >>
> >
> >Huh ?
> >
> >I suppose you mean, that I can get the name of the executable that caused the
> >core dump, when running size - right ?
> >
> >Well, you can do that easier with the file command.
> >
> >But that doesn't prevent my 7 other processes from overwriting the core file
> >of the 8'th process which was the first one to crash. Multi-process systems
> >can, on occation, produce such "domino dumps". Separate names is a *must have*.
> >
> This point I fully agree with. And in fact 2.4.17 already does it the
> core.{pid} way.

This is still not a very good way to control the names.

What I have is a cluster of nearly 100 machines - all but one of them have
no disk. When something goes down on one of the machines, I would like to
know (a) what it was that went down and (b) which machine it was on.
I would also like to have the core files someplace that is writable (all
but the /coredumps directory is read-only - oh, and the local tmpfs mounts
for /var and /etc)

> >And having process names is nicer than having PIDs - I don't mind if my core
> >files are over-written on subsequent runs, actually it's nice (keeps the disks
> >from filling up).
>
> They can get long and annoying... They are not suitable for short name
> filesystems... They provide a good
> hint for deliberate overwrites.... and so on. Basically I think this
> would be too much of the good.

I was very carefull to keep that behavior consistant with 2.4.17. That
is, if you do nothing different with the kernel.core_name_format then it
will work just as before. And only root can change that sysctl.

As to "overwrites" and the like, I have much less overwrites with most
any pattern form than with just plain "core" And I can support features
that many people have wanted (%N.core being a very popular construct).

--
Michael Sinz ---- Worldgate Communications ---- [email protected]
A master's secrets are only as good as
the master's ability to explain them to others.