2011-03-10 15:57:24

by Phil Carmody

[permalink] [raw]
Subject: [PATCH 1/1] sysfs: add more info to the oops dump

If we're going to remember which file we accessed, we might as well
also remember whether it was a read or a write, and if the latter,
some indication of what was written.

e.g.
$ echo 1 > /sys/kernel/slab/:at-0000064/sanity_checks
$ echo c > /proc/sysrq-trigger
...
[ 112.457580] last sysfs file (w): /sys/kernel/slab/:at-0000064/sanity_checks
[ 112.464569] written: 1

Signed-off-by: Phil Carmody <[email protected]>
---
fs/sysfs/file.c | 21 ++++++++++++++++++++-
1 files changed, 20 insertions(+), 1 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index da3fefe..281e4dd 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -26,9 +26,18 @@

/* used in crash dumps to help with debugging */
static char last_sysfs_file[PATH_MAX];
+static char last_sysfs_write[16];
+static int last_sysfs_write_len; /* magic value < 0 => read */
void sysfs_printk_last_file(void)
{
- printk(KERN_EMERG "last sysfs file: %s\n", last_sysfs_file);
+ printk(KERN_EMERG "last sysfs file (%c): %s\n",
+ (last_sysfs_write_len < 0) ? 'r' : 'w',
+ last_sysfs_file);
+ if (last_sysfs_write_len >= sizeof(last_sysfs_write))
+ printk(KERN_EMERG " written: %s...(%d chars)\n",
+ last_sysfs_write, last_sysfs_write_len);
+ else if (last_sysfs_write_len > 0)
+ printk(KERN_EMERG " written: %s\n", last_sysfs_write);
}

/*
@@ -200,12 +209,19 @@ flush_write_buffer(struct dentry * dentry, struct sysfs_buffer * buffer, size_t
struct sysfs_dirent *attr_sd = dentry->d_fsdata;
struct kobject *kobj = attr_sd->s_parent->s_dir.kobj;
const struct sysfs_ops * ops = buffer->ops;
+ int copylen = min(count, sizeof(last_sysfs_write) - 1);
int rc;

/* need attr_sd for attr and ops, its parent for kobj */
if (!sysfs_get_active(attr_sd))
return -ENODEV;

+ while (copylen > 0 && buffer->page[copylen-1] == '\n')
+ --copylen; /* never print trailing \n's */
+ memcpy(last_sysfs_write, buffer->page, copylen);
+ last_sysfs_write[copylen] = '\0';
+ last_sysfs_write_len = count;
+
rc = ops->store(kobj, attr_sd->s_attr.attr, buffer->page, count);

sysfs_put_active(attr_sd);
@@ -363,6 +379,9 @@ static int sysfs_open_file(struct inode *inode, struct file *file)
if (file->f_mode & FMODE_WRITE) {
if (!(inode->i_mode & S_IWUGO) || !ops->store)
goto err_out;
+ last_sysfs_write_len = 0;
+ } else {
+ last_sysfs_write_len = -1; /* magic value */
}

/* File needs read support.
--
1.7.2.rc1.37.gf8c40


2011-03-10 17:05:25

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 1/1] sysfs: add more info to the oops dump

On Thu, Mar 10, 2011 at 05:53:08PM +0200, Phil Carmody wrote:
> If we're going to remember which file we accessed, we might as well
> also remember whether it was a read or a write, and if the latter,
> some indication of what was written.
>
> e.g.
> $ echo 1 > /sys/kernel/slab/:at-0000064/sanity_checks
> $ echo c > /proc/sysrq-trigger
> ...
> [ 112.457580] last sysfs file (w): /sys/kernel/slab/:at-0000064/sanity_checks
> [ 112.464569] written: 1

Has this actually helped you out? I've been thinking of removing this
line in the dmesg entirely as I haven't seen it help in a very long time
to track anything down.

>
> Signed-off-by: Phil Carmody <[email protected]>
> ---
> fs/sysfs/file.c | 21 ++++++++++++++++++++-
> 1 files changed, 20 insertions(+), 1 deletions(-)
>
> diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
> index da3fefe..281e4dd 100644
> --- a/fs/sysfs/file.c
> +++ b/fs/sysfs/file.c
> @@ -26,9 +26,18 @@
>
> /* used in crash dumps to help with debugging */
> static char last_sysfs_file[PATH_MAX];
> +static char last_sysfs_write[16];
> +static int last_sysfs_write_len; /* magic value < 0 => read */
> void sysfs_printk_last_file(void)
> {
> - printk(KERN_EMERG "last sysfs file: %s\n", last_sysfs_file);
> + printk(KERN_EMERG "last sysfs file (%c): %s\n",
> + (last_sysfs_write_len < 0) ? 'r' : 'w',
> + last_sysfs_file);
> + if (last_sysfs_write_len >= sizeof(last_sysfs_write))
> + printk(KERN_EMERG " written: %s...(%d chars)\n",
> + last_sysfs_write, last_sysfs_write_len);
> + else if (last_sysfs_write_len > 0)
> + printk(KERN_EMERG " written: %s\n", last_sysfs_write);
> }
>
> /*
> @@ -200,12 +209,19 @@ flush_write_buffer(struct dentry * dentry, struct sysfs_buffer * buffer, size_t
> struct sysfs_dirent *attr_sd = dentry->d_fsdata;
> struct kobject *kobj = attr_sd->s_parent->s_dir.kobj;
> const struct sysfs_ops * ops = buffer->ops;
> + int copylen = min(count, sizeof(last_sysfs_write) - 1);
> int rc;
>
> /* need attr_sd for attr and ops, its parent for kobj */
> if (!sysfs_get_active(attr_sd))
> return -ENODEV;
>
> + while (copylen > 0 && buffer->page[copylen-1] == '\n')
> + --copylen; /* never print trailing \n's */
> + memcpy(last_sysfs_write, buffer->page, copylen);
> + last_sysfs_write[copylen] = '\0';
> + last_sysfs_write_len = count;
> +
> rc = ops->store(kobj, attr_sd->s_attr.attr, buffer->page, count);
>
> sysfs_put_active(attr_sd);
> @@ -363,6 +379,9 @@ static int sysfs_open_file(struct inode *inode, struct file *file)
> if (file->f_mode & FMODE_WRITE) {
> if (!(inode->i_mode & S_IWUGO) || !ops->store)
> goto err_out;
> + last_sysfs_write_len = 0;
> + } else {
> + last_sysfs_write_len = -1; /* magic value */

"magic" values are bad, please use a #define or something else that
makes more sense to those looking at the code in 5 years.

thanks,

greg k-h

2011-03-10 18:37:04

by Phil Carmody

[permalink] [raw]
Subject: Re: [PATCH 1/1] sysfs: add more info to the oops dump

On 10/03/11 08:25 -0800, ext Greg KH wrote:
> On Thu, Mar 10, 2011 at 05:53:08PM +0200, Phil Carmody wrote:
> > If we're going to remember which file we accessed, we might as well
> > also remember whether it was a read or a write, and if the latter,
> > some indication of what was written.
> >
> > e.g.
> > $ echo 1 > /sys/kernel/slab/:at-0000064/sanity_checks
> > $ echo c > /proc/sysrq-trigger
> > ...
> > [ 112.457580] last sysfs file (w): /sys/kernel/slab/:at-0000064/sanity_checks
> > [ 112.464569] written: 1
>
> Has this actually helped you out? I've been thinking of removing this
> line in the dmesg entirely as I haven't seen it help in a very long time
> to track anything down.

I'm glad I started my commit message with that 'if', that tells you
something. I have considered it, in its original form, not particularly
helpful too. But I blamed that on it not telling me enough. Today I saw
an oops related to a sysfs write, and this just patch basically wrote
itself instantly. But it's not stood the test of time, that's for sure.

Phil

> >
> > Signed-off-by: Phil Carmody <[email protected]>
> > ---
> > fs/sysfs/file.c | 21 ++++++++++++++++++++-
> > 1 files changed, 20 insertions(+), 1 deletions(-)
> >
> > diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
> > index da3fefe..281e4dd 100644
> > --- a/fs/sysfs/file.c
> > +++ b/fs/sysfs/file.c
> > @@ -26,9 +26,18 @@
> >
> > /* used in crash dumps to help with debugging */
> > static char last_sysfs_file[PATH_MAX];
> > +static char last_sysfs_write[16];
> > +static int last_sysfs_write_len; /* magic value < 0 => read */
> > void sysfs_printk_last_file(void)
> > {
> > - printk(KERN_EMERG "last sysfs file: %s\n", last_sysfs_file);
> > + printk(KERN_EMERG "last sysfs file (%c): %s\n",
> > + (last_sysfs_write_len < 0) ? 'r' : 'w',
> > + last_sysfs_file);
> > + if (last_sysfs_write_len >= sizeof(last_sysfs_write))
> > + printk(KERN_EMERG " written: %s...(%d chars)\n",
> > + last_sysfs_write, last_sysfs_write_len);
> > + else if (last_sysfs_write_len > 0)
> > + printk(KERN_EMERG " written: %s\n", last_sysfs_write);
> > }
> >
> > /*
> > @@ -200,12 +209,19 @@ flush_write_buffer(struct dentry * dentry, struct sysfs_buffer * buffer, size_t
> > struct sysfs_dirent *attr_sd = dentry->d_fsdata;
> > struct kobject *kobj = attr_sd->s_parent->s_dir.kobj;
> > const struct sysfs_ops * ops = buffer->ops;
> > + int copylen = min(count, sizeof(last_sysfs_write) - 1);
> > int rc;
> >
> > /* need attr_sd for attr and ops, its parent for kobj */
> > if (!sysfs_get_active(attr_sd))
> > return -ENODEV;
> >
> > + while (copylen > 0 && buffer->page[copylen-1] == '\n')
> > + --copylen; /* never print trailing \n's */
> > + memcpy(last_sysfs_write, buffer->page, copylen);
> > + last_sysfs_write[copylen] = '\0';
> > + last_sysfs_write_len = count;
> > +
> > rc = ops->store(kobj, attr_sd->s_attr.attr, buffer->page, count);
> >
> > sysfs_put_active(attr_sd);
> > @@ -363,6 +379,9 @@ static int sysfs_open_file(struct inode *inode, struct file *file)
> > if (file->f_mode & FMODE_WRITE) {
> > if (!(inode->i_mode & S_IWUGO) || !ops->store)
> > goto err_out;
> > + last_sysfs_write_len = 0;
> > + } else {
> > + last_sysfs_write_len = -1; /* magic value */
>
> "magic" values are bad, please use a #define or something else that
> makes more sense to those looking at the code in 5 years.
>
> thanks,
>
> greg k-h

2011-03-10 19:04:27

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 1/1] sysfs: add more info to the oops dump

On Thu, Mar 10, 2011 at 08:32:40PM +0200, Phil Carmody wrote:
> On 10/03/11 08:25 -0800, ext Greg KH wrote:
> > On Thu, Mar 10, 2011 at 05:53:08PM +0200, Phil Carmody wrote:
> > > If we're going to remember which file we accessed, we might as well
> > > also remember whether it was a read or a write, and if the latter,
> > > some indication of what was written.
> > >
> > > e.g.
> > > $ echo 1 > /sys/kernel/slab/:at-0000064/sanity_checks
> > > $ echo c > /proc/sysrq-trigger
> > > ...
> > > [ 112.457580] last sysfs file (w): /sys/kernel/slab/:at-0000064/sanity_checks
> > > [ 112.464569] written: 1
> >
> > Has this actually helped you out? I've been thinking of removing this
> > line in the dmesg entirely as I haven't seen it help in a very long time
> > to track anything down.
>
> I'm glad I started my commit message with that 'if', that tells you
> something. I have considered it, in its original form, not particularly
> helpful too. But I blamed that on it not telling me enough. Today I saw
> an oops related to a sysfs write, and this just patch basically wrote
> itself instantly. But it's not stood the test of time, that's for sure.

But would the trace-back also have shown you the problem? If not, then
that's a good reason to keep this feature, and to add your patch.

But, I would suggest keeping it on one line, adding more lines to oops
messages is not good if at all possible.

thanks,

greg k-h

2011-03-10 21:14:58

by Phil Carmody

[permalink] [raw]
Subject: [PATCH 1/1] sysfs: add more info to the oops dump

If we're going to remember which file we accessed, we might as well
also remember whether it was a read or a write, and if the latter,
some indication of what was written.

Changes in V2: restrict to just one line.

e.g.
$ echo sesquipedalianism > /sys/module/vt/parameters/default_utf8
$ echo c > /proc/sysrq-trigger
...
[ 57.522583] last sysfs file (w): /sys/module/vt/parameters/default_utf8 written: sesquipedaliani...(18 chars)

Signed-off-by: Phil Carmody <[email protected]>
---
fs/sysfs/file.c | 22 +++++++++++++++++++++-
1 files changed, 21 insertions(+), 1 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index da3fefe..fa5b21a 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -26,9 +26,19 @@

/* used in crash dumps to help with debugging */
static char last_sysfs_file[PATH_MAX];
+static char last_sysfs_write[16];
+static int last_sysfs_write_len; /* magic value < 0 => read */
void sysfs_printk_last_file(void)
{
- printk(KERN_EMERG "last sysfs file: %s\n", last_sysfs_file);
+ printk(KERN_EMERG "last sysfs file (%c): %s%c",
+ (last_sysfs_write_len < 0) ? 'r' : 'w',
+ last_sysfs_file,
+ last_sysfs_write_len > 0 ? ' ' : '\n');
+ if (last_sysfs_write_len >= sizeof(last_sysfs_write))
+ printk(KERN_CONT " written: %s...(%d chars)\n",
+ last_sysfs_write, last_sysfs_write_len);
+ else if (last_sysfs_write_len > 0)
+ printk(KERN_CONT " written: %s\n", last_sysfs_write);
}

/*
@@ -200,12 +210,19 @@ flush_write_buffer(struct dentry * dentry, struct sysfs_buffer * buffer, size_t
struct sysfs_dirent *attr_sd = dentry->d_fsdata;
struct kobject *kobj = attr_sd->s_parent->s_dir.kobj;
const struct sysfs_ops * ops = buffer->ops;
+ int copylen = min(count, sizeof(last_sysfs_write) - 1);
int rc;

/* need attr_sd for attr and ops, its parent for kobj */
if (!sysfs_get_active(attr_sd))
return -ENODEV;

+ while (copylen > 0 && buffer->page[copylen-1] == '\n')
+ --copylen; /* never print trailing \n's */
+ memcpy(last_sysfs_write, buffer->page, copylen);
+ last_sysfs_write[copylen] = '\0';
+ last_sysfs_write_len = count;
+
rc = ops->store(kobj, attr_sd->s_attr.attr, buffer->page, count);

sysfs_put_active(attr_sd);
@@ -363,6 +380,9 @@ static int sysfs_open_file(struct inode *inode, struct file *file)
if (file->f_mode & FMODE_WRITE) {
if (!(inode->i_mode & S_IWUGO) || !ops->store)
goto err_out;
+ last_sysfs_write_len = 0;
+ } else {
+ last_sysfs_write_len = -1; /* magic value */
}

/* File needs read support.
--
1.7.2.rc1.37.gf8c40

2011-03-10 21:28:19

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 1/1] sysfs: add more info to the oops dump

On Thu, Mar 10, 2011 at 11:10:40PM +0200, Phil Carmody wrote:
> If we're going to remember which file we accessed, we might as well
> also remember whether it was a read or a write, and if the latter,
> some indication of what was written.
>
> Changes in V2: restrict to just one line.

Looks nicer, but you forgot to address the "magic value" trick you used
that I mentioned last time. Please fix that up.

thanks,

greg k-h

2011-03-10 22:31:46

by Phil Carmody

[permalink] [raw]
Subject: [PATCHv3 1/1] sysfs: add more info to the oops dump

If we're going to remember which file we accessed, we might as well
also remember whether it was a read or a write, and if the latter,
some indication of what was written.

e.g.
$ echo sesquipedalianism > /sys/module/vt/parameters/default_utf8
$ echo c > /proc/sysrq-trigger
...
[ 57.522583] last sysfs file (w): /sys/module/vt/parameters/default_utf8 written: sesquipedaliani...(18 chars)

Signed-off-by: Phil Carmody <[email protected]>
---
fs/sysfs/file.c | 24 +++++++++++++++++++++++-
1 files changed, 23 insertions(+), 1 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index da3fefe..88ac53d 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -26,9 +26,21 @@

/* used in crash dumps to help with debugging */
static char last_sysfs_file[PATH_MAX];
+static char last_sysfs_write[16];
+/* To indicate the last operation was a read, use an impossible length */
+#define SYSFS_NOT_A_WRITE -1
+static int last_sysfs_write_len;
void sysfs_printk_last_file(void)
{
- printk(KERN_EMERG "last sysfs file: %s\n", last_sysfs_file);
+ printk(KERN_EMERG "last sysfs file (%c): %s%c",
+ (last_sysfs_write_len == SYSFS_NOT_A_WRITE) ? 'r' : 'w',
+ last_sysfs_file,
+ last_sysfs_write_len > 0 ? ' ' : '\n');
+ if (last_sysfs_write_len >= (int)sizeof(last_sysfs_write))
+ printk(KERN_CONT " written: %s...(%d chars)\n",
+ last_sysfs_write, last_sysfs_write_len);
+ else if (last_sysfs_write_len > 0)
+ printk(KERN_CONT " written: %s\n", last_sysfs_write);
}

/*
@@ -200,12 +212,19 @@ flush_write_buffer(struct dentry * dentry, struct sysfs_buffer * buffer, size_t
struct sysfs_dirent *attr_sd = dentry->d_fsdata;
struct kobject *kobj = attr_sd->s_parent->s_dir.kobj;
const struct sysfs_ops * ops = buffer->ops;
+ int copylen = min(count, sizeof(last_sysfs_write) - 1);
int rc;

/* need attr_sd for attr and ops, its parent for kobj */
if (!sysfs_get_active(attr_sd))
return -ENODEV;

+ while (copylen > 0 && buffer->page[copylen-1] == '\n')
+ --copylen; /* never print trailing \n's */
+ memcpy(last_sysfs_write, buffer->page, copylen);
+ last_sysfs_write[copylen] = '\0';
+ last_sysfs_write_len = count;
+
rc = ops->store(kobj, attr_sd->s_attr.attr, buffer->page, count);

sysfs_put_active(attr_sd);
@@ -363,6 +382,9 @@ static int sysfs_open_file(struct inode *inode, struct file *file)
if (file->f_mode & FMODE_WRITE) {
if (!(inode->i_mode & S_IWUGO) || !ops->store)
goto err_out;
+ last_sysfs_write_len = 0;
+ } else {
+ last_sysfs_write_len = SYSFS_NOT_A_WRITE;
}

/* File needs read support.
--
1.7.2.rc1.37.gf8c40

2011-03-10 22:31:56

by Phil Carmody

[permalink] [raw]
Subject: [PATCHv3 0/1] sysfs: add more info to the oops dump


> But would the trace-back also have shown you the problem? If not, then
> that's a good reason to keep this feature, and to add your patch.

In this sample size of one, the write handler was so overloaded
the data written was very valuable compared to the backtrace.

> Looks nicer, but you forgot to address the "magic value" trick you used
> that I mentioned last time. Please fix that up.

My bad, sorry. I also realised that I'd never actually tested the read
case, which highlighted a mixed-sign comparison issue. In a linux-2.6
context this hasn't been more than compile tested, my testing has been
done in our maemo tree.

cat /sys/module/vt/parameters/default_utf8; echo c > /proc/sysrq-trigger
[ 13.489440] last sysfs file (r): /sys/module/vt/parameters/default_utf8

echo sesquipedalian > /sys/module/vt/parameters/default_utf8; echo c > /proc/sysrq-trigger
[ 49.404632] last sysfs file (w): /sys/module/vt/parameters/default_utf8 written: sesquipedalian

echo sesquipedalianism > /sys/module/vt/parameters/default_utf8; echo c > /proc/sysrq-trigger
[ 33.172332] last sysfs file (w): /sys/module/vt/parameters/default_utf8 written: sesquipedaliani...(18 chars)

Sorry for the ping-pong.

Cheers,
Phil

2011-03-10 22:38:15

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCHv3 1/1] sysfs: add more info to the oops dump

On Fri, 11 Mar 2011 00:27:07 +0200
Phil Carmody <[email protected]> wrote:

> +static int last_sysfs_write_len;
> void sysfs_printk_last_file(void)
> {
> - printk(KERN_EMERG "last sysfs file: %s\n", last_sysfs_file);
> + printk(KERN_EMERG "last sysfs file (%c): %s%c",
> + (last_sysfs_write_len == SYSFS_NOT_A_WRITE) ? 'r' : 'w',
> + last_sysfs_file,
> + last_sysfs_write_len > 0 ? ' ' : '\n');
> + if (last_sysfs_write_len >= (int)sizeof(last_sysfs_write))
> + printk(KERN_CONT " written: %s...(%d chars)\n",
> + last_sysfs_write, last_sysfs_write_len);
> + else if (last_sysfs_write_len > 0)
> + printk(KERN_CONT " written: %s\n", last_sysfs_write);
> }

That's going to fail to emit the \n if last_sysfs_write_len<=0.
However there are tricks inside printk() which will fix that up if the
next printk starts with KERN_foo.



I can recall just one instance in the past 5-6 years in which the
last-sysfs-file output enabled me to locate a bug which would otherwise
have been unlocatable (within a drivers/md handler, iirc).

I don't think it's a terribly useful feature. Except for when it _is_
useful, when it's very useful ;) The world wouldn't end if we decided
to remove it.

2011-03-10 22:59:40

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCHv3 1/1] sysfs: add more info to the oops dump

On Thu, 10 Mar 2011, Andrew Morton wrote:

> On Fri, 11 Mar 2011 00:27:07 +0200
> Phil Carmody <[email protected]> wrote:
>
> > +static int last_sysfs_write_len;
> > void sysfs_printk_last_file(void)
> > {
> > - printk(KERN_EMERG "last sysfs file: %s\n", last_sysfs_file);
> > + printk(KERN_EMERG "last sysfs file (%c): %s%c",
> > + (last_sysfs_write_len == SYSFS_NOT_A_WRITE) ? 'r' : 'w',
> > + last_sysfs_file,
> > + last_sysfs_write_len > 0 ? ' ' : '\n');
> > + if (last_sysfs_write_len >= (int)sizeof(last_sysfs_write))
> > + printk(KERN_CONT " written: %s...(%d chars)\n",
> > + last_sysfs_write, last_sysfs_write_len);
> > + else if (last_sysfs_write_len > 0)
> > + printk(KERN_CONT " written: %s\n", last_sysfs_write);
> > }
>
> That's going to fail to emit the \n if last_sysfs_write_len<=0.
> However there are tricks inside printk() which will fix that up if the
> next printk starts with KERN_foo.
>
>
>
> I can recall just one instance in the past 5-6 years in which the
> last-sysfs-file output enabled me to locate a bug which would otherwise
> have been unlocatable (within a drivers/md handler, iirc).
>
> I don't think it's a terribly useful feature. Except for when it _is_
> useful, when it's very useful ;) The world wouldn't end if we decided
> to remove it.

Amen to that.

It's more of an distraction than anything which is relevant to 99.999%
of the problems we have to deal with.

Thanks,

tglx

2011-03-10 23:04:19

by Greg KH

[permalink] [raw]
Subject: Re: [PATCHv3 1/1] sysfs: add more info to the oops dump

On Thu, Mar 10, 2011 at 11:59:26PM +0100, Thomas Gleixner wrote:
> On Thu, 10 Mar 2011, Andrew Morton wrote:
>
> > On Fri, 11 Mar 2011 00:27:07 +0200
> > Phil Carmody <[email protected]> wrote:
> >
> > > +static int last_sysfs_write_len;
> > > void sysfs_printk_last_file(void)
> > > {
> > > - printk(KERN_EMERG "last sysfs file: %s\n", last_sysfs_file);
> > > + printk(KERN_EMERG "last sysfs file (%c): %s%c",
> > > + (last_sysfs_write_len == SYSFS_NOT_A_WRITE) ? 'r' : 'w',
> > > + last_sysfs_file,
> > > + last_sysfs_write_len > 0 ? ' ' : '\n');
> > > + if (last_sysfs_write_len >= (int)sizeof(last_sysfs_write))
> > > + printk(KERN_CONT " written: %s...(%d chars)\n",
> > > + last_sysfs_write, last_sysfs_write_len);
> > > + else if (last_sysfs_write_len > 0)
> > > + printk(KERN_CONT " written: %s\n", last_sysfs_write);
> > > }
> >
> > That's going to fail to emit the \n if last_sysfs_write_len<=0.
> > However there are tricks inside printk() which will fix that up if the
> > next printk starts with KERN_foo.
> >
> >
> >
> > I can recall just one instance in the past 5-6 years in which the
> > last-sysfs-file output enabled me to locate a bug which would otherwise
> > have been unlocatable (within a drivers/md handler, iirc).
> >
> > I don't think it's a terribly useful feature. Except for when it _is_
> > useful, when it's very useful ;) The world wouldn't end if we decided
> > to remove it.
>
> Amen to that.
>
> It's more of an distraction than anything which is relevant to 99.999%
> of the problems we have to deal with.

Ok, I think I'll delete it as I'm tired of having to filter it out of my
"any problem with sysfs" lkml filter that I use :)

thanks,

greg k-h

2011-03-10 23:10:21

by Phil Carmody

[permalink] [raw]
Subject: Re: [PATCHv3 1/1] sysfs: add more info to the oops dump

On 10/03/11 23:59 +0100, ext Thomas Gleixner wrote:
> On Thu, 10 Mar 2011, Andrew Morton wrote:
>
> > On Fri, 11 Mar 2011 00:27:07 +0200
> > Phil Carmody <[email protected]> wrote:
> >
> > > +static int last_sysfs_write_len;
> > > void sysfs_printk_last_file(void)
> > > {
> > > - printk(KERN_EMERG "last sysfs file: %s\n", last_sysfs_file);
> > > + printk(KERN_EMERG "last sysfs file (%c): %s%c",
> > > + (last_sysfs_write_len == SYSFS_NOT_A_WRITE) ? 'r' : 'w',
> > > + last_sysfs_file,
> > > + last_sysfs_write_len > 0 ? ' ' : '\n');
> > > + if (last_sysfs_write_len >= (int)sizeof(last_sysfs_write))
> > > + printk(KERN_CONT " written: %s...(%d chars)\n",
> > > + last_sysfs_write, last_sysfs_write_len);
> > > + else if (last_sysfs_write_len > 0)
> > > + printk(KERN_CONT " written: %s\n", last_sysfs_write);
> > > }
> >
> > That's going to fail to emit the \n if last_sysfs_write_len<=0.

That's this case:
+ last_sysfs_write_len > 0 ? ' ' : '\n');

> > However there are tricks inside printk() which will fix that up if the
> > next printk starts with KERN_foo.
> >
> > I can recall just one instance in the past 5-6 years in which the
> > last-sysfs-file output enabled me to locate a bug which would otherwise
> > have been unlocatable (within a drivers/md handler, iirc).
> >
> > I don't think it's a terribly useful feature. Except for when it _is_
> > useful, when it's very useful ;) The world wouldn't end if we decided
> > to remove it.
>
> Amen to that.
>
> It's more of an distraction than anything which is relevant to 99.999%
> of the problems we have to deal with.

As I indicated before, I've previously thought that too, but thought I
could 'fix' it by adding to it when I hit the once-in-three-years case.

Phil

2011-03-10 23:14:10

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCHv3 1/1] sysfs: add more info to the oops dump

On Fri, 11 Mar 2011, Phil Carmody wrote:
> On 10/03/11 23:59 +0100, ext Thomas Gleixner wrote:
> > On Thu, 10 Mar 2011, Andrew Morton wrote:
> >
> > > On Fri, 11 Mar 2011 00:27:07 +0200
> > > Phil Carmody <[email protected]> wrote:
> > >
> > > > +static int last_sysfs_write_len;
> > > > void sysfs_printk_last_file(void)
> > > > {
> > > > - printk(KERN_EMERG "last sysfs file: %s\n", last_sysfs_file);
> > > > + printk(KERN_EMERG "last sysfs file (%c): %s%c",
> > > > + (last_sysfs_write_len == SYSFS_NOT_A_WRITE) ? 'r' : 'w',
> > > > + last_sysfs_file,
> > > > + last_sysfs_write_len > 0 ? ' ' : '\n');
> > > > + if (last_sysfs_write_len >= (int)sizeof(last_sysfs_write))
> > > > + printk(KERN_CONT " written: %s...(%d chars)\n",
> > > > + last_sysfs_write, last_sysfs_write_len);
> > > > + else if (last_sysfs_write_len > 0)
> > > > + printk(KERN_CONT " written: %s\n", last_sysfs_write);
> > > > }
> > >
> > > That's going to fail to emit the \n if last_sysfs_write_len<=0.
>
> That's this case:
> + last_sysfs_write_len > 0 ? ' ' : '\n');
>
> > > However there are tricks inside printk() which will fix that up if the
> > > next printk starts with KERN_foo.
> > >
> > > I can recall just one instance in the past 5-6 years in which the
> > > last-sysfs-file output enabled me to locate a bug which would otherwise
> > > have been unlocatable (within a drivers/md handler, iirc).
> > >
> > > I don't think it's a terribly useful feature. Except for when it _is_
> > > useful, when it's very useful ;) The world wouldn't end if we decided
> > > to remove it.
> >
> > Amen to that.
> >
> > It's more of an distraction than anything which is relevant to 99.999%
> > of the problems we have to deal with.
>
> As I indicated before, I've previously thought that too, but thought I
> could 'fix' it by adding to it when I hit the once-in-three-years case.

The interesting question is:

How did that info help and was it really the ultimate reason why you
found the underlying bug ?

Thanks,

tglx

2011-03-10 23:15:17

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCHv3 1/1] sysfs: add more info to the oops dump

On Thu, 10 Mar 2011 15:03:04 -0800
Greg KH <[email protected]> wrote:

> > Amen to that.
> >
> > It's more of an distraction than anything which is relevant to 99.999%
> > of the problems we have to deal with.
>
> Ok, I think I'll delete it as I'm tired of having to filter it out of my
> "any problem with sysfs" lkml filter that I use :)

Ok, I'll assign future incomprehensible sysfs oops traces to you guys ;)

2011-03-10 23:22:14

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCHv3 1/1] sysfs: add more info to the oops dump

On Fri, 11 Mar 2011 00:13:58 +0100 (CET)
Thomas Gleixner <[email protected]> wrote:

> > > It's more of an distraction than anything which is relevant to 99.999%
> > > of the problems we have to deal with.
> >
> > As I indicated before, I've previously thought that too, but thought I
> > could 'fix' it by adding to it when I hit the once-in-three-years case.
>
> The interesting question is:
>
> How did that info help and was it really the ultimate reason why you
> found the underlying bug ?

What happens with sysfs is that if a subsystem's handler is buggy, that
tends to cause a crash within sysfs core code. You get a stack trace
which contains only VFS and sysfs functions - there is no symbol in the
trace which permits you to identify the offending subsystem.

2011-03-10 23:26:26

by Phil Carmody

[permalink] [raw]
Subject: Re: [PATCHv3 1/1] sysfs: add more info to the oops dump

On 11/03/11 00:13 +0100, ext Thomas Gleixner wrote:
> On Fri, 11 Mar 2011, Phil Carmody wrote:
> > On 10/03/11 23:59 +0100, ext Thomas Gleixner wrote:
> > > On Thu, 10 Mar 2011, Andrew Morton wrote:
...
> > > > I don't think it's a terribly useful feature. Except for when it _is_
> > > > useful, when it's very useful ;) The world wouldn't end if we decided
> > > > to remove it.
> > >
> > > Amen to that.
> > >
> > > It's more of an distraction than anything which is relevant to 99.999%
> > > of the problems we have to deal with.
> >
> > As I indicated before, I've previously thought that too, but thought I
> > could 'fix' it by adding to it when I hit the once-in-three-years case.
>
> The interesting question is:
>
> How did that info help and was it really the ultimate reason why you
> found the underlying bug ?

It's academic now, as I fully support the removal. Perhaps the information
was most useful simply because it was given to me quickly, there would have
been other ways of divining it certainly (I can read hex stack dumps, for
example). Even I don't need this patch any more!

At least my intrusion has been useful for instigating a little cleanup!

Cheers,
Phil

2011-03-10 23:28:11

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCHv3 1/1] sysfs: add more info to the oops dump

On Thu, 10 Mar 2011, Andrew Morton wrote:
> On Fri, 11 Mar 2011 00:13:58 +0100 (CET)
> Thomas Gleixner <[email protected]> wrote:
>
> > > > It's more of an distraction than anything which is relevant to 99.999%
> > > > of the problems we have to deal with.
> > >
> > > As I indicated before, I've previously thought that too, but thought I
> > > could 'fix' it by adding to it when I hit the once-in-three-years case.
> >
> > The interesting question is:
> >
> > How did that info help and was it really the ultimate reason why you
> > found the underlying bug ?
>
> What happens with sysfs is that if a subsystem's handler is buggy, that
> tends to cause a crash within sysfs core code. You get a stack trace
> which contains only VFS and sysfs functions - there is no symbol in the
> trace which permits you to identify the offending subsystem.

Reminds me of timer bugs, which popped up way after the fact that some
stupid driver reinitialized and active timer or freed memory
containing an active driver.

For some obvious reasons I haven't seen any of those bugs wasting my
time other than asking the bug reporter to enable debugobjects. :)

Thanks,

tglx

2011-03-10 23:46:52

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCHv3 1/1] sysfs: add more info to the oops dump

On Fri, 11 Mar 2011, Thomas Gleixner wrote:

> On Thu, 10 Mar 2011, Andrew Morton wrote:
> > On Fri, 11 Mar 2011 00:13:58 +0100 (CET)
> > Thomas Gleixner <[email protected]> wrote:
> >
> > > > > It's more of an distraction than anything which is relevant to 99.999%
> > > > > of the problems we have to deal with.
> > > >
> > > > As I indicated before, I've previously thought that too, but thought I
> > > > could 'fix' it by adding to it when I hit the once-in-three-years case.
> > >
> > > The interesting question is:
> > >
> > > How did that info help and was it really the ultimate reason why you
> > > found the underlying bug ?
> >
> > What happens with sysfs is that if a subsystem's handler is buggy, that
> > tends to cause a crash within sysfs core code. You get a stack trace
> > which contains only VFS and sysfs functions - there is no symbol in the
> > trace which permits you to identify the offending subsystem.
>
> Reminds me of timer bugs, which popped up way after the fact that some
> stupid driver reinitialized and active timer or freed memory
> containing an active driver.

Gah: s/active driver/active timer/

> For some obvious reasons I haven't seen any of those bugs wasting my
> time other than asking the bug reporter to enable debugobjects. :)

That said, we really want better debug facilities which are not
cluttering the basic debug output with totally irrelevant information.

Following your reasoning we should record the last accessed file in
general, plus the last ioctl and whatever we think might be relevant
to decode random bugs easier. That's not going to fly.

The main problemns are object life time rules or missing function
pointers in the first place. Both can be tackled by other means than
adding random information to the back trace.

Thanks,

tglx