2011-03-21 18:24:27

by Tony Luck

[permalink] [raw]
Subject: [RFC] pstore: Don't use persistent store for normal shutdown

In commit 04c6862c055fb687c90d9652f32c11a063df15cf
kmsg_dump: add kmsg_dump() calls to the reboot, halt, poweroff and emergency_restart paths

Seiji Aguchi added kmsg_dump options for all the "normal" ways
that a system can be shut down (KEXEC, RESTART, HALT and POWEROFF).
It doesn't seem useful to save the kernel log to persistent store
in these cases.

Signed-off-by: Tony Luck <[email protected]>

---

My /dev/pstore changes have been merged - and I immediately noticed that they
now save a record on every shutdown. Some simple detective work with git found
that kmsg_dump now has some new "reasons" for calling its subscribers.

This patch excludes the four "normal shutdown" cases from being logged
to persistent store, on the assumption that we don't want to clutter up
a limited amount of storage space with routine data. My presumption is
that there is some other subscriber to kmsg_dump that is doing something
with these reason codes ... is that right? Or do you think that we should
save the tail of kernel log into persistent store for every shutdown?

Perhaps we could save less data for these new reasons? Other ideas?

diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index ce9ad84..4ca44c6 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -61,15 +61,34 @@ static void pstore_dump(struct kmsg_dumper *dumper,
unsigned long s1_start, s2_start;
unsigned long l1_cpy, l2_cpy;
unsigned long size, total = 0;
- char *dst;
+ char *dst, *why;
u64 id;
int hsize, part = 1;

+ switch (reason) {
+ case KMSG_DUMP_KEXEC:
+ case KMSG_DUMP_RESTART:
+ case KMSG_DUMP_HALT:
+ case KMSG_DUMP_POWEROFF:
+ return;
+ case KMSG_DUMP_OOPS:
+ why = "oops";
+ break;
+ case KMSG_DUMP_PANIC:
+ why = "panic";
+ break;
+ case KMSG_DUMP_EMERG:
+ why = "emergency";
+ break;
+ default:
+ why = "unknown";
+ break;
+ }
mutex_lock(&psinfo->buf_mutex);
oopscount++;
while (total < kmsg_bytes) {
dst = psinfo->buf;
- hsize = sprintf(dst, "Oops#%d Part%d\n", oopscount, part++);
+ hsize = sprintf(dst, "%s#%d Part%d\n", why, oopscount, part++);
size = psinfo->bufsize - hsize;
dst += hsize;


2011-03-21 19:50:07

by Seiji Aguchi

[permalink] [raw]
Subject: RE: [RFC] pstore: Don't use persistent store for normal shutdown

Hi,

>Seiji Aguchi added kmsg_dump options for all the "normal" ways
>that a system can be shut down (KEXEC, RESTART, HALT and POWEROFF).
>It doesn't seem useful to save the kernel log to persistent store
>in these cases.

I think we should save the tail of kernel log into persistent store
for both every shutdown and kexec path because these data are useful
in enterprise area.

I already described the useful case in lkml (see below).

normal shutdown:
https://lkml.org/lkml/2010/11/17/249

kexec:
https://lkml.org/lkml/2011/2/23/325

Seiji

2011-03-21 20:27:57

by Tony Luck

[permalink] [raw]
Subject: Re: [RFC] pstore: Don't use persistent store for normal shutdown

On Mon, Mar 21, 2011 at 12:49 PM, Seiji Aguchi <[email protected]> wrote:
> I think we should save the tail of kernel log into persistent store
> for both every shutdown and kexec path because these data are useful
> in enterprise area.
>
> I already described the useful case in lkml (see below).
>
> normal shutdown:
> https://lkml.org/lkml/2010/11/17/249

I hadn't thought about the forensic opportunities to show that a reboot/halt
was the cause of a server shutdown - but these do look useful. Should
I still try to save as much data in these cases? Saving 1 Kbyte would be
enough to have the last dozen+ lines, including the critical
<0>[ 536.970556] Restarting system.
that you'd like to see.

> kexec:
> https://lkml.org/lkml/2011/2/23/325

In the case of kexec used as a "fast reboot" to another kernel, it might
also be valid to save some small amount of kernel log - but in the case
of kexec to take a kernel dump after a serious problem, it might be good
to have as much kernel log as possible to make sure that we have no
gaps in saved data. If there isn't an easy way to tell these two kexec
cases apart, we should just save more data in both cases.

I'll make a new patch incorporating these changes.

-Tony

2011-03-21 21:48:42

by Seiji Aguchi

[permalink] [raw]
Subject: RE: [RFC] pstore: Don't use persistent store for normal shutdown

Hi,

>Should I still try to save as much data in these cases? Saving 1 Kbyte would be
>enough to have the last dozen+ lines, including the critical
> <0>[ 536.970556] Restarting system.
>that you'd like to see.

On my assumption, critical information is the last dozen lines.
However, it depends on the situations how much data is needed or what kind of data is useful.

I think we should introduce a tunable parameter specifying how much data stores into persistent
ram when normal shutdown happens.

Seiji

2011-03-21 22:01:17

by Tony Luck

[permalink] [raw]
Subject: Re: [RFC] pstore: Don't use persistent store for normal shutdown

On Mon, Mar 21, 2011 at 2:47 PM, Seiji Aguchi <[email protected]> wrote:
> I think we should introduce a tunable parameter specifying how much data stores into persistent
> ram when normal shutdown happens.

Currently pstore has a /sys/fs/pstore hook to change the amount of data saved.
Christoph Hellwig convinced me that a mount option would be better, so there
is a patch pending to change this:
http://marc.info/?l=linux-fsdevel&m=130048763005234&w=2

I can add more options easily (apart from thinking of good names for
the options :-)

But this is beginning to feel over engineered ... I think I'll wait to see
if there are some other (better) ideas before I rush to do this.

-Tony

2011-03-22 08:45:32

by Artem Bityutskiy

[permalink] [raw]
Subject: Re: [RFC] pstore: Don't use persistent store for normal shutdown

On Mon, 2011-03-21 at 11:24 -0700, ext Luck, Tony wrote:
> In commit 04c6862c055fb687c90d9652f32c11a063df15cf
> kmsg_dump: add kmsg_dump() calls to the reboot, halt, poweroff and emergency_restart paths
>
> Seiji Aguchi added kmsg_dump options for all the "normal" ways
> that a system can be shut down (KEXEC, RESTART, HALT and POWEROFF).
> It doesn't seem useful to save the kernel log to persistent store
> in these cases.
>
> Signed-off-by: Tony Luck <[email protected]>
>
> ---
>
> My /dev/pstore changes have been merged - and I immediately noticed that they
> now save a record on every shutdown. Some simple detective work with git found
> that kmsg_dump now has some new "reasons" for calling its subscribers.
>
> This patch excludes the four "normal shutdown" cases from being logged
> to persistent store, on the assumption that we don't want to clutter up
> a limited amount of storage space with routine data. My presumption is
> that there is some other subscriber to kmsg_dump that is doing something
> with these reason codes ... is that right? Or do you think that we should
> save the tail of kernel log into persistent store for every shutdown?
>
> Perhaps we could save less data for these new reasons? Other ideas?

If you ask me, this smells like policy in the kernel. I'd look into the
direction of having only the mechanisms in the kernel and letting the
user-space making policy decisions by choosing what he wants to filter
out and what he wants to store via some pstore interfaces.

--
Best Regards,
Artem Bityutskiy (Артём Битюцкий)

2011-03-22 08:54:50

by Cong Wang

[permalink] [raw]
Subject: Re: [RFC] pstore: Don't use persistent store for normal shutdown

On Tue, Mar 22, 2011 at 4:42 PM, Artem Bityutskiy
<[email protected]> wrote:
> On Mon, 2011-03-21 at 11:24 -0700, ext Luck, Tony wrote:
>> In commit 04c6862c055fb687c90d9652f32c11a063df15cf
>> kmsg_dump: add kmsg_dump() calls to the reboot, halt, poweroff and emergency_restart paths
>>
>> Seiji Aguchi added kmsg_dump options for all the "normal" ways
>> that a system can be shut down (KEXEC, RESTART, HALT and POWEROFF).
>> It doesn't seem useful to save the kernel log to persistent store
>> in these cases.
>>
>> Signed-off-by: Tony Luck <[email protected]>
>>
>> ---
>>
>> My /dev/pstore changes have been merged - and I immediately noticed that they
>> now save a record on every shutdown.  Some simple detective work with git found
>> that kmsg_dump now has some new "reasons" for calling its subscribers.
>>
>> This patch excludes the four "normal shutdown" cases from being logged
>> to persistent store, on the assumption that we don't want to clutter up
>> a limited amount of storage space with routine data.  My presumption is
>> that there is some other subscriber to kmsg_dump that is doing something
>> with these reason codes ... is that right?  Or do you think that we should
>> save the tail of kernel log into persistent store for every shutdown?
>>
>> Perhaps we could save less data for these new reasons? Other ideas?
>
> If you ask me, this smells like policy in the kernel. I'd look into the
> direction of having only the mechanisms in the kernel and letting the
> user-space making policy decisions by choosing what he wants to filter
> out and what he wants to store via some pstore interfaces.
>

I agree, it would be nice if we can have another mount option for this.

Thanks.

2011-03-22 19:55:56

by Tony Luck

[permalink] [raw]
Subject: Re: [RFC] pstore: Don't use persistent store for normal shutdown

From: Tony Luck <[email protected]>

pstore_dump() can be called with many different "reason" codes. Save
the name of the code in the persistent store record.

Also - only worthwhile calling pstore_mkfile for KMSG_DUMP_OOPS - that
is the only one where the kernel will continue running.

Signed-off-by: Tony Luck <[email protected]>

---

Artem Bityutskiy said:
> If you ask me, this smells like policy in the kernel. I'd look into the
> direction of having only the mechanisms in the kernel and letting the
> user-space making policy decisions by choosing what he wants to filter
> out and what he wants to store via some pstore interfaces.

Perhaps I'll just leave all the policy out then and keep logging
the same for all dmesg reason codes. Seiji Aguchi sounded quite
lukewarm to the idea of logging less for shutdown/restart etc.
Providing an option that lets the user ask pstore to filter away
the restart/shutdown logs would defeat the purpose for which these
reason codes were added - providing a neat trail that simple tools
can use to show that a system rebooted because some root process
issued a reboot(2) syscall.

Here's a simpler clean up of the fs/pstore/platform.c code that
makes sure that we log a meaningful string for each reason code
(rather than "Oops"!) and that only tries to add an entry to the
mounted filesystem for the KMSG_TYPE_OOPS case (since that is the
only case where the system stays up for some process to see it).

diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index ce9ad84..f835a25 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -48,6 +48,10 @@ void pstore_set_kmsg_bytes(int bytes)
/* Tag each group of saved records with a sequence number */
static int oopscount;

+static char *reason_str[] = {
+ "Oops", "Panic", "Kexec", "Restart", "Halt", "Poweroff", "Emergency"
+};
+
/*
* callback from kmsg_dump. (s2,l2) has the most recently
* written bytes, older bytes are in (s1,l1). Save as much
@@ -61,15 +65,20 @@ static void pstore_dump(struct kmsg_dumper *dumper,
unsigned long s1_start, s2_start;
unsigned long l1_cpy, l2_cpy;
unsigned long size, total = 0;
- char *dst;
+ char *dst, *why;
u64 id;
int hsize, part = 1;

+ if (reason < ARRAY_SIZE(reason_str))
+ why = reason_str[reason];
+ else
+ why = "Unknown";
+
mutex_lock(&psinfo->buf_mutex);
oopscount++;
while (total < kmsg_bytes) {
dst = psinfo->buf;
- hsize = sprintf(dst, "Oops#%d Part%d\n", oopscount, part++);
+ hsize = sprintf(dst, "%s#%d Part%d\n", why, oopscount, part++);
size = psinfo->bufsize - hsize;
dst += hsize;

@@ -86,7 +95,7 @@ static void pstore_dump(struct kmsg_dumper *dumper,
memcpy(dst + l1_cpy, s2 + s2_start, l2_cpy);

id = psinfo->write(PSTORE_TYPE_DMESG, hsize + l1_cpy + l2_cpy);
- if (pstore_is_mounted())
+ if (reason == KMSG_DUMP_OOPS && pstore_is_mounted())
pstore_mkfile(PSTORE_TYPE_DMESG, psinfo->name, id,
psinfo->buf, hsize + l1_cpy + l2_cpy,
CURRENT_TIME, psinfo->erase);

2011-03-22 22:18:12

by Seiji Aguchi

[permalink] [raw]
Subject: RE: [RFC] pstore: Don't use persistent store for normal shutdown

>-----Original Message-----
>From: Luck, Tony [mailto:[email protected]]
>Sent: Tuesday, March 22, 2011 3:56 PM
>To: Artem Bityutskiy
>Cc: [email protected]; Seiji Aguchi; David Woodhouse; Marco Stornelli; KOSAKI Motohiro; Andrew Morton; Linus
>Torvalds
>Subject: Re: [RFC] pstore: Don't use persistent store for normal shutdown
>
>From: Tony Luck <[email protected]>
>
>pstore_dump() can be called with many different "reason" codes. Save
>the name of the code in the persistent store record.
>
>Also - only worthwhile calling pstore_mkfile for KMSG_DUMP_OOPS - that
>is the only one where the kernel will continue running.
>
>Signed-off-by: Tony Luck <[email protected]>
>

Looks good to me.

Reviewed-by: Seiji Aguchi <[email protected]>