2007-12-14 18:49:26

by Arjan van de Ven

[permalink] [raw]
Subject: Top kernel oopses/warnings this week

The http://www.kerneloops.org website collects kernel oops and warning reports from various mailing lists and bugzillas; below is a top 10
list of the oopses collected in the last 7 days. (Reports prior to 2.6.23 have been omitted in collecting the top 10)

This is the first such report that I'm posting; Please let me know if this is useful or not.

hid_output_report warning
Warning at drivers/hid/hid-core.c:784 implement()
16 times last week
<no specific version information available>
More Info: http://www.kerneloops.org/search.php?search=implement

softlockup in tick_broadcast_oneshot_control
3 times last week
Only seen in 2.6.24-rc4 so far
More Info: http://www.kerneloops.org/oops.php?number=2409

hiddev_ioctl crash
3 times last week
Only seen in 2.6.24-rc3 so far
More Info: http://www.kerneloops.org/oops.php?number=2428

shrink_dcache_for_umount_subtree crash
BUG at fs/dcache.c:595
2 times last week
Has been seen as far back as 2.6.18
More Info: http://www.kerneloops.org/oops.php?number=2365
More Info: http://www.kerneloops.org/search.php?search=shrink_dcache_for_umount_subtree

cpufreq_remove_dev crash
BUG at drivers/cpufreq/cpufreq.c:1060
2 times last week
Has been reported only for 2.6.24-rc4
More Info: http://www.kerneloops.org/search.php?search=cpufreq_remove_dev
More Info: http://www.kerneloops.org/oops.php?number=2458

journal_dirty_data crash (tainted)
BUG at fs/jbd/transaction.c:983
2 times last week
Has been reported only in 2.6.23.9
http://www.kerneloops.org/search.php?search=journal_dirty_data

tcp_fastretrans_alert
WARNING at net/ipv4/tcp_input.c:2533 tcp_fastretrans_alert()
2 times last week
Has been reported in 2.6.24-rc4 and -rc5
More Info: http://www.kerneloops.org/search.php?search=tcp_fastretrans_alert

tcp_sacktag_one
WARNING at net/ipv4/tcp_input.c:1280 tcp_sacktag_one()
Reported once
Has only been seen in -rc5 so far
More Info: http://www.kerneloops.org/search.php?search=tcp_sacktag_one

simple_map_write (MTD)
kernel crash
Reported once this week on 2.6.24-rc5
Has been seen as far back as 2.6.17
More Info: http://www.kerneloops.org/search.php?search=simple_map_write

tcp_sacktag_walk
WARNING at net/ipv4/tcp_input.c:1280
Reported once on 2.6.24-rc5
Has been seen only on 2.6.24-rc5
More Info: http://www.kerneloops.org/search.php?search=tcp_sacktag_walk


2007-12-14 18:58:23

by Dave Jones

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week

On Fri, Dec 14, 2007 at 10:46:36AM -0800, Arjan van de Ven wrote:
> The http://www.kerneloops.org website collects kernel oops and warning reports from various mailing lists and bugzillas; below is a top 10
> list of the oopses collected in the last 7 days. (Reports prior to 2.6.23 have been omitted in collecting the top 10)
>
> This is the first such report that I'm posting; Please let me know if this is useful or not.

I like! Good work.

> cpufreq_remove_dev crash
> BUG at drivers/cpufreq/cpufreq.c:1060
> 2 times last week
> Has been reported only for 2.6.24-rc4
> More Info: http://www.kerneloops.org/search.php?search=cpufreq_remove_dev
> More Info: http://www.kerneloops.org/oops.php?number=2458

Patch pending. Already in -mm. Also sitting in Linus' inbox.

Dave

--
http://www.codemonkey.org.uk

2007-12-14 21:58:19

by Andrew Morton

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week

On Fri, 14 Dec 2007 10:46:36 -0800 Arjan van de Ven <[email protected]> wrote:

> The http://www.kerneloops.org website collects kernel oops and warning
> reports from various mailing lists and bugzillas

Well that would have been fun to write. Does it watch
https://lists.linux-foundation.org/mailman/listinfo/bugme-new ?

2007-12-14 22:13:37

by Jon Masters

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week


On Fri, 2007-12-14 at 10:46 -0800, Arjan van de Ven wrote:

> The http://www.kerneloops.org website collects kernel oops and warning reports from various mailing lists and bugzillas; below is a top 10
> list of the oopses collected in the last 7 days. (Reports prior to 2.6.23 have been omitted in collecting the top 10)
>
> This is the first such report that I'm posting; Please let me know if this is useful or not.

FWIW I think this is incredibly useful, Arjan. Hoping we'll get the
kerneloops tools into Fedora soon too.

Jon.

2007-12-14 22:25:41

by Natalie Protasevich

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week

On Dec 14, 2007 1:57 PM, Andrew Morton <[email protected]> wrote:
> On Fri, 14 Dec 2007 10:46:36 -0800 Arjan van de Ven <[email protected]> wrote:
>
> > The http://www.kerneloops.org website collects kernel oops and warning
> > reports from various mailing lists and bugzillas
>
> Well that would have been fun to write. Does it watch
> https://lists.linux-foundation.org/mailman/listinfo/bugme-new ?
>

This looks great! I'd like to install and try this package on
bugzilla... It looks like it can do all kinds of searches.

2007-12-15 00:41:19

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week

Andrew Morton wrote:
> On Fri, 14 Dec 2007 10:46:36 -0800 Arjan van de Ven <[email protected]> wrote:
>
>> The http://www.kerneloops.org website collects kernel oops and warning
>> reports from various mailing lists and bugzillas
>
> Well that would have been fun to write. Does it watch
> https://lists.linux-foundation.org/mailman/listinfo/bugme-new ?

yes it does; Martin pointed me at that recently....
What doesn't work yet (I now realize) is the link from the oops to the bugzilla URL; I'll be working on that shortly.

2007-12-15 15:49:59

by Stefan Richter

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week

Arjan van de Ven wrote:
> The http://www.kerneloops.org website collects kernel oops and warning
> reports from various mailing lists and bugzillas;

A few comments:

Report counts may be too high due to duplicate recognition of the very
same report.?

Reports against 2.6.X-rcY-mmZ are listed in the same category as reports
against 2.6.X-rcY. To distinguish -mm reports from vanilla reports, one
has to look into the details of each bug entry.?

A general weakness is that it is ultimately impossible to know whether a
report was against an unpatched kernel, unless one drills down to the
individual mailinglist threads.

Reports about tainted kernels have arguably less value. It would be
good to hide such reports until a report of the same oops in an
untainted kernel was found.

?) example: http://www.kerneloops.org/oops.php?number=2335
--
Stefan Richter
-=====-=-=== ==-- -====
http://arcgraph.de/sr/

2007-12-15 18:23:14

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week

Stefan Richter wrote:
> Arjan van de Ven wrote:
>> The http://www.kerneloops.org website collects kernel oops and warning
>> reports from various mailing lists and bugzillas;
>
> A few comments:
>
> Report counts may be too high due to duplicate recognition of the very
> same report.?

this is true however it's .. a hard issue. It's really hard to distinguish a duplicate report from
two reports of the same bug.

>
> Reports against 2.6.X-rcY-mmZ are listed in the same category as reports
> against 2.6.X-rcY. To distinguish -mm reports from vanilla reports, one
> has to look into the details of each bug entry.?

finding what exact kernel version an oops is from is... surprisingly hard.
And to be honest, bugs against -mm are still very interesting, since they'll be
the next mainline after all

>
> A general weakness is that it is ultimately impossible to know whether a
> report was against an unpatched kernel, unless one drills down to the
> individual mailinglist threads.

for the same reason patched kernels are relevant. And if someone has a super weirdo kernel,
well, as long as we get enough bug data it'll be way down in the noise.


> Reports about tainted kernels have arguably less value. It would be
> good to hide such reports until a report of the same oops in an
> untainted kernel was found.
That's half of what is done right now; they're not hidden though, just very clearly marked.

2007-12-15 19:45:14

by Stefan Richter

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week

Arjan van de Ven wrote:
> Stefan Richter wrote:
>> Report counts may be too high due to duplicate recognition of the very
>> same report.
>
> this is true however it's .. a hard issue. It's really hard to
> distinguish a duplicate report from two reports of the same bug.

Would be nice though to try to find duplicates like the example I gave.
(The actual report and a reply was listed. The reply just had a full
quote of the oops, with "> " prepended and perhaps lines wrapped.)
Because if an oops is independently reported twice or more, this too
says something about the issue. E.g. flaky RAM and such is pretty much
eliminated as a possible cause.

Anyway, someone who is actually interested in a particular oops and
looks at the posts in your links quickly notices eventual duplicates.
But it would be helpful to people who only have a quick glance at the
bar graphs if you add a note of caution that the figures are not
accurate and not representative, e.g. because of occasional duplicates.

For the same reason, please don't write headings like "Oops statistics
for kernel 2.6.23-release". Unless you mean "statistics" in a narrower
sense like they do statistics in medicine and economics. ;-)
Simply write "Oops reports for kernel...".

>> Reports against 2.6.X-rcY-mmZ are listed in the same category as reports
>> against 2.6.X-rcY. To distinguish -mm reports from vanilla reports, one
>> has to look into the details of each bug entry.?
>
> finding what exact kernel version an oops is from is... surprisingly hard.
> And to be honest, bugs against -mm are still very interesting, since
> they'll be the next mainline after all

Yes, they definitely are interesting. And it's the same like with the
above issue: People who are genuinely interested in an oops find the
necessary information at the details page. Separating them from
mainline oopses would be a service though for people who want to
- have a quick look at what's urgent and what's not so urgent,
- draw conclusions about the state of the release candidates.
So this is not that important.
--
Stefan Richter
-=====-=-=== ==-- -====
http://arcgraph.de/sr/

2007-12-17 02:52:02

by Dave Jones

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week

On Sat, Dec 15, 2007 at 04:49:05PM +0100, Stefan Richter wrote:

> Reports about tainted kernels have arguably less value. It would be
> good to hide such reports until a report of the same oops in an
> untainted kernel was found.

I disagree with this. It's useful to have a "we've seen this before,
and every time, it was tainted with xyz module" datapoint, especially
if no untainted copies of that oops turn up.

Dave

--
http://www.codemonkey.org.uk

2007-12-17 12:34:44

by Jon Masters

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week


On Sun, 2007-12-16 at 21:51 -0500, Dave Jones wrote:
> On Sat, Dec 15, 2007 at 04:49:05PM +0100, Stefan Richter wrote:
>
> > Reports about tainted kernels have arguably less value. It would be
> > good to hide such reports until a report of the same oops in an
> > untainted kernel was found.
>
> I disagree with this. It's useful to have a "we've seen this before,
> and every time, it was tainted with xyz module" datapoint, especially
> if no untainted copies of that oops turn up.

+1

In fact, that's even more useful in many cases, if it helps demonstrate
that the oops is associated with a particular buggy binary driver. I can
see a lot of potentially interesting statistics coming from that too.

Jon.

2007-12-17 13:14:36

by Stefan Richter

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week

Jon Masters wrote:
> On Sun, 2007-12-16 at 21:51 -0500, Dave Jones wrote:
>> On Sat, Dec 15, 2007 at 04:49:05PM +0100, Stefan Richter wrote:
>>
>> > Reports about tainted kernels have arguably less value. It would be
>> > good to hide such reports until a report of the same oops in an
>> > untainted kernel was found.
>>
>> I disagree with this. It's useful to have a "we've seen this before,
>> and every time, it was tainted with xyz module" datapoint, especially
>> if no untainted copies of that oops turn up.
>
> +1
>
> In fact, that's even more useful in many cases, if it helps demonstrate
> that the oops is associated with a particular buggy binary driver. I can
> see a lot of potentially interesting statistics coming from that too.

-1 :-)

I don't care at all what this xyz module does or does not do by and in
itself.

(Of course since at least two people care and since this makes life
easier for Arjan, just keep listing reports about tainted kernels like
you do now. It just so happens that different people are interested in
different things.)
--
Stefan Richter
-=====-=-=== ==-- =---=
http://arcgraph.de/sr/

2007-12-17 16:42:40

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week

Stefan Richter wrote:
> Jon Masters wrote:
>> On Sun, 2007-12-16 at 21:51 -0500, Dave Jones wrote:
>>> On Sat, Dec 15, 2007 at 04:49:05PM +0100, Stefan Richter wrote:
>>>
>>> > Reports about tainted kernels have arguably less value. It would be
>>> > good to hide such reports until a report of the same oops in an
>>> > untainted kernel was found.
>>>
>>> I disagree with this. It's useful to have a "we've seen this before,
>>> and every time, it was tainted with xyz module" datapoint, especially
>>> if no untainted copies of that oops turn up.
>> +1
>>
>> In fact, that's even more useful in many cases, if it helps demonstrate
>> that the oops is associated with a particular buggy binary driver. I can
>> see a lot of potentially interesting statistics coming from that too.
>
> -1 :-)
>
> I don't care at all what this xyz module does or does not do by and in
> itself.
>

the thing is this: The goal of kerneloops.org is to allow developers to focus their effort on the real
important cases. Part of that is knowing which cases to dismiss/not spend time on because of their
relation with one or more binary drivers.... so imo keeping track of this and showing the "don't bother"
flag with it is very much worthwhile; it allows us developers to know what to ignore.

2007-12-17 17:23:59

by Ingo Molnar

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week


* Arjan van de Ven <[email protected]> wrote:

> The http://www.kerneloops.org website collects kernel oops and warning
> reports from various mailing lists and bugzillas; below is a top 10
> list of the oopses collected in the last 7 days. (Reports prior to
> 2.6.23 have been omitted in collecting the top 10)

cool stuff! I cannot over-emphasise how useful this will be.

Let us know if you need any additional WARN_ON()s or other dmesg
annotations to make parsing easier / more intelligent. At least as far
as arch/x86 and the scheduler is related it's going to be applied to the
fast-track queue ;-)

Ingo

2007-12-17 18:25:45

by Zach Brown

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week


>> Report counts may be too high due to duplicate recognition of the very
>> same report.?
>
> this is true however it's .. a hard issue. It's really hard to
> distinguish a duplicate report from
> two reports of the same bug.

Can we hack some data in to oops output to help? Say a giant per-boot
anonymous random number (yeah, I know, harder than it sounds) and then
an incrementing oops counter. That'd also let you discover that the
latter oopses in a chain of oopses might be fall-out from the head of
the chain.

- z

2007-12-17 18:43:16

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week

Zach Brown wrote:
>>> Report counts may be too high due to duplicate recognition of the very
>>> same report.?
>> this is true however it's .. a hard issue. It's really hard to
>> distinguish a duplicate report from
>> two reports of the same bug.
>
> Can we hack some data in to oops output to help? Say a giant per-boot
> anonymous random number (yeah, I know, harder than it sounds) and then
> an incrementing oops counter.

there already is a per-boot UUID afaik, just a matter of printing that..
I'll look into that, but it does add extra info to the oops print

> That'd also let you discover that the
> latter oopses in a chain of oopses might be fall-out from the head of
> the chain.

this is there already and taken care of ;)

2007-12-17 21:37:53

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week

On Mon, 17 Dec 2007 18:23:31 +0100
Ingo Molnar <[email protected]> wrote:

>
> * Arjan van de Ven <[email protected]> wrote:
>
> > The http://www.kerneloops.org website collects kernel oops and
> > warning reports from various mailing lists and bugzillas; below is
> > a top 10 list of the oopses collected in the last 7 days. (Reports
> > prior to 2.6.23 have been omitted in collecting the top 10)
>
> cool stuff! I cannot over-emphasise how useful this will be.
>
> Let us know if you need any additional WARN_ON()s or other dmesg
> annotations to make parsing easier / more intelligent. At least as
> far as arch/x86 and the scheduler is related it's going to be applied
> to the fast-track queue ;-)
>

the following patch would help a lot; it ads a very nice parsable end-marker
to oopses, as well as printing the boot UUID as part of the oops, which
makes it easier to de-dupe oopses. The UUID is just a random number and not
privacy-tracable to any system.

--

Subject: [patch] terminate the oops printing with a defined string/uuid
From: Arjan van de Ven <[email protected]>

Right now, it's hard for automated tools to determine when an oops has
ended; there's no clear marker for this. In addition, there's no good
way to find out if an oops is unique. Sometimes it's the same oops
just reported multiple times, while other times it's a different
instance of the crash with the same signature. Printing the boot UUID
as part of the end string resolves this ambiguity.

Signed-off-by: Arjan van de Ven <[email protected]>
CC: Ted Ts'o <[email protected]>

---
drivers/char/random.c | 35 ++++++++++++++++++++++++++++++++++-
include/linux/random.h | 1 +
kernel/panic.c | 2 ++
3 files changed, 37 insertions(+), 1 deletion(-)

Index: linux-2.6.24-rc5/drivers/char/random.c
===================================================================
--- linux-2.6.24-rc5.orig/drivers/char/random.c
+++ linux-2.6.24-rc5/drivers/char/random.c
@@ -1176,8 +1176,41 @@ static int max_read_thresh = INPUT_POOL_
static int max_write_thresh = INPUT_POOL_WORDS * 32;
static char sysctl_bootid[16];

+/**
+ * get_boot_uuid - return a string pointer to a system wide boot UUID
+ *
+ * Returns a pointer to the boot UUID. This UUID is unique per system
+ * boot but persistent for one boot session.
+ *
+ * The memory returned via the return pointer is static allocated and
+ * owned by the random.c driver; this should not be kfree()'d.
+ *
+ * Locking: none
+ */
+ */
+char *get_boot_uuid(void)
+{
+ static char target[80];
+ unsigned char *uuid;
+
+ if (sysctl_bootid[8] == 0)
+ generate_random_uuid(sysctl_bootid);
+ /* sysctl_bootid is signed, to print we need unsigned .. */
+ uuid = sysctl_bootid;
+
+ if (target[0] == 0) {
+ sprintf(target, "%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-"
+ "%02x%02x%02x%02x%02x%02x",
+ uuid[0], uuid[1], uuid[2], uuid[3], uuid[4],
+ uuid[5], uuid[6], uuid[7], uuid[8], uuid[9],
+ uuid[10], uuid[11], uuid[12], uuid[13], uuid[14],
+ uuid[15]);
+ }
+ return target;
+}
+
/*
- * These functions is used to return both the bootid UUID, and random
+ * These functions are used to return both the bootid UUID, and random
* UUID. The difference is in whether table->data is NULL; if it is,
* then a new UUID is generated and returned to the user.
*
Index: linux-2.6.24-rc5/include/linux/random.h
===================================================================
--- linux-2.6.24-rc5.orig/include/linux/random.h
+++ linux-2.6.24-rc5/include/linux/random.h
@@ -71,6 +71,7 @@ unsigned long randomize_range(unsigned l

u32 random32(void);
void srandom32(u32 seed);
+char *get_boot_uuid(void);

#endif /* __KERNEL___ */

Index: linux-2.6.24-rc5/kernel/panic.c
===================================================================
--- linux-2.6.24-rc5.orig/kernel/panic.c
+++ linux-2.6.24-rc5/kernel/panic.c
@@ -19,6 +19,7 @@
#include <linux/nmi.h>
#include <linux/kexec.h>
#include <linux/debug_locks.h>
+#include <linux/random.h>

int panic_on_oops;
int tainted;
@@ -272,6 +273,7 @@ void oops_enter(void)
void oops_exit(void)
{
do_oops_enter_exit();
+ printk("---[ end of trace %s ]---\n", get_boot_uuid());
}

#ifdef CONFIG_CC_STACKPROTECTOR

2007-12-17 21:58:44

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week

On Mon, Dec 17, 2007 at 01:36:31PM -0800, Arjan van de Ven wrote:
> Subject: [patch] terminate the oops printing with a defined string/uuid
> From: Arjan van de Ven <[email protected]>
>
> Right now, it's hard for automated tools to determine when an oops has
> ended; there's no clear marker for this. In addition, there's no good
> way to find out if an oops is unique. Sometimes it's the same oops
> just reported multiple times, while other times it's a different
> instance of the crash with the same signature. Printing the boot UUID
> as part of the end string resolves this ambiguity.
>
> Signed-off-by: Arjan van de Ven <[email protected]>
> CC: Ted Ts'o <[email protected]>

Looks good to me!

Signed-off-by: "Theodore Ts'o" <[email protected]>

- Ted

2007-12-17 22:59:12

by Luck, Tony

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week

> + static char target[80];
...
> + sprintf(target, "%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-"
> + "%02x%02x%02x%02x%02x%02x",

[80] is overkill ... [37] bytes should be enough (unless I went
cross-eyed counting the "%02x" :-)

-Tony

2007-12-17 23:22:45

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week

Tony Luck wrote:
>> + static char target[80];
> ...
>> + sprintf(target, "%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-"
>> + "%02x%02x%02x%02x%02x%02x",
>
> [80] is overkill ... [37] bytes should be enough (unless I went
> cross-eyed counting the "%02x" :-)
>

%02x doesn't guarantee that it's at most 2, but at LEAST 2...

2007-12-17 23:27:05

by Luck, Tony

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week

On Dec 17, 2007 3:17 PM, Arjan van de Ven <[email protected]> wrote:
>
> Tony Luck wrote:
> >> + static char target[80];
> > ...
> >> + sprintf(target, "%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-"
> >> + "%02x%02x%02x%02x%02x%02x",
> >
> > [80] is overkill ... [37] bytes should be enough (unless I went
> > cross-eyed counting the "%02x" :-)
> >
>
> %02x doesn't guarantee that it's at most 2, but at LEAST 2...

How will you fit a number that requires >2 hex digits into an
"unsigned char"?

Alternatively ... if %02x may spew more that 2 characters, can
you be sure that [80] is enough?

-Tony

2007-12-17 23:47:59

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week

On Mon, 17 Dec 2007 15:26:46 -0800
"Tony Luck" <[email protected]> wrote:

> On Dec 17, 2007 3:17 PM, Arjan van de Ven <[email protected]>
> wrote:
> >
> > Tony Luck wrote:
> > >> + static char target[80];
> > > ...
> > >> + sprintf(target,
> > >> "%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-"
> > >> + "%02x%02x%02x%02x%02x%02x",
> > >
> > > [80] is overkill ... [37] bytes should be enough (unless I went
> > > cross-eyed counting the "%02x" :-)
> > >
> >
> > %02x doesn't guarantee that it's at most 2, but at LEAST 2...
>
> How will you fit a number that requires >2 hex digits into an
> "unsigned char"?

eh eh because at first it was a signed char but I fixed that bug later

updated patch attached; using 38 to have a hard 0 at the end in case sprintf does
something weird and 2 cpus race over oopsing (I don't want to add locking to the oops codepath
if I can avoid it; the worst case with 38 is a truncated UUID string)


Subject: [patch] terminate the oops printing with a defined string/uuid
From: Arjan van de Ven <[email protected]>

Right now, it's hard for automated tools to determine when an oops has
ended; there's no clear marker for this. In addition, there's no good
way to find out if an oops is unique. Sometimes it's the same oops
just reported multiple times, while other times it's a different
instance of the crash with the same signature. Printing the boot UUID
as part of the end string resolves this ambiguity.

Signed-off-by: Arjan van de Ven <[email protected]>
Signed-off-by: "Theodore Ts'o" <[email protected]>

---
drivers/char/random.c | 35 ++++++++++++++++++++++++++++++++++-
include/linux/random.h | 1 +
kernel/panic.c | 2 ++
3 files changed, 37 insertions(+), 1 deletion(-)

Index: linux-2.6.24-rc5/drivers/char/random.c
===================================================================
--- linux-2.6.24-rc5.orig/drivers/char/random.c
+++ linux-2.6.24-rc5/drivers/char/random.c
@@ -1176,8 +1176,41 @@ static int max_read_thresh = INPUT_POOL_
static int max_write_thresh = INPUT_POOL_WORDS * 32;
static char sysctl_bootid[16];

+/**
+ * get_boot_uuid - return a string pointer to a system wide boot UUID
+ *
+ * Returns a pointer to the boot UUID. This UUID is unique per system
+ * boot but persistent for one boot session.
+ *
+ * The memory returned via the return pointer is static allocated and
+ * owned by the random.c driver; this should not be kfree()'d.
+ *
+ * Locking: none
+ */
+ */
+char *get_boot_uuid(void)
+{
+ static char target[38];
+ unsigned char *uuid;
+
+ if (sysctl_bootid[8] == 0)
+ generate_random_uuid(sysctl_bootid);
+ /* sysctl_bootid is signed, to print we need unsigned .. */
+ uuid = sysctl_bootid;
+
+ if (target[0] == 0) {
+ sprintf(target, "%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-"
+ "%02x%02x%02x%02x%02x%02x",
+ uuid[0], uuid[1], uuid[2], uuid[3], uuid[4],
+ uuid[5], uuid[6], uuid[7], uuid[8], uuid[9],
+ uuid[10], uuid[11], uuid[12], uuid[13], uuid[14],
+ uuid[15]);
+ }
+ return target;
+}
+
/*
- * These functions is used to return both the bootid UUID, and random
+ * These functions are used to return both the bootid UUID, and random
* UUID. The difference is in whether table->data is NULL; if it is,
* then a new UUID is generated and returned to the user.
*
Index: linux-2.6.24-rc5/include/linux/random.h
===================================================================
--- linux-2.6.24-rc5.orig/include/linux/random.h
+++ linux-2.6.24-rc5/include/linux/random.h
@@ -71,6 +71,7 @@ unsigned long randomize_range(unsigned l

u32 random32(void);
void srandom32(u32 seed);
+char *get_boot_uuid(void);

#endif /* __KERNEL___ */

Index: linux-2.6.24-rc5/kernel/panic.c
===================================================================
--- linux-2.6.24-rc5.orig/kernel/panic.c
+++ linux-2.6.24-rc5/kernel/panic.c
@@ -19,6 +19,7 @@
#include <linux/nmi.h>
#include <linux/kexec.h>
#include <linux/debug_locks.h>
+#include <linux/random.h>

int panic_on_oops;
int tainted;
@@ -272,6 +273,7 @@ void oops_enter(void)
void oops_exit(void)
{
do_oops_enter_exit();
+ printk("---[ end of trace %s ]---\n", get_boot_uuid());
}

#ifdef CONFIG_CC_STACKPROTECTOR

2007-12-18 00:32:27

by Linus Torvalds

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week



On Mon, 17 Dec 2007, Arjan van de Ven wrote:
>
> +char *get_boot_uuid(void)
> +{
> + static char target[38];
> + unsigned char *uuid;
> +
> + if (sysctl_bootid[8] == 0)
> + generate_random_uuid(sysctl_bootid);
> + /* sysctl_bootid is signed, to print we need unsigned .. */
> + uuid = sysctl_bootid;
> +
> + if (target[0] == 0) {
> + sprintf(target, "%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-"
> + "%02x%02x%02x%02x%02x%02x",

Why isn't *everything* inside that "if (target[0] == 0" check?

IOW, that function should look something like

const char *get_boot_uuid(void)
{
static char target[38];

if (!target[0])
fill_boot_uid(target)
return target;
}

which also allows you to clean it up a bit.

I'd _also_ suggest that you'd actually try to avoid that horrid sequence
of "%02x..", and instead just make sure that sysctl_bootid[] is 4-byte
aligned, and then you can do

sprintf("%08x-%04x-%04x-%04x-%04x%08x",
ntohl(0[(u32 *)uuid]),
ntohs(2[(u16 *)uuid]),
ntohs(3[(u16 *)uuid]),
ntohs(4[(u16 *)uuid]),
ntohs(5[(u16 *)uuid]),
ntohl(3[(u32 *)uuid]));

which also gets bonus points for being totally unreadable, and thus 100%
in the spirit of uuid's.

Linus

2007-12-18 00:46:39

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week

Linus Torvalds wrote:
>
> On Mon, 17 Dec 2007, Arjan van de Ven wrote:
>> +char *get_boot_uuid(void)
>> +{
>> + static char target[38];
>> + unsigned char *uuid;
>> +
>> + if (sysctl_bootid[8] == 0)
>> + generate_random_uuid(sysctl_bootid);
>> + /* sysctl_bootid is signed, to print we need unsigned .. */
>> + uuid = sysctl_bootid;
>> +
>> + if (target[0] == 0) {
>> + sprintf(target, "%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-"
>> + "%02x%02x%02x%02x%02x%02x",
>
> Why isn't *everything* inside that "if (target[0] == 0" check?

the sysctl_bootid is shared with the /proc exposed bootid, so I need to generate it the same way

> I'd _also_ suggest that you'd actually try to avoid that horrid sequence
> of "%02x..", and instead just make sure that sysctl_bootid[] is 4-byte
> aligned, and then you can do
>
> sprintf("%08x-%04x-%04x-%04x-%04x%08x",
> ntohl(0[(u32 *)uuid]),
> ntohs(2[(u16 *)uuid]),
> ntohs(3[(u16 *)uuid]),
> ntohs(4[(u16 *)uuid]),
> ntohs(5[(u16 *)uuid]),
> ntohl(3[(u32 *)uuid]));
>
> which also gets bonus points for being totally unreadable, and thus 100%
> in the spirit of uuid's.

again.. this is for compatibility with /proc/sys/kernel/random/boot_id .. the code 10 lines below my patch is identical and does
the %02x stuff... I didn't make that up, I just copied that to get the same output.
I can deviate for cleanup... but I can see some value of being the same format and same data.

2007-12-18 02:31:59

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week

On Mon, Dec 17, 2007 at 04:21:12PM -0800, Linus Torvalds wrote:
> which also gets bonus points for being totally unreadable, and thus 100%
> in the spirit of uuid's.

Heh. UUID's don't have to be readable; just universally unique. Code
on the other hand should be readable. :-)

If you want something more readable, you could print the MAC address
and boot time. Of course some crazy people seem to think leaking the
MAC address will somehow be a privacy violation. And printing a
random UUID is a lot simpler....

- Ted

2007-12-18 07:01:01

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week

Theodore Tso wrote:
> On Mon, Dec 17, 2007 at 04:21:12PM -0800, Linus Torvalds wrote:
>> which also gets bonus points for being totally unreadable, and thus 100%
>> in the spirit of uuid's.
>
> Heh. UUID's don't have to be readable; just universally unique. Code
> on the other hand should be readable. :-)

Linus' suggested... improvement should either be done in all 3 places or none ;)
Since you're the maintainer... what's your suggestion?

>
> If you want something more readable, you could print the MAC address
> and boot time. Of course some crazy people seem to think leaking the
> MAC address will somehow be a privacy violation. And printing a
> random UUID is a lot simpler....

boot UUID is nice in that it's different each boot, so that an oops that happens twice will have a
different UUID even if it's the same machine, while repeat-reports of the same oops will have
the same UUID. So I very much like to use some form of UUID; since the boot UUID has the
same properties I was happy to share this; if it gets too ugly or evil code wise I can always
pick something else ;-)

2007-12-18 10:12:19

by Jon Masters

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week


On Mon, 2007-12-17 at 21:31 -0500, Theodore Tso wrote:
> On Mon, Dec 17, 2007 at 04:21:12PM -0800, Linus Torvalds wrote:
> > which also gets bonus points for being totally unreadable, and thus 100%
> > in the spirit of uuid's.
>
> Heh. UUID's don't have to be readable; just universally unique. Code
> on the other hand should be readable. :-)
>
> If you want something more readable, you could print the MAC address
> and boot time. Of course some crazy people seem to think leaking the
> MAC address will somehow be a privacy violation. And printing a
> random UUID is a lot simpler....

Printing a random UUID is necessary, for now anyway, because you cannot
assume every machine is going to have a MAC address, even if it is
deemed appropriate to print this on oops.

The Network is the Computer!

Jon.

2007-12-18 17:50:13

by Matt Mackall

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week

On Mon, Dec 17, 2007 at 01:36:31PM -0800, Arjan van de Ven wrote:
> On Mon, 17 Dec 2007 18:23:31 +0100
> Ingo Molnar <[email protected]> wrote:
>
> >
> > * Arjan van de Ven <[email protected]> wrote:
> >
> > > The http://www.kerneloops.org website collects kernel oops and
> > > warning reports from various mailing lists and bugzillas; below is
> > > a top 10 list of the oopses collected in the last 7 days. (Reports
> > > prior to 2.6.23 have been omitted in collecting the top 10)
> >
> > cool stuff! I cannot over-emphasise how useful this will be.
> >
> > Let us know if you need any additional WARN_ON()s or other dmesg
> > annotations to make parsing easier / more intelligent. At least as
> > far as arch/x86 and the scheduler is related it's going to be applied
> > to the fast-track queue ;-)
> >
>
> the following patch would help a lot; it ads a very nice parsable end-marker
> to oopses, as well as printing the boot UUID as part of the oops, which
> makes it easier to de-dupe oopses. The UUID is just a random number and not
> privacy-tracable to any system.
>
> --
>
> Subject: [patch] terminate the oops printing with a defined string/uuid
> From: Arjan van de Ven <[email protected]>
>
> Right now, it's hard for automated tools to determine when an oops has
> ended; there's no clear marker for this. In addition, there's no good
> way to find out if an oops is unique. Sometimes it's the same oops
> just reported multiple times, while other times it's a different
> instance of the crash with the same signature. Printing the boot UUID
> as part of the end string resolves this ambiguity.
>
> Signed-off-by: Arjan van de Ven <[email protected]>
> CC: Ted Ts'o <[email protected]>
>
> ---
> drivers/char/random.c | 35 ++++++++++++++++++++++++++++++++++-
> include/linux/random.h | 1 +
> kernel/panic.c | 2 ++
> 3 files changed, 37 insertions(+), 1 deletion(-)
>
> Index: linux-2.6.24-rc5/drivers/char/random.c
> ===================================================================
> --- linux-2.6.24-rc5.orig/drivers/char/random.c
> +++ linux-2.6.24-rc5/drivers/char/random.c
> @@ -1176,8 +1176,41 @@ static int max_read_thresh = INPUT_POOL_
> static int max_write_thresh = INPUT_POOL_WORDS * 32;
> static char sysctl_bootid[16];
>
> +/**
> + * get_boot_uuid - return a string pointer to a system wide boot UUID
> + *
> + * Returns a pointer to the boot UUID. This UUID is unique per system
> + * boot but persistent for one boot session.
> + *
> + * The memory returned via the return pointer is static allocated and
> + * owned by the random.c driver; this should not be kfree()'d.
> + *
> + * Locking: none
> + */
> + */
> +char *get_boot_uuid(void)
> +{
> + static char target[80];
> + unsigned char *uuid;
> +
> + if (sysctl_bootid[8] == 0)
> + generate_random_uuid(sysctl_bootid);
> + /* sysctl_bootid is signed, to print we need unsigned .. */
> + uuid = sysctl_bootid;
> +
> + if (target[0] == 0) {
> + sprintf(target, "%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-"
> + "%02x%02x%02x%02x%02x%02x",
> + uuid[0], uuid[1], uuid[2], uuid[3], uuid[4],
> + uuid[5], uuid[6], uuid[7], uuid[8], uuid[9],
> + uuid[10], uuid[11], uuid[12], uuid[13], uuid[14],
> + uuid[15]);

Blech. Invoking the random pool machinery at oops time is moderately
safe, but not very shiny. Going through all the sprintf ugliness to
format it to an irrelevant UUID standard is not very shiny either. At
least refactor it so it's not duplicating code.

And I'd much rather the static variable lived with its user, as
random.c is already too miscellaneous:

> --- linux-2.6.24-rc5.orig/kernel/panic.c
> +++ linux-2.6.24-rc5/kernel/panic.c
...
> + printk("---[ end of trace %s ]---\n", get_boot_uuid());

Also, please cc: me on any future patches to random.c.

--
Mathematics is the supreme nostalgia of our time.

2007-12-18 17:54:22

by Matt Mackall

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week

On Mon, Dec 17, 2007 at 10:58:54PM -0800, Arjan van de Ven wrote:
> Theodore Tso wrote:
> >On Mon, Dec 17, 2007 at 04:21:12PM -0800, Linus Torvalds wrote:
> >>which also gets bonus points for being totally unreadable, and thus 100%
> >>in the spirit of uuid's.
> >
> >Heh. UUID's don't have to be readable; just universally unique. Code
> >on the other hand should be readable. :-)
>
> Linus' suggested... improvement should either be done in all 3 places or
> none ;)
> Since you're the maintainer... what's your suggestion?

For the record:

RANDOM NUMBER DRIVER
P: Matt Mackall
M: [email protected]
S: Maintained

--
Mathematics is the supreme nostalgia of our time.

2007-12-18 18:08:31

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week

Linus Torvalds wrote:
>
> On Mon, 17 Dec 2007, Arjan van de Ven wrote:
>> +char *get_boot_uuid(void)
>> +{
>> + static char target[38];
>> + unsigned char *uuid;
>> +
>> + if (sysctl_bootid[8] == 0)
>> + generate_random_uuid(sysctl_bootid);
>> + /* sysctl_bootid is signed, to print we need unsigned .. */
>> + uuid = sysctl_bootid;
>> +
>> + if (target[0] == 0) {
>> + sprintf(target, "%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-"
>> + "%02x%02x%02x%02x%02x%02x",
>
> Why isn't *everything* inside that "if (target[0] == 0" check?
>
> IOW, that function should look something like


ok so this got a lot more involved than I was hoping for;
something like below will help me (and kerneloops.org ;) for the short term,
while I'll see what I can do for random.c in a few dead moments soon, for a 2.6.25
enhancement...


Subject: [patch] terminate the oops printing with a defined string/uuid
From: Arjan van de Ven <[email protected]>

Right now, it's hard for automated tools to determine when an oops has
ended; there's no clear marker for this. For later kernels I would also
like a UUID to printed here, but for short term I've put all zeros there
since printing a UUID seems to involve cleaning up/rewriting quite a chunk
of random.c and that's more involved -> later patch.

Signed-off-by: Arjan van de Ven <[email protected]>

---
kernel/panic.c | 1 +
1 files changed, 1 insertion(+), 0 deletions(-)

Index: linux-2.6.24-rc5/kernel/panic.c
===================================================================
--- linux-2.6.24-rc5.orig/kernel/panic.c
+++ linux-2.6.24-rc5/kernel/panic.c
@@ -272,6 +273,7 @@ void oops_enter(void)
void oops_exit(void)
{
do_oops_enter_exit();
+ printk("---[ end of trace 00000000-0000-0000-0000-000000000000 ]---\n");
}

#ifdef CONFIG_CC_STACKPROTECTOR


Attachments:
oopsend.patch (981.00 B)

2007-12-18 18:14:55

by Matt Mackall

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week

On Tue, Dec 18, 2007 at 10:06:14AM -0800, Arjan van de Ven wrote:
> Linus Torvalds wrote:
> >
> >On Mon, 17 Dec 2007, Arjan van de Ven wrote:
> >>+char *get_boot_uuid(void)
> >>+{
> >>+ static char target[38];
> >>+ unsigned char *uuid;
> >>+
> >>+ if (sysctl_bootid[8] == 0)
> >>+ generate_random_uuid(sysctl_bootid);
> >>+ /* sysctl_bootid is signed, to print we need unsigned .. */
> >>+ uuid = sysctl_bootid;
> >>+
> >>+ if (target[0] == 0) {
> >>+ sprintf(target,
> >>"%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-"
> >>+ "%02x%02x%02x%02x%02x%02x",
> >
> >Why isn't *everything* inside that "if (target[0] == 0" check?
> >
> >IOW, that function should look something like
>
>
> ok so this got a lot more involved than I was hoping for;
> something like below will help me (and kerneloops.org ;) for the short term,
> while I'll see what I can do for random.c in a few dead moments soon, for a
> 2.6.25
> enhancement...

Might as well leave out the null UUID, no sense in claiming to have
one when you don't. It's easy for a parser to cut on "^---["

--
Mathematics is the supreme nostalgia of our time.

2007-12-18 18:28:28

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week

On Mon, Dec 17, 2007 at 10:58:54PM -0800, Arjan van de Ven wrote:
> Theodore Tso wrote:
>> On Mon, Dec 17, 2007 at 04:21:12PM -0800, Linus Torvalds wrote:
>>> which also gets bonus points for being totally unreadable, and thus 100%
>>> in the spirit of uuid's.
>> Heh. UUID's don't have to be readable; just universally unique. Code
>> on the other hand should be readable. :-)
>
> Linus' suggested... improvement should either be done in all 3 places or
> none ;)
> Since you're the maintainer... what's your suggestion?

Well, Matt took over maintenance of the /dev/random driver, but my
take on it is that code readability is more important that saving a
few bytes of generated code or speed; the code paths are only executed
once, so it's hardly a fast path.

- Ted

2007-12-18 18:36:29

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week

Matt Mackall wrote:
> Might as well leave out the null UUID, no sense in claiming to have
> one when you don't. It's easy for a parser to cut on "^---["

one can't cut on that since that's also the start marker.
Yes it's possible to leave it out entirely, and thus have 2 different terminators over time.
No I don't think it's a good idea.

2007-12-18 18:47:19

by Linus Torvalds

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week



On Tue, 18 Dec 2007, Theodore Tso wrote:
>
> Well, Matt took over maintenance of the /dev/random driver, but my
> take on it is that code readability is more important that saving a
> few bytes of generated code or speed; the code paths are only executed
> once, so it's hardly a fast path.

Quite frankly, I'd argue that while my suggested code wasn't exactly
readable, it was more so than the horror it tried to replace.

BAD CODE is never readable. At least my suggestion was good code.

Linus

2007-12-18 23:42:31

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Top kernel oopses/warnings this week

Matt Mackall wrote:
>
> Blech. Invoking the random pool machinery at oops time is moderately
> safe, but not very shiny. Going through all the sprintf ugliness to
> format it to an irrelevant UUID standard is not very shiny either. At
> least refactor it so it's not duplicating code.
>
> And I'd much rather the static variable lived with its user, as
> random.c is already too miscellaneous:

ok so something like this?


From: Arjan van de Ven <[email protected]>
Subject: [patch] Print end-of-oops marker with UUID

Right now, it's nearly impossible for parsers to detect the end-of-oops
condition; for example this is a problem for http://www.kerneloops.org.
In addition, it's not currently possible to detect whether or not
2 oopses that look alike are actually the same oops reported twice,
or truely 2 unique oopses.

This patch factors out the "sprintf a UUID into a string" code from
random.c into a separate function (using snprintf as suggested by
Randy). So far I left the %02x in place instead of using Linus'
"improvement"; if someone really hates the %02x's he/she can do that
later.

It also reduces the stack footprint of proc_do_uuid(); it
was using 64 bytes for the string where 37 is sufficient.
With these random.c changes, the oops_exit() function can print an
end-of-oops marker from the oops_exit() function.

Normally, the UUID used for oopses is calculated as late_initcall
(in the hope that at that time there is enough entropy to get a
unique enough UUID); however for early oopses the oops_exit() function
needs to generate the UUID on the fly.

Signed-off-by: Arjan van de Ven <[email protected]>
CC: Matt
CC: Ted
CC: Randy

--- linux-2.6.24-rc5/drivers/char/random.c.org 2007-12-18 11:37:22.000000000 -0800
+++ linux-2.6.24-rc5/drivers/char/random.c 2007-12-18 12:20:48.000000000 -0800
@@ -1176,8 +1175,34 @@ static int max_read_thresh = INPUT_POOL_
static int max_write_thresh = INPUT_POOL_WORDS * 32;
static char sysctl_bootid[16];

+
+/**
+ * snprintf_uuid - Convert a 16 byte UUID into string format
+ * @string: buffer to store the UUID into
+ * @len: size of @string
+ * @uuid: the UUID to convert
+ *
+ * This function converts a 16 byte binary UUID into canonical
+ * ASCII form. This ASCII form needs 37 bytes of storage space,
+ * allocated and provided by the caller.
+ *
+ * Returns: pointer to @string
+ *
+ * Locking: none
+ */
+const char *snprintf_uuid(char *string, int len, unsigned char *uuid)
+{
+ snprintf(string, len, "%02x%02x%02x%02x-%02x%02x-%02x%02x-"
+ "%02x%02x-%02x%02x%02x%02x%02x%02x",
+ uuid[0], uuid[1], uuid[2], uuid[3],
+ uuid[4], uuid[5], uuid[6], uuid[7],
+ uuid[8], uuid[9], uuid[10], uuid[11],
+ uuid[12], uuid[13], uuid[14], uuid[15]);
+ return string;
+}
+
/*
- * These functions is used to return both the bootid UUID, and random
+ * These functions are used to return both the bootid UUID, and random
* UUID. The difference is in whether table->data is NULL; if it is,
* then a new UUID is generated and returned to the user.
*
@@ -1189,7 +1214,7 @@ static int proc_do_uuid(ctl_table *table
void __user *buffer, size_t *lenp, loff_t *ppos)
{
ctl_table fake_table;
- unsigned char buf[64], tmp_uuid[16], *uuid;
+ unsigned char buf[37], tmp_uuid[16], *uuid;

uuid = table->data;
if (!uuid) {
@@ -1199,12 +1224,7 @@ static int proc_do_uuid(ctl_table *table
if (uuid[8] == 0)
generate_random_uuid(uuid);

- sprintf(buf, "%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-"
- "%02x%02x%02x%02x%02x%02x",
- uuid[0], uuid[1], uuid[2], uuid[3],
- uuid[4], uuid[5], uuid[6], uuid[7],
- uuid[8], uuid[9], uuid[10], uuid[11],
- uuid[12], uuid[13], uuid[14], uuid[15]);
+ snprintf_uuid(buf, sizeof(buf), uuid);
fake_table.data = buf;
fake_table.maxlen = sizeof(buf);

--- linux-2.6.24-rc5/include/linux/random.h.org 2007-12-18 12:22:49.000000000 -0800
+++ linux-2.6.24-rc5/include/linux/random.h 2007-12-18 12:22:57.000000000 -0800
@@ -71,6 +71,7 @@ unsigned long randomize_range(unsigned l

u32 random32(void);
void srandom32(u32 seed);
+const char *snprintf_uuid(char *string, int len, unsigned char *uuid);

#endif /* __KERNEL___ */

--- linux-2.6.24-rc5/kernel/panic.c.org 2007-12-18 12:23:19.000000000 -0800
+++ linux-2.6.24-rc5/kernel/panic.c 2007-12-18 12:35:46.000000000 -0800
@@ -19,6 +19,7 @@
#include <linux/nmi.h>
#include <linux/kexec.h>
#include <linux/debug_locks.h>
+#include <linux/random.h>

int panic_on_oops;
int tainted;
@@ -32,6 +33,8 @@ ATOMIC_NOTIFIER_HEAD(panic_notifier_list

EXPORT_SYMBOL(panic_notifier_list);

+static unsigned char oops_uuid[16];
+
static int __init panic_setup(char *str)
{
panic_timeout = simple_strtoul(str, NULL, 0);
@@ -265,15 +268,32 @@ void oops_enter(void)
do_oops_enter_exit();
}

+static int prime_oops_uuid(void)
+{
+ if (oops_uuid[8] == 0)
+ generate_random_uuid(oops_uuid);
+ return 0;
+}
+
/*
* Called when the architecture exits its oops handler, after printing
* everything.
*/
void oops_exit(void)
{
+ char uuid_string[37];
do_oops_enter_exit();
+
+ /*
+ * normally the oops_uid is already calculated, but if we oops during
+ * really early boot, it may not be. In that case, calculate it here.
+ */
+ prime_oops_uuid();
+ printk("---[ end trace %s ]---\n",
+ snprintf_uuid(uuid_string, sizeof(uuid_string), oops_uuid));
}

+late_initcall(prime_oops_uuid);
#ifdef CONFIG_CC_STACKPROTECTOR
/*
* Called when gcc's -fstack-protector feature is used, and


Attachments:
oopsend2.patch (4.94 kB)