2008-03-28 18:55:35

by Arjan van de Ven

[permalink] [raw]
Subject: Oops/Warning report for the week of March 28th 2008

The http://www.kerneloops.org website collects kernel oops and
warning reports from various mailing lists and bugzillas as well as
with a client users can install to auto-submit oopses.
Below is a top 10 list of the oopses collected in the last 7 days.
(Reports prior to 2.6.23 have been omitted in collecting the top 10)

This week, a total of 538 oopses and warnings have been reported,
compared to 169 reports in the previous week.


Rank 1: input_release_device
This appears to be a regression in 2.6.25; the first reports show up in 2.6.25-rc2
Often a warning at kernel/mutex.c:134 (mutex_lock_nested), but some oopses too
Reported 161 times (261 total reports)... this is 30% of the total reports this week!
The reports are all in the fput() path via ev_release
More info: http://www.kerneloops.org/searchweek.php?search=input_release_device

Rank 2: set_dentry_child_flags
The standard inotify warning... only seen upto 2.6.24
Assumed fixed in 2.6.25
Reported 49 times (493 total reports)
More info: http://www.kerneloops.org/searchweek.php?search=set_dentry_child_flags

Rank 3: __ieee80211_rx
Warning at net/mac80211/rx.c:2184 __ieee80211_rx (newer-than-mainline mac80211)
Reported 29 times (73 total reports)
More info: http://www.kerneloops.org/searchweek.php?search=__ieee80211_rx

Rank 4: radeon_vblank_crtc_set
This is probably a Fedora-special experimental Radeon driver
Reported 23 times (28 total reports)
More info: http://www.kerneloops.org/searchweek.php?search=radeon_vblank_crtc_set

Rank 5: do_sys_open
Oops because a driver returns "1" and not "-1" as error code
Seen with the nouveau driver, not in mainline yet.
Reported 21 times (39 total reports)
More info: http://www.kerneloops.org/searchweek.php?search=do_sys_open

Rank 6: uvc_get_video_ctrl
The uvc driver is not in mainline
Reported 19 times (84 total reports)
More info: http://www.kerneloops.org/searchweek.php?search=uvc_get_video_ctrl

Rank 7: __register_sysctl_paths
Seen with proprietary drivers only
Reported 16 times (186 total reports)
More info: http://www.kerneloops.org/searchweek.php?search=__register_sysctl_paths

Rank 8: tcp_ack
Warning at net/ipv4/tcp_input.c:2532 tcp_ack
Reported 15 times (87 total reports)
More info: http://www.kerneloops.org/searchweek.php?search=tcp_ack

Rank 9: sysctl_check_lookup
This seems to be a bug in parport_device_proc_register()
Reported 14 times (62 total reports)
More info: http://www.kerneloops.org/searchweek.php?search=sysctl_check_lookup

Rank 10: mark_buffer_dirty
This is an old friend, started in 2.6.24 and is still with us today.
Often but not exclusively during unmounting of a filesystem
Reported 12 times (39 total reports)
More info: http://www.kerneloops.org/searchweek.php?search=mark_buffer_dirty


2008-03-28 20:21:52

by Linus Torvalds

[permalink] [raw]
Subject: Re: Oops/Warning report for the week of March 28th 2008



On Fri, 28 Mar 2008, Arjan van de Ven wrote:
>
> Rank 1: input_release_device
> This appears to be a regression in 2.6.25; the first reports show up
> in 2.6.25-rc2 Often a warning at kernel/mutex.c:134
> (mutex_lock_nested), but some oopses too

The oopses (at least some of them) seem to be a use-after-free where we
seem to do a list_add() on an already-released list head (or we didn't
remove the previous/next entry from a list before we free'd it, and then
the next list_add() will follow a bogus pointer).

> http://www.kerneloops.org/searchweek.php?search=input_release_device

The problem with kerneloops is that it seems to be really hard to figure
out the *source* of the oops. I can find the oopses (and it's really good
with the whole search-and-clump-together-by-version thing), but then when
some oops like this is found, it's hard to see where your kerneloops
scripts found the oops from, so the context of the oops is all gone.

Is there something obvious that I'm missing? I'd really like to see the
whole posting that the oops came from. Do you save the originals or even
just message ID's from the ones you pick from emails?

Linus

2008-03-28 20:52:09

by Linus Torvalds

[permalink] [raw]
Subject: Re: Oops/Warning report for the week of March 28th 2008



On Fri, 28 Mar 2008, Linus Torvalds wrote:
>
> Is there something obvious that I'm missing? I'd really like to see the
> whole posting that the oops came from. Do you save the originals or even
> just message ID's from the ones you pick from emails?

Hmm. Definitely not from the kernel mailing list. I'm intrigued, where did
that oops #5814 come from (picked a recent one at random)?

The thing is recent, and oopses on "mutex_lock(dev->mutex)" in
input_release_device. In particular, the path *seems* to be this one:

evdev_release ->
evdev_ungrab ->
input_release_device ->
mutex_lock ->
mutex_lock_nested ->
__mutex_lock_common ->
list_add_tail(&waiter.list, &lock->wait_list)

where "lock->wait_list.prev" seems to be 0x6b6b6b6b6b6b6b6b, which is the
use-after-free poison pattern.

(In fact, I think the access that actually oopses is when the
debug version of __list_add() does

if (unlikely(prev->next != next)) {

because that "prev" pointer is crap).

So it seems that when input_release_device() does:

struct input_dev *dev = handle->dev;

mutex_lock(&dev->mutex);

the "dev" it uses has already been released. And this only shows up as a
problem when you have slab debugging turned on (like the Fedora kernels
do, thank you all Fedora guys).

The odd thing is that I don't think any of this code has really changed
recently.

Linus

2008-03-28 20:58:01

by Björn Steinbrink

[permalink] [raw]
Subject: Re: Oops/Warning report for the week of March 28th 2008

On 2008.03.28 13:21:38 -0700, Linus Torvalds wrote:
> On Fri, 28 Mar 2008, Arjan van de Ven wrote:
> >
> > Rank 1: input_release_device
> > This appears to be a regression in 2.6.25; the first reports show up
> > in 2.6.25-rc2 Often a warning at kernel/mutex.c:134
> > (mutex_lock_nested), but some oopses too
>
> The oopses (at least some of them) seem to be a use-after-free where we
> seem to do a list_add() on an already-released list head (or we didn't
> remove the previous/next entry from a list before we free'd it, and then
> the next list_add() will follow a bogus pointer).
>
> > http://www.kerneloops.org/searchweek.php?search=input_release_device
>
> The problem with kerneloops is that it seems to be really hard to figure
> out the *source* of the oops. I can find the oopses (and it's really good
> with the whole search-and-clump-together-by-version thing), but then when
> some oops like this is found, it's hard to see where your kerneloops
> scripts found the oops from, so the context of the oops is all gone.
>
> Is there something obvious that I'm missing? I'd really like to see the
> whole posting that the oops came from. Do you save the originals or even
> just message ID's from the ones you pick from emails?

The oops for that one seem to be all coming from fc9 systems, which
(IIRC) include the automatic kerneloops reporting tool, so there
probably is no mail for them. Those that come from a mailing list
usually have a link at the top, for example this one:
http://www.kerneloops.org/raw.php?rawid=5735&msgid=http://mid.gmane.org/[email protected]

Bj?rn

2008-03-28 21:01:37

by Johannes Berg

[permalink] [raw]
Subject: Re: Oops/Warning report for the week of March 28th 2008


> Hmm. Definitely not from the kernel mailing list. I'm intrigued, where did
> that oops #5814 come from (picked a recent one at random)?

Can't tell you this, but

> The thing is recent, and oopses on "mutex_lock(dev->mutex)" in
> input_release_device. In particular, the path *seems* to be this one:
>
> evdev_release ->
> evdev_ungrab ->
> input_release_device ->
> mutex_lock ->
> mutex_lock_nested ->
> __mutex_lock_common ->
> list_add_tail(&waiter.list, &lock->wait_list)

That I've seen before, analysed a bit and posted:
http://permalink.gmane.org/gmane.linux.kernel.input/4281

johannes


Attachments:
signature.asc (828.00 B)
This is a digitally signed message part

2008-03-28 21:17:03

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: Oops/Warning report for the week of March 28th 2008

On Fri, Mar 28, 2008 at 01:51:38PM -0700, Linus Torvalds wrote:
>
>
> On Fri, 28 Mar 2008, Linus Torvalds wrote:
> >
> > Is there something obvious that I'm missing? I'd really like to see the
> > whole posting that the oops came from. Do you save the originals or even
> > just message ID's from the ones you pick from emails?
>
> Hmm. Definitely not from the kernel mailing list. I'm intrigued, where did
> that oops #5814 come from (picked a recent one at random)?
>
> The thing is recent, and oopses on "mutex_lock(dev->mutex)" in
> input_release_device. In particular, the path *seems* to be this one:
>
> evdev_release ->
> evdev_ungrab ->
> input_release_device ->
> mutex_lock ->
> mutex_lock_nested ->
> __mutex_lock_common ->
> list_add_tail(&waiter.list, &lock->wait_list)
>
> where "lock->wait_list.prev" seems to be 0x6b6b6b6b6b6b6b6b, which is the
> use-after-free poison pattern.
>
> (In fact, I think the access that actually oopses is when the
> debug version of __list_add() does
>
> if (unlikely(prev->next != next)) {
>
> because that "prev" pointer is crap).
>
> So it seems that when input_release_device() does:
>
> struct input_dev *dev = handle->dev;
>
> mutex_lock(&dev->mutex);
>
> the "dev" it uses has already been released. And this only shows up as a
> problem when you have slab debugging turned on (like the Fedora kernels
> do, thank you all Fedora guys).
>
> The odd thing is that I don't think any of this code has really changed
> recently.
>

There is a patch from Pete that works around the problem by not
calling input_release_device() on devices that are gone. But what
I don't understand is why the parent input device is gone since
sysfs/driver core should be keeping a reference to it since it is
a parent of evdev. input_dev shoudl only be released after
evdev_free() is called.

--
Dmitry

2008-03-28 21:25:24

by Linus Torvalds

[permalink] [raw]
Subject: Re: Oops/Warning report for the week of March 28th 2008



On Fri, 28 Mar 2008, Johannes Berg wrote:
>
> That I've seen before, analysed a bit and posted:
> http://permalink.gmane.org/gmane.linux.kernel.input/4281

Ahh.. And this is presumably triggered either by some new Xorg behavior in
fc9 or just the scheduler timing changes that caused other things too?

Your suggested solution sounds ok, but I'm also wondering why those things
aren't properly refcounted? It does sound like a bug to free the thing
before all users are gone - regardless of anything else. Hmm?

But the ungrab sounds like the best short-term fix. Do we have a patch for
testing? Please? It's pretty late in the 2.6.25 cycle for these kinds of
things..

Linus

2008-03-28 21:43:43

by Johannes Berg

[permalink] [raw]
Subject: Re: Oops/Warning report for the week of March 28th 2008


> > That I've seen before, analysed a bit and posted:
> > http://permalink.gmane.org/gmane.linux.kernel.input/4281
>
> Ahh.. And this is presumably triggered either by some new Xorg behavior in
> fc9 or just the scheduler timing changes that caused other things too?

I tend to think it's actually a regression, I've done exactly what I
described there dozens of times before on older kernels and it doesn't
seem to be a race condition, X always locks the input device and I never
had trouble with that before.

> Your suggested solution sounds ok, but I'm also wondering why those things
> aren't properly refcounted? It does sound like a bug to free the thing
> before all users are gone - regardless of anything else. Hmm?

I have to admit to not understanding the depths of the input code, I
just took a quick look when I posted that. Mind you, that was 5 weeks
ago!

> But the ungrab sounds like the best short-term fix. Do we have a patch for
> testing? Please? It's pretty late in the 2.6.25 cycle for these kinds of
> things..

I don't. I think Dmitry just said in the other mail that he did, but I
haven't seen that yet.

johannes


Attachments:
signature.asc (828.00 B)
This is a digitally signed message part

2008-03-28 22:02:25

by Linus Torvalds

[permalink] [raw]
Subject: Re: Oops/Warning report for the week of March 28th 2008



On Fri, 28 Mar 2008, Johannes Berg wrote:
>
> I tend to think it's actually a regression, I've done exactly what I
> described there dozens of times before on older kernels and it doesn't
> seem to be a race condition, X always locks the input device and I never
> had trouble with that before.

Ahh. So it's easily repeatable for you.. Do you think you could bisect it?
At least a few runs? Even a partial bisection will give a nice (== much
smaller) range of commits to be blamed, and might tell us why it started
happening..

Linus

2008-03-28 22:15:26

by Johannes Berg

[permalink] [raw]
Subject: Re: Oops/Warning report for the week of March 28th 2008


> Ahh. So it's easily repeatable for you.. Do you think you could bisect it?
> At least a few runs? Even a partial bisection will give a nice (== much
> smaller) range of commits to be blamed, and might tell us why it started
> happening..

Unfortunately, it takes forever on this machine to compile a kernel
after any bigger changes. I can do it early next week when I have access
to my bigger box again.

On the other hand, it should be easily reproducible by anyone else with
the same trick, here's what I do:

* configure X to use /dev/input/event* devices
* in an xterm, do something like
rmmod usbhid ; modprobe usbhid
* switch to a VT
* watch kernel crash as X releases the grab on the event device

Mind you, I actually did this with the appletouch driver, but I don't
think it makes a difference since the problematic thing is the code that
X grabs the device. A trivial program that executes the grab ioctl would
probably suffice if you leave it running over a rmmod/modprobe cycle.

johannes


Attachments:
signature.asc (828.00 B)
This is a digitally signed message part

2008-03-28 22:15:45

by Jiri Kosina

[permalink] [raw]
Subject: Re: Oops/Warning report for the week of March 28th 2008

On Fri, 28 Mar 2008, Linus Torvalds wrote:

> > I tend to think it's actually a regression, I've done exactly what I
> > described there dozens of times before on older kernels and it doesn't
> > seem to be a race condition, X always locks the input device and I
> > never had trouble with that before.
> Ahh. So it's easily repeatable for you.. Do you think you could bisect it?
> At least a few runs? Even a partial bisection will give a nice (== much
> smaller) range of commits to be blamed, and might tell us why it started
> happening..

... and if bisecting the whole tree isn't the way for you to go right now,
maybe restricting to drivers/input/ and drivers/base/ (as I think this
might be some broken driver core refcounting) and possibly drivers/hid/
(if you are using any HID input device) might also be sufficient.

--
Jiri Kosina
SUSE Labs

2008-03-28 22:33:57

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Oops/Warning report for the week of March 28th 2008

Linus Torvalds wrote:
>
>> http://www.kerneloops.org/searchweek.php?search=input_release_device
>
> The problem with kerneloops is that it seems to be really hard to figure
> out the *source* of the oops. I can find the oopses (and it's really good
> with the whole search-and-clump-together-by-version thing), but then when
> some oops like this is found, it's hard to see where your kerneloops
> scripts found the oops from, so the context of the oops is all gone.

the website has 2 (well 3) sources for oopses

1) Postings to LKML and related mailing list
2) kernel.org bugzilla bugs (but this technically goes via a mailing list so could count as #1)
3) a daemon/desktop applet application that will, when an oops is found in dmesg or /var/log/messages,
pops up a dialog and asks if it's OK to submit the crash data. (always/yes/no/never)

Fedora 9 alpha / beta versions have this daemon/applet included (as does Gentoo and several other distros).

> Is there something obvious that I'm missing? I'd really like to see the
> whole posting that the oops came from. Do you save the originals or even
> just message ID's from the ones you pick from emails?

For the postings to LKML there is already a link to the message (based on msg id)
For bugzilla entries the website actually shows a bugzilla logo and links to the bug directly
For the daemon-submitted oopses there is no other data unfortunately (but this is for privacy
reasons; since no other info than the oops is submitted users are more likely to allow this)

The question (and it is a question still for me) is if those anonymous submissions are useful.
I think they are, since they allow for getting a more "neutral" measure of "which oopses are frequent"...
At least more neutral than "people who manage to notice an oops in dmesg and then post to LKML"...
And there's sometimes power in numbers.. for example in the do_sys_open() oops, I had like 20 no-info oopses,
but they ALL had nouveau in the modules list...

I can try doing some sorting tricks in the grouping, so that those oopses who have more data than this, show
up on top... (and maybe a direct link on the search page)

2008-03-28 23:10:31

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Oops/Warning report for the week of March 28th 2008

Arjan van de Ven wrote:
> Linus Torvalds wrote:
>>
>>> http://www.kerneloops.org/searchweek.php?search=input_release_device
>>
>> The problem with kerneloops is that it seems to be really hard to
>> figure out the *source* of the oops. I can find the oopses (and it's
>> really good with the whole search-and-clump-together-by-version
>> thing), but then when some oops like this is found, it's hard to see
>> where your kerneloops scripts found the oops from, so the context of
>> the oops is all gone.

> I can try doing some sorting tricks in the grouping, so that those
> oopses who have more data than this, show
> up on top... (and maybe a direct link on the search page)

I just implemented this; below are two examples of this:

http://www.kerneloops.org/searchweek.php?search=tcp_mark_head_lost
http://www.kerneloops.org/search.php?search=input_release_device

The ratio of automatic versus mailinglist/bugzilla mails is quite high in favor
of the automatic ones so there's not all THAT many of these at this point,
especially in the "searchweek" version, which only shows the last 7 days of
reports.

2008-03-29 12:20:29

by Björn Steinbrink

[permalink] [raw]
Subject: Re: Oops/Warning report for the week of March 28th 2008

On 2008.03.28 17:16:42 -0400, Dmitry Torokhov wrote:
> On Fri, Mar 28, 2008 at 01:51:38PM -0700, Linus Torvalds wrote:
> >
> >
> > On Fri, 28 Mar 2008, Linus Torvalds wrote:
> > >
> > > Is there something obvious that I'm missing? I'd really like to see the
> > > whole posting that the oops came from. Do you save the originals or even
> > > just message ID's from the ones you pick from emails?
> >
> > Hmm. Definitely not from the kernel mailing list. I'm intrigued, where did
> > that oops #5814 come from (picked a recent one at random)?
> >
> > The thing is recent, and oopses on "mutex_lock(dev->mutex)" in
> > input_release_device. In particular, the path *seems* to be this one:
> >
> > evdev_release ->
> > evdev_ungrab ->
> > input_release_device ->
> > mutex_lock ->
> > mutex_lock_nested ->
> > __mutex_lock_common ->
> > list_add_tail(&waiter.list, &lock->wait_list)
> >
> > where "lock->wait_list.prev" seems to be 0x6b6b6b6b6b6b6b6b, which is the
> > use-after-free poison pattern.
> >
> > (In fact, I think the access that actually oopses is when the
> > debug version of __list_add() does
> >
> > if (unlikely(prev->next != next)) {
> >
> > because that "prev" pointer is crap).
> >
> > So it seems that when input_release_device() does:
> >
> > struct input_dev *dev = handle->dev;
> >
> > mutex_lock(&dev->mutex);
> >
> > the "dev" it uses has already been released. And this only shows up as a
> > problem when you have slab debugging turned on (like the Fedora kernels
> > do, thank you all Fedora guys).
> >
> > The odd thing is that I don't think any of this code has really changed
> > recently.
> >
>
> There is a patch from Pete that works around the problem by not
> calling input_release_device() on devices that are gone. But what
> I don't understand is why the parent input device is gone since
> sysfs/driver core should be keeping a reference to it since it is
> a parent of evdev. input_dev shoudl only be released after
> evdev_free() is called.

Hm? evdev_free only does the final kfree call. The calls to device_del
and put_device are already happening in device_disconnect, so the parent
can go away any time after that. Do you say that that should be moved
into evdev_free instead? I'm not familiar with the code, but at first
sight, I'd say that we should have a "if (evdev->grab)
evdev_ungrab(evdev, evdev->grab)" in evdev_cleanup, looks like the
logical place to do that. Anything I'm missing?

Bj?rn