2006-09-05 17:37:53

by Miles Lane

[permalink] [raw]
Subject: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected

ieee1394: Node changed: 0-01:1023 -> 0-00:1023
ieee1394: Node changed: 0-02:1023 -> 0-01:1023
ieee1394: Node suspended: ID:BUS[0-00:1023] GUID[0080880002103eae]

=============================================
[ INFO: possible recursive locking detected ]
2.6.18-rc5-mm1 #2
---------------------------------------------
knodemgrd_0/2321 is trying to acquire lock:
(&s->rwsem){----}, at: [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394]

but task is already holding lock:
(&s->rwsem){----}, at: [<f8959078>] nodemgr_host_thread+0x717/0x883 [ieee1394]

other info that might help us debug this:
2 locks held by knodemgrd_0/2321:
#0: (nodemgr_serialize){--..}, at: [<c11e76cd>]
mutex_lock_interruptible+0x1c/0x21
#1: (&s->rwsem){----}, at: [<f8959078>]
nodemgr_host_thread+0x717/0x883 [ieee1394]

stack backtrace:
[<c1003c97>] dump_trace+0x69/0x1b7
[<c1003dfa>] show_trace_log_lvl+0x15/0x28
[<c10040f5>] show_trace+0x16/0x19
[<c1004110>] dump_stack+0x18/0x1d
[<c102f1e1>] __lock_acquire+0x7a2/0x9f8
[<c102f70a>] lock_acquire+0x56/0x74
[<c102b805>] down_write+0x27/0x41
[<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394]
[<f8959098>] nodemgr_host_thread+0x737/0x883 [ieee1394]
[<c1028c19>] kthread+0xaf/0xde
[<c100397b>] kernel_thread_helper+0x7/0x10
DWARF2 unwinder stuck at kernel_thread_helper+0x7/0x10

Leftover inexact backtrace:

[<c1003dfa>] show_trace_log_lvl+0x15/0x28
[<c10040f5>] show_trace+0x16/0x19
[<c1004110>] dump_stack+0x18/0x1d
[<c102f1e1>] __lock_acquire+0x7a2/0x9f8
[<c102f70a>] lock_acquire+0x56/0x74
[<c102b805>] down_write+0x27/0x41
[<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394]
[<f8959098>] nodemgr_host_thread+0x737/0x883 [ieee1394]
[<c1028c19>] kthread+0xaf/0xde
[<c100397b>] kernel_thread_helper+0x7/0x10
=======================
ieee1394: Node resumed: ID:BUS[0-00:1023] GUID[0080880002103eae]
ieee1394: Node changed: 0-00:1023 -> 0-01:1023
ieee1394: Node changed: 0-01:1023 -> 0-02:1023


2006-09-05 18:13:44

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected

On Tue, 5 Sep 2006 10:37:51 -0700
"Miles Lane" <[email protected]> wrote:

> ieee1394: Node changed: 0-01:1023 -> 0-00:1023
> ieee1394: Node changed: 0-02:1023 -> 0-01:1023
> ieee1394: Node suspended: ID:BUS[0-00:1023] GUID[0080880002103eae]
>
> =============================================
> [ INFO: possible recursive locking detected ]
> 2.6.18-rc5-mm1 #2
> ---------------------------------------------
> knodemgrd_0/2321 is trying to acquire lock:
> (&s->rwsem){----}, at: [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394]
>
> but task is already holding lock:
> (&s->rwsem){----}, at: [<f8959078>] nodemgr_host_thread+0x717/0x883 [ieee1394]
>
> other info that might help us debug this:
> 2 locks held by knodemgrd_0/2321:
> #0: (nodemgr_serialize){--..}, at: [<c11e76cd>]
> mutex_lock_interruptible+0x1c/0x21
> #1: (&s->rwsem){----}, at: [<f8959078>]
> nodemgr_host_thread+0x717/0x883 [ieee1394]
>
> stack backtrace:
> [<c1003c97>] dump_trace+0x69/0x1b7
> [<c1003dfa>] show_trace_log_lvl+0x15/0x28
> [<c10040f5>] show_trace+0x16/0x19
> [<c1004110>] dump_stack+0x18/0x1d
> [<c102f1e1>] __lock_acquire+0x7a2/0x9f8
> [<c102f70a>] lock_acquire+0x56/0x74
> [<c102b805>] down_write+0x27/0x41
> [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394]
> [<f8959098>] nodemgr_host_thread+0x737/0x883 [ieee1394]
> [<c1028c19>] kthread+0xaf/0xde
> [<c100397b>] kernel_thread_helper+0x7/0x10
> DWARF2 unwinder stuck at kernel_thread_helper+0x7/0x10
>
> Leftover inexact backtrace:
>
> [<c1003dfa>] show_trace_log_lvl+0x15/0x28
> [<c10040f5>] show_trace+0x16/0x19
> [<c1004110>] dump_stack+0x18/0x1d
> [<c102f1e1>] __lock_acquire+0x7a2/0x9f8
> [<c102f70a>] lock_acquire+0x56/0x74
> [<c102b805>] down_write+0x27/0x41
> [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394]
> [<f8959098>] nodemgr_host_thread+0x737/0x883 [ieee1394]
> [<c1028c19>] kthread+0xaf/0xde
> [<c100397b>] kernel_thread_helper+0x7/0x10
> =======================
> ieee1394: Node resumed: ID:BUS[0-00:1023] GUID[0080880002103eae]
> ieee1394: Node changed: 0-00:1023 -> 0-01:1023
> ieee1394: Node changed: 0-01:1023 -> 0-02:1023

That's a 1394 glitch, possibly introduced by git-ieee1394.patch.

2006-09-05 18:17:10

by Miles Lane

[permalink] [raw]
Subject: Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected

On 9/5/06, Andrew Morton <[email protected]> wrote:
> On Tue, 5 Sep 2006 10:37:51 -0700
> "Miles Lane" <[email protected]> wrote:
>
> > ieee1394: Node changed: 0-01:1023 -> 0-00:1023
> > ieee1394: Node changed: 0-02:1023 -> 0-01:1023
> > ieee1394: Node suspended: ID:BUS[0-00:1023] GUID[0080880002103eae]
> >
> > =============================================
> > [ INFO: possible recursive locking detected ]
> > 2.6.18-rc5-mm1 #2
> > ---------------------------------------------
> > knodemgrd_0/2321 is trying to acquire lock:
> > (&s->rwsem){----}, at: [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394]
> >
> > but task is already holding lock:
> > (&s->rwsem){----}, at: [<f8959078>] nodemgr_host_thread+0x717/0x883 [ieee1394]
> >
> > other info that might help us debug this:
> > 2 locks held by knodemgrd_0/2321:
> > #0: (nodemgr_serialize){--..}, at: [<c11e76cd>]
> > mutex_lock_interruptible+0x1c/0x21
> > #1: (&s->rwsem){----}, at: [<f8959078>]
> > nodemgr_host_thread+0x717/0x883 [ieee1394]
> >
> > stack backtrace:
> > [<c1003c97>] dump_trace+0x69/0x1b7
> > [<c1003dfa>] show_trace_log_lvl+0x15/0x28
> > [<c10040f5>] show_trace+0x16/0x19
> > [<c1004110>] dump_stack+0x18/0x1d
> > [<c102f1e1>] __lock_acquire+0x7a2/0x9f8
> > [<c102f70a>] lock_acquire+0x56/0x74
> > [<c102b805>] down_write+0x27/0x41
> > [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394]
> > [<f8959098>] nodemgr_host_thread+0x737/0x883 [ieee1394]
> > [<c1028c19>] kthread+0xaf/0xde
> > [<c100397b>] kernel_thread_helper+0x7/0x10
> > DWARF2 unwinder stuck at kernel_thread_helper+0x7/0x10
> >
> > Leftover inexact backtrace:
> >
> > [<c1003dfa>] show_trace_log_lvl+0x15/0x28
> > [<c10040f5>] show_trace+0x16/0x19
> > [<c1004110>] dump_stack+0x18/0x1d
> > [<c102f1e1>] __lock_acquire+0x7a2/0x9f8
> > [<c102f70a>] lock_acquire+0x56/0x74
> > [<c102b805>] down_write+0x27/0x41
> > [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394]
> > [<f8959098>] nodemgr_host_thread+0x737/0x883 [ieee1394]
> > [<c1028c19>] kthread+0xaf/0xde
> > [<c100397b>] kernel_thread_helper+0x7/0x10
> > =======================
> > ieee1394: Node resumed: ID:BUS[0-00:1023] GUID[0080880002103eae]
> > ieee1394: Node changed: 0-00:1023 -> 0-01:1023
> > ieee1394: Node changed: 0-01:1023 -> 0-02:1023
>
> That's a 1394 glitch, possibly introduced by git-ieee1394.patch.

Would you like me to verify that removing the patch fixes it, or
should I wait for the 2.6.18-rc6-mm1 tree?

Thanks,
Miles

2006-09-05 19:04:38

by Stefan Richter

[permalink] [raw]
Subject: Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected

Miles Lane wrote:
> On 9/5/06, Andrew Morton <[email protected]> wrote:
>> On Tue, 5 Sep 2006 10:37:51 -0700
>> "Miles Lane" <[email protected]> wrote:
>>
>>> ieee1394: Node changed: 0-01:1023 -> 0-00:1023
>>> ieee1394: Node changed: 0-02:1023 -> 0-01:1023
>>> ieee1394: Node suspended: ID:BUS[0-00:1023] GUID[0080880002103eae]
>>>
>>> =============================================
>>> [ INFO: possible recursive locking detected ]
>>> 2.6.18-rc5-mm1 #2
>>> ---------------------------------------------
>>> knodemgrd_0/2321 is trying to acquire lock:
>>> (&s->rwsem){----}, at: [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394]
>>>
>>> but task is already holding lock:
>>> (&s->rwsem){----}, at: [<f8959078>] nodemgr_host_thread+0x717/0x883 [ieee1394]

How often does this happen?

[...]
>> That's a 1394 glitch, possibly introduced by git-ieee1394.patch.
>
> Would you like me to verify that removing the patch fixes it, or
> should I wait for the 2.6.18-rc6-mm1 tree?

My patches
"ieee1394: nodemgr: switch to kthread api, replace reset semaphore" and
"ieee1394: nodemgr: convert nodemgr_serialize semaphore to mutex"
may be relevant. They are included in git-ieee1394.patch.

Could you revert them individually and test? It should be possible to
just "patch -p1 -R < ...." the following patchfiles:
http://me.in-berlin.de/~s5r6/linux1394/updates/2.6.18-rc5/patches/119-ieee1394-nodemgr-convert-nodemgr_serialize-semaphore-to-mutex.patch
If the problem persists, also revert
http://me.in-berlin.de/~s5r6/linux1394/updates/2.6.18-rc5/patches/118-ieee1394-nodemgr-switch-to-kthread-api--replace-reset-semaphore.patch

If that does not help, install them again and unapply all ieee1394
patches from -mm. If you have the time.

Thanks a lot,
--
Stefan Richter
-=====-=-==- =--= --=-=
http://arcgraph.de/sr/

2006-09-05 19:20:00

by Miles Lane

[permalink] [raw]
Subject: Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected

On 9/5/06, Stefan Richter <[email protected]> wrote:
> Miles Lane wrote:
> > On 9/5/06, Andrew Morton <[email protected]> wrote:
> >> On Tue, 5 Sep 2006 10:37:51 -0700
> >> "Miles Lane" <[email protected]> wrote:
> >>
> >>> ieee1394: Node changed: 0-01:1023 -> 0-00:1023
> >>> ieee1394: Node changed: 0-02:1023 -> 0-01:1023
> >>> ieee1394: Node suspended: ID:BUS[0-00:1023] GUID[0080880002103eae]
> >>>
> >>> =============================================
> >>> [ INFO: possible recursive locking detected ]
> >>> 2.6.18-rc5-mm1 #2
> >>> ---------------------------------------------
> >>> knodemgrd_0/2321 is trying to acquire lock:
> >>> (&s->rwsem){----}, at: [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394]
> >>>
> >>> but task is already holding lock:
> >>> (&s->rwsem){----}, at: [<f8959078>] nodemgr_host_thread+0x717/0x883 [ieee1394]
>
> How often does this happen?

It seems to happen each time I plug in my JVC MiniDV camera (model GR-DVL9800U).

> >> That's a 1394 glitch, possibly introduced by git-ieee1394.patch.
> >
> > Would you like me to verify that removing the patch fixes it, or
> > should I wait for the 2.6.18-rc6-mm1 tree?
>
> My patches
> "ieee1394: nodemgr: switch to kthread api, replace reset semaphore" and
> "ieee1394: nodemgr: convert nodemgr_serialize semaphore to mutex"
> may be relevant. They are included in git-ieee1394.patch.
>
> Could you revert them individually and test? It should be possible to
> just "patch -p1 -R < ...." the following patchfiles:
> http://me.in-berlin.de/~s5r6/linux1394/updates/2.6.18-rc5/patches/119-ieee1394-nodemgr-convert-nodemgr_serialize-semaphore-to-mutex.patch
> If the problem persists, also revert
> http://me.in-berlin.de/~s5r6/linux1394/updates/2.6.18-rc5/patches/118-ieee1394-nodemgr-switch-to-kthread-api--replace-reset-semaphore.patch
>
> If that does not help, install them again and unapply all ieee1394
> patches from -mm. If you have the time.

I am setting up to test with the first patch removed. The patch
doesn't apply cleanly, but I suspect this is no big deal.

patch -p1 -R < /home/miles/119-ieee1394-nodemgr-convert-nodemgr_serialize-semaphore-to-mutex.patch
patching file drivers/ieee1394/nodemgr.c
Hunk #2 succeeded at 1630 (offset 9 lines).
Hunk #3 succeeded at 1659 (offset 9 lines).
Hunk #4 succeeded at 1677 (offset 9 lines).

2006-09-05 19:24:19

by Stefan Richter

[permalink] [raw]
Subject: Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected

Andrew Morton wrote:
> On Tue, 5 Sep 2006 10:37:51 -0700
> "Miles Lane" <[email protected]> wrote:
>
>> ieee1394: Node changed: 0-01:1023 -> 0-00:1023
>> ieee1394: Node changed: 0-02:1023 -> 0-01:1023
>> ieee1394: Node suspended: ID:BUS[0-00:1023] GUID[0080880002103eae]
>>
>> =============================================
>> [ INFO: possible recursive locking detected ]
>> 2.6.18-rc5-mm1 #2
>> ---------------------------------------------
>> knodemgrd_0/2321 is trying to acquire lock:
>> (&s->rwsem){----}, at: [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394]
>>
>> but task is already holding lock:
>> (&s->rwsem){----}, at: [<f8959078>] nodemgr_host_thread+0x717/0x883 [ieee1394]
[...]

This information confuses me. These places are not supposed to be the
ones where the locks were actually acquired, are they?

> That's a 1394 glitch, possibly introduced by git-ieee1394.patch.

Or maybe it's older. Nodemgr takes class->subsys.rwsem and
device.bus->subsys.rwsem. It always did. Could there be a change in
driver core which makes this recursive? Or has it always been recursive?
For example,

static void nodemgr_update_pdrv(struct node_entry *ne)
{
struct unit_directory *ud;
struct hpsb_protocol_driver *pdrv;
struct class *class = &nodemgr_ud_class;
struct class_device *cdev;

down_read(&class->subsys.rwsem);
list_for_each_entry(cdev, &class->children, node) {
ud = container_of(cdev, struct unit_directory, class_dev);
if (ud->ne != ne || !ud->device.driver)
continue;

pdrv = container_of(ud->device.driver, struct hpsb_protocol_driver,
driver);

if (pdrv->update && pdrv->update(ud)) {
down_write(&ud->device.bus->subsys.rwsem);
device_release_driver(&ud->device);
up_write(&ud->device.bus->subsys.rwsem);
}
}
up_read(&class->subsys.rwsem);
}


Miles,

perhaps you should rather unapply all 1394 patches at once.
git-ieee1394.patch is alas the lowermost patch of a stack of dependent
patches. I somehow expect that the "possible recursive locking" persists
even if all the 1394 patches were removed.

Thanks in advance,
--
Stefan Richter
-=====-=-==- =--= --=-=
http://arcgraph.de/sr/

2006-09-05 19:49:38

by Miles Lane

[permalink] [raw]
Subject: Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected

On 9/5/06, Andrew Morton <[email protected]> wrote:
> On Tue, 5 Sep 2006 10:37:51 -0700
> "Miles Lane" <[email protected]> wrote:
>
> > ieee1394: Node changed: 0-01:1023 -> 0-00:1023
> > ieee1394: Node changed: 0-02:1023 -> 0-01:1023
> > ieee1394: Node suspended: ID:BUS[0-00:1023] GUID[0080880002103eae]
> >
> > =============================================
> > [ INFO: possible recursive locking detected ]
> > 2.6.18-rc5-mm1 #2
> > ---------------------------------------------
> > knodemgrd_0/2321 is trying to acquire lock:
> > (&s->rwsem){----}, at: [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394]
> >
> > but task is already holding lock:
> > (&s->rwsem){----}, at: [<f8959078>] nodemgr_host_thread+0x717/0x883 [ieee1394]
> >
> > other info that might help us debug this:
> > 2 locks held by knodemgrd_0/2321:
> > #0: (nodemgr_serialize){--..}, at: [<c11e76cd>]
> > mutex_lock_interruptible+0x1c/0x21
> > #1: (&s->rwsem){----}, at: [<f8959078>]
> > nodemgr_host_thread+0x717/0x883 [ieee1394]
> >
> > stack backtrace:
> > [<c1003c97>] dump_trace+0x69/0x1b7
> > [<c1003dfa>] show_trace_log_lvl+0x15/0x28
> > [<c10040f5>] show_trace+0x16/0x19
> > [<c1004110>] dump_stack+0x18/0x1d
> > [<c102f1e1>] __lock_acquire+0x7a2/0x9f8
> > [<c102f70a>] lock_acquire+0x56/0x74
> > [<c102b805>] down_write+0x27/0x41
> > [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394]
> > [<f8959098>] nodemgr_host_thread+0x737/0x883 [ieee1394]
> > [<c1028c19>] kthread+0xaf/0xde
> > [<c100397b>] kernel_thread_helper+0x7/0x10
> > DWARF2 unwinder stuck at kernel_thread_helper+0x7/0x10
> >
> > Leftover inexact backtrace:
> >
> > [<c1003dfa>] show_trace_log_lvl+0x15/0x28
> > [<c10040f5>] show_trace+0x16/0x19
> > [<c1004110>] dump_stack+0x18/0x1d
> > [<c102f1e1>] __lock_acquire+0x7a2/0x9f8
> > [<c102f70a>] lock_acquire+0x56/0x74
> > [<c102b805>] down_write+0x27/0x41
> > [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394]
> > [<f8959098>] nodemgr_host_thread+0x737/0x883 [ieee1394]
> > [<c1028c19>] kthread+0xaf/0xde
> > [<c100397b>] kernel_thread_helper+0x7/0x10
> > =======================
> > ieee1394: Node resumed: ID:BUS[0-00:1023] GUID[0080880002103eae]
> > ieee1394: Node changed: 0-00:1023 -> 0-01:1023
> > ieee1394: Node changed: 0-01:1023 -> 0-02:1023
>
> That's a 1394 glitch, possibly introduced by git-ieee1394.patch.
>
>

Hi Andrew,

I am having trouble with backing out the git-ieee1394 patches. Suggestions?
I am not knowledgable about the kernel code to fix broken patches.

# patch -p1 -R --dry-run < /home/miles/git-ieee1394.patch patching
file drivers/ieee1394/csr.c
patching file drivers/ieee1394/csr.h
patching file drivers/ieee1394/dma.c
patching file drivers/ieee1394/dma.h
patching file drivers/ieee1394/dv1394-private.h
patching file drivers/ieee1394/dv1394.c
patching file drivers/ieee1394/eth1394.c
Hunk #1 succeeded at 66 with fuzz 1 (offset -1 lines).
patching file drivers/ieee1394/highlevel.h
patching file drivers/ieee1394/hosts.c
Hunk #1 succeeded at 98 with fuzz 2 (offset 8 lines).
Hunk #2 FAILED at 113.
Hunk #3 FAILED at 123.
2 out of 3 hunks FAILED -- saving rejects to file drivers/ieee1394/hosts.c.rej
patching file drivers/ieee1394/hosts.h
Hunk #2 succeeded at 109 (offset -3 lines).
Hunk #3 succeeded at 157 (offset -3 lines).
Hunk #4 succeeded at 167 (offset -3 lines).
Hunk #5 succeeded at 193 (offset -3 lines).
patching file drivers/ieee1394/ieee1394-ioctl.h
patching file drivers/ieee1394/ieee1394.h
patching file drivers/ieee1394/ieee1394_core.c
Hunk #1 succeeded at 354 (offset -1 lines).
patching file drivers/ieee1394/ieee1394_core.h
Hunk #1 FAILED at 1.
Hunk #2 succeeded at 57 (offset -1 lines).
Hunk #3 succeeded at 79 (offset -1 lines).
Hunk #4 succeeded at 91 (offset -1 lines).
Hunk #5 succeeded at 203 (offset -1 lines).
Hunk #6 succeeded at 222 (offset -1 lines).
1 out of 6 hunks FAILED -- saving rejects to file
drivers/ieee1394/ieee1394_core.h.rej
patching file drivers/ieee1394/ieee1394_hotplug.h
patching file drivers/ieee1394/ieee1394_transactions.c
Hunk #1 succeeded at 13 with fuzz 2 (offset -1 lines).
Hunk #2 succeeded at 232 (offset 18 lines).
Hunk #3 succeeded at 279 (offset 18 lines).
patching file drivers/ieee1394/ieee1394_transactions.h
patching file drivers/ieee1394/ieee1394_types.h
Hunk #1 FAILED at 1.
Hunk #2 succeeded at 9 with fuzz 2 (offset -22 lines).
Hunk #3 FAILED at 32.
2 out of 3 hunks FAILED -- saving rejects to file
drivers/ieee1394/ieee1394_types.h.rej
patching file drivers/ieee1394/iso.c
patching file drivers/ieee1394/iso.h
patching file drivers/ieee1394/nodemgr.c
Hunk #4 succeeded at 418 (offset 10 lines).
Hunk #5 succeeded at 1260 (offset 9 lines).
Hunk #6 succeeded at 1268 (offset 9 lines).
Hunk #7 succeeded at 1309 (offset 9 lines).
Hunk #8 succeeded at 1501 (offset 9 lines).
Hunk #9 succeeded at 1631 (offset 9 lines).
Hunk #10 succeeded at 1676 (offset 9 lines).
Hunk #11 succeeded at 1707 (offset 9 lines).
Hunk #12 succeeded at 1773 (offset 9 lines).
Hunk #13 succeeded at 1815 (offset 9 lines).
patching file drivers/ieee1394/nodemgr.h
Hunk #5 succeeded at 105 with fuzz 1 (offset -1 lines).
Hunk #6 succeeded at 152 (offset -1 lines).
Hunk #7 succeeded at 169 (offset -1 lines).
patching file drivers/ieee1394/ohci1394.c
patching file drivers/ieee1394/raw1394-private.h
patching file drivers/ieee1394/raw1394.c
patching file drivers/ieee1394/sbp2.c
Hunk #1 succeeded at 367 (offset 11 lines).
Hunk #2 succeeded at 380 (offset 11 lines).
patching file drivers/ieee1394/video1394.c

# patch -p1 -R --dry-run < /home/miles/git-ieee1394-fixup.patch
patching file drivers/ieee1394/hosts.c
Hunk #1 succeeded at 100 with fuzz 2 (offset 10 lines).
Hunk #2 FAILED at 117.
Hunk #3 FAILED at 128.
2 out of 3 hunks FAILED -- saving rejects to file drivers/ieee1394/hosts.c.rej

2006-09-05 19:52:37

by Stefan Richter

[permalink] [raw]
Subject: Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected

Miles Lane wrote:
> The patch doesn't apply cleanly, but I suspect this is no big deal.
>
> patch -p1 -R <
> /home/miles/119-ieee1394-nodemgr-convert-nodemgr_serialize-semaphore-to-mutex.patch
>
> patching file drivers/ieee1394/nodemgr.c
> Hunk #2 succeeded at 1630 (offset 9 lines).
> Hunk #3 succeeded at 1659 (offset 9 lines).
> Hunk #4 succeeded at 1677 (offset 9 lines).

Yes, these offsets are harmless.

Thanks for the help to debug this.
--
Stefan Richter
-=====-=-==- =--= --=-=
http://arcgraph.de/sr/

2006-09-05 20:20:29

by Stefan Richter

[permalink] [raw]
Subject: Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected

Miles Lane wrote:
> I am having trouble with backing out the git-ieee1394 patches.

Take a look at
http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc5/2.6.18-rc5-mm1/broken-out/series

There are a number of 1394 subsystem patches; the last one is
ieee1394-sbp2-more-help-in-kconfig.patch. (That's supposed that no
further external patches touch ieee1394.) The order of patches in
patch-series is how they were applied.

Not all of these patches depend on each other, but some do. So the
safest way to unapply them is to follow the exact reverse order.

One tool to make this a little bit easier is quilt. This should be
available as a package for most distributions. I haven't tried it myself
yet, but akpm's "broken-out" patch distribution can be manipulated by
quilt. I guess it works like the following method --- which has the
drawback that you cannot use it with your existing linux-2.6.18-rc5-mm1
build. (Except with a trick, see below.)

Install linux-2.6.18-rc5.
Unpack
http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc5/2.6.18-rc5-mm1/2.6.18-rc5-mm1-broken-out.tar.bz2

Rename the broken-out directory to "linux-2.6.18-rc5/patches".
Copy your linux-2.6.18-rc5-mm1/.config to linux-2.6.18-rc5.

Apply all the patches, in the order given by patches/series:
$ cd linux-2.6.18-rc5
$ quilt push -a

Fetch all of
http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc5/2.6.18-rc5-mm1/hot-fixes/
and add it on top of all regular mm1 patches:
$ quilt import ~/hot-fixes/*.patch
$ quilt push -a

Now open patches/series in an editor. Find the ieee1394 patches. Move
all of them to the bottom of the series file. Save it. You can now
revert each 1394 patch by
$ quilt pop

Build the kernel as usual.

Now to the trick I mentioned before. To avoid starting from
linux-2.6.18-rc5 even though you already built and booted
2.6.18-rc5-mm1, perform the steps above on top of 2.6.18-rc5 until and
including the step where you imported and pushed the hot-fixes. After
that, just copy the patches/ and .pc/ directories over to your existing
2.6.18-rc5-mm1. Check the effect with
$ cd ../2.6.18-rc5-mm1
$ quilt top
This should give a message that the last hot fix is topmost. It should
now be possible to run "quilt pop" etc.

Anyway; manually removing the ieee1394 patches by looking at the order
in the series file may be faster than setting up quilt and the second
kernel source tree.
--
Stefan Richter
-=====-=-==- =--= --=-=
http://arcgraph.de/sr/

2006-09-05 20:27:15

by Stefan Richter

[permalink] [raw]
Subject: Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected

I wrote:
[...]
> Now open patches/series in an editor. Find the ieee1394 patches. Move
> all of them to the bottom of the series file. Save it. You can now
> revert each 1394 patch by
> $ quilt pop

(Repeat until git-ieee1394.patch was removed.)

> Build the kernel as usual.

(Of course you just need to build, install, and reload the kernel
modules if you have ieee1394 configured as module.)
--
Stefan Richter
-=====-=-==- =--= --=-=
http://arcgraph.de/sr/

2006-09-05 20:33:27

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected

On Tue, 05 Sep 2006 22:19:51 +0200
Stefan Richter <[email protected]> wrote:

> One tool to make this a little bit easier is quilt. This should be
> available as a package for most distributions. I haven't tried it myself
> yet, but akpm's "broken-out" patch distribution can be manipulated by
> quilt.

Each -mm announcement contains the following text:

:- If you hit a bug in -mm and it is not obvious which patch caused it, it is
: most valuable if you can perform a bisection search to identify which patch
: introduced the bug. Instructions for this process are at
:
: http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt
:
: But beware that this process takes some time (around ten rebuilds and
: reboots), so consider reporting the bug first and if we cannot immediately
: identify the faulty patch, then perform the bisection search.
:

;)

2006-09-05 21:08:45

by Arjan van de Ven

[permalink] [raw]
Subject: Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected

On Tue, 2006-09-05 at 21:23 +0200, Stefan Richter wrote:
> Andrew Morton wrote:
> > On Tue, 5 Sep 2006 10:37:51 -0700
> > "Miles Lane" <[email protected]> wrote:
> >
> >> ieee1394: Node changed: 0-01:1023 -> 0-00:1023
> >> ieee1394: Node changed: 0-02:1023 -> 0-01:1023
> >> ieee1394: Node suspended: ID:BUS[0-00:1023] GUID[0080880002103eae]
> >>
> >> =============================================
> >> [ INFO: possible recursive locking detected ]
> >> 2.6.18-rc5-mm1 #2
> >> ---------------------------------------------
> >> knodemgrd_0/2321 is trying to acquire lock:
> >> (&s->rwsem){----}, at: [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394]
> >>
> >> but task is already holding lock:
> >> (&s->rwsem){----}, at: [<f8959078>] nodemgr_host_thread+0x717/0x883 [ieee1394]
> [...]
>
> This information confuses me. These places are not supposed to be the
> ones where the locks were actually acquired, are they?

they should be yes
(but inlined functions get the name of the function they are inlined
into)



2006-09-05 22:27:55

by Stefan Richter

[permalink] [raw]
Subject: Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected

Arjan van de Ven wrote:
> On Tue, 2006-09-05 at 21:23 +0200, Stefan Richter wrote:
>> This information confuses me. These places are not supposed to be the
>> ones where the locks were actually acquired, are they?
>
> they should be yes
> (but inlined functions get the name of the function they are inlined
> into)

Was there function inlining performed? E.g. on those functions that are
called from only one place?
--
Stefan Richter
-=====-=-==- =--= --==-
http://arcgraph.de/sr/

2006-09-06 00:10:21

by Adrian Bunk

[permalink] [raw]
Subject: Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected

On Wed, Sep 06, 2006 at 12:27:08AM +0200, Stefan Richter wrote:
> Arjan van de Ven wrote:
> > On Tue, 2006-09-05 at 21:23 +0200, Stefan Richter wrote:
> >> This information confuses me. These places are not supposed to be the
> >> ones where the locks were actually acquired, are they?
> >
> > they should be yes
> > (but inlined functions get the name of the function they are inlined
> > into)
>
> Was there function inlining performed? E.g. on those functions that are
> called from only one place?

If a static function has only one caller it gets inlined.

> Stefan Richter

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2006-09-06 06:39:49

by Arjan van de Ven

[permalink] [raw]
Subject: Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected

On Tue, 2006-09-05 at 10:37 -0700, Miles Lane wrote:
> ieee1394: Node changed: 0-01:1023 -> 0-00:1023
> ieee1394: Node changed: 0-02:1023 -> 0-01:1023
> ieee1394: Node suspended: ID:BUS[0-00:1023] GUID[0080880002103eae]
>
> =============================================
> [ INFO: possible recursive locking detected ]
> 2.6.18-rc5-mm1 #2
> ---------------------------------------------
> knodemgrd_0/2321 is trying to acquire lock:
> (&s->rwsem){----}, at: [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394]
>
> but task is already holding lock:
> (&s->rwsem){----}, at: [<f8959078>] nodemgr_host_thread+0x717/0x883 [ieee1394]


looks like a real bug to me:

nodemgr_node_probe() takes down_read(&class->subsys.rwsem) and then
calls nodemgr_probe_ne() which calls nodemgr_update_pdrv() which does
down_read(&class->subsys.rwsem).

Such recursive taking of rwsems is not allowed (rwsems are fair, if a
write comes in in between then there is a deadlock).


2006-09-06 07:14:30

by Stefan Richter

[permalink] [raw]
Subject: Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected

I wrote:
> Andrew Morton wrote:
>> On Tue, 5 Sep 2006 10:37:51 -0700
>> "Miles Lane" <[email protected]> wrote:
>>
>>> ieee1394: Node changed: 0-01:1023 -> 0-00:1023
>>> ieee1394: Node changed: 0-02:1023 -> 0-01:1023
>>> ieee1394: Node suspended: ID:BUS[0-00:1023] GUID[0080880002103eae]
>>>
>>> =============================================
>>> [ INFO: possible recursive locking detected ]
>>> 2.6.18-rc5-mm1 #2
>>> ---------------------------------------------
>>> knodemgrd_0/2321 is trying to acquire lock:
>>> (&s->rwsem){----}, at: [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394]
>>>
>>> but task is already holding lock:
>>> (&s->rwsem){----}, at: [<f8959078>] nodemgr_host_thread+0x717/0x883 [ieee1394]
> [...]
>
> This information confuses me. These places are not supposed to be the
> ones where the locks were actually acquired, are they?
>
>> That's a 1394 glitch, possibly introduced by git-ieee1394.patch.
>
> Or maybe it's older. Nodemgr takes class->subsys.rwsem and
> device.bus->subsys.rwsem. It always did. Could there be a change in
> driver core which makes this recursive? Or has it always been recursive?
> For example,
>
> static void nodemgr_update_pdrv(struct node_entry *ne)
> {
> struct unit_directory *ud;
> struct hpsb_protocol_driver *pdrv;
> struct class *class = &nodemgr_ud_class;
> struct class_device *cdev;
>
> down_read(&class->subsys.rwsem);
> list_for_each_entry(cdev, &class->children, node) {
> ud = container_of(cdev, struct unit_directory, class_dev);
> if (ud->ne != ne || !ud->device.driver)
> continue;
>
> pdrv = container_of(ud->device.driver, struct hpsb_protocol_driver, driver);
>
> if (pdrv->update && pdrv->update(ud)) {
> down_write(&ud->device.bus->subsys.rwsem);
> device_release_driver(&ud->device);
> up_write(&ud->device.bus->subsys.rwsem);
> }
> }
> up_read(&class->subsys.rwsem);
> }

Hi Greg,

perhaps you could advise on this. It appears from grepping through the
sources that drivers/ieee1394/nodemgr.c is the only one with mixed
access to device.bus->subsys.rwsem and class->subsys.rwsem.

Other usages of subsys.rwsem that I found are:
1a.) dev->bus->subsys.rwsem
driver/ide/ide-proc.c and drivers/net/phy/phy_device.c take
dev->bus->subsys.rwsem. drivers/pnp/card.c takes dev.bus->subsys.rwsem.

1b.) driver.bus->subsys.rwsem
drivers/s390/net/qeth_proc.c takes driver.bus->subsys.rwsem.

2.) class->subsys.rwsem
drivers/scsi/hosts.c takes class->subsys.rwsem.

3.) bustype.subsys.rwsem
drivers/input/serio/serio.c takes serio_bus.subsys.rwsem.
drivers/input/gameport/gameport.c takes gameport_bus.subsys.rwsem.
drivers/base/power/shutdown.c takes devices_subsys.rwsem.
drivers/usb/core/devices.c and devio.c take usb_bus_type.subsys.rwsem.

Do class->subsys.rwsem, bus->subsys.rwsem, and bus_type.subsys.rwsem
point to identical or different lock instances?

Either way, could it hurt to convert nodemgr to uniformly use
ieee1394_bus_type.subsys.rwsem all over the place?

Thanks,
--
Stefan Richter
-=====-=-==- =--= --==-
http://arcgraph.de/sr/

2006-09-06 16:51:42

by Stefan Richter

[permalink] [raw]
Subject: Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected

I wrote:
> I wrote:
>> Or maybe it's older. Nodemgr takes class->subsys.rwsem and
>> device.bus->subsys.rwsem. It always did. Could there be a change in
>> driver core which makes this recursive? Or has it always been recursive?
>> For example,
>>
>> static void nodemgr_update_pdrv(struct node_entry *ne)
>> {
>> struct unit_directory *ud;
>> struct hpsb_protocol_driver *pdrv;
>> struct class *class = &nodemgr_ud_class;
>> struct class_device *cdev;
>>
>> down_read(&class->subsys.rwsem);
>> list_for_each_entry(cdev, &class->children, node) {

This may be wrong anyway. According to include/linux/device.h,
class->sem should be used to protect access to class->children. There
are more places in nodemgr of this sort.

>> ud = container_of(cdev, struct unit_directory, class_dev);
>> if (ud->ne != ne || !ud->device.driver)
>> continue;
>>
>> pdrv = container_of(ud->device.driver, struct hpsb_protocol_driver, driver);
>>
>> if (pdrv->update && pdrv->update(ud)) {
>> down_write(&ud->device.bus->subsys.rwsem);
>> device_release_driver(&ud->device);
>> up_write(&ud->device.bus->subsys.rwsem);
>> }
>> }
>> up_read(&class->subsys.rwsem);
>> }
>
> Hi Greg,
>
> perhaps you could advise on this. It appears from grepping through the
> sources that drivers/ieee1394/nodemgr.c is the only one with mixed
> access to device.bus->subsys.rwsem and class->subsys.rwsem.
>
> Other usages of subsys.rwsem that I found are:
> 1a.) dev->bus->subsys.rwsem
> driver/ide/ide-proc.c and drivers/net/phy/phy_device.c take
> dev->bus->subsys.rwsem. drivers/pnp/card.c takes dev.bus->subsys.rwsem.
>
> 1b.) driver.bus->subsys.rwsem
> drivers/s390/net/qeth_proc.c takes driver.bus->subsys.rwsem.
>
> 2.) class->subsys.rwsem
> drivers/scsi/hosts.c takes class->subsys.rwsem.
>
> 3.) bustype.subsys.rwsem
> drivers/input/serio/serio.c takes serio_bus.subsys.rwsem.
> drivers/input/gameport/gameport.c takes gameport_bus.subsys.rwsem.
> drivers/base/power/shutdown.c takes devices_subsys.rwsem.
> drivers/usb/core/devices.c and devio.c take usb_bus_type.subsys.rwsem.
>
> Do class->subsys.rwsem, bus->subsys.rwsem, and bus_type.subsys.rwsem
> point to identical or different lock instances?
>
> Either way, could it hurt to convert nodemgr to uniformly use
> ieee1394_bus_type.subsys.rwsem all over the place?
>
> Thanks,


--
Stefan Richter
-=====-=-==- =--= --==-
http://arcgraph.de/sr/

2006-09-06 17:05:28

by Stefan Richter

[permalink] [raw]
Subject: [RFT PATCH 1/2] ieee1394: nodemgr: fix rwsem recursion

nodemgr_update_pdrv grabbed an rw semaphore (as reader) which was
already taken by its caller's caller, nodemgr_probe_ne (as reader too).
Reported by Miles Lane, call path pointed out by Arjan van de Ven.

FIXME:
Shouldn't we rather use class->sem there, not class->subsys.rwsem?

Signed-off-by: Stefan Richter <[email protected]>
---
Index: linux/drivers/ieee1394/nodemgr.c
===================================================================
--- linux.orig/drivers/ieee1394/nodemgr.c 2006-08-30 20:47:57.000000000 +0200
+++ linux/drivers/ieee1394/nodemgr.c 2006-09-06 19:03:24.000000000 +0200
@@ -1316,6 +1316,7 @@ static void nodemgr_node_scan(struct hos
}


+/* Caller needs to hold nodemgr_ud_class.subsys.rwsem as reader. */
static void nodemgr_suspend_ne(struct node_entry *ne)
{
struct class_device *cdev;
@@ -1368,15 +1369,14 @@ static void nodemgr_resume_ne(struct nod
}


+/* Caller needs to hold nodemgr_ud_class.subsys.rwsem as reader. */
static void nodemgr_update_pdrv(struct node_entry *ne)
{
struct unit_directory *ud;
struct hpsb_protocol_driver *pdrv;
- struct class *class = &nodemgr_ud_class;
struct class_device *cdev;

- down_read(&class->subsys.rwsem);
- list_for_each_entry(cdev, &class->children, node) {
+ list_for_each_entry(cdev, &nodemgr_ud_class.children, node) {
ud = container_of(cdev, struct unit_directory, class_dev);
if (ud->ne != ne || !ud->device.driver)
continue;
@@ -1389,7 +1389,6 @@ static void nodemgr_update_pdrv(struct n
up_write(&ud->device.bus->subsys.rwsem);
}
}
- up_read(&class->subsys.rwsem);
}


@@ -1420,6 +1419,8 @@ static void nodemgr_irm_write_bc(struct
}


+/* Caller needs to hold nodemgr_ud_class.subsys.rwsem as reader because the
+ * calls to nodemgr_update_pdrv() and nodemgr_suspend_ne() here require it. */
static void nodemgr_probe_ne(struct host_info *hi, struct node_entry *ne, int generation)
{
struct device *dev;


2006-09-06 17:07:19

by Stefan Richter

[permalink] [raw]
Subject: [RFT PATCH 2/2] ieee1394: nodemgr: grab class.subsys.rwsem in nodemgr_resume_ne

nodemgr_resume_ne was iterating over nodemgr_ud_class.children without
protection by nodemgr_ud_class.subsys.rwsem.

FIXME:
Shouldn't we rather use class->sem there, not class->subsys.rwsem?

Signed-off-by: Stefan Richter <[email protected]>
---
Index: linux/drivers/ieee1394/nodemgr.c
===================================================================
--- linux.orig/drivers/ieee1394/nodemgr.c 2006-09-06 18:34:35.000000000 +0200
+++ linux/drivers/ieee1394/nodemgr.c 2006-09-06 18:38:20.000000000 +0200
@@ -1352,6 +1352,7 @@ static void nodemgr_resume_ne(struct nod
ne->in_limbo = 0;
device_remove_file(&ne->device, &dev_attr_ne_in_limbo);

+ down_read(&nodemgr_ud_class.subsys.rwsem);
down_read(&ne->device.bus->subsys.rwsem);
list_for_each_entry(cdev, &nodemgr_ud_class.children, node) {
ud = container_of(cdev, struct unit_directory, class_dev);
@@ -1363,6 +1364,7 @@ static void nodemgr_resume_ne(struct nod
ud->device.driver->resume(&ud->device);
}
up_read(&ne->device.bus->subsys.rwsem);
+ up_read(&nodemgr_ud_class.subsys.rwsem);

HPSB_DEBUG("Node resumed: ID:BUS[" NODE_BUS_FMT "] GUID[%016Lx]",
NODE_BUS_ARGS(ne->host, ne->nodeid), (unsigned long long)ne->guid);


2006-09-06 22:35:32

by Greg KH

[permalink] [raw]
Subject: Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected

On Wed, Sep 06, 2006 at 09:13:34AM +0200, Stefan Richter wrote:
> Do class->subsys.rwsem, bus->subsys.rwsem, and bus_type.subsys.rwsem
> point to identical or different lock instances?

class->subsys.rwsem is different from the others. bus->subsys.rwsem and
bus_type.subsys.rwsem are probably the same thing (depending on what
that bus-> pointer is to.)

> Either way, could it hurt to convert nodemgr to uniformly use
> ieee1394_bus_type.subsys.rwsem all over the place?

Probably a good idea.

thanks,

greg k-h

2006-09-07 22:45:37

by Miles Lane

[permalink] [raw]
Subject: Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected

On 9/6/06, Stefan Richter <[email protected]> wrote:
> I wrote:
> > I wrote:
> >> Or maybe it's older. Nodemgr takes class->subsys.rwsem and
> >> device.bus->subsys.rwsem. It always did. Could there be a change in
> >> driver core which makes this recursive? Or has it always been recursive?
> >> For example,
> >>
> >> static void nodemgr_update_pdrv(struct node_entry *ne)
> >> {
> >> struct unit_directory *ud;
> >> struct hpsb_protocol_driver *pdrv;
> >> struct class *class = &nodemgr_ud_class;
> >> struct class_device *cdev;
> >>
> >> down_read(&class->subsys.rwsem);
> >> list_for_each_entry(cdev, &class->children, node) {
>
> This may be wrong anyway. According to include/linux/device.h,
> class->sem should be used to protect access to class->children. There
> are more places in nodemgr of this sort.
>
> >> ud = container_of(cdev, struct unit_directory, class_dev);
> >> if (ud->ne != ne || !ud->device.driver)
> >> continue;
> >>
> >> pdrv = container_of(ud->device.driver, struct hpsb_protocol_driver, driver);
> >>
> >> if (pdrv->update && pdrv->update(ud)) {
> >> down_write(&ud->device.bus->subsys.rwsem);
> >> device_release_driver(&ud->device);
> >> up_write(&ud->device.bus->subsys.rwsem);
> >> }
> >> }
> >> up_read(&class->subsys.rwsem);
> >> }
> >
> > Hi Greg,
> >
> > perhaps you could advise on this. It appears from grepping through the
> > sources that drivers/ieee1394/nodemgr.c is the only one with mixed
> > access to device.bus->subsys.rwsem and class->subsys.rwsem.
> >
> > Other usages of subsys.rwsem that I found are:
> > 1a.) dev->bus->subsys.rwsem
> > driver/ide/ide-proc.c and drivers/net/phy/phy_device.c take
> > dev->bus->subsys.rwsem. drivers/pnp/card.c takes dev.bus->subsys.rwsem.
> >
> > 1b.) driver.bus->subsys.rwsem
> > drivers/s390/net/qeth_proc.c takes driver.bus->subsys.rwsem.
> >
> > 2.) class->subsys.rwsem
> > drivers/scsi/hosts.c takes class->subsys.rwsem.
> >
> > 3.) bustype.subsys.rwsem
> > drivers/input/serio/serio.c takes serio_bus.subsys.rwsem.
> > drivers/input/gameport/gameport.c takes gameport_bus.subsys.rwsem.
> > drivers/base/power/shutdown.c takes devices_subsys.rwsem.
> > drivers/usb/core/devices.c and devio.c take usb_bus_type.subsys.rwsem.
> >
> > Do class->subsys.rwsem, bus->subsys.rwsem, and bus_type.subsys.rwsem
> > point to identical or different lock instances?
> >
> > Either way, could it hurt to convert nodemgr to uniformly use
> > ieee1394_bus_type.subsys.rwsem all over the place?

I don't have time to do the bisection testing. If there is a patch
you'd like me to test against 2.6.18-rc5-mm1+all hotfixes, please let
me know. I apologize for not being able to narrow this down further
for you.

Miles

2006-09-07 23:25:27

by Stefan Richter

[permalink] [raw]
Subject: Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected

Miles Lane wrote:
> I don't have time to do the bisection testing. If there is a patch
> you'd like me to test against 2.6.18-rc5-mm1+all hotfixes, please let
> me know. I apologize for not being able to narrow this down further
> for you.

Bisection is probably not necessary anymore. The issue seems to be much
older than -mm's changes to nodemgr. Please apply the patches
ieee1394: nodemgr: fix rwsem recursion
ieee1394: nodemgr: grab class.subsys.rwsem in nodemgr_resume_ne
on top of all of -mm. I posted them yesterday but will mail them again.
(linux1394-devel was kept in the dark by SpamCop.) Thanks a lot for your
help.
--
Stefan Richter
-=====-=-==- =--= -=---
http://arcgraph.de/sr/