2006-02-10 20:07:51

by Miles Lane

[permalink] [raw]
Subject: 2.6.16-rc2-mm1 -- BUG: warning at drivers/ieee1394/ohci1394.c:235/get_phy_reg()

BUG: warning at drivers/ieee1394/ohci1394.c:235/get_phy_reg()
<c0103a85> show_trace+0xd/0xf <c0103a9c> dump_stack+0x15/0x17
<f931b977> get_phy_reg+0x76/0xed [ohci1394] <f931ba34>
ohci_devctl+0x46/0x524 [ohci1394]
<f931cc2a> ohci_irq_handler+0x2e4/0x69e [ohci1394] <c013b91a>
handle_IRQ_event+0x26/0x51
<c013b9d2> __do_IRQ+0x8d/0xe3 <c0104c7f> do_IRQ+0x65/0x84
=======================
<c0103166> common_interrupt+0x1a/0x20 <c0104ce7> do_softirq+0x49/0x52
=======================
<c01209ed> irq_exit+0x36/0x38 <c010d679> smp_apic_timer_interrupt+0x5c/0x5f
<c01031f4> apic_timer_interrupt+0x1c/0x24 <f931b8f9>
set_phy_reg+0xe7/0xef [ohci1394]
<f931ba47> ohci_devctl+0x59/0x524 [ohci1394] <f9419079>
hpsb_reset_bus+0x1f/0x26 [ieee1394]
<f941af08> delayed_reset_bus+0xb7/0xbb [ieee1394] <c012a0ac>
run_workqueue+0x7a/0xbb
<c012a247> worker_thread+0xce/0x100 <c012ce31> kthread+0xa6/0xd3
<c0100bcd> kernel_thread_helper+0x5/0xb
BUG: warning at drivers/ieee1394/ohci1394.c:264/set_phy_reg()
<c0103a85> show_trace+0xd/0xf <c0103a9c> dump_stack+0x15/0x17
<f931b89f> set_phy_reg+0x8d/0xef [ohci1394] <f931ba47>
ohci_devctl+0x59/0x524 [ohci1394]
<f931cc2a> ohci_irq_handler+0x2e4/0x69e [ohci1394] <c013b91a>
handle_IRQ_event+0x26/0x51
<c013b9d2> __do_IRQ+0x8d/0xe3 <c0104c7f> do_IRQ+0x65/0x84
=======================
<c0103166> common_interrupt+0x1a/0x20 <c0104ce7> do_softirq+0x49/0x52
=======================
<c01209ed> irq_exit+0x36/0x38 <c010d679> smp_apic_timer_interrupt+0x5c/0x5f
<c01031f4> apic_timer_interrupt+0x1c/0x24 <f931b8f9>
set_phy_reg+0xe7/0xef [ohci1394]
<f931ba47> ohci_devctl+0x59/0x524 [ohci1394] <f9419079>
hpsb_reset_bus+0x1f/0x26 [ieee1394]
<f941af08> delayed_reset_bus+0xb7/0xbb [ieee1394] <c012a0ac>
run_workqueue+0x7a/0xbb
<c012a247> worker_thread+0xce/0x100 <c012ce31> kthread+0xa6/0xd3
<c0100bcd> kernel_thread_helper+0x5/0xb


2006-02-10 20:22:12

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.16-rc2-mm1 -- BUG: warning at drivers/ieee1394/ohci1394.c:235/get_phy_reg()

Miles Lane <[email protected]> wrote:
>
> BUG: warning at drivers/ieee1394/ohci1394.c:235/get_phy_reg()

That's a -mm-only warning telling you that get_phy_reg() is doing a
one-millisecond-or-more busywait while local interrupts are disabled.

That's the sort of thing which makes audio developers pursue 1394 developers
with sharp sticks.

2006-02-11 00:31:56

by Stefan Richter

[permalink] [raw]
Subject: Re: 2.6.16-rc2-mm1 -- BUG: warning at drivers/ieee1394/ohci1394.c:235/get_phy_reg()

Andrew Morton wrote:
> Miles Lane <[email protected]> wrote:
>>BUG: warning at drivers/ieee1394/ohci1394.c:235/get_phy_reg()
>
> That's a -mm-only warning telling you that get_phy_reg() is doing a
> one-millisecond-or-more busywait while local interrupts are disabled.

Same with set_phy_reg, ohci_soft_reset, ohci_hw_csr_reg. At least the
callers of ohci_hw_csr_reg (in particular, csr_highlevel.host_reset,
furthermore csr_highlevel.add_host) could fairly easily be converted to
a workqueue job or perhaps moved to the nodemgr thread. I have not
checked the other offending functions yet.
--
Stefan Richter
-=====-=-==- --=- -=-==
http://arcgraph.de/sr/

2006-02-11 02:30:43

by Lee Revell

[permalink] [raw]
Subject: Re: 2.6.16-rc2-mm1 -- BUG: warning at drivers/ieee1394/ohci1394.c:235/get_phy_reg()

On Sat, 2006-02-11 at 01:31 +0100, Stefan Richter wrote:
> Andrew Morton wrote:
> > Miles Lane <[email protected]> wrote:
> >>BUG: warning at drivers/ieee1394/ohci1394.c:235/get_phy_reg()
> >
> > That's a -mm-only warning telling you that get_phy_reg() is doing a
> > one-millisecond-or-more busywait while local interrupts are disabled.
>
> Same with set_phy_reg, ohci_soft_reset, ohci_hw_csr_reg. At least the
> callers of ohci_hw_csr_reg (in particular, csr_highlevel.host_reset,
> furthermore csr_highlevel.add_host) could fairly easily be converted to
> a workqueue job or perhaps moved to the nodemgr thread. I have not
> checked the other offending functions yet.

In fact I'm pretty sure we have seen reports on the linux-audio-user
list of apps reporting underruns when a 1394 drive is accessed.

Lee

2006-02-11 02:28:58

by Lee Revell

[permalink] [raw]
Subject: Re: 2.6.16-rc2-mm1 -- BUG: warning at drivers/ieee1394/ohci1394.c:235/get_phy_reg()

On Fri, 2006-02-10 at 12:21 -0800, Andrew Morton wrote:
> Miles Lane <[email protected]> wrote:
> >
> > BUG: warning at drivers/ieee1394/ohci1394.c:235/get_phy_reg()
>
> That's a -mm-only warning telling you that get_phy_reg() is doing a
> one-millisecond-or-more busywait while local interrupts are disabled.
>
> That's the sort of thing which makes audio developers pursue 1394 developers
> with sharp sticks.

Hmm, interesting, did -mm get a "lite" version of Ingo's latency tracer?

Lee

2006-02-11 04:37:59

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.16-rc2-mm1 -- BUG: warning at drivers/ieee1394/ohci1394.c:235/get_phy_reg()

Lee Revell <[email protected]> wrote:
>
> On Fri, 2006-02-10 at 12:21 -0800, Andrew Morton wrote:
> > Miles Lane <[email protected]> wrote:
> > >
> > > BUG: warning at drivers/ieee1394/ohci1394.c:235/get_phy_reg()
> >
> > That's a -mm-only warning telling you that get_phy_reg() is doing a
> > one-millisecond-or-more busywait while local interrupts are disabled.
> >
> > That's the sort of thing which makes audio developers pursue 1394 developers
> > with sharp sticks.
>
> Hmm, interesting, did -mm get a "lite" version of Ingo's latency tracer?
>

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.16-rc2/2.6.16-rc2-mm1/broken-out/debug-warn-if-we-sleep-in-an-irq-for-a-long-time.patch

2006-02-11 08:39:45

by Stefan Richter

[permalink] [raw]
Subject: Re: 2.6.16-rc2-mm1 -- BUG: warning at drivers/ieee1394/ohci1394.c:235/get_phy_reg()

Lee Revell wrote:
> On Sat, 2006-02-11 at 01:31 +0100, Stefan Richter wrote:
>>Andrew Morton wrote:
>>>That's a -mm-only warning telling you that get_phy_reg() is doing a
>>>one-millisecond-or-more busywait while local interrupts are disabled.
>>
>>Same with set_phy_reg, ohci_soft_reset, ohci_hw_csr_reg.
[...]
> In fact I'm pretty sure we have seen reports on the linux-audio-user
> list of apps reporting underruns when a 1394 drive is accessed.

These might have different causes. The 1394 storage driver, sbp2,
performs quite a lot of protocol handling in IRQ or soft IRQ context. I
have been planning to move this off into process context -- not really
because of potential latency issues but for sake of more reliable
protocol handling. I hope I get to it RSN.

AFAICS the offending ohci1394 functions mentioned above are never called
during normal access to an SBP-2 device. They only happen when a host
adapter is initialized or shut down or whenever a 1394 device is plugged
in or out (actually whenever self ID reception is completed).
--
Stefan Richter
-=====-=-==- --=- -=-==
http://arcgraph.de/sr/

2006-02-11 08:50:57

by Stefan Richter

[permalink] [raw]
Subject: Re: 2.6.16-rc2-mm1 -- BUG: warning at drivers/ieee1394/ohci1394.c:235/get_phy_reg()

Andrew Morton wrote:
>>>That's a -mm-only warning telling you that get_phy_reg() is doing a
>>>one-millisecond-or-more busywait while local interrupts are disabled.
...
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.16-rc2/2.6.16-rc2-mm1/broken-out/debug-warn-if-we-sleep-in-an-irq-for-a-long-time.patch

So that's why when I heard of this issue the first time, it was about
Redhat/ Fedora Core kernels. But even though this issue is very old, no
bug report got through to linux1394-devel until December 2005.

BTW, why not swap the order of expressions:
WARN_ON(!(warned++) && in_irq()); \
--
Stefan Richter
-=====-=-==- --=- -=-==
http://arcgraph.de/sr/

2006-02-11 09:04:08

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.16-rc2-mm1 -- BUG: warning at drivers/ieee1394/ohci1394.c:235/get_phy_reg()

Stefan Richter <[email protected]> wrote:
>
> BTW, why not swap the order of expressions:
> WARN_ON(!(warned++) && in_irq()); \

Because then it'd generate a warning every 4 billionth time through.