2010-11-25 15:51:45

by Alexander E. Patrakov

[permalink] [raw]
Subject: Nobody cared about IRQs at shutdown

Hello.

After switching my Gentoo desktop from sysvinit + openrc to systemd, I
started getting "nobody cared" messages about IRQs 16 and 19 (common
thing: they are assigned to the USB controllers, that's why CC:
linux-usb) at shutdown. Also, instead of shutting down immediately, the
kernel started waiting for something to time out (and after the timeout,
it does power off the computer). Adding the "irqpoll" parameter does
work around the issue. A screen photo illustrating the problem and the
kernel config are attached.

I didn't try to bisect because there is no known good kernel. I.e., the
issue doesn't exist without systemd, and I didn't run systemd with
earlier kernels.

Please help diagnosing and, if possible, fixing this problem.

The broken image at the top of the screen is from extlinux (I use it to
boot the kernel because I don't like grub).

Here is my lspci && lspci -nv:

00:00.0 Host bridge: Intel Corporation 82P965/G965 Memory Controller Hub
(rev 02)
00:02.0 VGA compatible controller: Intel Corporation 82G965 Integrated
Graphics Controller (rev 02)
00:02.1 Display controller: Intel Corporation 82G965 Integrated Graphics
Controller (rev 02)
00:03.0 Communication controller: Intel Corporation 82P965/G965 HECI
Controller (rev 02)
00:19.0 Ethernet controller: Intel Corporation 82566DC Gigabit Network
Connection (rev 02)
00:1a.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
Controller #4 (rev 02)
00:1a.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
Controller #5 (rev 02)
00:1a.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI
Controller #2 (rev 02)
00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio
Controller (rev 02)
00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express
Port 1 (rev 02)
00:1c.1 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express
Port 2 (rev 02)
00:1c.2 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express
Port 3 (rev 02)
00:1c.3 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express
Port 4 (rev 02)
00:1c.4 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express
Port 5 (rev 02)
00:1d.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
Controller #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
Controller #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
Controller #3 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI
Controller #1 (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev f2)
00:1f.0 ISA bridge: Intel Corporation 82801HB/HR (ICH8/R) LPC Interface
Controller (rev 02)
00:1f.2 SATA controller: Intel Corporation 82801HB (ICH8) 4 port SATA
AHCI Controller (rev 02)
00:1f.3 SMBus: Intel Corporation 82801H (ICH8 Family) SMBus Controller
(rev 02)
02:00.0 IDE interface: Marvell Technology Group Ltd. 88SE6101
single-port PATA133 interface (rev b1)
06:01.0 Multimedia controller: Philips Semiconductors
SAA7131/SAA7133/SAA7135 Video Broadcast Decoder (rev d1)
06:03.0 FireWire (IEEE 1394): Texas Instruments TSB43AB22/A
IEEE-1394a-2000 Controller (PHY/Link)
00:00.0 0600: 8086:29a0 (rev 02)
Subsystem: 8086:514d
Flags: bus master, fast devsel, latency 0
Capabilities: <access denied>
Kernel driver in use: agpgart-intel

00:02.0 0300: 8086:29a2 (rev 02) (prog-if 00 [VGA controller])
Subsystem: 8086:514d
Flags: bus master, fast devsel, latency 0, IRQ 40
Memory at 90300000 (32-bit, non-prefetchable) [size=1M]
Memory at 80000000 (64-bit, prefetchable) [size=256M]
I/O ports at 2110 [size=8]
Expansion ROM at <unassigned> [disabled]
Capabilities: <access denied>
Kernel driver in use: i915

00:02.1 0380: 8086:29a3 (rev 02)
Subsystem: 8086:514d
Flags: bus master, fast devsel, latency 0
Memory at 90200000 (32-bit, non-prefetchable) [size=1M]
Capabilities: <access denied>

00:03.0 0780: 8086:29a4 (rev 02)
Subsystem: 8086:514d
Flags: bus master, fast devsel, latency 0, IRQ 11
Memory at 90426100 (64-bit, non-prefetchable) [size=16]
Capabilities: <access denied>

00:19.0 0200: 8086:104b (rev 02)
Subsystem: 8086:0001
Flags: bus master, fast devsel, latency 0, IRQ 42
Memory at 90400000 (32-bit, non-prefetchable) [size=128K]
Memory at 90424000 (32-bit, non-prefetchable) [size=4K]
I/O ports at 20e0 [size=32]
Capabilities: <access denied>
Kernel driver in use: e1000e
Kernel modules: e1000e

00:1a.0 0c03: 8086:2834 (rev 02) (prog-if 00 [UHCI])
Subsystem: 8086:514d
Flags: bus master, medium devsel, latency 0, IRQ 16
I/O ports at 20c0 [size=32]
Kernel driver in use: uhci_hcd

00:1a.1 0c03: 8086:2835 (rev 02) (prog-if 00 [UHCI])
Subsystem: 8086:514d
Flags: bus master, medium devsel, latency 0, IRQ 21
I/O ports at 20a0 [size=32]
Kernel driver in use: uhci_hcd

00:1a.7 0c03: 8086:283a (rev 02) (prog-if 20 [EHCI])
Subsystem: 8086:514d
Flags: bus master, medium devsel, latency 0, IRQ 18
Memory at 90425c00 (32-bit, non-prefetchable) [size=1K]
Capabilities: <access denied>
Kernel driver in use: ehci_hcd

00:1b.0 0403: 8086:284b (rev 02)
Subsystem: 8086:2114
Flags: bus master, fast devsel, latency 0, IRQ 43
Memory at 90420000 (64-bit, non-prefetchable) [size=16K]
Capabilities: <access denied>
Kernel driver in use: HDA Intel
Kernel modules: snd-hda-intel

00:1c.0 0604: 8086:283f (rev 02) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0
Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
I/O behind bridge: 00003000-00003fff
Memory behind bridge: 90500000-905fffff
Prefetchable memory behind bridge: 0000000090900000-0000000090afffff
Capabilities: <access denied>

00:1c.1 0604: 8086:2841 (rev 02) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0
Bus: primary=00, secondary=02, subordinate=02, sec-latency=0
I/O behind bridge: 00001000-00001fff
Memory behind bridge: 90100000-901fffff
Prefetchable memory behind bridge: 0000000090b00000-0000000090cfffff
Capabilities: <access denied>

00:1c.2 0604: 8086:2843 (rev 02) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0
Bus: primary=00, secondary=03, subordinate=03, sec-latency=0
I/O behind bridge: 00004000-00004fff
Memory behind bridge: 90600000-906fffff
Prefetchable memory behind bridge: 0000000090d00000-0000000090efffff
Capabilities: <access denied>

00:1c.3 0604: 8086:2845 (rev 02) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0
Bus: primary=00, secondary=04, subordinate=04, sec-latency=0
I/O behind bridge: 00005000-00005fff
Memory behind bridge: 90700000-907fffff
Prefetchable memory behind bridge: 0000000090f00000-00000000910fffff
Capabilities: <access denied>

00:1c.4 0604: 8086:2847 (rev 02) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0
Bus: primary=00, secondary=05, subordinate=05, sec-latency=0
I/O behind bridge: 00006000-00006fff
Memory behind bridge: 90800000-908fffff
Prefetchable memory behind bridge: 0000000091100000-00000000912fffff
Capabilities: <access denied>

00:1d.0 0c03: 8086:2830 (rev 02) (prog-if 00 [UHCI])
Subsystem: 8086:514d
Flags: bus master, medium devsel, latency 0, IRQ 23
I/O ports at 2080 [size=32]
Kernel driver in use: uhci_hcd

00:1d.1 0c03: 8086:2831 (rev 02) (prog-if 00 [UHCI])
Subsystem: 8086:514d
Flags: bus master, medium devsel, latency 0, IRQ 19
I/O ports at 2060 [size=32]
Kernel driver in use: uhci_hcd

00:1d.2 0c03: 8086:2832 (rev 02) (prog-if 00 [UHCI])
Subsystem: 8086:514d
Flags: bus master, medium devsel, latency 0, IRQ 18
I/O ports at 2040 [size=32]
Kernel driver in use: uhci_hcd

00:1d.7 0c03: 8086:2836 (rev 02) (prog-if 20 [EHCI])
Subsystem: 8086:514d
Flags: bus master, medium devsel, latency 0, IRQ 23
Memory at 90425800 (32-bit, non-prefetchable) [size=1K]
Capabilities: <access denied>
Kernel driver in use: ehci_hcd

00:1e.0 0604: 8086:244e (rev f2) (prog-if 01 [Subtractive decode])
Flags: bus master, fast devsel, latency 0
Bus: primary=00, secondary=06, subordinate=06, sec-latency=32
Memory behind bridge: 90000000-900fffff
Capabilities: <access denied>

00:1f.0 0601: 8086:2810 (rev 02)
Subsystem: 8086:514d
Flags: bus master, medium devsel, latency 0
Capabilities: <access denied>

00:1f.2 0106: 8086:2824 (rev 02) (prog-if 01 [AHCI 1.0])
Subsystem: 8086:514d
Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 41
I/O ports at 2108 [size=8]
I/O ports at 211c [size=4]
I/O ports at 2100 [size=8]
I/O ports at 2118 [size=4]
I/O ports at 2020 [size=32]
Memory at 90425000 (32-bit, non-prefetchable) [size=2K]
Capabilities: <access denied>
Kernel driver in use: ahci

00:1f.3 0c05: 8086:283e (rev 02)
Subsystem: 8086:514d
Flags: medium devsel, IRQ 21
Memory at 90426000 (32-bit, non-prefetchable) [size=256]
I/O ports at 2000 [size=32]
Kernel driver in use: i801_smbus

02:00.0 0101: 11ab:6101 (rev b1) (prog-if 8f [Master SecP SecO PriP PriO])
Subsystem: 11ab:6101
Flags: bus master, fast devsel, latency 0, IRQ 10
I/O ports at 1018 [size=8]
I/O ports at 1024 [size=4]
I/O ports at 1010 [size=8]
I/O ports at 1020 [size=4]
I/O ports at 1000 [size=16]
Memory at 90100000 (32-bit, non-prefetchable) [size=512]
Capabilities: <access denied>

06:01.0 0480: 1131:7133 (rev d1)
Subsystem: 11bd:002f
Flags: bus master, medium devsel, latency 32, IRQ 22
Memory at 90004800 (32-bit, non-prefetchable) [size=2K]
Capabilities: <access denied>
Kernel driver in use: saa7134
Kernel modules: saa7134

06:03.0 0c00: 104c:8023 (prog-if 10 [OHCI])
Subsystem: 8086:514d
Flags: bus master, medium devsel, latency 32, IRQ 19
Memory at 90004000 (32-bit, non-prefetchable) [size=2K]
Memory at 90000000 (32-bit, non-prefetchable) [size=16K]
Capabilities: <access denied>
Kernel driver in use: firewire_ohci
Kernel modules: firewire-ohci

Here is /proc/interrupts:

CPU0 CPU1
0: 129 0 IO-APIC-edge timer
1: 2032 0 IO-APIC-edge i8042
7: 0 0 IO-APIC-edge parport0
8: 0 0 IO-APIC-edge rtc0
9: 0 0 IO-APIC-fasteoi acpi
12: 30450 0 IO-APIC-edge i8042
16: 0 0 IO-APIC-fasteoi uhci_hcd:usb3
18: 30167 0 IO-APIC-fasteoi ehci_hcd:usb1,
uhci_hcd:usb7
19: 29 0 IO-APIC-fasteoi uhci_hcd:usb6,
firewire_ohci
21: 0 0 IO-APIC-fasteoi uhci_hcd:usb4
22: 0 0 IO-APIC-fasteoi saa7133[0], saa7133[0]
23: 2 0 IO-APIC-fasteoi ehci_hcd:usb2,
uhci_hcd:usb5
40: 23715 0 PCI-MSI-edge i915
41: 14244 0 PCI-MSI-edge ahci
42: 1574 0 PCI-MSI-edge eth0
43: 691 0 PCI-MSI-edge hda_intel
NMI: 619 619 Non-maskable interrupts
LOC: 74638 62356 Local timer interrupts
SPU: 0 0 Spurious interrupts
PMI: 619 619 Performance monitoring interrupts
PND: 0 0 Performance pending work
RES: 3050 2061 Rescheduling interrupts
CAL: 656 208 Function call interrupts
TLB: 215 340 TLB shootdowns
TRM: 0 0 Thermal event interrupts
THR: 0 0 Threshold APIC interrupts
MCE: 0 0 Machine check exceptions
MCP: 3 3 Machine check polls
ERR: 1
MIS: 0

--
Alexander E. Patrakov


Attachments:
nobody_cared.JPG (46.11 kB)
config-2.6.36-gentoo-r3 (76.56 kB)
Download all attachments

2010-11-25 16:06:06

by Alan Stern

[permalink] [raw]
Subject: Re: Nobody cared about IRQs at shutdown

On Thu, 25 Nov 2010, Alexander E. Patrakov wrote:

> Hello.
>
> After switching my Gentoo desktop from sysvinit + openrc to systemd, I
> started getting "nobody cared" messages about IRQs 16 and 19 (common
> thing: they are assigned to the USB controllers, that's why CC:

According to your listing, they are used by uhci-hcd. Do the messages
go away if you unload uhci-hcd before shutting down?

> linux-usb) at shutdown. Also, instead of shutting down immediately, the
> kernel started waiting for something to time out (and after the timeout,
> it does power off the computer). Adding the "irqpoll" parameter does
> work around the issue. A screen photo illustrating the problem and the
> kernel config are attached.
>
> I didn't try to bisect because there is no known good kernel. I.e., the
> issue doesn't exist without systemd, and I didn't run systemd with
> earlier kernels.
>
> Please help diagnosing and, if possible, fixing this problem.
>
> The broken image at the top of the screen is from extlinux (I use it to
> boot the kernel because I don't like grub).

You may need to debug the uhci-hcd driver. Look into
drivers/usb/host/uhci-hcd.c; the uhci_shutdown() routine ought to be
called and it ought to call uhci_hc_died(), which in turn calls
uhci_reset_hc() in pci-quirks.c, which is supposed to prevent the
controller from generating any IRQs.

Do you have any USB devices attached to bus 3 or bus 6?

Alan Stern

2010-11-25 16:14:38

by Alexander E. Patrakov

[permalink] [raw]
Subject: Re: Nobody cared about IRQs at shutdown

25.11.2010 21:06, Alan Stern wrote:
> On Thu, 25 Nov 2010, Alexander E. Patrakov wrote:
>
>> Hello.
>>
>> After switching my Gentoo desktop from sysvinit + openrc to systemd, I
>> started getting "nobody cared" messages about IRQs 16 and 19 (common
>> thing: they are assigned to the USB controllers, that's why CC:
> According to your listing, they are used by uhci-hcd. Do the messages
> go away if you unload uhci-hcd before shutting down?

It is not a module here, so I have to recompile the kernel in order to
try this. Will do that tomorrow.

> You may need to debug the uhci-hcd driver. Look into
> drivers/usb/host/uhci-hcd.c; the uhci_shutdown() routine ought to be
> called and it ought to call uhci_hc_died(), which in turn calls
> uhci_reset_hc() in pci-quirks.c, which is supposed to prevent the
> controller from generating any IRQs.

OK, tomorrow I will add some printks there.

> Do you have any USB devices attached to bus 3 or bus 6?

Yes:

Bus 006 Device 002: ID 05d8:4002 Ultima Electronics Corp. Artec Ultima
2000 (GT6801 based)/Lifetec LT9385/ScanMagic 1200 UB Plus Scanner

Should I unplug it and retest the bug now?

--
Alexander E. Patrakov

2010-11-25 20:25:12

by Alan Stern

[permalink] [raw]
Subject: Re: Nobody cared about IRQs at shutdown

On Thu, 25 Nov 2010, Alexander E. Patrakov wrote:

> 25.11.2010 21:06, Alan Stern wrote:
> > On Thu, 25 Nov 2010, Alexander E. Patrakov wrote:
> >
> >> Hello.
> >>
> >> After switching my Gentoo desktop from sysvinit + openrc to systemd, I
> >> started getting "nobody cared" messages about IRQs 16 and 19 (common
> >> thing: they are assigned to the USB controllers, that's why CC:
> > According to your listing, they are used by uhci-hcd. Do the messages
> > go away if you unload uhci-hcd before shutting down?
>
> It is not a module here, so I have to recompile the kernel in order to
> try this. Will do that tomorrow.
>
> > You may need to debug the uhci-hcd driver. Look into
> > drivers/usb/host/uhci-hcd.c; the uhci_shutdown() routine ought to be
> > called and it ought to call uhci_hc_died(), which in turn calls
> > uhci_reset_hc() in pci-quirks.c, which is supposed to prevent the
> > controller from generating any IRQs.
>
> OK, tomorrow I will add some printks there.
>
> > Do you have any USB devices attached to bus 3 or bus 6?
>
> Yes:
>
> Bus 006 Device 002: ID 05d8:4002 Ultima Electronics Corp. Artec Ultima
> 2000 (GT6801 based)/Lifetec LT9385/ScanMagic 1200 UB Plus Scanner
>
> Should I unplug it and retest the bug now?

Sure.

Even without rebuilding the kernel, you can unbind the uhci-hcd driver
from the hardware by going to the /sys/bus/pci/drivers/uhci_hcd
directory and doing:

echo -n device-name >unbind

where "device-name" is the name of one of the symlinks in that
directory.

Alan Stern

2010-11-27 07:12:32

by Alexander E. Patrakov

[permalink] [raw]
Subject: Re: Nobody cared about IRQs at shutdown

26.11.2010 01:25, Alan Stern wrote:
> On Thu, 25 Nov 2010, Alexander E. Patrakov wrote:
>
>> 25.11.2010 21:06, Alan Stern wrote:
>>> On Thu, 25 Nov 2010, Alexander E. Patrakov wrote:
>>>
>>>> Hello.
>>>>
>>>> After switching my Gentoo desktop from sysvinit + openrc to systemd, I
>>>> started getting "nobody cared" messages about IRQs 16 and 19 (common
>>>> thing: they are assigned to the USB controllers, that's why CC:
>>> According to your listing, they are used by uhci-hcd. Do the messages
>>> go away if you unload uhci-hcd before shutting down?
>> It is not a module here, so I have to recompile the kernel in order to
>> try this. Will do that tomorrow.
>>
>>> You may need to debug the uhci-hcd driver. Look into
>>> drivers/usb/host/uhci-hcd.c; the uhci_shutdown() routine ought to be
>>> called and it ought to call uhci_hc_died(), which in turn calls
>>> uhci_reset_hc() in pci-quirks.c, which is supposed to prevent the
>>> controller from generating any IRQs.
>> OK, tomorrow I will add some printks there.

Sorry, I didn't add them due to being busy with a different (non-kernel)
bug. What do you want to know - just the fact that these functions are
called before or after reporting the bad IRQ?

>>> Do you have any USB devices attached to bus 3 or bus 6?
>> Yes:
>>
>> Bus 006 Device 002: ID 05d8:4002 Ultima Electronics Corp. Artec Ultima
>> 2000 (GT6801 based)/Lifetec LT9385/ScanMagic 1200 UB Plus Scanner
>>
>> Should I unplug it and retest the bug now?
> Sure.

Tried, it didn't change anything.

> Even without rebuilding the kernel, you can unbind the uhci-hcd driver
> from the hardware by going to the /sys/bus/pci/drivers/uhci_hcd
> directory and doing:
>
> echo -n device-name>unbind
>
> where "device-name" is the name of one of the symlinks in that
> directory.

Thanks for the tip. Below are the updated test results, including the
ones posted in my first mail, for completeness.

1. No irqpoll, no unbind: the system reports that nobody cared for IRQs,
then waits, displays SATA errors, waits again, shuts down.

2. irqpoll, no unbind: the system shuts down without any delays.

3. No irqpoll, unbind uhci-hcd from everything it controls: the system
reports bad IRQ 19 (consumed by firewire-ohci), waits, displays SATA
errors, waits again, shuts down.

4. No irqpoll, unbind both uhci-hcd and firewire-ohci from everything:
the system does not report any bad IRQs, waits, displays SATA errors,
waits again, shuts down.

5. irqpoll, unbind both uhci-hcd and firewire-ohci from everything: same
as (4).

I have no firewire devices.

--
Alexander E. Patrakov

2010-11-27 15:16:50

by Alan Stern

[permalink] [raw]
Subject: Re: Nobody cared about IRQs at shutdown

On Sat, 27 Nov 2010, Alexander E. Patrakov wrote:

> 26.11.2010 01:25, Alan Stern wrote:
> > On Thu, 25 Nov 2010, Alexander E. Patrakov wrote:
> >
> >> 25.11.2010 21:06, Alan Stern wrote:
> >>> On Thu, 25 Nov 2010, Alexander E. Patrakov wrote:
> >>>
> >>>> Hello.
> >>>>
> >>>> After switching my Gentoo desktop from sysvinit + openrc to systemd, I
> >>>> started getting "nobody cared" messages about IRQs 16 and 19 (common
> >>>> thing: they are assigned to the USB controllers, that's why CC:
> >>> According to your listing, they are used by uhci-hcd. Do the messages
> >>> go away if you unload uhci-hcd before shutting down?
> >> It is not a module here, so I have to recompile the kernel in order to
> >> try this. Will do that tomorrow.
> >>
> >>> You may need to debug the uhci-hcd driver. Look into
> >>> drivers/usb/host/uhci-hcd.c; the uhci_shutdown() routine ought to be
> >>> called and it ought to call uhci_hc_died(), which in turn calls
> >>> uhci_reset_hc() in pci-quirks.c, which is supposed to prevent the
> >>> controller from generating any IRQs.
> >> OK, tomorrow I will add some printks there.
>
> Sorry, I didn't add them due to being busy with a different (non-kernel)
> bug. What do you want to know - just the fact that these functions are
> called before or after reporting the bad IRQ?

They should be called before the bad IRQ is reported. That's what I
want to verify.

> > Even without rebuilding the kernel, you can unbind the uhci-hcd driver
> > from the hardware by going to the /sys/bus/pci/drivers/uhci_hcd
> > directory and doing:
> >
> > echo -n device-name>unbind
> >
> > where "device-name" is the name of one of the symlinks in that
> > directory.
>
> Thanks for the tip. Below are the updated test results, including the
> ones posted in my first mail, for completeness.
>
> 1. No irqpoll, no unbind: the system reports that nobody cared for IRQs,
> then waits, displays SATA errors, waits again, shuts down.
>
> 2. irqpoll, no unbind: the system shuts down without any delays.
>
> 3. No irqpoll, unbind uhci-hcd from everything it controls: the system
> reports bad IRQ 19 (consumed by firewire-ohci), waits, displays SATA
> errors, waits again, shuts down.

This suggests that the problem comes from the firewire controller, not
the UHCI controller.

Are you sure that you ever got "nobody cared" reports for IRQ 16? Your
listing showed that uhci-hcd used both 16 and 19 whereas firewire-ohci
used only 19.

> 4. No irqpoll, unbind both uhci-hcd and firewire-ohci from everything:
> the system does not report any bad IRQs, waits, displays SATA errors,
> waits again, shuts down.

And this tends to confirm it. The SATA errors are probably a separate
issue.

What happens if you unbind firewire-ohci but not uhci-hcd?

> 5. irqpoll, unbind both uhci-hcd and firewire-ohci from everything: same
> as (4).
>
> I have no firewire devices.

Apparently that doesn't stop the controller from misbehaving. But you
need to do more tests to be sure of this.

Alan Stern

2010-11-30 18:08:31

by Alexander E. Patrakov

[permalink] [raw]
Subject: Re: Nobody cared about IRQs at shutdown

27.11.2010 20:16, Alan Stern wrote:

(sorry for the delay with the reply)

> On Sat, 27 Nov 2010, Alexander E. Patrakov wrote:
>
>> 26.11.2010 01:25, Alan Stern wrote:
>>> On Thu, 25 Nov 2010, Alexander E. Patrakov wrote:
>>>
>>>> 25.11.2010 21:06, Alan Stern wrote:
>>>>> On Thu, 25 Nov 2010, Alexander E. Patrakov wrote:
>>>>>
>>>>>> Hello.
>>>>>>
>>>>>> After switching my Gentoo desktop from sysvinit + openrc to systemd, I
>>>>>> started getting "nobody cared" messages about IRQs 16 and 19 (common
>>>>>> thing: they are assigned to the USB controllers, that's why CC:
>>>>> According to your listing, they are used by uhci-hcd. Do the messages
>>>>> go away if you unload uhci-hcd before shutting down?
>>>> It is not a module here, so I have to recompile the kernel in order to
>>>> try this. Will do that tomorrow.
>>>>
>>>>> You may need to debug the uhci-hcd driver. Look into
>>>>> drivers/usb/host/uhci-hcd.c; the uhci_shutdown() routine ought to be
>>>>> called and it ought to call uhci_hc_died(), which in turn calls
>>>>> uhci_reset_hc() in pci-quirks.c, which is supposed to prevent the
>>>>> controller from generating any IRQs.
>>>> OK, tomorrow I will add some printks there.
>> Sorry, I didn't add them due to being busy with a different (non-kernel)
>> bug. What do you want to know - just the fact that these functions are
>> called before or after reporting the bad IRQ?
> They should be called before the bad IRQ is reported. That's what I
> want to verify.

Yes, this is called 5 times before the bad IRQ report. I think this is
consistent with the fact that I have 10 USB ports.

>>> Even without rebuilding the kernel, you can unbind the uhci-hcd driver
>>> from the hardware by going to the /sys/bus/pci/drivers/uhci_hcd
>>> directory and doing:
>>>
>>> echo -n device-name>unbind
>>>
>>> where "device-name" is the name of one of the symlinks in that
>>> directory.
>> Thanks for the tip. Below are the updated test results, including the
>> ones posted in my first mail, for completeness.
>>
>> 1. No irqpoll, no unbind: the system reports that nobody cared for IRQs,
>> then waits, displays SATA errors, waits again, shuts down.
>>
>> 2. irqpoll, no unbind: the system shuts down without any delays.
>>
>> 3. No irqpoll, unbind uhci-hcd from everything it controls: the system
>> reports bad IRQ 19 (consumed by firewire-ohci), waits, displays SATA
>> errors, waits again, shuts down.
> This suggests that the problem comes from the firewire controller, not
> the UHCI controller.
>
> Are you sure that you ever got "nobody cared" reports for IRQ 16? Your
> listing showed that uhci-hcd used both 16 and 19 whereas firewire-ohci
> used only 19.

Yes, see the screen photo attached to the first e-mail in this thread.

>> 4. No irqpoll, unbind both uhci-hcd and firewire-ohci from everything:
>> the system does not report any bad IRQs, waits, displays SATA errors,
>> waits again, shuts down.
> And this tends to confirm it. The SATA errors are probably a separate
> issue.
>
> What happens if you unbind firewire-ohci but not uhci-hcd?

The kernel reports the bad IRQ, waits, displays SATA errors, waits,
shuts down. See the photo attached to this mail.

>> 5. irqpoll, unbind both uhci-hcd and firewire-ohci from everything: same
>> as (4).
>>
>> I have no firewire devices.
> Apparently that doesn't stop the controller from misbehaving. But you
> need to do more tests to be sure of this.

In fact, I think that there is something bad, not specific to USB,
FireWire or SATA. Without systemd, all those subsystems function
properly at shutdown. With systemd, it looks like there are many
mishandled interrupts (all of USB, FireWire and SATA) at shutdown. What
could be this common thing? ACPI?

--
Alexander E. Patrakov


Attachments:
nobody_cared_1.jpg (65.39 kB)

2010-11-30 18:57:43

by Alan Stern

[permalink] [raw]
Subject: Re: Nobody cared about IRQs at shutdown

On Tue, 30 Nov 2010, Alexander E. Patrakov wrote:

> In fact, I think that there is something bad, not specific to USB,
> FireWire or SATA. Without systemd, all those subsystems function
> properly at shutdown. With systemd, it looks like there are many
> mishandled interrupts (all of USB, FireWire and SATA) at shutdown. What
> could be this common thing? ACPI?

I don't know -- what is systemd?

Alan Stern

2010-11-30 19:15:51

by Alexander E. Patrakov

[permalink] [raw]
Subject: Re: Nobody cared about IRQs at shutdown

Alan Stern wrote:
> On Tue, 30 Nov 2010, Alexander E. Patrakov wrote:
>
> > In fact, I think that there is something bad, not specific to USB,
> > FireWire or SATA. Without systemd, all those subsystems function
> > properly at shutdown. With systemd, it looks like there are many
> > mishandled interrupts (all of USB, FireWire and SATA) at shutdown.
> > What   could be this common thing? ACPI?
>
> I don't know -- what is systemd?

Systemd is a new init developed by Lennart Poettering. You can learn more at http://freedesktop.org/wiki/Software/systemd

It employs high concurrency in starting and stopping services, starts many things on demand and thus boots faster than the traditional SysV init. And also exposes this bug :(

--
Alexander E. Patrakov (from mobile phone)

2010-12-01 14:59:52

by Alan Stern

[permalink] [raw]
Subject: Re: Nobody cared about IRQs at shutdown

On Wed, 1 Dec 2010, Alexander E. Patrakov wrote:

> Alan Stern wrote:
> > On Tue, 30 Nov 2010, Alexander E. Patrakov wrote:
> >
> > > In fact, I think that there is something bad, not specific to USB,
> > > FireWire or SATA. Without systemd, all those subsystems function
> > > properly at shutdown. With systemd, it looks like there are many
> > > mishandled interrupts (all of USB, FireWire and SATA) at shutdown.
> > > What   could be this common thing? ACPI?
> >
> > I don't know -- what is systemd?
>
> Systemd is a new init developed by Lennart Poettering. You can learn more at http://freedesktop.org/wiki/Software/systemd
>
> It employs high concurrency in starting and stopping services, starts many things on demand and thus boots faster than the traditional SysV init. And also exposes this bug :(

All right.

One last test. What happens if you unbind the firewire driver and all
the UHCI controllers except the one attached to IRQ 16?

Possible explanations: IRQs are being misrouted, so the system thinks
it gets IRQ 16 when in fact a different interrupt line was activated
(this is related to ACPI, but I don't see any connection to systemd).
Or the interrupt layer is malfunctioning and it thinks IRQs are
arriving when they aren't.

Alan Stern

2010-12-01 17:32:52

by Alexander E. Patrakov

[permalink] [raw]
Subject: Re: Nobody cared about IRQs at shutdown

01.12.2010 19:59, Alan Stern пишет:
> On Wed, 1 Dec 2010, Alexander E. Patrakov wrote:
>
>> Alan Stern wrote:
>>> On Tue, 30 Nov 2010, Alexander E. Patrakov wrote:
>>>
>>>> In fact, I think that there is something bad, not specific to USB,
>>>> FireWire or SATA. Without systemd, all those subsystems function
>>>> properly at shutdown. With systemd, it looks like there are many
>>>> mishandled interrupts (all of USB, FireWire and SATA) at shutdown.
>>>> What  could be this common thing? ACPI?
>>> I don't know -- what is systemd?
>> Systemd is a new init developed by Lennart Poettering. You can learn more at http://freedesktop.org/wiki/Software/systemd
>>
>> It employs high concurrency in starting and stopping services, starts many things on demand and thus boots faster than the traditional SysV init. And also exposes this bug :(
> All right.
>
> One last test. What happens if you unbind the firewire driver and all
> the UHCI controllers except the one attached to IRQ 16?

As I was not sure if you mean 16 or 19, I did two tests. In both cases,
the firewire driver and all UHCI controllers except one were unbound. In
both cases, the system printed the line I added to uhci_hc_died(),
reported a bad IRQ (16 and 19, respectively), waited, displayed SATA
errors, waited again, and powered itself off. I.e., the screenshot is
nearly identical to what I sent earlier.


> Possible explanations: IRQs are being misrouted, so the system thinks
> it gets IRQ 16 when in fact a different interrupt line was activated
> (this is related to ACPI, but I don't see any connection to systemd).
> Or the interrupt layer is malfunctioning and it thinks IRQs are
> arriving when they aren't.

I forgot to mention that only shutdown is problematic, reboots are OK.

--
Alexander E. Patrakov

2010-12-01 18:26:08

by Alan Stern

[permalink] [raw]
Subject: Re: Nobody cared about IRQs at shutdown

On Wed, 1 Dec 2010, Alexander E. Patrakov wrote:

> > One last test. What happens if you unbind the firewire driver and all
> > the UHCI controllers except the one attached to IRQ 16?
>
> As I was not sure if you mean 16 or 19, I did two tests. In both cases,
> the firewire driver and all UHCI controllers except one were unbound. In
> both cases, the system printed the line I added to uhci_hc_died(),
> reported a bad IRQ (16 and 19, respectively), waited, displayed SATA
> errors, waited again, and powered itself off. I.e., the screenshot is
> nearly identical to what I sent earlier.
>
>
> > Possible explanations: IRQs are being misrouted, so the system thinks
> > it gets IRQ 16 when in fact a different interrupt line was activated
> > (this is related to ACPI, but I don't see any connection to systemd).
> > Or the interrupt layer is malfunctioning and it thinks IRQs are
> > arriving when they aren't.
>
> I forgot to mention that only shutdown is problematic, reboots are OK.

Well, I'm baffled. I don't see how the problem could be coming from
the drivers, which implies it must be in a more central part of the
kernel. The fact that the bad IRQs arrived on the expected lines
indicates that they are not being misrouted.

Alan Stern