2014-04-13 19:23:31

by Stefani Seibold

[permalink] [raw]
Subject: Missing USB XHCI and EHCI reset for kexec

When executing a kexec kernel on a PowerPC board the new started kernel
will not find already enumerated USB devices due a missing reset on the
USB bus.

As a work around a

echo 1 >/sys/bus/pci/drivers/[ex]hci-pci/BUS-ADDRESS-OF-THE-HCD/reset

will solve this. But this is far from beauty.

My latest kernel without this issue was for EHCI kernel 2.6.39 and for
XHCI kernel 3.4, but i have no idea when exactly this behavior was
introduced.

For X86 all is fine.


2014-04-14 02:14:54

by Alan Stern

[permalink] [raw]
Subject: Re: Missing USB XHCI and EHCI reset for kexec

On Sun, 13 Apr 2014, Stefani Seibold wrote:

> When executing a kexec kernel on a PowerPC board the new started kernel
> will not find already enumerated USB devices due a missing reset on the
> USB bus.

How do you know the problem is caused by a missing reset?

Can you post the dmesg log from the kexec-ed kernel, with
CONFIG_USB_DEBUG enabled?

Alan Stern

2014-04-14 10:06:22

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Missing USB XHCI and EHCI reset for kexec

On Sun, 2014-04-13 at 21:22 +0200, Stefani Seibold wrote:
> When executing a kexec kernel on a PowerPC board the new started kernel
> will not find already enumerated USB devices due a missing reset on the
> USB bus.
>
> As a work around a
>
> echo 1 >/sys/bus/pci/drivers/[ex]hci-pci/BUS-ADDRESS-OF-THE-HCD/reset
>
> will solve this. But this is far from beauty.
>
> My latest kernel without this issue was for EHCI kernel 2.6.39 and for
> XHCI kernel 3.4, but i have no idea when exactly this behavior was
> introduced.
>
> For X86 all is fine.

I don't know about EHCI specifically but this is a known issue with
XHCI, I observe similar issues on other powerpc platforms (servers)
and this isn't architecture specific (looks more like actualy xhc
implementation specific).

Thadeu Cascardo (on CC) has been the one investigating that on our side,
he might have more to add including patches.

Cheers,
Ben.

2014-04-14 13:54:40

by Stefani Seibold

[permalink] [raw]
Subject: Re: Missing USB XHCI and EHCI reset for kexec


From Alan Stern <[email protected]>:

> On Sun, 13 Apr 2014, Stefani Seibold wrote:
>
>> When executing a kexec kernel on a PowerPC board the new started kernel
>> will not find already enumerated USB devices due a missing reset on the
>> USB bus.
>
> How do you know the problem is caused by a missing reset?
>

A "echo 1 >/sys/bus/pci/drivers/ehci-pci/0000\:00\:17.2/reset" will
solve this for kernel 3.10. So i thought this is a reset problem.

But i have now a kernel 3.14 running on my PowerPC device and this
have a different behavour. After a couple of minutes the USB device
will appear again.

> Can you post the dmesg log from the kexec-ed kernel, with
> CONFIG_USB_DEBUG enabled?
>

Here is the log for a 3.14 which CONFIG_USB_DEBUG enabled:

<6>[ 1.753647] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
<6>[ 1.760316] ehci-pci: EHCI PCI platform driver
<6>[ 1.765012] ehci-pci 0000:00:17.2: EHCI Host Controller
<6>[ 1.770408] ehci-pci 0000:00:17.2: new USB bus registered,
assigned bus number 1
<6>[ 1.778348] ehci-pci 0000:00:17.2: irq 22, io mem 0xc0006800
<6>[ 1.795144] ehci-pci 0000:00:17.2: USB 2.0 started, EHCI 1.00
<6>[ 1.801139] usb usb1: New USB device found, idVendor=1d6b,
idProduct=0002
<6>[ 1.807993] usb usb1: New USB device strings: Mfr=3, Product=2,
SerialNumber=1
<6>[ 1.815247] usb usb1: Product: EHCI Host Controller
<6>[ 1.820157] usb usb1: Manufacturer: Linux 3.14.0 ehci_hcd
<6>[ 1.825586] usb usb1: SerialNumber: 0000:00:17.2
<6>[ 1.831022] hub 1-0:1.0: USB hub found
<6>[ 1.834914] hub 1-0:1.0: 5 ports detected
<6>[ 1.839972] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
<6>[ 1.846277] ohci-pci: OHCI PCI platform driver
<6>[ 1.850911] ohci-pci 0000:00:17.0: OHCI PCI host controller
<6>[ 1.856617] ohci-pci 0000:00:17.0: new USB bus registered,
assigned bus number 2
<6>[ 1.864420] ohci-pci 0000:00:17.0: irq 20, io mem 0xc0004000
<6>[ 1.953518] usb usb2: New USB device found, idVendor=1d6b,
idProduct=0001
<6>[ 1.960376] usb usb2: New USB device strings: Mfr=3, Product=2,
SerialNumber=1
<6>[ 1.967631] usb usb2: Product: OHCI PCI host controller
<6>[ 1.972887] usb usb2: Manufacturer: Linux 3.14.0 ohci_hcd
<6>[ 1.978315] usb usb2: SerialNumber: 0000:00:17.0
<6>[ 1.983802] hub 2-0:1.0: USB hub found
<6>[ 1.987695] hub 2-0:1.0: 3 ports detected
<6>[ 1.992395] ohci-pci 0000:00:17.1: OHCI PCI host controller
<6>[ 1.998116] ohci-pci 0000:00:17.1: new USB bus registered,
assigned bus number 3
<6>[ 2.005935] ohci-pci 0000:00:17.1: irq 21, io mem 0xc0005000
<6>[ 2.097535] usb usb3: New USB device found, idVendor=1d6b,
idProduct=0001
<6>[ 2.104391] usb usb3: New USB device strings: Mfr=3, Product=2,
SerialNumber=1
<6>[ 2.111641] usb usb3: Product: OHCI PCI host controller
<6>[ 2.116895] usb usb3: Manufacturer: Linux 3.14.0 ohci_hcd
<6>[ 2.122321] usb usb3: SerialNumber: 0000:00:17.1
<6>[ 2.127802] hub 3-0:1.0: USB hub found
<6>[ 2.131691] hub 3-0:1.0: 2 ports detected
<6>[ 2.136956] Freescale High-Speed USB SOC Device Controller
driver (Apr 20, 2007)
<6>[ 2.145437] mousedev: PS/2 mouse device common for all mice
<6>[ 2.151211] usb 1-2: new high-speed USB device number 2 using ehci-pci
<6>[ 2.157893] i2c /dev entries driver
<6>[ 2.162013] mpc-i2c fef03000.i2c: timeout 1000000 us
<6>[ 2.181749] rtc-rs5c372 0-0032: rs5c372a found, 24hr, driver version 0.6
<6>[ 2.212713] rtc-rs5c372 0-0032: rtc core: registered rtc-rs5c372 as rtc0
<6>[ 2.219904] mpc-i2c fef03100.i2c: timeout 1000000 us
<6>[ 2.226165] usbcore: registered new interface driver usbhid
<6>[ 2.231835] usbhid: USB HID core driver
<6>[ 2.235868] rsfrontp: using key table SMBV (117)
<6>[ 2.240747] usbcore: registered new interface driver rsfrontp
<6>[ 2.246601] rsfrontp: R&S USB HID Frontpanel driver (v1.2)
<6>[ 2.252254] usbcore: registered new interface driver rsknop
<6>[ 2.257912] rsknop: R&S USB HID Knop support (v1.4)
<6>[ 2.263093] TCP: cubic registered
<6>[ 2.266475] NET: Registered protocol family 17
<6>[ 2.278216] rtc-rs5c372 0-0032: setting system clock to
2014-04-13 15:46:52 UTC (1397404012)
<6>[ 2.294464] Freeing unused kernel memory: 968K (c03c3000 - c04b5000)
<6>[ 2.308009] usb 1-2: New USB device found, idVendor=0424, idProduct=2514
<6>[ 2.318509] usb 1-2: New USB device strings: Mfr=0, Product=0,
SerialNumber=0
<6>[ 2.327340] hub 1-2:1.0: USB hub found
<6>[ 2.331579] hub 1-2:1.0: 4 ports detected
<6>[ 2.707176] usb 1-4: new high-speed USB device number 4 using ehci-pci
<3>[ 17.823176] usb 1-4: device descriptor read/64, error -110
<3>[ 33.043207] usb 1-4: device descriptor read/64, error -110
<6>[ 33.263185] usb 1-4: new high-speed USB device number 5 using ehci-pci
<3>[ 48.379210] usb 1-4: device descriptor read/64, error -110
<3>[ 63.607205] usb 1-4: device descriptor read/64, error -110
<6>[ 63.827191] usb 1-4: new high-speed USB device number 6 using ehci-pci
<3>[ 68.851401] usb 1-4: device descriptor read/8, error -110
<3>[ 73.975374] usb 1-4: device descriptor read/8, error -110
<6>[ 74.203206] usb 1-4: new high-speed USB device number 7 using ehci-pci
<3>[ 79.231446] usb 1-4: device descriptor read/8, error -110
<3>[ 84.359422] usb 1-4: device descriptor read/8, error -110
<3>[ 84.475224] hub 1-0:1.0: unable to enumerate USB device on port 4
<6>[ 84.795211] usb 2-2: new full-speed USB device number 2 using ohci-pci
<6>[ 85.016582] usb 2-2: New USB device found, idVendor=0451, idProduct=2036
<6>[ 85.023394] usb 2-2: New USB device strings: Mfr=0, Product=1,
SerialNumber=0
<6>[ 85.030572] usb 2-2: Product: General Purpose USB Hub
<6>[ 85.043722] hub 2-2:1.0: USB hub found
<6>[ 85.049569] hub 2-2:1.0: 2 ports detected
<6>[ 85.367220] usb 3-2: new full-speed USB device number 2 using ohci-pci
<3>[ 100.555212] usb 3-2: device descriptor read/64, error -110
<3>[ 115.847196] usb 3-2: device descriptor read/64, error -110
<6>[ 116.131178] usb 3-2: new full-speed USB device number 3 using ohci-pci
<3>[ 131.311178] usb 3-2: device descriptor read/64, error -110
<3>[ 146.595209] usb 3-2: device descriptor read/64, error -110
<6>[ 146.887192] usb 3-2: new full-speed USB device number 4 using ohci-pci
<3>[ 151.915584] usb 3-2: device descriptor read/8, error -110
<3>[ 157.039436] usb 3-2: device descriptor read/8, error -110
<6>[ 157.323180] usb 3-2: new full-speed USB device number 5 using ohci-pci
<3>[ 162.352239] usb 3-2: device descriptor read/8, error -110
<3>[ 167.484079] usb 3-2: device descriptor read/8, error -110
<3>[ 167.591236] hub 3-0:1.0: unable to enumerate USB device on port 2
<6>[ 167.936921] usb 2-2.1: new full-speed USB device number 3 using ohci-pci
<6>[ 168.067890] usb 2-2.1: New USB device found, idVendor=076b,
idProduct=a021
<6>[ 168.074871] usb 2-2.1: New USB device strings: Mfr=1, Product=2,
SerialNumber=0
<6>[ 168.082226] usb 2-2.1: Product: Smart Card Reader
<6>[ 168.086963] usb 2-2.1: Manufacturer: USB
<6>[ 168.172893] usb 2-2.2: new low-speed USB device number 4 using ohci-pci
<6>[ 168.300839] usb 2-2.2: New USB device found, idVendor=0aad,
idProduct=0024
<6>[ 168.307823] usb 2-2.2: New USB device strings: Mfr=1, Product=2,
SerialNumber=0
<6>[ 168.315180] usb 2-2.2: Product: FrontPanel USB Keyboard
<6>[ 168.320436] usb 2-2.2: Manufacturer: Rohde&Schwarz
<6>[ 168.337895] input: Rohde&Schwarz FrontPanel USB Keyboard as
/devices/pci0000:00/0000:00:17.0/usb2/2-2/2-2.2/2-2.2:1.0/input/input0
<6>[ 168.360988] input: Rohde&Schwarz FrontPanel USB Keyboard as
/devices/pci0000:00/0000:00:17.0/usb2/2-2/2-2.2/2-2.2:1.1/input/input1

This is the output of lsusb:

Bus 001 Device 002: ID 0424:2514 Standard Microsystems Corp. USB 2.0 Hub
Bus 001 Device 004: ID 0928:0007 Oxford Semiconductor, Ltd
Bus 002 Device 002: ID 0451:2036 Texas Instruments, Inc. TUSB2036 Hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 002 Device 003: ID 076b:a021 OmniKey AG CCID Smart Card Reader
Bus 002 Device 004: ID 0aad:0024 Rohde & Schwarz GmbH & Co. KG

My old kernel 3.4 does not show this problem. Since kernel 3.10 i need
to reset to ehci-pci device when kexec. But this workaround does not
work any longer on kernel 3.14.

-Stefani

2014-04-14 14:49:17

by Alan Stern

[permalink] [raw]
Subject: Re: Missing USB XHCI and EHCI reset for kexec

On Mon, 14 Apr 2014 [email protected] wrote:

>
> From Alan Stern <[email protected]>:
>
> > On Sun, 13 Apr 2014, Stefani Seibold wrote:
> >
> >> When executing a kexec kernel on a PowerPC board the new started kernel
> >> will not find already enumerated USB devices due a missing reset on the
> >> USB bus.
> >
> > How do you know the problem is caused by a missing reset?
> >
>
> A "echo 1 >/sys/bus/pci/drivers/ehci-pci/0000\:00\:17.2/reset" will
> solve this for kernel 3.10. So i thought this is a reset problem.

All that means is the problem can be fixed by adding a reset. It
doesn't mean that the problem was caused by a missing reset.

> But i have now a kernel 3.14 running on my PowerPC device and this
> have a different behavour. After a couple of minutes the USB device
> will appear again.
>
> > Can you post the dmesg log from the kexec-ed kernel, with
> > CONFIG_USB_DEBUG enabled?
> >
>
> Here is the log for a 3.14 which CONFIG_USB_DEBUG enabled:

> <6>[ 2.308009] usb 1-2: New USB device found, idVendor=0424, idProduct=2514
> <6>[ 2.318509] usb 1-2: New USB device strings: Mfr=0, Product=0,
> SerialNumber=0
> <6>[ 2.327340] hub 1-2:1.0: USB hub found
> <6>[ 2.331579] hub 1-2:1.0: 4 ports detected

This device worked okay.

> <6>[ 2.707176] usb 1-4: new high-speed USB device number 4 using ehci-pci
> <3>[ 17.823176] usb 1-4: device descriptor read/64, error -110
> <3>[ 33.043207] usb 1-4: device descriptor read/64, error -110
> <6>[ 33.263185] usb 1-4: new high-speed USB device number 5 using ehci-pci
> <3>[ 48.379210] usb 1-4: device descriptor read/64, error -110
> <3>[ 63.607205] usb 1-4: device descriptor read/64, error -110
> <6>[ 63.827191] usb 1-4: new high-speed USB device number 6 using ehci-pci
> <3>[ 68.851401] usb 1-4: device descriptor read/8, error -110
> <3>[ 73.975374] usb 1-4: device descriptor read/8, error -110
> <6>[ 74.203206] usb 1-4: new high-speed USB device number 7 using ehci-pci
> <3>[ 79.231446] usb 1-4: device descriptor read/8, error -110
> <3>[ 84.359422] usb 1-4: device descriptor read/8, error -110
> <3>[ 84.475224] hub 1-0:1.0: unable to enumerate USB device on port 4

Is this the problem you are referring to?

> <6>[ 84.795211] usb 2-2: new full-speed USB device number 2 using ohci-pci
> <6>[ 85.016582] usb 2-2: New USB device found, idVendor=0451, idProduct=2036
> <6>[ 85.023394] usb 2-2: New USB device strings: Mfr=0, Product=1,
> SerialNumber=0
> <6>[ 85.030572] usb 2-2: Product: General Purpose USB Hub
> <6>[ 85.043722] hub 2-2:1.0: USB hub found
> <6>[ 85.049569] hub 2-2:1.0: 2 ports detected
> <6>[ 85.367220] usb 3-2: new full-speed USB device number 2 using ohci-pci
> <3>[ 100.555212] usb 3-2: device descriptor read/64, error -110
> <3>[ 115.847196] usb 3-2: device descriptor read/64, error -110
> <6>[ 116.131178] usb 3-2: new full-speed USB device number 3 using ohci-pci
> <3>[ 131.311178] usb 3-2: device descriptor read/64, error -110
> <3>[ 146.595209] usb 3-2: device descriptor read/64, error -110
> <6>[ 146.887192] usb 3-2: new full-speed USB device number 4 using ohci-pci
> <3>[ 151.915584] usb 3-2: device descriptor read/8, error -110
> <3>[ 157.039436] usb 3-2: device descriptor read/8, error -110
> <6>[ 157.323180] usb 3-2: new full-speed USB device number 5 using ohci-pci
> <3>[ 162.352239] usb 3-2: device descriptor read/8, error -110
> <3>[ 167.484079] usb 3-2: device descriptor read/8, error -110
> <3>[ 167.591236] hub 3-0:1.0: unable to enumerate USB device on port 2

And this?

> <6>[ 167.936921] usb 2-2.1: new full-speed USB device number 3 using ohci-pci
> <6>[ 168.067890] usb 2-2.1: New USB device found, idVendor=076b,
> idProduct=a021
> <6>[ 168.074871] usb 2-2.1: New USB device strings: Mfr=1, Product=2,
> SerialNumber=0
> <6>[ 168.082226] usb 2-2.1: Product: Smart Card Reader
> <6>[ 168.086963] usb 2-2.1: Manufacturer: USB
> <6>[ 168.172893] usb 2-2.2: new low-speed USB device number 4 using ohci-pci
> <6>[ 168.300839] usb 2-2.2: New USB device found, idVendor=0aad,
> idProduct=0024
> <6>[ 168.307823] usb 2-2.2: New USB device strings: Mfr=1, Product=2,
> SerialNumber=0
> <6>[ 168.315180] usb 2-2.2: Product: FrontPanel USB Keyboard
> <6>[ 168.320436] usb 2-2.2: Manufacturer: Rohde&Schwarz
> <6>[ 168.337895] input: Rohde&Schwarz FrontPanel USB Keyboard as
> /devices/pci0000:00/0000:00:17.0/usb2/2-2/2-2.2/2-2.2:1.0/input/input0
> <6>[ 168.360988] input: Rohde&Schwarz FrontPanel USB Keyboard as
> /devices/pci0000:00/0000:00:17.0/usb2/2-2/2-2.2/2-2.2:1.1/input/input1

Since some devices work and some don't, maybe part of the problem lies
in the particular devices.

> This is the output of lsusb:
>
> Bus 001 Device 002: ID 0424:2514 Standard Microsystems Corp. USB 2.0 Hub
> Bus 001 Device 004: ID 0928:0007 Oxford Semiconductor, Ltd
> Bus 002 Device 002: ID 0451:2036 Texas Instruments, Inc. TUSB2036 Hub
> Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
> Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
> Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
> Bus 002 Device 003: ID 076b:a021 OmniKey AG CCID Smart Card Reader
> Bus 002 Device 004: ID 0aad:0024 Rohde & Schwarz GmbH & Co. KG
>
> My old kernel 3.4 does not show this problem. Since kernel 3.10 i need
> to reset to ehci-pci device when kexec. But this workaround does not
> work any longer on kernel 3.14.

Have you tried bisecting between 3.4 and 3.10 to find which commit
caused the behavior to change?

What about if you just do:

rmmod ehci-pci
modprobe ehci-pci

Alan Stern

2014-04-14 15:25:34

by Stefani Seibold

[permalink] [raw]
Subject: Re: Missing USB XHCI and EHCI reset for kexec


Zitat von Alan Stern <[email protected]>:


>> <6>[ 167.936921] usb 2-2.1: new full-speed USB device number 3
>> using ohci-pci
>> <6>[ 168.067890] usb 2-2.1: New USB device found, idVendor=076b,
>> idProduct=a021
>> <6>[ 168.074871] usb 2-2.1: New USB device strings: Mfr=1, Product=2,
>> SerialNumber=0
>> <6>[ 168.082226] usb 2-2.1: Product: Smart Card Reader
>> <6>[ 168.086963] usb 2-2.1: Manufacturer: USB
>> <6>[ 168.172893] usb 2-2.2: new low-speed USB device number 4
>> using ohci-pci
>> <6>[ 168.300839] usb 2-2.2: New USB device found, idVendor=0aad,
>> idProduct=0024
>> <6>[ 168.307823] usb 2-2.2: New USB device strings: Mfr=1, Product=2,
>> SerialNumber=0
>> <6>[ 168.315180] usb 2-2.2: Product: FrontPanel USB Keyboard
>> <6>[ 168.320436] usb 2-2.2: Manufacturer: Rohde&Schwarz
>> <6>[ 168.337895] input: Rohde&Schwarz FrontPanel USB Keyboard as
>> /devices/pci0000:00/0000:00:17.0/usb2/2-2/2-2.2/2-2.2:1.0/input/input0
>> <6>[ 168.360988] input: Rohde&Schwarz FrontPanel USB Keyboard as
>> /devices/pci0000:00/0000:00:17.0/usb2/2-2/2-2.2/2-2.2:1.1/input/input1
>
> Since some devices work and some don't, maybe part of the problem lies
> in the particular devices.
>

The problem lies on the "Bus 001 Device 002: ID 0424:2514 Standard
Microsystems Corp. USB 2.0 Hub", which hangs for arround 162 seconds
after a kexec.

The "Bus 002 Device 003: ID 076b:a021 OmniKey AG CCID Smart Card
Reader" and "Bus 002 Device 004: ID 0aad:0024 Rohde & Schwarz GmbH &
Co. KG" are attached to this Hub.

An other PowerPC device which is nearly eactly the same HW but without
this USB HUB works perfectly.

>> This is the output of lsusb:
>>
>> Bus 001 Device 002: ID 0424:2514 Standard Microsystems Corp. USB 2.0 Hub
>> Bus 001 Device 004: ID 0928:0007 Oxford Semiconductor, Ltd
>> Bus 002 Device 002: ID 0451:2036 Texas Instruments, Inc. TUSB2036 Hub
>> Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
>> Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
>> Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
>> Bus 002 Device 003: ID 076b:a021 OmniKey AG CCID Smart Card Reader
>> Bus 002 Device 004: ID 0aad:0024 Rohde & Schwarz GmbH & Co. KG
>>
>> My old kernel 3.4 does not show this problem. Since kernel 3.10 i need
>> to reset to ehci-pci device when kexec. But this workaround does not
>> work any longer on kernel 3.14.
>
> Have you tried bisecting between 3.4 and 3.10 to find which commit
> caused the behavior to change?
>

I cannot do a besecting run for this PowerPC emebbed device, since
there are some other patches like a BSP and older squashfs which are
not available. Nearly all of this generated kernels will not boot and
work. For X86 this is a easy job, but not for a "out of tree" PowerPC
device.

> What about if you just do:
>
> rmmod ehci-pci
> modprobe ehci-pci
>

The kernel is monolitic because the USB HW is needed in a early boot
stage. The problem also occurs with ehci-fsl used in by an other
PowerPC device, which is a part of the SoC and is not attached to the
PCI bus.

One thing is that the "echo 1
>/sys/bus/pci/drivers/ehci-pci/0000\:00\:17.2/reset" workaround will
no longer work for kernel 3.14.

- Stefani

2014-04-14 15:45:18

by Stefani Seibold

[permalink] [raw]
Subject: Re: Missing USB XHCI and EHCI reset for kexec


Zitat von Benjamin Herrenschmidt <[email protected]>:

> On Sun, 2014-04-13 at 21:22 +0200, Stefani Seibold wrote:
>> When executing a kexec kernel on a PowerPC board the new started kernel
>> will not find already enumerated USB devices due a missing reset on the
>> USB bus.
>>
>> As a work around a
>>
>> echo 1 >/sys/bus/pci/drivers/[ex]hci-pci/BUS-ADDRESS-OF-THE-HCD/reset
>>
>> will solve this. But this is far from beauty.
>>
>> My latest kernel without this issue was for EHCI kernel 2.6.39 and for
>> XHCI kernel 3.4, but i have no idea when exactly this behavior was
>> introduced.
>>
>> For X86 all is fine.
>
> I don't know about EHCI specifically but this is a known issue with
> XHCI, I observe similar issues on other powerpc platforms (servers)
> and this isn't architecture specific (looks more like actualy xhc
> implementation specific).
>
> Thadeu Cascardo (on CC) has been the one investigating that on our side,
> he might have more to add including patches.
>

I have now a kernel 3.14 dmesg log of the problem. After a kexec the
kexeced 3.14 kernel shows:

[ 1.170029] xhci_hcd 0001:03:00.0: xHCI Host Controller
[ 1.175306] xhci_hcd 0001:03:00.0: new USB bus registered, assigned
bus number 1
[ 1.212561] xhci_hcd 0001:03:00.0: Host not halted after 16000
microseconds.
[ 1.219621] xhci_hcd 0001:03:00.0: can't setup: -110
[ 1.224597] xhci_hcd 0001:03:00.0: USB bus 1 deregistered
[ 1.230021] xhci_hcd 0001:03:00.0: init 0001:03:00.0 fail, -110
[ 1.235955] xhci_hcd: probe of 0001:03:00.0 failed with error -110

Any idea?

- Stefani

2014-04-14 16:27:24

by Alan Stern

[permalink] [raw]
Subject: Re: Missing USB XHCI and EHCI reset for kexec

On Mon, 14 Apr 2014 [email protected] wrote:

> Zitat von Alan Stern <[email protected]>:
>
>
> >> <6>[ 167.936921] usb 2-2.1: new full-speed USB device number 3
> >> using ohci-pci
> >> <6>[ 168.067890] usb 2-2.1: New USB device found, idVendor=076b,
> >> idProduct=a021
> >> <6>[ 168.074871] usb 2-2.1: New USB device strings: Mfr=1, Product=2,
> >> SerialNumber=0
> >> <6>[ 168.082226] usb 2-2.1: Product: Smart Card Reader
> >> <6>[ 168.086963] usb 2-2.1: Manufacturer: USB
> >> <6>[ 168.172893] usb 2-2.2: new low-speed USB device number 4
> >> using ohci-pci
> >> <6>[ 168.300839] usb 2-2.2: New USB device found, idVendor=0aad,
> >> idProduct=0024
> >> <6>[ 168.307823] usb 2-2.2: New USB device strings: Mfr=1, Product=2,
> >> SerialNumber=0
> >> <6>[ 168.315180] usb 2-2.2: Product: FrontPanel USB Keyboard
> >> <6>[ 168.320436] usb 2-2.2: Manufacturer: Rohde&Schwarz
> >> <6>[ 168.337895] input: Rohde&Schwarz FrontPanel USB Keyboard as
> >> /devices/pci0000:00/0000:00:17.0/usb2/2-2/2-2.2/2-2.2:1.0/input/input0
> >> <6>[ 168.360988] input: Rohde&Schwarz FrontPanel USB Keyboard as
> >> /devices/pci0000:00/0000:00:17.0/usb2/2-2/2-2.2/2-2.2:1.1/input/input1
> >
> > Since some devices work and some don't, maybe part of the problem lies
> > in the particular devices.
> >
>
> The problem lies on the "Bus 001 Device 002: ID 0424:2514 Standard
> Microsystems Corp. USB 2.0 Hub", which hangs for arround 162 seconds
> after a kexec.
>
> The "Bus 002 Device 003: ID 076b:a021 OmniKey AG CCID Smart Card
> Reader" and "Bus 002 Device 004: ID 0aad:0024 Rohde & Schwarz GmbH &
> Co. KG" are attached to this Hub.

Actually, it looks like they are plugged into the Texas Instruments
hub, not the Standard Microsystems hub (because they are on bus 2, not
bus 1). Did you rearrange the USB cables?

> An other PowerPC device which is nearly eactly the same HW but without
> this USB HUB works perfectly.

Maybe you should replace that hub with a different brand.

> >> This is the output of lsusb:
> >>
> >> Bus 001 Device 002: ID 0424:2514 Standard Microsystems Corp. USB 2.0 Hub
> >> Bus 001 Device 004: ID 0928:0007 Oxford Semiconductor, Ltd
> >> Bus 002 Device 002: ID 0451:2036 Texas Instruments, Inc. TUSB2036 Hub
> >> Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
> >> Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
> >> Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
> >> Bus 002 Device 003: ID 076b:a021 OmniKey AG CCID Smart Card Reader
> >> Bus 002 Device 004: ID 0aad:0024 Rohde & Schwarz GmbH & Co. KG

Here, the only device that might be plugged into the Standard
Microsystems hub is the Oxford Semiconductor thing (whatever it is).

> > What about if you just do:
> >
> > rmmod ehci-pci
> > modprobe ehci-pci
> >
>
> The kernel is monolitic because the USB HW is needed in a early boot
> stage. The problem also occurs with ehci-fsl used in by an other
> PowerPC device, which is a part of the SoC and is not attached to the
> PCI bus.
>
> One thing is that the "echo 1
> >/sys/bus/pci/drivers/ehci-pci/0000\:00\:17.2/reset" workaround will
> no longer work for kernel 3.14.

Instead, you could try

echo 0000:00:17.2 >/sys/bus/pci/drivers/ehci-pci/unbind
echo 0000:00:17.2 >/sys/bus/pci/drivers/ehci-pci/bind

Alan Stern

2014-04-14 16:36:33

by Stefani Seibold

[permalink] [raw]
Subject: Re: Missing USB XHCI and EHCI reset for kexec

Am Montag, den 14.04.2014, 12:27 -0400 schrieb Alan Stern:
> On Mon, 14 Apr 2014 [email protected] wrote:
>
> > Zitat von Alan Stern <[email protected]>:
> >
> >
> > >> <6>[ 167.936921] usb 2-2.1: new full-speed USB device number 3
> > >> using ohci-pci
> > >> <6>[ 168.067890] usb 2-2.1: New USB device found, idVendor=076b,
> > >> idProduct=a021
> > >> <6>[ 168.074871] usb 2-2.1: New USB device strings: Mfr=1, Product=2,
> > >> SerialNumber=0
> > >> <6>[ 168.082226] usb 2-2.1: Product: Smart Card Reader
> > >> <6>[ 168.086963] usb 2-2.1: Manufacturer: USB
> > >> <6>[ 168.172893] usb 2-2.2: new low-speed USB device number 4
> > >> using ohci-pci
> > >> <6>[ 168.300839] usb 2-2.2: New USB device found, idVendor=0aad,
> > >> idProduct=0024
> > >> <6>[ 168.307823] usb 2-2.2: New USB device strings: Mfr=1, Product=2,
> > >> SerialNumber=0
> > >> <6>[ 168.315180] usb 2-2.2: Product: FrontPanel USB Keyboard
> > >> <6>[ 168.320436] usb 2-2.2: Manufacturer: Rohde&Schwarz
> > >> <6>[ 168.337895] input: Rohde&Schwarz FrontPanel USB Keyboard as
> > >> /devices/pci0000:00/0000:00:17.0/usb2/2-2/2-2.2/2-2.2:1.0/input/input0
> > >> <6>[ 168.360988] input: Rohde&Schwarz FrontPanel USB Keyboard as
> > >> /devices/pci0000:00/0000:00:17.0/usb2/2-2/2-2.2/2-2.2:1.1/input/input1
> > >
> > > Since some devices work and some don't, maybe part of the problem lies
> > > in the particular devices.
> > >
> >
> > The problem lies on the "Bus 001 Device 002: ID 0424:2514 Standard
> > Microsystems Corp. USB 2.0 Hub", which hangs for arround 162 seconds
> > after a kexec.
> >
> > The "Bus 002 Device 003: ID 076b:a021 OmniKey AG CCID Smart Card
> > Reader" and "Bus 002 Device 004: ID 0aad:0024 Rohde & Schwarz GmbH &
> > Co. KG" are attached to this Hub.
>
> Actually, it looks like they are plugged into the Texas Instruments
> hub, not the Standard Microsystems hub (because they are on bus 2, not
> bus 1). Did you rearrange the USB cables?
>

You are right, sorry for the confusion. I can't rearrange the cables
because the HUB is on board.

> > An other PowerPC device which is nearly eactly the same HW but without
> > this USB HUB works perfectly.
>
> Maybe you should replace that hub with a different brand.
>

Thats not possible, because the Hub is soldered on the board. And it is
also not a HW issue, since the Hub works perfectly which all previous
kernels including 3.4.

> > >> This is the output of lsusb:
> > >>
> > >> Bus 001 Device 002: ID 0424:2514 Standard Microsystems Corp. USB 2.0 Hub
> > >> Bus 001 Device 004: ID 0928:0007 Oxford Semiconductor, Ltd
> > >> Bus 002 Device 002: ID 0451:2036 Texas Instruments, Inc. TUSB2036 Hub
> > >> Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
> > >> Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
> > >> Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
> > >> Bus 002 Device 003: ID 076b:a021 OmniKey AG CCID Smart Card Reader
> > >> Bus 002 Device 004: ID 0aad:0024 Rohde & Schwarz GmbH & Co. KG
>
> Here, the only device that might be plugged into the Standard
> Microsystems hub is the Oxford Semiconductor thing (whatever it is).
>
> > > What about if you just do:
> > >
> > > rmmod ehci-pci
> > > modprobe ehci-pci
> > >
> >
> > The kernel is monolitic because the USB HW is needed in a early boot
> > stage. The problem also occurs with ehci-fsl used in by an other
> > PowerPC device, which is a part of the SoC and is not attached to the
> > PCI bus.
> >
> > One thing is that the "echo 1
> > >/sys/bus/pci/drivers/ehci-pci/0000\:00\:17.2/reset" workaround will
> > no longer work for kernel 3.14.
>
> Instead, you could try
>
> echo 0000:00:17.2 >/sys/bus/pci/drivers/ehci-pci/unbind
> echo 0000:00:17.2 >/sys/bus/pci/drivers/ehci-pci/bind
>

I am now at home. I will do this tomorrow. Thanks so much for your
support.

- Stefani


2014-04-14 17:58:24

by Alan Stern

[permalink] [raw]
Subject: Re: Missing USB XHCI and EHCI reset for kexec

On Mon, 14 Apr 2014, Stefani Seibold wrote:

> > > An other PowerPC device which is nearly eactly the same HW but without
> > > this USB HUB works perfectly.
> >
> > Maybe you should replace that hub with a different brand.
> >
>
> Thats not possible, because the Hub is soldered on the board. And it is
> also not a HW issue, since the Hub works perfectly which all previous
> kernels including 3.4.

One other thing you can try is to increase the reset timeout in
drivers/usb/host/ehci-hub.c. This is under the USB_PORT_FEAT_RESET
case in ehci_hub_control(), around line 1225:

/*
* caller must wait, then call GetPortStatus
* usb 2.0 spec says 50 ms resets on root
*/
ehci->reset_done [wIndex] = jiffies
+ msecs_to_jiffies (50);

Increasing the 50 to 100 or more might help.

Alan Stern

Subject: Re: Missing USB XHCI and EHCI reset for kexec

On Mon, Apr 14, 2014 at 05:44:58PM +0200, [email protected] wrote:
>
> Zitat von Benjamin Herrenschmidt <[email protected]>:
>
> >On Sun, 2014-04-13 at 21:22 +0200, Stefani Seibold wrote:
> >>When executing a kexec kernel on a PowerPC board the new started kernel
> >>will not find already enumerated USB devices due a missing reset on the
> >>USB bus.
> >>
> >>As a work around a
> >>
> >>echo 1 >/sys/bus/pci/drivers/[ex]hci-pci/BUS-ADDRESS-OF-THE-HCD/reset
> >>
> >>will solve this. But this is far from beauty.
> >>
> >>My latest kernel without this issue was for EHCI kernel 2.6.39 and for
> >>XHCI kernel 3.4, but i have no idea when exactly this behavior was
> >>introduced.
> >>
> >>For X86 all is fine.
> >
> >I don't know about EHCI specifically but this is a known issue with
> >XHCI, I observe similar issues on other powerpc platforms (servers)
> >and this isn't architecture specific (looks more like actualy xhc
> >implementation specific).
> >
> >Thadeu Cascardo (on CC) has been the one investigating that on our side,
> >he might have more to add including patches.
> >
>
> I have now a kernel 3.14 dmesg log of the problem. After a kexec the
> kexeced 3.14 kernel shows:
>
> [ 1.170029] xhci_hcd 0001:03:00.0: xHCI Host Controller
> [ 1.175306] xhci_hcd 0001:03:00.0: new USB bus registered,
> assigned bus number 1
> [ 1.212561] xhci_hcd 0001:03:00.0: Host not halted after 16000
> microseconds.
> [ 1.219621] xhci_hcd 0001:03:00.0: can't setup: -110
> [ 1.224597] xhci_hcd 0001:03:00.0: USB bus 1 deregistered
> [ 1.230021] xhci_hcd 0001:03:00.0: init 0001:03:00.0 fail, -110
> [ 1.235955] xhci_hcd: probe of 0001:03:00.0 failed with error -110
>
> Any idea?
>
> - Stefani
>
>

What is your controller vendor and device IDs? Is that a TI chip?

Can you check if the patch I sent a month ago fixes it? [1] There's the
whole story there. In fact, you will also need something like the patch
below. Can you apply only the first one, verify, and, then, the other
one as well, and report what worked for you?

[1] http://marc.info/?l=linux-usb&m=139483181809062&w=2

Thanks.
Cascardo.
---
diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index 47390e3..232574f 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -392,7 +392,7 @@ static struct pci_driver xhci_pci_driver = {
.remove = xhci_pci_remove,
/* suspend and resume implemented later */

- .shutdown = usb_hcd_pci_shutdown,
+ .shutdown = xhci_pci_remove,
#ifdef CONFIG_PM
.driver = {
.pm = &usb_hcd_pci_pm_ops

2014-04-15 10:04:34

by Stefani Seibold

[permalink] [raw]
Subject: Re: Missing USB XHCI and EHCI reset for kexec


Zitat von Thadeu Lima de Souza Cascardo <[email protected]>:

> On Mon, Apr 14, 2014 at 05:44:58PM +0200, [email protected] wrote:
>>
>> Zitat von Benjamin Herrenschmidt <[email protected]>:
>>
>> >I don't know about EHCI specifically but this is a known issue with
>> >XHCI, I observe similar issues on other powerpc platforms (servers)
>> >and this isn't architecture specific (looks more like actualy xhc
>> >implementation specific).
>> >
>> >Thadeu Cascardo (on CC) has been the one investigating that on our side,
>> >he might have more to add including patches.
>> >
>>
>> I have now a kernel 3.14 dmesg log of the problem. After a kexec the
>> kexeced 3.14 kernel shows:
>>
>> [ 1.170029] xhci_hcd 0001:03:00.0: xHCI Host Controller
>> [ 1.175306] xhci_hcd 0001:03:00.0: new USB bus registered,
>> assigned bus number 1
>> [ 1.212561] xhci_hcd 0001:03:00.0: Host not halted after 16000
>> microseconds.
>> [ 1.219621] xhci_hcd 0001:03:00.0: can't setup: -110
>> [ 1.224597] xhci_hcd 0001:03:00.0: USB bus 1 deregistered
>> [ 1.230021] xhci_hcd 0001:03:00.0: init 0001:03:00.0 fail, -110
>> [ 1.235955] xhci_hcd: probe of 0001:03:00.0 failed with error -110
>>
>
> What is your controller vendor and device IDs? Is that a TI chip?
>

Yes it is a TI chip, vendor ID 104c and product ID 8241.

> Can you check if the patch I sent a month ago fixes it? [1] There's the
> whole story there. In fact, you will also need something like the patch
> below. Can you apply only the first one, verify, and, then, the other
> one as well, and report what worked for you?
>
> [1] http://marc.info/?l=linux-usb&m=139483181809062&w=2
>

I tried the attach patch and it did not help. This is what i expected
because this is a fix in the shutdown path, which will never called
when doing a forced kexec.

I have a running a 3.10.23 kernel. This kernel do a kexec for a kernel
3.14. Since the kernel 3.10.23 did not performe a clean shutdown, the
state of the XHCI Controller is undefined. So when kernel 3.14 will
probe XHCI it will find a XHCI controller which was not performed a
reset.

So i think it is necessary to reset the XHCI controller and all
devices on this bus. This is what i do with a "echo 1
>/sys/bus/pci/drivers/xhci_hcd/0001:03:00.0/reset" before the kexec.

- Stefani

Subject: Re: Missing USB XHCI and EHCI reset for kexec

On Tue, Apr 15, 2014 at 12:04:17PM +0200, [email protected] wrote:
>
> Zitat von Thadeu Lima de Souza Cascardo <[email protected]>:
>
> >On Mon, Apr 14, 2014 at 05:44:58PM +0200, [email protected] wrote:
> >>
> >>Zitat von Benjamin Herrenschmidt <[email protected]>:
> >>
> >>>I don't know about EHCI specifically but this is a known issue with
> >>>XHCI, I observe similar issues on other powerpc platforms (servers)
> >>>and this isn't architecture specific (looks more like actualy xhc
> >>>implementation specific).
> >>>
> >>>Thadeu Cascardo (on CC) has been the one investigating that on our side,
> >>>he might have more to add including patches.
> >>>
> >>
> >>I have now a kernel 3.14 dmesg log of the problem. After a kexec the
> >>kexeced 3.14 kernel shows:
> >>
> >>[ 1.170029] xhci_hcd 0001:03:00.0: xHCI Host Controller
> >>[ 1.175306] xhci_hcd 0001:03:00.0: new USB bus registered,
> >>assigned bus number 1
> >>[ 1.212561] xhci_hcd 0001:03:00.0: Host not halted after 16000
> >>microseconds.
> >>[ 1.219621] xhci_hcd 0001:03:00.0: can't setup: -110
> >>[ 1.224597] xhci_hcd 0001:03:00.0: USB bus 1 deregistered
> >>[ 1.230021] xhci_hcd 0001:03:00.0: init 0001:03:00.0 fail, -110
> >>[ 1.235955] xhci_hcd: probe of 0001:03:00.0 failed with error -110
> >>
> >
> >What is your controller vendor and device IDs? Is that a TI chip?
> >
>
> Yes it is a TI chip, vendor ID 104c and product ID 8241.
>
> >Can you check if the patch I sent a month ago fixes it? [1] There's the
> >whole story there. In fact, you will also need something like the patch
> >below. Can you apply only the first one, verify, and, then, the other
> >one as well, and report what worked for you?
> >
> >[1] http://marc.info/?l=linux-usb&m=139483181809062&w=2
> >
>
> I tried the attach patch and it did not help. This is what i
> expected because this is a fix in the shutdown path, which will
> never called when doing a forced kexec.

Hi, Stefani.

Did you try with both patches applied? How do you evoke the forced
kexec? Is that a kexec on panic? Does it really need to be forced? With
no clean shutdown, platform and drivers would need to issue resets, like
you mentioned below, to get the system into a clean state.

>
> I have a running a 3.10.23 kernel. This kernel do a kexec for a
> kernel 3.14. Since the kernel 3.10.23 did not performe a clean
> shutdown, the state of the XHCI Controller is undefined. So when

And the clean shutdown requires both of my patches, for TI chips, as far
as I know. It looks like the problem is issuing a halt when there are
pending URBs.

> kernel 3.14 will probe XHCI it will find a XHCI controller which was
> not performed a reset.
>

The problem is not that a reset hasn't been issued. A PCI function reset
should fix most of the problems with a bad device state, when the reset
works. However, the problem is that it was not cleanly shut down. URBs
should have been canceled and removed from the controller queue, and it
should have halted after that.

> So i think it is necessary to reset the XHCI controller and all
> devices on this bus. This is what i do with a "echo 1
> >/sys/bus/pci/drivers/xhci_hcd/0001:03:00.0/reset" before the kexec.
>

One way to look at that is making the PCI code issue resets to all buses
before doing any other access. That will make booting more slow, and
there are a lot of other corner cases where this might not be enough.
It's probably more sane to try to get the 3.10.23 kernel to do a clean
shutdown, if possible.

Regards.
Cascardo.

> - Stefani
>
>

2014-04-15 15:00:43

by Stefani Seibold

[permalink] [raw]
Subject: Re: Missing USB XHCI and EHCI reset for kexec


Zitat von Thadeu Lima de Souza Cascardo <[email protected]>:

> On Tue, Apr 15, 2014 at 12:04:17PM +0200, [email protected] wrote:
>>
>> Zitat von Thadeu Lima de Souza Cascardo <[email protected]>:
>>
>> >On Mon, Apr 14, 2014 at 05:44:58PM +0200, [email protected] wrote:
>> >>
>> >>Zitat von Benjamin Herrenschmidt <[email protected]>:
>> >>
>> >>>I don't know about EHCI specifically but this is a known issue with
>> >>>XHCI, I observe similar issues on other powerpc platforms (servers)
>> >>>and this isn't architecture specific (looks more like actualy xhc
>> >>>implementation specific).
>> >>>
>> >>>Thadeu Cascardo (on CC) has been the one investigating that on our side,
>> >>>he might have more to add including patches.
>> >>>
>> >>
>> >>I have now a kernel 3.14 dmesg log of the problem. After a kexec the
>> >>kexeced 3.14 kernel shows:
>> >>
>> >>[ 1.170029] xhci_hcd 0001:03:00.0: xHCI Host Controller
>> >>[ 1.175306] xhci_hcd 0001:03:00.0: new USB bus registered,
>> >>assigned bus number 1
>> >>[ 1.212561] xhci_hcd 0001:03:00.0: Host not halted after 16000
>> >>microseconds.
>> >>[ 1.219621] xhci_hcd 0001:03:00.0: can't setup: -110
>> >>[ 1.224597] xhci_hcd 0001:03:00.0: USB bus 1 deregistered
>> >>[ 1.230021] xhci_hcd 0001:03:00.0: init 0001:03:00.0 fail, -110
>> >>[ 1.235955] xhci_hcd: probe of 0001:03:00.0 failed with error -110
>> >>
>> >
>> >What is your controller vendor and device IDs? Is that a TI chip?
>> >
>>
>> Yes it is a TI chip, vendor ID 104c and product ID 8241.
>>
>> >Can you check if the patch I sent a month ago fixes it? [1] There's the
>> >whole story there. In fact, you will also need something like the patch
>> >below. Can you apply only the first one, verify, and, then, the other
>> >one as well, and report what worked for you?
>> >
>> >[1] http://marc.info/?l=linux-usb&m=139483181809062&w=2
>> >
>>
>> I tried the attach patch and it did not help. This is what i
>> expected because this is a fix in the shutdown path, which will
>> never called when doing a forced kexec.
>
> Hi, Stefani.
>
> Did you try with both patches applied? How do you evoke the forced
> kexec? Is that a kexec on panic? Does it really need to be forced? With
> no clean shutdown, platform and drivers would need to issue resets, like
> you mentioned below, to get the system into a clean state.
>

Yes, i applied both patches. But without success.

IMHO i think it is necessary to bring the device i a clean state when
the driver use the HW.

>>
>> I have a running a 3.10.23 kernel. This kernel do a kexec for a
>> kernel 3.14. Since the kernel 3.10.23 did not performe a clean
>> shutdown, the state of the XHCI Controller is undefined. So when
>
> And the clean shutdown requires both of my patches, for TI chips, as far
> as I know. It looks like the problem is issuing a halt when there are
> pending URBs.
>
>> kernel 3.14 will probe XHCI it will find a XHCI controller which was
>> not performed a reset.
>>
>
> The problem is not that a reset hasn't been issued. A PCI function reset
> should fix most of the problems with a bad device state, when the reset
> works. However, the problem is that it was not cleanly shut down. URBs
> should have been canceled and removed from the controller queue, and it
> should have halted after that.

Again, i think it is the job of the driver to bring the chip in a clean state
before using them. A driver should never expect a reset state of a chip.

>
>> So i think it is necessary to reset the XHCI controller and all
>> devices on this bus. This is what i do with a "echo 1
>> >/sys/bus/pci/drivers/xhci_hcd/0001:03:00.0/reset" before the kexec.
>>
>
> One way to look at that is making the PCI code issue resets to all buses
> before doing any other access. That will make booting more slow, and
> there are a lot of other corner cases where this might not be enough.
> It's probably more sane to try to get the 3.10.23 kernel to do a clean
> shutdown, if possible.
>

With this driver design the kexec functionality is usesless on PowerPC.
X86 looks a little bit better.

- Stefani

2014-04-15 18:31:13

by Stefani Seibold

[permalink] [raw]
Subject: Re: Missing USB XHCI and EHCI reset for kexec


Am Montag, den 14.04.2014, 12:27 -0400 schrieb Alan Stern:
> On Mon, 14 Apr 2014 [email protected] wrote:
>
> > Zitat von Alan Stern <[email protected]>:
> >
> >
> > >> <6>[ 167.936921] usb 2-2.1: new full-speed USB device number 3
> > >> using ohci-pci
> > >> <6>[ 168.067890] usb 2-2.1: New USB device found, idVendor=076b,
> > >> idProduct=a021
> > >> <6>[ 168.074871] usb 2-2.1: New USB device strings: Mfr=1, Product=2,
> > >> SerialNumber=0
> > >> <6>[ 168.082226] usb 2-2.1: Product: Smart Card Reader
> > >> <6>[ 168.086963] usb 2-2.1: Manufacturer: USB
> > >> <6>[ 168.172893] usb 2-2.2: new low-speed USB device number 4
> > >> using ohci-pci
> > >> <6>[ 168.300839] usb 2-2.2: New USB device found, idVendor=0aad,
> > >> idProduct=0024
> > >> <6>[ 168.307823] usb 2-2.2: New USB device strings: Mfr=1, Product=2,
> > >> SerialNumber=0
> > >> <6>[ 168.315180] usb 2-2.2: Product: FrontPanel USB Keyboard
> > >> <6>[ 168.320436] usb 2-2.2: Manufacturer: Rohde&Schwarz
> > >> <6>[ 168.337895] input: Rohde&Schwarz FrontPanel USB Keyboard as
> > >> /devices/pci0000:00/0000:00:17.0/usb2/2-2/2-2.2/2-2.2:1.0/input/input0
> > >> <6>[ 168.360988] input: Rohde&Schwarz FrontPanel USB Keyboard as
> > >> /devices/pci0000:00/0000:00:17.0/usb2/2-2/2-2.2/2-2.2:1.1/input/input1
> > >
> > > Since some devices work and some don't, maybe part of the problem lies
> > > in the particular devices.
> > >
> >
> > The problem lies on the "Bus 001 Device 002: ID 0424:2514 Standard
> > Microsystems Corp. USB 2.0 Hub", which hangs for arround 162 seconds
> > after a kexec.
> >
> > The "Bus 002 Device 003: ID 076b:a021 OmniKey AG CCID Smart Card
> > Reader" and "Bus 002 Device 004: ID 0aad:0024 Rohde & Schwarz GmbH &
> > Co. KG" are attached to this Hub.

> > > What about if you just do:
> > >
> > > rmmod ehci-pci
> > > modprobe ehci-pci
> > >
> >
> > The kernel is monolitic because the USB HW is needed in a early boot
> > stage. The problem also occurs with ehci-fsl used in by an other
> > PowerPC device, which is a part of the SoC and is not attached to the
> > PCI bus.
> >
> > One thing is that the "echo 1
> > >/sys/bus/pci/drivers/ehci-pci/0000\:00\:17.2/reset" workaround will
> > no longer work for kernel 3.14.
>
> Instead, you could try
>
> echo 0000:00:17.2 >/sys/bus/pci/drivers/ehci-pci/unbind
> echo 0000:00:17.2 >/sys/bus/pci/drivers/ehci-pci/bind
>

I did a unbind and bind of the ehci-pci and ohci-pci, after this i got
the following dmesg log:

ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
ehci-pci: EHCI PCI platform driver
ehci-pci 0000:00:17.2: EHCI Host Controller
ehci-pci 0000:00:17.2: new USB bus registered, assigned bus number 1
ehci-pci 0000:00:17.2: irq 22, io mem 0xc0006800
ehci-pci 0000:00:17.2: USB 2.0 started, EHCI 1.00
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 5 ports detected
ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
ohci_hcd 0000:00:17.0: OHCI Host Controller
ohci_hcd 0000:00:17.0: new USB bus registered, assigned bus number 2
ohci_hcd 0000:00:17.0: irq 20, io mem 0xc0004000
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 3 ports detected
ohci_hcd 0000:00:17.1: OHCI Host Controller
ohci_hcd 0000:00:17.1: new USB bus registered, assigned bus number 3
ohci_hcd 0000:00:17.1: irq 21, io mem 0xc0005000
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
Freescale High-Speed USB SOC Device Controller driver (Apr 20, 2007)
mousedev: PS/2 mouse device common for all mice
i2c /dev entries driver
mpc-i2c fef03000.i2c: timeout 1000000 us
rtc-rs5c372 0-0032: rs5c372a found, 24hr, driver version 0.6
rtc-rs5c372 0-0032: rtc core: registered rtc-rs5c372 as rtc0
mpc-i2c fef03100.i2c: timeout 1000000 us
usbcore: registered new interface driver usbhid
usbhid: USB HID core driver
rsfrontp: using key table SMBV (117)
usbcore: registered new interface driver rsfrontp
rsfrontp: R&S USB HID Frontpanel driver (v1.2)
usbcore: registered new interface driver rsknop
rsknop: R&S USB HID Knop support (v1.4)
usb 1-2: new high-speed USB device number 2 using ehci-pci
zram: Created 1 device(s) ...
TCP: cubic registered
NET: Registered protocol family 17
rtc-rs5c372 0-0032: setting system clock to 2014-04-14 14:51:50 UTC (1397487110)
Freeing unused kernel memory: 996K (c032e000 - c0427000)
hub 1-2:1.0: USB hub found
hub 1-2:1.0: 4 ports detected
yaffs: dev is 32505859 name is "mtdblock3" rw
yaffs: passed flags ""
yaffs: yaffs: Attempting MTD mount of 31.3,"mtdblock3"
yaffs: auto selecting yaffs2
yaffs: yaffs_read_super: is_checkpointed 1
usb 2-2: new full-speed USB device number 2 using ohci_hcd
hub 2-2:1.0: USB hub found
hub 2-2:1.0: 2 ports detected
usbcore: registered new interface driver usb-storage
usb 2-2.1: new full-speed USB device number 3 using ohci_hcd
usb 2-2.2: new low-speed USB device number 4 using ohci_hcd
input: Rohde&Schwarz FrontPanel USB Keyboard as /devices/pci0000:00/0000:00:17.0/usb2/2-2/2-2.2/2-2.2:1.0/input/input0
fsl-gianfar fef24000.ethernet eth0: mac: 00:90:b8:1b:36:37
input: Rohde&Schwarz FrontPanel USB Keyboard as /devices/pci0000:00/0000:00:17.0/usb2/2-2/2-2.2/2-2.2:1.1/input/input1
fsl-gianfar fef24000.ethernet eth0: Running with NAPI enabled
fsl-gianfar fef24000.ethernet eth0: RX BD ring size for Q[0]: 256
fsl-gianfar fef24000.ethernet eth0: TX BD ring size for Q[0]: 256
>udevd[114]: starting version 171
libphy: mdio@fef24520:00 - Link is Up - 100/Full
Adding 131068k swap on /dev/zram0. Priority:-1 extents:1 across:131068k SS
Initializing SATA USB Mass Storage driver...
usbcore: registered new interface driver USB-SATA-storage
USB SATA Mass Storage support registered.
usb 1-4: new high-speed USB device number 4 using ehci-pci
xxxx: ports detected
ohci_hcd 0000:00:17.0: remove, state 1
usb usb2: USB disconnect, device number 1
usb 2-2: USB disconnect, device number 2
usb 2-2.1: USB disconnect, device number 3
usb 2-2.2: USB disconnect, device number 4
ohci_hcd 0000:00:17.0: USB bus 2 deregistered
ohci_hcd 0000:00:17.1: remove, state 1
usb usb3: USB disconnect, device number 1
ohci_hcd 0000:00:17.1: USB bus 3 deregistered
ehci-pci 0000:00:17.2: remove, state 1
usb usb1: USB disconnect, device number 1
usb 1-2: USB disconnect, device number 2
usb 1-4: USB disconnect, device number 4
ehci-pci 0000:00:17.2: USB bus 1 deregistered
ohci_hcd 0000:00:17.0: OHCI Host Controller
ohci_hcd 0000:00:17.0: new USB bus registered, assigned bus number 1
ohci_hcd 0000:00:17.0: irq 20, io mem 0xc0004000
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 3 ports detected
ohci_hcd 0000:00:17.1: OHCI Host Controller
ohci_hcd 0000:00:17.1: new USB bus registered, assigned bus number 2
ohci_hcd 0000:00:17.1: irq 21, io mem 0xc0005000
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
ehci-pci 0000:00:17.2: EHCI Host Controller
ehci-pci 0000:00:17.2: new USB bus registered, assigned bus number 3
ehci-pci 0000:00:17.2: irq 22, io mem 0xc0006800
ehci-pci 0000:00:17.2: USB 2.0 started, EHCI 1.00
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 5 ports detected
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 3 ports detected
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
usb 3-2: new high-speed USB device number 2 using ehci-pci
hub 3-2:1.0: USB hub found
hub 3-2:1.0: 4 ports detected
usb 3-4: new high-speed USB device number 4 using ehci-pci
usb 3-4: device descriptor read/64, error -110
usb 3-4: device descriptor read/64, error -110
usb 3-4: new high-speed USB device number 5 using ehci-pci
usb 3-4: device descriptor read/64, error -110
usb 3-4: device descriptor read/64, error -110
usb 3-4: new high-speed USB device number 6 using ehci-pci
usb 3-4: device descriptor read/8, error -110
usb 3-4: device descriptor read/8, error -110
usb 3-4: new high-speed USB device number 7 using ehci-pci
usb 3-4: device descriptor read/8, error -110
usb 3-4: device descriptor read/8, error -110
hub 3-0:1.0: unable to enumerate USB device on port 4
usb 1-2: new full-speed USB device number 2 using ohci_hcd
hub 1-2:1.0: USB hub found
hub 1-2:1.0: 2 ports detected
usb 2-2: new full-speed USB device number 2 using ohci_hcd
usb 2-2: device descriptor read/64, error -110
usb 2-2: device descriptor read/64, error -110
usb 2-2: new full-speed USB device number 3 using ohci_hcd
usb 2-2: device descriptor read/64, error -110
usb 2-2: device descriptor read/64, error -110
usb 2-2: new full-speed USB device number 4 using ohci_hcd
usb 2-2: device descriptor read/8, error -110
usb 2-2: device descriptor read/8, error -110
usb 2-2: new full-speed USB device number 5 using ohci_hcd
usb 2-2: device descriptor read/8, error -110
usb 2-2: device descriptor read/8, error -110
hub 2-0:1.0: unable to enumerate USB device on port 2
usb 1-2.1: new full-speed USB device number 3 using ohci_hcd
usb 1-2.2: new low-speed USB device number 4 using ohci_hcd
input: Rohde&Schwarz FrontPanel USB Keyboard as /devices/pci0000:00/0000:00:17.0/usb1/1-2/1-2.2/1-2.2:1.0/input/input2
input: Rohde&Schwarz FrontPanel USB Keyboard as /devices/pci0000:00/0000:00:17.0/usb1/1-2/1-2.2/1-2.2:1.1/input/input3

As you can see there is a difference between the USB port detected
during the boot phase and after the unbind/bind:

hub 1-0:1.0: 5 ports detected
hub 2-0:1.0: 3 ports detected
hub 3-0:1.0: 2 ports detected
hub 1-2:1.0: 4 ports detected
hub 2-2:1.0: 2 ports detected

echo 0000:00:17.0 >/sys/bus/pci/drivers/ohci-pci/unbind
echo 0000:00:17.1 >/sys/bus/pci/drivers/ohci-pci/unbind
echo 0000:00:17.2 >/sys/bus/pci/drivers/ehci-pci/unbind

echo 0000:00:17.0 >/sys/bus/pci/drivers/ohci-pci/bind
echo 0000:00:17.1 >/sys/bus/pci/drivers/ohci-pci/bind
echo 0000:00:17.2 >/sys/bus/pci/drivers/ehci-pci/bind

hub 1-0:1.0: 3 ports detected
hub 2-0:1.0: 2 ports detected
hub 3-0:1.0: 5 ports detected
hub 1-0:1.0: 3 ports detected
hub 2-0:1.0: 2 ports detected
hub 3-2:1.0: 4 ports detected
hub 1-2:1.0: 2 ports detected

This was kernel 3.10, but i get similar results for 3.14

- Stefani






2014-04-15 18:33:21

by Stefani Seibold

[permalink] [raw]
Subject: Re: Missing USB XHCI and EHCI reset for kexec

Am Montag, den 14.04.2014, 13:58 -0400 schrieb Alan Stern:
> On Mon, 14 Apr 2014, Stefani Seibold wrote:
>
> > > > An other PowerPC device which is nearly eactly the same HW but without
> > > > this USB HUB works perfectly.
> > >
> > > Maybe you should replace that hub with a different brand.
> > >
> >
> > Thats not possible, because the Hub is soldered on the board. And it is
> > also not a HW issue, since the Hub works perfectly which all previous
> > kernels including 3.4.
>
> One other thing you can try is to increase the reset timeout in
> drivers/usb/host/ehci-hub.c. This is under the USB_PORT_FEAT_RESET
> case in ehci_hub_control(), around line 1225:
>
> /*
> * caller must wait, then call GetPortStatus
> * usb 2.0 spec says 50 ms resets on root
> */
> ehci->reset_done [wIndex] = jiffies
> + msecs_to_jiffies (50);
>
> Increasing the 50 to 100 or more might help.
>
> Alan Stern
>

I tried this, when i increase the value to 1000, the reset and
enumeration process will be faster after a kexec: 28 Seconds vs. 162
Seconds.

- Stefani

Subject: Re: Missing USB XHCI and EHCI reset for kexec

On Tue, Apr 15, 2014 at 05:00:28PM +0200, [email protected] wrote:
>
> Zitat von Thadeu Lima de Souza Cascardo <[email protected]>:
>
> >On Tue, Apr 15, 2014 at 12:04:17PM +0200, [email protected] wrote:
> >>
> >>Zitat von Thadeu Lima de Souza Cascardo <[email protected]>:
> >>
> >>>On Mon, Apr 14, 2014 at 05:44:58PM +0200, [email protected] wrote:
> >>>>
> >>>>Zitat von Benjamin Herrenschmidt <[email protected]>:
> >>>>
> >>>>>I don't know about EHCI specifically but this is a known issue with
> >>>>>XHCI, I observe similar issues on other powerpc platforms (servers)
> >>>>>and this isn't architecture specific (looks more like actualy xhc
> >>>>>implementation specific).
> >>>>>
> >>>>>Thadeu Cascardo (on CC) has been the one investigating that on our side,
> >>>>>he might have more to add including patches.
> >>>>>
> >>>>
> >>>>I have now a kernel 3.14 dmesg log of the problem. After a kexec the
> >>>>kexeced 3.14 kernel shows:
> >>>>
> >>>>[ 1.170029] xhci_hcd 0001:03:00.0: xHCI Host Controller
> >>>>[ 1.175306] xhci_hcd 0001:03:00.0: new USB bus registered,
> >>>>assigned bus number 1
> >>>>[ 1.212561] xhci_hcd 0001:03:00.0: Host not halted after 16000
> >>>>microseconds.
> >>>>[ 1.219621] xhci_hcd 0001:03:00.0: can't setup: -110
> >>>>[ 1.224597] xhci_hcd 0001:03:00.0: USB bus 1 deregistered
> >>>>[ 1.230021] xhci_hcd 0001:03:00.0: init 0001:03:00.0 fail, -110
> >>>>[ 1.235955] xhci_hcd: probe of 0001:03:00.0 failed with error -110
> >>>>
> >>>
> >>>What is your controller vendor and device IDs? Is that a TI chip?
> >>>
> >>
> >>Yes it is a TI chip, vendor ID 104c and product ID 8241.
> >>
> >>>Can you check if the patch I sent a month ago fixes it? [1] There's the
> >>>whole story there. In fact, you will also need something like the patch
> >>>below. Can you apply only the first one, verify, and, then, the other
> >>>one as well, and report what worked for you?
> >>>
> >>>[1] http://marc.info/?l=linux-usb&m=139483181809062&w=2
> >>>
> >>
> >>I tried the attach patch and it did not help. This is what i
> >>expected because this is a fix in the shutdown path, which will
> >>never called when doing a forced kexec.
> >
> >Hi, Stefani.
> >
> >Did you try with both patches applied? How do you evoke the forced
> >kexec? Is that a kexec on panic? Does it really need to be forced? With
> >no clean shutdown, platform and drivers would need to issue resets, like
> >you mentioned below, to get the system into a clean state.
> >
>
> Yes, i applied both patches. But without success.
>
> IMHO i think it is necessary to bring the device i a clean state
> when the driver use the HW.
>
> >>
> >>I have a running a 3.10.23 kernel. This kernel do a kexec for a
> >>kernel 3.14. Since the kernel 3.10.23 did not performe a clean
> >>shutdown, the state of the XHCI Controller is undefined. So when
> >
> >And the clean shutdown requires both of my patches, for TI chips, as far
> >as I know. It looks like the problem is issuing a halt when there are
> >pending URBs.
> >
> >>kernel 3.14 will probe XHCI it will find a XHCI controller which was
> >>not performed a reset.
> >>
> >
> >The problem is not that a reset hasn't been issued. A PCI function reset
> >should fix most of the problems with a bad device state, when the reset
> >works. However, the problem is that it was not cleanly shut down. URBs
> >should have been canceled and removed from the controller queue, and it
> >should have halted after that.
>
> Again, i think it is the job of the driver to bring the chip in a clean state
> before using them. A driver should never expect a reset state of a chip.
>
> >
> >>So i think it is necessary to reset the XHCI controller and all
> >>devices on this bus. This is what i do with a "echo 1
> >>>/sys/bus/pci/drivers/xhci_hcd/0001:03:00.0/reset" before the kexec.
> >>
> >
> >One way to look at that is making the PCI code issue resets to all buses
> >before doing any other access. That will make booting more slow, and
> >there are a lot of other corner cases where this might not be enough.
> >It's probably more sane to try to get the 3.10.23 kernel to do a clean
> >shutdown, if possible.
> >
>
> With this driver design the kexec functionality is usesless on PowerPC.
> X86 looks a little bit better.
>
> - Stefani
>
>

What is the vendor and device ID you are using on your X86 system? This
is not a matter of what architecture you are using, it's the XHCI
controller which does not behave as well as the one you are using on
X86, which is likely an Intel one.

Cascardo.

2014-04-15 18:43:56

by Stefani Seibold

[permalink] [raw]
Subject: Re: Missing USB XHCI and EHCI reset for kexec

Am Dienstag, den 15.04.2014, 15:33 -0300 schrieb Thadeu Lima de Souza
Cascardo:
> On Tue, Apr 15, 2014 at 05:00:28PM +0200, [email protected] wrote:
> >
> > Zitat von Thadeu Lima de Souza Cascardo <[email protected]>:
> >
> > >On Tue, Apr 15, 2014 at 12:04:17PM +0200, [email protected] wrote:
> > >>
> > >>Zitat von Thadeu Lima de Souza Cascardo <[email protected]>:
> > >>
> > >>>On Mon, Apr 14, 2014 at 05:44:58PM +0200, [email protected] wrote:
> > >>>>
> > >>>>Zitat von Benjamin Herrenschmidt <[email protected]>:
> > >>>>
> > >>>>>I don't know about EHCI specifically but this is a known issue with
> > >>>>>XHCI, I observe similar issues on other powerpc platforms (servers)
> > >>>>>and this isn't architecture specific (looks more like actualy xhc
> > >>>>>implementation specific).
> > >>>>>
> > >>>>>Thadeu Cascardo (on CC) has been the one investigating that on our side,
> > >>>>>he might have more to add including patches.
> > >>>>>
> > >>>>
> > >>>>I have now a kernel 3.14 dmesg log of the problem. After a kexec the
> > >>>>kexeced 3.14 kernel shows:
> > >>>>
> > >>>>[ 1.170029] xhci_hcd 0001:03:00.0: xHCI Host Controller
> > >>>>[ 1.175306] xhci_hcd 0001:03:00.0: new USB bus registered,
> > >>>>assigned bus number 1
> > >>>>[ 1.212561] xhci_hcd 0001:03:00.0: Host not halted after 16000
> > >>>>microseconds.
> > >>>>[ 1.219621] xhci_hcd 0001:03:00.0: can't setup: -110
> > >>>>[ 1.224597] xhci_hcd 0001:03:00.0: USB bus 1 deregistered
> > >>>>[ 1.230021] xhci_hcd 0001:03:00.0: init 0001:03:00.0 fail, -110
> > >>>>[ 1.235955] xhci_hcd: probe of 0001:03:00.0 failed with error -110
> > >>>>
> > >>>
> > >>>What is your controller vendor and device IDs? Is that a TI chip?
> > >>>
> > >>
> > >>Yes it is a TI chip, vendor ID 104c and product ID 8241.
> > >>
> > >>>Can you check if the patch I sent a month ago fixes it? [1] There's the
> > >>>whole story there. In fact, you will also need something like the patch
> > >>>below. Can you apply only the first one, verify, and, then, the other
> > >>>one as well, and report what worked for you?
> > >>>
> > >>>[1] http://marc.info/?l=linux-usb&m=139483181809062&w=2
> > >>>
> > >>
> > >>I tried the attach patch and it did not help. This is what i
> > >>expected because this is a fix in the shutdown path, which will
> > >>never called when doing a forced kexec.
> > >
> > >Hi, Stefani.
> > >
> > >Did you try with both patches applied? How do you evoke the forced
> > >kexec? Is that a kexec on panic? Does it really need to be forced? With
> > >no clean shutdown, platform and drivers would need to issue resets, like
> > >you mentioned below, to get the system into a clean state.
> > >
> >
> > Yes, i applied both patches. But without success.
> >
> > IMHO i think it is necessary to bring the device i a clean state
> > when the driver use the HW.
> >
> > >>
> > >>I have a running a 3.10.23 kernel. This kernel do a kexec for a
> > >>kernel 3.14. Since the kernel 3.10.23 did not performe a clean
> > >>shutdown, the state of the XHCI Controller is undefined. So when
> > >
> > >And the clean shutdown requires both of my patches, for TI chips, as far
> > >as I know. It looks like the problem is issuing a halt when there are
> > >pending URBs.
> > >
> > >>kernel 3.14 will probe XHCI it will find a XHCI controller which was
> > >>not performed a reset.
> > >>
> > >
> > >The problem is not that a reset hasn't been issued. A PCI function reset
> > >should fix most of the problems with a bad device state, when the reset
> > >works. However, the problem is that it was not cleanly shut down. URBs
> > >should have been canceled and removed from the controller queue, and it
> > >should have halted after that.
> >
> > Again, i think it is the job of the driver to bring the chip in a clean state
> > before using them. A driver should never expect a reset state of a chip.
> >
> > >
> > >>So i think it is necessary to reset the XHCI controller and all
> > >>devices on this bus. This is what i do with a "echo 1
> > >>>/sys/bus/pci/drivers/xhci_hcd/0001:03:00.0/reset" before the kexec.
> > >>
> > >
> > >One way to look at that is making the PCI code issue resets to all buses
> > >before doing any other access. That will make booting more slow, and
> > >there are a lot of other corner cases where this might not be enough.
> > >It's probably more sane to try to get the 3.10.23 kernel to do a clean
> > >shutdown, if possible.
> > >
> >
> > With this driver design the kexec functionality is usesless on PowerPC.
> > X86 looks a little bit better.
> >
> > - Stefani
> >
> >
>
> What is the vendor and device ID you are using on your X86 system? This
> is not a matter of what architecture you are using, it's the XHCI
> controller which does not behave as well as the one you are using on
> X86, which is likely an Intel one.
>

It is an Intel 8086:8c31. But this was only a side note. We need a
generic solution not a vendor specific one. Otherwise kexec is useless
on other architectures.

- Stefani

Subject: Re: Missing USB XHCI and EHCI reset for kexec

On Tue, Apr 15, 2014 at 08:42:58PM +0200, Stefani Seibold wrote:
> Am Dienstag, den 15.04.2014, 15:33 -0300 schrieb Thadeu Lima de Souza
> Cascardo:
> > On Tue, Apr 15, 2014 at 05:00:28PM +0200, [email protected] wrote:
> > >
> > > Zitat von Thadeu Lima de Souza Cascardo <[email protected]>:
> > >
> > > >On Tue, Apr 15, 2014 at 12:04:17PM +0200, [email protected] wrote:
> > > >>
> > > >>Zitat von Thadeu Lima de Souza Cascardo <[email protected]>:
> > > >>
> > > >>>On Mon, Apr 14, 2014 at 05:44:58PM +0200, [email protected] wrote:
> > > >>>>
> > > >>>>Zitat von Benjamin Herrenschmidt <[email protected]>:
> > > >>>>
> > > >>>>>I don't know about EHCI specifically but this is a known issue with
> > > >>>>>XHCI, I observe similar issues on other powerpc platforms (servers)
> > > >>>>>and this isn't architecture specific (looks more like actualy xhc
> > > >>>>>implementation specific).
> > > >>>>>
> > > >>>>>Thadeu Cascardo (on CC) has been the one investigating that on our side,
> > > >>>>>he might have more to add including patches.
> > > >>>>>
> > > >>>>
> > > >>>>I have now a kernel 3.14 dmesg log of the problem. After a kexec the
> > > >>>>kexeced 3.14 kernel shows:
> > > >>>>
> > > >>>>[ 1.170029] xhci_hcd 0001:03:00.0: xHCI Host Controller
> > > >>>>[ 1.175306] xhci_hcd 0001:03:00.0: new USB bus registered,
> > > >>>>assigned bus number 1
> > > >>>>[ 1.212561] xhci_hcd 0001:03:00.0: Host not halted after 16000
> > > >>>>microseconds.
> > > >>>>[ 1.219621] xhci_hcd 0001:03:00.0: can't setup: -110
> > > >>>>[ 1.224597] xhci_hcd 0001:03:00.0: USB bus 1 deregistered
> > > >>>>[ 1.230021] xhci_hcd 0001:03:00.0: init 0001:03:00.0 fail, -110
> > > >>>>[ 1.235955] xhci_hcd: probe of 0001:03:00.0 failed with error -110
> > > >>>>
> > > >>>
> > > >>>What is your controller vendor and device IDs? Is that a TI chip?
> > > >>>
> > > >>
> > > >>Yes it is a TI chip, vendor ID 104c and product ID 8241.
> > > >>
> > > >>>Can you check if the patch I sent a month ago fixes it? [1] There's the
> > > >>>whole story there. In fact, you will also need something like the patch
> > > >>>below. Can you apply only the first one, verify, and, then, the other
> > > >>>one as well, and report what worked for you?
> > > >>>
> > > >>>[1] http://marc.info/?l=linux-usb&m=139483181809062&w=2
> > > >>>
> > > >>
> > > >>I tried the attach patch and it did not help. This is what i
> > > >>expected because this is a fix in the shutdown path, which will
> > > >>never called when doing a forced kexec.
> > > >
> > > >Hi, Stefani.
> > > >
> > > >Did you try with both patches applied? How do you evoke the forced
> > > >kexec? Is that a kexec on panic? Does it really need to be forced? With
> > > >no clean shutdown, platform and drivers would need to issue resets, like
> > > >you mentioned below, to get the system into a clean state.
> > > >
> > >
> > > Yes, i applied both patches. But without success.
> > >
> > > IMHO i think it is necessary to bring the device i a clean state
> > > when the driver use the HW.
> > >
> > > >>
> > > >>I have a running a 3.10.23 kernel. This kernel do a kexec for a
> > > >>kernel 3.14. Since the kernel 3.10.23 did not performe a clean
> > > >>shutdown, the state of the XHCI Controller is undefined. So when
> > > >
> > > >And the clean shutdown requires both of my patches, for TI chips, as far
> > > >as I know. It looks like the problem is issuing a halt when there are
> > > >pending URBs.
> > > >
> > > >>kernel 3.14 will probe XHCI it will find a XHCI controller which was
> > > >>not performed a reset.
> > > >>
> > > >
> > > >The problem is not that a reset hasn't been issued. A PCI function reset
> > > >should fix most of the problems with a bad device state, when the reset
> > > >works. However, the problem is that it was not cleanly shut down. URBs
> > > >should have been canceled and removed from the controller queue, and it
> > > >should have halted after that.
> > >
> > > Again, i think it is the job of the driver to bring the chip in a clean state
> > > before using them. A driver should never expect a reset state of a chip.
> > >
> > > >
> > > >>So i think it is necessary to reset the XHCI controller and all
> > > >>devices on this bus. This is what i do with a "echo 1
> > > >>>/sys/bus/pci/drivers/xhci_hcd/0001:03:00.0/reset" before the kexec.
> > > >>
> > > >
> > > >One way to look at that is making the PCI code issue resets to all buses
> > > >before doing any other access. That will make booting more slow, and
> > > >there are a lot of other corner cases where this might not be enough.
> > > >It's probably more sane to try to get the 3.10.23 kernel to do a clean
> > > >shutdown, if possible.
> > > >
> > >
> > > With this driver design the kexec functionality is usesless on PowerPC.
> > > X86 looks a little bit better.
> > >
> > > - Stefani
> > >
> > >
> >
> > What is the vendor and device ID you are using on your X86 system? This
> > is not a matter of what architecture you are using, it's the XHCI
> > controller which does not behave as well as the one you are using on
> > X86, which is likely an Intel one.
> >
>
> It is an Intel 8086:8c31. But this was only a side note. We need a
> generic solution not a vendor specific one. Otherwise kexec is useless
> on other architectures.
>
> - Stefani
>
>

It's probably "useless" on X86 with a TI XHCI board. I just don't have
such an environment to test. Can you arrange to test that? If that shows
me wrong, we certainly need to investigate this even further.

Thanks.
Cascardo.

2014-04-15 18:55:19

by Stefani Seibold

[permalink] [raw]
Subject: Re: Missing USB XHCI and EHCI reset for kexec

Am Dienstag, den 15.04.2014, 15:49 -0300 schrieb Thadeu Lima de Souza
Cascardo:
> On Tue, Apr 15, 2014 at 08:42:58PM +0200, Stefani Seibold wrote:
> > Am Dienstag, den 15.04.2014, 15:33 -0300 schrieb Thadeu Lima de Souza
> > Cascardo:
> > > On Tue, Apr 15, 2014 at 05:00:28PM +0200, [email protected] wrote:
> > > >
> > > > Zitat von Thadeu Lima de Souza Cascardo <[email protected]>:
> > > >
> > > > >On Tue, Apr 15, 2014 at 12:04:17PM +0200, [email protected] wrote:
> > > > >>
> > > > >>Zitat von Thadeu Lima de Souza Cascardo <[email protected]>:
> > > > >>
> > > > >>>On Mon, Apr 14, 2014 at 05:44:58PM +0200, [email protected] wrote:
> > > > >>>>
> > > > >>>>Zitat von Benjamin Herrenschmidt <[email protected]>:
> > > > >>>>
> > > > >>>>>I don't know about EHCI specifically but this is a known issue with
> > > > >>>>>XHCI, I observe similar issues on other powerpc platforms (servers)
> > > > >>>>>and this isn't architecture specific (looks more like actualy xhc
> > > > >>>>>implementation specific).
> > > > >>>>>
> > > > >>>>>Thadeu Cascardo (on CC) has been the one investigating that on our side,
> > > > >>>>>he might have more to add including patches.
> > > > >>>>>
> > > > >>>>
> > > > >>>>I have now a kernel 3.14 dmesg log of the problem. After a kexec the
> > > > >>>>kexeced 3.14 kernel shows:
> > > > >>>>
> > > > >>>>[ 1.170029] xhci_hcd 0001:03:00.0: xHCI Host Controller
> > > > >>>>[ 1.175306] xhci_hcd 0001:03:00.0: new USB bus registered,
> > > > >>>>assigned bus number 1
> > > > >>>>[ 1.212561] xhci_hcd 0001:03:00.0: Host not halted after 16000
> > > > >>>>microseconds.
> > > > >>>>[ 1.219621] xhci_hcd 0001:03:00.0: can't setup: -110
> > > > >>>>[ 1.224597] xhci_hcd 0001:03:00.0: USB bus 1 deregistered
> > > > >>>>[ 1.230021] xhci_hcd 0001:03:00.0: init 0001:03:00.0 fail, -110
> > > > >>>>[ 1.235955] xhci_hcd: probe of 0001:03:00.0 failed with error -110
> > > > >>>>
> > > > >>>
> > > > >>>What is your controller vendor and device IDs? Is that a TI chip?
> > > > >>>
> > > > >>
> > > > >>Yes it is a TI chip, vendor ID 104c and product ID 8241.
> > > > >>
> > > > >>>Can you check if the patch I sent a month ago fixes it? [1] There's the
> > > > >>>whole story there. In fact, you will also need something like the patch
> > > > >>>below. Can you apply only the first one, verify, and, then, the other
> > > > >>>one as well, and report what worked for you?
> > > > >>>
> > > > >>>[1] http://marc.info/?l=linux-usb&m=139483181809062&w=2
> > > > >>>
> > > > >>
> > > > >>I tried the attach patch and it did not help. This is what i
> > > > >>expected because this is a fix in the shutdown path, which will
> > > > >>never called when doing a forced kexec.
> > > > >
> > > > >Hi, Stefani.
> > > > >
> > > > >Did you try with both patches applied? How do you evoke the forced
> > > > >kexec? Is that a kexec on panic? Does it really need to be forced? With
> > > > >no clean shutdown, platform and drivers would need to issue resets, like
> > > > >you mentioned below, to get the system into a clean state.
> > > > >
> > > >
> > > > Yes, i applied both patches. But without success.
> > > >
> > > > IMHO i think it is necessary to bring the device i a clean state
> > > > when the driver use the HW.
> > > >
> > > > >>
> > > > >>I have a running a 3.10.23 kernel. This kernel do a kexec for a
> > > > >>kernel 3.14. Since the kernel 3.10.23 did not performe a clean
> > > > >>shutdown, the state of the XHCI Controller is undefined. So when
> > > > >
> > > > >And the clean shutdown requires both of my patches, for TI chips, as far
> > > > >as I know. It looks like the problem is issuing a halt when there are
> > > > >pending URBs.
> > > > >
> > > > >>kernel 3.14 will probe XHCI it will find a XHCI controller which was
> > > > >>not performed a reset.
> > > > >>
> > > > >
> > > > >The problem is not that a reset hasn't been issued. A PCI function reset
> > > > >should fix most of the problems with a bad device state, when the reset
> > > > >works. However, the problem is that it was not cleanly shut down. URBs
> > > > >should have been canceled and removed from the controller queue, and it
> > > > >should have halted after that.
> > > >
> > > > Again, i think it is the job of the driver to bring the chip in a clean state
> > > > before using them. A driver should never expect a reset state of a chip.
> > > >
> > > > >
> > > > >>So i think it is necessary to reset the XHCI controller and all
> > > > >>devices on this bus. This is what i do with a "echo 1
> > > > >>>/sys/bus/pci/drivers/xhci_hcd/0001:03:00.0/reset" before the kexec.
> > > > >>
> > > > >
> > > > >One way to look at that is making the PCI code issue resets to all buses
> > > > >before doing any other access. That will make booting more slow, and
> > > > >there are a lot of other corner cases where this might not be enough.
> > > > >It's probably more sane to try to get the 3.10.23 kernel to do a clean
> > > > >shutdown, if possible.
> > > > >
> > > >
> > > > With this driver design the kexec functionality is usesless on PowerPC.
> > > > X86 looks a little bit better.
> > > >
> > > > - Stefani
> > > >
> > > >
> > >
> > > What is the vendor and device ID you are using on your X86 system? This
> > > is not a matter of what architecture you are using, it's the XHCI
> > > controller which does not behave as well as the one you are using on
> > > X86, which is likely an Intel one.
> > >
> >
> > It is an Intel 8086:8c31. But this was only a side note. We need a
> > generic solution not a vendor specific one. Otherwise kexec is useless
> > on other architectures.
> >
> > - Stefani
> >
> >
>
> It's probably "useless" on X86 with a TI XHCI board. I just don't have
> such an environment to test. Can you arrange to test that? If that shows
> me wrong, we certainly need to investigate this even further.
>

No i have only a Freescal PowerPC with a TI XHCI. I need a solution for
this environment.

2014-04-15 19:02:14

by Alan Stern

[permalink] [raw]
Subject: Re: Missing USB XHCI and EHCI reset for kexec

On Tue, 15 Apr 2014, Stefani Seibold wrote:

> I did a unbind and bind of the ehci-pci and ohci-pci, after this i got
> the following dmesg log:
>
> ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
> ehci-pci: EHCI PCI platform driver
> ehci-pci 0000:00:17.2: EHCI Host Controller
> ehci-pci 0000:00:17.2: new USB bus registered, assigned bus number 1
> ehci-pci 0000:00:17.2: irq 22, io mem 0xc0006800
> ehci-pci 0000:00:17.2: USB 2.0 started, EHCI 1.00
> hub 1-0:1.0: USB hub found
> hub 1-0:1.0: 5 ports detected
> ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
> ohci_hcd 0000:00:17.0: OHCI Host Controller
> ohci_hcd 0000:00:17.0: new USB bus registered, assigned bus number 2
> ohci_hcd 0000:00:17.0: irq 20, io mem 0xc0004000
> hub 2-0:1.0: USB hub found
> hub 2-0:1.0: 3 ports detected
> ohci_hcd 0000:00:17.1: OHCI Host Controller
> ohci_hcd 0000:00:17.1: new USB bus registered, assigned bus number 3
> ohci_hcd 0000:00:17.1: irq 21, io mem 0xc0005000
> hub 3-0:1.0: USB hub found
> hub 3-0:1.0: 2 ports detected
...
> usbcore: registered new interface driver USB-SATA-storage
> USB SATA Mass Storage support registered.
> usb 1-4: new high-speed USB device number 4 using ehci-pci
> xxxx: ports detected

What driver is this? I've never heard of USB-SATA-storage.

> ohci_hcd 0000:00:17.0: remove, state 1
> usb usb2: USB disconnect, device number 1
> usb 2-2: USB disconnect, device number 2
> usb 2-2.1: USB disconnect, device number 3
> usb 2-2.2: USB disconnect, device number 4
> ohci_hcd 0000:00:17.0: USB bus 2 deregistered
> ohci_hcd 0000:00:17.1: remove, state 1
> usb usb3: USB disconnect, device number 1
> ohci_hcd 0000:00:17.1: USB bus 3 deregistered
> ehci-pci 0000:00:17.2: remove, state 1
> usb usb1: USB disconnect, device number 1
> usb 1-2: USB disconnect, device number 2
> usb 1-4: USB disconnect, device number 4
> ehci-pci 0000:00:17.2: USB bus 1 deregistered
> ohci_hcd 0000:00:17.0: OHCI Host Controller
> ohci_hcd 0000:00:17.0: new USB bus registered, assigned bus number 1
> ohci_hcd 0000:00:17.0: irq 20, io mem 0xc0004000
> hub 1-0:1.0: USB hub found
> hub 1-0:1.0: 3 ports detected
> ohci_hcd 0000:00:17.1: OHCI Host Controller
> ohci_hcd 0000:00:17.1: new USB bus registered, assigned bus number 2
> ohci_hcd 0000:00:17.1: irq 21, io mem 0xc0005000
> hub 2-0:1.0: USB hub found
> hub 2-0:1.0: 2 ports detected
> ehci-pci 0000:00:17.2: EHCI Host Controller
> ehci-pci 0000:00:17.2: new USB bus registered, assigned bus number 3
> ehci-pci 0000:00:17.2: irq 22, io mem 0xc0006800
> ehci-pci 0000:00:17.2: USB 2.0 started, EHCI 1.00
> hub 3-0:1.0: USB hub found
> hub 3-0:1.0: 5 ports detected
> hub 1-0:1.0: USB hub found
> hub 1-0:1.0: 3 ports detected
> hub 2-0:1.0: USB hub found
> hub 2-0:1.0: 2 ports detected
> usb 3-2: new high-speed USB device number 2 using ehci-pci
> hub 3-2:1.0: USB hub found
> hub 3-2:1.0: 4 ports detected
> usb 3-4: new high-speed USB device number 4 using ehci-pci
> usb 3-4: device descriptor read/64, error -110
> usb 3-4: device descriptor read/64, error -110
> usb 3-4: new high-speed USB device number 5 using ehci-pci
> usb 3-4: device descriptor read/64, error -110
> usb 3-4: device descriptor read/64, error -110
> usb 3-4: new high-speed USB device number 6 using ehci-pci
> usb 3-4: device descriptor read/8, error -110
> usb 3-4: device descriptor read/8, error -110
> usb 3-4: new high-speed USB device number 7 using ehci-pci
> usb 3-4: device descriptor read/8, error -110
> usb 3-4: device descriptor read/8, error -110
> hub 3-0:1.0: unable to enumerate USB device on port 4
> usb 1-2: new full-speed USB device number 2 using ohci_hcd
> hub 1-2:1.0: USB hub found
> hub 1-2:1.0: 2 ports detected
> usb 2-2: new full-speed USB device number 2 using ohci_hcd
> usb 2-2: device descriptor read/64, error -110
> usb 2-2: device descriptor read/64, error -110
> usb 2-2: new full-speed USB device number 3 using ohci_hcd
> usb 2-2: device descriptor read/64, error -110
> usb 2-2: device descriptor read/64, error -110
> usb 2-2: new full-speed USB device number 4 using ohci_hcd
> usb 2-2: device descriptor read/8, error -110
> usb 2-2: device descriptor read/8, error -110
> usb 2-2: new full-speed USB device number 5 using ohci_hcd
> usb 2-2: device descriptor read/8, error -110
> usb 2-2: device descriptor read/8, error -110
> hub 2-0:1.0: unable to enumerate USB device on port 2
> usb 1-2.1: new full-speed USB device number 3 using ohci_hcd
> usb 1-2.2: new low-speed USB device number 4 using ohci_hcd
> input: Rohde&Schwarz FrontPanel USB Keyboard as /devices/pci0000:00/0000:00:17.0/usb1/1-2/1-2.2/1-2.2:1.0/input/input2
> input: Rohde&Schwarz FrontPanel USB Keyboard as /devices/pci0000:00/0000:00:17.0/usb1/1-2/1-2.2/1-2.2:1.1/input/input3

So two devices failed to initialize after the drivers were rebound: 3-4
and 2-2. We know that 3-4 was the mass-storage device, but I can't
tell from this log what 2-2 was.

> As you can see there is a difference between the USB port detected
> during the boot phase and after the unbind/bind:
>
> hub 1-0:1.0: 5 ports detected
> hub 2-0:1.0: 3 ports detected
> hub 3-0:1.0: 2 ports detected
> hub 1-2:1.0: 4 ports detected
> hub 2-2:1.0: 2 ports detected
>
> echo 0000:00:17.0 >/sys/bus/pci/drivers/ohci-pci/unbind
> echo 0000:00:17.1 >/sys/bus/pci/drivers/ohci-pci/unbind
> echo 0000:00:17.2 >/sys/bus/pci/drivers/ehci-pci/unbind
>
> echo 0000:00:17.0 >/sys/bus/pci/drivers/ohci-pci/bind
> echo 0000:00:17.1 >/sys/bus/pci/drivers/ohci-pci/bind
> echo 0000:00:17.2 >/sys/bus/pci/drivers/ehci-pci/bind
>
> hub 1-0:1.0: 3 ports detected
> hub 2-0:1.0: 2 ports detected
> hub 3-0:1.0: 5 ports detected
> hub 1-0:1.0: 3 ports detected
> hub 2-0:1.0: 2 ports detected
> hub 3-2:1.0: 4 ports detected
> hub 1-2:1.0: 2 ports detected
>
> This was kernel 3.10, but i get similar results for 3.14

Of course. This is because you rebound the drivers in a different
order from the binding during boot. During boot, the EHCI controller
(17.2) was assigned to bus 1, and the two OHCI controllers (17.0 and
17.2) were assigned to buses 2 and 3.

But when you rebound the drivers, you did 17.0 and 17.1 first, so they
were assigned to buses 1 and 2. Then 17.2 was assigned to bus 3. If
you had bound 17.2 and then 17.0 and 17.1, you would have gotten the
same results as during boot.

I can't tell why the two devices failed to initialize. Can you plug
either of them into an x86-based computer and see if it shows the same
behavior there?

Alan Stern

2014-04-15 19:05:23

by Alan Stern

[permalink] [raw]
Subject: Re: Missing USB XHCI and EHCI reset for kexec

On Tue, 15 Apr 2014, Stefani Seibold wrote:

> > One other thing you can try is to increase the reset timeout in
> > drivers/usb/host/ehci-hub.c. This is under the USB_PORT_FEAT_RESET
> > case in ehci_hub_control(), around line 1225:
> >
> > /*
> > * caller must wait, then call GetPortStatus
> > * usb 2.0 spec says 50 ms resets on root
> > */
> > ehci->reset_done [wIndex] = jiffies
> > + msecs_to_jiffies (50);
> >
> > Increasing the 50 to 100 or more might help.
> >
> > Alan Stern
> >
>
> I tried this, when i increase the value to 1000, the reset and
> enumeration process will be faster after a kexec: 28 Seconds vs. 162
> Seconds.

Even 28 seconds is much longer than it should be. And a 1000-ms long
reset signal is a lot longer than any device should need.

Anyway, since you saw the same problem after unbind and rebind, you
don't have to perform a kexec for testing.

Alan Stern

2014-04-15 19:07:18

by Stefani Seibold

[permalink] [raw]
Subject: Re: Missing USB XHCI and EHCI reset for kexec

Am Dienstag, den 15.04.2014, 15:02 -0400 schrieb Alan Stern:
> On Tue, 15 Apr 2014, Stefani Seibold wrote:
>
> > I did a unbind and bind of the ehci-pci and ohci-pci, after this i got
> > the following dmesg log:
> >
> > ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
> > ehci-pci: EHCI PCI platform driver
> > ehci-pci 0000:00:17.2: EHCI Host Controller
> > ehci-pci 0000:00:17.2: new USB bus registered, assigned bus number 1
> > ehci-pci 0000:00:17.2: irq 22, io mem 0xc0006800
> > ehci-pci 0000:00:17.2: USB 2.0 started, EHCI 1.00
> > hub 1-0:1.0: USB hub found
> > hub 1-0:1.0: 5 ports detected
> > ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
> > ohci_hcd 0000:00:17.0: OHCI Host Controller
> > ohci_hcd 0000:00:17.0: new USB bus registered, assigned bus number 2
> > ohci_hcd 0000:00:17.0: irq 20, io mem 0xc0004000
> > hub 2-0:1.0: USB hub found
> > hub 2-0:1.0: 3 ports detected
> > ohci_hcd 0000:00:17.1: OHCI Host Controller
> > ohci_hcd 0000:00:17.1: new USB bus registered, assigned bus number 3
> > ohci_hcd 0000:00:17.1: irq 21, io mem 0xc0005000
> > hub 3-0:1.0: USB hub found
> > hub 3-0:1.0: 2 ports detected
> ...
> > usbcore: registered new interface driver USB-SATA-storage
> > USB SATA Mass Storage support registered.
> > usb 1-4: new high-speed USB device number 4 using ehci-pci
> > xxxx: ports detected
>
> What driver is this? I've never heard of USB-SATA-storage.
>

This is a special embedded USB SATA driver written by me. It is mostly a
fork of the usb-storage driver but handle only one vendor and product ID
and does switch on a port bit. On the other side this vendor and product
ID is black listed in the regular usb-storage driver.

> > ohci_hcd 0000:00:17.0: remove, state 1
> > usb usb2: USB disconnect, device number 1
> > usb 2-2: USB disconnect, device number 2
> > usb 2-2.1: USB disconnect, device number 3
> > usb 2-2.2: USB disconnect, device number 4
> > ohci_hcd 0000:00:17.0: USB bus 2 deregistered
> > ohci_hcd 0000:00:17.1: remove, state 1
> > usb usb3: USB disconnect, device number 1
> > ohci_hcd 0000:00:17.1: USB bus 3 deregistered
> > ehci-pci 0000:00:17.2: remove, state 1
> > usb usb1: USB disconnect, device number 1
> > usb 1-2: USB disconnect, device number 2
> > usb 1-4: USB disconnect, device number 4
> > ehci-pci 0000:00:17.2: USB bus 1 deregistered
> > ohci_hcd 0000:00:17.0: OHCI Host Controller
> > ohci_hcd 0000:00:17.0: new USB bus registered, assigned bus number 1
> > ohci_hcd 0000:00:17.0: irq 20, io mem 0xc0004000
> > hub 1-0:1.0: USB hub found
> > hub 1-0:1.0: 3 ports detected
> > ohci_hcd 0000:00:17.1: OHCI Host Controller
> > ohci_hcd 0000:00:17.1: new USB bus registered, assigned bus number 2
> > ohci_hcd 0000:00:17.1: irq 21, io mem 0xc0005000
> > hub 2-0:1.0: USB hub found
> > hub 2-0:1.0: 2 ports detected
> > ehci-pci 0000:00:17.2: EHCI Host Controller
> > ehci-pci 0000:00:17.2: new USB bus registered, assigned bus number 3
> > ehci-pci 0000:00:17.2: irq 22, io mem 0xc0006800
> > ehci-pci 0000:00:17.2: USB 2.0 started, EHCI 1.00
> > hub 3-0:1.0: USB hub found
> > hub 3-0:1.0: 5 ports detected
> > hub 1-0:1.0: USB hub found
> > hub 1-0:1.0: 3 ports detected
> > hub 2-0:1.0: USB hub found
> > hub 2-0:1.0: 2 ports detected
> > usb 3-2: new high-speed USB device number 2 using ehci-pci
> > hub 3-2:1.0: USB hub found
> > hub 3-2:1.0: 4 ports detected
> > usb 3-4: new high-speed USB device number 4 using ehci-pci
> > usb 3-4: device descriptor read/64, error -110
> > usb 3-4: device descriptor read/64, error -110
> > usb 3-4: new high-speed USB device number 5 using ehci-pci
> > usb 3-4: device descriptor read/64, error -110
> > usb 3-4: device descriptor read/64, error -110
> > usb 3-4: new high-speed USB device number 6 using ehci-pci
> > usb 3-4: device descriptor read/8, error -110
> > usb 3-4: device descriptor read/8, error -110
> > usb 3-4: new high-speed USB device number 7 using ehci-pci
> > usb 3-4: device descriptor read/8, error -110
> > usb 3-4: device descriptor read/8, error -110
> > hub 3-0:1.0: unable to enumerate USB device on port 4
> > usb 1-2: new full-speed USB device number 2 using ohci_hcd
> > hub 1-2:1.0: USB hub found
> > hub 1-2:1.0: 2 ports detected
> > usb 2-2: new full-speed USB device number 2 using ohci_hcd
> > usb 2-2: device descriptor read/64, error -110
> > usb 2-2: device descriptor read/64, error -110
> > usb 2-2: new full-speed USB device number 3 using ohci_hcd
> > usb 2-2: device descriptor read/64, error -110
> > usb 2-2: device descriptor read/64, error -110
> > usb 2-2: new full-speed USB device number 4 using ohci_hcd
> > usb 2-2: device descriptor read/8, error -110
> > usb 2-2: device descriptor read/8, error -110
> > usb 2-2: new full-speed USB device number 5 using ohci_hcd
> > usb 2-2: device descriptor read/8, error -110
> > usb 2-2: device descriptor read/8, error -110
> > hub 2-0:1.0: unable to enumerate USB device on port 2
> > usb 1-2.1: new full-speed USB device number 3 using ohci_hcd
> > usb 1-2.2: new low-speed USB device number 4 using ohci_hcd
> > input: Rohde&Schwarz FrontPanel USB Keyboard as /devices/pci0000:00/0000:00:17.0/usb1/1-2/1-2.2/1-2.2:1.0/input/input2
> > input: Rohde&Schwarz FrontPanel USB Keyboard as /devices/pci0000:00/0000:00:17.0/usb1/1-2/1-2.2/1-2.2:1.1/input/input3
>
> So two devices failed to initialize after the drivers were rebound: 3-4
> and 2-2. We know that 3-4 was the mass-storage device, but I can't
> tell from this log what 2-2 was.
>
> > As you can see there is a difference between the USB port detected
> > during the boot phase and after the unbind/bind:
> >
> > hub 1-0:1.0: 5 ports detected
> > hub 2-0:1.0: 3 ports detected
> > hub 3-0:1.0: 2 ports detected
> > hub 1-2:1.0: 4 ports detected
> > hub 2-2:1.0: 2 ports detected
> >
> > echo 0000:00:17.0 >/sys/bus/pci/drivers/ohci-pci/unbind
> > echo 0000:00:17.1 >/sys/bus/pci/drivers/ohci-pci/unbind
> > echo 0000:00:17.2 >/sys/bus/pci/drivers/ehci-pci/unbind
> >
> > echo 0000:00:17.0 >/sys/bus/pci/drivers/ohci-pci/bind
> > echo 0000:00:17.1 >/sys/bus/pci/drivers/ohci-pci/bind
> > echo 0000:00:17.2 >/sys/bus/pci/drivers/ehci-pci/bind
> >
> > hub 1-0:1.0: 3 ports detected
> > hub 2-0:1.0: 2 ports detected
> > hub 3-0:1.0: 5 ports detected
> > hub 1-0:1.0: 3 ports detected
> > hub 2-0:1.0: 2 ports detected
> > hub 3-2:1.0: 4 ports detected
> > hub 1-2:1.0: 2 ports detected
> >
> > This was kernel 3.10, but i get similar results for 3.14
>
> Of course. This is because you rebound the drivers in a different
> order from the binding during boot. During boot, the EHCI controller
> (17.2) was assigned to bus 1, and the two OHCI controllers (17.0 and
> 17.2) were assigned to buses 2 and 3.
>
> But when you rebound the drivers, you did 17.0 and 17.1 first, so they
> were assigned to buses 1 and 2. Then 17.2 was assigned to bus 3. If
> you had bound 17.2 and then 17.0 and 17.1, you would have gotten the
> same results as during boot.
>
> I can't tell why the two devices failed to initialize. Can you plug
> either of them into an x86-based computer and see if it shows the same
> behavior there?
>
> Alan Stern
>

2014-04-15 19:09:05

by Stefani Seibold

[permalink] [raw]
Subject: Re: Missing USB XHCI and EHCI reset for kexec

Am Dienstag, den 15.04.2014, 15:05 -0400 schrieb Alan Stern:
> On Tue, 15 Apr 2014, Stefani Seibold wrote:
>
> > > One other thing you can try is to increase the reset timeout in
> > > drivers/usb/host/ehci-hub.c. This is under the USB_PORT_FEAT_RESET
> > > case in ehci_hub_control(), around line 1225:
> > >
> > > /*
> > > * caller must wait, then call GetPortStatus
> > > * usb 2.0 spec says 50 ms resets on root
> > > */
> > > ehci->reset_done [wIndex] = jiffies
> > > + msecs_to_jiffies (50);
> > >
> > > Increasing the 50 to 100 or more might help.
> > >
> > > Alan Stern
> > >
> >
> > I tried this, when i increase the value to 1000, the reset and
> > enumeration process will be faster after a kexec: 28 Seconds vs. 162
> > Seconds.
>
> Even 28 seconds is much longer than it should be. And a 1000-ms long
> reset signal is a lot longer than any device should need.
>
> Anyway, since you saw the same problem after unbind and rebind, you
> don't have to perform a kexec for testing.
>

Right, but i would prefer a solution for this. Since it works perfectly
in kernel 3.4 i don't think it is a hardware issue.

- Stefani

2014-04-15 19:14:49

by Alan Stern

[permalink] [raw]
Subject: Re: Missing USB XHCI and EHCI reset for kexec

On Tue, 15 Apr 2014, Stefani Seibold wrote:

> > > usbcore: registered new interface driver USB-SATA-storage
> > > USB SATA Mass Storage support registered.
> > > usb 1-4: new high-speed USB device number 4 using ehci-pci
> > > xxxx: ports detected
> >
> > What driver is this? I've never heard of USB-SATA-storage.
> >
>
> This is a special embedded USB SATA driver written by me. It is mostly a
> fork of the usb-storage driver but handle only one vendor and product ID
> and does switch on a port bit. On the other side this vendor and product
> ID is black listed in the regular usb-storage driver.

Is it possible that your driver leaves the device in a strange state
where it won't initialize properly?

You can test this by blacklisting both mass-storage drivers. If the
device then initializes properly after unbind and rebind, you'll know
the answer.

Alan Stern

2014-04-15 19:15:19

by Stefani Seibold

[permalink] [raw]
Subject: Re: Missing USB XHCI and EHCI reset for kexec

Am Dienstag, den 15.04.2014, 15:02 -0400 schrieb Alan Stern:
> On Tue, 15 Apr 2014, Stefani Seibold wrote:
>
> > I did a unbind and bind of the ehci-pci and ohci-pci, after this i got
> > the following dmesg log:
> >
> > ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
> > ehci-pci: EHCI PCI platform driver
> > ehci-pci 0000:00:17.2: EHCI Host Controller
> > ehci-pci 0000:00:17.2: new USB bus registered, assigned bus number 1
> > ehci-pci 0000:00:17.2: irq 22, io mem 0xc0006800
> > ehci-pci 0000:00:17.2: USB 2.0 started, EHCI 1.00
> > hub 1-0:1.0: USB hub found
> > hub 1-0:1.0: 5 ports detected
> > ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
> > ohci_hcd 0000:00:17.0: OHCI Host Controller
> > ohci_hcd 0000:00:17.0: new USB bus registered, assigned bus number 2
> > ohci_hcd 0000:00:17.0: irq 20, io mem 0xc0004000
> > hub 2-0:1.0: USB hub found
> > hub 2-0:1.0: 3 ports detected
> > ohci_hcd 0000:00:17.1: OHCI Host Controller
> > ohci_hcd 0000:00:17.1: new USB bus registered, assigned bus number 3
> > ohci_hcd 0000:00:17.1: irq 21, io mem 0xc0005000
> > hub 3-0:1.0: USB hub found
> > hub 3-0:1.0: 2 ports detected
> ...
> > usbcore: registered new interface driver USB-SATA-storage
> > USB SATA Mass Storage support registered.
> > usb 1-4: new high-speed USB device number 4 using ehci-pci
> > xxxx: ports detected
>
> What driver is this? I've never heard of USB-SATA-storage.
>
> > ohci_hcd 0000:00:17.0: remove, state 1
> > usb usb2: USB disconnect, device number 1
> > usb 2-2: USB disconnect, device number 2
> > usb 2-2.1: USB disconnect, device number 3
> > usb 2-2.2: USB disconnect, device number 4
> > ohci_hcd 0000:00:17.0: USB bus 2 deregistered
> > ohci_hcd 0000:00:17.1: remove, state 1
> > usb usb3: USB disconnect, device number 1
> > ohci_hcd 0000:00:17.1: USB bus 3 deregistered
> > ehci-pci 0000:00:17.2: remove, state 1
> > usb usb1: USB disconnect, device number 1
> > usb 1-2: USB disconnect, device number 2
> > usb 1-4: USB disconnect, device number 4
> > ehci-pci 0000:00:17.2: USB bus 1 deregistered
> > ohci_hcd 0000:00:17.0: OHCI Host Controller
> > ohci_hcd 0000:00:17.0: new USB bus registered, assigned bus number 1
> > ohci_hcd 0000:00:17.0: irq 20, io mem 0xc0004000
> > hub 1-0:1.0: USB hub found
> > hub 1-0:1.0: 3 ports detected
> > ohci_hcd 0000:00:17.1: OHCI Host Controller
> > ohci_hcd 0000:00:17.1: new USB bus registered, assigned bus number 2
> > ohci_hcd 0000:00:17.1: irq 21, io mem 0xc0005000
> > hub 2-0:1.0: USB hub found
> > hub 2-0:1.0: 2 ports detected
> > ehci-pci 0000:00:17.2: EHCI Host Controller
> > ehci-pci 0000:00:17.2: new USB bus registered, assigned bus number 3
> > ehci-pci 0000:00:17.2: irq 22, io mem 0xc0006800
> > ehci-pci 0000:00:17.2: USB 2.0 started, EHCI 1.00
> > hub 3-0:1.0: USB hub found
> > hub 3-0:1.0: 5 ports detected
> > hub 1-0:1.0: USB hub found
> > hub 1-0:1.0: 3 ports detected
> > hub 2-0:1.0: USB hub found
> > hub 2-0:1.0: 2 ports detected
> > usb 3-2: new high-speed USB device number 2 using ehci-pci
> > hub 3-2:1.0: USB hub found
> > hub 3-2:1.0: 4 ports detected
> > usb 3-4: new high-speed USB device number 4 using ehci-pci
> > usb 3-4: device descriptor read/64, error -110
> > usb 3-4: device descriptor read/64, error -110
> > usb 3-4: new high-speed USB device number 5 using ehci-pci
> > usb 3-4: device descriptor read/64, error -110
> > usb 3-4: device descriptor read/64, error -110
> > usb 3-4: new high-speed USB device number 6 using ehci-pci
> > usb 3-4: device descriptor read/8, error -110
> > usb 3-4: device descriptor read/8, error -110
> > usb 3-4: new high-speed USB device number 7 using ehci-pci
> > usb 3-4: device descriptor read/8, error -110
> > usb 3-4: device descriptor read/8, error -110
> > hub 3-0:1.0: unable to enumerate USB device on port 4
> > usb 1-2: new full-speed USB device number 2 using ohci_hcd
> > hub 1-2:1.0: USB hub found
> > hub 1-2:1.0: 2 ports detected
> > usb 2-2: new full-speed USB device number 2 using ohci_hcd
> > usb 2-2: device descriptor read/64, error -110
> > usb 2-2: device descriptor read/64, error -110
> > usb 2-2: new full-speed USB device number 3 using ohci_hcd
> > usb 2-2: device descriptor read/64, error -110
> > usb 2-2: device descriptor read/64, error -110
> > usb 2-2: new full-speed USB device number 4 using ohci_hcd
> > usb 2-2: device descriptor read/8, error -110
> > usb 2-2: device descriptor read/8, error -110
> > usb 2-2: new full-speed USB device number 5 using ohci_hcd
> > usb 2-2: device descriptor read/8, error -110
> > usb 2-2: device descriptor read/8, error -110
> > hub 2-0:1.0: unable to enumerate USB device on port 2
> > usb 1-2.1: new full-speed USB device number 3 using ohci_hcd
> > usb 1-2.2: new low-speed USB device number 4 using ohci_hcd
> > input: Rohde&Schwarz FrontPanel USB Keyboard as /devices/pci0000:00/0000:00:17.0/usb1/1-2/1-2.2/1-2.2:1.0/input/input2
> > input: Rohde&Schwarz FrontPanel USB Keyboard as /devices/pci0000:00/0000:00:17.0/usb1/1-2/1-2.2/1-2.2:1.1/input/input3
>
> So two devices failed to initialize after the drivers were rebound: 3-4
> and 2-2. We know that 3-4 was the mass-storage device, but I can't
> tell from this log what 2-2 was.
>

2-2 is a custom HID device which handles the front panel and the rotary
knop.

> > As you can see there is a difference between the USB port detected
> > during the boot phase and after the unbind/bind:
> >
> > hub 1-0:1.0: 5 ports detected
> > hub 2-0:1.0: 3 ports detected
> > hub 3-0:1.0: 2 ports detected
> > hub 1-2:1.0: 4 ports detected
> > hub 2-2:1.0: 2 ports detected
> >
> > echo 0000:00:17.0 >/sys/bus/pci/drivers/ohci-pci/unbind
> > echo 0000:00:17.1 >/sys/bus/pci/drivers/ohci-pci/unbind
> > echo 0000:00:17.2 >/sys/bus/pci/drivers/ehci-pci/unbind
> >
> > echo 0000:00:17.0 >/sys/bus/pci/drivers/ohci-pci/bind
> > echo 0000:00:17.1 >/sys/bus/pci/drivers/ohci-pci/bind
> > echo 0000:00:17.2 >/sys/bus/pci/drivers/ehci-pci/bind
> >
> > hub 1-0:1.0: 3 ports detected
> > hub 2-0:1.0: 2 ports detected
> > hub 3-0:1.0: 5 ports detected
> > hub 1-0:1.0: 3 ports detected
> > hub 2-0:1.0: 2 ports detected
> > hub 3-2:1.0: 4 ports detected
> > hub 1-2:1.0: 2 ports detected
> >
> > This was kernel 3.10, but i get similar results for 3.14
>
> Of course. This is because you rebound the drivers in a different
> order from the binding during boot. During boot, the EHCI controller
> (17.2) was assigned to bus 1, and the two OHCI controllers (17.0 and
> 17.2) were assigned to buses 2 and 3.
>
> But when you rebound the drivers, you did 17.0 and 17.1 first, so they
> were assigned to buses 1 and 2. Then 17.2 was assigned to bus 3. If
> you had bound 17.2 and then 17.0 and 17.1, you would have gotten the
> same results as during boot.
>

I will try this tomorrow in the office.

Both of this devices are connect thru a HUB. An other PPC board which
nearly the same HW does not have this HUB and this can handle the USB
bus without any problem.

So i think the HUB is the reason.


> I can't tell why the two devices failed to initialize. Can you plug
> either of them into an x86-based computer and see if it shows the same
> behavior there?
>

Sorry, the HW is soldered on the board.

- Stefani


2014-04-15 19:18:00

by Stefani Seibold

[permalink] [raw]
Subject: Re: Missing USB XHCI and EHCI reset for kexec

Am Dienstag, den 15.04.2014, 15:14 -0400 schrieb Alan Stern:
> On Tue, 15 Apr 2014, Stefani Seibold wrote:
>
> > > > usbcore: registered new interface driver USB-SATA-storage
> > > > USB SATA Mass Storage support registered.
> > > > usb 1-4: new high-speed USB device number 4 using ehci-pci
> > > > xxxx: ports detected
> > >
> > > What driver is this? I've never heard of USB-SATA-storage.
> > >
> >
> > This is a special embedded USB SATA driver written by me. It is mostly a
> > fork of the usb-storage driver but handle only one vendor and product ID
> > and does switch on a port bit. On the other side this vendor and product
> > ID is black listed in the regular usb-storage driver.
>
> Is it possible that your driver leaves the device in a strange state
> where it won't initialize properly?
>

I don't think so, because the driver was not used in this test due a
missing harddisk. But i will do a test tomorrow in the office.

- Stefani


2014-04-15 19:34:47

by Alan Stern

[permalink] [raw]
Subject: Re: Missing USB XHCI and EHCI reset for kexec

On Tue, 15 Apr 2014, Stefani Seibold wrote:

> > So two devices failed to initialize after the drivers were rebound: 3-4
> > and 2-2. We know that 3-4 was the mass-storage device, but I can't
> > tell from this log what 2-2 was.
> >
>
> 2-2 is a custom HID device which handles the front panel and the rotary
> knop.

A custom device? Maybe it's not behaving the way it should.

> Both of this devices are connect thru a HUB.

No. You can tell from the device paths. 3-4 is the device plugged
directly into port 4 of the host controller for bus 3. Similarly, 2-2
is the device plugged directly into port 2 of the host controller for
bus 2.

A good example of a device connected through a hub is the Rohde&Schwarz
FrontPanel USB Keyboard, 1-2.2. This means it is plugged into port 2
of the hub that is plugged into port 2 of the host controller for bus
1.

Also, the fact that these two devices are on different buses means that
they can't be connected to the same hub. If they were, they would both
be on the same bus as that hub.

> An other PPC board which
> nearly the same HW does not have this HUB and this can handle the USB
> bus without any problem.
>
> So i think the HUB is the reason.

Under the circumstances, I don't see how it could be.

> Sorry, the HW is soldered on the board.

If you think there is a bad hub, can you remove it and re-solder the
other components?

Unfortunately, your environment isn't very suitable for testing. You
can't attach the hardware to a different computer, you can't use any
test equipment, and you can't even try out different kernel versions.

All together, the odds of solving this problem aren't good.

> Right, but i would prefer a solution for this. Since it works perfectly
> in kernel 3.4 i don't think it is a hardware issue.

Maybe yes, maybe no. Without the ability to perform detailed tests, we
can't tell.

What happens if you run the unbind/rebind test under 3.4?

Alan Stern

2014-06-07 08:16:17

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Missing USB XHCI and EHCI reset for kexec

On Tue, 2014-04-15 at 20:54 +0200, Stefani Seibold wrote:

> > > > > >>>>[ 1.170029] xhci_hcd 0001:03:00.0: xHCI Host Controller
> > > > > >>>>[ 1.175306] xhci_hcd 0001:03:00.0: new USB bus registered,
> > > > > >>>>assigned bus number 1
> > > > > >>>>[ 1.212561] xhci_hcd 0001:03:00.0: Host not halted after 16000
> > > > > >>>>microseconds.
> > > > > >>>>[ 1.219621] xhci_hcd 0001:03:00.0: can't setup: -110
> > > > > >>>>[ 1.224597] xhci_hcd 0001:03:00.0: USB bus 1 deregistered
> > > > > >>>>[ 1.230021] xhci_hcd 0001:03:00.0: init 0001:03:00.0 fail, -110
> > > > > >>>>[ 1.235955] xhci_hcd: probe of 0001:03:00.0 failed with error -110
> > > > > >>>>

Cascardo, Alan,

Bringing back that old thread..

I'm trying to kexec from our OPAL FW bootloader (3.10.23 based) to a
fedora 3.14.3 on a new machine and am still seeing the above.

A reset brings the chip back.

Do we have any resolution here ? It looks like the xHCI remains active
on the way out and is clagged up on the way back in, at least in a
state the driver really doesn't like.

And this is a "clean" kexec, not a forced one... This is killing USB on
the upcoming OpenPower dev machines.

Cheers,
Ben.

2014-06-07 15:40:19

by Alan Stern

[permalink] [raw]
Subject: Re: Missing USB XHCI and EHCI reset for kexec

[Added Mathias to CC: list]

On Sat, 7 Jun 2014, Benjamin Herrenschmidt wrote:

> I'm trying to kexec from our OPAL FW bootloader (3.10.23 based) to a
> fedora 3.14.3 on a new machine and am still seeing the above.
>
> A reset brings the chip back.
>
> Do we have any resolution here ? It looks like the xHCI remains active
> on the way out and is clagged up on the way back in, at least in a
> state the driver really doesn't like.
>
> And this is a "clean" kexec, not a forced one... This is killing USB on
> the upcoming OpenPower dev machines.

The current xhci-hcd driver includes a quirk flag (XHCI_SPURIOS_WAKEUP)
that causes the shutdown routine to reset the controller. It wasn't
meant for fixing kexec problems, but I bet you could use it for that
purpose.

In addition, it's possible that a reset is needed in the probe pathway.

Alan Stern

2014-06-07 21:11:55

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Missing USB XHCI and EHCI reset for kexec

On Sat, 2014-06-07 at 11:40 -0400, Alan Stern wrote:
> The current xhci-hcd driver includes a quirk flag (XHCI_SPURIOS_WAKEUP)
> that causes the shutdown routine to reset the controller. It wasn't
> meant for fixing kexec problems, but I bet you could use it for that
> purpose.
>
> In addition, it's possible that a reset is needed in the probe pathway.

Ok, thanks. I'll have a look. A reset in the probe means fixing distros,
I'd rather find a way to fix it from the kexec path entirely, even if
that involves adding a quirk on the way out to reset it in shutdown()

I'll see what I can come up with that works and will come back to you.

Cheers,
Ben.

2014-06-07 22:36:05

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Missing USB XHCI and EHCI reset for kexec

On Sat, 2014-06-07 at 11:40 -0400, Alan Stern wrote:

> The current xhci-hcd driver includes a quirk flag (XHCI_SPURIOS_WAKEUP)
> that causes the shutdown routine to reset the controller. It wasn't
> meant for fixing kexec problems, but I bet you could use it for that
> purpose.
>
> In addition, it's possible that a reset is needed in the probe pathway.

Looking at the code a bit more ... that xhci_shutdown() worries me.

It basically just whacks xhci_halt() and optionally reset() but nothing
is done that I can see to ensure that we aren't concurrently
doing things like queuing URBs, polling the root hub etc...

That's definitely not clean and while it might work (most of the time
at least) on actual shutdown it's definitely not right for kexec I
reckon.

Now there's a separate discussion that we had a while ago and might
want to resume which is to say that kexec shouldn't be calling
shutdown() anyway, but instead remove() on all drivers which is
a much better code path for the purpose of leaving the device in
a state where a driver can reconnect to it.

However, in the case of XHCI that leads to another issue described
here:

http://marc.info/?l=linux-usb&m=139483181809062&w=2

For which there was little / no discussion at all... I suppose we could
do a quirk but I don't think the problem is fundamentally
specific to the TI chip, we should probably stop both root hubs
before we halt both HCDs.

Cheers,
Ben.

2014-06-08 01:10:02

by Alan Stern

[permalink] [raw]
Subject: Re: Missing USB XHCI and EHCI reset for kexec

On Sun, 8 Jun 2014, Benjamin Herrenschmidt wrote:

> Looking at the code a bit more ... that xhci_shutdown() worries me.
>
> It basically just whacks xhci_halt() and optionally reset() but nothing
> is done that I can see to ensure that we aren't concurrently
> doing things like queuing URBs, polling the root hub etc...
>
> That's definitely not clean and while it might work (most of the time
> at least) on actual shutdown it's definitely not right for kexec I
> reckon.

Yes, it really was meant for actual system shutdown.

> Now there's a separate discussion that we had a while ago and might
> want to resume which is to say that kexec shouldn't be calling
> shutdown() anyway, but instead remove() on all drivers which is
> a much better code path for the purpose of leaving the device in
> a state where a driver can reconnect to it.
>
> However, in the case of XHCI that leads to another issue described
> here:
>
> http://marc.info/?l=linux-usb&m=139483181809062&w=2
>
> For which there was little / no discussion at all... I suppose we could
> do a quirk but I don't think the problem is fundamentally
> specific to the TI chip, we should probably stop both root hubs
> before we halt both HCDs.

The issue described in that email seems valid to me. Maybe the patch
should be resubmitted. Now that xhci-hcd has changed maintainership,
the discussion might move forward.

In any case, you certainly can try testing with that patch installed.
After all, xhci-hcd should work properly after a rmmod/modprobe
sequence, and this is pretty much the same thing.

Alan Stern