(following Richard Goochs template bug report)
(please CC me because I am not subscribed)
(thanks in advance for looking into this!)
1. OOPS after connection Droids MuIn USB display
2. I recently bought this display:
http://www.droids.it/cmsvb4/content.php?233-990.014-MuIn-LCD_overview It
presents itself as a "communication" class USB device. After plugin, the
kernel gives an oops. The kernel keeps running, other USB devices
continue to work, but at least any lsusb started keeps hanging in "D"
state. There is no additional kernel logging about the lsusb hanging.
3. OOPS insert usb acm device
4. Linux version 2.6.38.2 (erik@skylla) (gcc version 4.5.1 20100924 (Red
Hat 4.5.1-4) (GCC) ) #3 SMP Fri Apr 22 17:31:06 CEST 2011
This is a vanilla kernel on x86-64. The fedora kernel gives the oops as
well, though.
5. --------------- >-8 ---------------------
usb 2-1.3: new full speed USB device using ehci_hcd and address 111
usb 2-1.3: New USB device found, idVendor=04d8, idProduct=000b
usb 2-1.3: New USB device strings: Mfr=1, Product=2, SerialNumber=0
usb 2-1.3: Product: VCOM
usb 2-1.3: Manufacturer: DROIDS
cdc_acm 2-1.3:1.0: This device cannot do calls on its own. It is not a
modem.
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: [<ffffffff815f64e7>] acm_probe+0x157/0xcb0
PGD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:1d.7/usb2/idVendor
CPU 0
Modules linked in: [last unloaded: scsi_wait_scan]
Pid: 19, comm: khubd Tainted: G W 2.6.38.2 #3 Dell Inc. XPS
M1330
RIP: 0010:[<ffffffff815f64e7>] [<ffffffff815f64e7>] acm_probe+0x157/0xcb0
RSP: 0018:ffff88011fdf59c0 EFLAGS: 00010286
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8800cd803000
RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff88011bb8d800
RBP: ffff8800cd803400 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: ffff8800cd803400
R13: 0000000000000000 R14: ffff88011bb8d888 R15: ffff8800cd803430
FS: 0000000000000000(0000) GS:ffff8800df400000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000008 CR3: 0000000001cb1000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process khubd (pid: 19, threadinfo ffff88011fdf4000, task ffff88011fca9640)
Stack:
ffff8800cd86b200 ffffffff81120a09 ffff88011fdf5a30 ffff88011bb8d888
0000000000000000 0000000000000001 0000000000000000 ffff88011bb8d800
0000000000000206 0000000000000000 ffffffff00000010 0000000000000246
Call Trace:
[<ffffffff81120a09>] ? sysfs_addrm_finish+0x29/0xe0
[<ffffffff815cff9e>] ? usb_probe_interface+0x10e/0x210
[<ffffffff81431ef6>] ? driver_probe_device+0x96/0x1b0
[<ffffffff814320b0>] ? __device_attach+0x0/0x60
[<ffffffff81430c1c>] ? bus_for_each_drv+0x5c/0x90
[<ffffffff81431d9f>] ? device_attach+0x8f/0xb0
[<ffffffff814315cd>] ? bus_probe_device+0x2d/0x50
[<ffffffff8142f4ca>] ? device_add+0x5ba/0x690
[<ffffffff815ce574>] ? usb_set_configuration+0x5b4/0x690
[<ffffffff81121722>] ? sysfs_do_create_link+0xe2/0x220
[<ffffffff815d733b>] ? generic_probe+0x3b/0xa0
[<ffffffff81431ef6>] ? driver_probe_device+0x96/0x1b0
[<ffffffff814320b0>] ? __device_attach+0x0/0x60
[<ffffffff81430c1c>] ? bus_for_each_drv+0x5c/0x90
[<ffffffff81431d9f>] ? device_attach+0x8f/0xb0
[<ffffffff814315cd>] ? bus_probe_device+0x2d/0x50
[<ffffffff8142f4ca>] ? device_add+0x5ba/0x690
[<ffffffff815c72f9>] ? usb_new_device+0xe9/0x130
[<ffffffff815c85a0>] ? hub_thread+0xba0/0x10d0
[<ffffffff810337a0>] ? enqueue_task_fair+0x160/0x1a0
[<ffffffff81058580>] ? autoremove_wake_function+0x0/0x30
[<ffffffff815c7a00>] ? hub_thread+0x0/0x10d0
[<ffffffff815c7a00>] ? hub_thread+0x0/0x10d0
[<ffffffff810580f6>] ? kthread+0x96/0xa0
[<ffffffff81003b14>] ? kernel_thread_helper+0x4/0x10
[<ffffffff81058060>] ? kthread+0x0/0xa0
[<ffffffff81003b10>] ? kernel_thread_helper+0x0/0x10
Code: 44 8b 5c 24 28 45 85 db 0f 8e 65 01 00 00 8b 74 24 28 48 8b 7c 24
38 4c 89 e5 e8 25 d3
RIP [<ffffffff815f64e7>] acm_probe+0x157/0xcb0
RSP <ffff88011fdf59c0>
CR2: 0000000000000008
---[ end trace facc809d566fe573 ]---
------------------------- >-8 --------------------
6. N/A
7.
- (software) N/A
- (cpu)
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 23
model name : Intel(R) Core(TM)2 Duo CPU T9300 @ 2.50GHz
stepping : 6
cpu MHz : 800.000
cache size : 6144 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm
constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64
monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 lahf_lm ida dts
tpr_shadow vnmi flexpriority
bogomips : 4987.54
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:
processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 23
model name : Intel(R) Core(TM)2 Duo CPU T9300 @ 2.50GHz
stepping : 6
cpu MHz : 800.000
cache size : 6144 KB
physical id : 0
siblings : 2
core id : 1
cpu cores : 2
apicid : 1
initial apicid : 1
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm
constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64
monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 lahf_lm ida dts
tpr_shadow vnmi flexpriority
bogomips : 4987.43
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:
- (modules) none
- (driver/hardware details) (skip for the moment)
- (pci)
00:00.0 Host bridge: Intel Corporation Mobile PM965/GM965/GL960 Memory
Controller Hub (rev 0c)
00:02.0 VGA compatible controller: Intel Corporation Mobile GM965/GL960
Integrated Graphics Controller (rev 0c)
00:02.1 Display controller: Intel Corporation Mobile GM965/GL960
Integrated Graphics Controller (rev 0c)
00:1a.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
Controller #4 (rev 02)
00:1a.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
Controller #5 (rev 02)
00:1a.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI
Controller #2 (rev 02)
00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio
Controller (rev 02)
00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express
Port 1 (rev 02)
00:1c.1 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express
Port 2 (rev 02)
00:1c.3 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express
Port 4 (rev 02)
00:1c.5 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express
Port 6 (rev 02)
00:1d.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
Controller #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
Controller #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
Controller #3 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI
Controller #1 (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev f2)
00:1f.0 ISA bridge: Intel Corporation 82801HEM (ICH8M) LPC Interface
Controller (rev 02)
00:1f.1 IDE interface: Intel Corporation 82801HBM/HEM (ICH8M/ICH8M-E)
IDE Controller (rev 02)
00:1f.2 SATA controller: Intel Corporation 82801HBM/HEM (ICH8M/ICH8M-E)
SATA AHCI Controller (rev 02)
00:1f.3 SMBus: Intel Corporation 82801H (ICH8 Family) SMBus Controller
(rev 02)
03:01.0 FireWire (IEEE 1394): Ricoh Co Ltd R5C832 IEEE 1394 Controller
(rev 05)
03:01.1 SD Host controller: Ricoh Co Ltd R5C822 SD/SDIO/MMC/MS/MSPro
Host Adapter (rev 22)
03:01.2 System peripheral: Ricoh Co Ltd R5C592 Memory Stick Bus Host
Adapter (rev 12)
03:01.3 System peripheral: Ricoh Co Ltd xD-Picture Card Controller (rev 12)
09:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5906M Fast
Ethernet PCI Express (rev 02)
0c:00.0 Network controller: Intel Corporation PRO/Wireless 4965 AG or
AGN [Kedron] Network Connection (rev 61)
0d:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053 PCI-E
Gigabit Ethernet Controller (rev 20)
- (scsi) N/A
- (other) none, System.map doesn't seem to give any extra information
Other notes: this is exactly reproduceable, so I don't think it's a
hardware issue. The device itself (the display) seems to be working OK.
Following up on this bug I have found the location of the OOPS in source
code (using bisection). It's clearly not the source of the problem, but
I hope it will help.
In the file drivers/usb/class/cdc-acm.c, around line 1098, there is a
comment "/*workaround for switched interfaces */" and then a check:
if (data_interface->cur_altsetting->desc.bInterfaceClass !=
CDC_DATA_INTERFACE_TYPE)
At this point, the variable data_interface is NULL and this causes the OOPS.
Hi Eric,
> In the file drivers/usb/class/cdc-acm.c, around line 1098, there is a
> comment "/*workaround for switched interfaces */" and then a check:
>
> if (data_interface->cur_altsetting->desc.bInterfaceClass !=
> CDC_DATA_INTERFACE_TYPE)
>
> At this point, the variable data_interface is NULL and this causes the OOPS.
>
I have included some checks for "data_interface" since
"usb_ifnum_to_if" could return NULL.
Could you please check by applying this patch ?
Signed-off-by: Maxin B. John <[email protected]>
---
diff --git a/drivers/usb/class/cdc-acm.c b/drivers/usb/class/cdc-acm.c
index e057e53..073f418 100644
--- a/drivers/usb/class/cdc-acm.c
+++ b/drivers/usb/class/cdc-acm.c
@@ -960,6 +960,10 @@ static int acm_probe(struct usb_interface *intf,
if (quirks == NO_UNION_NORMAL) {
data_interface = usb_ifnum_to_if(usb_dev, 1);
control_interface = usb_ifnum_to_if(usb_dev, 0);
+ if (!data_interface || !control_interface) {
+ dev_dbg(&intf->dev, "no interfaces\n");
+ return -ENODEV;
+ }
goto skip_normal_probe;
}
@@ -1031,6 +1035,10 @@ next_desc:
if (call_interface_num > 0) {
dev_dbg(&intf->dev, "No union descriptor,
using call management descriptor\n");
data_interface = usb_ifnum_to_if(usb_dev,
(data_interface_num = call_interface_num));
+ if (!data_interface) {
+ dev_dbg(&intf->dev, "no data interface\n");
+ return -ENODEV;
+ }
control_interface = intf;
} else {
if (intf->cur_altsetting->desc.bNumEndpoints != 3) {
> I have included some checks for "data_interface" since
> "usb_ifnum_to_if" could return NULL.
> Could you please check by applying this patch ?
In the meantime I've done a lot of bisection and inserting printk's
(needs a reboot every time, that's why I didn't react yet).
I think I can tell even a bit more than this patch will. I already found
out that the device appears to have no data connection but does have a
control connection. The lack of data connection seems to yield the NULL.
I will now test with data_connection forced to be equal to
control_connection, because apparently some devices actually work with
only connection, maybe this is one of them.
Will have more details later.
Thank you for your interest.
Yeah, I've found it!
As I mentioned earlier, the device appears to have a control interface
and no data interface. The current usb cdc-acm code doesn't account for
this and so yields a oops because data_interface is null.
I've "fixed" it very dirtily by changing these lines:
if (call_interface_num > 0) {
dev_dbg(&intf->dev, "No union descriptor, using call management
descriptor\n");
data_interface = usb_ifnum_to_if(usb_dev, (data_interface_num =
call_interface_num));
into:
if (call_interface_num > 0) {
dev_dbg(&intf->dev, "No union descriptor, using call management
descriptor\n");
data_interface = usb_ifnum_to_if(usb_dev, 0);
I can't give an exact patch because the source is full of added debug
printk's now ;-)
As far as I understand it, this change results in data_interface to be
set equal to control_interface and later on this gets dealed with properly.
And now it not only no longer oopses, it actually works!
If you're planning to make a quirk of it, the USB ID is 04d8:000b
"Microchip Technology, Inc. PIC18F2550 (32K Flashable 10 Channel, 10 Bit
A/D USB Microcontroller"
lsusb says:
Bus 002 Device 018: ID 04d8:000b Microchip Technology, Inc. PIC18F2550
(32K Flashable 10 Channel, 10 Bit A/D USB Microcontroller)
Device Descriptor:
bLength 18
bDescriptorType 1
bcdUSB 2.00
bDeviceClass 2 Communications
bDeviceSubClass 2 Abstract (modem)
bDeviceProtocol 1 AT-commands (v.25ter)
bMaxPacketSize0 64
idVendor 0x04d8 Microchip Technology, Inc.
idProduct 0x000b PIC18F2550 (32K Flashable 10 Channel, 10
Bit A/D USB Microcontroller)
bcdDevice 0.00
iManufacturer 1 DROIDS
iProduct 2 VCOM
iSerial 0
bNumConfigurations 1
Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength 53
bNumInterfaces 1
bConfigurationValue 1
iConfiguration 0
bmAttributes 0xa0
(Bus Powered)
Remote Wakeup
MaxPower 500mA
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 0
bAlternateSetting 0
bNumEndpoints 3
bInterfaceClass 2 Communications
bInterfaceSubClass 2 Abstract (modem)
bInterfaceProtocol 1 AT-commands (v.25ter)
iInterface 0
CDC Header:
bcdCDC 1.10
CDC Call Management:
bmCapabilities 0x01
call management
bDataInterface 1
CDC ACM:
bmCapabilities 0x06
sends break
line coding and serial state
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x83 EP 3 IN
bmAttributes 3
Transfer Type Interrupt
Synch Type None
Usage Type Data
wMaxPacketSize 0x0008 1x 8 bytes
bInterval 10
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x84 EP 4 IN
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0040 1x 64 bytes bInterval
10
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x04 EP 4 OUT
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0040 1x 64 bytes
bInterval 10
Device Status: 0x0001
Self Powered
Thanks.
Hi Erik,
My apologies for misspelling your name in the previous mail.
On Fri, May 6, 2011 at 7:57 PM, Erik Slagter <[email protected]> wrote:
> Yeah, I've found it!
>
> As I mentioned earlier, the device appears to have a control interface and
> no data interface. The current usb cdc-acm code doesn't account for this and
> so yields a oops because data_interface is null.
That's so good to hear.
> I can't give an exact patch because the source is full of added debug
> printk's now ;-)
>
> As far as I understand it, this change results in data_interface to be set
> equal to control_interface and later on this gets dealed with properly.
>
> And now it not only no longer oopses, it actually works!
Congratulations and thanks for explaining the details.
Best Regards,
Maxin
> Congratulations and thanks for explaining the details.
And now my big question is, can this be fixed upstream? ;-)
Cheers,
Erik.
On Sat, May 07, 2011 at 09:02:49AM +0200, Erik Slagter wrote:
> >Congratulations and thanks for explaining the details.
>
> And now my big question is, can this be fixed upstream? ;-)
Yes, if someone sends us a patch :)
thanks,
greg k-h
On 07-05-11 22:02, Greg KH wrote:
> On Sat, May 07, 2011 at 09:02:49AM +0200, Erik Slagter wrote:
>> And now my big question is, can this be fixed upstream? ;-)
>
> Yes, if someone sends us a patch :)
Somehow I knew where this was going to :-/ I can happily make a clean
patch that makes this device work, but as I am almost completely in the
dark as whether what I am doing, it will probably break any other device
and also fix it in the wrong way as well. I've had patches rejected for
the same reasons in the past...
On Sun, May 08, 2011 at 09:45:07AM +0200, Erik Slagter wrote:
> On 07-05-11 22:02, Greg KH wrote:
> >On Sat, May 07, 2011 at 09:02:49AM +0200, Erik Slagter wrote:
> >>And now my big question is, can this be fixed upstream? ;-)
> >
> >Yes, if someone sends us a patch :)
>
> Somehow I knew where this was going to :-/ I can happily make a
> clean patch that makes this device work, but as I am almost
> completely in the dark as whether what I am doing, it will probably
> break any other device and also fix it in the wrong way as well.
> I've had patches rejected for the same reasons in the past...
Well, try it and we will work at it from there, otherwise nothing will
change.
thanks,
greg k-h
On 07-05-11 22:02, Greg KH wrote:
> On Sat, May 07, 2011 at 09:02:49AM +0200, Erik Slagter wrote:
>>> Congratulations and thanks for explaining the details.
>> And now my big question is, can this be fixed upstream? ;-)
> Yes, if someone sends us a patch :)
Hi Greg,
Maxin John has been so kind as to make a patch which I tested and
tweaked a tiny bit. The attached patch set should do the trick. Does
this suffice this way?
Thanks,
Erik Slagter.
On Tue, May 10, 2011 at 08:25:50PM +0200, Erik Slagter wrote:
> On 07-05-11 22:02, Greg KH wrote:
> >On Sat, May 07, 2011 at 09:02:49AM +0200, Erik Slagter wrote:
> >>>Congratulations and thanks for explaining the details.
> >>And now my big question is, can this be fixed upstream? ;-)
> >Yes, if someone sends us a patch :)
>
> Hi Greg,
>
> Maxin John has been so kind as to make a patch which I tested and
> tweaked a tiny bit. The attached patch set should do the trick. Does
> this suffice this way?
It's a great start, yes. But could you take a look at the file,
Documentation/Submitting patches and resend it in the format described
there so that I could apply it?
Hint, you need a good description of what the patch does, a
signed-off-by: line, perhaps a tested-by line, and the patch at the
correct level so it can be applied from the root of the kernel tree.
Can you try doing this?
thanks,
greg k-h