2013-08-09 06:06:52

by Thomas Richter

[permalink] [raw]
Subject: [HANG] Trouble with NEC-based USB adapter in PCMCIA slot on E7110

Hi Alan, hi Dominik,

maybe you want me to help out a bit - I'm having trouble getting a
Delock PCMCIA to USB-2 adapter to work under linux, with strange
behavior in some situations. The trouble is that while I can *read* via
fast (usb 2.0) transfers from the port, an ehci-triggered write just
locks up (it does not Oops, though) and I'm pretty much stuck narrowing
down the problem.

Here are some more details on the machine:

The USB 2 adapter chip is from NEC: 03:00.1 0c03: 1033:00e0 (rev 05)
and so is its ohci companion: 03:00.0 0c03: 1033:0035.

Both of them sit on a delock PMCCIA card, which hangs on a O2 Micro
OZ6933/711E1 cardbus bridge:

02:0a.0 0607: 1217:6933 (rev 02)
02:0a.1 0607: 1217:6933 (rev 02)

The trouble is as follows: A read from a USB stick or an external
harddisk through the scsi subsystem issues a SCSI command 28 (read
block), and this works fine. A write (SCSI command 2A) locks up. The
machine remains usable, but the command never completes. Some time later
first the subsystem tries to reset the bus, does not succeed and then
gives up. The harddisk on the USB even worse seems to "believe" to
receive some data, and then thankfully erases the superblock of the ext2
partition on it as soon as I try to mount it r/w. Read-only mounts work
(no write commands involved). Happy happy joy joy!

It *also* works to disable the ehci portion of the NEC chip and just go
through the companion. Transfer through ohci is fine (but, of course,
slow, and besides the point since that's exactly what I need the adapter
for).

The problem appears on various kernels, I've tested 3.10.5 (build
myself), 3.2.0-4-686-pae (vanilly debian wheezy kernel), 2.6.32-5-686
(vanilla debian squeeze kernel). Interestingly, the chip *does* work
under an older Knoppix (knoppix 5.0.1) with a 2.6.19.1 kernel. As far as
I can read the notes, the ehci and cardbus interfaces have not been
altered and are as they came with 2.6.19.1. I also checked I/O goes
through high speed and not usb 1 (unloaded ohci, loaded ehci, inserted
the stick - writing works!)

What is also strange (but an unrelated problem) is that I cannot make
grub2 boot from the knoppix kernel. It just resets after having loaded
kernel and initrd, immediately. I neither can build 2.6.19.1 manually
unaltered as it depends on older userspace tools. After a bit of
patching, I could get it to compile with gcc-4.7, but the kernel
otherwise behaves as the Knoppix kernel: Just reboots after grub 2
loaded it.

I also tried a newer knoppix with a 3.5 kernel, but this again locks up
on writing, so something must have changed between 2.6.19.1 and
2.6.32.5. Or something changed in knoppix or debian userland which I
cannot test for because the 2.6.19.1 does not run here from the disk.

The system is an old Fujitsu E7110 with a debian wheezy userland with
1GB of memory and an 80GB harddisk, with a Pentium 4-M dinosaur
processor at 1.7Ghz featuring speedstep. Whether speedstep is enabled or
disabled does not matter as far as the "stuck write" is concerned, the
p4-clockmodulation module *is not* loaded and hence not the culprit.

Greetings,
Thomas

PS: Dear LKLM readers, please set me CC on any responses that would
otherwise go through the mailing list only.


2013-08-09 17:46:43

by Alan Stern

[permalink] [raw]
Subject: Re: [HANG] Trouble with NEC-based USB adapter in PCMCIA slot on E7110

On Fri, 9 Aug 2013, Thomas Richter wrote:

> Hi Alan, hi Dominik,
>
> maybe you want me to help out a bit - I'm having trouble getting a
> Delock PCMCIA to USB-2 adapter to work under linux, with strange
> behavior in some situations. The trouble is that while I can *read* via
> fast (usb 2.0) transfers from the port, an ehci-triggered write just
> locks up (it does not Oops, though) and I'm pretty much stuck narrowing
> down the problem.

The best way to start will be to enable CONFIG_USB_DEBUG in your kernel
and post the dmesg log. Also post a usbmon trace showing what happens
when you try to write something.

Alan Stern

2013-08-09 19:29:16

by Thomas Richter

[permalink] [raw]
Subject: Re: [HANG] Trouble with NEC-based USB adapter in PCMCIA slot on E7110

On 09.08.2013 19:46, Alan Stern wrote:
> On Fri, 9 Aug 2013, Thomas Richter wrote:
>
>> Hi Alan, hi Dominik,
>>
>> maybe you want me to help out a bit - I'm having trouble getting a
>> Delock PCMCIA to USB-2 adapter to work under linux, with strange
>> behavior in some situations. The trouble is that while I can *read* via
>> fast (usb 2.0) transfers from the port, an ehci-triggered write just
>> locks up (it does not Oops, though) and I'm pretty much stuck narrowing
>> down the problem.
>
> The best way to start will be to enable CONFIG_USB_DEBUG in your kernel
> and post the dmesg log. Also post a usbmon trace showing what happens
> when you try to write something.

Will try and report back, thanks. I've bisected it down in the meantime
to a change from 2.6.31.6 to 2.6.32.6. Interestingly, this is very much
the same time when the udev userland changed. It works with 2.6.31.6 old
udev, but not with 2.6.36.2, new udev.

Probably the handover from full speed to USB 2.0 got broken?

Note that this laptop has now both uhci (legacy) and ohci (as by the
pcmcia card, as companion of the ehci) chips, which probably adds to the
problem.

Greetings,
Thomas

2013-08-09 19:35:07

by Alan Stern

[permalink] [raw]
Subject: Re: [HANG] Trouble with NEC-based USB adapter in PCMCIA slot on E7110

On Fri, 9 Aug 2013, Thomas Richter wrote:

> On 09.08.2013 19:46, Alan Stern wrote:
> > On Fri, 9 Aug 2013, Thomas Richter wrote:
> >
> >> Hi Alan, hi Dominik,
> >>
> >> maybe you want me to help out a bit - I'm having trouble getting a
> >> Delock PCMCIA to USB-2 adapter to work under linux, with strange
> >> behavior in some situations. The trouble is that while I can *read* via
> >> fast (usb 2.0) transfers from the port, an ehci-triggered write just
> >> locks up (it does not Oops, though) and I'm pretty much stuck narrowing
> >> down the problem.
> >
> > The best way to start will be to enable CONFIG_USB_DEBUG in your kernel
> > and post the dmesg log. Also post a usbmon trace showing what happens
> > when you try to write something.
>
> Will try and report back, thanks. I've bisected it down in the meantime
> to a change from 2.6.31.6 to 2.6.32.6. Interestingly, this is very much
> the same time when the udev userland changed. It works with 2.6.31.6 old
> udev, but not with 2.6.36.2, new udev.

Can't you use the new udev with the old kernel?

> Probably the handover from full speed to USB 2.0 got broken?

The handover goes the other way, from EHCI to OHCI.

> Note that this laptop has now both uhci (legacy) and ohci (as by the
> pcmcia card, as companion of the ehci) chips, which probably adds to the
> problem.

That should not make any difference.

Alan Stern

2013-08-09 23:16:21

by Thomas Richter

[permalink] [raw]
Subject: Re: [HANG] Trouble with NEC-based USB adapter in PCMCIA slot on E7110

Hi Alan,

>> Will try and report back, thanks. I've bisected it down in the meantime
>> to a change from 2.6.31.6 to 2.6.32.6. Interestingly, this is very much
>> the same time when the udev userland changed. It works with 2.6.31.6 old
>> udev, but not with 2.6.36.2, new udev.
>
> Can't you use the new udev with the old kernel?

Nope. udevd refuses to start and says that it requires at least 2.6.32,
which means that apparently something changed in the kernel-userland
interface in 2.6.32. Thus, my conclusion that there's probably something
nasty going on in userland with udev.

Here's what I got: "messages" - output from /var/log/messages. Note that
the mount hangs exactly at the SCSI 2A command you see there, then tries
to reset the device after probably 20 seconds or so.

"udevmonitor" - what "udevadm monitor" generated and hence what udev got
from the kernel.

"usbdevices" what I got from "lsusb". The interesting USB busses are 3 -
the ehci chip, and 4, its ohci companion. The rest is the legacy uhci of
the board.

"usb-device-3" that's what I got from /sys/kernel/debug/usb/usbmon/3u,
i.e. the ehci bus.

The corresponding ohci bus (#4) remains empty, no debug output from there.

Hope you'll find some time to look into this riddle. If you need
anything more to hunt it down, please let me know.

Thanks,
Thomas


Attachments:
messages (30.71 kB)
udevmonitor (8.00 kB)
usbdevices (2.20 kB)
usb-device-3 (156.65 kB)
Download all attachments

2013-08-10 01:07:50

by Alan Stern

[permalink] [raw]
Subject: Re: [HANG] Trouble with NEC-based USB adapter in PCMCIA slot on E7110

On Sat, 10 Aug 2013, Thomas Richter wrote:

> Hi Alan,
>
> >> Will try and report back, thanks. I've bisected it down in the meantime
> >> to a change from 2.6.31.6 to 2.6.32.6. Interestingly, this is very much
> >> the same time when the udev userland changed. It works with 2.6.31.6 old
> >> udev, but not with 2.6.36.2, new udev.
> >
> > Can't you use the new udev with the old kernel?
>
> Nope. udevd refuses to start and says that it requires at least 2.6.32,
> which means that apparently something changed in the kernel-userland
> interface in 2.6.32. Thus, my conclusion that there's probably something
> nasty going on in userland with udev.

What about the old udev with the newer kernel?

> Here's what I got: "messages" - output from /var/log/messages. Note that
> the mount hangs exactly at the SCSI 2A command you see there, then tries
> to reset the device after probably 20 seconds or so.

It looks like the controller is having trouble sending the large
packets for the write. I have no idea why.

At this point, bisection continues to look like the best approach.
Maybe you can confine the search to just the commits that touch files
under drivers/usb/host; then perhaps the intermediate kernels will
continue to work with the old udev.

Alan Stern

2013-08-10 08:36:19

by Thomas Richter

[permalink] [raw]
Subject: Re: [HANG] Trouble with NEC-based USB adapter in PCMCIA slot on E7110

On 10.08.2013 03:07, Alan Stern wrote:

>> Nope. udevd refuses to start and says that it requires at least 2.6.32,
>> which means that apparently something changed in the kernel-userland
>> interface in 2.6.32. Thus, my conclusion that there's probably something
>> nasty going on in userland with udev.
>
> What about the old udev with the newer kernel?

Well, I extracted now the old udev from the knoppix where I took the
kernel from. It puked on a couple of the rules from the new system, but
there was enough in the system to allow me to mount the usb ports. And
that worked. So, old kernel plus old udev plus new rules -> works. It's
at least not the udev rules in my new system that are at fault.

I'm now trying the latest 2.6.31.xx kernel with identical configuration
and check how that is going.

Greetings,
Thomas

2013-08-10 11:26:17

by Thomas Richter

[permalink] [raw]
Subject: Re: [HANG] Trouble with NEC-based USB adapter in PCMCIA slot on E7110

Hi Alan,

later... ok, bisected this even more. 2.6.31.14 kernel, old udev works
fine. Then removed /etc/rcS.d/S02udev to be sure the 2.6.32 kernel would
not pull the new udev. Rebuild 2.6.32.1 with *identical* kernel
configuration, booted up, manually started the *old* udev as in
2.6.31.14, and tried whether it worked.

Did not.

Thus: it is neither the udev configuration, nor the udev userspace
program, nor the kernel configuration.

Something must be screwed in the kernel space. Which is interesting
since 2.6.32.1 is actually older than 2.6.31.14 (but the older 2.6.31.6
works, too, and younger 2.6.32.6 doesn't work either).

As said, configuration of 2.6.32.1 and 2.6.31.14 is the same.

Which modules are relevant for handling the communication with the
device, and where would you suggest to start looking? I assume
yenta_socket is one thing for the cartbus adapter, then we have ehci_hcd
and ehci_pci. Do I assume correctly that we can rule out the ohci_hcd
module since that is not involved? What about other usb modules? I would
probably need a short recap on through which modules the user data flows
before the write is triggered, and which modules are involved in the
configuration of the NEC ehci host adapter.

Thanks,
Thomas

2013-08-10 12:04:58

by Alan Stern

[permalink] [raw]
Subject: Re: [HANG] Trouble with NEC-based USB adapter in PCMCIA slot on E7110

On Sat, 10 Aug 2013, Thomas Richter wrote:

> Which modules are relevant for handling the communication with the
> device, and where would you suggest to start looking? I assume
> yenta_socket is one thing for the cartbus adapter, then we have ehci_hcd
> and ehci_pci. Do I assume correctly that we can rule out the ohci_hcd
> module since that is not involved?

That's right.

> What about other usb modules? I would
> probably need a short recap on through which modules the user data flows
> before the write is triggered, and which modules are involved in the
> configuration of the NEC ehci host adapter.

Start with just the ones you have identified. If that doesn't fix the
problem then we can look further.

Alan Stern

2013-08-10 19:04:45

by Josep Lladonosa

[permalink] [raw]
Subject: Re: [HANG] Trouble with NEC-based USB adapter in PCMCIA slot on E7110

On 10 August 2013 14:04, Alan Stern <[email protected]> wrote:
> On Sat, 10 Aug 2013, Thomas Richter wrote:
>
>> Which modules are relevant for handling the communication with the
>> device, and where would you suggest to start looking? I assume
>> yenta_socket is one thing for the cartbus adapter, then we have ehci_hcd
>> and ehci_pci. Do I assume correctly that we can rule out the ohci_hcd
>> module since that is not involved?
>
> That's right.
>
>> What about other usb modules? I would
>> probably need a short recap on through which modules the user data flows
>> before the write is triggered, and which modules are involved in the
>> configuration of the NEC ehci host adapter.
>
> Start with just the ones you have identified. If that doesn't fix the
> problem then we can look further.
>

Hello,

Have you tried to wait for 5 minutes? It could also be related to it
what happens to my kernel 3.11.0-rc4 (and rc3) and USB

I plug the drive. Only after 5 minutes of high-speed of USb bus, it mounts.


Aug 10 20:56:09 minijep kernel: [ 234.401608] usb 1-3: USB
disconnect, device number 3
Aug 10 20:56:43 minijep kernel: [ 268.790914] usb 3-1.2: new
high-speed USB device number 6 using ehci-pci
Aug 10 20:56:43 minijep kernel: [ 268.886687] usb 3-1.2: New USB
device found, idVendor=090c, idProduct=1000
Aug 10 20:56:43 minijep kernel: [ 268.886695] usb 3-1.2: New USB
device strings: Mfr=1, Product=2, SerialNumber=3
Aug 10 20:56:43 minijep kernel: [ 268.886699] usb 3-1.2: Product: DISK Pro
Aug 10 20:56:43 minijep kernel: [ 268.886702] usb 3-1.2: Manufacturer: USB
Aug 10 20:56:43 minijep kernel: [ 268.886705] usb 3-1.2:
SerialNumber: AA04012700008773
Aug 10 20:56:43 minijep mtp-probe: checking bus 3, device 6:
"/sys/devices/pci0000:00/0000:00:1a.0/usb3/3-1/3-1.2"
Aug 10 20:56:43 minijep kernel: [ 268.887447] usb-storage 3-1.2:1.0:
USB Mass Storage device detected
Aug 10 20:56:43 minijep kernel: [ 268.887714] scsi9 : usb-storage 3-1.2:1.0
Aug 10 20:56:43 minijep mtp-probe: bus: 3, device: 6 was not an MTP device
Aug 10 20:56:44 minijep kernel: [ 269.889527] scsi 9:0:0:0:
Direct-Access USB DISK Pro 1100 PQ: 0 ANSI: 0 CCS
Aug 10 20:56:44 minijep kernel: [ 269.890025] sd 9:0:0:0: Attached
scsi generic sg3 type 0
Aug 10 20:56:44 minijep kernel: [ 269.892356] sd 9:0:0:0: [sdc]
3963904 512-byte logical blocks: (2.02 GB/1.88 GiB)
Aug 10 20:56:44 minijep kernel: [ 269.893461] sd 9:0:0:0: [sdc] Write
Protect is off
Aug 10 20:57:14 minijep kernel: [ 300.109604] usb 3-1.2: reset
high-speed USB device number 6 using ehci-pci
Aug 10 20:57:45 minijep kernel: [ 331.207924] usb 3-1.2: reset
high-speed USB device number 6 using ehci-pci
Aug 10 20:58:16 minijep kernel: [ 362.210362] usb 3-1.2: reset
high-speed USB device number 6 using ehci-pci
Aug 10 20:58:47 minijep kernel: [ 393.212774] usb 3-1.2: reset
high-speed USB device number 6 using ehci-pci
Aug 10 20:59:18 minijep kernel: [ 424.279257] usb 3-1.2: reset
high-speed USB device number 6 using ehci-pci
Aug 10 20:59:49 minijep kernel: [ 455.345574] usb 3-1.2: reset
high-speed USB device number 6 using ehci-pci
Aug 10 21:00:20 minijep kernel: [ 486.347994] usb 3-1.2: reset
high-speed USB device number 6 using ehci-pci
Aug 10 21:00:51 minijep kernel: [ 517.350252] usb 3-1.2: reset
high-speed USB device number 6 using ehci-pci
Aug 10 21:00:52 minijep kernel: [ 517.494528] sdc: sdc1
Aug 10 21:01:22 minijep kernel: [ 548.352623] usb 3-1.2: reset
high-speed USB device number 6 using ehci-pci
Aug 10 21:01:53 minijep kernel: [ 579.355037] usb 3-1.2: reset
high-speed USB device number 6 using ehci-pci
Aug 10 21:02:24 minijep kernel: [ 610.485609] usb 3-1.2: reset
high-speed USB device number 6 using ehci-pci










> Alan Stern
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/



--
--
Salutacions...Josep
--

2013-08-10 19:48:28

by Thomas Richter

[permalink] [raw]
Subject: Re: [HANG] Trouble with NEC-based USB adapter in PCMCIA slot on E7110

On 10.08.2013 14:04, Alan Stern wrote:
> On Sat, 10 Aug 2013, Thomas Richter wrote:

>
>> What about other usb modules? I would
>> probably need a short recap on through which modules the user data flows
>> before the write is triggered, and which modules are involved in the
>> configuration of the NEC ehci host adapter.
>
> Start with just the ones you have identified. If that doesn't fix the
> problem then we can look further.

Sorry, need to look further. I replaced ehci-pci from 2.6.31.14, and
yenta_socket as well. No changes were necessary. Also replaced ehci-hcd,
but that required two minor changes because in 2.6.32, apparently, one
hardware related structure was pulled out of the ehci structure.

Still did not make any difference, still locks up.

What's the next module up in the call chain I should be looking at?

Greetings,
Thomas

2013-08-10 19:52:24

by Thomas Richter

[permalink] [raw]
Subject: Re: [HANG] Trouble with NEC-based USB adapter in PCMCIA slot on E7110

Hi Josep,

> Have you tried to wait for 5 minutes? It could also be related to it
> what happens to my kernel 3.11.0-rc4 (and rc3) and USB
>
> I plug the drive. Only after 5 minutes of high-speed of USb bus, it mounts.

No, currently I don't even depend on gnome mounting it. I don't run a
desktop. I just mount it manually in the single user mode with "mount"
(eliminates variables that is). If I mount it, it first looks ok. Then I
write a file on it. Still looks ok. Then I call "sync". That finally
locks it up with the 2.6.32 kernel, not on 2.6.31. For the .32, it just
sits there, after probably twenty seconds, it times out, complains that
it can't write, then dies away. Works just nicely with the .31 kernel.

Thus, at this time, I guess I would need to know more about how a write
reaches the ehci module, and how the configuration from there works.

Sorry, but this isn't quite as simple as I thought.

Thanks,
Thomas

2013-08-10 19:58:30

by Josep Lladonosa

[permalink] [raw]
Subject: Re: [HANG] Trouble with NEC-based USB adapter in PCMCIA slot on E7110

On 10 August 2013 21:52, Thomas Richter <[email protected]> wrote:
> Hi Josep,
>
>
>> Have you tried to wait for 5 minutes? It could also be related to it
>> what happens to my kernel 3.11.0-rc4 (and rc3) and USB
>>
>> I plug the drive. Only after 5 minutes of high-speed of USb bus, it
>> mounts.
>
>
> No, currently I don't even depend on gnome mounting it. I don't run a
> desktop. I just mount it manually in the single user mode with "mount"
> (eliminates variables that is). If I mount it, it first looks ok. Then I
> write a file on it. Still looks ok. Then I call "sync". That finally locks
> it up with the 2.6.32 kernel, not on 2.6.31. For the .32, it just sits
> there, after probably twenty seconds, it times out, complains that it can't
> write, then dies away. Works just nicely with the .31 kernel.
>

Perhaps both cases are related, and if you waited after those 5
minutes of resets, your pen would work fine...
Have you checked your /var/log/messages after insertion of pen?

My automount is under XFCE, not Gnome... ;)

Josep


> Thus, at this time, I guess I would need to know more about how a write
> reaches the ehci module, and how the configuration from there works.
>
> Sorry, but this isn't quite as simple as I thought.
>
> Thanks,
> Thomas



--
--
Salutacions...Josep
--

2013-08-10 20:03:50

by Josep Lladonosa

[permalink] [raw]
Subject: Re: [HANG] Trouble with NEC-based USB adapter in PCMCIA slot on E7110

On 10 August 2013 21:58, Josep Lladonosa <[email protected]> wrote:
> On 10 August 2013 21:52, Thomas Richter <[email protected]> wrote:
>> Hi Josep,
>>
>>
>>> Have you tried to wait for 5 minutes? It could also be related to it
>>> what happens to my kernel 3.11.0-rc4 (and rc3) and USB
>>>
>>> I plug the drive. Only after 5 minutes of high-speed of USb bus, it
>>> mounts.

I add that some usb pendrives mount immediately and some others don't...


>>
>>
>> No, currently I don't even depend on gnome mounting it. I don't run a
>> desktop. I just mount it manually in the single user mode with "mount"
>> (eliminates variables that is). If I mount it, it first looks ok. Then I
>> write a file on it. Still looks ok. Then I call "sync". That finally locks
>> it up with the 2.6.32 kernel, not on 2.6.31. For the .32, it just sits
>> there, after probably twenty seconds, it times out, complains that it can't
>> write, then dies away. Works just nicely with the .31 kernel.
>>
>
> Perhaps both cases are related, and if you waited after those 5
> minutes of resets, your pen would work fine...
> Have you checked your /var/log/messages after insertion of pen?
>
> My automount is under XFCE, not Gnome... ;)
>
> Josep
>
>
>> Thus, at this time, I guess I would need to know more about how a write
>> reaches the ehci module, and how the configuration from there works.
>>
>> Sorry, but this isn't quite as simple as I thought.
>>
>> Thanks,
>> Thomas
>
>
>
> --
> --
> Salutacions...Josep
> --



--
--
Salutacions...Josep
--

2013-08-10 20:05:07

by Thomas Richter

[permalink] [raw]
Subject: Re: [HANG] Trouble with NEC-based USB adapter in PCMCIA slot on E7110

Hi Josep,

>
> Perhaps both cases are related, and if you waited after those 5
> minutes of resets, your pen would work fine...
> Have you checked your /var/log/messages after insertion of pen?
>
> My automount is under XFCE, not Gnome... ;)

Unclear. Messages I get seem to make sense. First, it detects the
insertion, then after five seconds or so, it creates the device file. If
I mount this, I get the problem.

So long,
Thomas

2013-08-10 20:05:57

by Thomas Richter

[permalink] [raw]
Subject: Re: [HANG] Trouble with NEC-based USB adapter in PCMCIA slot on E7110

On 10.08.2013 22:03, Josep Lladonosa wrote:

>
> I add that some usb pendrives mount immediately and some others don't...

Nope, all the same here... None works. Pen drive, hard disk, does not
matter.

Greetings,
Thomas

2013-08-11 01:40:40

by Alan Stern

[permalink] [raw]
Subject: Re: [HANG] Trouble with NEC-based USB adapter in PCMCIA slot on E7110

On Sat, 10 Aug 2013, Josep Lladonosa wrote:

> Hello,
>
> Have you tried to wait for 5 minutes? It could also be related to it
> what happens to my kernel 3.11.0-rc4 (and rc3) and USB
>
> I plug the drive. Only after 5 minutes of high-speed of USb bus, it mounts.

Your experience is not related to Thomas's problem. It can be
fixed by applying this patch:

http://marc.info/?l=linux-usb&m=137523956310060&w=2

Alan Stern

2013-08-11 01:46:38

by Alan Stern

[permalink] [raw]
Subject: Re: [HANG] Trouble with NEC-based USB adapter in PCMCIA slot on E7110

On Sat, 10 Aug 2013, Thomas Richter wrote:

> Sorry, need to look further. I replaced ehci-pci from 2.6.31.14, and
> yenta_socket as well. No changes were necessary. Also replaced ehci-hcd,

Do you mean that you took the source code for ehci-hcd from the 2.6.31
kernel, copied it into the source directory for the 2.6.32 kernel, and
built it there?

> but that required two minor changes because in 2.6.32, apparently, one
> hardware related structure was pulled out of the ehci structure.
>
> Still did not make any difference, still locks up.
>
> What's the next module up in the call chain I should be looking at?

Have you tried comparing a usbmon trace from 2.6.31 with a comparable
trace from 2.6.32? I doubt there will be any important differences,
but you never know until you try.

If the two traces are the same, that rules out everything in the USB
and SCSI stacks except for ehci-hcd, and you have already ruled that
out.

Alan Stern

2013-08-11 08:51:15

by Thomas Richter

[permalink] [raw]
Subject: Re: [HANG] Trouble with NEC-based USB adapter in PCMCIA slot on E7110

On 11.08.2013 03:46, Alan Stern wrote:
> On Sat, 10 Aug 2013, Thomas Richter wrote:
>
>> Sorry, need to look further. I replaced ehci-pci from 2.6.31.14, and
>> yenta_socket as well. No changes were necessary. Also replaced ehci-hcd,
>
> Do you mean that you took the source code for ehci-hcd from the 2.6.31
> kernel, copied it into the source directory for the 2.6.32 kernel, and
> built it there?

Exactly that, yes. Two minor patches were necessary in ehci-hcd, but I
doubt that this is the source of the problem (just one administration
structure was pulled out in 2.6.32).

>> but that required two minor changes because in 2.6.32, apparently, one
>> hardware related structure was pulled out of the ehci structure.
>>
>> Still did not make any difference, still locks up.
>>
>> What's the next module up in the call chain I should be looking at?
>
> Have you tried comparing a usbmon trace from 2.6.31 with a comparable
> trace from 2.6.32? I doubt there will be any important differences,
> but you never know until you try.

Not yet, good idea, thanks.

Thanks,
Thomas

2013-08-11 13:27:20

by Thomas Richter

[permalink] [raw]
Subject: Re: [HANG] Trouble with NEC-based USB adapter in PCMCIA slot on E7110

On 11.08.2013 03:46, Alan Stern wrote:
> On Sat, 10 Aug 2013, Thomas Richter wrote:
>
>> Sorry, need to look further. I replaced ehci-pci from 2.6.31.14, and
>> yenta_socket as well. No changes were necessary. Also replaced ehci-hcd,
>
> Do you mean that you took the source code for ehci-hcd from the 2.6.31
> kernel, copied it into the source directory for the 2.6.32 kernel, and
> built it there?

In the meantime, I replaced the entire usb subsystem from the 2.6.32.1
by that of 2.6.31.14, plus minor patches, but no avail. Still hangs. So
it's unlikely it's someplace in usb. I'll try next with the PCMCIA
subsystem... )-:

Greetings,
Thomas

2013-08-11 17:24:45

by Thomas Richter

[permalink] [raw]
Subject: [PCMCIA] Solved: No USB 2.0 (ehci) in PCMCIA slot on E7110

Dear Alan, Josep, David, dear PCMCIA maintainers,

this is on followup of a hang I reported earlier when using a delock
(NEC-based) PCMCIA card in a O2Micro slot of an oldish E7110 Fujitsu
laptop. The phenomenon is that read accesses through the ehci interface
of the USB host adapter are unreliable, and write accesses are outright
impossible and create hangs and bad data being transmitted. The
miscommunication on the USB port is so bad that it erased the super
block of an external hard drive I have, thus causing loss of files on
this drive. BUMMER!

While I searched for a long time in the usb subsystem, I found that this
is actually innocent of the hang, and the trouble is rather in the
pcmcia system.

The error was introduced in kernel 2.6.32, 2.6.31 worked fine. After a
lot of bisection work, I finally found the source of the error.
The trouble is in drivers/pcmcia/o2micro.h, line 137 as of the linux
3.10.5 kernel. It's likely in the close vicinity for the current
development kernel:

case PCI_DEVICE_ID_O2_6729:
case PCI_DEVICE_ID_O2_6730:
case PCI_DEVICE_ID_O2_6812:
case PCI_DEVICE_ID_O2_6832:
case PCI_DEVICE_ID_O2_6836:
case PCI_DEVICE_ID_O2_6933: /* <--- this one! */
use_speedup = false;
break;

This line disables read prefetch and write bursting of the O2 pcmcia
bridge chip found in this laptop, thus pcmcia communication is then
entirely CPU-based, and simply too slow to generate signals to be
conforming to the ehci protocol. ohci (USB 1.0) works. Note that this is
only a 1.7Ghz Pentium-4-M dinosaur.

I understand the care necessary to disable this for some broken
machines, but on the other hand on my machine write bursting does work,
apparently, and *not* enabling it broke the communications so badly that
it harmed data integrity on an external disk.

Thus, unconditionally disabling write bursts on the mentioned O2 bridge
is not the right action. Probably the bridge should allow users of this
bridge to query which speeds are available, and either disable the
higher speed protocols like ehci, or include some kind of yet to be
determined test that would allow to check whether the chip *is* actually
broken. As said, it isn't here.

As a shorter time resolution, I would recommend to expose this as a
configuration item in the kernel configuration dialog so I could build
my kernel in the correct way, without running into the risk of ruining
data on external devices. (-;

For your information, this is the lspci output of the (un-)affected O2 card:
02:0a.0 CardBus bridge: O2 Micro, Inc. OZ6933/711E1 CardBus/SmartCardBus
Controller (rev 02)
02:0a.1 CardBus bridge: O2 Micro, Inc. OZ6933/711E1 CardBus/SmartCardBus
Controller (rev 02)

or numeric:

02:0a.0 0607: 1217:6933 (rev 02)
02:0a.1 0607: 1217:6933 (rev 02)

What is also probably helpful is the output of dmidecode to identify the
product:

Handle 0x0001, DMI type 1, 25 bytes
System Information
Manufacturer: FUJITSU SIEMENS
Product Name: LIFEBOOK E Series
Version:
Serial Number: YBSN003702
UUID: D1552D78-AB36-11D6-8B14-00E000AE4E45
Wake-up Type: Power Switch

Handle 0x0002, DMI type 2, 8 bytes
Base Board Information
Manufacturer: FUJITSU
Product Name: FJNB15C
Version:
Serial Number:

Handle 0x0021, DMI type 9, 13 bytes
System Slot Information
Designation: CardBus-1
Type: 32-bit PC Card (PCMCIA)
Current Usage: Unknown
Length: Other
ID: Adapter 0, Socket 0
Characteristics:
5.0 V is provided
3.3 V is provided
PC Card-16 is supported
Cardbus is supported
Modem ring resume is supported
PME signal is supported

Handle 0x0022, DMI type 9, 13 bytes
System Slot Information
Designation: CardBus-2
Type: 32-bit PC Card (PCMCIA)
Current Usage: Unknown
Length: Other
ID: Adapter 1, Socket 0
Characteristics:
5.0 V is provided
3.3 V is provided
PC Card-16 is supported
Cardbus is supported
Modem ring resume is supported
PME signal is supported

(yes, there are two pcmcia slots here)

Maybe that helps as an identification, too, before turning
write-bursting off.

Thanks, and please let me know if I can be of any further help to
resolve the problem. It's currently successfully patched away for my
kernel, but that's not a good solution.

Greetings,
Thomas

2013-08-12 21:25:10

by Andreas Mohr

[permalink] [raw]
Subject: Re: [PCMCIA] Solved: No USB 2.0 (ehci) in PCMCIA slot on E7110

Hi,

that means we're talking this one (have you had a look at Bugzilla #15014?
Might be useful...):

commit 35169529093be3bbef70afd3c4125e35cece7e03
Author: Wolfram Sang <[email protected]>
Date: Sun Jan 10 09:41:24 2010 +0100

pcmcia/yenta: add module parameter for O2 speedups

O2-bridges can do read prefetch and write burst. However, for some combinations
of older bridges and cards, this causes problems, so it is disabled for those
bridges. Now, as some users know their setup works with the speedups enabled, a
new parameter is introduced to the driver. Now, a user can specifically enable
or disable these features, while the default is what we have today: detect the
bridge and decide accordingly. Fixes Bugzilla entry 15014.

Simplify and unify the printouts, fix a whitespace issue while we are here.

Signed-off-by: Wolfram Sang <[email protected]>
Tested-by: [email protected]
[[email protected]: whitespace fixes]
Signed-off-by: Dominik Brodowski <[email protected]>


And my

git log -p drivers/pcmcia/o2micro.h

also mentions

commit 1ff84890b62b20823b3697a6041bbec1b5280cee
Author: Tomas Kovacik <[email protected]>
Date: Sun Jul 26 22:04:58 2009 +0200

pcmcia: disable prefetch/burst for OZ6933

Problems have been reported [1], so disable prefetch/burst, to be on the safe
side.

[1] http://www.mail-archive.com/[email protected]/msg02048.html

Signed-off-by: Tomáš Kováčik <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>



A useful idea would be to kindly ask Tomáš Kováčik (CC'd) about details of his system
(4 years later - but hopefully...), specifically lspci -vv -xxx or some such
(especially the revision of that controller might be *different*,
so perhaps only *some* 6933 remained affected, whereas newer ones of that
possibly more modern chipset ID started to get corrected).
Or quite likely there's some sufficiently detailed lspci log of that hardware
out on the internet somewhere...


Note that there's the comment
"for some bridges it is at 0x94, for others at 0xD4. it's
* ok to write to both registers on all O2 bridges."
, yet our 6933 support lines were added *later*,
so there's a faint possibility that the compatibility statement
actually does not apply to this chipset.



And of course there remains the question *why* such slow communication
would then cause such severe USB HC communication trouble.
There might be some safeguard missing there as well...


And kudos to the patch submitters for having supplied
such nicely detailed commit logs!
(although mentioning the title of URLs probably would have been even better)



> Note that this is only a 1.7Ghz Pentium-4-M dinosaur.

That means you really don't want to know which kinds of machines I am using ;)
(yes, I'm sitting at a CardBus box here, too)
((TI CardBus controller))


Greetings,

Andreas Mohr (KA/S)

2013-08-13 06:03:10

by Tomas Kovacik

[permalink] [raw]
Subject: Re: [PCMCIA] Solved: No USB 2.0 (ehci) in PCMCIA slot on E7110

hi everyone,

i don't have that old laptop anymore, but as you can see in lspci logs,
it was Ali chipset so, it was really old beast :). Anyway if is O2 used
in new computer, maybe it's fixed and wifi card (atheros) with wpa2 will
work. Try it :)

t.

On Mon, 2013-08-12 at 23:25 +0200, Andreas Mohr wrote:
> Hi,
>
> that means we're talking this one (have you had a look at Bugzilla #15014?
> Might be useful...):
>
> commit 35169529093be3bbef70afd3c4125e35cece7e03
> Author: Wolfram Sang <[email protected]>
> Date: Sun Jan 10 09:41:24 2010 +0100
>
> pcmcia/yenta: add module parameter for O2 speedups
>
> O2-bridges can do read prefetch and write burst. However, for some combinations
> of older bridges and cards, this causes problems, so it is disabled for those
> bridges. Now, as some users know their setup works with the speedups enabled, a
> new parameter is introduced to the driver. Now, a user can specifically enable
> or disable these features, while the default is what we have today: detect the
> bridge and decide accordingly. Fixes Bugzilla entry 15014.
>
> Simplify and unify the printouts, fix a whitespace issue while we are here.
>
> Signed-off-by: Wolfram Sang <[email protected]>
> Tested-by: [email protected]
> [[email protected]: whitespace fixes]
> Signed-off-by: Dominik Brodowski <[email protected]>
>
>
> And my
>
> git log -p drivers/pcmcia/o2micro.h
>
> also mentions
>
> commit 1ff84890b62b20823b3697a6041bbec1b5280cee
> Author: Tomas Kovacik <[email protected]>
> Date: Sun Jul 26 22:04:58 2009 +0200
>
> pcmcia: disable prefetch/burst for OZ6933
>
> Problems have been reported [1], so disable prefetch/burst, to be on the safe
> side.
>
> [1] http://www.mail-archive.com/[email protected]/msg02048.html
>
> Signed-off-by: Tomáš Kováčik <[email protected]>
> Signed-off-by: Wolfram Sang <[email protected]>
> Signed-off-by: Greg Kroah-Hartman <[email protected]>
>
>
>
> A useful idea would be to kindly ask Tomáš Kováčik (CC'd) about details of his system
> (4 years later - but hopefully...), specifically lspci -vv -xxx or some such
> (especially the revision of that controller might be *different*,
> so perhaps only *some* 6933 remained affected, whereas newer ones of that
> possibly more modern chipset ID started to get corrected).
> Or quite likely there's some sufficiently detailed lspci log of that hardware
> out on the internet somewhere...
>
>
> Note that there's the comment
> "for some bridges it is at 0x94, for others at 0xD4. it's
> * ok to write to both registers on all O2 bridges."
> , yet our 6933 support lines were added *later*,
> so there's a faint possibility that the compatibility statement
> actually does not apply to this chipset.
>
>
>
> And of course there remains the question *why* such slow communication
> would then cause such severe USB HC communication trouble.
> There might be some safeguard missing there as well...
>
>
> And kudos to the patch submitters for having supplied
> such nicely detailed commit logs!
> (although mentioning the title of URLs probably would have been even better)
>
>
>
> > Note that this is only a 1.7Ghz Pentium-4-M dinosaur.
>
> That means you really don't want to know which kinds of machines I am using ;)
> (yes, I'm sitting at a CardBus box here, too)
> ((TI CardBus controller))
>
>
> Greetings,
>
> Andreas Mohr (KA/S)