On Sun, 14 Nov 2021 16:14:35 -0500
[email protected] wrote:
> On Sun, Nov 14, 2021 at 02:48:42PM -0500, David Niklas wrote:
> > Hello,
> >
> > Almost 1 year ago to the date I posted a question regarding
> > connecting my UPS to my PC (look for the same subject -ping). The
> > input subsystem maintainers, Jiri Kosina or Ben Tissoires were asked
> > to get back to me. No one ever did though.
> >
> > Now my UPS is *not* working correctly, and I'd really really like to
> > be able to speak to it with my PC so I can (hopefully) figure out
> > what's wrong with it.
> >
> > I'm running Linux Kernel 5.15.X on a Devuan (Debian) system. I
> > connected my UPS (OPTI-UPS Thunder Shield TS2250B) via USB cable and
> > got (almost the same as last time...):
> > [ 4236.165138] usb 3-2: new low-speed USB device number 2 using
> > xhci_hcd [ 4236.325178] usb 3-2: New USB device found, idVendor=0d9f,
> > idProduct=0004, bcdDevice= 0.02 [ 4236.325183] usb 3-2: New USB
> > device strings: Mfr=3, Product=1, SerialNumber=2 [ 4236.325185] usb
> > 3-2: Product: HID UPS Battery [ 4236.325187] usb 3-2: Manufacturer:
> > POWERCOM Co.,LTD [ 4236.325188] usb 3-2: SerialNumber: 004-0D9F-000
> > [ 4236.423210] hid-generic 0003:0D9F:0004.000B: hiddev3,hidraw8: USB
> > HID v1.00 Device [POWERCOM Co.,LTD HID UPS Battery] on
> > usb-0000:11:00.3-2/input0 [ 4239.842223] usb 3-2: USB disconnect,
> > device number 2 [ 4242.485126] usb 3-2: new low-speed USB device
> > number 3 using xhci_hcd [ 4242.645075] usb 3-2: New USB device found,
> > idVendor=0d9f, idProduct=0004, bcdDevice= 0.02 [ 4242.645080] usb
> > 3-2: New USB device strings: Mfr=3, Product=1, SerialNumber=2
> > [ 4242.645082] usb 3-2: Product: HID UPS Battery [ 4242.645084] usb
> > 3-2: Manufacturer: POWERCOM Co.,LTD [ 4242.645085] usb 3-2:
> > SerialNumber: 004-0D9F-000 [ 4242.727148] hid-generic
> > 0003:0D9F:0004.000C: hiddev3,hidraw8: USB HID v1.00 Device [POWERCOM
> > Co.,LTD HID UPS Battery] on usb-0000:11:00.3-2/input0 [ 4246.135926]
> > usb 3-2: USB disconnect, device number 3 [ 4248.781839] usb 3-2: new
> > low-speed USB device number 4 using xhci_hcd [ 4248.942099] usb 3-2:
> > New USB device found, idVendor=0d9f, idProduct=0004, bcdDevice= 0.02
> > [ 4248.942104] usb 3-2: New USB device strings: Mfr=3, Product=1,
> > SerialNumber=2 [ 4248.942106] usb 3-2: Product: HID UPS Battery
> > [ 4248.942108] usb 3-2: Manufacturer: POWERCOM Co.,LTD [ 4248.942109]
> > usb 3-2: SerialNumber: 004-0D9F-000 [ 4249.031166] hid-generic
> > 0003:0D9F:0004.000D: hiddev3,hidraw8: USB HID v1.00 Device [POWERCOM
> > Co.,LTD HID UPS Battery] on usb-0000:11:00.3-2/input0 [ 4252.511996]
> > usb 3-2: USB disconnect, device number 4 ...
> >
> > I'd appreciate any advice trying to get my UPS to stay connected and
> > not spam the kernel log. I'd like to have nut or apcupsd talk to it.
>
> It would help to see the details of the communication between the
> computer and the UPS. You can collect a usbmon trace by running (as
> superuser):
>
> cat /sys/kernel/debug/usb/usbmon/3u >mon.txt
>
> before plugging in the UPS cable. (The 3 comes from the bus number of
> the USB port in the log above.) Kill the process after the cable has
> been plugged in long enough for a couple of these disconnect messages
> to appear, and post the output file to the mailing list.
>
> Alan Stern
>
Attached is the info.
In case someone doesn't realize, I'm subscribed to the linux-kernel ML,
but not input or usb. Please CC me if you don't want to spam the LKML.
Thanks for the timely response,
David
On Sun, Nov 14, 2021 at 10:02:22PM -0500, David Niklas wrote:
> On Sun, 14 Nov 2021 16:14:35 -0500
> [email protected] wrote:
> > On Sun, Nov 14, 2021 at 02:48:42PM -0500, David Niklas wrote:
> > > Hello,
> > >
> > > Almost 1 year ago to the date I posted a question regarding
> > > connecting my UPS to my PC (look for the same subject -ping). The
> > > input subsystem maintainers, Jiri Kosina or Ben Tissoires were asked
> > > to get back to me. No one ever did though.
> > >
> > > Now my UPS is *not* working correctly, and I'd really really like to
> > > be able to speak to it with my PC so I can (hopefully) figure out
> > > what's wrong with it.
> > >
> > > I'm running Linux Kernel 5.15.X on a Devuan (Debian) system. I
> > > connected my UPS (OPTI-UPS Thunder Shield TS2250B) via USB cable and
> > > got (almost the same as last time...):
> > > [ 4236.165138] usb 3-2: new low-speed USB device number 2 using
> > > xhci_hcd [ 4236.325178] usb 3-2: New USB device found, idVendor=0d9f,
> > > idProduct=0004, bcdDevice= 0.02 [ 4236.325183] usb 3-2: New USB
> > > device strings: Mfr=3, Product=1, SerialNumber=2 [ 4236.325185] usb
> > > 3-2: Product: HID UPS Battery [ 4236.325187] usb 3-2: Manufacturer:
> > > POWERCOM Co.,LTD [ 4236.325188] usb 3-2: SerialNumber: 004-0D9F-000
> > > [ 4236.423210] hid-generic 0003:0D9F:0004.000B: hiddev3,hidraw8: USB
> > > HID v1.00 Device [POWERCOM Co.,LTD HID UPS Battery] on
> > > usb-0000:11:00.3-2/input0 [ 4239.842223] usb 3-2: USB disconnect,
> > > device number 2 [ 4242.485126] usb 3-2: new low-speed USB device
> > > number 3 using xhci_hcd [ 4242.645075] usb 3-2: New USB device found,
> > > idVendor=0d9f, idProduct=0004, bcdDevice= 0.02 [ 4242.645080] usb
> > > 3-2: New USB device strings: Mfr=3, Product=1, SerialNumber=2
> > > [ 4242.645082] usb 3-2: Product: HID UPS Battery [ 4242.645084] usb
> > > 3-2: Manufacturer: POWERCOM Co.,LTD [ 4242.645085] usb 3-2:
> > > SerialNumber: 004-0D9F-000 [ 4242.727148] hid-generic
> > > 0003:0D9F:0004.000C: hiddev3,hidraw8: USB HID v1.00 Device [POWERCOM
> > > Co.,LTD HID UPS Battery] on usb-0000:11:00.3-2/input0 [ 4246.135926]
> > > usb 3-2: USB disconnect, device number 3 [ 4248.781839] usb 3-2: new
> > > low-speed USB device number 4 using xhci_hcd [ 4248.942099] usb 3-2:
> > > New USB device found, idVendor=0d9f, idProduct=0004, bcdDevice= 0.02
> > > [ 4248.942104] usb 3-2: New USB device strings: Mfr=3, Product=1,
> > > SerialNumber=2 [ 4248.942106] usb 3-2: Product: HID UPS Battery
> > > [ 4248.942108] usb 3-2: Manufacturer: POWERCOM Co.,LTD [ 4248.942109]
> > > usb 3-2: SerialNumber: 004-0D9F-000 [ 4249.031166] hid-generic
> > > 0003:0D9F:0004.000D: hiddev3,hidraw8: USB HID v1.00 Device [POWERCOM
> > > Co.,LTD HID UPS Battery] on usb-0000:11:00.3-2/input0 [ 4252.511996]
> > > usb 3-2: USB disconnect, device number 4 ...
> > >
> > > I'd appreciate any advice trying to get my UPS to stay connected and
> > > not spam the kernel log. I'd like to have nut or apcupsd talk to it.
> >
> > It would help to see the details of the communication between the
> > computer and the UPS. You can collect a usbmon trace by running (as
> > superuser):
> >
> > cat /sys/kernel/debug/usb/usbmon/3u >mon.txt
> >
> > before plugging in the UPS cable. (The 3 comes from the bus number of
> > the USB port in the log above.) Kill the process after the cable has
> > been plugged in long enough for a couple of these disconnect messages
> > to appear, and post the output file to the mailing list.
> >
> > Alan Stern
> >
>
> Attached is the info.
>
> In case someone doesn't realize, I'm subscribed to the linux-kernel ML,
> but not input or usb. Please CC me if you don't want to spam the LKML.
Has this device ever worked with any version of Linux?
The usbmon trace is not terribly revealing. Here are the important
parts from one connect/disconnect cycle. (FYI, the second column of
each line is a timestamp in microseconds.)
ffff93eaa3edad80 2135760066 S Ci:3:001:0 s a3 00 0000 0003 0004 4 <
ffff93eaa3edad80 2135760074 C Ci:3:001:0 0 4 = 01030000
The kernel learns that there is a connection on port 3 of bus 3.
ffff93eaa3edad80 2135760113 S Co:3:001:0 s 23 03 0004 0003 0000 0
ffff93eaa3edad80 2135760117 C Co:3:001:0 0 0
The kernel tells the USB host controller to reset the port.
ffff93eaa3edad80 2135826729 S Ci:3:001:0 s a3 00 0000 0003 0004 4 <
ffff93eaa3edad80 2135826739 C Ci:3:001:0 0 4 = 03031000
The kernel sees that the reset is complete and the port is enabled.
ffff93eaa3edad80 2135826742 S Co:3:001:0 s 23 01 0014 0003 0000 0
ffff93eaa3edad80 2135826744 C Co:3:001:0 0 0
The kernel clears the "reset complete" status flag.
ffff93eaa3edad80 2135883456 S Ci:3:000:0 s 80 06 0100 0000 0040 64 <
ffff93eaa3edad80 2135885736 C Ci:3:000:0 0 18 = 12011001 00000008 9f0d0400 02000301 0201
The computer reads the device descriptor for the first time.
ffff93eaa3edad80 2135885741 S Co:3:001:0 s 23 03 0004 0003 0000 0
ffff93eaa3edad80 2135885746 C Co:3:001:0 0 0
ffff93eaa3edad80 2135950054 S Ci:3:001:0 s a3 00 0000 0003 0004 4 <
ffff93eaa3edad80 2135950060 C Ci:3:001:0 0 4 = 03031000
ffff93eaa3edad80 2135950063 S Co:3:001:0 s 23 01 0014 0003 0000 0
ffff93eaa3edad80 2135950066 C Co:3:001:0 0 0
Another reset, reset complete, clear "reset complete" flag series.
This is the normal sequence used every time a device gets connected.
ffff93eaa3edad80 2136023391 S Ci:3:009:0 s 80 06 0100 0000 0012 18 <
ffff93eaa3edad80 2136025736 C Ci:3:009:0 0 18 = 12011001 00000008 9f0d0400 02000301 0201
The kernel reads the device descriptor a second time.
ffff93eaa3edad80 2136025745 S Ci:3:009:0 s 80 06 0200 0000 0009 9 <
ffff93eaa3edad80 2136028735 C Ci:3:009:0 0 9 = 09022200 010100a0 32
The kernel reads the device's configuration descriptor.
ffff93eaa3edad80 2136028738 S Ci:3:009:0 s 80 06 0200 0000 0022 34 <
ffff93eaa3edad80 2136031810 C Ci:3:009:0 0 34 = 09022200 010100a0 32090400 00010300 00000921 00010001 22e40307 05810308
The kernel reads the device's configuration, interface, and endpoint
descriptors.
ffff93eaa3edad80 2136031818 S Ci:3:009:0 s 80 06 0300 0000 00ff 255 <
ffff93eaa3edad80 2136034734 C Ci:3:009:0 0 6 = 06030904 0408
ffff93eaa3edad80 2136034738 S Ci:3:009:0 s 80 06 0301 0409 00ff 255 <
ffff93eaa3edad80 2136037799 C Ci:3:009:0 0 32 = 20034800 49004400 20005500 50005300 20004200 61007400 74006500 72007900
ffff93eaa3edad80 2136037803 S Ci:3:009:0 s 80 06 0303 0409 00ff 255 <
ffff93eaa3edad80 2136040810 C Ci:3:009:0 0 34 = 22035000 4f005700 45005200 43004f00 4d002000 43006f00 2e002c00 4c005400
ffff93eaa3edad80 2136040816 S Ci:3:009:0 s 80 06 0302 0409 00ff 255 <
ffff93eaa3edad80 2136043734 C Ci:3:009:0 0 26 = 1a033000 30003400 2d003000 44003900 46002d00 30003000 3000
The kernel reads several string descriptors. If you're good at
translating UTF-16LE codes from hex to normal characters, you can find
the truncated strings "HID UPS Battery", "POWERCOM Co.,LT", and
"004-0D9F-000" that make up the vendor, product (slightly truncated),
and serial number string descriptors.
Reading these descriptors is part of the normal enumeration process
that happens whenever a USB device is plugged in.
ffff93eaa3edad80 2136080671 S Co:3:009:0 s 00 09 0001 0000 0000 0
ffff93eaa3edad80 2136081734 C Co:3:009:0 0 0
The kernel tells the device to install its first (and in this case,
only) configuration. Also very normal, and necessary for proper
operation.
ffff93eaa3edad80 2136081778 S Ci:3:009:0 s 80 06 0302 0409 00ff 255 <
ffff93eaa3edad80 2136084734 C Ci:3:009:0 0 26 = 1a033000 30003400 2d003000 44003900 46002d00 30003000 3000
The kernel reads the serial number string descriptor descriptor again.
ffff93eaa3edad80 2136084747 S Co:3:009:0 s 21 0a 0000 0000 0000 0
ffff93eaa3edad80 2136086733 C Co:3:009:0 0 0
The kernel sends a Set-Idle request to the device, telling it not to
send any data reports when nothing has changed. This is done
automatically by the usbhid driver for every USB HID device, including
keyboards and mice as well as your UPS.
ffff93eaa3edad80 2136086737 S Ci:3:009:0 s 81 06 2200 0000 03e4 996 <
ffff93eaa3edad80 2136122734 C Ci:3:009:0 0 996 = 05840904 a1010924 a1028501 09fe7901 75089501 150026ff 00b12285 0209ff79
The kernel reads the device's HID descriptor. (The usbmon trace shows
only the first 32 bytes of the 996-byte descriptor.) Again, this is
normal and necessary for using any HID device.
ffff93e482efb440 2139520170 C Ii:3:001:1 0:2048 1 = 08
ffff93e482efb440 2139520180 S Ii:3:001:1 -115:2048 4 <
At this point the USB controller tells the kernel that there has been a
status change on port 3 of bus 3.
ffff93eaa2ff8240 2139520188 S Ci:3:001:0 s a3 00 0000 0003 0004 4 <
ffff93eaa2ff8240 2139520197 C Ci:3:001:0 0 4 = 00010100
The kernel reads the port's status and sees that there is a "connection
status change" bit set and the port is no longer connected. In other
words, the UPS device has disconnected itself electronically from the
USB bus.
ffff93eaa2ff8240 2139520200 S Co:3:001:0 s 23 01 0010 0003 0000 0
ffff93eaa2ff8240 2139520203 C Co:3:001:0 0 0
The kernel clears the "connection status change" flag. Following this
the cycle repeats.
Out of all this information, the only conclusion I can draw is that the
UPS is not behaving like a normal device. One possibility is that it
doesn't like the Set-Idle request (although if that is the case, why
did it remain connected long enough to send the HID descriptor?).
You can test the theory by patching the kernel, if you want. The code
to change is in the source file drivers/hid/usbhid/hid-core.c, and the
function in question is hid_set_idle() located around line 659 in the
file. Just change the statement:
return usb_control_msg(dev, usb_sndctrlpipe(dev, 0),
HID_REQ_SET_IDLE, USB_TYPE_CLASS | USB_RECIP_INTERFACE, (idle << 8) | report,
ifnum, NULL, 0, USB_CTRL_SET_TIMEOUT);
to:
return 0;
to prevent the Set-Idle request from being sent. If the device still
insists on disconnecting then we'll know that this wasn't the reason.
Also, if you have another system (say, one running Windows) which the
UPS does work properly with, you could try collecting the equivalent of
a usbmon trace from that system for purposes of comparison. (On
Windows, I believe you can use Wireshark to trace USB communications.)
Alan Stern
On Mon, 15 Nov 2021 11:09:18 -0500
[email protected] wrote:
> On Sun, Nov 14, 2021 at 10:02:22PM -0500, David Niklas wrote:
> >
<snip>
> Has this device ever worked with any version of Linux?
I am unaware of this device working on any version of Linux. I assumed
when I bought it that it would a least be able to connect to the USB port
correctly. I might then have to make a contribution to add support for it
to nut or apcupsd.
<snip>
> The kernel sends a Set-Idle request to the device, telling it not to
> send any data reports when nothing has changed. This is done
> automatically by the usbhid driver for every USB HID device, including
> keyboards and mice as well as your UPS.
>
> ffff93eaa3edad80 2136086737 S Ci:3:009:0 s 81 06 2200 0000 03e4 996 <
> ffff93eaa3edad80 2136122734 C Ci:3:009:0 0 996 = 05840904 a1010924
> a1028501 09fe7901 75089501 150026ff 00b12285 0209ff79
>
> The kernel reads the device's HID descriptor. (The usbmon trace shows
> only the first 32 bytes of the 996-byte descriptor.) Again, this is
> normal and necessary for using any HID device.
>
> ffff93e482efb440 2139520170 C Ii:3:001:1 0:2048 1 = 08
> ffff93e482efb440 2139520180 S Ii:3:001:1 -115:2048 4 <
>
> At this point the USB controller tells the kernel that there has been a
> status change on port 3 of bus 3.
>
> ffff93eaa2ff8240 2139520188 S Ci:3:001:0 s a3 00 0000 0003 0004 4 <
> ffff93eaa2ff8240 2139520197 C Ci:3:001:0 0 4 = 00010100
>
> The kernel reads the port's status and sees that there is a "connection
> status change" bit set and the port is no longer connected. In other
> words, the UPS device has disconnected itself electronically from the
> USB bus.
>
> ffff93eaa2ff8240 2139520200 S Co:3:001:0 s 23 01 0010 0003 0000 0
> ffff93eaa2ff8240 2139520203 C Co:3:001:0 0 0
>
> The kernel clears the "connection status change" flag. Following this
> the cycle repeats.
>
>
> Out of all this information, the only conclusion I can draw is that the
> UPS is not behaving like a normal device. One possibility is that it
> doesn't like the Set-Idle request (although if that is the case, why
> did it remain connected long enough to send the HID descriptor?).
Thanks for the detailed breakdown!
> You can test the theory by patching the kernel, if you want. The code
> to change is in the source file drivers/hid/usbhid/hid-core.c, and the
> function in question is hid_set_idle() located around line 659 in the
> file. Just change the statement:
>
> return usb_control_msg(dev, usb_sndctrlpipe(dev, 0),
> HID_REQ_SET_IDLE, USB_TYPE_CLASS | USB_RECIP_INTERFACE,
> (idle << 8) | report, ifnum, NULL, 0, USB_CTRL_SET_TIMEOUT);
>
> to:
>
> return 0;
>
> to prevent the Set-Idle request from being sent. If the device still
> insists on disconnecting then we'll know that this wasn't the reason.
Will do tomorrow. (I'm busy ATM)
> Also, if you have another system (say, one running Windows) which the
> UPS does work properly with, you could try collecting the equivalent of
> a usbmon trace from that system for purposes of comparison. (On
> Windows, I believe you can use Wireshark to trace USB communications.)
>
> Alan Stern
>
I'll have to look into that.
Thanks again!
David
On Mon, 15 Nov 2021 11:09:18 -0500
[email protected] wrote:
<snip>
> You can test the theory by patching the kernel, if you want. The code
> to change is in the source file drivers/hid/usbhid/hid-core.c, and the
> function in question is hid_set_idle() located around line 659 in the
> file. Just change the statement:
>
> return usb_control_msg(dev, usb_sndctrlpipe(dev, 0),
> HID_REQ_SET_IDLE, USB_TYPE_CLASS | USB_RECIP_INTERFACE,
> (idle << 8) | report, ifnum, NULL, 0, USB_CTRL_SET_TIMEOUT);
>
> to:
>
> return 0;
>
> to prevent the Set-Idle request from being sent. If the device still
> insists on disconnecting then we'll know that this wasn't the reason.
>
Ok, so I changed out the line above with "__panic(2);" and now my PC just
reboots.... Teasing :D
That didn't seem to change anything. I'll attach another dump just in
case it reveals more.
> Also, if you have another system (say, one running Windows) which the
> UPS does work properly with, you could try collecting the equivalent of
> a usbmon trace from that system for purposes of comparison. (On
> Windows, I believe you can use Wireshark to trace USB communications.)
>
Limitations of SW:
Wireshark works if you have windows in a virtual environment, but I don't
actually own... I mean license, any windowz products. I'm a straight
Luser.
So borrowed a windowz machine and plugged in the UPS. I then used USBPcap
to capture the data after installing the drivers. It has 4 things it can't
detect:
Bus states (Suspended, Power ON, Power OFF, Reset, High Speed Detection
Handshake)
Packet ID (PID)
Split transactions (CSPLIT, SSPLIT)
Duration of bus state and time used to transfer packet over the wire
Transfer speed (Low Speed, Full Speed, High Speed)
I'm 100% certain the last 2 we don't care about. IDK about the others.
Notes:
Here's the product page of my UPS.
https://www.newegg.com/opti-ups-ts2250b/p/N82E16842107014
The main webpage for USBPcap is here:
https://desowin.org/usbpcap/index.html
I can also try and use SnoopyPro and busdog if the output is undesirable.
USBPcap spits out a pcap file which can be analyzed by wireshark
using dissectors -- somehow (I really should practice using wireshark.)
Test and capture procedure:
When I installed the drivers it asked me where to look for the UPS. I
didn't tell it the USB port until after I started USBPcap and then
plugged in the UPS. Then the GUI opened up and I could see a lot of cool
controls like the battery power, loading, etc. The loading was 132W and
the battery was at 100%. Then I ran a self test (There's a button in the
GUI) and it worked fine. Then I unplugged the UPS and it crashed. Then I
plugged it back in. All --100%-- of this is in the pcap file.
Results of:
After unplugging the UPS it's battery dropped to 22% and then it turned
off. My UPS is 2y and 5m old. It has a 3Y parts warranty. I guess I'll
see if they'll honor it.
I'm still interested in talking to it via my Linux PC, of course.
Thanks,
David
On Wed, Nov 17, 2021 at 12:23:59AM -0500, David Niklas wrote:
> On Mon, 15 Nov 2021 11:09:18 -0500
> [email protected] wrote:
> <snip>
> > You can test the theory by patching the kernel, if you want. The code
> > to change is in the source file drivers/hid/usbhid/hid-core.c, and the
> > function in question is hid_set_idle() located around line 659 in the
> > file. Just change the statement:
> >
> > return usb_control_msg(dev, usb_sndctrlpipe(dev, 0),
> > HID_REQ_SET_IDLE, USB_TYPE_CLASS | USB_RECIP_INTERFACE,
> > (idle << 8) | report, ifnum, NULL, 0, USB_CTRL_SET_TIMEOUT);
> >
> > to:
> >
> > return 0;
> >
> > to prevent the Set-Idle request from being sent. If the device still
> > insists on disconnecting then we'll know that this wasn't the reason.
> >
>
> Ok, so I changed out the line above with "__panic(2);" and now my PC just
> reboots.... Teasing :D
> That didn't seem to change anything. I'll attach another dump just in
> case it reveals more.
It doesn't. :-( The Set-Idle request does not appear to be related to
the problem.
> > Also, if you have another system (say, one running Windows) which the
> > UPS does work properly with, you could try collecting the equivalent of
> > a usbmon trace from that system for purposes of comparison. (On
> > Windows, I believe you can use Wireshark to trace USB communications.)
> >
>
> Limitations of SW:
> Wireshark works if you have windows in a virtual environment, but I don't
> actually own... I mean license, any windowz products. I'm a straight
> Luser.
> So borrowed a windowz machine and plugged in the UPS. I then used USBPcap
> to capture the data after installing the drivers. It has 4 things it can't
> detect:
>
> Bus states (Suspended, Power ON, Power OFF, Reset, High Speed Detection
> Handshake)
> Packet ID (PID)
> Split transactions (CSPLIT, SSPLIT)
> Duration of bus state and time used to transfer packet over the wire
> Transfer speed (Low Speed, Full Speed, High Speed)
>
> I'm 100% certain the last 2 we don't care about. IDK about the others.
I don't think they matter. In principle the time delays might be
important, but I rather doubt it.
> Notes:
> Here's the product page of my UPS.
> https://www.newegg.com/opti-ups-ts2250b/p/N82E16842107014
> The main webpage for USBPcap is here:
> https://desowin.org/usbpcap/index.html
> I can also try and use SnoopyPro and busdog if the output is undesirable.
> USBPcap spits out a pcap file which can be analyzed by wireshark
> using dissectors -- somehow (I really should practice using wireshark.)
Wireshark on my system has no trouble reading your pcap file.
> Test and capture procedure:
> When I installed the drivers it asked me where to look for the UPS. I
> didn't tell it the USB port until after I started USBPcap and then
> plugged in the UPS. Then the GUI opened up and I could see a lot of cool
> controls like the battery power, loading, etc. The loading was 132W and
> the battery was at 100%. Then I ran a self test (There's a button in the
> GUI) and it worked fine. Then I unplugged the UPS and it crashed. Then I
> plugged it back in. All --100%-- of this is in the pcap file.
I'm just concentrating on the first part, up to the point where the
unwanted disconnects occurred with Linux. So far as I can see, there
are only two significant differences between the usbmon and wireshark
traces:
The Windows system doesn't transfer any of the string
descriptors during initial enumeration, whereas the Linux
system does. While this might be relevant, I don't think it is.
When the Windows system requests the HID report descriptor from
the device, it asks for 1060 bytes of data. The Linux system
asks for only 996 bytes. (Note: The descriptor is exactly
996 bytes long, and that's how much data the device sends in
either case.)
It's entirely possible that this second discrepancy is somehow causing
the problem. You can test this guess by applying the following patch:
--- usb-devel.orig/drivers/hid/usbhid/hid-core.c
+++ usb-devel/drivers/hid/usbhid/hid-core.c
@@ -667,13 +667,16 @@ static int hid_get_class_descriptor(stru
unsigned char type, void *buf, int size)
{
int result, retries = 4;
+ int size2 = size;
+ if (size == 996)
+ size2 = 1060;
memset(buf, 0, size);
do {
result = usb_control_msg(dev, usb_rcvctrlpipe(dev, 0),
USB_REQ_GET_DESCRIPTOR, USB_RECIP_INTERFACE | USB_DIR_IN,
- (type << 8), ifnum, buf, size, USB_CTRL_GET_TIMEOUT);
+ (type << 8), ifnum, buf, size2, USB_CTRL_GET_TIMEOUT);
retries--;
} while (result < size && retries);
return result;
This will cause the kernel to ask for 1060 bytes rather than 996. (It's
also potentially dangerous, because it asks for 1060 bytes to be stored
into a 996-byte buffer; if the device sends more data than expected then
the excess will be written beyond the end of the buffer.)
Please send a usbmon trace showing what happens with this patch applied.
And you might as well put the Set-Idle request back in, because now we
know Windows does send that request.
> Results of:
> After unplugging the UPS it's battery dropped to 22% and then it turned
> off. My UPS is 2y and 5m old. It has a 3Y parts warranty. I guess I'll
> see if they'll honor it.
>
>
>
>
> I'm still interested in talking to it via my Linux PC, of course.
Let's see if the patch will avert the disconnect.
Alan Stern
On Wed, 17 Nov 2021 12:08:17 -0500
Alan Stern <[email protected]> wrote:
> On Wed, Nov 17, 2021 at 12:23:59AM -0500, David Niklas wrote:
> > On Mon, 15 Nov 2021 11:09:18 -0500
> > [email protected] wrote:
> > <snip>
> > > You can test the theory by patching the kernel, if you want. The
> > > code to change is in the source file drivers/hid/usbhid/hid-core.c,
> > > and the function in question is hid_set_idle() located around line
> > > 659 in the file. Just change the statement:
> > >
> > > return usb_control_msg(dev, usb_sndctrlpipe(dev, 0),
> > > HID_REQ_SET_IDLE, USB_TYPE_CLASS |
> > > USB_RECIP_INTERFACE, (idle << 8) | report, ifnum, NULL, 0,
> > > USB_CTRL_SET_TIMEOUT);
> > >
> > > to:
> > >
> > > return 0;
> > >
> > > to prevent the Set-Idle request from being sent. If the device
> > > still insists on disconnecting then we'll know that this wasn't the
> > > reason.
> >
> > Ok, so I changed out the line above with "__panic(2);" and now my PC
> > just reboots.... Teasing :D
> > That didn't seem to change anything. I'll attach another dump just in
> > case it reveals more.
>
> It doesn't. :-( The Set-Idle request does not appear to be related to
> the problem.
>
> > > Also, if you have another system (say, one running Windows) which
> > > the UPS does work properly with, you could try collecting the
> > > equivalent of a usbmon trace from that system for purposes of
> > > comparison. (On Windows, I believe you can use Wireshark to trace
> > > USB communications.)
> >
> > Limitations of SW:
> > Wireshark works if you have windows in a virtual environment, but I
> > don't actually own... I mean license, any windowz products. I'm a
> > straight Luser.
> > So borrowed a windowz machine and plugged in the UPS. I then used
> > USBPcap to capture the data after installing the drivers. It has 4
> > things it can't detect:
> >
> > Bus states (Suspended, Power ON, Power OFF, Reset, High Speed
> > Detection Handshake)
> > Packet ID (PID)
> > Split transactions (CSPLIT, SSPLIT)
> > Duration of bus state and time used to transfer packet over the wire
> > Transfer speed (Low Speed, Full Speed, High Speed)
> >
> > I'm 100% certain the last 2 we don't care about. IDK about the
> > others.
>
> I don't think they matter. In principle the time delays might be
> important, but I rather doubt it.
>
> > Notes:
> > Here's the product page of my UPS.
> > https://www.newegg.com/opti-ups-ts2250b/p/N82E16842107014
> > The main webpage for USBPcap is here:
> > https://desowin.org/usbpcap/index.html
> > I can also try and use SnoopyPro and busdog if the output is
> > undesirable. USBPcap spits out a pcap file which can be analyzed by
> > wireshark using dissectors -- somehow (I really should practice using
> > wireshark.)
>
> Wireshark on my system has no trouble reading your pcap file.
Misunderstanding then. I was thinking in terms of the USBPcap docs. I was
saying a dissector would need to be written. I'm glad it worked for you.
https://desowin.org/usbpcap/dissectors.html
"Writing USB class dissector"
> > Test and capture procedure:
> > When I installed the drivers it asked me where to look for the UPS. I
> > didn't tell it the USB port until after I started USBPcap and then
> > plugged in the UPS. Then the GUI opened up and I could see a lot of
> > cool controls like the battery power, loading, etc. The loading was
> > 132W and the battery was at 100%. Then I ran a self test (There's a
> > button in the GUI) and it worked fine. Then I unplugged the UPS and
> > it crashed. Then I plugged it back in. All --100%-- of this is in the
> > pcap file.
>
> I'm just concentrating on the first part, up to the point where the
> unwanted disconnects occurred with Linux. So far as I can see, there
> are only two significant differences between the usbmon and wireshark
> traces:
>
> The Windows system doesn't transfer any of the string
> descriptors during initial enumeration, whereas the Linux
> system does. While this might be relevant, I don't think it is.
>
> When the Windows system requests the HID report descriptor from
> the device, it asks for 1060 bytes of data. The Linux system
> asks for only 996 bytes. (Note: The descriptor is exactly
> 996 bytes long, and that's how much data the device sends in
> either case.)
>
> It's entirely possible that this second discrepancy is somehow causing
> the problem. You can test this guess by applying the following patch:
>
> --- usb-devel.orig/drivers/hid/usbhid/hid-core.c
> +++ usb-devel/drivers/hid/usbhid/hid-core.c
> @@ -667,13 +667,16 @@ static int hid_get_class_descriptor(stru
> unsigned char type, void *buf, int size)
> {
> int result, retries = 4;
> + int size2 = size;
>
> + if (size == 996)
> + size2 = 1060;
> memset(buf, 0, size);
>
> do {
> result = usb_control_msg(dev, usb_rcvctrlpipe(dev, 0),
> USB_REQ_GET_DESCRIPTOR,
> USB_RECIP_INTERFACE | USB_DIR_IN,
> - (type << 8), ifnum, buf, size,
> USB_CTRL_GET_TIMEOUT);
> + (type << 8), ifnum, buf, size2,
> USB_CTRL_GET_TIMEOUT); retries--;
> } while (result < size && retries);
> return result;
>
> This will cause the kernel to ask for 1060 bytes rather than 996.
> (It's also potentially dangerous, because it asks for 1060 bytes to be
> stored into a 996-byte buffer; if the device sends more data than
> expected then the excess will be written beyond the end of the buffer.)
>
> Please send a usbmon trace showing what happens with this patch
> applied. And you might as well put the Set-Idle request back in,
> because now we know Windows does send that request.
>
<snip>
It still disconnects. I've attached the usbmon output.
Thanks,
David
On Fri, Nov 19, 2021 at 05:19:15PM -0500, David Niklas wrote:
> On Wed, 17 Nov 2021 12:08:17 -0500
> Alan Stern <[email protected]> wrote:
> > On Wed, Nov 17, 2021 at 12:23:59AM -0500, David Niklas wrote:
> > > I can also try and use SnoopyPro and busdog if the output is
> > > undesirable. USBPcap spits out a pcap file which can be analyzed by
> > > wireshark using dissectors -- somehow (I really should practice using
> > > wireshark.)
> >
> > Wireshark on my system has no trouble reading your pcap file.
>
> Misunderstanding then. I was thinking in terms of the USBPcap docs. I was
> saying a dissector would need to be written. I'm glad it worked for you.
> https://desowin.org/usbpcap/dissectors.html
> "Writing USB class dissector"
Evidently wireshark already has built-in dissectors for standard USB
communications.
> > This will cause the kernel to ask for 1060 bytes rather than 996.
> > (It's also potentially dangerous, because it asks for 1060 bytes to be
> > stored into a 996-byte buffer; if the device sends more data than
> > expected then the excess will be written beyond the end of the buffer.)
> >
> > Please send a usbmon trace showing what happens with this patch
> > applied. And you might as well put the Set-Idle request back in,
> > because now we know Windows does send that request.
> >
> <snip>
>
> It still disconnects. I've attached the usbmon output.
The trace clearly shows the request for a 1060-byte HID report
descriptor and the device sending back a 996-byte reply, just like in
Windows. And yet the disconnect still occurs.
The only remaining difference is the transfer of string descriptors.
You can prevent Linux from asking for them by editing usb_string() in
drivers/usb/core/message.c. Just make the function return -ENOMEM
unconditionally. That will stop the requests from going out.
Alan Stern
On Sat, 20 Nov 2021 21:54:20 -0500
Alan Stern <[email protected]> wrote:
> On Fri, Nov 19, 2021 at 05:19:15PM -0500, David Niklas wrote:
> > On Wed, 17 Nov 2021 12:08:17 -0500
> > Alan Stern <[email protected]> wrote:
<snip>
> > > This will cause the kernel to ask for 1060 bytes rather than 996.
> > > (It's also potentially dangerous, because it asks for 1060 bytes to
> > > be stored into a 996-byte buffer; if the device sends more data than
> > > expected then the excess will be written beyond the end of the
> > > buffer.)
> > >
> > > Please send a usbmon trace showing what happens with this patch
> > > applied. And you might as well put the Set-Idle request back in,
> > > because now we know Windows does send that request.
> > >
> > <snip>
> >
> > It still disconnects. I've attached the usbmon output.
>
> The trace clearly shows the request for a 1060-byte HID report
> descriptor and the device sending back a 996-byte reply, just like in
> Windows. And yet the disconnect still occurs.
>
> The only remaining difference is the transfer of string descriptors.
> You can prevent Linux from asking for them by editing usb_string() in
> drivers/usb/core/message.c. Just make the function return -ENOMEM
> unconditionally. That will stop the requests from going out.
>
> Alan Stern
>
Ok, I first edited the kernel to return -ENOMEM like you suggested but
the UPS still disconnected. I then edited it again to re-add the 1060
byte request and the UPS still disconnected.
I'm attaching the usbmon traces.
If you need any additional info I'll do my best to provide it.
Thanks,
David
On Mon, Nov 22, 2021 at 11:25:26AM -0500, David Niklas wrote:
> Ok, I first edited the kernel to return -ENOMEM like you suggested but
> the UPS still disconnected. I then edited it again to re-add the 1060
> byte request and the UPS still disconnected.
>
> I'm attaching the usbmon traces.
> If you need any additional info I'll do my best to provide it.
Holy cow! I just realized what's going on. And these little changes
we've been messing around with have nothing to do with it.
For the first time, I looked at the timestamps in the usbmon traces. It
turns out that the disconnects occur several seconds after the kernel
retrieves the HID report descriptor from the device. Under normal
conditions we would expect to see report packets coming in from the
device, starting just a fraction of a second after the descriptor is
received. But that isn't happening in the Linux traces, whereas it does
happen in the Windows pcap log.
I would guess that the UPS is programmed to disconnect itself
electronically from the USB bus if it doesn't get any requests for
reports within a couple of seconds. That certainly would explain what
you've been seeing. I can't imagine why it would be programmed to
behave this way, but companies have been known to do stranger things.
As for why the kernel doesn't try to get the reports... That's a little
harder to answer. Maybe Jiri or Benjamin will know something about it.
The UPS's vendor ID is 0d9f (POWERCOM) and the product ID is 0004. Now,
the drivers/hid/hid-quirks.c file contains a quirk entry for 0d9f:0002
(product POWERCOM_UPS), which is probably an earlier model of the same
device, or a very similar device. This quirk entry is in the
hid_ignore_list; it tells the HID core not to handle the device at all.
I don't know why that quirk entry is present, and furthermore, it can't
directly affect what is happening with your device because the product
IDs are different. Still, it is an indication that something strange is
going on behind the scenes.
Perhaps there is no kernel driver for these UPS devices? Perhaps the
intention is that some user program will handle all the communication
when one of them is detected? A quick search on Google turns up
usbhid-ups, part of Network USB Tools (NUT) -- maybe you need to
install that package in order to use the device.
I don't know what the full story is. With luck, Jiri or Benjamin can
help more.
Alan Stern
On Mon, 2021-11-22 at 15:13 -0500, Alan Stern wrote:
> On Mon, Nov 22, 2021 at 11:25:26AM -0500, David Niklas wrote:
> > Ok, I first edited the kernel to return -ENOMEM like you suggested but
> > the UPS still disconnected. I then edited it again to re-add the 1060
> > byte request and the UPS still disconnected.
> >
> > I'm attaching the usbmon traces.
> > If you need any additional info I'll do my best to provide it.
>
> Holy cow! I just realized what's going on. And these little changes
> we've been messing around with have nothing to do with it.
>
> For the first time, I looked at the timestamps in the usbmon traces. It
> turns out that the disconnects occur several seconds after the kernel
> retrieves the HID report descriptor from the device. Under normal
> conditions we would expect to see report packets coming in from the
> device, starting just a fraction of a second after the descriptor is
> received. But that isn't happening in the Linux traces, whereas it does
> happen in the Windows pcap log.
>
> I would guess that the UPS is programmed to disconnect itself
> electronically from the USB bus if it doesn't get any requests for
> reports within a couple of seconds. That certainly would explain what
> you've been seeing. I can't imagine why it would be programmed to
> behave this way, but companies have been known to do stranger things.
>
> As for why the kernel doesn't try to get the reports... That's a little
> harder to answer. Maybe Jiri or Benjamin will know something about it.
>
> The UPS's vendor ID is 0d9f (POWERCOM) and the product ID is 0004. Now,
> the drivers/hid/hid-quirks.c file contains a quirk entry for 0d9f:0002
> (product POWERCOM_UPS), which is probably an earlier model of the same
> device, or a very similar device. This quirk entry is in the
> hid_ignore_list; it tells the HID core not to handle the device at all.
>
> I don't know why that quirk entry is present, and furthermore, it can't
> directly affect what is happening with your device because the product
> IDs are different. Still, it is an indication that something strange is
> going on behind the scenes.
>
> Perhaps there is no kernel driver for these UPS devices? Perhaps the
> intention is that some user program will handle all the communication
> when one of them is detected? A quick search on Google turns up
> usbhid-ups, part of Network USB Tools (NUT) -- maybe you need to
> install that package in order to use the device.
>
> I don't know what the full story is. With luck, Jiri or Benjamin can
> help more.
>
> Alan Stern
hid-generic should be able to handle these devices, but UPSes are much less
tested than normal input peripherals, so it's not that surprising that there may
be some unexpected weirdness. FWIW, I have two UPSes, one works OOTB and the
other doesn't, I have been meaning to investigate the issue. However, David's
case seems to me like a hardware quirk, but that's just my guess.
Cheers,
Filipe Laíns
On Mon, Nov 22, 2021 at 10:35 PM Filipe Laíns <[email protected]> wrote:
>
> On Mon, 2021-11-22 at 15:13 -0500, Alan Stern wrote:
> > On Mon, Nov 22, 2021 at 11:25:26AM -0500, David Niklas wrote:
> > > Ok, I first edited the kernel to return -ENOMEM like you suggested but
> > > the UPS still disconnected. I then edited it again to re-add the 1060
> > > byte request and the UPS still disconnected.
> > >
> > > I'm attaching the usbmon traces.
> > > If you need any additional info I'll do my best to provide it.
> >
> > Holy cow! I just realized what's going on. And these little changes
> > we've been messing around with have nothing to do with it.
> >
> > For the first time, I looked at the timestamps in the usbmon traces. It
> > turns out that the disconnects occur several seconds after the kernel
> > retrieves the HID report descriptor from the device. Under normal
> > conditions we would expect to see report packets coming in from the
> > device, starting just a fraction of a second after the descriptor is
> > received. But that isn't happening in the Linux traces, whereas it does
> > happen in the Windows pcap log.
> >
> > I would guess that the UPS is programmed to disconnect itself
> > electronically from the USB bus if it doesn't get any requests for
> > reports within a couple of seconds. That certainly would explain what
> > you've been seeing. I can't imagine why it would be programmed to
> > behave this way, but companies have been known to do stranger things.
> >
> > As for why the kernel doesn't try to get the reports... That's a little
> > harder to answer. Maybe Jiri or Benjamin will know something about it.
I am not sure exactly what is going on there.
There are a couple of things that come to my mind:
- for quite some time now, we don't fetch all reports whenever we
connect a new device. This was known to be problematic on some devices
(see all the devices with HID_QUIRK_NOGET or
HID_QUIRK_NO_INIT_REPORT), and the default to not poll input values on
plug for devices is actually safer. If you want to revert, we will
have to have a special driver for this one I guess
- HID_QUIRK_ALWAYS_POLL *might* be a way to force the device to stay
with a USB connection up.
> >
> > The UPS's vendor ID is 0d9f (POWERCOM) and the product ID is 0004. Now,
> > the drivers/hid/hid-quirks.c file contains a quirk entry for 0d9f:0002
> > (product POWERCOM_UPS), which is probably an earlier model of the same
> > device, or a very similar device. This quirk entry is in the
> > hid_ignore_list; it tells the HID core not to handle the device at all.
> >
> > I don't know why that quirk entry is present, and furthermore, it can't
> > directly affect what is happening with your device because the product
> > IDs are different. Still, it is an indication that something strange is
> > going on behind the scenes.
> >
> > Perhaps there is no kernel driver for these UPS devices? Perhaps the
> > intention is that some user program will handle all the communication
> > when one of them is detected? A quick search on Google turns up
> > usbhid-ups, part of Network USB Tools (NUT) -- maybe you need to
> > install that package in order to use the device.
I don't have enough experience with UPS here to be helpful,
unfortunately. But What Alan said made a lot of sense. Maybe the NUT
people will have a better insight.
> >
> > I don't know what the full story is. With luck, Jiri or Benjamin can
> > help more.
> >
> > Alan Stern
>
> hid-generic should be able to handle these devices, but UPSes are much less
> tested than normal input peripherals, so it's not that surprising that there may
> be some unexpected weirdness. FWIW, I have two UPSes, one works OOTB and the
> other doesn't, I have been meaning to investigate the issue. However, David's
> case seems to me like a hardware quirk, but that's just my guess.
>
Yep, that seems to validate the fact that this UPS needs some care...
Cheers,
Benjamin
On Tue, 23 Nov 2021 17:33:08 +0100
Benjamin Tissoires <[email protected]> wrote:
> On Mon, Nov 22, 2021 at 10:35 PM Filipe Laíns <[email protected]>
> wrote:
> >
> > On Mon, 2021-11-22 at 15:13 -0500, Alan Stern wrote:
> > > On Mon, Nov 22, 2021 at 11:25:26AM -0500, David Niklas wrote:
> > > > Ok, I first edited the kernel to return -ENOMEM like you
> > > > suggested but the UPS still disconnected. I then edited it again
> > > > to re-add the 1060 byte request and the UPS still disconnected.
> > > >
> > > > I'm attaching the usbmon traces.
> > > > If you need any additional info I'll do my best to provide it.
> > >
> > > Holy cow! I just realized what's going on. And these little
> > > changes we've been messing around with have nothing to do with it.
> > >
> > > For the first time, I looked at the timestamps in the usbmon
> > > traces. It turns out that the disconnects occur several seconds
> > > after the kernel retrieves the HID report descriptor from the
> > > device. Under normal conditions we would expect to see report
> > > packets coming in from the device, starting just a fraction of a
> > > second after the descriptor is received. But that isn't happening
> > > in the Linux traces, whereas it does happen in the Windows pcap log.
> > >
> > > I would guess that the UPS is programmed to disconnect itself
> > > electronically from the USB bus if it doesn't get any requests for
> > > reports within a couple of seconds. That certainly would explain
> > > what you've been seeing. I can't imagine why it would be
> > > programmed to behave this way, but companies have been known to do
> > > stranger things.
> > >
> > > As for why the kernel doesn't try to get the reports... That's a
> > > little harder to answer. Maybe Jiri or Benjamin will know
> > > something about it.
>
> I am not sure exactly what is going on there.
> There are a couple of things that come to my mind:
> - for quite some time now, we don't fetch all reports whenever we
> connect a new device. This was known to be problematic on some devices
> (see all the devices with HID_QUIRK_NOGET or
> HID_QUIRK_NO_INIT_REPORT), and the default to not poll input values on
> plug for devices is actually safer. If you want to revert, we will
> have to have a special driver for this one I guess
> - HID_QUIRK_ALWAYS_POLL *might* be a way to force the device to stay
> with a USB connection up.
>
> > >
> > > The UPS's vendor ID is 0d9f (POWERCOM) and the product ID is 0004.
> > > Now, the drivers/hid/hid-quirks.c file contains a quirk entry for
> > > 0d9f:0002 (product POWERCOM_UPS), which is probably an earlier
> > > model of the same device, or a very similar device. This quirk
> > > entry is in the hid_ignore_list; it tells the HID core not to
> > > handle the device at all.
> > >
> > > I don't know why that quirk entry is present, and furthermore, it
> > > can't directly affect what is happening with your device because
> > > the product IDs are different. Still, it is an indication that
> > > something strange is going on behind the scenes.
> > >
> > > Perhaps there is no kernel driver for these UPS devices? Perhaps
> > > the intention is that some user program will handle all the
> > > communication when one of them is detected? A quick search on
> > > Google turns up usbhid-ups, part of Network USB Tools (NUT) --
> > > maybe you need to install that package in order to use the device.
>
> I don't have enough experience with UPS here to be helpful,
> unfortunately. But What Alan said made a lot of sense. Maybe the NUT
> people will have a better insight.
<snip>
I'll send a message their way.
Thanks,
David