Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C2FD8C38142 for ; Tue, 31 Jan 2023 22:50:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232100AbjAaWux (ORCPT ); Tue, 31 Jan 2023 17:50:53 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42730 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231530AbjAaWuv (ORCPT ); Tue, 31 Jan 2023 17:50:51 -0500 Received: from trent.utfs.org (trent.utfs.org [94.185.90.103]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 214164AA4E for ; Tue, 31 Jan 2023 14:50:49 -0800 (PST) Received: from localhost (localhost [IPv6:::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by trent.utfs.org (Postfix) with ESMTPS id AAF755F841; Tue, 31 Jan 2023 23:50:46 +0100 (CET) Date: Tue, 31 Jan 2023 23:50:46 +0100 (CET) From: Christian Kujau To: Juergen Gross , Michael Kelley , Borislav Petkov cc: linux-kernel@vger.kernel.org, Greg KH , Linux regressions mailing list Subject: Re: External USB disks not recognized with v6.1.8 when using Xen In-Reply-To: Message-ID: <8f132803-f496-f33a-d2ab-b47fd5af0b88@nerdbynature.de> References: <4fe9541e-4d4c-2b2a-f8c8-2d34a7284930@nerdbynature.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org [Leaving the full quote below for reference and adding more appropriate people.] After a far too long round of git-bisect I narrowed it down to: c1c59538337ab6d45700cb4a1c9725e67f59bc6e is the first bad commit x86/pat: Fix pat_x_mtrr_type() for MTRR disabled case commit 90b926e68f500844dff16b5bcea178dc55cf580a upstream. And indeed, reverting this single commit from v6.1.8 (stable) makes the disks appear again. TL;DR: with v6.1.8 in Xen Dom0 mode (i.e. the Xen host itself) the external disk enclosure attached via USB is not being recognized. When booted *without* Xen, the disks show up just fine. Details with dmesg and lsusb outputs: https://nerdbynature.de/bits/usb_v6.1.8/ Thanks Thorsten for the localmodconfig hint, I've tried that before, but the thing just did not want to boot, so I manually cut down on options, but it's still ~12 minutes per compile, ccache helped a bit in the end. Thanks for reading, Christian. On Mon, 30 Jan 2023, Linux kernel regression tracking (#adding) wrote: > [TLDR: I'm adding this report to the list of tracked Linux kernel > regressions; the text you find below is based on a few templates > paragraphs you might have encountered already in similar form. > See link in footer if these mails annoy you.] > > On 30.01.23 04:46, Christian Kujau wrote: > > [CC stable as I only tested the stable tree for now] > > > > I'm running a current Alpine Linux with linux-edge-6.1.8-r0 installed on a > > Lenovo Thinkpad L540 where an external disk enclosure with two disks is > > attached via USB. The Alpine Linux kernel appears to track Linux stable > > and is more or less vanilla. Also, the machine boots into Xen 4.17.0 and > > then starts a few headless VMs, nothing too exotic here. > > > > But when updating from Linux 6.1.1 to 6.1.8, the disks from the external > > enclosure did not show up. Unplug, replug, no dice, and this is 100% > > reproducable. dmesg has new these lines now: > > > > +ioremap error for 0xf2520000-0xf2530000, requested 0x2, got 0x0 > > +ioremap error for 0xf2520000-0xf2530000, requested 0x2, got 0x0 > > +xhci_hcd 0000:00:14.0: init 0000:00:14.0 fail, -14 > > +ioremap error for 0xfed1f000-0xfed20000, requested 0x2, got 0x0 > > +iTCO_wdt iTCO_wdt.1.auto: ioremap failed for resource [mem 0xfed1f410-0xfed1f414] > > > > I'm not sure if the ioremap error is related here (booted with > > early_ioremap_debug but then dmesg was filled with WARNINGS for both > > versions, so I disabled it again), but that xhci_hcd error looks > > suspicious. > > > > Curiously 6.1.8 works just fine when NOT booted via Xen. I booted into > > Xen + vanilla 6.1.8 now and was able to reproduce this issue. Xen + > > vanilla 6.1.1 works fine. > > Thanks for the report. To be sure the issue doesn't fall through the > cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression > tracking bot: > > #regzbot ^introduced v6.1.1..v6.1.8 > #regzbot title xen/usb(?): External USB disks not recognized anymore > under Xen > #regzbot ignore-activity > > This isn't a regression? This issue or a fix for it are already > discussed somewhere else? It was fixed already? You want to clarify when > the regression started to happen? Or point out I got the title or > something else totally wrong? Then just reply and tell me -- ideally > while also telling regzbot about it, as explained by the page listed in > the footer of this mail. > > Developers: When fixing the issue, remember to add 'Link:' tags pointing > to the report (the parent of this mail). See page linked in footer for > details. > > > From v6.1.1 to v6.1.8 there's only one commit in drivers/xen, but 54 > > commits in drivers/usb. Compiling takes time because the distribution > > kernel has almost everything enabled and I still need to cut down enabled > > options to be able to attempt a git biset in a reasonable time, > > FWIW, I'm working on a text for the kernel docs that will use > "localmodconfig" to trim down the configs automatically. Maybe it's > helpful for you, here is a draft: > > https://www.leemhuis.info/files/misc/How%20to%20quickly%20build%20a%20Linux%20kernel%20%E2%80%94%20The%20Linux%20Kernel%20documentation.html > > > but I > > still wanted to report this, maybe someone has an idea about this. > > > > Full dmesg and lshw outputs: https://nerdbynature.de/bits/usb_v6.1.8/ > > > > Thanks, > > Christian. > > > > PS: I found this workaround on the interwebs[0] to force the USB ports > > of that machine to USB 2.0 and then the missing disks magically appear: > > > > $ lspci -nn | grep -i usb > > 00:14.0 USB controller [0c03]: Intel Corporation 8 Series/C220 Series Chipset Family USB xHCI [8086:8c31] (rev 05) <=== !!! > > 00:1a.0 USB controller [0c03]: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #2 [8086:8c2d] (rev 05) > > 00:1d.0 USB controller [0c03]: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #1 [8086:8c26] (rev 05) > > > > $ setpci -H1 -d 8086:8c31 d8.l=0 > > $ setpci -H1 -d 8086:8c31 d0.l=0 > > > > $ dmesg > > usb 1-1.3: new full-speed USB device number 3 using ehci-pci > > usb 2-1.3: new high-speed USB device number 3 using ehci-pci > > usb 1-1.3: New USB device found, idVendor=138a, idProduct=0011, bcdDevice=0.78 > > usb 1-1.3: New USB device strings: Mfr=0, Product=0, SerialNumber=1 > > usb 1-1.3: SerialNumber: aa32bf84ed47 > > usb 1-1.5: new full-speed USB device number 4 using ehci-pci > > usb 2-1.3: New USB device found, idVendor=1e91, idProduct=a3a8, bcdDevice=2.07 > > usb 2-1.3: New USB device strings: Mfr=1, Product=2, SerialNumber=5 > > usb 2-1.3: Product: Elite Pro Dual > > usb 2-1.3: Manufacturer: OWC > > usb 2-1.3: SerialNumber: RANDOM__1E359879645F > > usb 2-1.3: UAS is ignored for this device, using usb-storage instead > > usb-storage 2-1.3:1.0: USB Mass Storage device detected > > usb-storage 2-1.3:1.0: Quirks match for vid 1e91 pid a3a8: 800000 > > scsi host5: usb-storage 2-1.3:1.0 > > usb 1-1.5: New USB device found, idVendor=8087, idProduct=07dc, bcdDevice=0.01 > > usb 1-1.5: New USB device strings: Mfr=0, Product=0, SerialNumber=0 > > Bluetooth: hci0: Legacy ROM 2.5 revision 8.0 build 1 week 45 2013 > > Bluetooth: hci0: Intel Bluetooth firmware file: intel/ibt-hw-37.7.10-fw-1.80.1.2d.d.bseq > > usb 1-1.6: new high-speed USB device number 5 using ehci-pci > > usb 1-1.6: New USB device found, idVendor=04f2, idProduct=b398, bcdDevice=39.98 > > usb 1-1.6: New USB device strings: Mfr=1, Product=2, SerialNumber=0 > > usb 1-1.6: Product: Integrated Camera > > usb 1-1.6: Manufacturer: Vimicro corp. > > Bluetooth: hci0: Intel BT fw patch 0x2a completed & activated > > scsi 5:0:0:0: Direct-Access ElitePro Dual U3FW-1 0207 PQ: 0 ANSI: 6 > > scsi 5:0:0:1: Direct-Access ElitePro Dual U3FW-2 0207 PQ: 0 ANSI: 6 > > sd 5:0:0:0: [sdc] Very big device. Trying to use READ CAPACITY(16). > > sd 5:0:0:0: [sdc] 7814037168 512-byte logical blocks: (4.00 TB/3.64 TiB) > > sd 5:0:0:1: [sdd] Very big device. Trying to use READ CAPACITY(16). > > sd 5:0:0:1: [sdd] 7814037168 512-byte logical blocks: (4.00 TB/3.64 TiB) > > sd 5:0:0:1: [sdd] Write Protect is off > > sd 5:0:0:1: [sdd] Mode Sense: 47 00 10 08 > > sd 5:0:0:0: [sdc] Write Protect is off > > sd 5:0:0:0: [sdc] Mode Sense: 47 00 10 08 > > sd 5:0:0:0: [sdc] No Caching mode page found > > sd 5:0:0:0: [sdc] Assuming drive cache: write through > > sd 5:0:0:1: [sdd] No Caching mode page found > > sd 5:0:0:1: [sdd] Assuming drive cache: write through > > sd 5:0:0:0: [sdc] Attached SCSI disk > > sd 5:0:0:1: [sdd] Attached SCSI disk > > > > $ lsblk /dev/sd[cd] > > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS > > sdc 8:32 0 3.6T 0 disk > > sdd 8:48 0 3.6T 0 disk > > > > > > [0] https://superuser.com/a/875863/218574 > > Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) > -- > Everything you wanna know about Linux kernel regression tracking: > https://linux-regtracking.leemhuis.info/about/#tldr > That page also explains what to do if mails like this annoy you. > -- BOFH excuse #188: ..disk or the processor is on fire.