2023-09-12 21:55:03

by John David Anglin

[permalink] [raw]
Subject: Re: [PATCH] linux/export: fix reference to exported functions for parisc64

On 2023-09-10 5:30 p.m., John David Anglin wrote:
> Hi Masahiro,
>
> The attached change fixed boot at ddb5cdbafaaa ????
>
> However, v6.5.x boot is still broken:
>
> Run /init as init process
> process '/usr/bin/sh' started with executable stack
> Loading, please wait...
> Starting systemd-udevd version 254.1-3
> e1000 alternatives: applied 0 out of 569 patches
> e1000: Intel(R) PRO/1000 Network Driver
> e1000: Copyright (c) 1999-2006 Intel Corporation.
> scsi_mod alternatives: applied 0 out of 7 patches
> SCSI subsystem initialized
> usbcore alternatives: applied 0 out of 18 patches
> usbcore: registered new interface driver usbfs
> libata alternatives: applied 0 out of 3 patches
> usbcore: registered new interface driver hub
> usbcore: registered new device driver usb
> mptbase alternatives: applied 0 out of 73 patches
> ehci_hcd alternatives: applied 0 out of 114 patches
> sata_sil24 alternatives: applied 0 out of 56 patches
> Fusion MPT base driver 3.04.20
> Copyright (c) 1999-2008 LSI Corporation
> sata_sil24 0000:00:01.0: Applying completion IRQ loss on PCI-X errata fix
> scsi host0: sata_sil24
> scsi host1: sata_sil24
> pata_sil680 0000:60:02.0: sil680: 133MHz clock.
> scsi host2: sata_sil24
> ehci_pci alternatives: applied 0 out of 2 patches
> ohci_hcd alternatives: applied 0 out of 144 patches
> ehci-pci 0000:60:01.2: EHCI Host Controller
> scsi host3: pata_sil680
> ehci-pci 0000:60:01.2: new USB bus registered, assigned bus number 1
> scsi host4: sata_sil24
> ata1: SATA max UDMA/100 host m128@0xffffffff80088000 port 0xffffffff80080000 ir6
> ata2: SATA max UDMA/100 host m128@0xffffffff80088000 port 0xffffffff80082000 ir6
> ata3: SATA max UDMA/100 host m128@0xffffffff80088000 port 0xffffffff80084000 ir6
> ata4: SATA max UDMA/100 host m128@0xffffffff80088000 port 0xffffffff80086000 ir6
> e1000 0000:60:03.0 eth0: (PCI:33MHz:32-bit) 00:11:0a:31:8a:77
> ehci-pci 0000:60:01.2: irq 71, io mem 0xffffffffb00a1000
> scsi host5: pata_sil680
> ata5: PATA max UDMA/133 cmd 0x26058 ctl 0x26064 bmdma 0x26040 irq 72
> ata6: PATA max UDMA/133 cmd 0x26050 ctl 0x26060 bmdma 0x26048 irq 72
> e1000 0000:60:03.0 eth0: Intel(R) PRO/1000 Network Connection
> ehci-pci 0000:60:01.2: USB 2.0 started, EHCI 0.95
> usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 6.05
> usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
> usb usb1: Product: EHCI Host Controller
> usb usb1: Manufacturer: Linux 6.5.2-dirty ehci_hcd
> usb usb1: SerialNumber: 0000:60:01.2
> hub 1-0:1.0: USB hub found
> hub 1-0:1.0: 5 ports detected
> ata1: SATA link down (SStatus 0 SControl 0)
> ata2: SATA link down (SStatus 0 SControl 0)
> ata3: SATA link down (SStatus 0 SControl 0)
> ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
> ata4.00: ATA-10: ST4000VN008-2DR166, SC60, max UDMA/133
> ata4.00: 7814037168 sectors, multi 0: LBA48 NCQ (depth 31/32)
> ata4.00: configured for UDMA/100
> scsi 4:0:0:0: Direct-Access     ATA      ST4000VN008-2DR1 SC60 PQ: 0 ANSI: 5
> ata6.00: ATAPI: HL-DT-STDVD+-RW GSA-H21L, 1.04, max UDMA/44
> scsi 5:0:0:0: CD-ROM            HL-DT-ST DVD+-RW GSA-H21L 1.04 PQ: 0 ANSI: 5
> random: crng init done
> Timed out for waiting the udev queue being empty.
> Begin: Loading essential drivers ... done.
> Begin: Running /scripts/init-premount ... done.
> Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done.
> Begin: Running /scripts/local-premount ... done.
> Timed out for waiting the udev queue being empty.
> Begin: Waiting for root file system ... Begin: Running /scripts/local-block ....
> Begin: Running /scripts/local-block ... done.
> Begin: Running /scripts/local-block ... done.
> Begin: Running /scripts/local-block ... done.
> Begin: Running /scripts/local-block ... done.
> Begin: Running /scripts/local-block ... done.
> Begin: Running /scripts/local-block ... done.
> Begin: Running /scripts/local-block ... done.
> Begin: Running /scripts/local-block ... done.
> Begin: Running /scripts/local-block ... done.
> Begin: Running /scripts/local-block ... done.
> Begin: Running /scripts/local-block ... done.
> Begin: Running /scripts/local-block ... done.
> Begin: Running /scripts/local-block ... done.
> Begin: Running /scripts/local-block ... done.
> Begin: Running /scripts/local-block ... done.
> Begin: Running /scripts/local-block ... done.
> Begin: Running /scripts/local-block ... done.
> Begin: Running /scripts/local-block ... done.
> done.
> Gave up waiting for root file system device.  Common problems:
>  - Boot args (cat /proc/cmdline)
>    - Check rootdelay= (did the system wait long enough?)
>  - Missing modules (cat /proc/modules; ls /dev)
> ALERT!  LABEL=ROOT does not exist.  Dropping to a shell!
> Rebooting automatically due to panic= boot argument
>
> I'll see if I can find the commit that breaks 6.5.
I've traced this to the following merge commit:

dave@atlas:~/linux/linux$ git bisect good
ca7ce08d6a063e0ccb91dc57f9bc213120d0d1a7 is the first bad commit
commit ca7ce08d6a063e0ccb91dc57f9bc213120d0d1a7
Merge: 1546cd4bfda4 af92c02fb209
Author: Linus Torvalds <[email protected]>
Date:   Fri Jun 30 11:57:07 2023 -0700

    Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

    Pull SCSI updates from James Bottomley:
     "Updates to the usual drivers (ufs, pm80xx, libata-scsi, smartpqi,
      lpfc, qla2xxx).

      We have a couple of major core changes impacting other systems:

       - Command Duration Limits, which spills into block and ATA

       - block level Persistent Reservation Operations, which touches block,
         nvme, target and dm

      Both of these are added with merge commits containing a cover letter
      explaining what's going on"

    * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (187 commits)
      scsi: core: Improve warning message in scsi_device_block()
      scsi: core: Replace scsi_target_block() with scsi_block_targets()
      scsi: core: Don't wait for quiesce in scsi_device_block()
      scsi: core: Don't wait for quiesce in scsi_stop_queue()
      scsi: core: Merge scsi_internal_device_block() and device_block()
      scsi: sg: Increase number of devices
      scsi: bsg: Increase number of devices
      scsi: qla2xxx: Remove unused nvme_ls_waitq wait queue
      scsi: ufs: ufs-pci: Add support for Intel Arrow Lake
      scsi: sd: sd_zbc: Use PAGE_SECTORS_SHIFT
      scsi: ufs: wb: Add explicit flush_threshold sysfs attribute
      scsi: ufs: ufs-qcom: Switch to the new ICE API
      scsi: ufs: dt-bindings: qcom: Add ICE phandle
      scsi: ufs: ufs-mediatek: Set UFSHCD_QUIRK_MCQ_BROKEN_RTC quirk
      scsi: ufs: ufs-mediatek: Set UFSHCD_QUIRK_MCQ_BROKEN_INTR quirk
      scsi: ufs: core: Add host quirk UFSHCD_QUIRK_MCQ_BROKEN_RTC
      scsi: ufs: core: Add host quirk UFSHCD_QUIRK_MCQ_BROKEN_INTR
      scsi: ufs: core: Remove dedicated hwq for dev command
      scsi: ufs: core: mcq: Fix the incorrect OCS value for the device command
      scsi: ufs: dt-bindings: samsung,exynos: Drop unneeded quotes
      ...

dave@atlas:~/linux/linux$ lspci
00:01.0 RAID bus controller: Silicon Image, Inc. SiI 3124 PCI-X Serial ATA Controller (rev 02)
40:01.0 SCSI storage controller: Broadcom / LSI 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 07)
40:01.1 SCSI storage controller: Broadcom / LSI 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 07)
60:01.0 USB controller: NEC Corporation OHCI USB Controller (rev 41)
60:01.1 USB controller: NEC Corporation OHCI USB Controller (rev 41)
60:01.2 USB controller: NEC Corporation uPD72010x USB 2.0 Controller (rev 02)
60:02.0 IDE interface: Silicon Image, Inc. PCI0680 Ultra ATA-133 Host Controller (rev 02)
60:03.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller (rev 02)

Dave

--
John David Anglin [email protected]


2023-09-13 17:59:39

by John David Anglin

[permalink] [raw]
Subject: Re: [PATCH] linux/export: fix reference to exported functions for parisc64

On 2023-09-12 5:53 p.m., John David Anglin wrote:
> On 2023-09-10 5:30 p.m., John David Anglin wrote:
>> Hi Masahiro,
>>
>> The attached change fixed boot at ddb5cdbafaaa ????
>>
>> However, v6.5.x boot is still broken:
>>
>> Run /init as init process
>> process '/usr/bin/sh' started with executable stack
>> Loading, please wait...
>> Starting systemd-udevd version 254.1-3
>> e1000 alternatives: applied 0 out of 569 patches
>> e1000: Intel(R) PRO/1000 Network Driver
>> e1000: Copyright (c) 1999-2006 Intel Corporation.
>> scsi_mod alternatives: applied 0 out of 7 patches
>> SCSI subsystem initialized
>> usbcore alternatives: applied 0 out of 18 patches
>> usbcore: registered new interface driver usbfs
>> libata alternatives: applied 0 out of 3 patches
>> usbcore: registered new interface driver hub
>> usbcore: registered new device driver usb
>> mptbase alternatives: applied 0 out of 73 patches
>> ehci_hcd alternatives: applied 0 out of 114 patches
>> sata_sil24 alternatives: applied 0 out of 56 patches
>> Fusion MPT base driver 3.04.20
>> Copyright (c) 1999-2008 LSI Corporation
>> sata_sil24 0000:00:01.0: Applying completion IRQ loss on PCI-X errata fix
>> scsi host0: sata_sil24
>> scsi host1: sata_sil24
>> pata_sil680 0000:60:02.0: sil680: 133MHz clock.
>> scsi host2: sata_sil24
>> ehci_pci alternatives: applied 0 out of 2 patches
>> ohci_hcd alternatives: applied 0 out of 144 patches
>> ehci-pci 0000:60:01.2: EHCI Host Controller
>> scsi host3: pata_sil680
>> ehci-pci 0000:60:01.2: new USB bus registered, assigned bus number 1
>> scsi host4: sata_sil24
>> ata1: SATA max UDMA/100 host m128@0xffffffff80088000 port 0xffffffff80080000 ir6
>> ata2: SATA max UDMA/100 host m128@0xffffffff80088000 port 0xffffffff80082000 ir6
>> ata3: SATA max UDMA/100 host m128@0xffffffff80088000 port 0xffffffff80084000 ir6
>> ata4: SATA max UDMA/100 host m128@0xffffffff80088000 port 0xffffffff80086000 ir6
>> e1000 0000:60:03.0 eth0: (PCI:33MHz:32-bit) 00:11:0a:31:8a:77
>> ehci-pci 0000:60:01.2: irq 71, io mem 0xffffffffb00a1000
>> scsi host5: pata_sil680
>> ata5: PATA max UDMA/133 cmd 0x26058 ctl 0x26064 bmdma 0x26040 irq 72
>> ata6: PATA max UDMA/133 cmd 0x26050 ctl 0x26060 bmdma 0x26048 irq 72
>> e1000 0000:60:03.0 eth0: Intel(R) PRO/1000 Network Connection
>> ehci-pci 0000:60:01.2: USB 2.0 started, EHCI 0.95
>> usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 6.05
>> usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
>> usb usb1: Product: EHCI Host Controller
>> usb usb1: Manufacturer: Linux 6.5.2-dirty ehci_hcd
>> usb usb1: SerialNumber: 0000:60:01.2
>> hub 1-0:1.0: USB hub found
>> hub 1-0:1.0: 5 ports detected
>> ata1: SATA link down (SStatus 0 SControl 0)
>> ata2: SATA link down (SStatus 0 SControl 0)
>> ata3: SATA link down (SStatus 0 SControl 0)
>> ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
>> ata4.00: ATA-10: ST4000VN008-2DR166, SC60, max UDMA/133
>> ata4.00: 7814037168 sectors, multi 0: LBA48 NCQ (depth 31/32)
>> ata4.00: configured for UDMA/100
>> scsi 4:0:0:0: Direct-Access     ATA      ST4000VN008-2DR1 SC60 PQ: 0 ANSI: 5
>> ata6.00: ATAPI: HL-DT-STDVD+-RW GSA-H21L, 1.04, max UDMA/44
>> scsi 5:0:0:0: CD-ROM            HL-DT-ST DVD+-RW GSA-H21L 1.04 PQ: 0 ANSI: 5
>> random: crng init done
>> Timed out for waiting the udev queue being empty.
>> Begin: Loading essential drivers ... done.
>> Begin: Running /scripts/init-premount ... done.
>> Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done.
>> Begin: Running /scripts/local-premount ... done.
>> Timed out for waiting the udev queue being empty.
>> Begin: Waiting for root file system ... Begin: Running /scripts/local-block ....
>> Begin: Running /scripts/local-block ... done.
>> Begin: Running /scripts/local-block ... done.
>> Begin: Running /scripts/local-block ... done.
>> Begin: Running /scripts/local-block ... done.
>> Begin: Running /scripts/local-block ... done.
>> Begin: Running /scripts/local-block ... done.
>> Begin: Running /scripts/local-block ... done.
>> Begin: Running /scripts/local-block ... done.
>> Begin: Running /scripts/local-block ... done.
>> Begin: Running /scripts/local-block ... done.
>> Begin: Running /scripts/local-block ... done.
>> Begin: Running /scripts/local-block ... done.
>> Begin: Running /scripts/local-block ... done.
>> Begin: Running /scripts/local-block ... done.
>> Begin: Running /scripts/local-block ... done.
>> Begin: Running /scripts/local-block ... done.
>> Begin: Running /scripts/local-block ... done.
>> Begin: Running /scripts/local-block ... done.
>> done.
>> Gave up waiting for root file system device.  Common problems:
>>  - Boot args (cat /proc/cmdline)
>>    - Check rootdelay= (did the system wait long enough?)
>>  - Missing modules (cat /proc/modules; ls /dev)
>> ALERT!  LABEL=ROOT does not exist.  Dropping to a shell!
>> Rebooting automatically due to panic= boot argument
>>
>> I'll see if I can find the commit that breaks 6.5.
> I've traced this to the following merge commit:
>
> dave@atlas:~/linux/linux$ git bisect good
> ca7ce08d6a063e0ccb91dc57f9bc213120d0d1a7 is the first bad commit
> commit ca7ce08d6a063e0ccb91dc57f9bc213120d0d1a7
> Merge: 1546cd4bfda4 af92c02fb209
> Author: Linus Torvalds <[email protected]>
> Date:   Fri Jun 30 11:57:07 2023 -0700
>
>     Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
>
>     Pull SCSI updates from James Bottomley:
>      "Updates to the usual drivers (ufs, pm80xx, libata-scsi, smartpqi,
>       lpfc, qla2xxx).
>
>       We have a couple of major core changes impacting other systems:
>
>        - Command Duration Limits, which spills into block and ATA
>
>        - block level Persistent Reservation Operations, which touches block,
>          nvme, target and dm
>
>       Both of these are added with merge commits containing a cover letter
>       explaining what's going on"
>
>     * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (187 commits)
>       scsi: core: Improve warning message in scsi_device_block()
>       scsi: core: Replace scsi_target_block() with scsi_block_targets()
>       scsi: core: Don't wait for quiesce in scsi_device_block()
>       scsi: core: Don't wait for quiesce in scsi_stop_queue()
>       scsi: core: Merge scsi_internal_device_block() and device_block()
>       scsi: sg: Increase number of devices
>       scsi: bsg: Increase number of devices
>       scsi: qla2xxx: Remove unused nvme_ls_waitq wait queue
>       scsi: ufs: ufs-pci: Add support for Intel Arrow Lake
>       scsi: sd: sd_zbc: Use PAGE_SECTORS_SHIFT
>       scsi: ufs: wb: Add explicit flush_threshold sysfs attribute
>       scsi: ufs: ufs-qcom: Switch to the new ICE API
>       scsi: ufs: dt-bindings: qcom: Add ICE phandle
>       scsi: ufs: ufs-mediatek: Set UFSHCD_QUIRK_MCQ_BROKEN_RTC quirk
>       scsi: ufs: ufs-mediatek: Set UFSHCD_QUIRK_MCQ_BROKEN_INTR quirk
>       scsi: ufs: core: Add host quirk UFSHCD_QUIRK_MCQ_BROKEN_RTC
>       scsi: ufs: core: Add host quirk UFSHCD_QUIRK_MCQ_BROKEN_INTR
>       scsi: ufs: core: Remove dedicated hwq for dev command
>       scsi: ufs: core: mcq: Fix the incorrect OCS value for the device command
>       scsi: ufs: dt-bindings: samsung,exynos: Drop unneeded quotes
>       ...
>
> dave@atlas:~/linux/linux$ lspci
> 00:01.0 RAID bus controller: Silicon Image, Inc. SiI 3124 PCI-X Serial ATA Controller (rev 02)
> 40:01.0 SCSI storage controller: Broadcom / LSI 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 07)
> 40:01.1 SCSI storage controller: Broadcom / LSI 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 07)
> 60:01.0 USB controller: NEC Corporation OHCI USB Controller (rev 41)
> 60:01.1 USB controller: NEC Corporation OHCI USB Controller (rev 41)
> 60:01.2 USB controller: NEC Corporation uPD72010x USB 2.0 Controller (rev 02)
> 60:02.0 IDE interface: Silicon Image, Inc. PCI0680 Ultra ATA-133 Host Controller (rev 02)
> 60:03.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller (rev 02)
This was introduced by the following commit:

dave@atlas:~/linux/linux$ git bisect good
624885209f31eb9985bf51abe204ecbffe2fdeea is the first bad commit
commit 624885209f31eb9985bf51abe204ecbffe2fdeea
Author: Damien Le Moal <[email protected]>
Date:   Thu May 11 03:13:41 2023 +0200

    scsi: core: Detect support for command duration limits

    Introduce the function scsi_cdl_check() to detect if a device supports
    command duration limits (CDL). Support for the READ 16, WRITE 16, READ 32
    and WRITE 32 commands are checked using the function scsi_report_opcode()
    to probe the rwcdlp and cdlp bits as they indicate the mode page defining
    the command duration limits descriptors that apply to the command being
    tested.

    If any of these commands support CDL, the field cdl_supported of struct
    scsi_device is set to 1 to indicate that the device supports CDL.

    Support for CDL for a device is advertizes through sysfs using the new
    cdl_supported device attribute. This attribute value is 1 for a device
    supporting CDL and 0 otherwise.

    Signed-off-by: Damien Le Moal <[email protected]>
    Reviewed-by: Hannes Reinecke <[email protected]>
    Co-developed-by: Niklas Cassel <[email protected]>
    Signed-off-by: Niklas Cassel <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Martin K. Petersen <[email protected]>

 Documentation/ABI/testing/sysfs-block-device |  9 ++++
 drivers/scsi/scsi.c                          | 81 ++++++++++++++++++++++++++++
 drivers/scsi/scsi_scan.c                     |  3 ++
 drivers/scsi/scsi_sysfs.c                    |  2 +
 include/scsi/scsi_device.h                   |  3 ++
 5 files changed, 98 insertions(+)

Sometimes I see when booting a bad commit:
[...]
Begin: Running /scripts/local-block ... done.
Begin: Running /scripts/local-block ... done.
Begin: Running /scripts/local-block ... done.
done.
Gave up waiting for root file system device.  Common problems:
 - Boot args (cat /proc/cmdline)
   - Check rootdelay= (did the system wait long enough?)
 - Missing modules (cat /proc/modules; ls /dev)
ALERT!  LABEL=ROOT does not exist.  Dropping to a shell!
Rebooting automatically due to panic= boot argument
ata4: SATA link down (SStatus 0 SControl 0)
ata5: SATA link down (SStatus 0 SControl 0)
ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
ata6.00: ATA-10: ST4000VN008-2DR166, SC60, max UDMA/133
ata6.00: 7814037168 sectors, multi 0: LBA48 NCQ (depth 31/32)
ata6.00: configured for UDMA/100
scsi 5:0:0:0: Direct-Access     ATA      ST4000VN008-2DR1 SC60 PQ: 0 ANSI: 5

Dave

--
John David Anglin [email protected]

2023-09-13 21:27:48

by John David Anglin

[permalink] [raw]
Subject: Re: [PATCH] linux/export: fix reference to exported functions for parisc64

On 2023-09-13 1:58 p.m., John David Anglin wrote:
> On 2023-09-12 5:53 p.m., John David Anglin wrote:
>> On 2023-09-10 5:30 p.m., John David Anglin wrote:
>>> Hi Masahiro,
>>>
>>> The attached change fixed boot at ddb5cdbafaaa ????
>>>
>>> However, v6.5.x boot is still broken:
>>>
>>> Run /init as init process
>>> process '/usr/bin/sh' started with executable stack
>>> Loading, please wait...
>>> Starting systemd-udevd version 254.1-3
>>> e1000 alternatives: applied 0 out of 569 patches
>>> e1000: Intel(R) PRO/1000 Network Driver
>>> e1000: Copyright (c) 1999-2006 Intel Corporation.
>>> scsi_mod alternatives: applied 0 out of 7 patches
>>> SCSI subsystem initialized
>>> usbcore alternatives: applied 0 out of 18 patches
>>> usbcore: registered new interface driver usbfs
>>> libata alternatives: applied 0 out of 3 patches
>>> usbcore: registered new interface driver hub
>>> usbcore: registered new device driver usb
>>> mptbase alternatives: applied 0 out of 73 patches
>>> ehci_hcd alternatives: applied 0 out of 114 patches
>>> sata_sil24 alternatives: applied 0 out of 56 patches
>>> Fusion MPT base driver 3.04.20
>>> Copyright (c) 1999-2008 LSI Corporation
>>> sata_sil24 0000:00:01.0: Applying completion IRQ loss on PCI-X errata fix
>>> scsi host0: sata_sil24
>>> scsi host1: sata_sil24
>>> pata_sil680 0000:60:02.0: sil680: 133MHz clock.
>>> scsi host2: sata_sil24
>>> ehci_pci alternatives: applied 0 out of 2 patches
>>> ohci_hcd alternatives: applied 0 out of 144 patches
>>> ehci-pci 0000:60:01.2: EHCI Host Controller
>>> scsi host3: pata_sil680
>>> ehci-pci 0000:60:01.2: new USB bus registered, assigned bus number 1
>>> scsi host4: sata_sil24
>>> ata1: SATA max UDMA/100 host m128@0xffffffff80088000 port 0xffffffff80080000 ir6
>>> ata2: SATA max UDMA/100 host m128@0xffffffff80088000 port 0xffffffff80082000 ir6
>>> ata3: SATA max UDMA/100 host m128@0xffffffff80088000 port 0xffffffff80084000 ir6
>>> ata4: SATA max UDMA/100 host m128@0xffffffff80088000 port 0xffffffff80086000 ir6
>>> e1000 0000:60:03.0 eth0: (PCI:33MHz:32-bit) 00:11:0a:31:8a:77
>>> ehci-pci 0000:60:01.2: irq 71, io mem 0xffffffffb00a1000
>>> scsi host5: pata_sil680
>>> ata5: PATA max UDMA/133 cmd 0x26058 ctl 0x26064 bmdma 0x26040 irq 72
>>> ata6: PATA max UDMA/133 cmd 0x26050 ctl 0x26060 bmdma 0x26048 irq 72
>>> e1000 0000:60:03.0 eth0: Intel(R) PRO/1000 Network Connection
>>> ehci-pci 0000:60:01.2: USB 2.0 started, EHCI 0.95
>>> usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 6.05
>>> usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
>>> usb usb1: Product: EHCI Host Controller
>>> usb usb1: Manufacturer: Linux 6.5.2-dirty ehci_hcd
>>> usb usb1: SerialNumber: 0000:60:01.2
>>> hub 1-0:1.0: USB hub found
>>> hub 1-0:1.0: 5 ports detected
>>> ata1: SATA link down (SStatus 0 SControl 0)
>>> ata2: SATA link down (SStatus 0 SControl 0)
>>> ata3: SATA link down (SStatus 0 SControl 0)
>>> ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
>>> ata4.00: ATA-10: ST4000VN008-2DR166, SC60, max UDMA/133
>>> ata4.00: 7814037168 sectors, multi 0: LBA48 NCQ (depth 31/32)
>>> ata4.00: configured for UDMA/100
>>> scsi 4:0:0:0: Direct-Access     ATA      ST4000VN008-2DR1 SC60 PQ: 0 ANSI: 5
>>> ata6.00: ATAPI: HL-DT-STDVD+-RW GSA-H21L, 1.04, max UDMA/44
>>> scsi 5:0:0:0: CD-ROM            HL-DT-ST DVD+-RW GSA-H21L 1.04 PQ: 0 ANSI: 5
>>> random: crng init done
>>> Timed out for waiting the udev queue being empty.
>>> Begin: Loading essential drivers ... done.
>>> Begin: Running /scripts/init-premount ... done.
>>> Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done.
>>> Begin: Running /scripts/local-premount ... done.
>>> Timed out for waiting the udev queue being empty.
>>> Begin: Waiting for root file system ... Begin: Running /scripts/local-block ....
>>> Begin: Running /scripts/local-block ... done.
>>> Begin: Running /scripts/local-block ... done.
>>> Begin: Running /scripts/local-block ... done.
>>> Begin: Running /scripts/local-block ... done.
>>> Begin: Running /scripts/local-block ... done.
>>> Begin: Running /scripts/local-block ... done.
>>> Begin: Running /scripts/local-block ... done.
>>> Begin: Running /scripts/local-block ... done.
>>> Begin: Running /scripts/local-block ... done.
>>> Begin: Running /scripts/local-block ... done.
>>> Begin: Running /scripts/local-block ... done.
>>> Begin: Running /scripts/local-block ... done.
>>> Begin: Running /scripts/local-block ... done.
>>> Begin: Running /scripts/local-block ... done.
>>> Begin: Running /scripts/local-block ... done.
>>> Begin: Running /scripts/local-block ... done.
>>> Begin: Running /scripts/local-block ... done.
>>> Begin: Running /scripts/local-block ... done.
>>> done.
>>> Gave up waiting for root file system device.  Common problems:
>>>  - Boot args (cat /proc/cmdline)
>>>    - Check rootdelay= (did the system wait long enough?)
>>>  - Missing modules (cat /proc/modules; ls /dev)
>>> ALERT!  LABEL=ROOT does not exist.  Dropping to a shell!
>>> Rebooting automatically due to panic= boot argument
>>>
>>> I'll see if I can find the commit that breaks 6.5.
>> I've traced this to the following merge commit:
>>
>> dave@atlas:~/linux/linux$ git bisect good
>> ca7ce08d6a063e0ccb91dc57f9bc213120d0d1a7 is the first bad commit
>> commit ca7ce08d6a063e0ccb91dc57f9bc213120d0d1a7
>> Merge: 1546cd4bfda4 af92c02fb209
>> Author: Linus Torvalds <[email protected]>
>> Date:   Fri Jun 30 11:57:07 2023 -0700
>>
>>     Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
>>
>>     Pull SCSI updates from James Bottomley:
>>      "Updates to the usual drivers (ufs, pm80xx, libata-scsi, smartpqi,
>>       lpfc, qla2xxx).
>>
>>       We have a couple of major core changes impacting other systems:
>>
>>        - Command Duration Limits, which spills into block and ATA
>>
>>        - block level Persistent Reservation Operations, which touches block,
>>          nvme, target and dm
>>
>>       Both of these are added with merge commits containing a cover letter
>>       explaining what's going on"
>>
>>     * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (187 commits)
>>       scsi: core: Improve warning message in scsi_device_block()
>>       scsi: core: Replace scsi_target_block() with scsi_block_targets()
>>       scsi: core: Don't wait for quiesce in scsi_device_block()
>>       scsi: core: Don't wait for quiesce in scsi_stop_queue()
>>       scsi: core: Merge scsi_internal_device_block() and device_block()
>>       scsi: sg: Increase number of devices
>>       scsi: bsg: Increase number of devices
>>       scsi: qla2xxx: Remove unused nvme_ls_waitq wait queue
>>       scsi: ufs: ufs-pci: Add support for Intel Arrow Lake
>>       scsi: sd: sd_zbc: Use PAGE_SECTORS_SHIFT
>>       scsi: ufs: wb: Add explicit flush_threshold sysfs attribute
>>       scsi: ufs: ufs-qcom: Switch to the new ICE API
>>       scsi: ufs: dt-bindings: qcom: Add ICE phandle
>>       scsi: ufs: ufs-mediatek: Set UFSHCD_QUIRK_MCQ_BROKEN_RTC quirk
>>       scsi: ufs: ufs-mediatek: Set UFSHCD_QUIRK_MCQ_BROKEN_INTR quirk
>>       scsi: ufs: core: Add host quirk UFSHCD_QUIRK_MCQ_BROKEN_RTC
>>       scsi: ufs: core: Add host quirk UFSHCD_QUIRK_MCQ_BROKEN_INTR
>>       scsi: ufs: core: Remove dedicated hwq for dev command
>>       scsi: ufs: core: mcq: Fix the incorrect OCS value for the device command
>>       scsi: ufs: dt-bindings: samsung,exynos: Drop unneeded quotes
>>       ...
>>
>> dave@atlas:~/linux/linux$ lspci
>> 00:01.0 RAID bus controller: Silicon Image, Inc. SiI 3124 PCI-X Serial ATA Controller (rev 02)
>> 40:01.0 SCSI storage controller: Broadcom / LSI 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 07)
>> 40:01.1 SCSI storage controller: Broadcom / LSI 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 07)
>> 60:01.0 USB controller: NEC Corporation OHCI USB Controller (rev 41)
>> 60:01.1 USB controller: NEC Corporation OHCI USB Controller (rev 41)
>> 60:01.2 USB controller: NEC Corporation uPD72010x USB 2.0 Controller (rev 02)
>> 60:02.0 IDE interface: Silicon Image, Inc. PCI0680 Ultra ATA-133 Host Controller (rev 02)
>> 60:03.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller (rev 02)
> This was introduced by the following commit:
>
> dave@atlas:~/linux/linux$ git bisect good
> 624885209f31eb9985bf51abe204ecbffe2fdeea is the first bad commit
> commit 624885209f31eb9985bf51abe204ecbffe2fdeea
> Author: Damien Le Moal <[email protected]>
> Date:   Thu May 11 03:13:41 2023 +0200
>
>     scsi: core: Detect support for command duration limits
>
>     Introduce the function scsi_cdl_check() to detect if a device supports
>     command duration limits (CDL). Support for the READ 16, WRITE 16, READ 32
>     and WRITE 32 commands are checked using the function scsi_report_opcode()
>     to probe the rwcdlp and cdlp bits as they indicate the mode page defining
>     the command duration limits descriptors that apply to the command being
>     tested.
>
>     If any of these commands support CDL, the field cdl_supported of struct
>     scsi_device is set to 1 to indicate that the device supports CDL.
>
>     Support for CDL for a device is advertizes through sysfs using the new
>     cdl_supported device attribute. This attribute value is 1 for a device
>     supporting CDL and 0 otherwise.
>
>     Signed-off-by: Damien Le Moal <[email protected]>
>     Reviewed-by: Hannes Reinecke <[email protected]>
>     Co-developed-by: Niklas Cassel <[email protected]>
>     Signed-off-by: Niklas Cassel <[email protected]>
>     Link: https://lore.kernel.org/r/[email protected]
>     Signed-off-by: Martin K. Petersen <[email protected]>
>
>  Documentation/ABI/testing/sysfs-block-device |  9 ++++
>  drivers/scsi/scsi.c                          | 81 ++++++++++++++++++++++++++++
>  drivers/scsi/scsi_scan.c                     |  3 ++
>  drivers/scsi/scsi_sysfs.c                    |  2 +
>  include/scsi/scsi_device.h                   |  3 ++
>  5 files changed, 98 insertions(+)
>
> Sometimes I see when booting a bad commit:
> [...]
> Begin: Running /scripts/local-block ... done.
> Begin: Running /scripts/local-block ... done.
> Begin: Running /scripts/local-block ... done.
> done.
> Gave up waiting for root file system device.  Common problems:
>  - Boot args (cat /proc/cmdline)
>    - Check rootdelay= (did the system wait long enough?)
>  - Missing modules (cat /proc/modules; ls /dev)
> ALERT!  LABEL=ROOT does not exist.  Dropping to a shell!
> Rebooting automatically due to panic= boot argument
> ata4: SATA link down (SStatus 0 SControl 0)
> ata5: SATA link down (SStatus 0 SControl 0)
> ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
> ata6.00: ATA-10: ST4000VN008-2DR166, SC60, max UDMA/133
> ata6.00: 7814037168 sectors, multi 0: LBA48 NCQ (depth 31/32)
> ata6.00: configured for UDMA/100
> scsi 5:0:0:0: Direct-Access     ATA      ST4000VN008-2DR1 SC60 PQ: 0 ANSI: 5

System boots master at e56b2b605799 if I disable CDL:

dave@atlas:~/linux/linux$ git diff drivers/scsi/scsi.c
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index d0911bc28663..dc3a283ebd75 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -578,6 +578,8 @@ static bool scsi_cdl_check_cmd(struct scsi_device *sdev, u8 opcode, u16 sa,
        int ret;
        u8 cdlp;

+       return false;
+
        /* Check operation code */
        ret = scsi_report_opcode(sdev, buf, SCSI_CDL_CHECK_BUF_LEN, opcode, sa);
        if (ret <= 0)

Dave

--
John David Anglin [email protected]

2023-09-14 10:41:44

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH] linux/export: fix reference to exported functions for parisc64

On 9/14/23 06:22, John David Anglin wrote:
> On 2023-09-13 1:58 p.m., John David Anglin wrote:
>> On 2023-09-12 5:53 p.m., John David Anglin wrote:
>>> On 2023-09-10 5:30 p.m., John David Anglin wrote:
>>>> Hi Masahiro,
>>>>
>>>> The attached change fixed boot at ddb5cdbafaaa ????
>>>>
>>>> However, v6.5.x boot is still broken:
>>>>
>>>> Run /init as init process
>>>> process '/usr/bin/sh' started with executable stack
>>>> Loading, please wait...
>>>> Starting systemd-udevd version 254.1-3
>>>> e1000 alternatives: applied 0 out of 569 patches
>>>> e1000: Intel(R) PRO/1000 Network Driver
>>>> e1000: Copyright (c) 1999-2006 Intel Corporation.
>>>> scsi_mod alternatives: applied 0 out of 7 patches
>>>> SCSI subsystem initialized
>>>> usbcore alternatives: applied 0 out of 18 patches
>>>> usbcore: registered new interface driver usbfs
>>>> libata alternatives: applied 0 out of 3 patches
>>>> usbcore: registered new interface driver hub
>>>> usbcore: registered new device driver usb
>>>> mptbase alternatives: applied 0 out of 73 patches
>>>> ehci_hcd alternatives: applied 0 out of 114 patches
>>>> sata_sil24 alternatives: applied 0 out of 56 patches
>>>> Fusion MPT base driver 3.04.20
>>>> Copyright (c) 1999-2008 LSI Corporation
>>>> sata_sil24 0000:00:01.0: Applying completion IRQ loss on PCI-X errata fix
>>>> scsi host0: sata_sil24
>>>> scsi host1: sata_sil24
>>>> pata_sil680 0000:60:02.0: sil680: 133MHz clock.
>>>> scsi host2: sata_sil24
>>>> ehci_pci alternatives: applied 0 out of 2 patches
>>>> ohci_hcd alternatives: applied 0 out of 144 patches
>>>> ehci-pci 0000:60:01.2: EHCI Host Controller
>>>> scsi host3: pata_sil680
>>>> ehci-pci 0000:60:01.2: new USB bus registered, assigned bus number 1
>>>> scsi host4: sata_sil24
>>>> ata1: SATA max UDMA/100 host m128@0xffffffff80088000 port 0xffffffff80080000 ir6
>>>> ata2: SATA max UDMA/100 host m128@0xffffffff80088000 port 0xffffffff80082000 ir6
>>>> ata3: SATA max UDMA/100 host m128@0xffffffff80088000 port 0xffffffff80084000 ir6
>>>> ata4: SATA max UDMA/100 host m128@0xffffffff80088000 port 0xffffffff80086000 ir6
>>>> e1000 0000:60:03.0 eth0: (PCI:33MHz:32-bit) 00:11:0a:31:8a:77
>>>> ehci-pci 0000:60:01.2: irq 71, io mem 0xffffffffb00a1000
>>>> scsi host5: pata_sil680
>>>> ata5: PATA max UDMA/133 cmd 0x26058 ctl 0x26064 bmdma 0x26040 irq 72
>>>> ata6: PATA max UDMA/133 cmd 0x26050 ctl 0x26060 bmdma 0x26048 irq 72
>>>> e1000 0000:60:03.0 eth0: Intel(R) PRO/1000 Network Connection
>>>> ehci-pci 0000:60:01.2: USB 2.0 started, EHCI 0.95
>>>> usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 6.05
>>>> usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
>>>> usb usb1: Product: EHCI Host Controller
>>>> usb usb1: Manufacturer: Linux 6.5.2-dirty ehci_hcd
>>>> usb usb1: SerialNumber: 0000:60:01.2
>>>> hub 1-0:1.0: USB hub found
>>>> hub 1-0:1.0: 5 ports detected
>>>> ata1: SATA link down (SStatus 0 SControl 0)
>>>> ata2: SATA link down (SStatus 0 SControl 0)
>>>> ata3: SATA link down (SStatus 0 SControl 0)
>>>> ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
>>>> ata4.00: ATA-10: ST4000VN008-2DR166, SC60, max UDMA/133
>>>> ata4.00: 7814037168 sectors, multi 0: LBA48 NCQ (depth 31/32)
>>>> ata4.00: configured for UDMA/100
>>>> scsi 4:0:0:0: Direct-Access     ATA      ST4000VN008-2DR1 SC60 PQ: 0 ANSI: 5
>>>> ata6.00: ATAPI: HL-DT-STDVD+-RW GSA-H21L, 1.04, max UDMA/44
>>>> scsi 5:0:0:0: CD-ROM            HL-DT-ST DVD+-RW GSA-H21L 1.04 PQ: 0 ANSI: 5
>>>> random: crng init done
>>>> Timed out for waiting the udev queue being empty.
>>>> Begin: Loading essential drivers ... done.
>>>> Begin: Running /scripts/init-premount ... done.
>>>> Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done.
>>>> Begin: Running /scripts/local-premount ... done.
>>>> Timed out for waiting the udev queue being empty.
>>>> Begin: Waiting for root file system ... Begin: Running /scripts/local-block ....
>>>> Begin: Running /scripts/local-block ... done.
>>>> Begin: Running /scripts/local-block ... done.
>>>> Begin: Running /scripts/local-block ... done.
>>>> Begin: Running /scripts/local-block ... done.
>>>> Begin: Running /scripts/local-block ... done.
>>>> Begin: Running /scripts/local-block ... done.
>>>> Begin: Running /scripts/local-block ... done.
>>>> Begin: Running /scripts/local-block ... done.
>>>> Begin: Running /scripts/local-block ... done.
>>>> Begin: Running /scripts/local-block ... done.
>>>> Begin: Running /scripts/local-block ... done.
>>>> Begin: Running /scripts/local-block ... done.
>>>> Begin: Running /scripts/local-block ... done.
>>>> Begin: Running /scripts/local-block ... done.
>>>> Begin: Running /scripts/local-block ... done.
>>>> Begin: Running /scripts/local-block ... done.
>>>> Begin: Running /scripts/local-block ... done.
>>>> Begin: Running /scripts/local-block ... done.
>>>> done.
>>>> Gave up waiting for root file system device.  Common problems:
>>>>  - Boot args (cat /proc/cmdline)
>>>>    - Check rootdelay= (did the system wait long enough?)
>>>>  - Missing modules (cat /proc/modules; ls /dev)
>>>> ALERT!  LABEL=ROOT does not exist.  Dropping to a shell!
>>>> Rebooting automatically due to panic= boot argument
>>>>
>>>> I'll see if I can find the commit that breaks 6.5.
>>> I've traced this to the following merge commit:
>>>
>>> dave@atlas:~/linux/linux$ git bisect good
>>> ca7ce08d6a063e0ccb91dc57f9bc213120d0d1a7 is the first bad commit
>>> commit ca7ce08d6a063e0ccb91dc57f9bc213120d0d1a7
>>> Merge: 1546cd4bfda4 af92c02fb209
>>> Author: Linus Torvalds <[email protected]>
>>> Date:   Fri Jun 30 11:57:07 2023 -0700
>>>
>>>     Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
>>>
>>>     Pull SCSI updates from James Bottomley:
>>>      "Updates to the usual drivers (ufs, pm80xx, libata-scsi, smartpqi,
>>>       lpfc, qla2xxx).
>>>
>>>       We have a couple of major core changes impacting other systems:
>>>
>>>        - Command Duration Limits, which spills into block and ATA
>>>
>>>        - block level Persistent Reservation Operations, which touches block,
>>>          nvme, target and dm
>>>
>>>       Both of these are added with merge commits containing a cover letter
>>>       explaining what's going on"
>>>
>>>     * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (187 commits)
>>>       scsi: core: Improve warning message in scsi_device_block()
>>>       scsi: core: Replace scsi_target_block() with scsi_block_targets()
>>>       scsi: core: Don't wait for quiesce in scsi_device_block()
>>>       scsi: core: Don't wait for quiesce in scsi_stop_queue()
>>>       scsi: core: Merge scsi_internal_device_block() and device_block()
>>>       scsi: sg: Increase number of devices
>>>       scsi: bsg: Increase number of devices
>>>       scsi: qla2xxx: Remove unused nvme_ls_waitq wait queue
>>>       scsi: ufs: ufs-pci: Add support for Intel Arrow Lake
>>>       scsi: sd: sd_zbc: Use PAGE_SECTORS_SHIFT
>>>       scsi: ufs: wb: Add explicit flush_threshold sysfs attribute
>>>       scsi: ufs: ufs-qcom: Switch to the new ICE API
>>>       scsi: ufs: dt-bindings: qcom: Add ICE phandle
>>>       scsi: ufs: ufs-mediatek: Set UFSHCD_QUIRK_MCQ_BROKEN_RTC quirk
>>>       scsi: ufs: ufs-mediatek: Set UFSHCD_QUIRK_MCQ_BROKEN_INTR quirk
>>>       scsi: ufs: core: Add host quirk UFSHCD_QUIRK_MCQ_BROKEN_RTC
>>>       scsi: ufs: core: Add host quirk UFSHCD_QUIRK_MCQ_BROKEN_INTR
>>>       scsi: ufs: core: Remove dedicated hwq for dev command
>>>       scsi: ufs: core: mcq: Fix the incorrect OCS value for the device command
>>>       scsi: ufs: dt-bindings: samsung,exynos: Drop unneeded quotes
>>>       ...
>>>
>>> dave@atlas:~/linux/linux$ lspci
>>> 00:01.0 RAID bus controller: Silicon Image, Inc. SiI 3124 PCI-X Serial ATA Controller (rev 02)
>>> 40:01.0 SCSI storage controller: Broadcom / LSI 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 07)
>>> 40:01.1 SCSI storage controller: Broadcom / LSI 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 07)
>>> 60:01.0 USB controller: NEC Corporation OHCI USB Controller (rev 41)
>>> 60:01.1 USB controller: NEC Corporation OHCI USB Controller (rev 41)
>>> 60:01.2 USB controller: NEC Corporation uPD72010x USB 2.0 Controller (rev 02)
>>> 60:02.0 IDE interface: Silicon Image, Inc. PCI0680 Ultra ATA-133 Host Controller (rev 02)
>>> 60:03.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller (rev 02)
>> This was introduced by the following commit:
>>
>> dave@atlas:~/linux/linux$ git bisect good
>> 624885209f31eb9985bf51abe204ecbffe2fdeea is the first bad commit
>> commit 624885209f31eb9985bf51abe204ecbffe2fdeea
>> Author: Damien Le Moal <[email protected]>
>> Date:   Thu May 11 03:13:41 2023 +0200
>>
>>     scsi: core: Detect support for command duration limits
>>
>>     Introduce the function scsi_cdl_check() to detect if a device supports
>>     command duration limits (CDL). Support for the READ 16, WRITE 16, READ 32
>>     and WRITE 32 commands are checked using the function scsi_report_opcode()
>>     to probe the rwcdlp and cdlp bits as they indicate the mode page defining
>>     the command duration limits descriptors that apply to the command being
>>     tested.
>>
>>     If any of these commands support CDL, the field cdl_supported of struct
>>     scsi_device is set to 1 to indicate that the device supports CDL.
>>
>>     Support for CDL for a device is advertizes through sysfs using the new
>>     cdl_supported device attribute. This attribute value is 1 for a device
>>     supporting CDL and 0 otherwise.
>>
>>     Signed-off-by: Damien Le Moal <[email protected]>
>>     Reviewed-by: Hannes Reinecke <[email protected]>
>>     Co-developed-by: Niklas Cassel <[email protected]>
>>     Signed-off-by: Niklas Cassel <[email protected]>
>>     Link: https://lore.kernel.org/r/[email protected]
>>     Signed-off-by: Martin K. Petersen <[email protected]>
>>
>>  Documentation/ABI/testing/sysfs-block-device |  9 ++++
>>  drivers/scsi/scsi.c                          | 81 ++++++++++++++++++++++++++++
>>  drivers/scsi/scsi_scan.c                     |  3 ++
>>  drivers/scsi/scsi_sysfs.c                    |  2 +
>>  include/scsi/scsi_device.h                   |  3 ++
>>  5 files changed, 98 insertions(+)
>>
>> Sometimes I see when booting a bad commit:
>> [...]
>> Begin: Running /scripts/local-block ... done.
>> Begin: Running /scripts/local-block ... done.
>> Begin: Running /scripts/local-block ... done.
>> done.
>> Gave up waiting for root file system device.  Common problems:
>>  - Boot args (cat /proc/cmdline)
>>    - Check rootdelay= (did the system wait long enough?)
>>  - Missing modules (cat /proc/modules; ls /dev)
>> ALERT!  LABEL=ROOT does not exist.  Dropping to a shell!
>> Rebooting automatically due to panic= boot argument
>> ata4: SATA link down (SStatus 0 SControl 0)
>> ata5: SATA link down (SStatus 0 SControl 0)
>> ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
>> ata6.00: ATA-10: ST4000VN008-2DR166, SC60, max UDMA/133
>> ata6.00: 7814037168 sectors, multi 0: LBA48 NCQ (depth 31/32)
>> ata6.00: configured for UDMA/100
>> scsi 5:0:0:0: Direct-Access     ATA      ST4000VN008-2DR1 SC60 PQ: 0 ANSI: 5
>
> System boots master at e56b2b605799 if I disable CDL:
>
> dave@atlas:~/linux/linux$ git diff drivers/scsi/scsi.c
> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
> index d0911bc28663..dc3a283ebd75 100644
> --- a/drivers/scsi/scsi.c
> +++ b/drivers/scsi/scsi.c
> @@ -578,6 +578,8 @@ static bool scsi_cdl_check_cmd(struct scsi_device *sdev, u8 opcode, u16 sa,
>         int ret;
>         u8 cdlp;
>
> +       return false;
> +
>         /* Check operation code */
>         ret = scsi_report_opcode(sdev, buf, SCSI_CDL_CHECK_BUF_LEN, opcode, sa);
>         if (ret <= 0)

It is weird that this solves anything... the MAINTENANCE_IN command issued by
scsi_report_opcode() ends up being emulated in libata with
ata_scsiop_maint_in(). There are no actual commands issued to the drive, so
nothing that could actually fail/cause issues. By the time this is issued, the
ATA drive is also fully probed...

Or is the drive connected to the Broadcom HBA you have ? In that case, libata is
not used and the HBA FW SAT (scsi-ata-translation) is likely to blame.

Could you send a full dmesg output for a clean boot and for a failed one so that
I can compare ?

--
Damien Le Moal
Western Digital Research