2022-10-12 08:05:43

by Naresh Kamboju

[permalink] [raw]
Subject: TI: X15 the connected SSD is not detected on Linux next 20221006 tag

On TI beagle board x15 the connected SSD is not detected on linux next
20221006 tag.

+ export STORAGE_DEV=/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
+ STORAGE_DEV=/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
+ test -n /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
+ echo y
+ mkfs.ext4 /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
mke2fs 1.46.5 (30-Dec-2021)
The file /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 does
not exist and no size was specified.
+ lava-test-raise 'mkfs.ext4
/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 failed; job
exit'

Test log:
- https://lkft.validation.linaro.org/scheduler/job/5634743#L2580

Reported-by: Linux Kernel Functional Testing <[email protected]>

metadata:
git_ref: master
git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next
git_sha: 7da9fed0474b4cd46055dd92d55c42faf32c19ac
git_describe: next-20221006
kernel_version: 6.0.0
kernel-config: https://builds.tuxbuild.com/2FkkkZ51ZYhBL1G8D69YX8Pkt5F/config
build-url: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next/-/pipelines/659754170
artifact-location: https://builds.tuxbuild.com/2FkkkZ51ZYhBL1G8D69YX8Pkt5F
toolchain: gcc-10

--
Linaro LKFT
https://lkft.linaro.org


2022-10-13 07:35:27

by Damien Le Moal

[permalink] [raw]
Subject: Re: TI: X15 the connected SSD is not detected on Linux next 20221006 tag

On 2022/10/12 16:24, Naresh Kamboju wrote:
> On TI beagle board x15 the connected SSD is not detected on linux next
> 20221006 tag.
>
> + export STORAGE_DEV=/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
> + STORAGE_DEV=/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
> + test -n /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
> + echo y
> + mkfs.ext4 /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
> mke2fs 1.46.5 (30-Dec-2021)
> The file /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 does
> not exist and no size was specified.
> + lava-test-raise 'mkfs.ext4
> /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 failed; job
> exit'
>
> Test log:
> - https://lkft.validation.linaro.org/scheduler/job/5634743#L2580
>
> Reported-by: Linux Kernel Functional Testing <[email protected]>
>
> metadata:
> git_ref: master
> git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next
> git_sha: 7da9fed0474b4cd46055dd92d55c42faf32c19ac
> git_describe: next-20221006
> kernel_version: 6.0.0
> kernel-config: https://builds.tuxbuild.com/2FkkkZ51ZYhBL1G8D69YX8Pkt5F/config
> build-url: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next/-/pipelines/659754170
> artifact-location: https://builds.tuxbuild.com/2FkkkZ51ZYhBL1G8D69YX8Pkt5F
> toolchain: gcc-10

The kernel messages that are shown in the links above do not show any "libata
version 3.00 loaded." message nor any ata/ahci message that I can see. So I
think the eSATA adapter is not even being detected and libata/ahci driver not used.

Was this working before ? If yes, can you try with the following patches reverted ?

d3243965f24a ("ata: make PATA_PLATFORM selectable only for suitable architectures")
3ebe59a54111 ("ata: clean up how architectures enable PATA_PLATFORM and
PATA_OF_PLATFORM")

If reverting these patches restores the eSATA port on this board, then you need
to fix the defconfig for that board.

--
Damien Le Moal
Western Digital Research

2022-10-13 13:13:55

by Naresh Kamboju

[permalink] [raw]
Subject: Re: TI: X15 the connected SSD is not detected on Linux next 20221006 tag

On Thu, 13 Oct 2022 at 12:41, Damien Le Moal
<[email protected]> wrote:
>
> On 2022/10/12 16:24, Naresh Kamboju wrote:
> > On TI beagle board x15 the connected SSD is not detected on linux next
> > 20221006 tag.
> >
> > + export STORAGE_DEV=/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
> > + STORAGE_DEV=/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
> > + test -n /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
> > + echo y
> > + mkfs.ext4 /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
> > mke2fs 1.46.5 (30-Dec-2021)
> > The file /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 does
> > not exist and no size was specified.
> > + lava-test-raise 'mkfs.ext4
> > /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 failed; job
> > exit'

The reported issue is now noticed on the Linux mainline master branch.

1)
I see following config is missing on latest problematic builds
- CONFIG_HAVE_PATA_PLATFORM=y

2)
Following ahci sata kernel message are missing on problematic boots,
[ 1.408660] ahci 4a140000.sata: forcing port_map 0x0 -> 0x1
[ 1.408691] ahci 4a140000.sata: AHCI 0001.0300 32 slots 1 ports 3
Gbps 0x1 impl platform mode
[ 1.408721] ahci 4a140000.sata: flags: 64bit ncq sntf pm led clo
only pmp pio slum part ccc apst
[ 1.409820] scsi host0: ahci
[ 1.410064] ata1: SATA max UDMA/133 mmio [mem
0x4a140000-0x4a1410ff] port 0x100 irq 98

3)
GOOD: 9d84bb40bcb30a7fa16f33baa967aeb9953dda78
BAD: e08466a7c00733a501d3c5328d29ec974478d717

4)
Here i am adding links working and not working test jobs and kernel configs,
problematic test job:
- https://lkft.validation.linaro.org/scheduler/job/5641407#L2602
Good test job:
- https://lkft.validation.linaro.org/scheduler/job/5640672#L2198

5)
metadata:
git_ref: master
git_repo: https://gitlab.com/Linaro/lkft/mirrors/torvalds/linux-mainline
git_sha: e08466a7c00733a501d3c5328d29ec974478d717
git_describe: v6.0-7220-ge08466a7c007
kernel_version: 6.0.0
kernel-config: https://builds.tuxbuild.com/2Fourpiqf1OrlPFFtKwhHV0wAiq/config
build-url: https://gitlab.com/Linaro/lkft/mirrors/torvalds/linux-mainline/-/pipelines/661424896
artifact-location: https://builds.tuxbuild.com/2Fourpiqf1OrlPFFtKwhHV0wAiq
toolchain: gcc-10


6)
For your information,
--
I see diff on good to bad commits,
$ git log --oneline 9d84bb40bcb3..e08466a7c007 -- drivers/ata
4078aa685097 Merge tag 'ata-6.1-rc1' of
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata
71d7b6e51ad3 ata: libata-eh: avoid needless hard reset when revalidating link
e3b1fff6c051 ata: libata: drop superfluous ata_eh_analyze_tf() parameter
b46c760e11c8 ata: libata: drop superfluous ata_eh_request_sense() parameter
cb6e73aaadff ata: libata-eh: Remove the unneeded result variable
ecf8322f464d ata: ahci_st: Enable compile test
2d29dd108c78 ata: ahci_st: Fix compilation warning
9628711aa649 ata: ahci-dwc: Add Baikal-T1 AHCI SATA interface support
bc7af9100fa8 ata: ahci-dwc: Add platform-specific quirks support
33629d35090f ata: ahci: Add DWC AHCI SATA controller support
6ce73f3a6fc0 ata: libahci_platform: Add function returning a clock-handle by id
18ee7c49f75b ata: ahci: Introduce firmware-specific caps initialization
7cbbfbe01a72 ata: ahci: Convert __ahci_port_base to accepting hpriv as arguments
fad64dc06579 ata: libahci: Don't read AHCI version twice in the
save-config method
88589772e80c ata: libahci: Discard redundant force_port_map parameter
eb7cae0b6afd ata: libahci: Extend port-cmd flags set with port capabilities
f67f12ff57bc ata: libahci_platform: Introduce reset
assertion/deassertion methods
3f74cd046fbe ata: libahci_platform: Parse ports-implemented property
in resources getter
3c132ea6508b ata: libahci_platform: Sanity check the DT child nodes number
e28b3abf8020 ata: libahci_platform: Convert to using devm bulk clocks API
82d437e6dcb1 ata: libahci_platform: Convert to using platform
devm-ioremap methods
d3243965f24a ata: make PATA_PLATFORM selectable only for suitable architectures
3ebe59a54111 ata: clean up how architectures enable PATA_PLATFORM and
PATA_OF_PLATFORM
55d5ba550535 ata: libata-core: Check errors in sata_print_link_status()
03070458d700 ata: libata-sff: Fix double word in comments
0b2436d3d25f ata: pata_macio: Remove unneeded word in comments
024811a2da45 ata: libata-core: Simplify ata_dev_set_xfermode()
066de3b9d93b ata: libata-core: Simplify ata_build_rw_tf()
e00923c59e68 ata: libata: Rename ATA_DFLAG_NCQ_PRIO_ENABLE
614065aba704 ata: libata-core: remove redundant err_mask variable
fee6073051c3 ata: ahci: Do not check ACPI_FADT_LOW_POWER_S0
99ad3f9f829f ata: libata-core: improve parameter names for ata_dev_set_feature()
16169fb78182 ata: libata-core: Print timeout value when internal command times




> >
> > Test log:
> > - https://lkft.validation.linaro.org/scheduler/job/5634743#L2580
> >
> > Reported-by: Linux Kernel Functional Testing <[email protected]>
> >
> > metadata:
> > git_ref: master
> > git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next
> > git_sha: 7da9fed0474b4cd46055dd92d55c42faf32c19ac
> > git_describe: next-20221006
> > kernel_version: 6.0.0
> > kernel-config: https://builds.tuxbuild.com/2FkkkZ51ZYhBL1G8D69YX8Pkt5F/config
> > build-url: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next/-/pipelines/659754170
> > artifact-location: https://builds.tuxbuild.com/2FkkkZ51ZYhBL1G8D69YX8Pkt5F
> > toolchain: gcc-10

7)
> The kernel messages that are shown in the links above do not show any "libata
> version 3.00 loaded." message nor any ata/ahci message that I can see. So I
> think the eSATA adapter is not even being detected and libata/ahci driver not used.
>
> Was this working before ? If yes, can you try with the following patches reverted ?
>
> d3243965f24a ("ata: make PATA_PLATFORM selectable only for suitable architectures")
> 3ebe59a54111 ("ata: clean up how architectures enable PATA_PLATFORM and
> PATA_OF_PLATFORM")

I have reverted above two patches and but the problem has not been solved.

8)
> If reverting these patches restores the eSATA port on this board, then you need
> to fix the defconfig for that board.

OTOH,
Anders, enabled the new config CONFIG_AHCI_DWC=y and tried but the
device failed to boot.

- Naresh

2022-10-13 22:20:39

by Anders Roxell

[permalink] [raw]
Subject: Re: TI: X15 the connected SSD is not detected on Linux next 20221006 tag

On Thu, 13 Oct 2022 at 14:39, Naresh Kamboju <[email protected]> wrote:
>
> On Thu, 13 Oct 2022 at 12:41, Damien Le Moal
> <[email protected]> wrote:
> >
> > On 2022/10/12 16:24, Naresh Kamboju wrote:
> > > On TI beagle board x15 the connected SSD is not detected on linux next
> > > 20221006 tag.
> > >
> > > + export STORAGE_DEV=/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
> > > + STORAGE_DEV=/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
> > > + test -n /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
> > > + echo y
> > > + mkfs.ext4 /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
> > > mke2fs 1.46.5 (30-Dec-2021)
> > > The file /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 does
> > > not exist and no size was specified.
> > > + lava-test-raise 'mkfs.ext4
> > > /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 failed; job
> > > exit'
>
> The reported issue is now noticed on the Linux mainline master branch.
>
> 1)
> I see following config is missing on latest problematic builds
> - CONFIG_HAVE_PATA_PLATFORM=y
>
> 2)
> Following ahci sata kernel message are missing on problematic boots,
> [ 1.408660] ahci 4a140000.sata: forcing port_map 0x0 -> 0x1
> [ 1.408691] ahci 4a140000.sata: AHCI 0001.0300 32 slots 1 ports 3
> Gbps 0x1 impl platform mode
> [ 1.408721] ahci 4a140000.sata: flags: 64bit ncq sntf pm led clo
> only pmp pio slum part ccc apst
> [ 1.409820] scsi host0: ahci
> [ 1.410064] ata1: SATA max UDMA/133 mmio [mem
> 0x4a140000-0x4a1410ff] port 0x100 irq 98
>
> 3)
> GOOD: 9d84bb40bcb30a7fa16f33baa967aeb9953dda78
> BAD: e08466a7c00733a501d3c5328d29ec974478d717
>
> 4)
> Here i am adding links working and not working test jobs and kernel configs,
> problematic test job:
> - https://lkft.validation.linaro.org/scheduler/job/5641407#L2602
> Good test job:
> - https://lkft.validation.linaro.org/scheduler/job/5640672#L2198
>
> 5)
> metadata:
> git_ref: master
> git_repo: https://gitlab.com/Linaro/lkft/mirrors/torvalds/linux-mainline
> git_sha: e08466a7c00733a501d3c5328d29ec974478d717
> git_describe: v6.0-7220-ge08466a7c007
> kernel_version: 6.0.0
> kernel-config: https://builds.tuxbuild.com/2Fourpiqf1OrlPFFtKwhHV0wAiq/config
> build-url: https://gitlab.com/Linaro/lkft/mirrors/torvalds/linux-mainline/-/pipelines/661424896
> artifact-location: https://builds.tuxbuild.com/2Fourpiqf1OrlPFFtKwhHV0wAiq
> toolchain: gcc-10
>
>
> 6)
> For your information,
> --
> I see diff on good to bad commits,
> $ git log --oneline 9d84bb40bcb3..e08466a7c007 -- drivers/ata
> 4078aa685097 Merge tag 'ata-6.1-rc1' of
> git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata
> 71d7b6e51ad3 ata: libata-eh: avoid needless hard reset when revalidating link
> e3b1fff6c051 ata: libata: drop superfluous ata_eh_analyze_tf() parameter
> b46c760e11c8 ata: libata: drop superfluous ata_eh_request_sense() parameter
> cb6e73aaadff ata: libata-eh: Remove the unneeded result variable
> ecf8322f464d ata: ahci_st: Enable compile test
> 2d29dd108c78 ata: ahci_st: Fix compilation warning
> 9628711aa649 ata: ahci-dwc: Add Baikal-T1 AHCI SATA interface support
> bc7af9100fa8 ata: ahci-dwc: Add platform-specific quirks support
> 33629d35090f ata: ahci: Add DWC AHCI SATA controller support
> 6ce73f3a6fc0 ata: libahci_platform: Add function returning a clock-handle by id
> 18ee7c49f75b ata: ahci: Introduce firmware-specific caps initialization
> 7cbbfbe01a72 ata: ahci: Convert __ahci_port_base to accepting hpriv as arguments
> fad64dc06579 ata: libahci: Don't read AHCI version twice in the
> save-config method
> 88589772e80c ata: libahci: Discard redundant force_port_map parameter
> eb7cae0b6afd ata: libahci: Extend port-cmd flags set with port capabilities
> f67f12ff57bc ata: libahci_platform: Introduce reset
> assertion/deassertion methods
> 3f74cd046fbe ata: libahci_platform: Parse ports-implemented property
> in resources getter
> 3c132ea6508b ata: libahci_platform: Sanity check the DT child nodes number
> e28b3abf8020 ata: libahci_platform: Convert to using devm bulk clocks API
> 82d437e6dcb1 ata: libahci_platform: Convert to using platform
> devm-ioremap methods
> d3243965f24a ata: make PATA_PLATFORM selectable only for suitable architectures
> 3ebe59a54111 ata: clean up how architectures enable PATA_PLATFORM and
> PATA_OF_PLATFORM
> 55d5ba550535 ata: libata-core: Check errors in sata_print_link_status()
> 03070458d700 ata: libata-sff: Fix double word in comments
> 0b2436d3d25f ata: pata_macio: Remove unneeded word in comments
> 024811a2da45 ata: libata-core: Simplify ata_dev_set_xfermode()
> 066de3b9d93b ata: libata-core: Simplify ata_build_rw_tf()
> e00923c59e68 ata: libata: Rename ATA_DFLAG_NCQ_PRIO_ENABLE
> 614065aba704 ata: libata-core: remove redundant err_mask variable
> fee6073051c3 ata: ahci: Do not check ACPI_FADT_LOW_POWER_S0
> 99ad3f9f829f ata: libata-core: improve parameter names for ata_dev_set_feature()
> 16169fb78182 ata: libata-core: Print timeout value when internal command times
>
>
>
>
> > >
> > > Test log:
> > > - https://lkft.validation.linaro.org/scheduler/job/5634743#L2580
> > >
> > > Reported-by: Linux Kernel Functional Testing <[email protected]>
> > >
> > > metadata:
> > > git_ref: master
> > > git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next
> > > git_sha: 7da9fed0474b4cd46055dd92d55c42faf32c19ac
> > > git_describe: next-20221006
> > > kernel_version: 6.0.0
> > > kernel-config: https://builds.tuxbuild.com/2FkkkZ51ZYhBL1G8D69YX8Pkt5F/config
> > > build-url: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next/-/pipelines/659754170
> > > artifact-location: https://builds.tuxbuild.com/2FkkkZ51ZYhBL1G8D69YX8Pkt5F
> > > toolchain: gcc-10
>
> 7)
> > The kernel messages that are shown in the links above do not show any "libata
> > version 3.00 loaded." message nor any ata/ahci message that I can see. So I
> > think the eSATA adapter is not even being detected and libata/ahci driver not used.
> >
> > Was this working before ? If yes, can you try with the following patches reverted ?
> >
> > d3243965f24a ("ata: make PATA_PLATFORM selectable only for suitable architectures")
> > 3ebe59a54111 ("ata: clean up how architectures enable PATA_PLATFORM and
> > PATA_OF_PLATFORM")
>
> I have reverted above two patches and but the problem has not been solved.
>
> 8)
> > If reverting these patches restores the eSATA port on this board, then you need
> > to fix the defconfig for that board.
>
> OTOH,
> Anders, enabled the new config CONFIG_AHCI_DWC=y and tried but the
> device failed to boot.

I thought it would work with enabling CONFIG_AHCI_DWC=y, but it didn't...
However, reverting patch 33629d35090f ("ata: ahci: Add DWC AHCI SATA
controller support")
from next-20221013 was a success, kernel booted and the 'mkfs.ext4' cmd was
successful.

Build artifacts [1].
Any idea what happens?

Cheers,
Anders
[1] https://builds.tuxbuild.com/2G53i1F7vUWWTuZJtka3Fr7iH1B/

2022-10-14 00:37:15

by Damien Le Moal

[permalink] [raw]
Subject: Re: TI: X15 the connected SSD is not detected on Linux next 20221006 tag

On 10/14/22 07:07, Anders Roxell wrote:
[...]
>> 8)
>>> If reverting these patches restores the eSATA port on this board, then you need
>>> to fix the defconfig for that board.
>>
>> OTOH,
>> Anders, enabled the new config CONFIG_AHCI_DWC=y and tried but the
>> device failed to boot.
>
> I thought it would work with enabling CONFIG_AHCI_DWC=y, but it didn't...

As mentioned in my previous reply to Naresh, this is a new driver added in
6.1. Your board was working before so this should not be the driver needed
for it.

> However, reverting patch 33629d35090f ("ata: ahci: Add DWC AHCI SATA
> controller support")
> from next-20221013 was a success, kernel booted and the 'mkfs.ext4' cmd was
> successful.

Which is very strange... There is only one hunk in that commit that could
be considered suspicious:

diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c
index 9b56490ecbc3..8f5572a9f8f1 100644
--- a/drivers/ata/ahci_platform.c
+++ b/drivers/ata/ahci_platform.c
@@ -80,9 +80,7 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend,
static const struct of_device_id ahci_of_match[] = {
{ .compatible = "generic-ahci", },
/* Keep the following compatibles for device tree compatibility */
- { .compatible = "snps,spear-ahci", },
{ .compatible = "ibm,476gtr-ahci", },
- { .compatible = "snps,dwc-ahci", },
{ .compatible = "hisilicon,hisi-ahci", },
{ .compatible = "cavium,octeon-7130-ahci", },
{ /* sentinel */ }

Is your board using one of these compatible string ?

Serge ?
Any idea ?

--
Damien Le Moal
Western Digital Research

2022-10-14 01:21:08

by Damien Le Moal

[permalink] [raw]
Subject: Re: TI: X15 the connected SSD is not detected on Linux next 20221006 tag

On 10/13/22 21:39, Naresh Kamboju wrote:
> On Thu, 13 Oct 2022 at 12:41, Damien Le Moal
> <[email protected]> wrote:
>>
>> On 2022/10/12 16:24, Naresh Kamboju wrote:
>>> On TI beagle board x15 the connected SSD is not detected on linux next
>>> 20221006 tag.
>>>
>>> + export STORAGE_DEV=/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
>>> + STORAGE_DEV=/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
>>> + test -n /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
>>> + echo y
>>> + mkfs.ext4 /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
>>> mke2fs 1.46.5 (30-Dec-2021)
>>> The file /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 does
>>> not exist and no size was specified.
>>> + lava-test-raise 'mkfs.ext4
>>> /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 failed; job
>>> exit'
>
> The reported issue is now noticed on the Linux mainline master branch.
>
> 1)
> I see following config is missing on latest problematic builds
> - CONFIG_HAVE_PATA_PLATFORM=y
>
> 2)
> Following ahci sata kernel message are missing on problematic boots,
> [ 1.408660] ahci 4a140000.sata: forcing port_map 0x0 -> 0x1
> [ 1.408691] ahci 4a140000.sata: AHCI 0001.0300 32 slots 1 ports 3
> Gbps 0x1 impl platform mode
> [ 1.408721] ahci 4a140000.sata: flags: 64bit ncq sntf pm led clo
> only pmp pio slum part ccc apst
> [ 1.409820] scsi host0: ahci
> [ 1.410064] ata1: SATA max UDMA/133 mmio [mem
> 0x4a140000-0x4a1410ff] port 0x100 irq 98

The proper driver for this board is not being loaded I think, or not
builtin. What is the compat string in the device tree for this ahci
adapter ? What driver does it need ? I quickly tried to google that info
but did not find any details.

> 3)
> GOOD: 9d84bb40bcb30a7fa16f33baa967aeb9953dda78
> BAD: e08466a7c00733a501d3c5328d29ec974478d717

What are these ? "git show" says they are drm and rdma pull request merge
from Linus...

> 4)
> Here i am adding links working and not working test jobs and kernel configs,
> problematic test job:
> - https://lkft.validation.linaro.org/scheduler/job/5641407#L2602
> Good test job:
> - https://lkft.validation.linaro.org/scheduler/job/5640672#L2198

Hard to read... Can you send a diff of the kernel configs ?

> 5)
> metadata:
> git_ref: master
> git_repo: https://gitlab.com/Linaro/lkft/mirrors/torvalds/linux-mainline
> git_sha: e08466a7c00733a501d3c5328d29ec974478d717
> git_describe: v6.0-7220-ge08466a7c007
> kernel_version: 6.0.0
> kernel-config: https://builds.tuxbuild.com/2Fourpiqf1OrlPFFtKwhHV0wAiq/config
> build-url: https://gitlab.com/Linaro/lkft/mirrors/torvalds/linux-mainline/-/pipelines/661424896
> artifact-location: https://builds.tuxbuild.com/2Fourpiqf1OrlPFFtKwhHV0wAiq
> toolchain: gcc-10
>
>
> 6)
> For your information,
> --
> I see diff on good to bad commits,
> $ git log --oneline 9d84bb40bcb3..e08466a7c007 -- drivers/ata
> 4078aa685097 Merge tag 'ata-6.1-rc1' of
> git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata
> 71d7b6e51ad3 ata: libata-eh: avoid needless hard reset when revalidating link
> e3b1fff6c051 ata: libata: drop superfluous ata_eh_analyze_tf() parameter
> b46c760e11c8 ata: libata: drop superfluous ata_eh_request_sense() parameter
> cb6e73aaadff ata: libata-eh: Remove the unneeded result variable
> ecf8322f464d ata: ahci_st: Enable compile test
> 2d29dd108c78 ata: ahci_st: Fix compilation warning
> 9628711aa649 ata: ahci-dwc: Add Baikal-T1 AHCI SATA interface support
> bc7af9100fa8 ata: ahci-dwc: Add platform-specific quirks support
> 33629d35090f ata: ahci: Add DWC AHCI SATA controller support
> 6ce73f3a6fc0 ata: libahci_platform: Add function returning a clock-handle by id
> 18ee7c49f75b ata: ahci: Introduce firmware-specific caps initialization
> 7cbbfbe01a72 ata: ahci: Convert __ahci_port_base to accepting hpriv as arguments
> fad64dc06579 ata: libahci: Don't read AHCI version twice in the
> save-config method
> 88589772e80c ata: libahci: Discard redundant force_port_map parameter
> eb7cae0b6afd ata: libahci: Extend port-cmd flags set with port capabilities
> f67f12ff57bc ata: libahci_platform: Introduce reset
> assertion/deassertion methods
> 3f74cd046fbe ata: libahci_platform: Parse ports-implemented property
> in resources getter
> 3c132ea6508b ata: libahci_platform: Sanity check the DT child nodes number
> e28b3abf8020 ata: libahci_platform: Convert to using devm bulk clocks API
> 82d437e6dcb1 ata: libahci_platform: Convert to using platform
> devm-ioremap methods
> d3243965f24a ata: make PATA_PLATFORM selectable only for suitable architectures
> 3ebe59a54111 ata: clean up how architectures enable PATA_PLATFORM and
> PATA_OF_PLATFORM
> 55d5ba550535 ata: libata-core: Check errors in sata_print_link_status()
> 03070458d700 ata: libata-sff: Fix double word in comments
> 0b2436d3d25f ata: pata_macio: Remove unneeded word in comments
> 024811a2da45 ata: libata-core: Simplify ata_dev_set_xfermode()
> 066de3b9d93b ata: libata-core: Simplify ata_build_rw_tf()
> e00923c59e68 ata: libata: Rename ATA_DFLAG_NCQ_PRIO_ENABLE
> 614065aba704 ata: libata-core: remove redundant err_mask variable
> fee6073051c3 ata: ahci: Do not check ACPI_FADT_LOW_POWER_S0
> 99ad3f9f829f ata: libata-core: improve parameter names for ata_dev_set_feature()
> 16169fb78182 ata: libata-core: Print timeout value when internal command times

I do not understand what you are trying to say here. These are the latest
ata patches for 6.1. They touch different drivers and ata core.
I still do not know which driver needs to be used on that board...

>>> Test log:
>>> - https://lkft.validation.linaro.org/scheduler/job/5634743#L2580
>>>
>>> Reported-by: Linux Kernel Functional Testing <[email protected]>
>>>
>>> metadata:
>>> git_ref: master
>>> git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next
>>> git_sha: 7da9fed0474b4cd46055dd92d55c42faf32c19ac
>>> git_describe: next-20221006
>>> kernel_version: 6.0.0
>>> kernel-config: https://builds.tuxbuild.com/2FkkkZ51ZYhBL1G8D69YX8Pkt5F/config
>>> build-url: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next/-/pipelines/659754170
>>> artifact-location: https://builds.tuxbuild.com/2FkkkZ51ZYhBL1G8D69YX8Pkt5F
>>> toolchain: gcc-10
>
> 7)
>> The kernel messages that are shown in the links above do not show any "libata
>> version 3.00 loaded." message nor any ata/ahci message that I can see. So I
>> think the eSATA adapter is not even being detected and libata/ahci driver not used.
>>
>> Was this working before ? If yes, can you try with the following patches reverted ?
>>
>> d3243965f24a ("ata: make PATA_PLATFORM selectable only for suitable architectures")
>> 3ebe59a54111 ("ata: clean up how architectures enable PATA_PLATFORM and
>> PATA_OF_PLATFORM")
>
> I have reverted above two patches and but the problem has not been solved.

OK.

>
> 8)
>> If reverting these patches restores the eSATA port on this board, then you need
>> to fix the defconfig for that board.
>
> OTOH,
> Anders, enabled the new config CONFIG_AHCI_DWC=y and tried but the
> device failed to boot.

Why would you need to enable this new driver ? You board was working
before without this new driver, so it is not the one to use for this
board, right ? Please send the ata related bits of the device tree to
understand what this board needs.

--
Damien Le Moal
Western Digital Research

2022-10-14 07:59:16

by Arnd Bergmann

[permalink] [raw]
Subject: Re: TI: X15 the connected SSD is not detected on Linux next 20221006 tag

On Fri, Oct 14, 2022, at 2:22 AM, Damien Le Moal wrote:
> On 10/14/22 07:07, Anders Roxell wrote:
> [...]
>>> 8)
>>>> If reverting these patches restores the eSATA port on this board, then you need
>>>> to fix the defconfig for that board.
>>>
>>> OTOH,
>>> Anders, enabled the new config CONFIG_AHCI_DWC=y and tried but the
>>> device failed to boot.
>>
>> I thought it would work with enabling CONFIG_AHCI_DWC=y, but it didn't...
>
> As mentioned in my previous reply to Naresh, this is a new driver added in
> 6.1. Your board was working before so this should not be the driver needed
> for it.
>
>> However, reverting patch 33629d35090f ("ata: ahci: Add DWC AHCI SATA
>> controller support")
>> from next-20221013 was a success, kernel booted and the 'mkfs.ext4' cmd was
>> successful.
>
> Which is very strange... There is only one hunk in that commit that could
> be considered suspicious:
>
> diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c
> index 9b56490ecbc3..8f5572a9f8f1 100644
> --- a/drivers/ata/ahci_platform.c
> +++ b/drivers/ata/ahci_platform.c
> @@ -80,9 +80,7 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend,
> static const struct of_device_id ahci_of_match[] = {
> { .compatible = "generic-ahci", },
> /* Keep the following compatibles for device tree compatibility */
> - { .compatible = "snps,spear-ahci", },
> { .compatible = "ibm,476gtr-ahci", },
> - { .compatible = "snps,dwc-ahci", },
> { .compatible = "hisilicon,hisi-ahci", },
> { .compatible = "cavium,octeon-7130-ahci", },
> { /* sentinel */ }
>
> Is your board using one of these compatible string ?

The x15 uses "snps,dwc-ahci". I would expect it to detect the device
with the new driver if that is loaded, but it's possible that the
driver does not work on all versions of the dwc-ahci hardware.

Anders, can you provide the boot log from a boot with the new driver
built in? There should be some messages from dwc-ahci about finding
the device, but then not ultimately working.

Depending on which way it goes wrong, the safest fallback for 6.1 is
probably to move the "snps,spear-ahci" and "snps,dwc-ahci" compatible
strings back into the old driver, and leave the new one only for
the "baikal,bt1-ahci" implementation of it, until it has been
successfully verified on TI am5/dra7, spear13xx and exynos.

Arnd

2022-10-14 08:02:17

by Damien Le Moal

[permalink] [raw]
Subject: Re: TI: X15 the connected SSD is not detected on Linux next 20221006 tag

On 10/14/22 16:31, Arnd Bergmann wrote:
> On Fri, Oct 14, 2022, at 2:22 AM, Damien Le Moal wrote:
>> On 10/14/22 07:07, Anders Roxell wrote:
>> [...]
>>>> 8)
>>>>> If reverting these patches restores the eSATA port on this board, then you need
>>>>> to fix the defconfig for that board.
>>>>
>>>> OTOH,
>>>> Anders, enabled the new config CONFIG_AHCI_DWC=y and tried but the
>>>> device failed to boot.
>>>
>>> I thought it would work with enabling CONFIG_AHCI_DWC=y, but it didn't...
>>
>> As mentioned in my previous reply to Naresh, this is a new driver added in
>> 6.1. Your board was working before so this should not be the driver needed
>> for it.
>>
>>> However, reverting patch 33629d35090f ("ata: ahci: Add DWC AHCI SATA
>>> controller support")
>>> from next-20221013 was a success, kernel booted and the 'mkfs.ext4' cmd was
>>> successful.
>>
>> Which is very strange... There is only one hunk in that commit that could
>> be considered suspicious:
>>
>> diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c
>> index 9b56490ecbc3..8f5572a9f8f1 100644
>> --- a/drivers/ata/ahci_platform.c
>> +++ b/drivers/ata/ahci_platform.c
>> @@ -80,9 +80,7 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend,
>> static const struct of_device_id ahci_of_match[] = {
>> { .compatible = "generic-ahci", },
>> /* Keep the following compatibles for device tree compatibility */
>> - { .compatible = "snps,spear-ahci", },
>> { .compatible = "ibm,476gtr-ahci", },
>> - { .compatible = "snps,dwc-ahci", },
>> { .compatible = "hisilicon,hisi-ahci", },
>> { .compatible = "cavium,octeon-7130-ahci", },
>> { /* sentinel */ }
>>
>> Is your board using one of these compatible string ?
>
> The x15 uses "snps,dwc-ahci". I would expect it to detect the device
> with the new driver if that is loaded, but it's possible that the
> driver does not work on all versions of the dwc-ahci hardware.
>
> Anders, can you provide the boot log from a boot with the new driver
> built in? There should be some messages from dwc-ahci about finding
> the device, but then not ultimately working.
>
> Depending on which way it goes wrong, the safest fallback for 6.1 is
> probably to move the "snps,spear-ahci" and "snps,dwc-ahci" compatible
> strings back into the old driver, and leave the new one only for
> the "baikal,bt1-ahci" implementation of it, until it has been
> successfully verified on TI am5/dra7, spear13xx and exynos.

OK. So a fix patch until further tests/debug is completed would be this:

diff --git a/drivers/ata/ahci_dwc.c b/drivers/ata/ahci_dwc.c
index 8fb66860db31..7a0cbab00843 100644
--- a/drivers/ata/ahci_dwc.c
+++ b/drivers/ata/ahci_dwc.c
@@ -469,8 +469,6 @@ static struct ahci_dwc_plat_data ahci_bt1_plat = {
};

static const struct of_device_id ahci_dwc_of_match[] = {
- { .compatible = "snps,dwc-ahci", &ahci_dwc_plat },
- { .compatible = "snps,spear-ahci", &ahci_dwc_plat },
{ .compatible = "baikal,bt1-ahci", &ahci_bt1_plat },
{},
};
diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c
index 8f5572a9f8f1..9b56490ecbc3 100644
--- a/drivers/ata/ahci_platform.c
+++ b/drivers/ata/ahci_platform.c
@@ -80,7 +80,9 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend,
static const struct of_device_id ahci_of_match[] = {
{ .compatible = "generic-ahci", },
/* Keep the following compatibles for device tree compatibility */
+ { .compatible = "snps,spear-ahci", },
{ .compatible = "ibm,476gtr-ahci", },
+ { .compatible = "snps,dwc-ahci", },
{ .compatible = "hisilicon,hisi-ahci", },
{ .compatible = "cavium,octeon-7130-ahci", },
{ /* sentinel */ }

Anders, Naresh,

Can you try this ?

--
Damien Le Moal
Western Digital Research

2022-10-14 09:45:30

by Arnd Bergmann

[permalink] [raw]
Subject: Re: TI: X15 the connected SSD is not detected on Linux next 20221006 tag

On Fri, Oct 14, 2022, at 11:22 AM, Anders Roxell wrote:
> On Fri, 14 Oct 2022 at 09:53, Damien Le Moal

> Tested this patch on todays linux-next tag: next-20221014 without enabling
> CONFIG_AHCI_DWC and it worked as expected when booting [1].
> On the other hand I also tried a build/boot with CONFIG_AHCI_DWC enabled
> and it worked as expected to boot [2].

Ok, great. Can you a patch to [email protected] to enable the driver in the
relevant defconfigs?

> However, during building a warning [3] popped up:
>
> make --silent --keep-going --jobs=8
> O=/home/tuxbuild/.cache/tuxmake/builds/2/build ARCH=arm
> CROSS_COMPILE=arm-linux-gnueabihf- 'CC=sccache
> arm-linux-gnueabihf-gcc' 'HOSTCC=sccache gcc'
> /builds/linux/drivers/ata/ahci_dwc.c:462:34: warning: 'ahci_dwc_plat'
> defined but not used [-Wunused-variable]
> 462 | static struct ahci_dwc_plat_data ahci_dwc_plat = {

Strange, I can't reproduce this, and the ahci_dwc_plat symbol
looks like it is clearly used in ahci_dwc_of_match[], at least
in next-20221014. Do you also see this on mainline?

Arnd

2022-10-14 09:47:00

by Anders Roxell

[permalink] [raw]
Subject: Re: TI: X15 the connected SSD is not detected on Linux next 20221006 tag

On Fri, 14 Oct 2022 at 09:53, Damien Le Moal
<[email protected]> wrote:
>
> On 10/14/22 16:31, Arnd Bergmann wrote:
> > On Fri, Oct 14, 2022, at 2:22 AM, Damien Le Moal wrote:
> >> On 10/14/22 07:07, Anders Roxell wrote:
> >> [...]
> >>>> 8)
> >>>>> If reverting these patches restores the eSATA port on this board, then you need
> >>>>> to fix the defconfig for that board.
> >>>>
> >>>> OTOH,
> >>>> Anders, enabled the new config CONFIG_AHCI_DWC=y and tried but the
> >>>> device failed to boot.
> >>>
> >>> I thought it would work with enabling CONFIG_AHCI_DWC=y, but it didn't...
> >>
> >> As mentioned in my previous reply to Naresh, this is a new driver added in
> >> 6.1. Your board was working before so this should not be the driver needed
> >> for it.
> >>
> >>> However, reverting patch 33629d35090f ("ata: ahci: Add DWC AHCI SATA
> >>> controller support")
> >>> from next-20221013 was a success, kernel booted and the 'mkfs.ext4' cmd was
> >>> successful.
> >>
> >> Which is very strange... There is only one hunk in that commit that could
> >> be considered suspicious:
> >>
> >> diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c
> >> index 9b56490ecbc3..8f5572a9f8f1 100644
> >> --- a/drivers/ata/ahci_platform.c
> >> +++ b/drivers/ata/ahci_platform.c
> >> @@ -80,9 +80,7 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend,
> >> static const struct of_device_id ahci_of_match[] = {
> >> { .compatible = "generic-ahci", },
> >> /* Keep the following compatibles for device tree compatibility */
> >> - { .compatible = "snps,spear-ahci", },
> >> { .compatible = "ibm,476gtr-ahci", },
> >> - { .compatible = "snps,dwc-ahci", },
> >> { .compatible = "hisilicon,hisi-ahci", },
> >> { .compatible = "cavium,octeon-7130-ahci", },
> >> { /* sentinel */ }
> >>
> >> Is your board using one of these compatible string ?
> >
> > The x15 uses "snps,dwc-ahci". I would expect it to detect the device
> > with the new driver if that is loaded, but it's possible that the
> > driver does not work on all versions of the dwc-ahci hardware.
> >
> > Anders, can you provide the boot log from a boot with the new driver
> > built in? There should be some messages from dwc-ahci about finding
> > the device, but then not ultimately working.
> >
> > Depending on which way it goes wrong, the safest fallback for 6.1 is
> > probably to move the "snps,spear-ahci" and "snps,dwc-ahci" compatible
> > strings back into the old driver, and leave the new one only for
> > the "baikal,bt1-ahci" implementation of it, until it has been
> > successfully verified on TI am5/dra7, spear13xx and exynos.
>
> OK. So a fix patch until further tests/debug is completed would be this:
>
> diff --git a/drivers/ata/ahci_dwc.c b/drivers/ata/ahci_dwc.c
> index 8fb66860db31..7a0cbab00843 100644
> --- a/drivers/ata/ahci_dwc.c
> +++ b/drivers/ata/ahci_dwc.c
> @@ -469,8 +469,6 @@ static struct ahci_dwc_plat_data ahci_bt1_plat = {
> };
>
> static const struct of_device_id ahci_dwc_of_match[] = {
> - { .compatible = "snps,dwc-ahci", &ahci_dwc_plat },
> - { .compatible = "snps,spear-ahci", &ahci_dwc_plat },
> { .compatible = "baikal,bt1-ahci", &ahci_bt1_plat },
> {},
> };
> diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c
> index 8f5572a9f8f1..9b56490ecbc3 100644
> --- a/drivers/ata/ahci_platform.c
> +++ b/drivers/ata/ahci_platform.c
> @@ -80,7 +80,9 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend,
> static const struct of_device_id ahci_of_match[] = {
> { .compatible = "generic-ahci", },
> /* Keep the following compatibles for device tree compatibility */
> + { .compatible = "snps,spear-ahci", },
> { .compatible = "ibm,476gtr-ahci", },
> + { .compatible = "snps,dwc-ahci", },
> { .compatible = "hisilicon,hisi-ahci", },
> { .compatible = "cavium,octeon-7130-ahci", },
> { /* sentinel */ }
>
> Anders, Naresh,
>
> Can you try this ?

Tested this patch on todays linux-next tag: next-20221014 without enabling
CONFIG_AHCI_DWC and it worked as expected when booting [1].
On the other hand I also tried a build/boot with CONFIG_AHCI_DWC enabled
and it worked as expected to boot [2]. However, during building a
warning [3] popped up:

make --silent --keep-going --jobs=8
O=/home/tuxbuild/.cache/tuxmake/builds/2/build ARCH=arm
CROSS_COMPILE=arm-linux-gnueabihf- 'CC=sccache
arm-linux-gnueabihf-gcc' 'HOSTCC=sccache gcc'
/builds/linux/drivers/ata/ahci_dwc.c:462:34: warning: 'ahci_dwc_plat'
defined but not used [-Wunused-variable]
462 | static struct ahci_dwc_plat_data ahci_dwc_plat = {
| ^~~~~~~~~~~~~


Cheers,
Anders
[1] https://lkft.validation.linaro.org/scheduler/job/5678031
[2] https://lkft.validation.linaro.org/scheduler/job/5678152
[3] https://builds.tuxbuild.com/2G7PDSV5uzjnQqCCBybK4WpoTxz/build.log

2022-10-14 10:08:29

by Damien Le Moal

[permalink] [raw]
Subject: Re: TI: X15 the connected SSD is not detected on Linux next 20221006 tag

On 10/14/22 18:37, Arnd Bergmann wrote:
> On Fri, Oct 14, 2022, at 11:22 AM, Anders Roxell wrote:
>> On Fri, 14 Oct 2022 at 09:53, Damien Le Moal
>
>> Tested this patch on todays linux-next tag: next-20221014 without enabling
>> CONFIG_AHCI_DWC and it worked as expected when booting [1].
>> On the other hand I also tried a build/boot with CONFIG_AHCI_DWC enabled
>> and it worked as expected to boot [2].

That is great news ! So the new driver is OK, good !

>
> Ok, great. Can you a patch to [email protected] to enable the driver in the
> relevant defconfigs?
>
>> However, during building a warning [3] popped up:
>>
>> make --silent --keep-going --jobs=8
>> O=/home/tuxbuild/.cache/tuxmake/builds/2/build ARCH=arm
>> CROSS_COMPILE=arm-linux-gnueabihf- 'CC=sccache
>> arm-linux-gnueabihf-gcc' 'HOSTCC=sccache gcc'
>> /builds/linux/drivers/ata/ahci_dwc.c:462:34: warning: 'ahci_dwc_plat'
>> defined but not used [-Wunused-variable]
>> 462 | static struct ahci_dwc_plat_data ahci_dwc_plat = {
>
> Strange, I can't reproduce this, and the ahci_dwc_plat symbol
> looks like it is clearly used in ahci_dwc_of_match[], at least
> in next-20221014. Do you also see this on mainline?

This is with the trial fix diff I sent. My bad, it was not even compile
tested :). Does not happen otherwise.

> Arnd

Thanks for helping with this !

--
Damien Le Moal
Western Digital Research

2022-10-14 13:43:30

by Serge Semin

[permalink] [raw]
Subject: Re: TI: X15 the connected SSD is not detected on Linux next 20221006 tag

On Fri, Oct 14, 2022 at 09:31:55AM +0200, Arnd Bergmann wrote:
> On Fri, Oct 14, 2022, at 2:22 AM, Damien Le Moal wrote:
> > On 10/14/22 07:07, Anders Roxell wrote:
> > [...]
> >>> 8)
> >>>> If reverting these patches restores the eSATA port on this board, then you need
> >>>> to fix the defconfig for that board.
> >>>
> >>> OTOH,
> >>> Anders, enabled the new config CONFIG_AHCI_DWC=y and tried but the
> >>> device failed to boot.
> >>
> >> I thought it would work with enabling CONFIG_AHCI_DWC=y, but it didn't...
> >
> > As mentioned in my previous reply to Naresh, this is a new driver added in
> > 6.1. Your board was working before so this should not be the driver needed
> > for it.
> >
> >> However, reverting patch 33629d35090f ("ata: ahci: Add DWC AHCI SATA
> >> controller support")
> >> from next-20221013 was a success, kernel booted and the 'mkfs.ext4' cmd was
> >> successful.
> >
> > Which is very strange... There is only one hunk in that commit that could
> > be considered suspicious:
> >
> > diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c
> > index 9b56490ecbc3..8f5572a9f8f1 100644
> > --- a/drivers/ata/ahci_platform.c
> > +++ b/drivers/ata/ahci_platform.c
> > @@ -80,9 +80,7 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend,
> > static const struct of_device_id ahci_of_match[] = {
> > { .compatible = "generic-ahci", },
> > /* Keep the following compatibles for device tree compatibility */
> > - { .compatible = "snps,spear-ahci", },
> > { .compatible = "ibm,476gtr-ahci", },
> > - { .compatible = "snps,dwc-ahci", },
> > { .compatible = "hisilicon,hisi-ahci", },
> > { .compatible = "cavium,octeon-7130-ahci", },
> > { /* sentinel */ }
> >
> > Is your board using one of these compatible string ?
>
> The x15 uses "snps,dwc-ahci". I would expect it to detect the device
> with the new driver if that is loaded, but it's possible that the
> driver does not work on all versions of the dwc-ahci hardware.
>

> Anders, can you provide the boot log from a boot with the new driver
> built in? There should be some messages from dwc-ahci about finding
> the device, but then not ultimately working.
>

Yes. The boot-log would be very useful.

> Depending on which way it goes wrong, the safest fallback for 6.1 is
> probably to move the "snps,spear-ahci" and "snps,dwc-ahci" compatible
> strings back into the old driver, and leave the new one only for
> the "baikal,bt1-ahci" implementation of it, until it has been
> successfully verified on TI am5/dra7, spear13xx and exynos.

Right. This would be a possible solution. But I'd rather suggest to at
least try to debug the problem.

-Sergey

>
> Arnd
>

2022-10-14 13:54:41

by Serge Semin

[permalink] [raw]
Subject: Re: TI: X15 the connected SSD is not detected on Linux next 20221006 tag

Hello Damien, Anders

On Fri, Oct 14, 2022 at 09:22:34AM +0900, Damien Le Moal wrote:
> On 10/14/22 07:07, Anders Roxell wrote:
> [...]
> >> 8)
> >>> If reverting these patches restores the eSATA port on this board, then you need
> >>> to fix the defconfig for that board.
> >>
> >> OTOH,
> >> Anders, enabled the new config CONFIG_AHCI_DWC=y and tried but the
> >> device failed to boot.
> >
> > I thought it would work with enabling CONFIG_AHCI_DWC=y, but it didn't...
>
> As mentioned in my previous reply to Naresh, this is a new driver added in
> 6.1. Your board was working before so this should not be the driver needed
> for it.
>
> > However, reverting patch 33629d35090f ("ata: ahci: Add DWC AHCI SATA
> > controller support")
> > from next-20221013 was a success, kernel booted and the 'mkfs.ext4' cmd was
> > successful.
>
> Which is very strange... There is only one hunk in that commit that could
> be considered suspicious:
>
> diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c
> index 9b56490ecbc3..8f5572a9f8f1 100644
> --- a/drivers/ata/ahci_platform.c
> +++ b/drivers/ata/ahci_platform.c
> @@ -80,9 +80,7 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend,
> static const struct of_device_id ahci_of_match[] = {
> { .compatible = "generic-ahci", },
> /* Keep the following compatibles for device tree compatibility */
> - { .compatible = "snps,spear-ahci", },
> { .compatible = "ibm,476gtr-ahci", },
> - { .compatible = "snps,dwc-ahci", },
> { .compatible = "hisilicon,hisi-ahci", },
> { .compatible = "cavium,octeon-7130-ahci", },
> { /* sentinel */ }
>

> Is your board using one of these compatible string ?

No. My board isn't using them. As a quick-fix they could be got back
to the generic driver. But please see below.

>
> Serge ?
> Any idea ?

The only difference between ahci_platform.c and ahci_dwc.c relevant to
these compatibles is in calling the next methods:
ahci_dwc_check_cap(hpriv);
ahci_dwc_init_timer(hpriv);
ahci_dwc_init_dmacr(hpriv);
As a first step on debugging the problem I would comment them out and
try to boot the system with the snps,dwc-ahci device being probed by
the ahci_dwc.c driver.

Let's try to test that out first. Then we can narrow down the scale
by commenting out one of these methods and then up to some parts of
it. What do you think?

-Sergey

>
> --
> Damien Le Moal
> Western Digital Research
>
>

2022-10-14 14:19:46

by Serge Semin

[permalink] [raw]
Subject: Re: TI: X15 the connected SSD is not detected on Linux next 20221006 tag

On Fri, Oct 14, 2022 at 11:22:38AM +0200, Anders Roxell wrote:
> On Fri, 14 Oct 2022 at 09:53, Damien Le Moal
> <[email protected]> wrote:
> >
> > On 10/14/22 16:31, Arnd Bergmann wrote:
> > > On Fri, Oct 14, 2022, at 2:22 AM, Damien Le Moal wrote:
> > >> On 10/14/22 07:07, Anders Roxell wrote:
> > >> [...]
> > >>>> 8)
> > >>>>> If reverting these patches restores the eSATA port on this board, then you need
> > >>>>> to fix the defconfig for that board.
> > >>>>
> > >>>> OTOH,
> > >>>> Anders, enabled the new config CONFIG_AHCI_DWC=y and tried but the
> > >>>> device failed to boot.
> > >>>
> > >>> I thought it would work with enabling CONFIG_AHCI_DWC=y, but it didn't...
> > >>
> > >> As mentioned in my previous reply to Naresh, this is a new driver added in
> > >> 6.1. Your board was working before so this should not be the driver needed
> > >> for it.
> > >>
> > >>> However, reverting patch 33629d35090f ("ata: ahci: Add DWC AHCI SATA
> > >>> controller support")
> > >>> from next-20221013 was a success, kernel booted and the 'mkfs.ext4' cmd was
> > >>> successful.
> > >>
> > >> Which is very strange... There is only one hunk in that commit that could
> > >> be considered suspicious:
> > >>
> > >> diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c
> > >> index 9b56490ecbc3..8f5572a9f8f1 100644
> > >> --- a/drivers/ata/ahci_platform.c
> > >> +++ b/drivers/ata/ahci_platform.c
> > >> @@ -80,9 +80,7 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend,
> > >> static const struct of_device_id ahci_of_match[] = {
> > >> { .compatible = "generic-ahci", },
> > >> /* Keep the following compatibles for device tree compatibility */
> > >> - { .compatible = "snps,spear-ahci", },
> > >> { .compatible = "ibm,476gtr-ahci", },
> > >> - { .compatible = "snps,dwc-ahci", },
> > >> { .compatible = "hisilicon,hisi-ahci", },
> > >> { .compatible = "cavium,octeon-7130-ahci", },
> > >> { /* sentinel */ }
> > >>
> > >> Is your board using one of these compatible string ?
> > >
> > > The x15 uses "snps,dwc-ahci". I would expect it to detect the device
> > > with the new driver if that is loaded, but it's possible that the
> > > driver does not work on all versions of the dwc-ahci hardware.
> > >
> > > Anders, can you provide the boot log from a boot with the new driver
> > > built in? There should be some messages from dwc-ahci about finding
> > > the device, but then not ultimately working.
> > >
> > > Depending on which way it goes wrong, the safest fallback for 6.1 is
> > > probably to move the "snps,spear-ahci" and "snps,dwc-ahci" compatible
> > > strings back into the old driver, and leave the new one only for
> > > the "baikal,bt1-ahci" implementation of it, until it has been
> > > successfully verified on TI am5/dra7, spear13xx and exynos.
> >
> > OK. So a fix patch until further tests/debug is completed would be this:
> >
> > diff --git a/drivers/ata/ahci_dwc.c b/drivers/ata/ahci_dwc.c
> > index 8fb66860db31..7a0cbab00843 100644
> > --- a/drivers/ata/ahci_dwc.c
> > +++ b/drivers/ata/ahci_dwc.c
> > @@ -469,8 +469,6 @@ static struct ahci_dwc_plat_data ahci_bt1_plat = {
> > };
> >
> > static const struct of_device_id ahci_dwc_of_match[] = {
> > - { .compatible = "snps,dwc-ahci", &ahci_dwc_plat },
> > - { .compatible = "snps,spear-ahci", &ahci_dwc_plat },
> > { .compatible = "baikal,bt1-ahci", &ahci_bt1_plat },
> > {},
> > };
> > diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c
> > index 8f5572a9f8f1..9b56490ecbc3 100644
> > --- a/drivers/ata/ahci_platform.c
> > +++ b/drivers/ata/ahci_platform.c
> > @@ -80,7 +80,9 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend,
> > static const struct of_device_id ahci_of_match[] = {
> > { .compatible = "generic-ahci", },
> > /* Keep the following compatibles for device tree compatibility */
> > + { .compatible = "snps,spear-ahci", },
> > { .compatible = "ibm,476gtr-ahci", },
> > + { .compatible = "snps,dwc-ahci", },
> > { .compatible = "hisilicon,hisi-ahci", },
> > { .compatible = "cavium,octeon-7130-ahci", },
> > { /* sentinel */ }
> >
> > Anders, Naresh,
> >
> > Can you try this ?
>

> Tested this patch on todays linux-next tag: next-20221014 without enabling
> CONFIG_AHCI_DWC and it worked as expected when booting [1].
> On the other hand I also tried a build/boot with CONFIG_AHCI_DWC enabled
> and it worked as expected to boot [2].

Expected result. The DWC driver will probe the device on our platform
only while your platform falls back to using the generic driver.
Anders, in order understand the root cause of the problem could you please
1. upload the bogus boot log.
2. try what I suggested here
Link: https://lore.kernel.org/linux-ide/20221014133623.l6w4o7onoyhv2q34@mobilestation/
and if the system fails to boot at some point upload the boot log.

-Sergey

> However, during building a warning [3] popped up:
>
> make --silent --keep-going --jobs=8
> O=/home/tuxbuild/.cache/tuxmake/builds/2/build ARCH=arm
> CROSS_COMPILE=arm-linux-gnueabihf- 'CC=sccache
> arm-linux-gnueabihf-gcc' 'HOSTCC=sccache gcc'
> /builds/linux/drivers/ata/ahci_dwc.c:462:34: warning: 'ahci_dwc_plat'
> defined but not used [-Wunused-variable]
> 462 | static struct ahci_dwc_plat_data ahci_dwc_plat = {
> | ^~~~~~~~~~~~~
>
>
> Cheers,
> Anders
> [1] https://lkft.validation.linaro.org/scheduler/job/5678031
> [2] https://lkft.validation.linaro.org/scheduler/job/5678152
> [3] https://builds.tuxbuild.com/2G7PDSV5uzjnQqCCBybK4WpoTxz/build.log
>

2022-10-17 08:12:23

by Anders Roxell

[permalink] [raw]
Subject: Re: TI: X15 the connected SSD is not detected on Linux next 20221006 tag

On Fri, 14 Oct 2022 at 16:06, Serge Semin
<[email protected]> wrote:
>
> On Fri, Oct 14, 2022 at 11:22:38AM +0200, Anders Roxell wrote:
> > On Fri, 14 Oct 2022 at 09:53, Damien Le Moal
> > <[email protected]> wrote:
> > >
> > > On 10/14/22 16:31, Arnd Bergmann wrote:
> > > > On Fri, Oct 14, 2022, at 2:22 AM, Damien Le Moal wrote:
> > > >> On 10/14/22 07:07, Anders Roxell wrote:
> > > >> [...]
> > > >>>> 8)
> > > >>>>> If reverting these patches restores the eSATA port on this board, then you need
> > > >>>>> to fix the defconfig for that board.
> > > >>>>
> > > >>>> OTOH,
> > > >>>> Anders, enabled the new config CONFIG_AHCI_DWC=y and tried but the
> > > >>>> device failed to boot.
> > > >>>
> > > >>> I thought it would work with enabling CONFIG_AHCI_DWC=y, but it didn't...
> > > >>
> > > >> As mentioned in my previous reply to Naresh, this is a new driver added in
> > > >> 6.1. Your board was working before so this should not be the driver needed
> > > >> for it.
> > > >>
> > > >>> However, reverting patch 33629d35090f ("ata: ahci: Add DWC AHCI SATA
> > > >>> controller support")
> > > >>> from next-20221013 was a success, kernel booted and the 'mkfs.ext4' cmd was
> > > >>> successful.
> > > >>
> > > >> Which is very strange... There is only one hunk in that commit that could
> > > >> be considered suspicious:
> > > >>
> > > >> diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c
> > > >> index 9b56490ecbc3..8f5572a9f8f1 100644
> > > >> --- a/drivers/ata/ahci_platform.c
> > > >> +++ b/drivers/ata/ahci_platform.c
> > > >> @@ -80,9 +80,7 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend,
> > > >> static const struct of_device_id ahci_of_match[] = {
> > > >> { .compatible = "generic-ahci", },
> > > >> /* Keep the following compatibles for device tree compatibility */
> > > >> - { .compatible = "snps,spear-ahci", },
> > > >> { .compatible = "ibm,476gtr-ahci", },
> > > >> - { .compatible = "snps,dwc-ahci", },
> > > >> { .compatible = "hisilicon,hisi-ahci", },
> > > >> { .compatible = "cavium,octeon-7130-ahci", },
> > > >> { /* sentinel */ }
> > > >>
> > > >> Is your board using one of these compatible string ?
> > > >
> > > > The x15 uses "snps,dwc-ahci". I would expect it to detect the device
> > > > with the new driver if that is loaded, but it's possible that the
> > > > driver does not work on all versions of the dwc-ahci hardware.
> > > >
> > > > Anders, can you provide the boot log from a boot with the new driver
> > > > built in? There should be some messages from dwc-ahci about finding
> > > > the device, but then not ultimately working.
> > > >
> > > > Depending on which way it goes wrong, the safest fallback for 6.1 is
> > > > probably to move the "snps,spear-ahci" and "snps,dwc-ahci" compatible
> > > > strings back into the old driver, and leave the new one only for
> > > > the "baikal,bt1-ahci" implementation of it, until it has been
> > > > successfully verified on TI am5/dra7, spear13xx and exynos.
> > >
> > > OK. So a fix patch until further tests/debug is completed would be this:
> > >
> > > diff --git a/drivers/ata/ahci_dwc.c b/drivers/ata/ahci_dwc.c
> > > index 8fb66860db31..7a0cbab00843 100644
> > > --- a/drivers/ata/ahci_dwc.c
> > > +++ b/drivers/ata/ahci_dwc.c
> > > @@ -469,8 +469,6 @@ static struct ahci_dwc_plat_data ahci_bt1_plat = {
> > > };
> > >
> > > static const struct of_device_id ahci_dwc_of_match[] = {
> > > - { .compatible = "snps,dwc-ahci", &ahci_dwc_plat },
> > > - { .compatible = "snps,spear-ahci", &ahci_dwc_plat },
> > > { .compatible = "baikal,bt1-ahci", &ahci_bt1_plat },
> > > {},
> > > };
> > > diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c
> > > index 8f5572a9f8f1..9b56490ecbc3 100644
> > > --- a/drivers/ata/ahci_platform.c
> > > +++ b/drivers/ata/ahci_platform.c
> > > @@ -80,7 +80,9 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend,
> > > static const struct of_device_id ahci_of_match[] = {
> > > { .compatible = "generic-ahci", },
> > > /* Keep the following compatibles for device tree compatibility */
> > > + { .compatible = "snps,spear-ahci", },
> > > { .compatible = "ibm,476gtr-ahci", },
> > > + { .compatible = "snps,dwc-ahci", },
> > > { .compatible = "hisilicon,hisi-ahci", },
> > > { .compatible = "cavium,octeon-7130-ahci", },
> > > { /* sentinel */ }
> > >
> > > Anders, Naresh,
> > >
> > > Can you try this ?
> >
>
> > Tested this patch on todays linux-next tag: next-20221014 without enabling
> > CONFIG_AHCI_DWC and it worked as expected when booting [1].
> > On the other hand I also tried a build/boot with CONFIG_AHCI_DWC enabled
> > and it worked as expected to boot [2].
>
> Expected result. The DWC driver will probe the device on our platform
> only while your platform falls back to using the generic driver.
> Anders, in order understand the root cause of the problem could you please
> 1. upload the bogus boot log.

This [1] is the bogus boot log.

> 2. try what I suggested here
> Link: https://lore.kernel.org/linux-ide/20221014133623.l6w4o7onoyhv2q34@mobilestation/
> and if the system fails to boot at some point upload the boot log.

Only doing this:

--- a/drivers/ata/ahci_dwc.c
+++ b/drivers/ata/ahci_dwc.c
@@ -316,12 +316,13 @@ static int ahci_dwc_init_host(struct
ahci_host_priv *hpriv)
if (rc)
goto err_disable_resources;
}
-
+/*
ahci_dwc_check_cap(hpriv);

ahci_dwc_init_timer(hpriv);

rc = ahci_dwc_init_dmacr(hpriv);
+*/
if (rc)
goto err_clear_platform;

and enable CONFIG_AHCI_DWC made the mkfs to detect the SATA drive [2].

Cheers,
Anders
[1] https://lkft.validation.linaro.org/scheduler/job/5634743#L2580
[2] https://lkft.validation.linaro.org/scheduler/job/5679278#L2617

2022-10-17 16:05:44

by Serge Semin

[permalink] [raw]
Subject: Re: TI: X15 the connected SSD is not detected on Linux next 20221006 tag

On Mon, Oct 17, 2022 at 09:43:24AM +0200, Anders Roxell wrote:
> On Fri, 14 Oct 2022 at 16:06, Serge Semin
> <[email protected]> wrote:
> >
> > On Fri, Oct 14, 2022 at 11:22:38AM +0200, Anders Roxell wrote:
> > > On Fri, 14 Oct 2022 at 09:53, Damien Le Moal
> > > <[email protected]> wrote:
> > > >
> > > > On 10/14/22 16:31, Arnd Bergmann wrote:
> > > > > On Fri, Oct 14, 2022, at 2:22 AM, Damien Le Moal wrote:
> > > > >> On 10/14/22 07:07, Anders Roxell wrote:
> > > > >> [...]
> > > > >>>> 8)
> > > > >>>>> If reverting these patches restores the eSATA port on this board, then you need
> > > > >>>>> to fix the defconfig for that board.
> > > > >>>>
> > > > >>>> OTOH,
> > > > >>>> Anders, enabled the new config CONFIG_AHCI_DWC=y and tried but the
> > > > >>>> device failed to boot.
> > > > >>>
> > > > >>> I thought it would work with enabling CONFIG_AHCI_DWC=y, but it didn't...
> > > > >>
> > > > >> As mentioned in my previous reply to Naresh, this is a new driver added in
> > > > >> 6.1. Your board was working before so this should not be the driver needed
> > > > >> for it.
> > > > >>
> > > > >>> However, reverting patch 33629d35090f ("ata: ahci: Add DWC AHCI SATA
> > > > >>> controller support")
> > > > >>> from next-20221013 was a success, kernel booted and the 'mkfs.ext4' cmd was
> > > > >>> successful.
> > > > >>
> > > > >> Which is very strange... There is only one hunk in that commit that could
> > > > >> be considered suspicious:
> > > > >>
> > > > >> diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c
> > > > >> index 9b56490ecbc3..8f5572a9f8f1 100644
> > > > >> --- a/drivers/ata/ahci_platform.c
> > > > >> +++ b/drivers/ata/ahci_platform.c
> > > > >> @@ -80,9 +80,7 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend,
> > > > >> static const struct of_device_id ahci_of_match[] = {
> > > > >> { .compatible = "generic-ahci", },
> > > > >> /* Keep the following compatibles for device tree compatibility */
> > > > >> - { .compatible = "snps,spear-ahci", },
> > > > >> { .compatible = "ibm,476gtr-ahci", },
> > > > >> - { .compatible = "snps,dwc-ahci", },
> > > > >> { .compatible = "hisilicon,hisi-ahci", },
> > > > >> { .compatible = "cavium,octeon-7130-ahci", },
> > > > >> { /* sentinel */ }
> > > > >>
> > > > >> Is your board using one of these compatible string ?
> > > > >
> > > > > The x15 uses "snps,dwc-ahci". I would expect it to detect the device
> > > > > with the new driver if that is loaded, but it's possible that the
> > > > > driver does not work on all versions of the dwc-ahci hardware.
> > > > >
> > > > > Anders, can you provide the boot log from a boot with the new driver
> > > > > built in? There should be some messages from dwc-ahci about finding
> > > > > the device, but then not ultimately working.
> > > > >
> > > > > Depending on which way it goes wrong, the safest fallback for 6.1 is
> > > > > probably to move the "snps,spear-ahci" and "snps,dwc-ahci" compatible
> > > > > strings back into the old driver, and leave the new one only for
> > > > > the "baikal,bt1-ahci" implementation of it, until it has been
> > > > > successfully verified on TI am5/dra7, spear13xx and exynos.
> > > >
> > > > OK. So a fix patch until further tests/debug is completed would be this:
> > > >
> > > > diff --git a/drivers/ata/ahci_dwc.c b/drivers/ata/ahci_dwc.c
> > > > index 8fb66860db31..7a0cbab00843 100644
> > > > --- a/drivers/ata/ahci_dwc.c
> > > > +++ b/drivers/ata/ahci_dwc.c
> > > > @@ -469,8 +469,6 @@ static struct ahci_dwc_plat_data ahci_bt1_plat = {
> > > > };
> > > >
> > > > static const struct of_device_id ahci_dwc_of_match[] = {
> > > > - { .compatible = "snps,dwc-ahci", &ahci_dwc_plat },
> > > > - { .compatible = "snps,spear-ahci", &ahci_dwc_plat },
> > > > { .compatible = "baikal,bt1-ahci", &ahci_bt1_plat },
> > > > {},
> > > > };
> > > > diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c
> > > > index 8f5572a9f8f1..9b56490ecbc3 100644
> > > > --- a/drivers/ata/ahci_platform.c
> > > > +++ b/drivers/ata/ahci_platform.c
> > > > @@ -80,7 +80,9 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend,
> > > > static const struct of_device_id ahci_of_match[] = {
> > > > { .compatible = "generic-ahci", },
> > > > /* Keep the following compatibles for device tree compatibility */
> > > > + { .compatible = "snps,spear-ahci", },
> > > > { .compatible = "ibm,476gtr-ahci", },
> > > > + { .compatible = "snps,dwc-ahci", },
> > > > { .compatible = "hisilicon,hisi-ahci", },
> > > > { .compatible = "cavium,octeon-7130-ahci", },
> > > > { /* sentinel */ }
> > > >
> > > > Anders, Naresh,
> > > >
> > > > Can you try this ?
> > >
> >
> > > Tested this patch on todays linux-next tag: next-20221014 without enabling
> > > CONFIG_AHCI_DWC and it worked as expected when booting [1].
> > > On the other hand I also tried a build/boot with CONFIG_AHCI_DWC enabled
> > > and it worked as expected to boot [2].
> >
> > Expected result. The DWC driver will probe the device on our platform
> > only while your platform falls back to using the generic driver.
> > Anders, in order understand the root cause of the problem could you please
> > 1. upload the bogus boot log.
>

> This [1] is the bogus boot log.
>
> > 2. try what I suggested here
> > Link: https://lore.kernel.org/linux-ide/20221014133623.l6w4o7onoyhv2q34@mobilestation/
> > and if the system fails to boot at some point upload the boot log.
>
> Only doing this:
>
> --- a/drivers/ata/ahci_dwc.c
> +++ b/drivers/ata/ahci_dwc.c
> @@ -316,12 +316,13 @@ static int ahci_dwc_init_host(struct
> ahci_host_priv *hpriv)
> if (rc)
> goto err_disable_resources;
> }
> -
> +/*
> ahci_dwc_check_cap(hpriv);
>
> ahci_dwc_init_timer(hpriv);
>
> rc = ahci_dwc_init_dmacr(hpriv);
> +*/
> if (rc)
> goto err_clear_platform;
>
> and enable CONFIG_AHCI_DWC made the mkfs to detect the SATA drive [2].

Judging by what is in [1] and [2] I have much doubt that [1] was
executed with the CONFIG_AHCI_DWC config enabled because the boot log has
nothing about the ahci-dwc driver probe failure or none of the logs
messages seen in [2] (see every line with the ahci-dwc word in it).

1. If you had the device probe procedure failed at some point you
would have got a line like this:
< ahci-dwc: probe of 4a140000.sata failed with error -errno
But there is no such line in [1]. There is literally nothing
AHCI/SATA/SCSI/DWC AHCI/ahci-dwc/etc in it.

2. If you had the DW AHCI device probe at least performed, then the next
calls-chain would have been executed:
ahci_dwc_probe()
+-> ahci_dwc_get_resources()
+-> ahci_platform_get_resources()
+-> ...
+-> devm_regulator_get(...)
+-> ...
which would have caused the next log messages:
< [] ahci-dwc 4a140000.sata: supply ahci not found, using dummy regulator
< [] ahci-dwc 4a140000.sata: supply phy not found, using dummy regulator
< [] ahci-dwc 4a140000.sata: supply target not found, using dummy regulator
You do have these lines in [2] but missing them in [1]. Should you
have any errors in ahci_dwc_probe() detected before that you would
have an error printed as I noted in 1.

3. Should the problem was in the commented out code lines you would
have at least got the messages above printed to the log [1] because
the commented out code is executed after the resources request
procedure (see the ahci_dwc_init_host() method is called after
ahci_dwc_get_resources()).

4. Finally the commented out code doesn't really do any actions which
could have caused the device probe to silently halt.

All of that makes me thinking that the DW AHCI SATA wasn't even probed
in [1] which most likely means that either the driver config was
omitted there or the device was disabled. So could you please re-start
the system like in [2] but uncomment the lines above?

* Please make sure the Damien's fix
https://www.spinics.net/lists/arm-kernel/msg1017920.html
isn't applied on the kernel [2].

[1] https://lkft.validation.linaro.org/scheduler/job/5634743#L2580
[2] https://lkft.validation.linaro.org/scheduler/job/5679278#L2617

-Sergey

>
> Cheers,
> Anders
> [1] https://lkft.validation.linaro.org/scheduler/job/5634743#L2580
> [2] https://lkft.validation.linaro.org/scheduler/job/5679278#L2617

2022-11-30 10:22:47

by Naresh Kamboju

[permalink] [raw]
Subject: Re: TI: X15 the connected SSD is not detected on Linux next 20221006 tag

On Mon, 17 Oct 2022 at 21:22, Serge Semin <[email protected]> wrote:
>
> On Mon, Oct 17, 2022 at 09:43:24AM +0200, Anders Roxell wrote:
> > On Fri, 14 Oct 2022 at 16:06, Serge Semin
> > <[email protected]> wrote:
> > >
> > > On Fri, Oct 14, 2022 at 11:22:38AM +0200, Anders Roxell wrote:
> > > > On Fri, 14 Oct 2022 at 09:53, Damien Le Moal
> > > > <[email protected]> wrote:
> > > > >
> > > > > On 10/14/22 16:31, Arnd Bergmann wrote:
> > > > > > On Fri, Oct 14, 2022, at 2:22 AM, Damien Le Moal wrote:
> > > > > >> On 10/14/22 07:07, Anders Roxell wrote:
> > > > > >> [...]
> > > > > >>>> 8)
> > > > > >>>>> If reverting these patches restores the eSATA port on this board, then you need
> > > > > >>>>> to fix the defconfig for that board.
> > > > > >>>>
> > > > > >>>> OTOH,
> > > > > >>>> Anders, enabled the new config CONFIG_AHCI_DWC=y and tried but the
> > > > > >>>> device failed to boot.
> > > > > >>>
> > > > > >>> I thought it would work with enabling CONFIG_AHCI_DWC=y, but it didn't...
> > > > > >>
> > > > > >> As mentioned in my previous reply to Naresh, this is a new driver added in
> > > > > >> 6.1. Your board was working before so this should not be the driver needed
> > > > > >> for it.
> > > > > >>
> > > > > >>> However, reverting patch 33629d35090f ("ata: ahci: Add DWC AHCI SATA
> > > > > >>> controller support")
> > > > > >>> from next-20221013 was a success, kernel booted and the 'mkfs.ext4' cmd was
> > > > > >>> successful.
> > > > > >>
> > > > > >> Which is very strange... There is only one hunk in that commit that could
> > > > > >> be considered suspicious:
> > > > > >>
> > > > > >> diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c
> > > > > >> index 9b56490ecbc3..8f5572a9f8f1 100644
> > > > > >> --- a/drivers/ata/ahci_platform.c
> > > > > >> +++ b/drivers/ata/ahci_platform.c
> > > > > >> @@ -80,9 +80,7 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend,
> > > > > >> static const struct of_device_id ahci_of_match[] = {
> > > > > >> { .compatible = "generic-ahci", },
> > > > > >> /* Keep the following compatibles for device tree compatibility */
> > > > > >> - { .compatible = "snps,spear-ahci", },
> > > > > >> { .compatible = "ibm,476gtr-ahci", },
> > > > > >> - { .compatible = "snps,dwc-ahci", },
> > > > > >> { .compatible = "hisilicon,hisi-ahci", },
> > > > > >> { .compatible = "cavium,octeon-7130-ahci", },
> > > > > >> { /* sentinel */ }
> > > > > >>
> > > > > >> Is your board using one of these compatible string ?
> > > > > >
> > > > > > The x15 uses "snps,dwc-ahci". I would expect it to detect the device
> > > > > > with the new driver if that is loaded, but it's possible that the
> > > > > > driver does not work on all versions of the dwc-ahci hardware.
> > > > > >
> > > > > > Anders, can you provide the boot log from a boot with the new driver
> > > > > > built in? There should be some messages from dwc-ahci about finding
> > > > > > the device, but then not ultimately working.
> > > > > >
> > > > > > Depending on which way it goes wrong, the safest fallback for 6.1 is
> > > > > > probably to move the "snps,spear-ahci" and "snps,dwc-ahci" compatible
> > > > > > strings back into the old driver, and leave the new one only for
> > > > > > the "baikal,bt1-ahci" implementation of it, until it has been
> > > > > > successfully verified on TI am5/dra7, spear13xx and exynos.
> > > > >
> > > > > OK. So a fix patch until further tests/debug is completed would be this:
> > > > >
> > > > > diff --git a/drivers/ata/ahci_dwc.c b/drivers/ata/ahci_dwc.c
> > > > > index 8fb66860db31..7a0cbab00843 100644
> > > > > --- a/drivers/ata/ahci_dwc.c
> > > > > +++ b/drivers/ata/ahci_dwc.c
> > > > > @@ -469,8 +469,6 @@ static struct ahci_dwc_plat_data ahci_bt1_plat = {
> > > > > };
> > > > >
> > > > > static const struct of_device_id ahci_dwc_of_match[] = {
> > > > > - { .compatible = "snps,dwc-ahci", &ahci_dwc_plat },
> > > > > - { .compatible = "snps,spear-ahci", &ahci_dwc_plat },
> > > > > { .compatible = "baikal,bt1-ahci", &ahci_bt1_plat },
> > > > > {},
> > > > > };
> > > > > diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c
> > > > > index 8f5572a9f8f1..9b56490ecbc3 100644
> > > > > --- a/drivers/ata/ahci_platform.c
> > > > > +++ b/drivers/ata/ahci_platform.c
> > > > > @@ -80,7 +80,9 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend,
> > > > > static const struct of_device_id ahci_of_match[] = {
> > > > > { .compatible = "generic-ahci", },
> > > > > /* Keep the following compatibles for device tree compatibility */
> > > > > + { .compatible = "snps,spear-ahci", },
> > > > > { .compatible = "ibm,476gtr-ahci", },
> > > > > + { .compatible = "snps,dwc-ahci", },
> > > > > { .compatible = "hisilicon,hisi-ahci", },
> > > > > { .compatible = "cavium,octeon-7130-ahci", },
> > > > > { /* sentinel */ }


FYI,

We have been noticing this problem [a] & [b] on Linux mainline master 6.1.0-rc7

Test error: mkfs.ext4
/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 failed; job
exit

Please suggest a way forward on this reported issue on arm32 TI BeagleBoard X15
device. Build and Kernel configs details provided in the metadata section.

metadata:
git_ref: master
git_repo: https://gitlab.com/Linaro/lkft/mirrors/torvalds/linux-mainline
git_sha: b7b275e60bcd5f89771e865a8239325f86d9927d
git_describe: v6.1-rc7
kernel_version: 6.1.0-rc7
kernel-config: https://builds.tuxbuild.com/2I9I42JhhQqS9GOpFppfRiuqtRW/config
build-url: https://gitlab.com/Linaro/lkft/mirrors/torvalds/linux-mainline/-/pipelines/706371149
artifact-location: https://builds.tuxbuild.com/2I9I42JhhQqS9GOpFppfRiuqtRW
toolchain: gcc-10

[a] https://lkft.validation.linaro.org/scheduler/job/5892099
[b] https://lore.kernel.org/all/20221017155246.zxal2cfehjgaajcu@mobilestation/

- Naresh

2022-12-01 12:42:54

by Anders Roxell

[permalink] [raw]
Subject: Re: TI: X15 the connected SSD is not detected on Linux next 20221006 tag

On Wed, 30 Nov 2022 at 11:03, Niklas Cassel <[email protected]> wrote:
>
> On Wed, Nov 30, 2022 at 03:10:37PM +0530, Naresh Kamboju wrote:
> > On Mon, 17 Oct 2022 at 21:22, Serge Semin <[email protected]> wrote:
> >
> > FYI,
> >
> > We have been noticing this problem [a] & [b] on Linux mainline master 6.1.0-rc7
> >
> > Test error: mkfs.ext4
> > /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 failed; job
> > exit
> >
> > Please suggest a way forward on this reported issue on arm32 TI BeagleBoard X15
> > device. Build and Kernel configs details provided in the metadata section.
> >
> > metadata:
> > git_ref: master
> > git_repo: https://gitlab.com/Linaro/lkft/mirrors/torvalds/linux-mainline
> > git_sha: b7b275e60bcd5f89771e865a8239325f86d9927d
> > git_describe: v6.1-rc7
> > kernel_version: 6.1.0-rc7
> > kernel-config: https://builds.tuxbuild.com/2I9I42JhhQqS9GOpFppfRiuqtRW/config
> > build-url: https://gitlab.com/Linaro/lkft/mirrors/torvalds/linux-mainline/-/pipelines/706371149
> > artifact-location: https://builds.tuxbuild.com/2I9I42JhhQqS9GOpFppfRiuqtRW
> > toolchain: gcc-10
> >
> > [a] https://lkft.validation.linaro.org/scheduler/job/5892099
> > [b] https://lore.kernel.org/all/20221017155246.zxal2cfehjgaajcu@mobilestation/
> >
> > - Naresh
>
> Hello Naresh,
>
>
> Looking at the error from the log:
>
> + mkfs.ext4 /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
> mke2fs 1.46.5 (30-Dec-2021)
> The file /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 does not exist and no size was specified.
>
> It seems like the device that you are trying to format does not exist.
>
>
>
> On October 17th Serge suggested that you guys should try to enable:
> CONFIG_AHCI_DWC
> and see if that does solve your problem.
>
> There was never any reply to his suggestion.

I re-tested this on todays linux tree v6.1-rc7-103-gef4d3ea40565.

With CONFIG_AHCI_DWC=y the kernel fell on its back and no output was produced
So changing the ahci_dwc_init to be a late_initcall [1] made me see
what was going on [2].

The kernel booted fine with CONFIG_AHCI_DWC=y + this patch [3]

--- a/drivers/ata/libahci_platform.c
+++ b/drivers/ata/libahci_platform.c
@@ -109,7 +109,8 @@ struct clk *ahci_platform_find_clk(struct
ahci_host_priv *hpriv, const char *con
int i;

for (i = 0; i < hpriv->n_clks; i++) {
- if (!strcmp(hpriv->clks[i].id, con_id))
+ if (hpriv->clks && hpriv->clks[i].id &&
+ !strcmp(hpriv->clks[i].id, con_id))
return hpriv->clks[i].clk;
}

Bootlog [4].
Thank you Arnd for helping out with the investigation and for
proposing the patch for me to test.

>
>
>
> Looking at the config in:
> > kernel-config: https://builds.tuxbuild.com/2I9I42JhhQqS9GOpFppfRiuqtRW/config
>
> # CONFIG_AHCI_DWC is not set
>
> This Kconfig is indeed not enabled.
>
> Could you guys please try the suggestion from Serge?

The patch was also tested [5] without enabling CONFIG_AHCI_DWC, this
also worked fine.

Cheers,
Anders
[1] http://ix.io/4hmt
[2] https://lkft.validation.linaro.org/scheduler/job/5902935
[3] http://ix.io/4hmv
[4] https://lkft.validation.linaro.org/scheduler/job/5903220
[5] http://ix.io/4hmw

Subject: Re: TI: X15 the connected SSD is not detected on Linux next 20221006 tag #forregzbot

[Note: this mail contains only information for Linux kernel regression
tracking. Mails like these contain '#forregzbot' in the subject to make
then easy to spot and filter out. The author also tried to remove most
or all individuals from the list of recipients to spare them the hassle.]

On 30.11.22 10:40, Naresh Kamboju wrote:

>
> We have been noticing this problem [a] & [b] on Linux mainline master 6.1.0-rc7
>
> Test error: mkfs.ext4
> /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 failed; job
> exit
>
> Please suggest a way forward on this reported issue on arm32 TI BeagleBoard X15
> device. Build and Kernel configs details provided in the metadata section.

Thanks for the report. To be sure below issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression
tracking bot:

#regzbot ^introduced v6.0..v6.1-rc7
#regzbot title ata/arm32: TI BeagleBoard X15 doesn't boot
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply -- ideally with also
telling regzbot about it, as explained here:
https://linux-regtracking.leemhuis.info/tracked-regression/

Reminder for developers: When fixing the issue, add 'Link:' tags
pointing to the report (the mail this one replies to), as explained for
in the Linux kernel's documentation; above webpage explains why this is
important for tracked regressions.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.

2022-12-05 02:21:05

by Serge Semin

[permalink] [raw]
Subject: Re: TI: X15 the connected SSD is not detected on Linux next 20221006 tag

On Thu, Dec 01, 2022 at 12:48:32PM +0100, Anders Roxell wrote:
> On Wed, 30 Nov 2022 at 11:03, Niklas Cassel <[email protected]> wrote:
> >
> > On Wed, Nov 30, 2022 at 03:10:37PM +0530, Naresh Kamboju wrote:
> > > On Mon, 17 Oct 2022 at 21:22, Serge Semin <[email protected]> wrote:
> > >
> > > FYI,
> > >
> > > We have been noticing this problem [a] & [b] on Linux mainline master 6.1.0-rc7
> > >
> > > Test error: mkfs.ext4
> > > /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 failed; job
> > > exit
> > >
> > > Please suggest a way forward on this reported issue on arm32 TI BeagleBoard X15
> > > device. Build and Kernel configs details provided in the metadata section.
> > >
> > > metadata:
> > > git_ref: master
> > > git_repo: https://gitlab.com/Linaro/lkft/mirrors/torvalds/linux-mainline
> > > git_sha: b7b275e60bcd5f89771e865a8239325f86d9927d
> > > git_describe: v6.1-rc7
> > > kernel_version: 6.1.0-rc7
> > > kernel-config: https://builds.tuxbuild.com/2I9I42JhhQqS9GOpFppfRiuqtRW/config
> > > build-url: https://gitlab.com/Linaro/lkft/mirrors/torvalds/linux-mainline/-/pipelines/706371149
> > > artifact-location: https://builds.tuxbuild.com/2I9I42JhhQqS9GOpFppfRiuqtRW
> > > toolchain: gcc-10
> > >
> > > [a] https://lkft.validation.linaro.org/scheduler/job/5892099
> > > [b] https://lore.kernel.org/all/20221017155246.zxal2cfehjgaajcu@mobilestation/
> > >
> > > - Naresh
> >
> > Hello Naresh,
> >
> >
> > Looking at the error from the log:
> >
> > + mkfs.ext4 /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84
> > mke2fs 1.46.5 (30-Dec-2021)
> > The file /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 does not exist and no size was specified.
> >
> > It seems like the device that you are trying to format does not exist.
> >
> >
> >
> > On October 17th Serge suggested that you guys should try to enable:
> > CONFIG_AHCI_DWC
> > and see if that does solve your problem.
> >
> > There was never any reply to his suggestion.
>
> I re-tested this on todays linux tree v6.1-rc7-103-gef4d3ea40565.
>
> With CONFIG_AHCI_DWC=y the kernel fell on its back and no output was produced
> So changing the ahci_dwc_init to be a late_initcall [1] made me see
> what was going on [2].
>
> The kernel booted fine with CONFIG_AHCI_DWC=y + this patch [3]
>
> --- a/drivers/ata/libahci_platform.c
> +++ b/drivers/ata/libahci_platform.c
> @@ -109,7 +109,8 @@ struct clk *ahci_platform_find_clk(struct
> ahci_host_priv *hpriv, const char *con
> int i;
>
> for (i = 0; i < hpriv->n_clks; i++) {
> - if (!strcmp(hpriv->clks[i].id, con_id))
> + if (hpriv->clks && hpriv->clks[i].id &&
> + !strcmp(hpriv->clks[i].id, con_id))
> return hpriv->clks[i].clk;
> }

Indeed I should have taken into account that devm_clk_bulk_get_all()
can get unnamed clocks too. But checking the hpriv->clks pointer for
being not null is redundant, since the ahci_platform_get_resources()
procedure makes sure that the array is always allocated. At the very
least you shouldn't check the pointer in the loop, but can make sure
that the clks array is available before it.

-Serge(y)

>
> Bootlog [4].
> Thank you Arnd for helping out with the investigation and for
> proposing the patch for me to test.
>
> >
> >
> >
> > Looking at the config in:
> > > kernel-config: https://builds.tuxbuild.com/2I9I42JhhQqS9GOpFppfRiuqtRW/config
> >
> > # CONFIG_AHCI_DWC is not set
> >
> > This Kconfig is indeed not enabled.
> >
> > Could you guys please try the suggestion from Serge?
>
> The patch was also tested [5] without enabling CONFIG_AHCI_DWC, this
> also worked fine.
>
> Cheers,
> Anders
> [1] http://ix.io/4hmt
> [2] https://lkft.validation.linaro.org/scheduler/job/5902935
> [3] http://ix.io/4hmv
> [4] https://lkft.validation.linaro.org/scheduler/job/5903220
> [5] http://ix.io/4hmw

2022-12-05 11:10:22

by Arnd Bergmann

[permalink] [raw]
Subject: Re: TI: X15 the connected SSD is not detected on Linux next 20221006 tag

On Mon, Dec 5, 2022, at 02:11, Serge Semin wrote:
> On Thu, Dec 01, 2022 at 12:48:32PM +0100, Anders Roxell wrote:

>>
>> for (i = 0; i < hpriv->n_clks; i++) {
>> - if (!strcmp(hpriv->clks[i].id, con_id))
>> + if (hpriv->clks && hpriv->clks[i].id &&
>> + !strcmp(hpriv->clks[i].id, con_id))
>> return hpriv->clks[i].clk;
>> }
>
> Indeed I should have taken into account that devm_clk_bulk_get_all()
> can get unnamed clocks too. But checking the hpriv->clks pointer for
> being not null is redundant, since the ahci_platform_get_resources()
> procedure makes sure that the array is always allocated. At the very
> least you shouldn't check the pointer in the loop, but can make sure
> that the clks array is available before it.

Do you think this is otherwise the correct fix then? Any chance we
can still get a version of it into 6.1?

Arnd

2022-12-05 13:43:35

by Damien Le Moal

[permalink] [raw]
Subject: Re: TI: X15 the connected SSD is not detected on Linux next 20221006 tag

On 12/5/22 19:08, Arnd Bergmann wrote:
> On Mon, Dec 5, 2022, at 02:11, Serge Semin wrote:
>> On Thu, Dec 01, 2022 at 12:48:32PM +0100, Anders Roxell wrote:
>
>>>
>>> for (i = 0; i < hpriv->n_clks; i++) {
>>> - if (!strcmp(hpriv->clks[i].id, con_id))
>>> + if (hpriv->clks && hpriv->clks[i].id &&
>>> + !strcmp(hpriv->clks[i].id, con_id))
>>> return hpriv->clks[i].clk;
>>> }
>>
>> Indeed I should have taken into account that devm_clk_bulk_get_all()
>> can get unnamed clocks too. But checking the hpriv->clks pointer for
>> being not null is redundant, since the ahci_platform_get_resources()
>> procedure makes sure that the array is always allocated. At the very
>> least you shouldn't check the pointer in the loop, but can make sure
>> that the clks array is available before it.
>
> Do you think this is otherwise the correct fix then? Any chance we
> can still get a version of it into 6.1?

If someone sends me a proper patch to apply, I can send a last PR for 6.1
to Linus before week end.


>
> Arnd

--
Damien Le Moal
Western Digital Research

2022-12-06 09:43:52

by Serge Semin

[permalink] [raw]
Subject: Re: TI: X15 the connected SSD is not detected on Linux next 20221006 tag

On Mon, Dec 05, 2022 at 10:24:22PM +0900, Damien Le Moal wrote:
> On 12/5/22 19:08, Arnd Bergmann wrote:
> > On Mon, Dec 5, 2022, at 02:11, Serge Semin wrote:
> >> On Thu, Dec 01, 2022 at 12:48:32PM +0100, Anders Roxell wrote:
> >
> >>>
> >>> for (i = 0; i < hpriv->n_clks; i++) {
> >>> - if (!strcmp(hpriv->clks[i].id, con_id))
> >>> + if (hpriv->clks && hpriv->clks[i].id &&
> >>> + !strcmp(hpriv->clks[i].id, con_id))
> >>> return hpriv->clks[i].clk;
> >>> }
> >>
> >> Indeed I should have taken into account that devm_clk_bulk_get_all()
> >> can get unnamed clocks too. But checking the hpriv->clks pointer for
> >> being not null is redundant, since the ahci_platform_get_resources()
> >> procedure makes sure that the array is always allocated. At the very
> >> least you shouldn't check the pointer in the loop, but can make sure
> >> that the clks array is available before it.
> >

> > Do you think this is otherwise the correct fix then? Any chance we
> > can still get a version of it into 6.1?

I'll think of a better solution. But at this stage it seems like the
best choice seeing the bindings permit having unnamed clocks
specified.

>
> If someone sends me a proper patch to apply, I can send a last PR for 6.1
> to Linus before week end.

I'll submit the patch today. Thanks.

-Serge(y)

>
>
> >
> > Arnd
>
> --
> Damien Le Moal
> Western Digital Research
>

2022-12-06 10:15:10

by Damien Le Moal

[permalink] [raw]
Subject: Re: TI: X15 the connected SSD is not detected on Linux next 20221006 tag

On 12/6/22 17:46, Serge Semin wrote:
> On Mon, Dec 05, 2022 at 10:24:22PM +0900, Damien Le Moal wrote:
>> On 12/5/22 19:08, Arnd Bergmann wrote:
>>> On Mon, Dec 5, 2022, at 02:11, Serge Semin wrote:
>>>> On Thu, Dec 01, 2022 at 12:48:32PM +0100, Anders Roxell wrote:
>>>
>>>>>
>>>>> for (i = 0; i < hpriv->n_clks; i++) {
>>>>> - if (!strcmp(hpriv->clks[i].id, con_id))
>>>>> + if (hpriv->clks && hpriv->clks[i].id &&
>>>>> + !strcmp(hpriv->clks[i].id, con_id))
>>>>> return hpriv->clks[i].clk;
>>>>> }
>>>>
>>>> Indeed I should have taken into account that devm_clk_bulk_get_all()
>>>> can get unnamed clocks too. But checking the hpriv->clks pointer for
>>>> being not null is redundant, since the ahci_platform_get_resources()
>>>> procedure makes sure that the array is always allocated. At the very
>>>> least you shouldn't check the pointer in the loop, but can make sure
>>>> that the clks array is available before it.
>>>
>
>>> Do you think this is otherwise the correct fix then? Any chance we
>>> can still get a version of it into 6.1?
>
> I'll think of a better solution. But at this stage it seems like the
> best choice seeing the bindings permit having unnamed clocks
> specified.
>
>>
>> If someone sends me a proper patch to apply, I can send a last PR for 6.1
>> to Linus before week end.
>
> I'll submit the patch today. Thanks.

Anders just posted one. Can you review it please ?

>
> -Serge(y)
>
>>
>>
>>>
>>> Arnd
>>
>> --
>> Damien Le Moal
>> Western Digital Research
>>

--
Damien Le Moal
Western Digital Research

2022-12-06 21:22:16

by Serge Semin

[permalink] [raw]
Subject: Re: TI: X15 the connected SSD is not detected on Linux next 20221006 tag

On Tue, Dec 06, 2022 at 06:12:48PM +0900, Damien Le Moal wrote:
> On 12/6/22 17:46, Serge Semin wrote:
> > On Mon, Dec 05, 2022 at 10:24:22PM +0900, Damien Le Moal wrote:
> >> On 12/5/22 19:08, Arnd Bergmann wrote:
> >>> On Mon, Dec 5, 2022, at 02:11, Serge Semin wrote:
> >>>> On Thu, Dec 01, 2022 at 12:48:32PM +0100, Anders Roxell wrote:
> >>>
> >>>>>
> >>>>> for (i = 0; i < hpriv->n_clks; i++) {
> >>>>> - if (!strcmp(hpriv->clks[i].id, con_id))
> >>>>> + if (hpriv->clks && hpriv->clks[i].id &&
> >>>>> + !strcmp(hpriv->clks[i].id, con_id))
> >>>>> return hpriv->clks[i].clk;
> >>>>> }
> >>>>
> >>>> Indeed I should have taken into account that devm_clk_bulk_get_all()
> >>>> can get unnamed clocks too. But checking the hpriv->clks pointer for
> >>>> being not null is redundant, since the ahci_platform_get_resources()
> >>>> procedure makes sure that the array is always allocated. At the very
> >>>> least you shouldn't check the pointer in the loop, but can make sure
> >>>> that the clks array is available before it.
> >>>
> >
> >>> Do you think this is otherwise the correct fix then? Any chance we
> >>> can still get a version of it into 6.1?
> >
> > I'll think of a better solution. But at this stage it seems like the
> > best choice seeing the bindings permit having unnamed clocks
> > specified.
> >
> >>
> >> If someone sends me a proper patch to apply, I can send a last PR for 6.1
> >> to Linus before week end.
> >
> > I'll submit the patch today. Thanks.
>

> Anders just posted one. Can you review it please ?

Done. Thanks.

-Serge(y)

>
> >
> > -Serge(y)
> >
> >>
> >>
> >>>
> >>> Arnd
> >>
> >> --
> >> Damien Le Moal
> >> Western Digital Research
> >>
>
> --
> Damien Le Moal
> Western Digital Research
>

Subject: Re: TI: X15 the connected SSD is not detected on Linux next 20221006 tag #forregzbot

On 01.12.22 14:02, Thorsten Leemhuis wrote:
> [Note: this mail contains only information for Linux kernel regression
> tracking. Mails like these contain '#forregzbot' in the subject to make
> then easy to spot and filter out. The author also tried to remove most
> or all individuals from the list of recipients to spare them the hassle.]
>
> On 30.11.22 10:40, Naresh Kamboju wrote:
>
>>
>> We have been noticing this problem [a] & [b] on Linux mainline master 6.1.0-rc7
>>
>> Test error: mkfs.ext4
>> /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190702A00D84 failed; job
>> exit
>>
>> Please suggest a way forward on this reported issue on arm32 TI BeagleBoard X15
>> device. Build and Kernel configs details provided in the metadata section.
>
> Thanks for the report. To be sure below issue doesn't fall through the
> cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression
> tracking bot:
>
> #regzbot ^introduced v6.0..v6.1-rc7
> #regzbot title ata/arm32: TI BeagleBoard X15 doesn't boot
> #regzbot ignore-activity

#regzbot fix: ata: libahci_platform: ahci_platform_find_clk: oops, NULL
pointer