2023-06-21 14:29:23

by Arnd Bergmann

[permalink] [raw]
Subject: Re: next: ltp: fs: read_all: block sda: the capability attribute has been deprecated. - supervisor instruction fetch in kernel mode

On Wed, Jun 21, 2023, at 16:01, Naresh Kamboju wrote:
> While running LTP fs testing on x86_64 device the following kernel BUG:
> notice with Linux next-20230621.
>
> Reported-by: Linux Kernel Functional Testing <[email protected]>
>
> Steps to reproduce:
>
> # cd /opt/ltp
> # ./runltp -f fs
>
> Test log:
> ======
> read_all.c:687: TPASS: Finished reading files
> Summary:
> passed 1
> failed 0
> broken 0
> skipped 0
> warnings 0
> tst_test.c:1558: TINFO: Timeout per run is 0h 06m 40s
> read_all.c:568: TINFO: Worker timeout set to 10% of max_runtime: 1000ms
> [ 1344.664349] block sda: the capability attribute has been deprecated.

I think the oops is unrelated to the line above

> [ 1344.679885] BUG: kernel NULL pointer dereference, address: 0000000000000000
> [ 1344.686839] #PF: supervisor instruction fetch in kernel mode
> [ 1344.692490] #PF: error_code(0x0010) - not-present page
> [ 1344.697620] PGD 8000000105569067 P4D 8000000105569067 PUD 1056ed067 PMD 0
> [ 1344.704494] Oops: 0010 [#1] PREEMPT SMP PTI
> [ 1344.708680] CPU: 0 PID: 5649 Comm: read_all Not tainted
> 6.4.0-rc7-next-20230621 #1
> [ 1344.716245] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
> 2.5 11/26/2020
> [ 1344.723629] RIP: 0010:0x0
> [ 1344.726257] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> [ 1344.732780] RSP: 0018:ffff98d38123bd38 EFLAGS: 00010286
> [ 1344.737998] RAX: 0000000000000000 RBX: ffffffffbea38720 RCX: 0000000000000000
> [ 1344.745123] RDX: ffff979e42e31000 RSI: ffffffffbea38720 RDI: ffff979e40371900
> [ 1344.752246] RBP: ffff98d38123bd48 R08: ffff979e4080a0f0 R09: 0000000000000001
> [ 1344.759371] R10: ffff979e42e31000 R11: 0000000000000000 R12: ffff979e42e31000
> [ 1344.766495] R13: 0000000000000001 R14: ffff979e432dd2f8 R15: ffff979e432dd2d0
> [ 1344.773621] FS: 00007ff745d4b740(0000) GS:ffff97a1a7a00000(0000)
> knlGS:0000000000000000
> [ 1344.781704] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1344.787442] CR2: ffffffffffffffd6 CR3: 000000010563c004 CR4: 00000000003706f0
> [ 1344.794587] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 1344.801733] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 1344.808857] Call Trace:
> [ 1344.811301] <TASK>
> [ 1344.813399] ? show_regs+0x6e/0x80
> [ 1344.816804] ? __die+0x29/0x70
> [ 1344.819857] ? page_fault_oops+0x154/0x470
> [ 1344.823957] ? do_user_addr_fault+0x355/0x6c0
> [ 1344.828314] ? exc_page_fault+0x6e/0x170
> [ 1344.832239] ? asm_exc_page_fault+0x2b/0x30
> [ 1344.836420] max_phase_adjustment_show+0x23/0x50

The function is newly added by commit c3b60ab7a4dff ("ptp: Add
getmaxphase callback to ptp_clock_info"), adding everyone from
that patch to Cc.

Arnd


2023-06-27 17:26:50

by Rahul Rameshbabu

[permalink] [raw]
Subject: Re: next: ltp: fs: read_all: block sda: the capability attribute has been deprecated. - supervisor instruction fetch in kernel mode

On Wed, 21 Jun, 2023 16:08:50 +0200 "Arnd Bergmann" <[email protected]> wrote:
> On Wed, Jun 21, 2023, at 16:01, Naresh Kamboju wrote:
>> While running LTP fs testing on x86_64 device the following kernel BUG:
>> notice with Linux next-20230621.
>>
>> Reported-by: Linux Kernel Functional Testing <[email protected]>
>>
>> Steps to reproduce:
>>
>> # cd /opt/ltp
>> # ./runltp -f fs
>>
>> Test log:
>> ======
>> read_all.c:687: TPASS: Finished reading files
>> Summary:
>> passed 1
>> failed 0
>> broken 0
>> skipped 0
>> warnings 0
>> tst_test.c:1558: TINFO: Timeout per run is 0h 06m 40s
>> read_all.c:568: TINFO: Worker timeout set to 10% of max_runtime: 1000ms
>> [ 1344.664349] block sda: the capability attribute has been deprecated.
>
> I think the oops is unrelated to the line above
>
>> [ 1344.679885] BUG: kernel NULL pointer dereference, address: 0000000000000000
>> [ 1344.686839] #PF: supervisor instruction fetch in kernel mode
>> [ 1344.692490] #PF: error_code(0x0010) - not-present page
>> [ 1344.697620] PGD 8000000105569067 P4D 8000000105569067 PUD 1056ed067 PMD 0
>> [ 1344.704494] Oops: 0010 [#1] PREEMPT SMP PTI
>> [ 1344.708680] CPU: 0 PID: 5649 Comm: read_all Not tainted
>> 6.4.0-rc7-next-20230621 #1
>> [ 1344.716245] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
>> 2.5 11/26/2020
>> [ 1344.723629] RIP: 0010:0x0
>> [ 1344.726257] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
>> [ 1344.732780] RSP: 0018:ffff98d38123bd38 EFLAGS: 00010286
>> [ 1344.737998] RAX: 0000000000000000 RBX: ffffffffbea38720 RCX: 0000000000000000
>> [ 1344.745123] RDX: ffff979e42e31000 RSI: ffffffffbea38720 RDI: ffff979e40371900
>> [ 1344.752246] RBP: ffff98d38123bd48 R08: ffff979e4080a0f0 R09: 0000000000000001
>> [ 1344.759371] R10: ffff979e42e31000 R11: 0000000000000000 R12: ffff979e42e31000
>> [ 1344.766495] R13: 0000000000000001 R14: ffff979e432dd2f8 R15: ffff979e432dd2d0
>> [ 1344.773621] FS: 00007ff745d4b740(0000) GS:ffff97a1a7a00000(0000)
>> knlGS:0000000000000000
>> [ 1344.781704] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 1344.787442] CR2: ffffffffffffffd6 CR3: 000000010563c004 CR4: 00000000003706f0
>> [ 1344.794587] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [ 1344.801733] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> [ 1344.808857] Call Trace:
>> [ 1344.811301] <TASK>
>> [ 1344.813399] ? show_regs+0x6e/0x80
>> [ 1344.816804] ? __die+0x29/0x70
>> [ 1344.819857] ? page_fault_oops+0x154/0x470
>> [ 1344.823957] ? do_user_addr_fault+0x355/0x6c0
>> [ 1344.828314] ? exc_page_fault+0x6e/0x170
>> [ 1344.832239] ? asm_exc_page_fault+0x2b/0x30
>> [ 1344.836420] max_phase_adjustment_show+0x23/0x50
>
> The function is newly added by commit c3b60ab7a4dff ("ptp: Add
> getmaxphase callback to ptp_clock_info"), adding everyone from
> that patch to Cc.
>
> Arnd

The issue is that we introduce a new sysfs node that depends on a
hardware capability not all PTP devices support. On PTP devices that do
not support this capability, this leads to the NULL pointer dereference
since the driver callback for the functionality if not implemented. I
will submit a fix to the net mailing list along with the appropriate
recipients.

-- Rahul Rameshbabu