2021-09-07 16:03:45

by kernel test robot

[permalink] [raw]
Subject: [mm/gup] 9857a17f20: kernel_BUG_at_include/linux/pagemap.h



Greeting,

FYI, we noticed the following commit (built with clang-14):

commit: 9857a17f206ff374aea78bccfb687f145368be2e ("mm/gup: remove try_get_page(), call try_get_compound_head() directly")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


in testcase: trinity
version: trinity-i386-4d2343bd-1_20200320
with following parameters:

number: 99999
group: group-01

test-description: Trinity is a linux system call fuzz tester.
test-url: http://codemonkey.org.uk/projects/trinity/


on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 4G

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


+------------------------------------------+------------+------------+
| | 54d516b1d6 | 9857a17f20 |
+------------------------------------------+------------+------------+
| boot_successes | 10 | 0 |
| boot_failures | 0 | 12 |
| kernel_BUG_at_include/linux/pagemap.h | 0 | 12 |
| invalid_opcode:#[##] | 0 | 12 |
| EIP:try_get_compound_head | 0 | 12 |
| Kernel_panic-not_syncing:Fatal_exception | 0 | 12 |
+------------------------------------------+------------+------------+


If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>



[ 143.907782][ T3260] ------------[ cut here ]------------
[ 143.908513][ T3260] kernel BUG at include/linux/pagemap.h:223!
[ 143.909454][ T3260] invalid opcode: 0000 [#1]
[ 143.909946][ T3260] CPU: 0 PID: 3260 Comm: trinity-c0 Not tainted 5.14.0-00040-g9857a17f206f #1
[ 143.911026][ T3260] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 143.912039][ T3260] EIP: try_get_compound_head+0xac/0xb0
[ 143.912663][ T3260] Code: ba 00 8c 6c ff ff 0f 0b ff ff 0f 0b c3 74 ce 89 c3 74 ce 89 00 00 31 c0 00 00 31 c0 c7 61 82 e8 c7 61 82 e8 0f 0b 0f 0b 0f 0
b <0f> 0b e5 53 57 56 e5 53 57 56 8b 59 1c 39 8b 59 1c 39 05 13 15 e5
[ 143.914798][ T3260] EAX: ef228640 EBX: 00000000 ECX: ef228640 EDX: 00000001
[ 143.915590][ T3260] ESI: 80000000 EDI: f5eb8600 EBP: f5f43f14 ESP: f5f43f08
[ 143.916323][ T3260] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 EFLAGS: 00010206
[ 143.917183][ T3260] CR0: 80050033 CR2: 00ffd7c0 CR3: 75e19000 CR4: 00040690
[ 143.917975][ T3260] DR0: 769ae000 DR1: 769af000 DR2: 00000000 DR3: 00000000
[ 143.918765][ T3260] DR6: ffff0ff0 DR7: 00070602
[ 143.919258][ T3260] Call Trace:
[ 143.919656][ T3260] generic_pipe_buf_get+0xf/0x20
[ 143.920180][ T3260] do_tee+0x1e7/0x2f0
[ 143.920650][ T3260] __ia32_sys_tee+0x50/0xa0
[ 143.921133][ T3260] do_int80_syscall_32+0x3a/0x90
[ 143.921718][ T3260] ? irqentry_exit_to_user_mode+0x2a/0x30
[ 143.922430][ T3260] ? irqentry_exit+0x30/0x70
[ 143.922960][ T3260] ? common_interrupt+0x34/0x40
[ 143.923534][ T3260] entry_INT80_32+0x104/0x104
[ 143.924028][ T3260] EIP: 0x77f1ba02
[ 143.924486][ T3260] Code: 95 01 00 05 25 36 02 00 83 ec 14 8d 80 e8 99 ff ff 50 6a 02 e8 1f ff 00 00 c7 04 24 7f 00 00 00 e8 7e 87 01 00 66 90 90 cd 8
0 <c3> 8d b6 00 00 00 00 8d bc 27 00 00 00 00 8b 1c 24 c3 8d b6 00 00
[ 143.926860][ T3260] EAX: ffffffda EBX: 00000124 ECX: 00000127 EDX: 0000004b
[ 143.927655][ T3260] ESI: 0000000e EDI: 49494949 EBP: fffffff9 ESP: 7f96af88
[ 143.928455][ T3260] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000212
[ 143.929261][ T3260] Modules linked in:
[ 143.929768][ T3260] ---[ end trace 9b076b1117b0ac35 ]---



To reproduce:

# build kernel
cd linux
cp config-5.14.0-00040-g9857a17f206f .config
make HOSTCC=clang-14 CC=clang-14 ARCH=i386 olddefconfig prepare modules_prepare bzImage

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email



---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation

Thanks,
Oliver Sang


Attachments:
(No filename) (4.19 kB)
config-5.14.0-00040-g9857a17f206f (123.77 kB)
job-script (4.45 kB)
dmesg.xz (22.31 kB)
Download all attachments

2021-09-07 19:24:28

by Linus Torvalds

[permalink] [raw]
Subject: Re: [mm/gup] 9857a17f20: kernel_BUG_at_include/linux/pagemap.h

On Tue, Sep 7, 2021 at 8:20 AM kernel test robot <[email protected]> wrote:
>
> FYI, we noticed the following commit (built with clang-14):
>
> commit: 9857a17f206f ("mm/gup: remove try_get_page(), call try_get_compound_head() directly")
> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
>
> [ 143.908513][ T3260] kernel BUG at include/linux/pagemap.h:223!

Ahh, well, yes.

That commit is clearly buggy, in that the try_get_compound_head() code
really doesn't work at all for us.

__page_cache_add_speculative() is not at all the same as
try_get_page(), and I should have caught on to this as I applied it. I
just read the explanation, and it sounded believable, but it was
entirely wrong.

try_get_page() is literally about that "page ref overflow" case, but
try_get_compound_head() uses page_cache_add_speculative() which has
different logic and has those extra "this only works in RCU context"
logic.

So that commit was completely bogus, and the "lack of maintenance" was
not lack of maintenance at all, it was all about entirely different
semantics.

Reverted.

Linus