2021-08-16 17:36:46

by Jiri Olsa

[permalink] [raw]
Subject: [BUG] general protection fault when reading /proc/kcore

hi,
I'm getting fault below when running:

# cat /proc/kallsyms | grep ksys_read
ffffffff8136d580 T ksys_read
# objdump -d --start-address=0xffffffff8136d580 --stop-address=0xffffffff8136d590 /proc/kcore

/proc/kcore: file format elf64-x86-64

Segmentation fault

any idea? config is attached

thanks,
jirka


---
krava33 login: [ 68.330612] general protection fault, probably for non-canonical address 0xf887ffcbff000: 0000 [#1] SMP PTI
[ 68.333118] CPU: 12 PID: 1079 Comm: objdump Not tainted 5.14.0-rc5qemu+ #508
[ 68.334922] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-4.fc34 04/01/2014
[ 68.336945] RIP: 0010:kern_addr_valid+0x150/0x300
[ 68.338082] Code: 1f 40 00 48 8b 0d e8 12 61 01 48 85 f6 0f 85 ca 00 00 00 48 81 e1 00 f0 ff ff 48 21 c1 48 b8 00 00 00 00 80 88 ff ff 48 01 ca <48> 8b 3c 02 48 f7 c7 9f ff ff ff 0f 84 d8 fe ff ff 48 89 f8 0f 1f
[ 68.342220] RSP: 0018:ffffc90000bcbc38 EFLAGS: 00010206
[ 68.343428] RAX: ffff888000000000 RBX: 0000000000001000 RCX: 000ffffffcbff000
[ 68.345029] RDX: 000ffffffcbff000 RSI: 0000000000000000 RDI: 800ffffffcbff062
[ 68.346599] RBP: ffffc90000bcbea8 R08: 0000000000001000 R09: 0000000000000000
[ 68.349000] R10: 0000000000000000 R11: 0000000000001000 R12: 00007fcc0fd80010
[ 68.350804] R13: ffffffff83400000 R14: 0000000000400000 R15: ffffffff843d23e0
[ 68.352609] FS: 00007fcc111fcc80(0000) GS:ffff888275e00000(0000) knlGS:0000000000000000
[ 68.354638] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 68.356104] CR2: 00007fcc0fd80000 CR3: 000000011226e004 CR4: 0000000000770ee0
[ 68.357896] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 68.359694] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 68.361597] PKRU: 55555554
[ 68.362460] Call Trace:
[ 68.363252] read_kcore+0x57f/0x920
[ 68.364289] ? rcu_read_lock_sched_held+0x12/0x80
[ 68.365630] ? rcu_read_lock_sched_held+0x12/0x80
[ 68.366955] ? rcu_read_lock_sched_held+0x12/0x80
[ 68.368277] ? trace_hardirqs_on+0x1b/0xd0
[ 68.369462] ? rcu_read_lock_sched_held+0x12/0x80
[ 68.370793] ? lock_acquire+0x195/0x2f0
[ 68.371920] ? lock_acquire+0x195/0x2f0
[ 68.373035] ? rcu_read_lock_sched_held+0x12/0x80
[ 68.374364] ? lock_acquire+0x195/0x2f0
[ 68.375498] ? rcu_read_lock_sched_held+0x12/0x80
[ 68.376831] ? rcu_read_lock_sched_held+0x12/0x80
[ 68.379883] ? rcu_read_lock_sched_held+0x12/0x80
[ 68.381268] ? lock_release+0x22b/0x3e0
[ 68.382458] ? _raw_spin_unlock+0x1f/0x30
[ 68.383685] ? __handle_mm_fault+0xcfc/0x15f0
[ 68.384994] ? rcu_read_lock_sched_held+0x12/0x80
[ 68.386389] ? lock_acquire+0x195/0x2f0
[ 68.387573] ? rcu_read_lock_sched_held+0x12/0x80
[ 68.388969] ? lock_release+0x22b/0x3e0
[ 68.390145] proc_reg_read+0x55/0xa0
[ 68.391257] ? vfs_read+0x78/0x1b0
[ 68.392336] vfs_read+0xa7/0x1b0
[ 68.393328] ksys_read+0x68/0xe0
[ 68.394308] do_syscall_64+0x3b/0x90
[ 68.395391] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 68.396804] RIP: 0033:0x7fcc11cf92e2
[ 68.397824] Code: c0 e9 b2 fe ff ff 50 48 8d 3d ea 2e 0a 00 e8 95 e9 01 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
[ 68.402420] RSP: 002b:00007ffd6e0f8da8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[ 68.404357] RAX: ffffffffffffffda RBX: 0000565439305b20 RCX: 00007fcc11cf92e2
[ 68.406061] RDX: 0000000000800000 RSI: 00007fcc0f980010 RDI: 0000000000000003
[ 68.407747] RBP: 00007fcc11dcd300 R08: 0000000000000003 R09: 00007fcc0d980010
[ 68.410937] R10: 0000000003826000 R11: 0000000000000246 R12: 00007fcc0f980010
[ 68.412624] R13: 0000000000000d68 R14: 00007fcc11dcc700 R15: 0000000000800000
[ 68.414322] Modules linked in: intel_rapl_msr intel_rapl_common nfit kvm_intel kvm irqbypass rapl iTCO_wdt iTCO_vendor_support i2c_i801 i2c_smbus lpc_ich drm drm_panel_orientation_quirks zram xfs crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel
[ 68.419591] ---[ end trace e2c30f827226966b ]---
[ 68.420969] RIP: 0010:kern_addr_valid+0x150/0x300
[ 68.422308] Code: 1f 40 00 48 8b 0d e8 12 61 01 48 85 f6 0f 85 ca 00 00 00 48 81 e1 00 f0 ff ff 48 21 c1 48 b8 00 00 00 00 80 88 ff ff 48 01 ca <48> 8b 3c 02 48 f7 c7 9f ff ff ff 0f 84 d8 fe ff ff 48 89 f8 0f 1f
[ 68.426826] RSP: 0018:ffffc90000bcbc38 EFLAGS: 00010206
[ 68.428150] RAX: ffff888000000000 RBX: 0000000000001000 RCX: 000ffffffcbff000
[ 68.429813] RDX: 000ffffffcbff000 RSI: 0000000000000000 RDI: 800ffffffcbff062
[ 68.431465] RBP: ffffc90000bcbea8 R08: 0000000000001000 R09: 0000000000000000
[ 68.433115] R10: 0000000000000000 R11: 0000000000001000 R12: 00007fcc0fd80010
[ 68.434768] R13: ffffffff83400000 R14: 0000000000400000 R15: ffffffff843d23e0
[ 68.436423] FS: 00007fcc111fcc80(0000) GS:ffff888275e00000(0000) knlGS:0000000000000000
[ 68.438354] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 68.442077] CR2: 00007fcc0fd80000 CR3: 000000011226e004 CR4: 0000000000770ee0
[ 68.443727] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 68.445370] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 68.447010] PKRU: 55555554


Attachments:
(No filename) (5.21 kB)
config (151.93 kB)
Download all attachments

2021-08-16 17:52:41

by David Hildenbrand

[permalink] [raw]
Subject: Re: [BUG] general protection fault when reading /proc/kcore

On 16.08.21 19:34, Jiri Olsa wrote:
> hi,
> I'm getting fault below when running:
>
> # cat /proc/kallsyms | grep ksys_read
> ffffffff8136d580 T ksys_read
> # objdump -d --start-address=0xffffffff8136d580 --stop-address=0xffffffff8136d590 /proc/kcore
>
> /proc/kcore: file format elf64-x86-64
>
> Segmentation fault
>
> any idea? config is attached

Just tried with a different config on 5.14.0-rc6+

[root@localhost ~]# cat /proc/kallsyms | grep ksys_read
ffffffff8927a800 T ksys_readahead
ffffffff89333660 T ksys_read

[root@localhost ~]# objdump -d --start-address=0xffffffff89333660
--stop-address=0xffffffff89333670

a.out: file format elf64-x86-64



The kern_addr_valid(start) seems to fault in your case, which is weird,
because it merely walks the page tables. But it seems to complain about
a non-canonical address 0xf887ffcbff000

Can you post your QEMU cmdline? Did you test this on other kernel versions?

Thanks!

--
Thanks,

David / dhildenb

2021-08-16 17:57:04

by David Hildenbrand

[permalink] [raw]
Subject: Re: [BUG] general protection fault when reading /proc/kcore

On 16.08.21 19:49, David Hildenbrand wrote:
> On 16.08.21 19:34, Jiri Olsa wrote:
>> hi,
>> I'm getting fault below when running:
>>
>> # cat /proc/kallsyms | grep ksys_read
>> ffffffff8136d580 T ksys_read
>> # objdump -d --start-address=0xffffffff8136d580 --stop-address=0xffffffff8136d590 /proc/kcore
>>
>> /proc/kcore: file format elf64-x86-64
>>
>> Segmentation fault
>>
>> any idea? config is attached
>
> Just tried with a different config on 5.14.0-rc6+
>
> [root@localhost ~]# cat /proc/kallsyms | grep ksys_read
> ffffffff8927a800 T ksys_readahead
> ffffffff89333660 T ksys_read
>
> [root@localhost ~]# objdump -d --start-address=0xffffffff89333660
> --stop-address=0xffffffff89333670
>
> a.out: file format elf64-x86-64


Sorry, missed the /proc/kcore part:

[root@localhost ~]# cat /proc/kallsyms | grep ksys_read
ffffffffba27a800 T ksys_readahead
ffffffffba333660 T ksys_read
[root@localhost ~]# objdump -d --start-address=0xffffffffba333660
--stop-address=0xffffffffba333670 /proc/kcore

/proc/kcore: file format elf64-x86-64


Disassembly of section load1:

ffffffffba333660 <load1+0x333660>:
ffffffffba333660: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
ffffffffba333665: 41 55 push %r13
ffffffffba333667: 49 89 d5 mov %rdx,%r13
ffffffffba33366a: 41 54 push %r12
ffffffffba33366c: 49 89 f4 mov %rsi,%r12
ffffffffba33366f: 55 push %rbp


--
Thanks,

David / dhildenb

2021-08-16 18:14:54

by Jiri Olsa

[permalink] [raw]
Subject: Re: [BUG] general protection fault when reading /proc/kcore

On Mon, Aug 16, 2021 at 07:49:15PM +0200, David Hildenbrand wrote:
> On 16.08.21 19:34, Jiri Olsa wrote:
> > hi,
> > I'm getting fault below when running:
> >
> > # cat /proc/kallsyms | grep ksys_read
> > ffffffff8136d580 T ksys_read
> > # objdump -d --start-address=0xffffffff8136d580 --stop-address=0xffffffff8136d590 /proc/kcore
> >
> > /proc/kcore: file format elf64-x86-64
> >
> > Segmentation fault
> >
> > any idea? config is attached
>
> Just tried with a different config on 5.14.0-rc6+
>
> [root@localhost ~]# cat /proc/kallsyms | grep ksys_read
> ffffffff8927a800 T ksys_readahead
> ffffffff89333660 T ksys_read
>
> [root@localhost ~]# objdump -d --start-address=0xffffffff89333660
> --stop-address=0xffffffff89333670
>
> a.out: file format elf64-x86-64
>
>
>
> The kern_addr_valid(start) seems to fault in your case, which is weird,
> because it merely walks the page tables. But it seems to complain about a
> non-canonical address 0xf887ffcbff000
>
> Can you post your QEMU cmdline? Did you test this on other kernel versions?

I'm using virt-manager so:

/usr/bin/qemu-system-x86_64 -name guest=fedora33,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-13-fedora33/master-key.aes -machine pc-q35-5.1,accel=kvm,usb=off,vmport=off,dump-guest-core=off,memory-backend=pc.ram -cpu Skylake-Server-IBRS,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,clflushopt=on,umip=on,pku=on,stibp=on,arch-capabilities=on,ssbd=on,xsaves=on,ibpb=on,amd-stibp=on,amd-ssbd=on,skip-l1dfl-vmentry=on,pschange-mc-no=on -m 8192 -object memory-backend-ram,id=pc.ram,size=8589934592 -overcommit mem-lock=off -smp 20,sockets=20,cores=1,threads=1 -uuid 2185d5a9-dbad-4d61-aa4e-97af9fd7ebca -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=36,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -kernel /home/jolsa/qemu/run/vmlinux -initrd /home/jolsa/qemu/run/initrd -append root=/dev/mapper/fedora_fedora-root ro rd.lvm.lv=fedora_fedora/root console=tty0 console=ttyS0,115200 -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 -device pcie-root-port,port=0x16,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x6 -device qemu-xhci,p2=15,p3=15,id=usb,bus=pci.2,addr=0x0 -device virtio-serial-pci,id=virtio-serial0,bus=pci.3,addr=0x0 -blockdev {"driver":"file","filename":"/var/lib/libvirt/images/fedora33.qcow2","node-name":"libvirt-2-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-2-format","read-only":false,"driver":"qcow2","file":"libvirt-2-storage","backing":null} -device virtio-blk-pci,bus=pci.4,addr=0x0,drive=libvirt-2-format,id=virtio-disk0,bootindex=1 -device ide-cd,bus=ide.0,id=sata0-0-0 -netdev tap,fd=38,id=hostnet0,vhost=on,vhostfd=39 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:f3:c6:e7,bus=pci.1,addr=0x0 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,fd=40,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5900,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -device ich9-intel-hda,id=sound0,bus=pcie.0,addr=0x1b -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.5,addr=0x0 -object rng-random,id=objrng0,filename=/dev/urandom -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.6,addr=0x0 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on

so far I tested just bpf-next/master:
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git

and jsut removed my changes to make sure it wasn't me ;-)

I'll try to find a version that worked for me before


thanks,
jirka

2021-08-16 18:41:12

by David Hildenbrand

[permalink] [raw]
Subject: Re: [BUG] general protection fault when reading /proc/kcore

On 16.08.21 20:12, Jiri Olsa wrote:
> On Mon, Aug 16, 2021 at 07:49:15PM +0200, David Hildenbrand wrote:
>> On 16.08.21 19:34, Jiri Olsa wrote:
>>> hi,
>>> I'm getting fault below when running:
>>>
>>> # cat /proc/kallsyms | grep ksys_read
>>> ffffffff8136d580 T ksys_read
>>> # objdump -d --start-address=0xffffffff8136d580 --stop-address=0xffffffff8136d590 /proc/kcore
>>>
>>> /proc/kcore: file format elf64-x86-64
>>>
>>> Segmentation fault
>>>
>>> any idea? config is attached
>>
>> Just tried with a different config on 5.14.0-rc6+
>>
>> [root@localhost ~]# cat /proc/kallsyms | grep ksys_read
>> ffffffff8927a800 T ksys_readahead
>> ffffffff89333660 T ksys_read
>>
>> [root@localhost ~]# objdump -d --start-address=0xffffffff89333660
>> --stop-address=0xffffffff89333670
>>
>> a.out: file format elf64-x86-64
>>
>>
>>
>> The kern_addr_valid(start) seems to fault in your case, which is weird,
>> because it merely walks the page tables. But it seems to complain about a
>> non-canonical address 0xf887ffcbff000
>>
>> Can you post your QEMU cmdline? Did you test this on other kernel versions?
>
> I'm using virt-manager so:
>
> /usr/bin/qemu-system-x86_64 -name guest=fedora33,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-13-fedora33/master-key.aes -machine pc-q35-5.1,accel=kvm,usb=off,vmport=off,dump-guest-core=off,memory-backend=pc.ram -cpu Skylake-Server-IBRS,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,clflushopt=on,umip=on,pku=on,stibp=on,arch-capabilities=on,ssbd=on,xsaves=on,ibpb=on,amd-stibp=on,amd-ssbd=on,skip-l1dfl-vmentry=on,pschange-mc-no=on -m 8192 -object memory-backend-ram,id=pc.ram,size=8589934592 -overcommit mem-lock=off -smp 20,sockets=20,cores=1,threads=1 -uuid 2185d5a9-dbad-4d61-aa4e-97af9fd7ebca -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=36,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -kernel /home/jolsa/qemu/run/vmlinux -initrd /home/jolsa/qemu/run/initrd -append root=/dev/mapper/fedora_fedora-root ro rd.lvm.lv=fedora_fedora/root console=tty0 console=ttyS0,115200 -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 -device pcie-root-port,port=0x16,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x6 -device qemu-xhci,p2=15,p3=15,id=usb,bus=pci.2,addr=0x0 -device virtio-serial-pci,id=virtio-serial0,bus=pci.3,addr=0x0 -blockdev {"driver":"file","filename":"/var/lib/libvirt/images/fedora33.qcow2","node-name":"libvirt-2-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-2-format","read-only":false,"driver":"qcow2","file":"libvirt-2-storage","backing":null} -device virtio-blk-pci,bus=pci.4,addr=0x0,drive=libvirt-2-format,id=virtio-disk0,bootindex=1 -device ide-cd,bus=ide.0,id=sata0-0-0 -netdev tap,fd=38,id=hostnet0,vhost=on,vhostfd=39 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:f3:c6:e7,bus=pci.1,addr=0x0 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,fd=40,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5900,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -device ich9-intel-hda,id=sound0,bus=pcie.0,addr=0x1b -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.5,addr=0x0 -object rng-random,id=objrng0,filename=/dev/urandom -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.6,addr=0x0 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on
>
> so far I tested just bpf-next/master:
> git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git
>

Just tried with upstream Linux (5.14.0-rc6) and your config without
triggering it. I'm using "-cpu host", though, on an AMD Ryzen 9 3900X

> and jsut removed my changes to make sure it wasn't me ;-)

:)

>
> I'll try to find a version that worked for me before

Can you try with upstream Linux as well?


--
Thanks,

David / dhildenb

2021-08-16 19:15:17

by Mike Rapoport

[permalink] [raw]
Subject: Re: [BUG] general protection fault when reading /proc/kcore

On Mon, Aug 16, 2021 at 08:38:43PM +0200, David Hildenbrand wrote:
> On 16.08.21 20:12, Jiri Olsa wrote:
> > On Mon, Aug 16, 2021 at 07:49:15PM +0200, David Hildenbrand wrote:
> > > On 16.08.21 19:34, Jiri Olsa wrote:
> > > > hi,
> > > > I'm getting fault below when running:
> > > >
> > > > # cat /proc/kallsyms | grep ksys_read
> > > > ffffffff8136d580 T ksys_read
> > > > # objdump -d --start-address=0xffffffff8136d580 --stop-address=0xffffffff8136d590 /proc/kcore
> > > >
> > > > /proc/kcore: file format elf64-x86-64
> > > >
> > > > Segmentation fault
> > > >
> > > > any idea? config is attached
> > >
> > > Just tried with a different config on 5.14.0-rc6+
> > >
> > > [root@localhost ~]# cat /proc/kallsyms | grep ksys_read
> > > ffffffff8927a800 T ksys_readahead
> > > ffffffff89333660 T ksys_read
> > >
> > > [root@localhost ~]# objdump -d --start-address=0xffffffff89333660
> > > --stop-address=0xffffffff89333670
> > >
> > > a.out: file format elf64-x86-64
> > >
> > >
> > >
> > > The kern_addr_valid(start) seems to fault in your case, which is weird,
> > > because it merely walks the page tables. But it seems to complain about a
> > > non-canonical address 0xf887ffcbff000
> > >
> > > Can you post your QEMU cmdline? Did you test this on other kernel versions?
> >
> > I'm using virt-manager so:
> >
> > /usr/bin/qemu-system-x86_64 -name guest=fedora33,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-13-fedora33/master-key.aes -machine pc-q35-5.1,accel=kvm,usb=off,vmport=off,dump-guest-core=off,memory-backend=pc.ram -cpu Skylake-Server-IBRS,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,clflushopt=on,umip=on,pku=on,stibp=on,arch-capabilities=on,ssbd=on,xsaves=on,ibpb=on,amd-stibp=on,amd-ssbd=on,skip-l1dfl-vmentry=on,pschange-mc-no=on -m 8192 -object memory-backend-ram,id=pc.ram,size=8589934592 -overcommit mem-lock=off -smp 20,sockets=20,cores=1,threads=1 -uuid 2185d5a9-dbad-4d61-aa4e-97af9fd7ebca -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=36,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -kernel /home/jolsa/qemu/run/vmlinux -initrd /home/jolsa/qemu/run/initrd -append root=/dev/mapper/fedora_fedora-root ro rd.lvm.lv=fedora_fedora/root console=tty0 console=ttyS0,115200 -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 -device pcie-root-port,port=0x16,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x6 -device qemu-xhci,p2=15,p3=15,id=usb,bus=pci.2,addr=0x0 -device virtio-serial-pci,id=virtio-serial0,bus=pci.3,addr=0x0 -blockdev {"driver":"file","filename":"/var/lib/libvirt/images/fedora33.qcow2","node-name":"libvirt-2-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-2-format","read-only":false,"driver":"qcow2","file":"libvirt-2-storage","backing":null} -device virtio-blk-pci,bus=pci.4,addr=0x0,drive=libvirt-2-format,id=virtio-disk0,bootindex=1 -device ide-cd,bus=ide.0,id=sata0-0-0 -netdev tap,fd=38,id=hostnet0,vhost=on,vhostfd=39 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:f3:c6:e7,bus=pci.1,addr=0x0 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,fd=40,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5900,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -device ich9-intel-hda,id=sound0,bus=pcie.0,addr=0x1b -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.5,addr=0x0 -object rng-random,id=objrng0,filename=/dev/urandom -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.6,addr=0x0 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on



> > so far I tested just bpf-next/master:
> > git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git
> >
>
> Just tried with upstream Linux (5.14.0-rc6) and your config without
> triggering it. I'm using "-cpu host", though, on an AMD Ryzen 9 3900X

With Jiri's config and '-cpu <very long string>' it triggers for me on
v5.14-rc6.

I'll also try to take a look tomorrow.

> > and jsut removed my changes to make sure it wasn't me ;-)
>
> :)
>
> >
> > I'll try to find a version that worked for me before
>
> Can you try with upstream Linux as well?
>
>
> --
> Thanks,
>
> David / dhildenb

--
Sincerely yours,
Mike.

2021-08-16 19:21:44

by David Hildenbrand

[permalink] [raw]
Subject: Re: [BUG] general protection fault when reading /proc/kcore

On 16.08.21 21:12, Mike Rapoport wrote:
> On Mon, Aug 16, 2021 at 08:38:43PM +0200, David Hildenbrand wrote:
>> On 16.08.21 20:12, Jiri Olsa wrote:
>>> On Mon, Aug 16, 2021 at 07:49:15PM +0200, David Hildenbrand wrote:
>>>> On 16.08.21 19:34, Jiri Olsa wrote:
>>>>> hi,
>>>>> I'm getting fault below when running:
>>>>>
>>>>> # cat /proc/kallsyms | grep ksys_read
>>>>> ffffffff8136d580 T ksys_read
>>>>> # objdump -d --start-address=0xffffffff8136d580 --stop-address=0xffffffff8136d590 /proc/kcore
>>>>>
>>>>> /proc/kcore: file format elf64-x86-64
>>>>>
>>>>> Segmentation fault
>>>>>
>>>>> any idea? config is attached
>>>>
>>>> Just tried with a different config on 5.14.0-rc6+
>>>>
>>>> [root@localhost ~]# cat /proc/kallsyms | grep ksys_read
>>>> ffffffff8927a800 T ksys_readahead
>>>> ffffffff89333660 T ksys_read
>>>>
>>>> [root@localhost ~]# objdump -d --start-address=0xffffffff89333660
>>>> --stop-address=0xffffffff89333670
>>>>
>>>> a.out: file format elf64-x86-64
>>>>
>>>>
>>>>
>>>> The kern_addr_valid(start) seems to fault in your case, which is weird,
>>>> because it merely walks the page tables. But it seems to complain about a
>>>> non-canonical address 0xf887ffcbff000
>>>>
>>>> Can you post your QEMU cmdline? Did you test this on other kernel versions?
>>>
>>> I'm using virt-manager so:
>>>
>>> /usr/bin/qemu-system-x86_64 -name guest=fedora33,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-13-fedora33/master-key.aes -machine pc-q35-5.1,accel=kvm,usb=off,vmport=off,dump-guest-core=off,memory-backend=pc.ram -cpu Skylake-Server-IBRS,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,clflushopt=on,umip=on,pku=on,stibp=on,arch-capabilities=on,ssbd=on,xsaves=on,ibpb=on,amd-stibp=on,amd-ssbd=on,skip-l1dfl-vmentry=on,pschange-mc-no=on -m 8192 -object memory-backend-ram,id=pc.ram,size=8589934592 -overcommit mem-lock=off -smp 20,sockets=20,cores=1,threads=1 -uuid 2185d5a9-dbad-4d61-aa4e-97af9fd7ebca -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=36,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -kernel /home/jolsa/qemu/run/vmlinux -initrd /home/jolsa/qemu/run/initrd -append root=/dev/mapper/fedora_fedora-root ro rd.lvm.lv=fedora_fedora/root console=tty0 console=ttyS0,115200 -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 -device pcie-root-port,port=0x16,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x6 -device qemu-xhci,p2=15,p3=15,id=usb,bus=pci.2,addr=0x0 -device virtio-serial-pci,id=virtio-serial0,bus=pci.3,addr=0x0 -blockdev {"driver":"file","filename":"/var/lib/libvirt/images/fedora33.qcow2","node-name":"libvirt-2-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-2-format","read-only":false,"driver":"qcow2","file":"libvirt-2-storage","backing":null} -device virtio-blk-pci,bus=pci.4,addr=0x0,drive=libvirt-2-format,id=virtio-disk0,bootindex=1 -device ide-cd,bus=ide.0,id=sata0-0-0 -netdev tap,fd=38,id=hostnet0,vhost=on,vhostfd=39 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:f3:c6:e7,bus=pci.1,addr=0x0 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,fd=40,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5900,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -device ich9-intel-hda,id=sound0,bus=pcie.0,addr=0x1b -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.5,addr=0x0 -object rng-random,id=objrng0,filename=/dev/urandom -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.6,addr=0x0 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on
>
>
>
>>> so far I tested just bpf-next/master:
>>> git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git
>>>
>>
>> Just tried with upstream Linux (5.14.0-rc6) and your config without
>> triggering it. I'm using "-cpu host", though, on an AMD Ryzen 9 3900X
>
> With Jiri's config and '-cpu <very long string>' it triggers for me on
> v5.14-rc6.
>
> I'll also try to take a look tomorrow.

No luck here on my AMD system, even with that '-cpu <very long string>'.
Maybe some relevant CPU features get silently ignored because they are
not actually available on my system.

--
Thanks,

David / dhildenb

2021-08-17 07:45:43

by Jiri Olsa

[permalink] [raw]
Subject: Re: [BUG] general protection fault when reading /proc/kcore

On Mon, Aug 16, 2021 at 08:38:43PM +0200, David Hildenbrand wrote:
> On 16.08.21 20:12, Jiri Olsa wrote:
> > On Mon, Aug 16, 2021 at 07:49:15PM +0200, David Hildenbrand wrote:
> > > On 16.08.21 19:34, Jiri Olsa wrote:
> > > > hi,
> > > > I'm getting fault below when running:
> > > >
> > > > # cat /proc/kallsyms | grep ksys_read
> > > > ffffffff8136d580 T ksys_read
> > > > # objdump -d --start-address=0xffffffff8136d580 --stop-address=0xffffffff8136d590 /proc/kcore
> > > >
> > > > /proc/kcore: file format elf64-x86-64
> > > >
> > > > Segmentation fault
> > > >
> > > > any idea? config is attached
> > >
> > > Just tried with a different config on 5.14.0-rc6+
> > >
> > > [root@localhost ~]# cat /proc/kallsyms | grep ksys_read
> > > ffffffff8927a800 T ksys_readahead
> > > ffffffff89333660 T ksys_read
> > >
> > > [root@localhost ~]# objdump -d --start-address=0xffffffff89333660
> > > --stop-address=0xffffffff89333670
> > >
> > > a.out: file format elf64-x86-64
> > >
> > >
> > >
> > > The kern_addr_valid(start) seems to fault in your case, which is weird,
> > > because it merely walks the page tables. But it seems to complain about a
> > > non-canonical address 0xf887ffcbff000
> > >
> > > Can you post your QEMU cmdline? Did you test this on other kernel versions?
> >
> > I'm using virt-manager so:
> >
> > /usr/bin/qemu-system-x86_64 -name guest=fedora33,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-13-fedora33/master-key.aes -machine pc-q35-5.1,accel=kvm,usb=off,vmport=off,dump-guest-core=off,memory-backend=pc.ram -cpu Skylake-Server-IBRS,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,clflushopt=on,umip=on,pku=on,stibp=on,arch-capabilities=on,ssbd=on,xsaves=on,ibpb=on,amd-stibp=on,amd-ssbd=on,skip-l1dfl-vmentry=on,pschange-mc-no=on -m 8192 -object memory-backend-ram,id=pc.ram,size=8589934592 -overcommit mem-lock=off -smp 20,sockets=20,cores=1,threads=1 -uuid 2185d5a9-dbad-4d61-aa4e-97af9fd7ebca -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=36,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -kernel /home/jolsa/qemu/run/vmlinux -initrd /home/jolsa/qemu/run/initrd -append root=/dev/mapper/fedora_fedora-root ro rd.lvm.lv=fedora_fedora/root console=tty0 console=ttyS0,115200 -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 -device pcie-root-port,port=0x16,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x6 -device qemu-xhci,p2=15,p3=15,id=usb,bus=pci.2,addr=0x0 -device virtio-serial-pci,id=virtio-serial0,bus=pci.3,addr=0x0 -blockdev {"driver":"file","filename":"/var/lib/libvirt/images/fedora33.qcow2","node-name":"libvirt-2-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-2-format","read-only":false,"driver":"qcow2","file":"libvirt-2-storage","backing":null} -device virtio-blk-pci,bus=pci.4,addr=0x0,drive=libvirt-2-format,id=virtio-disk0,bootindex=1 -device ide-cd,bus=ide.0,id=sata0-0-0 -netdev tap,fd=38,id=hostnet0,vhost=on,vhostfd=39 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:f3:c6:e7,bus=pci.1,addr=0x0 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,fd=40,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5900,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -device ich9-intel-hda,id=sound0,bus=pcie.0,addr=0x1b -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.5,addr=0x0 -object rng-random,id=objrng0,filename=/dev/urandom -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.6,addr=0x0 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on
> >
> > so far I tested just bpf-next/master:
> > git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git
> >
>
> Just tried with upstream Linux (5.14.0-rc6) and your config without
> triggering it. I'm using "-cpu host", though, on an AMD Ryzen 9 3900X
>
> > and jsut removed my changes to make sure it wasn't me ;-)
>
> :)
>
> >
> > I'll try to find a version that worked for me before
>
> Can you try with upstream Linux as well?

I tried with latest linus tree and v5.12 with same results

I'm now playing with the cpu config, but I'm getting some
virt-manager errors.. so I'll need to dig in bit more

thanks,
jirka

2021-08-17 07:58:06

by Mike Rapoport

[permalink] [raw]
Subject: Re: [BUG] general protection fault when reading /proc/kcore

On Mon, Aug 16, 2021 at 10:13:18PM +0300, Mike Rapoport wrote:
> On Mon, Aug 16, 2021 at 08:38:43PM +0200, David Hildenbrand wrote:
> > On 16.08.21 20:12, Jiri Olsa wrote:
> > > On Mon, Aug 16, 2021 at 07:49:15PM +0200, David Hildenbrand wrote:
> > > > On 16.08.21 19:34, Jiri Olsa wrote:
> > > > > hi,
> > > > > I'm getting fault below when running:
> > > > >
> > > > > # cat /proc/kallsyms | grep ksys_read
> > > > > ffffffff8136d580 T ksys_read
> > > > > # objdump -d --start-address=0xffffffff8136d580 --stop-address=0xffffffff8136d590 /proc/kcore
> > > > >
> > > > > /proc/kcore: file format elf64-x86-64
> > > > >
> > > > > Segmentation fault
> > > > >
> > > > > any idea? config is attached
> > > >
> > > > Just tried with a different config on 5.14.0-rc6+
> > > >
> > > > [root@localhost ~]# cat /proc/kallsyms | grep ksys_read
> > > > ffffffff8927a800 T ksys_readahead
> > > > ffffffff89333660 T ksys_read
> > > >
> > > > [root@localhost ~]# objdump -d --start-address=0xffffffff89333660
> > > > --stop-address=0xffffffff89333670
> > > >
> > > > a.out: file format elf64-x86-64
> > > >
> > > >
> > > >
> > > > The kern_addr_valid(start) seems to fault in your case, which is weird,
> > > > because it merely walks the page tables. But it seems to complain about a
> > > > non-canonical address 0xf887ffcbff000
> > > >
> > > > Can you post your QEMU cmdline? Did you test this on other kernel versions?
> > >
> > > I'm using virt-manager so:
> > >
> > > /usr/bin/qemu-system-x86_64 -name guest=fedora33,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-13-fedora33/master-key.aes -machine pc-q35-5.1,accel=kvm,usb=off,vmport=off,dump-guest-core=off,memory-backend=pc.ram -cpu Skylake-Server-IBRS,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,clflushopt=on,umip=on,pku=on,stibp=on,arch-capabilities=on,ssbd=on,xsaves=on,ibpb=on,amd-stibp=on,amd-ssbd=on,skip-l1dfl-vmentry=on,pschange-mc-no=on -m 8192 -object memory-backend-ram,id=pc.ram,size=8589934592 -overcommit mem-lock=off -smp 20,sockets=20,cores=1,threads=1 -uuid 2185d5a9-dbad-4d61-aa4e-97af9fd7ebca -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=36,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -kernel /home/jolsa/qemu/run/vmlinux -initrd /home/jolsa/qemu/run/initrd -append root=/dev/mapper/fedora_fedora-root ro rd.lvm.lv=fedora_fedora/root console=tty0 console=ttyS0,115200 -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 -device pcie-root-port,port=0x16,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x6 -device qemu-xhci,p2=15,p3=15,id=usb,bus=pci.2,addr=0x0 -device virtio-serial-pci,id=virtio-serial0,bus=pci.3,addr=0x0 -blockdev {"driver":"file","filename":"/var/lib/libvirt/images/fedora33.qcow2","node-name":"libvirt-2-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-2-format","read-only":false,"driver":"qcow2","file":"libvirt-2-storage","backing":null} -device virtio-blk-pci,bus=pci.4,addr=0x0,drive=libvirt-2-format,id=virtio-disk0,bootindex=1 -device ide-cd,bus=ide.0,id=sata0-0-0 -netdev tap,fd=38,id=hostnet0,vhost=on,vhostfd=39 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:f3:c6:e7,bus=pci.1,addr=0x0 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,fd=40,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5900,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -device ich9-intel-hda,id=sound0,bus=pcie.0,addr=0x1b -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.5,addr=0x0 -object rng-random,id=objrng0,filename=/dev/urandom -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.6,addr=0x0 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on
>
> > > so far I tested just bpf-next/master:
> > > git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git
> > >
> >
> > Just tried with upstream Linux (5.14.0-rc6) and your config without
> > triggering it. I'm using "-cpu host", though, on an AMD Ryzen 9 3900X
>
> With Jiri's config and '-cpu <very long string>' it triggers for me on
> v5.14-rc6.
>
> I'll also try to take a look tomorrow.

There are some non-zero PMDs that are not present in the high kernel
mappings. The patch below fixes for me the issue in kern_addr_valid()
trying to access a not-present PMD. Jiri, can you check if it works for
you?

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index ddeaba947eb3..07b56e90db5d 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1433,18 +1433,18 @@ int kern_addr_valid(unsigned long addr)
return 0;

p4d = p4d_offset(pgd, addr);
- if (p4d_none(*p4d))
+ if (p4d_none(*p4d) || !p4d_present(*p4d))
return 0;

pud = pud_offset(p4d, addr);
- if (pud_none(*pud))
+ if (pud_none(*pud) || !pud_present(*pud))
return 0;

if (pud_large(*pud))
return pfn_valid(pud_pfn(*pud));

pmd = pmd_offset(pud, addr);
- if (pmd_none(*pmd))
+ if (pmd_none(*pmd) || !pmd_present(*pmd))
return 0;

if (pmd_large(*pmd))

--
Sincerely yours,
Mike.

2021-08-17 08:04:53

by David Hildenbrand

[permalink] [raw]
Subject: Re: [BUG] general protection fault when reading /proc/kcore

On 17.08.21 09:56, Mike Rapoport wrote:
> On Mon, Aug 16, 2021 at 10:13:18PM +0300, Mike Rapoport wrote:
>> On Mon, Aug 16, 2021 at 08:38:43PM +0200, David Hildenbrand wrote:
>>> On 16.08.21 20:12, Jiri Olsa wrote:
>>>> On Mon, Aug 16, 2021 at 07:49:15PM +0200, David Hildenbrand wrote:
>>>>> On 16.08.21 19:34, Jiri Olsa wrote:
>>>>>> hi,
>>>>>> I'm getting fault below when running:
>>>>>>
>>>>>> # cat /proc/kallsyms | grep ksys_read
>>>>>> ffffffff8136d580 T ksys_read
>>>>>> # objdump -d --start-address=0xffffffff8136d580 --stop-address=0xffffffff8136d590 /proc/kcore
>>>>>>
>>>>>> /proc/kcore: file format elf64-x86-64
>>>>>>
>>>>>> Segmentation fault
>>>>>>
>>>>>> any idea? config is attached
>>>>>
>>>>> Just tried with a different config on 5.14.0-rc6+
>>>>>
>>>>> [root@localhost ~]# cat /proc/kallsyms | grep ksys_read
>>>>> ffffffff8927a800 T ksys_readahead
>>>>> ffffffff89333660 T ksys_read
>>>>>
>>>>> [root@localhost ~]# objdump -d --start-address=0xffffffff89333660
>>>>> --stop-address=0xffffffff89333670
>>>>>
>>>>> a.out: file format elf64-x86-64
>>>>>
>>>>>
>>>>>
>>>>> The kern_addr_valid(start) seems to fault in your case, which is weird,
>>>>> because it merely walks the page tables. But it seems to complain about a
>>>>> non-canonical address 0xf887ffcbff000
>>>>>
>>>>> Can you post your QEMU cmdline? Did you test this on other kernel versions?
>>>>
>>>> I'm using virt-manager so:
>>>>
>>>> /usr/bin/qemu-system-x86_64 -name guest=fedora33,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-13-fedora33/master-key.aes -machine pc-q35-5.1,accel=kvm,usb=off,vmport=off,dump-guest-core=off,memory-backend=pc.ram -cpu Skylake-Server-IBRS,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,clflushopt=on,umip=on,pku=on,stibp=on,arch-capabilities=on,ssbd=on,xsaves=on,ibpb=on,amd-stibp=on,amd-ssbd=on,skip-l1dfl-vmentry=on,pschange-mc-no=on -m 8192 -object memory-backend-ram,id=pc.ram,size=8589934592 -overcommit mem-lock=off -smp 20,sockets=20,cores=1,threads=1 -uuid 2185d5a9-dbad-4d61-aa4e-97af9fd7ebca -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=36,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -kernel /home/jolsa/qemu/run/vmlinux -initrd /home/jolsa/qemu/run/initrd -append root=/dev/mapper/fedora_fedora-root ro rd.lvm.lv=fedora_fedora/root console=tty0 console=ttyS0,115200 -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 -device pcie-root-port,port=0x16,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x6 -device qemu-xhci,p2=15,p3=15,id=usb,bus=pci.2,addr=0x0 -device virtio-serial-pci,id=virtio-serial0,bus=pci.3,addr=0x0 -blockdev {"driver":"file","filename":"/var/lib/libvirt/images/fedora33.qcow2","node-name":"libvirt-2-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-2-format","read-only":false,"driver":"qcow2","file":"libvirt-2-storage","backing":null} -device virtio-blk-pci,bus=pci.4,addr=0x0,drive=libvirt-2-format,id=virtio-disk0,bootindex=1 -device ide-cd,bus=ide.0,id=sata0-0-0 -netdev tap,fd=38,id=hostnet0,vhost=on,vhostfd=39 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:f3:c6:e7,bus=pci.1,addr=0x0 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,fd=40,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5900,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -device ich9-intel-hda,id=sound0,bus=pcie.0,addr=0x1b -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.5,addr=0x0 -object rng-random,id=objrng0,filename=/dev/urandom -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.6,addr=0x0 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on
>>
>>>> so far I tested just bpf-next/master:
>>>> git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git
>>>>
>>>
>>> Just tried with upstream Linux (5.14.0-rc6) and your config without
>>> triggering it. I'm using "-cpu host", though, on an AMD Ryzen 9 3900X
>>
>> With Jiri's config and '-cpu <very long string>' it triggers for me on
>> v5.14-rc6.
>>
>> I'll also try to take a look tomorrow.
>
> There are some non-zero PMDs that are not present in the high kernel
> mappings. The patch below fixes for me the issue in kern_addr_valid()
> trying to access a not-present PMD. Jiri, can you check if it works for
> you?
>
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index ddeaba947eb3..07b56e90db5d 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -1433,18 +1433,18 @@ int kern_addr_valid(unsigned long addr)
> return 0;
>
> p4d = p4d_offset(pgd, addr);
> - if (p4d_none(*p4d))
> + if (p4d_none(*p4d) || !p4d_present(*p4d))
> return 0;
>
> pud = pud_offset(p4d, addr);
> - if (pud_none(*pud))
> + if (pud_none(*pud) || !pud_present(*pud))
> return 0;
>
> if (pud_large(*pud))
> return pfn_valid(pud_pfn(*pud));
>
> pmd = pmd_offset(pud, addr);
> - if (pmd_none(*pmd))
> + if (pmd_none(*pmd) || !pmd_present(*pmd))
> return 0;
>
> if (pmd_large(*pmd))
>

However, wouldn't that mean that that TEXT segment isn't actually
accessible at all? Or is this some weird kind of TEXT protection (not
even being able to read it, weird, no?)

We don't support swapping and all that stuff for kernel memory. So what
does !present even indicate here? (smells like a different BUG, but I
might be wrong, of course)

--
Thanks,

David / dhildenb

2021-08-17 08:11:00

by Jiri Olsa

[permalink] [raw]
Subject: Re: [BUG] general protection fault when reading /proc/kcore

On Tue, Aug 17, 2021 at 10:56:02AM +0300, Mike Rapoport wrote:
> On Mon, Aug 16, 2021 at 10:13:18PM +0300, Mike Rapoport wrote:
> > On Mon, Aug 16, 2021 at 08:38:43PM +0200, David Hildenbrand wrote:
> > > On 16.08.21 20:12, Jiri Olsa wrote:
> > > > On Mon, Aug 16, 2021 at 07:49:15PM +0200, David Hildenbrand wrote:
> > > > > On 16.08.21 19:34, Jiri Olsa wrote:
> > > > > > hi,
> > > > > > I'm getting fault below when running:
> > > > > >
> > > > > > # cat /proc/kallsyms | grep ksys_read
> > > > > > ffffffff8136d580 T ksys_read
> > > > > > # objdump -d --start-address=0xffffffff8136d580 --stop-address=0xffffffff8136d590 /proc/kcore
> > > > > >
> > > > > > /proc/kcore: file format elf64-x86-64
> > > > > >
> > > > > > Segmentation fault
> > > > > >
> > > > > > any idea? config is attached
> > > > >
> > > > > Just tried with a different config on 5.14.0-rc6+
> > > > >
> > > > > [root@localhost ~]# cat /proc/kallsyms | grep ksys_read
> > > > > ffffffff8927a800 T ksys_readahead
> > > > > ffffffff89333660 T ksys_read
> > > > >
> > > > > [root@localhost ~]# objdump -d --start-address=0xffffffff89333660
> > > > > --stop-address=0xffffffff89333670
> > > > >
> > > > > a.out: file format elf64-x86-64
> > > > >
> > > > >
> > > > >
> > > > > The kern_addr_valid(start) seems to fault in your case, which is weird,
> > > > > because it merely walks the page tables. But it seems to complain about a
> > > > > non-canonical address 0xf887ffcbff000
> > > > >
> > > > > Can you post your QEMU cmdline? Did you test this on other kernel versions?
> > > >
> > > > I'm using virt-manager so:
> > > >
> > > > /usr/bin/qemu-system-x86_64 -name guest=fedora33,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-13-fedora33/master-key.aes -machine pc-q35-5.1,accel=kvm,usb=off,vmport=off,dump-guest-core=off,memory-backend=pc.ram -cpu Skylake-Server-IBRS,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,clflushopt=on,umip=on,pku=on,stibp=on,arch-capabilities=on,ssbd=on,xsaves=on,ibpb=on,amd-stibp=on,amd-ssbd=on,skip-l1dfl-vmentry=on,pschange-mc-no=on -m 8192 -object memory-backend-ram,id=pc.ram,size=8589934592 -overcommit mem-lock=off -smp 20,sockets=20,cores=1,threads=1 -uuid 2185d5a9-dbad-4d61-aa4e-97af9fd7ebca -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=36,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -kernel /home/jolsa/qemu/run/vmlinux -initrd /home/jolsa/qemu/run/initrd -append root=/dev/mapper/fedora_fedora-root ro rd.lvm.lv=fedora_fedora/root console=tty0 console=ttyS0,115200 -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 -device pcie-root-port,port=0x16,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x6 -device qemu-xhci,p2=15,p3=15,id=usb,bus=pci.2,addr=0x0 -device virtio-serial-pci,id=virtio-serial0,bus=pci.3,addr=0x0 -blockdev {"driver":"file","filename":"/var/lib/libvirt/images/fedora33.qcow2","node-name":"libvirt-2-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-2-format","read-only":false,"driver":"qcow2","file":"libvirt-2-storage","backing":null} -device virtio-blk-pci,bus=pci.4,addr=0x0,drive=libvirt-2-format,id=virtio-disk0,bootindex=1 -device ide-cd,bus=ide.0,id=sata0-0-0 -netdev tap,fd=38,id=hostnet0,vhost=on,vhostfd=39 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:f3:c6:e7,bus=pci.1,addr=0x0 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,fd=40,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5900,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -device ich9-intel-hda,id=sound0,bus=pcie.0,addr=0x1b -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.5,addr=0x0 -object rng-random,id=objrng0,filename=/dev/urandom -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.6,addr=0x0 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on
> >
> > > > so far I tested just bpf-next/master:
> > > > git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git
> > > >
> > >
> > > Just tried with upstream Linux (5.14.0-rc6) and your config without
> > > triggering it. I'm using "-cpu host", though, on an AMD Ryzen 9 3900X
> >
> > With Jiri's config and '-cpu <very long string>' it triggers for me on
> > v5.14-rc6.
> >
> > I'll also try to take a look tomorrow.
>
> There are some non-zero PMDs that are not present in the high kernel
> mappings. The patch below fixes for me the issue in kern_addr_valid()
> trying to access a not-present PMD. Jiri, can you check if it works for
> you?

yep, seems to work for me.. console is quiet and I'm getting
expected output

thanks,
jirka

>
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index ddeaba947eb3..07b56e90db5d 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -1433,18 +1433,18 @@ int kern_addr_valid(unsigned long addr)
> return 0;
>
> p4d = p4d_offset(pgd, addr);
> - if (p4d_none(*p4d))
> + if (p4d_none(*p4d) || !p4d_present(*p4d))
> return 0;
>
> pud = pud_offset(p4d, addr);
> - if (pud_none(*pud))
> + if (pud_none(*pud) || !pud_present(*pud))
> return 0;
>
> if (pud_large(*pud))
> return pfn_valid(pud_pfn(*pud));
>
> pmd = pmd_offset(pud, addr);
> - if (pmd_none(*pmd))
> + if (pmd_none(*pmd) || !pmd_present(*pmd))
> return 0;
>
> if (pmd_large(*pmd))
>
> --
> Sincerely yours,
> Mike.
>

2021-08-17 08:22:30

by Mike Rapoport

[permalink] [raw]
Subject: Re: [BUG] general protection fault when reading /proc/kcore

On Tue, Aug 17, 2021 at 10:02:10AM +0200, David Hildenbrand wrote:
> On 17.08.21 09:56, Mike Rapoport wrote:
> > On Mon, Aug 16, 2021 at 10:13:18PM +0300, Mike Rapoport wrote:
> > > On Mon, Aug 16, 2021 at 08:38:43PM +0200, David Hildenbrand wrote:
> > > > On 16.08.21 20:12, Jiri Olsa wrote:
> > > > > On Mon, Aug 16, 2021 at 07:49:15PM +0200, David Hildenbrand wrote:
> > > > > > On 16.08.21 19:34, Jiri Olsa wrote:
> > > > > > > hi,
> > > > > > > I'm getting fault below when running:
> > > > > > >
> > > > > > > # cat /proc/kallsyms | grep ksys_read
> > > > > > > ffffffff8136d580 T ksys_read
> > > > > > > # objdump -d --start-address=0xffffffff8136d580 --stop-address=0xffffffff8136d590 /proc/kcore
> > > > > > >
> > > > > > > /proc/kcore: file format elf64-x86-64
> > > > > > >
> > > > > > > Segmentation fault
> > > > > > >
> > > > > > > any idea? config is attached
> > > > > >
> > > > > > Just tried with a different config on 5.14.0-rc6+
> > > > > >
> > > > > > [root@localhost ~]# cat /proc/kallsyms | grep ksys_read
> > > > > > ffffffff8927a800 T ksys_readahead
> > > > > > ffffffff89333660 T ksys_read
> > > > > >
> > > > > > [root@localhost ~]# objdump -d --start-address=0xffffffff89333660
> > > > > > --stop-address=0xffffffff89333670
> > > > > >
> > > > > > a.out: file format elf64-x86-64
> > > > > >
> > > > > >
> > > > > >
> > > > > > The kern_addr_valid(start) seems to fault in your case, which is weird,
> > > > > > because it merely walks the page tables. But it seems to complain about a
> > > > > > non-canonical address 0xf887ffcbff000
> > > > > >
> > > > > > Can you post your QEMU cmdline? Did you test this on other kernel versions?
> > > > >
> > > > > I'm using virt-manager so:
> > > > >
> > > > > /usr/bin/qemu-system-x86_64 -name guest=fedora33,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-13-fedora33/master-key.aes -machine pc-q35-5.1,accel=kvm,usb=off,vmport=off,dump-guest-core=off,memory-backend=pc.ram -cpu Skylake-Server-IBRS,ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,clflushopt=on,umip=on,pku=on,stibp=on,arch-capabilities=on,ssbd=on,xsaves=on,ibpb=on,amd-stibp=on,amd-ssbd=on,skip-l1dfl-vmentry=on,pschange-mc-no=on -m 8192 -object memory-backend-ram,id=pc.ram,size=8589934592 -overcommit mem-lock=off -smp 20,sockets=20,cores=1,threads=1 -uuid 2185d5a9-dbad-4d61-aa4e-97af9fd7ebca -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=36,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -kernel /home/jolsa/qemu/run/vmlinux -initrd /home/jolsa/qemu/run/initrd -append root=/dev/mapper/fedora_fedora-root ro rd.lvm.lv=fedora_fedora/root console=tty0 console=ttyS0,115200 -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 -device pcie-root-port,port=0x16,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x6 -device qemu-xhci,p2=15,p3=15,id=usb,bus=pci.2,addr=0x0 -device virtio-serial-pci,id=virtio-serial0,bus=pci.3,addr=0x0 -blockdev {"driver":"file","filename":"/var/lib/libvirt/images/fedora33.qcow2","node-name":"libvirt-2-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-2-format","read-only":false,"driver":"qcow2","file":"libvirt-2-storage","backing":null} -device virtio-blk-pci,bus=pci.4,addr=0x0,drive=libvirt-2-format,id=virtio-disk0,bootindex=1 -device ide-cd,bus=ide.0,id=sata0-0-0 -netdev tap,fd=38,id=hostnet0,vhost=on,vhostfd=39 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:f3:c6:e7,bus=pci.1,addr=0x0 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,fd=40,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5900,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 -device ich9-intel-hda,id=sound0,bus=pcie.0,addr=0x1b -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.5,addr=0x0 -object rng-random,id=objrng0,filename=/dev/urandom -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.6,addr=0x0 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on
> > > > > so far I tested just bpf-next/master:
> > > > > git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git
> > > > >
> > > >
> > > > Just tried with upstream Linux (5.14.0-rc6) and your config without
> > > > triggering it. I'm using "-cpu host", though, on an AMD Ryzen 9 3900X
> > >
> > > With Jiri's config and '-cpu <very long string>' it triggers for me on
> > > v5.14-rc6.
> > >
> > > I'll also try to take a look tomorrow.
> >
> > There are some non-zero PMDs that are not present in the high kernel
> > mappings. The patch below fixes for me the issue in kern_addr_valid()
> > trying to access a not-present PMD. Jiri, can you check if it works for
> > you?
> >
> > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> > index ddeaba947eb3..07b56e90db5d 100644
> > --- a/arch/x86/mm/init_64.c
> > +++ b/arch/x86/mm/init_64.c
> > @@ -1433,18 +1433,18 @@ int kern_addr_valid(unsigned long addr)
> > return 0;
> > p4d = p4d_offset(pgd, addr);
> > - if (p4d_none(*p4d))
> > + if (p4d_none(*p4d) || !p4d_present(*p4d))
> > return 0;
> > pud = pud_offset(p4d, addr);
> > - if (pud_none(*pud))
> > + if (pud_none(*pud) || !pud_present(*pud))
> > return 0;
> > if (pud_large(*pud))
> > return pfn_valid(pud_pfn(*pud));
> > pmd = pmd_offset(pud, addr);
> > - if (pmd_none(*pmd))
> > + if (pmd_none(*pmd) || !pmd_present(*pmd))
> > return 0;
> > if (pmd_large(*pmd))
> >
>
> However, wouldn't that mean that that TEXT segment isn't actually accessible
> at all? Or is this some weird kind of TEXT protection (not even being able
> to read it, weird, no?)

It does not seem like TEXT isn't accessible. There are unused parts in that
virtual range, but for some reason the PMDs there are not zero.

> We don't support swapping and all that stuff for kernel memory. So what does
> !present even indicate here? (smells like a different BUG, but I might be
> wrong, of course)

Don't know yet. For now I've only found the cause for kern_addr_valid() to
crash.

--
Sincerely yours,
Mike.