LinuxLists.cc - Panic/lockup in z3fold_zpool

2022-09-22 08:05:17

Subject: Panic/lockup in z3fold_zpool_free

Hello.

Since 5.19 series, zswap went unstable for me under memory pressure, and
occasionally I get the following:

```
watchdog: BUG: soft lockup - CPU#0 stuck for 10195s! [mariadbd:478]
Modules linked in: netconsole joydev mousedev intel_agp psmouse pcspkr
intel_gtt cfg80211 cirrus i2c_piix4 tun rfkill mac_hid nft_ct tcp_bbr2
nft_chain_nat nf_tables nfnetlink nf_nat nf_conntrack nf_defrag_ipv6
nf_defrag_ipv4 fuse qemu_fw_cfg ip_tables x_tables xfs libcrc32c
crc32c_generic dm_crypt cbc encrypted_keys trusted asn1_encoder tee tpm
rng_core dm_mod crct10dif_pclmul crc32_pclmul crc32c_intel
ghash_clmulni_intel virtio_net aesni_intel serio_raw net_failover
ata_generic virtio_balloon failover pata_acpi crypto_simd virtio_blk
atkbd libps2 vivaldi_fmap virtio_pci cryptd virtio_pci_legacy_dev
ata_piix virtio_pci_modern_dev i8042 floppy serio usbhid
Unloaded tainted modules: intel_cstate():1 intel_uncore():1
pcc_cpufreq():1 acpi_cpufreq():1
CPU: 0 PID: 478 Comm: mariadbd Tainted: G L 5.19.0-pf5 #1
12baccda8e49539e158b9dd97cbda6c7317d73af
Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014
RIP: 0010:z3fold_zpool_free+0x4c/0x5e0
Code: 7c 24 08 48 89 04 24 0f 85 e0 00 00 00 48 89 f5 41 bd 00 00 00 80
48 83 e5 c0 48 83 c5 28 eb 0a 48 89 df e8 b6 8d 9f 00 f3 90 <48> 89 ef
e8 bc 8b 9f 00 4d 8b 34 24 49 81 e6 00 f0 ff ff 49 8d 5e
RSP: 0000:ffffbeadc0e87b68 EFLAGS: 00000202
RAX: 0000000000000030 RBX: ffff99ac73d2c010 RCX: ffff99ac4e4ba380
RDX: 0000665340000000 RSI: ffffe3b540000000 RDI: ffff99ac73d2c010
RBP: ffff99ac55ef3a68 R08: ffff99ac422f0bf0 R09: 000000000000c60b
R10: ffffffffffffffc0 R11: 0000000000000000 R12: ffff99ac55ef3a50
R13: 0000000080000000 R14: ffff99ac73d2c000 R15: ffff99acf3d2c000
FS: 00007f587fcd66c0(0000) GS:ffff99ac7ec00000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f587ce8bec8 CR3: 0000000005b48006 CR4: 00000000000206f0
Call Trace:
<TASK>
zswap_free_entry+0xb5/0x110
zswap_frontswap_invalidate_page+0x72/0xa0
__frontswap_invalidate_page+0x3a/0x60
swap_range_free+0xb5/0xd0
swapcache_free_entries+0x16e/0x2e0
free_swap_slot+0xb4/0xc0
put_swap_page+0x259/0x420
delete_from_swap_cache+0x63/0xb0
try_to_free_swap+0x1b5/0x2a0
do_swap_page+0x24c/0xb80
__handle_mm_fault+0xa59/0xf70
handle_mm_fault+0x100/0x2f0
do_user_addr_fault+0x1c7/0x6a0
exc_page_fault+0x74/0x170
asm_exc_page_fault+0x26/0x30
RIP: 0033:0x556e96280428
Code: a0 03 00 00 67 e8 28 64 ff ff 48 8b 83 b0 00 00 00 48 8b 0d da 18
72 00 48 8b 10 66 48 0f 6e c1 48 85 d2 74 27 0f 1f 44 00 00 <48> c7 82
98 00 00 00 00 00 00 00 48 8b 10 48 83 c0 08 f2 0f 11 82
RSP: 002b:00007f587fcd3980 EFLAGS: 00010206
RAX: 00007f587d028468 RBX: 00007f587cb1a818 RCX: 3ff0000000000000
RDX: 00007f587ce8be30 RSI: 0000000000000000 RDI: 00007f587cedd030
RBP: 00007f587fcd39c0 R08: 0000000000000016 R09: 0000000000000000
R10: 0000000000000008 R11: 0000556e970961a0 R12: 00007f587d1f17b8
R13: 00007f5883595598 R14: 00007f587d1f17a8 R15: 00007f587cb1a928
</TASK>
```

This happens on the latest v5.19.10 kernel as well.

Sometimes it's not a soft lockup but GPF, although the stack trace is
the same. So, to me it looks like a memory corruption, UAF, double free
or something like that.

Have you got any idea regarding what's going on?

Thanks.

--
Oleksandr Natalenko (post-factum)

2022-09-22 12:35:46

by Brian Foster

[permalink] [raw]

Subject: Re: Panic/lockup in z3fold_zpool_free

On Thu, Sep 22, 2022 at 08:53:09AM +0200, Oleksandr Natalenko wrote:
> Hello.
>
> Since 5.19 series, zswap went unstable for me under memory pressure, and
> occasionally I get the following:
>
> ```
> watchdog: BUG: soft lockup - CPU#0 stuck for 10195s! [mariadbd:478]
> Modules linked in: netconsole joydev mousedev intel_agp psmouse pcspkr
> intel_gtt cfg80211 cirrus i2c_piix4 tun rfkill mac_hid nft_ct tcp_bbr2
> nft_chain_nat nf_tables nfnetlink nf_nat nf_conntrack nf_defrag_ipv6
> nf_defrag_ipv4 fuse qemu_fw_cfg ip_tables x_tables xfs libcrc32c
> crc32c_generic dm_crypt cbc encrypted_keys trusted asn1_encoder tee tpm
> rng_core dm_mod crct10dif_pclmul crc32_pclmul crc32c_intel
> ghash_clmulni_intel virtio_net aesni_intel serio_raw net_failover
> ata_generic virtio_balloon failover pata_acpi crypto_simd virtio_blk atkbd
> libps2 vivaldi_fmap virtio_pci cryptd virtio_pci_legacy_dev ata_piix
> virtio_pci_modern_dev i8042 floppy serio usbhid
> Unloaded tainted modules: intel_cstate():1 intel_uncore():1 pcc_cpufreq():1
> acpi_cpufreq():1
> CPU: 0 PID: 478 Comm: mariadbd Tainted: G L 5.19.0-pf5 #1
> 12baccda8e49539e158b9dd97cbda6c7317d73af
> Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014
> RIP: 0010:z3fold_zpool_free+0x4c/0x5e0
> Code: 7c 24 08 48 89 04 24 0f 85 e0 00 00 00 48 89 f5 41 bd 00 00 00 80 48
> 83 e5 c0 48 83 c5 28 eb 0a 48 89 df e8 b6 8d 9f 00 f3 90 <48> 89 ef e8 bc 8b
> 9f 00 4d 8b 34 24 49 81 e6 00 f0 ff ff 49 8d 5e
> RSP: 0000:ffffbeadc0e87b68 EFLAGS: 00000202
> RAX: 0000000000000030 RBX: ffff99ac73d2c010 RCX: ffff99ac4e4ba380
> RDX: 0000665340000000 RSI: ffffe3b540000000 RDI: ffff99ac73d2c010
> RBP: ffff99ac55ef3a68 R08: ffff99ac422f0bf0 R09: 000000000000c60b
> R10: ffffffffffffffc0 R11: 0000000000000000 R12: ffff99ac55ef3a50
> R13: 0000000080000000 R14: ffff99ac73d2c000 R15: ffff99acf3d2c000
> FS: 00007f587fcd66c0(0000) GS:ffff99ac7ec00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f587ce8bec8 CR3: 0000000005b48006 CR4: 00000000000206f0
> Call Trace:
> <TASK>
> zswap_free_entry+0xb5/0x110
> zswap_frontswap_invalidate_page+0x72/0xa0
> __frontswap_invalidate_page+0x3a/0x60
> swap_range_free+0xb5/0xd0
> swapcache_free_entries+0x16e/0x2e0
> free_swap_slot+0xb4/0xc0
> put_swap_page+0x259/0x420
> delete_from_swap_cache+0x63/0xb0
> try_to_free_swap+0x1b5/0x2a0
> do_swap_page+0x24c/0xb80
> __handle_mm_fault+0xa59/0xf70
> handle_mm_fault+0x100/0x2f0
> do_user_addr_fault+0x1c7/0x6a0
> exc_page_fault+0x74/0x170
> asm_exc_page_fault+0x26/0x30
> RIP: 0033:0x556e96280428
> Code: a0 03 00 00 67 e8 28 64 ff ff 48 8b 83 b0 00 00 00 48 8b 0d da 18 72
> 00 48 8b 10 66 48 0f 6e c1 48 85 d2 74 27 0f 1f 44 00 00 <48> c7 82 98 00 00
> 00 00 00 00 00 48 8b 10 48 83 c0 08 f2 0f 11 82
> RSP: 002b:00007f587fcd3980 EFLAGS: 00010206
> RAX: 00007f587d028468 RBX: 00007f587cb1a818 RCX: 3ff0000000000000
> RDX: 00007f587ce8be30 RSI: 0000000000000000 RDI: 00007f587cedd030
> RBP: 00007f587fcd39c0 R08: 0000000000000016 R09: 0000000000000000
> R10: 0000000000000008 R11: 0000556e970961a0 R12: 00007f587d1f17b8
> R13: 00007f5883595598 R14: 00007f587d1f17a8 R15: 00007f587cb1a928
> </TASK>
> ```
>
> This happens on the latest v5.19.10 kernel as well.
>
> Sometimes it's not a soft lockup but GPF, although the stack trace is the
> same. So, to me it looks like a memory corruption, UAF, double free or
> something like that.
>
> Have you got any idea regarding what's going on?
>

It might be unrelated, but this looks somewhat similar to a problem I
hit recently that is caused by swap entry data stored in page->private
being clobbered when splitting a huge page. That problem was introduced
in v5.19, so that potentially lines up as well.

More details in the links below. [1] includes a VM_BUG_ON() splat with
DEBUG_VM enabled, but the problem originally manifested as a soft lockup
without the debug checks enabled. [2] includes a properly formatted
patch. Any chance you could give that a try?

Brian

[1] https://lore.kernel.org/linux-mm/YxDyZLfBdFHK1Y1P@bfoster/
[2] https://lore.kernel.org/linux-mm/[email protected]/

> Thanks.
>
> --
> Oleksandr Natalenko (post-factum)
>

2022-09-23 08:43:12

by Oleksandr Natalenko

[permalink] [raw]

Subject: Re: Panic/lockup in z3fold_zpool_free

Hello.

On čtvrtek 22. září 2022 13:37:36 CEST Brian Foster wrote:
> On Thu, Sep 22, 2022 at 08:53:09AM +0200, Oleksandr Natalenko wrote:
> > Since 5.19 series, zswap went unstable for me under memory pressure, and
> > occasionally I get the following:
> >
> > ```
> > watchdog: BUG: soft lockup - CPU#0 stuck for 10195s! [mariadbd:478]
> > Modules linked in: netconsole joydev mousedev intel_agp psmouse pcspkr
> > intel_gtt cfg80211 cirrus i2c_piix4 tun rfkill mac_hid nft_ct tcp_bbr2
> > nft_chain_nat nf_tables nfnetlink nf_nat nf_conntrack nf_defrag_ipv6
> > nf_defrag_ipv4 fuse qemu_fw_cfg ip_tables x_tables xfs libcrc32c
> > crc32c_generic dm_crypt cbc encrypted_keys trusted asn1_encoder tee tpm
> > rng_core dm_mod crct10dif_pclmul crc32_pclmul crc32c_intel
> > ghash_clmulni_intel virtio_net aesni_intel serio_raw net_failover
> > ata_generic virtio_balloon failover pata_acpi crypto_simd virtio_blk atkbd
> > libps2 vivaldi_fmap virtio_pci cryptd virtio_pci_legacy_dev ata_piix
> > virtio_pci_modern_dev i8042 floppy serio usbhid
> > Unloaded tainted modules: intel_cstate():1 intel_uncore():1 pcc_cpufreq():1
> > acpi_cpufreq():1
> > CPU: 0 PID: 478 Comm: mariadbd Tainted: G L 5.19.0-pf5 #1
> > 12baccda8e49539e158b9dd97cbda6c7317d73af
> > Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014
> > RIP: 0010:z3fold_zpool_free+0x4c/0x5e0
> > Code: 7c 24 08 48 89 04 24 0f 85 e0 00 00 00 48 89 f5 41 bd 00 00 00 80 48
> > 83 e5 c0 48 83 c5 28 eb 0a 48 89 df e8 b6 8d 9f 00 f3 90 <48> 89 ef e8 bc 8b
> > 9f 00 4d 8b 34 24 49 81 e6 00 f0 ff ff 49 8d 5e
> > RSP: 0000:ffffbeadc0e87b68 EFLAGS: 00000202
> > RAX: 0000000000000030 RBX: ffff99ac73d2c010 RCX: ffff99ac4e4ba380
> > RDX: 0000665340000000 RSI: ffffe3b540000000 RDI: ffff99ac73d2c010
> > RBP: ffff99ac55ef3a68 R08: ffff99ac422f0bf0 R09: 000000000000c60b
> > R10: ffffffffffffffc0 R11: 0000000000000000 R12: ffff99ac55ef3a50
> > R13: 0000000080000000 R14: ffff99ac73d2c000 R15: ffff99acf3d2c000
> > FS: 00007f587fcd66c0(0000) GS:ffff99ac7ec00000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00007f587ce8bec8 CR3: 0000000005b48006 CR4: 00000000000206f0
> > Call Trace:
> > <TASK>
> > zswap_free_entry+0xb5/0x110
> > zswap_frontswap_invalidate_page+0x72/0xa0
> > __frontswap_invalidate_page+0x3a/0x60
> > swap_range_free+0xb5/0xd0
> > swapcache_free_entries+0x16e/0x2e0
> > free_swap_slot+0xb4/0xc0
> > put_swap_page+0x259/0x420
> > delete_from_swap_cache+0x63/0xb0
> > try_to_free_swap+0x1b5/0x2a0
> > do_swap_page+0x24c/0xb80
> > __handle_mm_fault+0xa59/0xf70
> > handle_mm_fault+0x100/0x2f0
> > do_user_addr_fault+0x1c7/0x6a0
> > exc_page_fault+0x74/0x170
> > asm_exc_page_fault+0x26/0x30
> > RIP: 0033:0x556e96280428
> > Code: a0 03 00 00 67 e8 28 64 ff ff 48 8b 83 b0 00 00 00 48 8b 0d da 18 72
> > 00 48 8b 10 66 48 0f 6e c1 48 85 d2 74 27 0f 1f 44 00 00 <48> c7 82 98 00 00
> > 00 00 00 00 00 48 8b 10 48 83 c0 08 f2 0f 11 82
> > RSP: 002b:00007f587fcd3980 EFLAGS: 00010206
> > RAX: 00007f587d028468 RBX: 00007f587cb1a818 RCX: 3ff0000000000000
> > RDX: 00007f587ce8be30 RSI: 0000000000000000 RDI: 00007f587cedd030
> > RBP: 00007f587fcd39c0 R08: 0000000000000016 R09: 0000000000000000
> > R10: 0000000000000008 R11: 0000556e970961a0 R12: 00007f587d1f17b8
> > R13: 00007f5883595598 R14: 00007f587d1f17a8 R15: 00007f587cb1a928
> > </TASK>
> > ```
> >
> > This happens on the latest v5.19.10 kernel as well.
> >
> > Sometimes it's not a soft lockup but GPF, although the stack trace is the
> > same. So, to me it looks like a memory corruption, UAF, double free or
> > something like that.
> >
> > Have you got any idea regarding what's going on?
> >
>
> It might be unrelated, but this looks somewhat similar to a problem I
> hit recently that is caused by swap entry data stored in page->private
> being clobbered when splitting a huge page. That problem was introduced
> in v5.19, so that potentially lines up as well.
>
> More details in the links below. [1] includes a VM_BUG_ON() splat with
> DEBUG_VM enabled, but the problem originally manifested as a soft lockup
> without the debug checks enabled. [2] includes a properly formatted
> patch. Any chance you could give that a try?

Thanks for your reply.

I'll give it a try. The only problem is that for me the issue is not reproducible at will, it can take 1 day, or it can take 2 weeks before the panic is hit.

> [1] https://lore.kernel.org/linux-mm/YxDyZLfBdFHK1Y1P@bfoster/
> [2] https://lore.kernel.org/linux-mm/[email protected]/

--
Oleksandr Natalenko (post-factum)

2022-10-06 16:07:17

by Oleksandr Natalenko

[permalink] [raw]

Subject: Re: Panic/lockup in z3fold_zpool_free

Hello.

On pátek 23. září 2022 10:33:14 CEST Oleksandr Natalenko wrote:
> On čtvrtek 22. září 2022 13:37:36 CEST Brian Foster wrote:
> > On Thu, Sep 22, 2022 at 08:53:09AM +0200, Oleksandr Natalenko wrote:
> > > Since 5.19 series, zswap went unstable for me under memory pressure, and
> > > occasionally I get the following:
> > >
> > > ```
> > > watchdog: BUG: soft lockup - CPU#0 stuck for 10195s! [mariadbd:478]
> > > Modules linked in: netconsole joydev mousedev intel_agp psmouse pcspkr
> > > intel_gtt cfg80211 cirrus i2c_piix4 tun rfkill mac_hid nft_ct tcp_bbr2
> > > nft_chain_nat nf_tables nfnetlink nf_nat nf_conntrack nf_defrag_ipv6
> > > nf_defrag_ipv4 fuse qemu_fw_cfg ip_tables x_tables xfs libcrc32c
> > > crc32c_generic dm_crypt cbc encrypted_keys trusted asn1_encoder tee tpm
> > > rng_core dm_mod crct10dif_pclmul crc32_pclmul crc32c_intel
> > > ghash_clmulni_intel virtio_net aesni_intel serio_raw net_failover
> > > ata_generic virtio_balloon failover pata_acpi crypto_simd virtio_blk atkbd
> > > libps2 vivaldi_fmap virtio_pci cryptd virtio_pci_legacy_dev ata_piix
> > > virtio_pci_modern_dev i8042 floppy serio usbhid
> > > Unloaded tainted modules: intel_cstate():1 intel_uncore():1 pcc_cpufreq():1
> > > acpi_cpufreq():1
> > > CPU: 0 PID: 478 Comm: mariadbd Tainted: G L 5.19.0-pf5 #1
> > > 12baccda8e49539e158b9dd97cbda6c7317d73af
> > > Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014
> > > RIP: 0010:z3fold_zpool_free+0x4c/0x5e0
> > > Code: 7c 24 08 48 89 04 24 0f 85 e0 00 00 00 48 89 f5 41 bd 00 00 00 80 48
> > > 83 e5 c0 48 83 c5 28 eb 0a 48 89 df e8 b6 8d 9f 00 f3 90 <48> 89 ef e8 bc 8b
> > > 9f 00 4d 8b 34 24 49 81 e6 00 f0 ff ff 49 8d 5e
> > > RSP: 0000:ffffbeadc0e87b68 EFLAGS: 00000202
> > > RAX: 0000000000000030 RBX: ffff99ac73d2c010 RCX: ffff99ac4e4ba380
> > > RDX: 0000665340000000 RSI: ffffe3b540000000 RDI: ffff99ac73d2c010
> > > RBP: ffff99ac55ef3a68 R08: ffff99ac422f0bf0 R09: 000000000000c60b
> > > R10: ffffffffffffffc0 R11: 0000000000000000 R12: ffff99ac55ef3a50
> > > R13: 0000000080000000 R14: ffff99ac73d2c000 R15: ffff99acf3d2c000
> > > FS: 00007f587fcd66c0(0000) GS:ffff99ac7ec00000(0000) knlGS:0000000000000000
> > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > CR2: 00007f587ce8bec8 CR3: 0000000005b48006 CR4: 00000000000206f0
> > > Call Trace:
> > > <TASK>
> > > zswap_free_entry+0xb5/0x110
> > > zswap_frontswap_invalidate_page+0x72/0xa0
> > > __frontswap_invalidate_page+0x3a/0x60
> > > swap_range_free+0xb5/0xd0
> > > swapcache_free_entries+0x16e/0x2e0
> > > free_swap_slot+0xb4/0xc0
> > > put_swap_page+0x259/0x420
> > > delete_from_swap_cache+0x63/0xb0
> > > try_to_free_swap+0x1b5/0x2a0
> > > do_swap_page+0x24c/0xb80
> > > __handle_mm_fault+0xa59/0xf70
> > > handle_mm_fault+0x100/0x2f0
> > > do_user_addr_fault+0x1c7/0x6a0
> > > exc_page_fault+0x74/0x170
> > > asm_exc_page_fault+0x26/0x30
> > > RIP: 0033:0x556e96280428
> > > Code: a0 03 00 00 67 e8 28 64 ff ff 48 8b 83 b0 00 00 00 48 8b 0d da 18 72
> > > 00 48 8b 10 66 48 0f 6e c1 48 85 d2 74 27 0f 1f 44 00 00 <48> c7 82 98 00 00
> > > 00 00 00 00 00 48 8b 10 48 83 c0 08 f2 0f 11 82
> > > RSP: 002b:00007f587fcd3980 EFLAGS: 00010206
> > > RAX: 00007f587d028468 RBX: 00007f587cb1a818 RCX: 3ff0000000000000
> > > RDX: 00007f587ce8be30 RSI: 0000000000000000 RDI: 00007f587cedd030
> > > RBP: 00007f587fcd39c0 R08: 0000000000000016 R09: 0000000000000000
> > > R10: 0000000000000008 R11: 0000556e970961a0 R12: 00007f587d1f17b8
> > > R13: 00007f5883595598 R14: 00007f587d1f17a8 R15: 00007f587cb1a928
> > > </TASK>
> > > ```
> > >
> > > This happens on the latest v5.19.10 kernel as well.
> > >
> > > Sometimes it's not a soft lockup but GPF, although the stack trace is the
> > > same. So, to me it looks like a memory corruption, UAF, double free or
> > > something like that.
> > >
> > > Have you got any idea regarding what's going on?
> > >
> >
> > It might be unrelated, but this looks somewhat similar to a problem I
> > hit recently that is caused by swap entry data stored in page->private
> > being clobbered when splitting a huge page. That problem was introduced
> > in v5.19, so that potentially lines up as well.
> >
> > More details in the links below. [1] includes a VM_BUG_ON() splat with
> > DEBUG_VM enabled, but the problem originally manifested as a soft lockup
> > without the debug checks enabled. [2] includes a properly formatted
> > patch. Any chance you could give that a try?
>
> Thanks for your reply.
>
> I'll give it a try. The only problem is that for me the issue is not reproducible at will, it can take 1 day, or it can take 2 weeks before the panic is hit.
>
> > [1] https://lore.kernel.org/linux-mm/YxDyZLfBdFHK1Y1P@bfoster/
> > [2] https://lore.kernel.org/linux-mm/[email protected]/

So far, I haven't reproduced this issue with your patch. I haven't run the machine sufficiently long, just under a week, so this is rather to let you know that I haven't abandoned testing.

Thanks.

--
Oleksandr Natalenko (post-factum)

2022-10-17 16:41:07

by Brian Foster

[permalink] [raw]

Subject: Re: Panic/lockup in z3fold_zpool_free

On Thu, Oct 06, 2022 at 05:52:52PM +0200, Oleksandr Natalenko wrote:
> Hello.
>
> On pátek 23. září 2022 10:33:14 CEST Oleksandr Natalenko wrote:
> > On čtvrtek 22. září 2022 13:37:36 CEST Brian Foster wrote:
> > > On Thu, Sep 22, 2022 at 08:53:09AM +0200, Oleksandr Natalenko wrote:
> > > > Since 5.19 series, zswap went unstable for me under memory pressure, and
> > > > occasionally I get the following:
> > > >
> > > > ```
> > > > watchdog: BUG: soft lockup - CPU#0 stuck for 10195s! [mariadbd:478]
> > > > Modules linked in: netconsole joydev mousedev intel_agp psmouse pcspkr
> > > > intel_gtt cfg80211 cirrus i2c_piix4 tun rfkill mac_hid nft_ct tcp_bbr2
> > > > nft_chain_nat nf_tables nfnetlink nf_nat nf_conntrack nf_defrag_ipv6
> > > > nf_defrag_ipv4 fuse qemu_fw_cfg ip_tables x_tables xfs libcrc32c
> > > > crc32c_generic dm_crypt cbc encrypted_keys trusted asn1_encoder tee tpm
> > > > rng_core dm_mod crct10dif_pclmul crc32_pclmul crc32c_intel
> > > > ghash_clmulni_intel virtio_net aesni_intel serio_raw net_failover
> > > > ata_generic virtio_balloon failover pata_acpi crypto_simd virtio_blk atkbd
> > > > libps2 vivaldi_fmap virtio_pci cryptd virtio_pci_legacy_dev ata_piix
> > > > virtio_pci_modern_dev i8042 floppy serio usbhid
> > > > Unloaded tainted modules: intel_cstate():1 intel_uncore():1 pcc_cpufreq():1
> > > > acpi_cpufreq():1
> > > > CPU: 0 PID: 478 Comm: mariadbd Tainted: G L 5.19.0-pf5 #1
> > > > 12baccda8e49539e158b9dd97cbda6c7317d73af
> > > > Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014
> > > > RIP: 0010:z3fold_zpool_free+0x4c/0x5e0
> > > > Code: 7c 24 08 48 89 04 24 0f 85 e0 00 00 00 48 89 f5 41 bd 00 00 00 80 48
> > > > 83 e5 c0 48 83 c5 28 eb 0a 48 89 df e8 b6 8d 9f 00 f3 90 <48> 89 ef e8 bc 8b
> > > > 9f 00 4d 8b 34 24 49 81 e6 00 f0 ff ff 49 8d 5e
> > > > RSP: 0000:ffffbeadc0e87b68 EFLAGS: 00000202
> > > > RAX: 0000000000000030 RBX: ffff99ac73d2c010 RCX: ffff99ac4e4ba380
> > > > RDX: 0000665340000000 RSI: ffffe3b540000000 RDI: ffff99ac73d2c010
> > > > RBP: ffff99ac55ef3a68 R08: ffff99ac422f0bf0 R09: 000000000000c60b
> > > > R10: ffffffffffffffc0 R11: 0000000000000000 R12: ffff99ac55ef3a50
> > > > R13: 0000000080000000 R14: ffff99ac73d2c000 R15: ffff99acf3d2c000
> > > > FS: 00007f587fcd66c0(0000) GS:ffff99ac7ec00000(0000) knlGS:0000000000000000
> > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > CR2: 00007f587ce8bec8 CR3: 0000000005b48006 CR4: 00000000000206f0
> > > > Call Trace:
> > > > <TASK>
> > > > zswap_free_entry+0xb5/0x110
> > > > zswap_frontswap_invalidate_page+0x72/0xa0
> > > > __frontswap_invalidate_page+0x3a/0x60
> > > > swap_range_free+0xb5/0xd0
> > > > swapcache_free_entries+0x16e/0x2e0
> > > > free_swap_slot+0xb4/0xc0
> > > > put_swap_page+0x259/0x420
> > > > delete_from_swap_cache+0x63/0xb0
> > > > try_to_free_swap+0x1b5/0x2a0
> > > > do_swap_page+0x24c/0xb80
> > > > __handle_mm_fault+0xa59/0xf70
> > > > handle_mm_fault+0x100/0x2f0
> > > > do_user_addr_fault+0x1c7/0x6a0
> > > > exc_page_fault+0x74/0x170
> > > > asm_exc_page_fault+0x26/0x30
> > > > RIP: 0033:0x556e96280428
> > > > Code: a0 03 00 00 67 e8 28 64 ff ff 48 8b 83 b0 00 00 00 48 8b 0d da 18 72
> > > > 00 48 8b 10 66 48 0f 6e c1 48 85 d2 74 27 0f 1f 44 00 00 <48> c7 82 98 00 00
> > > > 00 00 00 00 00 48 8b 10 48 83 c0 08 f2 0f 11 82
> > > > RSP: 002b:00007f587fcd3980 EFLAGS: 00010206
> > > > RAX: 00007f587d028468 RBX: 00007f587cb1a818 RCX: 3ff0000000000000
> > > > RDX: 00007f587ce8be30 RSI: 0000000000000000 RDI: 00007f587cedd030
> > > > RBP: 00007f587fcd39c0 R08: 0000000000000016 R09: 0000000000000000
> > > > R10: 0000000000000008 R11: 0000556e970961a0 R12: 00007f587d1f17b8
> > > > R13: 00007f5883595598 R14: 00007f587d1f17a8 R15: 00007f587cb1a928
> > > > </TASK>
> > > > ```
> > > >
> > > > This happens on the latest v5.19.10 kernel as well.
> > > >
> > > > Sometimes it's not a soft lockup but GPF, although the stack trace is the
> > > > same. So, to me it looks like a memory corruption, UAF, double free or
> > > > something like that.
> > > >
> > > > Have you got any idea regarding what's going on?
> > > >
> > >
> > > It might be unrelated, but this looks somewhat similar to a problem I
> > > hit recently that is caused by swap entry data stored in page->private
> > > being clobbered when splitting a huge page. That problem was introduced
> > > in v5.19, so that potentially lines up as well.
> > >
> > > More details in the links below. [1] includes a VM_BUG_ON() splat with
> > > DEBUG_VM enabled, but the problem originally manifested as a soft lockup
> > > without the debug checks enabled. [2] includes a properly formatted
> > > patch. Any chance you could give that a try?
> >
> > Thanks for your reply.
> >
> > I'll give it a try. The only problem is that for me the issue is not reproducible at will, it can take 1 day, or it can take 2 weeks before the panic is hit.
> >
> > > [1] https://lore.kernel.org/linux-mm/YxDyZLfBdFHK1Y1P@bfoster/
> > > [2] https://lore.kernel.org/linux-mm/[email protected]/
>
> So far, I haven't reproduced this issue with your patch. I haven't run the machine sufficiently long, just under a week, so this is rather to let you know that I haven't abandoned testing.
>

Thanks for the update. Is this still going well, or reached a point
where you typically see the problem? I can still reproduce the original
problem so I may have to ping the patch again..

Brian

> Thanks.
>
>
> --
> Oleksandr Natalenko (post-factum)
>
>

2022-10-17 17:15:11

by Oleksandr Natalenko

[permalink] [raw]

Subject: Re: Panic/lockup in z3fold_zpool_free

Hello.

On pondělí 17. října 2022 18:13:00 CEST Brian Foster wrote:
> On Thu, Oct 06, 2022 at 05:52:52PM +0200, Oleksandr Natalenko wrote:
> > On pátek 23. září 2022 10:33:14 CEST Oleksandr Natalenko wrote:
> > > On čtvrtek 22. září 2022 13:37:36 CEST Brian Foster wrote:
> > > > On Thu, Sep 22, 2022 at 08:53:09AM +0200, Oleksandr Natalenko wrote:
> > > > > Since 5.19 series, zswap went unstable for me under memory pressure, and
> > > > > occasionally I get the following:
> > > > >
> > > > > ```
> > > > > watchdog: BUG: soft lockup - CPU#0 stuck for 10195s! [mariadbd:478]
> > > > > Modules linked in: netconsole joydev mousedev intel_agp psmouse pcspkr
> > > > > intel_gtt cfg80211 cirrus i2c_piix4 tun rfkill mac_hid nft_ct tcp_bbr2
> > > > > nft_chain_nat nf_tables nfnetlink nf_nat nf_conntrack nf_defrag_ipv6
> > > > > nf_defrag_ipv4 fuse qemu_fw_cfg ip_tables x_tables xfs libcrc32c
> > > > > crc32c_generic dm_crypt cbc encrypted_keys trusted asn1_encoder tee tpm
> > > > > rng_core dm_mod crct10dif_pclmul crc32_pclmul crc32c_intel
> > > > > ghash_clmulni_intel virtio_net aesni_intel serio_raw net_failover
> > > > > ata_generic virtio_balloon failover pata_acpi crypto_simd virtio_blk atkbd
> > > > > libps2 vivaldi_fmap virtio_pci cryptd virtio_pci_legacy_dev ata_piix
> > > > > virtio_pci_modern_dev i8042 floppy serio usbhid
> > > > > Unloaded tainted modules: intel_cstate():1 intel_uncore():1 pcc_cpufreq():1
> > > > > acpi_cpufreq():1
> > > > > CPU: 0 PID: 478 Comm: mariadbd Tainted: G L 5.19.0-pf5 #1
> > > > > 12baccda8e49539e158b9dd97cbda6c7317d73af
> > > > > Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014
> > > > > RIP: 0010:z3fold_zpool_free+0x4c/0x5e0
> > > > > Code: 7c 24 08 48 89 04 24 0f 85 e0 00 00 00 48 89 f5 41 bd 00 00 00 80 48
> > > > > 83 e5 c0 48 83 c5 28 eb 0a 48 89 df e8 b6 8d 9f 00 f3 90 <48> 89 ef e8 bc 8b
> > > > > 9f 00 4d 8b 34 24 49 81 e6 00 f0 ff ff 49 8d 5e
> > > > > RSP: 0000:ffffbeadc0e87b68 EFLAGS: 00000202
> > > > > RAX: 0000000000000030 RBX: ffff99ac73d2c010 RCX: ffff99ac4e4ba380
> > > > > RDX: 0000665340000000 RSI: ffffe3b540000000 RDI: ffff99ac73d2c010
> > > > > RBP: ffff99ac55ef3a68 R08: ffff99ac422f0bf0 R09: 000000000000c60b
> > > > > R10: ffffffffffffffc0 R11: 0000000000000000 R12: ffff99ac55ef3a50
> > > > > R13: 0000000080000000 R14: ffff99ac73d2c000 R15: ffff99acf3d2c000
> > > > > FS: 00007f587fcd66c0(0000) GS:ffff99ac7ec00000(0000) knlGS:0000000000000000
> > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > > CR2: 00007f587ce8bec8 CR3: 0000000005b48006 CR4: 00000000000206f0
> > > > > Call Trace:
> > > > > <TASK>
> > > > > zswap_free_entry+0xb5/0x110
> > > > > zswap_frontswap_invalidate_page+0x72/0xa0
> > > > > __frontswap_invalidate_page+0x3a/0x60
> > > > > swap_range_free+0xb5/0xd0
> > > > > swapcache_free_entries+0x16e/0x2e0
> > > > > free_swap_slot+0xb4/0xc0
> > > > > put_swap_page+0x259/0x420
> > > > > delete_from_swap_cache+0x63/0xb0
> > > > > try_to_free_swap+0x1b5/0x2a0
> > > > > do_swap_page+0x24c/0xb80
> > > > > __handle_mm_fault+0xa59/0xf70
> > > > > handle_mm_fault+0x100/0x2f0
> > > > > do_user_addr_fault+0x1c7/0x6a0
> > > > > exc_page_fault+0x74/0x170
> > > > > asm_exc_page_fault+0x26/0x30
> > > > > RIP: 0033:0x556e96280428
> > > > > Code: a0 03 00 00 67 e8 28 64 ff ff 48 8b 83 b0 00 00 00 48 8b 0d da 18 72
> > > > > 00 48 8b 10 66 48 0f 6e c1 48 85 d2 74 27 0f 1f 44 00 00 <48> c7 82 98 00 00
> > > > > 00 00 00 00 00 48 8b 10 48 83 c0 08 f2 0f 11 82
> > > > > RSP: 002b:00007f587fcd3980 EFLAGS: 00010206
> > > > > RAX: 00007f587d028468 RBX: 00007f587cb1a818 RCX: 3ff0000000000000
> > > > > RDX: 00007f587ce8be30 RSI: 0000000000000000 RDI: 00007f587cedd030
> > > > > RBP: 00007f587fcd39c0 R08: 0000000000000016 R09: 0000000000000000
> > > > > R10: 0000000000000008 R11: 0000556e970961a0 R12: 00007f587d1f17b8
> > > > > R13: 00007f5883595598 R14: 00007f587d1f17a8 R15: 00007f587cb1a928
> > > > > </TASK>
> > > > > ```
> > > > >
> > > > > This happens on the latest v5.19.10 kernel as well.
> > > > >
> > > > > Sometimes it's not a soft lockup but GPF, although the stack trace is the
> > > > > same. So, to me it looks like a memory corruption, UAF, double free or
> > > > > something like that.
> > > > >
> > > > > Have you got any idea regarding what's going on?
> > > > >
> > > >
> > > > It might be unrelated, but this looks somewhat similar to a problem I
> > > > hit recently that is caused by swap entry data stored in page->private
> > > > being clobbered when splitting a huge page. That problem was introduced
> > > > in v5.19, so that potentially lines up as well.
> > > >
> > > > More details in the links below. [1] includes a VM_BUG_ON() splat with
> > > > DEBUG_VM enabled, but the problem originally manifested as a soft lockup
> > > > without the debug checks enabled. [2] includes a properly formatted
> > > > patch. Any chance you could give that a try?
> > >
> > > Thanks for your reply.
> > >
> > > I'll give it a try. The only problem is that for me the issue is not reproducible at will, it can take 1 day, or it can take 2 weeks before the panic is hit.
> > >
> > > > [1] https://lore.kernel.org/linux-mm/YxDyZLfBdFHK1Y1P@bfoster/
> > > > [2] https://lore.kernel.org/linux-mm/[email protected]/
> >
> > So far, I haven't reproduced this issue with your patch. I haven't run the machine sufficiently long, just under a week, so this is rather to let you know that I haven't abandoned testing.
> >
>
> Thanks for the update. Is this still going well, or reached a point
> where you typically see the problem? I can still reproduce the original
> problem so I may have to ping the patch again..

So far, no issue observed with your patch.

Thanks.

--
Oleksandr Natalenko (post-factum)

2022-10-17 22:28:51

by Andrew Morton

[permalink] [raw]

Subject: Re: Panic/lockup in z3fold_zpool_free

On Mon, 17 Oct 2022 18:34:50 +0200 Oleksandr Natalenko <[email protected]> wrote:

> > > > I'll give it a try. The only problem is that for me the issue is not reproducible at will, it can take 1 day, or it can take 2 weeks before the panic is hit.
> > > >
> > > > > [1] https://lore.kernel.org/linux-mm/YxDyZLfBdFHK1Y1P@bfoster/
> > > > > [2] https://lore.kernel.org/linux-mm/[email protected]/
> > >
> > > So far, I haven't reproduced this issue with your patch. I haven't run the machine sufficiently long, just under a week, so this is rather to let you know that I haven't abandoned testing.
> > >
> >
> > Thanks for the update. Is this still going well, or reached a point
> > where you typically see the problem? I can still reproduce the original
> > problem so I may have to ping the patch again..
>
> So far, no issue observed with your patch.

Thanks.

It's actually unclear (to me) why Matthew's b653db77350c73 ("mm: Clear
page->private when splitting or migrating a page") was considered
necessary. What problem did it solve?

https://lore.kernel.org/linux-mm/[email protected]/
is a partial undoing of that change, but should we simply revert
b653db77350c73?