Hi folks,
I noticed the below splat after using my box for a while (i.e., it was
still pretty responsive). Kernel is linus/master from the middle of this
week + tip/master.
Lemme know if you need any other info.
Thx.
[38810.312439] ------------[ cut here ]------------
[38810.317027] kernel BUG at fs/ext4/inode.c:1721!
[38810.321532] invalid opcode: 0000 [#1] PREEMPT SMP
[38810.326207] CPU: 10 PID: 24284 Comm: ThreadPoolForeg Not tainted 5.15.0-rc4+ #1
[38810.326210] Hardware name: Micro-Star International Co., Ltd. MS-7B79/X470 GAMING PRO (MS-7B79), BIOS 1.70 01/23/2019
[38810.326211] RIP: 0010:ext4_da_get_block_prep+0x3fd/0x440
[38810.326218] Code: ff ff f0 80 0b 20 0f 1f 80 00 00 00 00 e9 ed fe ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 e8 5b 95 fe ff 41 89 c4 e9 04 fe ff ff <0f> 0b 0f 0b 48 8b 7d 28 89 04 24 45 89 e1 4c 8b 45 38 48 c7 c1 10
[38810.367814] RSP: 0018:ffffc90002fdfc18 EFLAGS: 00010206
[38810.367818] RAX: 0000000000000001 RBX: ffff8881773226e8 RCX: 0000000000000004
[38810.367820] RDX: 27ffffffffffffff RSI: 0000000000000001 RDI: ffff8881dced03b8
[38810.367821] RBP: ffff8881dced0118 R08: 0000000000000001 R09: ffff8881773226e8
[38810.367823] R10: 0000000000001000 R11: ffff8887feeae3e0 R12: 0000000000000003
[38810.367824] R13: ffffffffffff0000 R14: 0000000000000000 R15: 0000000000000003
[38810.367825] FS: 00007f135f7fe700(0000) GS:ffff8887fee80000(0000) knlGS:0000000000000000
[38810.416367] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[38810.416370] CR2: 00007f9450fd8658 CR3: 0000000103ec5000 CR4: 00000000003506e0
[38810.416371] Call Trace:
[38810.416373] <TASK>
[38810.416374] ? alloc_buffer_head+0x1b/0x80
[38810.416380] __block_write_begin_int+0x171/0x650
[38810.416384] ? ext4_da_release_space+0x120/0x120
[38810.446889] ext4_da_write_begin+0x107/0x280
[38810.446893] ? generic_write_end+0xf6/0x150
[38810.446896] generic_perform_write+0xaf/0x1c0
[38810.446899] ext4_buffered_write_iter+0xbb/0x180
[38810.464182] new_sync_write+0x10b/0x190
[38810.464187] vfs_write+0x216/0x2d0
[38810.464190] ? __fget_files+0x6b/0xa0
[38810.464193] __x64_sys_pwrite64+0x87/0xb0
[38810.478984] do_syscall_64+0x3b/0x80
[38810.478987] entry_SYSCALL_64_after_hwframe+0x44/0xae
[38810.478991] RIP: 0033:0x7f13a836c9c7
[38810.478994] Code: 08 89 3c 24 48 89 4c 24 18 e8 55 f3 ff ff 4c 8b 54 24 18 48 8b 54 24 10 41 89 c0 48 8b 74 24 08 8b 3c 24 b8 12 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 04 24 e8 85 f3 ff ff 48 8b
[38810.478997] RSP: 002b:00007f135f7fc840 EFLAGS: 00000293 ORIG_RAX: 0000000000000012
[38810.509680] RAX: ffffffffffffffda RBX: 00007f135f7fc890 RCX: 00007f13a836c9c7
[38810.509681] RDX: 000000000000051a RSI: 00007f13908549d0 RDI: 00000000000000eb
[38810.509683] RBP: 00007f135f7fc930 R08: 0000000000000000 R09: 00007ffc27dbc090
[38810.509684] R10: 0000000000002bff R11: 0000000000000293 R12: 0000000000002bff
[38810.509685] R13: 00007f1344003a60 R14: 00007f13908549d0 R15: 000000000000051a
[38810.509688] </TASK>
[38810.509688] Modules linked in: fuse essiv authenc nft_counter nf_tables libcrc32c nfnetlink vfat fat loop dm_crypt dm_mod amd64_edac edac_mce_amd kvm_amd snd_hda_codec_realtek snd_hda_codec_generic kvm led_class ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_pcm irqbypass crct10dif_pclmul snd_timer crc32_pclmul crc32c_intel snd ghash_clmulni_intel pcspkr k10temp soundcore gpio_amdpt gpio_generic acpi_cpufreq radeon aesni_intel crypto_simd cryptd pinctrl_amd
[38810.598925] ---[ end trace c014409ea194e650 ]---
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
* Borislav Petkov <[email protected]>:
> Hi folks,
>
> I noticed the below splat after using my box for a while (i.e., it was
> still pretty responsive). Kernel is linus/master from the middle of this
> week + tip/master.
>
> Lemme know if you need any other info.
>
> Thx.
>
> [38810.312439] ------------[ cut here ]------------
> [38810.317027] kernel BUG at fs/ext4/inode.c:1721!
> [38810.321532] invalid opcode: 0000 [#1] PREEMPT SMP
> [38810.326207] CPU: 10 PID: 24284 Comm: ThreadPoolForeg Not tainted 5.15.0-rc4+ #1
> [38810.326210] Hardware name: Micro-Star International Co., Ltd. MS-7B79/X470 GAMING PRO (MS-7B79), BIOS 1.70 01/23/2019
> [38810.326211] RIP: 0010:ext4_da_get_block_prep+0x3fd/0x440
> [38810.326218] Code: ff ff f0 80 0b 20 0f 1f 80 00 00 00 00 e9 ed fe ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 e8 5b 95 fe ff 41 89 c4 e9 04 fe ff ff <0f> 0b 0f 0b 48 8b 7d 28 89 04 24 45 89 e1 4c 8b 45 38 48 c7 c1 10
> [38810.367814] RSP: 0018:ffffc90002fdfc18 EFLAGS: 00010206
> [38810.367818] RAX: 0000000000000001 RBX: ffff8881773226e8 RCX: 0000000000000004
> [38810.367820] RDX: 27ffffffffffffff RSI: 0000000000000001 RDI: ffff8881dced03b8
> [38810.367821] RBP: ffff8881dced0118 R08: 0000000000000001 R09: ffff8881773226e8
> [38810.367823] R10: 0000000000001000 R11: ffff8887feeae3e0 R12: 0000000000000003
> [38810.367824] R13: ffffffffffff0000 R14: 0000000000000000 R15: 0000000000000003
> [38810.367825] FS: 00007f135f7fe700(0000) GS:ffff8887fee80000(0000) knlGS:0000000000000000
> [38810.416367] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [38810.416370] CR2: 00007f9450fd8658 CR3: 0000000103ec5000 CR4: 00000000003506e0
> [38810.416371] Call Trace:
> [38810.416373] <TASK>
> [38810.416374] ? alloc_buffer_head+0x1b/0x80
> [38810.416380] __block_write_begin_int+0x171/0x650
> [38810.416384] ? ext4_da_release_space+0x120/0x120
> [38810.446889] ext4_da_write_begin+0x107/0x280
> [38810.446893] ? generic_write_end+0xf6/0x150
> [38810.446896] generic_perform_write+0xaf/0x1c0
> [38810.446899] ext4_buffered_write_iter+0xbb/0x180
> [38810.464182] new_sync_write+0x10b/0x190
> [38810.464187] vfs_write+0x216/0x2d0
> [38810.464190] ? __fget_files+0x6b/0xa0
> [38810.464193] __x64_sys_pwrite64+0x87/0xb0
> [38810.478984] do_syscall_64+0x3b/0x80
> [38810.478987] entry_SYSCALL_64_after_hwframe+0x44/0xae
> [38810.478991] RIP: 0033:0x7f13a836c9c7
> [38810.478994] Code: 08 89 3c 24 48 89 4c 24 18 e8 55 f3 ff ff 4c 8b 54 24 18 48 8b 54 24 10 41 89 c0 48 8b 74 24 08 8b 3c 24 b8 12 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 04 24 e8 85 f3 ff ff 48 8b
> [38810.478997] RSP: 002b:00007f135f7fc840 EFLAGS: 00000293 ORIG_RAX: 0000000000000012
> [38810.509680] RAX: ffffffffffffffda RBX: 00007f135f7fc890 RCX: 00007f13a836c9c7
> [38810.509681] RDX: 000000000000051a RSI: 00007f13908549d0 RDI: 00000000000000eb
> [38810.509683] RBP: 00007f135f7fc930 R08: 0000000000000000 R09: 00007ffc27dbc090
> [38810.509684] R10: 0000000000002bff R11: 0000000000000293 R12: 0000000000002bff
> [38810.509685] R13: 00007f1344003a60 R14: 00007f13908549d0 R15: 000000000000051a
> [38810.509688] </TASK>
> [38810.509688] Modules linked in: fuse essiv authenc nft_counter nf_tables libcrc32c nfnetlink vfat fat loop dm_crypt dm_mod amd64_edac edac_mce_amd kvm_amd snd_hda_codec_realtek snd_hda_codec_generic kvm led_class ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_pcm irqbypass crct10dif_pclmul snd_timer crc32_pclmul crc32c_intel snd ghash_clmulni_intel pcspkr k10temp soundcore gpio_amdpt gpio_generic acpi_cpufreq radeon aesni_intel crypto_simd cryptd pinctrl_amd
> [38810.598925] ---[ end trace c014409ea194e650 ]---
>
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
Hi, Boris - thanks very much for your report.
Was your kernel configured with the CONFIG_FS_ENCRYPTION option?
Could you please provide the output of the mount command for the affected
file system?
Do you recall what sort of code might have been running on this system at
the time of failure (for example, kernel build, desktop apps, etc.)?
Thanks,
Eric
* Borislav Petkov <[email protected]>:
> Hi Eric,
>
> On Fri, Oct 08, 2021 at 01:33:05PM -0400, Eric Whitney wrote:
> > Hi, Boris - thanks very much for your report.
>
> sure, np.
>
> > Was your kernel configured with the CONFIG_FS_ENCRYPTION option?
>
> $ grep CONFIG_FS_ENCRYPTION /boot/config-5.15.0-rc4+
> # CONFIG_FS_ENCRYPTION is not set
>
> > Could you please provide the output of the mount command for the affected
> > file system?
>
> Well, I can't figure out from dmesg - it's all I have from that run -
> which fs it was. So lemme give you all ext4 ones:
>
> $ mount | grep ext4
> /dev/nvme0n1p2 on / type ext4 (rw,relatime,errors=remount-ro)
> /dev/sdc1 on /home type ext4 (rw,noatime)
> /dev/sda1 on /mnt/oldhome type ext4 (rw,noatime)
> /dev/sdb1 on /mnt/smr type ext4 (rw,noatime)
> /dev/nvme1n1p1 on /mnt/kernel type ext4 (rw,nosuid,nodev,noatime,user)
>
> > Do you recall what sort of code might have been running on this system at
> > the time of failure (for example, kernel build, desktop apps, etc.)?
>
> Good question. I'm not sure. Kernel build is likely as I do those on
> that workstation constantly.
>
> Unfortunately, I don't have an exact reproducer. And I can't debug stuff
> on that box since it is my workstation and I've reverted it to 5.14.
>
> What I can do is, I can slap 5.15-rc4 or whichever version you'd want me
> to, on a test box and try running kernel builds or some other load to
> see whether it would fire. I have a similar box to my workstation.
>
> Or if you have a better idea...
Hi, Boris:
I've tried numerous kernel builds with -rc4 and rerun the full set of xfstests
we use when regressing ext4 each rc using a kernel that doesn't enable
FS_ENCRYPTION (I normally run with that) without luck. The code that caused
the splat you saw is new and would run when an assertion is violated,
suggesting that there may be an unsuspected bug elsewhere in ext4.
Do you recall having seen any evidence of ENOMEM or ENOSPC conditions prior
to the failure?
If you're willing to share, please send along your kernel config file and I'll
try working with that as well.
In the meantime, should this bug get in your way, just revert the following
patch and you should be able to run without further trouble:
948ca5f30e1d "ext4: enforce buffer head state assertion in ext4_da_map_blocks"
I'll likely be posting a patch to revert this shortly, since it's going to
take some time to sort out what's going on without a reproducer.
Thanks again for your help,
Eric
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
On Mon, Oct 11, 2021 at 07:11:24PM -0400, Eric Whitney wrote:
> I've tried numerous kernel builds with -rc4 and rerun the full set of xfstests
> we use when regressing ext4 each rc using a kernel that doesn't enable
> FS_ENCRYPTION (I normally run with that) without luck. The code that caused
> the splat you saw is new and would run when an assertion is violated,
> suggesting that there may be an unsuspected bug elsewhere in ext4.
Hmm.
> Do you recall having seen any evidence of ENOMEM or ENOSPC conditions prior
> to the failure?
I don't see anything of the sorts in the dmesg I've saved.
> If you're willing to share, please send along your kernel config file and I'll
> try working with that as well.
Sure, I'll send you the config I used and the dmesg I caught privately -
you might see something I've missed. Stuff like this, for example:
[ 10.254952] Adding 15721468k swap on /dev/nvme0n1p1. Priority:-2 extents:1 across:15721468k SS
[ 10.275365] EXT4-fs (nvme0n1p2): re-mounted. Opts: errors=remount-ro. Quota mode: disabled.
[ 10.417820] device-mapper: ioctl: 4.45.0-ioctl (2021-03-22) initialised: [email protected]
[ 10.595392] loop: module loaded
[ 10.661742] EXT4-fs (sdc1): mounted filesystem with ordered data mode. Opts: (null). Quota mode: disabled.
[ 10.758774] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null). Quota mode: disabled.
[ 10.930331] r8169 0000:18:00.0 eth0: Link is Up - 100Mbps/Full - flow control rx/tx
[ 10.939298] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 13.306747] EXT4-fs (sdb1): recovery complete
[ 13.325960] EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: (null). Quota mode: disabled.
[ 13.353624] EXT4-fs (nvme1n1p1): mounted filesystem with ordered data mode. Opts: (null). Quota mode: disabled.
[ 191.896690] loop0: detected capacity change from 0 to 2048
[ 191.941350] EXT4-fs (dm-0): mounting ext2 file system using the ext4 subsystem
[ 191.948773] EXT4-fs (dm-0): mounted filesystem without journal. Opts: (null). Quota mode: disabled.
[ 282.932355] fuse: init (API version 7.34)
[ 3159.620840] loop1: detected capacity change from 0 to 4194304
[ 3160.125963] EXT4-fs (dm-1): mounting ext3 file system using the ext4 subsystem
[ 3160.203143] EXT4-fs (dm-1): mounted filesystem with ordered data mode. Opts: (null). Quota mode: disabled.
Dunno if using ext4 to mount ext2 and ext3 filesystems would be
relevant.
> In the meantime, should this bug get in your way, just revert the following
> patch and you should be able to run without further trouble:
>
> 948ca5f30e1d "ext4: enforce buffer head state assertion in ext4_da_map_blocks"
>
> I'll likely be posting a patch to revert this shortly, since it's going to
> take some time to sort out what's going on without a reproducer.
Gotcha.
> Thanks again for your help,
Thanks too for taking a look.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Mon 11-10-21 19:11:24, Eric Whitney wrote:
> * Borislav Petkov <[email protected]>:
> > Hi Eric,
> >
> > On Fri, Oct 08, 2021 at 01:33:05PM -0400, Eric Whitney wrote:
> > > Hi, Boris - thanks very much for your report.
> >
> > sure, np.
> >
> > > Was your kernel configured with the CONFIG_FS_ENCRYPTION option?
> >
> > $ grep CONFIG_FS_ENCRYPTION /boot/config-5.15.0-rc4+
> > # CONFIG_FS_ENCRYPTION is not set
> >
> > > Could you please provide the output of the mount command for the affected
> > > file system?
> >
> > Well, I can't figure out from dmesg - it's all I have from that run -
> > which fs it was. So lemme give you all ext4 ones:
> >
> > $ mount | grep ext4
> > /dev/nvme0n1p2 on / type ext4 (rw,relatime,errors=remount-ro)
> > /dev/sdc1 on /home type ext4 (rw,noatime)
> > /dev/sda1 on /mnt/oldhome type ext4 (rw,noatime)
> > /dev/sdb1 on /mnt/smr type ext4 (rw,noatime)
> > /dev/nvme1n1p1 on /mnt/kernel type ext4 (rw,nosuid,nodev,noatime,user)
> >
> > > Do you recall what sort of code might have been running on this system at
> > > the time of failure (for example, kernel build, desktop apps, etc.)?
> >
> > Good question. I'm not sure. Kernel build is likely as I do those on
> > that workstation constantly.
> >
> > Unfortunately, I don't have an exact reproducer. And I can't debug stuff
> > on that box since it is my workstation and I've reverted it to 5.14.
> >
> > What I can do is, I can slap 5.15-rc4 or whichever version you'd want me
> > to, on a test box and try running kernel builds or some other load to
> > see whether it would fire. I have a similar box to my workstation.
> >
> > Or if you have a better idea...
>
> Hi, Boris:
>
> I've tried numerous kernel builds with -rc4 and rerun the full set of xfstests
> we use when regressing ext4 each rc using a kernel that doesn't enable
> FS_ENCRYPTION (I normally run with that) without luck. The code that caused
> the splat you saw is new and would run when an assertion is violated,
> suggesting that there may be an unsuspected bug elsewhere in ext4.
>
> Do you recall having seen any evidence of ENOMEM or ENOSPC conditions prior
> to the failure?
>
> If you're willing to share, please send along your kernel config file and I'll
> try working with that as well.
>
> In the meantime, should this bug get in your way, just revert the following
> patch and you should be able to run without further trouble:
>
> 948ca5f30e1d "ext4: enforce buffer head state assertion in ext4_da_map_blocks"
>
> I'll likely be posting a patch to revert this shortly, since it's going to
> take some time to sort out what's going on without a reproducer.
Looking at this I can see that the assertion is
BUG_ON(bh->b_blocknr != invalid_block);
and I suspect it is some kind of a race between ext4_da_map_blocks() and
writeback code? Writeback code holds only i_data_sem and page locks but
ext4_da_map_blocks() holds only page lock at that point. So page lock on
that particular page is the only thing that protects us from getting
outright out of date info from extent status tree. And I'm not sure all
extent status tree manipulations are careful enough to be also protected by
page locks of all pages that are inside given extent...
Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR