2020-09-09 00:29:45

by Gong, Sishuai

[permalink] [raw]
Subject: PROBLEM: another potential concurrency bug in swap_inode_boot_loader()

Hi,

We found a potential concurrency bug in linux kernel 5.3.11. We were able to reproduce this bug in x86 under specific thread interleavings. This bug causes a “bad header/extent” EXT4-fs error.

In addition, we think this bug may be related to another bug we reported earlier. Similar to a concern mentioned in your reply, this time the inode had a correct checksum but a wrong header data.

https://lore.kernel.org/linux-ext4/[email protected]/T/#t


------------------------------------------
Kernel console output

EXT4-fs error (device sda1): ext4_ext_check_inode:498: inode #5: comm ski-executor: pblk 0 bad header/extent: invalid magic - magic 0, entries 0, max 0(0), depth 0(0)

------------------------------------------
Test input

This bug occurs when a kernel test program is executed twice in different threads and ran concurrently. Our analysis has located that it happens when syscall ioctl with the EXT4_IOC_SWAP_BOOT flag is called twice and interleaves with itself.
The test program is generated by Syzkaller as follows:
r0 = creat(&(0x7f0000000080)='./file0\x00', 0x0)
ioctl$FS_IOC_SETFLAGS(r0, 0x40046602, &(0x7f0000000040))
r1 = creat(&(0x7f0000000000)='./file0\x00', 0x0)
pwrite64(r1, &(0x7f00000000c0)='\x00', 0x1, 0x1010000)
r2 = creat(&(0x7f0000000000)='./file0\x00', 0x0)
ioctl$EXT4_IOC_SWAP_BOOT(r2, 0x6611)

------------------------------------------
Thread interleaving

Our analysis revealed that the following interleaving triggers this bug.

CPU0 CPU1
swap_inode_boot_loader()

-- ext4_mark_inode_dirty() [fs/ext4/ioctl.c:207]
[context switch]
swap_inode_boot_loader()
-- ext4_iget()
---- ext4_isize()
[context switch]

-- ext4_mark_inode_dirty() [fs/ext4/ioctl.c:223]
---- ext4_mark_iloc_dirty()
------ ext4_do_update_inode()
for (block = 0; block < EXT4_N_BLOCKS; block++) [fs/ext4/inode.c:5337]
raw_inode->i_block[block] = ei->i_data[block];

[syscall finishes]
[context switch]

for (block = 0; block < EXT4_N_BLOCKS; block++) [fs/ext4/inode.c:5002]
ei->i_data[block] = raw_inode->i_block[block];

---- ext4_ext_check_inode(inode)
[EXT4-fs error]


Thanks,
Sishuai


2020-09-09 02:46:20

by Darrick J. Wong

[permalink] [raw]
Subject: Re: PROBLEM: another potential concurrency bug in swap_inode_boot_loader()

On Wed, Sep 09, 2020 at 12:28:36AM +0000, Gong, Sishuai wrote:
> Hi,
>
> We found a potential concurrency bug in linux kernel 5.3.11. We were able to reproduce this bug in x86 under specific thread interleavings. This bug causes a “bad header/extent” EXT4-fs error.
>
> In addition, we think this bug may be related to another bug we reported earlier. Similar to a concern mentioned in your reply, this time the inode had a correct checksum but a wrong header data.
>
> https://lore.kernel.org/linux-ext4/[email protected]/T/#t
>
>
> ------------------------------------------
> Kernel console output
>
> EXT4-fs error (device sda1): ext4_ext_check_inode:498: inode #5: comm ski-executor: pblk 0 bad header/extent: invalid magic - magic 0, entries 0, max 0(0), depth 0(0)
>
> ------------------------------------------
> Test input
>
> This bug occurs when a kernel test program is executed twice in different threads and ran concurrently. Our analysis has located that it happens when syscall ioctl with the EXT4_IOC_SWAP_BOOT flag is called twice and interleaves with itself.
> The test program is generated by Syzkaller as follows:
> r0 = creat(&(0x7f0000000080)='./file0\x00', 0x0)
> ioctl$FS_IOC_SETFLAGS(r0, 0x40046602, &(0x7f0000000040))
> r1 = creat(&(0x7f0000000000)='./file0\x00', 0x0)
> pwrite64(r1, &(0x7f00000000c0)='\x00', 0x1, 0x1010000)
> r2 = creat(&(0x7f0000000000)='./file0\x00', 0x0)
> ioctl$EXT4_IOC_SWAP_BOOT(r2, 0x6611)
>
> ------------------------------------------
> Thread interleaving
>
> Our analysis revealed that the following interleaving triggers this bug.
>
> CPU0 CPU1
> swap_inode_boot_loader()
> …
> -- ext4_mark_inode_dirty() [fs/ext4/ioctl.c:207]
> [context switch]
> swap_inode_boot_loader()
> -- ext4_iget()
> ---- ext4_isize()
> [context switch]


How do you end up in this state? CPU0 has already ext4_iget()'d a
reference to the bootloader inode, right? Which means that I_NEW is no
longer set on the incore inode, right, because we clear that flag when
we unlock the inode.i_lock at the end of the iget function. So
shouldn't CPU1's call to ext4_iget to get the same bootloader inode end
up with the same incore inode? And won't I_NEW be clear by then?

--D

> …
> -- ext4_mark_inode_dirty() [fs/ext4/ioctl.c:223]
> ---- ext4_mark_iloc_dirty()
> ------ ext4_do_update_inode()
> for (block = 0; block < EXT4_N_BLOCKS; block++) [fs/ext4/inode.c:5337]
> raw_inode->i_block[block] = ei->i_data[block];
> …
> [syscall finishes]
> [context switch]
> …
> for (block = 0; block < EXT4_N_BLOCKS; block++) [fs/ext4/inode.c:5002]
> ei->i_data[block] = raw_inode->i_block[block];
> …
> ---- ext4_ext_check_inode(inode)
> [EXT4-fs error]
>
>
> Thanks,
> Sishuai
>