LinuxLists.cc - [BUG report] security_inode_alloc return -ENOMEM let xfs shutdown

2022-05-23 08:55:19

Subject: [BUG report] security_inode_alloc return -ENOMEM let xfs shutdown

Hello Maintainer and developer.

Syzkaller report an filesystem shutdown for me, It's very easy to
trigger and also exists on the latest kernel version 5.18-rc7.

dmesg shows:

[ 285.725893] FAULT_INJECTION: forcing a failure.
name failslab, interval 1, probability 0, space 0, times 0
[ 285.729625] CPU: 7 PID: 18034 Comm: syz-executor Not tainted
4.19.90-43+ #7
[ 285.731420] Source Version: b62cabdd86181d386998660ebf34ca653addd6c9
[ 285.733051] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0
02/06/2015
[ 285.734796] Call trace:
[ 285.735614] dump_backtrace+0x0/0x3e0
[ 285.736609] show_stack+0x2c/0x38
[ 285.737525] dump_stack+0x164/0x1fc
[ 285.738489] should_fail+0x5c0/0x688
[ 285.739555] __should_failslab+0x118/0x180
[ 285.740725] should_failslab+0x2c/0x78
[ 285.741808] kmem_cache_alloc_trace+0x270/0x410
[ 285.743120] security_inode_alloc+0x100/0x1a8
[ 285.744356] inode_init_always+0x48c/0xa28
[ 285.745524] xfs_iget_cache_hit+0x9c0/0x2f28
[ 285.746739] xfs_iget+0x33c/0x9e0
[ 285.747708] xfs_ialloc+0x218/0x11c0
[ 285.748752] xfs_dir_ialloc+0xe8/0x480
[ 285.749832] xfs_create+0x5bc/0x1220
[ 285.750871] xfs_generic_create+0x42c/0x568
[ 285.752053] xfs_vn_mknod+0x48/0x58
[ 285.753067] xfs_vn_create+0x40/0x50
[ 285.754106] lookup_open+0x960/0x1580
[ 285.755176] do_last+0xd44/0x2180
[ 285.756149] path_openat+0x1a0/0x6d0
[ 285.757187] do_filp_open+0x14c/0x208
[ 285.758245] do_sys_open+0x340/0x470
[ 285.759289] __arm64_sys_openat+0x98/0xd8
[ 285.760438] el0_svc_common+0x230/0x3f0
[ 285.761541] el0_svc_handler+0x144/0x1a8
[ 285.762674] el0_svc+0x8/0x1b0
[ 285.763737] security_inode_alloc:796
[ 285.764733] inode_init_always:202
[ 285.765669] xfs_create:1213
[ 285.766485] XFS (dm-0): Internal error xfs_trans_cancel at line 1046
of file fs/xfs/xfs_trans.c. Caller xfs_create+0x700/0x1220
[ 285.769503] CPU: 7 PID: 18034 Comm: syz-executor Not tainted
4.19.90-43+ #7
[ 285.771275] Source Version: b62cabdd86181d386998660ebf34ca653addd6c9
[ 285.772892] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0
02/06/2015
[ 285.774625] Call trace:
[ 285.775335] dump_backtrace+0x0/0x3e0
[ 285.776324] show_stack+0x2c/0x38
[ 285.777236] dump_stack+0x164/0x1fc
[ 285.778188] xfs_error_report+0xdc/0xe0
[ 285.779292] xfs_trans_cancel+0x490/0x878
[ 285.780439] xfs_create+0x700/0x1220
[ 285.781477] xfs_generic_create+0x42c/0x568
[ 285.782673] xfs_vn_mknod+0x48/0x58
[ 285.783687] xfs_vn_create+0x40/0x50
[ 285.784724] lookup_open+0x960/0x1580
[ 285.785782] do_last+0xd44/0x2180
[ 285.786760] path_openat+0x1a0/0x6d0
[ 285.787791] do_filp_open+0x14c/0x208
[ 285.788844] do_sys_open+0x340/0x470
[ 285.789880] __arm64_sys_openat+0x98/0xd8
[ 285.791039] el0_svc_common+0x230/0x3f0
[ 285.792139] el0_svc_handler+0x144/0x1a8
[ 285.793260] el0_svc+0x8/0x1b0
[ 285.794283] XFS (dm-0): xfs_do_force_shutdown(0x8) called from line
1047 of file fs/xfs/xfs_trans.c. Return address = 00000000a4a366b9
[ 285.816187] XFS (dm-0): Corruption of in-memory data detected.
Shutting down filesystem
[ 285.818476] XFS (dm-0): Please umount the filesystem and rectify the
problem(s)

I found that it is not allowed to fail when alloc xfs_inode in
xfs_inode_alloc , but allow inode_init_always to report -ENOMEM?

inode_init_always is not failed by security_inode_alloc.

I have test the patch:

diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index ceee27b70384..609ad96e29e9 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -43,12 +43,14 @@ xfs_inode_alloc(
* code up to do this anyway.
*/
ip = kmem_zone_alloc(xfs_inode_zone, KM_SLEEP);
- if (!ip)
- return NULL;
- if (inode_init_always(mp->m_super, VFS_I(ip))) {
- kmem_zone_free(xfs_inode_zone, ip);
+ if (!ip) {
+ pr_err("%s:%d\n", __func__, __LINE__);
return NULL;
}
+ while (inode_init_always(mp->m_super, VFS_I(ip)) != 0) {
+ pr_err("%s:%d\n", __func__, __LINE__);
+ pr_err("111\n");
+ }

/* VFS doesn't initialise i_mode! */
VFS_I(ip)->i_mode = 0;
@@ -280,7 +282,7 @@ xfs_reinit_inode(
struct xfs_mount *mp,
struct inode *inode)
{
- int error;
+ int error = 0;
uint32_t nlink = inode->i_nlink;
uint32_t generation = inode->i_generation;
uint64_t version = inode_peek_iversion(inode);
@@ -289,7 +291,7 @@ xfs_reinit_inode(
kuid_t uid = inode->i_uid;
kgid_t gid = inode->i_gid;

- error = inode_init_always(mp->m_super, inode);
+ while (inode_init_always(mp->m_super, inode) != 0);

set_nlink(inode, nlink);
inode->i_generation = generation;

syzkaller works fine.

Does anyone help me, Any suggestion is welcome.

--
BR, Jackie Liu

2022-05-24 00:17:32

by Dave Chinner

[permalink] [raw]

Subject: Re: [BUG report] security_inode_alloc return -ENOMEM let xfs shutdown

On Mon, May 23, 2022 at 04:51:50PM +0800, Jackie Liu wrote:
> Hello Maintainer and developer.
>
> Syzkaller report an filesystem shutdown for me, It's very easy to
> trigger and also exists on the latest kernel version 5.18-rc7.

Shutdown is a perfectly reasonable way to handle a failure that we
can't recover cleanly from.

> dmesg shows:
>
> [ 285.725893] FAULT_INJECTION: forcing a failure.
> name failslab, interval 1, probability 0, space 0, times 0
> [ 285.729625] CPU: 7 PID: 18034 Comm: syz-executor Not tainted 4.19.90-43+
> #7
> [ 285.731420] Source Version: b62cabdd86181d386998660ebf34ca653addd6c9
> [ 285.733051] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0
> 02/06/2015
> [ 285.734796] Call trace:
> [ 285.735614] dump_backtrace+0x0/0x3e0
> [ 285.736609] show_stack+0x2c/0x38
> [ 285.737525] dump_stack+0x164/0x1fc
> [ 285.738489] should_fail+0x5c0/0x688
> [ 285.739555] __should_failslab+0x118/0x180
> [ 285.740725] should_failslab+0x2c/0x78
> [ 285.741808] kmem_cache_alloc_trace+0x270/0x410
> [ 285.743120] security_inode_alloc+0x100/0x1a8
> [ 285.744356] inode_init_always+0x48c/0xa28
> [ 285.745524] xfs_iget_cache_hit+0x9c0/0x2f28
> [ 285.746739] xfs_iget+0x33c/0x9e0
> [ 285.747708] xfs_ialloc+0x218/0x11c0
> [ 285.748752] xfs_dir_ialloc+0xe8/0x480
> [ 285.749832] xfs_create+0x5bc/0x1220
> [ 285.750871] xfs_generic_create+0x42c/0x568
> [ 285.752053] xfs_vn_mknod+0x48/0x58
> [ 285.753067] xfs_vn_create+0x40/0x50
> [ 285.754106] lookup_open+0x960/0x1580
> [ 285.755176] do_last+0xd44/0x2180
> [ 285.756149] path_openat+0x1a0/0x6d0
> [ 285.757187] do_filp_open+0x14c/0x208
> [ 285.758245] do_sys_open+0x340/0x470
> [ 285.759289] __arm64_sys_openat+0x98/0xd8
> [ 285.760438] el0_svc_common+0x230/0x3f0
> [ 285.761541] el0_svc_handler+0x144/0x1a8
> [ 285.762674] el0_svc+0x8/0x1b0
> [ 285.763737] security_inode_alloc:796
> [ 285.764733] inode_init_always:202
> [ 285.765669] xfs_create:1213
> [ 285.766485] XFS (dm-0): Internal error xfs_trans_cancel at line 1046 of
> file fs/xfs/xfs_trans.c. Caller xfs_create+0x700/0x1220
> [ 285.769503] CPU: 7 PID: 18034 Comm: syz-executor Not tainted 4.19.90-43+
> #7
> [ 285.771275] Source Version: b62cabdd86181d386998660ebf34ca653addd6c9
> [ 285.772892] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0
> 02/06/2015
> [ 285.774625] Call trace:
> [ 285.775335] dump_backtrace+0x0/0x3e0
> [ 285.776324] show_stack+0x2c/0x38
> [ 285.777236] dump_stack+0x164/0x1fc
> [ 285.778188] xfs_error_report+0xdc/0xe0
> [ 285.779292] xfs_trans_cancel+0x490/0x878
> [ 285.780439] xfs_create+0x700/0x1220
> [ 285.781477] xfs_generic_create+0x42c/0x568
> [ 285.782673] xfs_vn_mknod+0x48/0x58
> [ 285.783687] xfs_vn_create+0x40/0x50
> [ 285.784724] lookup_open+0x960/0x1580
> [ 285.785782] do_last+0xd44/0x2180
> [ 285.786760] path_openat+0x1a0/0x6d0
> [ 285.787791] do_filp_open+0x14c/0x208
> [ 285.788844] do_sys_open+0x340/0x470
> [ 285.789880] __arm64_sys_openat+0x98/0xd8
> [ 285.791039] el0_svc_common+0x230/0x3f0
> [ 285.792139] el0_svc_handler+0x144/0x1a8
> [ 285.793260] el0_svc+0x8/0x1b0
> [ 285.794283] XFS (dm-0): xfs_do_force_shutdown(0x8) called from line 1047
> of file fs/xfs/xfs_trans.c. Return address = 00000000a4a366b9
> [ 285.816187] XFS (dm-0): Corruption of in-memory data detected. Shutting
> down filesystem
> [ 285.818476] XFS (dm-0): Please umount the filesystem and rectify the
> problem(s)

Yup, that's a shutdown with a dirty transaction because memory
allocation failed in the middle of a transaction. XFS can not
tolerate memory allocation failure within the scope of a dirty
transactions and, in practice, this almost never happens. Indeed,
I've never seen this allocation from security_inode_alloc():

int lsm_inode_alloc(struct inode *inode)
{
if (!lsm_inode_cache) {
inode->i_security = NULL;
return 0;
}

>>>>> inode->i_security = kmem_cache_zalloc(lsm_inode_cache, GFP_NOFS);
if (inode->i_security == NULL)
return -ENOMEM;
return 0;
}

fail in all my OOM testing. Hence, to me, this is a theoretical
failure as I've never, ever seen this allocation fail in production
or test systems, even when driving them hard into OOM with excessive
inode allocation and triggering the OOM killer repeatedly until the
system kills init....

Hence I don't think there's anything we need to change here right
now. If users start hitting this, then we're going to have add new
memalloc_nofail_save/restore() functionality to XFS transaction
contexts. But until then, I don't think we need to worry about
syzkaller intentionally hitting this shutdown.

Cheers,

Dave.
--
Dave Chinner
[email protected]

2022-05-24 03:00:54

by Dave Chinner

[permalink] [raw]

Subject: Re: [BUG report] security_inode_alloc return -ENOMEM let xfs shutdown

On Tue, May 24, 2022 at 08:52:30AM +0800, Jackie Liu wrote:
> 在 2022/5/24 上午7:20, Dave Chinner 写道:
> > On Mon, May 23, 2022 at 04:51:50PM +0800, Jackie Liu wrote:
> > Yup, that's a shutdown with a dirty transaction because memory
> > allocation failed in the middle of a transaction. XFS can not
> > tolerate memory allocation failure within the scope of a dirty
> > transactions and, in practice, this almost never happens. Indeed,
> > I've never seen this allocation from security_inode_alloc():
> >
> > int lsm_inode_alloc(struct inode *inode)
> > {
> > if (!lsm_inode_cache) {
> > inode->i_security = NULL;
> > return 0;
> > }
> >
> > > > > > > inode->i_security = kmem_cache_zalloc(lsm_inode_cache, GFP_NOFS);
> > if (inode->i_security == NULL)
> > return -ENOMEM;
> > return 0;
> > }
> >
> > fail in all my OOM testing. Hence, to me, this is a theoretical
> > failure as I've never, ever seen this allocation fail in production
> > or test systems, even when driving them hard into OOM with excessive
> > inode allocation and triggering the OOM killer repeatedly until the
> > system kills init....
> >
> > Hence I don't think there's anything we need to change here right
> > now. If users start hitting this, then we're going to have add new
> > memalloc_nofail_save/restore() functionality to XFS transaction
> > contexts. But until then, I don't think we need to worry about
> > syzkaller intentionally hitting this shutdown.
>
> Thanks Dave.
>
> In the actual test, the x86 or arm64 device test will trigger this error
> more easily when FAILSLAB is turned on. After our internal discussion, we
> can try again through such a patch. Anyway, thank you for your reply.

What kernel is the patch against? It doesn't match a current TOT
kernel...

>
> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> index ceee27b70384..360304409c0c 100644
> --- a/fs/xfs/xfs_icache.c
> +++ b/fs/xfs/xfs_icache.c
> @@ -435,6 +435,7 @@ xfs_iget_cache_hit(
> wake_up_bit(&ip->i_flags, __XFS_INEW_BIT);
> ASSERT(ip->i_flags & XFS_IRECLAIMABLE);
> trace_xfs_iget_reclaim_fail(ip);
> + error = -EAGAIN;
> goto out_error;
> }

Ok, I can see what you are suggesting here - it might work if we get
it right. :)

We don't actually want (or need) an unconditional retry. This will
turn persistent memory allocation failure into a CPU burning
livelock rather than -ENOMEM being returned. It might work for a
one-off memory failure, but it's not viable for long term failure as
tends to happen when the system goes deep into OOM territory.

It also ignores the fact that we can return ENOMEM without
consequences from this path if we are not in a transaction - any
pathwalk lookup can have ENOMEM safely returned to it, and that will
propagate the error to userspace. Same with bulkstat lookups, etc.
So we still want them to fail with ENOMEM, not retry indefinitely.

Likely what we want to do is add conditions to the xfs_iget() lookup
tail to detect ENOMEM when tp != NULL. IN that case, we can then run
memalloc_retry_wait(GFP_NOFS) before retrying the lookup. That's in
line with what we do in other places that cannot tolerate allocation
failure (e.g. kmem_alloc(), xfs_buf_alloc_pages()) so it may make
sense to do the same thing here....

Cheers,

Dave.
--
Dave Chinner
[email protected]

2022-05-24 03:49:47

by Jackie Liu

[permalink] [raw]

Subject: Re: [BUG report] security_inode_alloc return -ENOMEM let xfs shutdown

在 2022/5/24 上午9:28, Dave Chinner 写道:
> On Tue, May 24, 2022 at 08:52:30AM +0800, Jackie Liu wrote:
>> 在 2022/5/24 上午7:20, Dave Chinner 写道:
>>> On Mon, May 23, 2022 at 04:51:50PM +0800, Jackie Liu wrote:
>>> Yup, that's a shutdown with a dirty transaction because memory
>>> allocation failed in the middle of a transaction. XFS can not
>>> tolerate memory allocation failure within the scope of a dirty
>>> transactions and, in practice, this almost never happens. Indeed,
>>> I've never seen this allocation from security_inode_alloc():
>>>
>>> int lsm_inode_alloc(struct inode *inode)
>>> {
>>> if (!lsm_inode_cache) {
>>> inode->i_security = NULL;
>>> return 0;
>>> }
>>>
>>>>>>>> inode->i_security = kmem_cache_zalloc(lsm_inode_cache, GFP_NOFS);
>>> if (inode->i_security == NULL)
>>> return -ENOMEM;
>>> return 0;
>>> }
>>>
>>> fail in all my OOM testing. Hence, to me, this is a theoretical
>>> failure as I've never, ever seen this allocation fail in production
>>> or test systems, even when driving them hard into OOM with excessive
>>> inode allocation and triggering the OOM killer repeatedly until the
>>> system kills init....
>>>
>>> Hence I don't think there's anything we need to change here right
>>> now. If users start hitting this, then we're going to have add new
>>> memalloc_nofail_save/restore() functionality to XFS transaction
>>> contexts. But until then, I don't think we need to worry about
>>> syzkaller intentionally hitting this shutdown.
>>
>> Thanks Dave.
>>
>> In the actual test, the x86 or arm64 device test will trigger this error
>> more easily when FAILSLAB is turned on. After our internal discussion, we
>> can try again through such a patch. Anyway, thank you for your reply.
>
> What kernel is the patch against? It doesn't match a current TOT
> kernel...

It's linux-4.19.y with LSM security patch, but as long as the LSM
framework is added, this problem can be repeated.

>
>>
>> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
>> index ceee27b70384..360304409c0c 100644
>> --- a/fs/xfs/xfs_icache.c
>> +++ b/fs/xfs/xfs_icache.c
>> @@ -435,6 +435,7 @@ xfs_iget_cache_hit(
>> wake_up_bit(&ip->i_flags, __XFS_INEW_BIT);
>> ASSERT(ip->i_flags & XFS_IRECLAIMABLE);
>> trace_xfs_iget_reclaim_fail(ip);
>> + error = -EAGAIN;
>> goto out_error;
>> }
>
> Ok, I can see what you are suggesting here - it might work if we get
> it right. :)
>
> We don't actually want (or need) an unconditional retry. This will
> turn persistent memory allocation failure into a CPU burning
> livelock rather than -ENOMEM being returned. It might work for a
> one-off memory failure, but it's not viable for long term failure as
> tends to happen when the system goes deep into OOM territory.

In my opinion, if it causes the filesystem to be shutdown, it's better
to let it try again and again.

>
> It also ignores the fact that we can return ENOMEM without
> consequences from this path if we are not in a transaction - any
> pathwalk lookup can have ENOMEM safely returned to it, and that will
> propagate the error to userspace. Same with bulkstat lookups, etc.
> So we still want them to fail with ENOMEM, not retry indefinitely.
>
> Likely what we want to do is add conditions to the xfs_iget() lookup
> tail to detect ENOMEM when tp != NULL. IN that case, we can then run
> memalloc_retry_wait(GFP_NOFS) before retrying the lookup. That's in
> line with what we do in other places that cannot tolerate allocation
> failure (e.g. kmem_alloc(), xfs_buf_alloc_pages()) so it may make
> sense to do the same thing here....

Do you have any patch suggestions? I have a test environment here to verify.

--
BR, Jackie Liu

>
> Cheers,
>
> Dave.
>

2022-05-24 05:59:01

by Jackie Liu

[permalink] [raw]

Subject: Re: [BUG report] security_inode_alloc return -ENOMEM let xfs shutdown

?? 2022/5/24 ????7:20, Dave Chinner д??:
> On Mon, May 23, 2022 at 04:51:50PM +0800, Jackie Liu wrote:
>> Hello Maintainer and developer.
>>
>> Syzkaller report an filesystem shutdown for me, It's very easy to
>> trigger and also exists on the latest kernel version 5.18-rc7.
>
> Shutdown is a perfectly reasonable way to handle a failure that we
> can't recover cleanly from.
>
>> dmesg shows:
>>
>> [ 285.725893] FAULT_INJECTION: forcing a failure.
>> name failslab, interval 1, probability 0, space 0, times 0
>> [ 285.729625] CPU: 7 PID: 18034 Comm: syz-executor Not tainted 4.19.90-43+
>> #7
>> [ 285.731420] Source Version: b62cabdd86181d386998660ebf34ca653addd6c9
>> [ 285.733051] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0
>> 02/06/2015
>> [ 285.734796] Call trace:
>> [ 285.735614] dump_backtrace+0x0/0x3e0
>> [ 285.736609] show_stack+0x2c/0x38
>> [ 285.737525] dump_stack+0x164/0x1fc
>> [ 285.738489] should_fail+0x5c0/0x688
>> [ 285.739555] __should_failslab+0x118/0x180
>> [ 285.740725] should_failslab+0x2c/0x78
>> [ 285.741808] kmem_cache_alloc_trace+0x270/0x410
>> [ 285.743120] security_inode_alloc+0x100/0x1a8
>> [ 285.744356] inode_init_always+0x48c/0xa28
>> [ 285.745524] xfs_iget_cache_hit+0x9c0/0x2f28
>> [ 285.746739] xfs_iget+0x33c/0x9e0
>> [ 285.747708] xfs_ialloc+0x218/0x11c0
>> [ 285.748752] xfs_dir_ialloc+0xe8/0x480
>> [ 285.749832] xfs_create+0x5bc/0x1220
>> [ 285.750871] xfs_generic_create+0x42c/0x568
>> [ 285.752053] xfs_vn_mknod+0x48/0x58
>> [ 285.753067] xfs_vn_create+0x40/0x50
>> [ 285.754106] lookup_open+0x960/0x1580
>> [ 285.755176] do_last+0xd44/0x2180
>> [ 285.756149] path_openat+0x1a0/0x6d0
>> [ 285.757187] do_filp_open+0x14c/0x208
>> [ 285.758245] do_sys_open+0x340/0x470
>> [ 285.759289] __arm64_sys_openat+0x98/0xd8
>> [ 285.760438] el0_svc_common+0x230/0x3f0
>> [ 285.761541] el0_svc_handler+0x144/0x1a8
>> [ 285.762674] el0_svc+0x8/0x1b0
>> [ 285.763737] security_inode_alloc:796
>> [ 285.764733] inode_init_always:202
>> [ 285.765669] xfs_create:1213
>> [ 285.766485] XFS (dm-0): Internal error xfs_trans_cancel at line 1046 of
>> file fs/xfs/xfs_trans.c. Caller xfs_create+0x700/0x1220
>> [ 285.769503] CPU: 7 PID: 18034 Comm: syz-executor Not tainted 4.19.90-43+
>> #7
>> [ 285.771275] Source Version: b62cabdd86181d386998660ebf34ca653addd6c9
>> [ 285.772892] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0
>> 02/06/2015
>> [ 285.774625] Call trace:
>> [ 285.775335] dump_backtrace+0x0/0x3e0
>> [ 285.776324] show_stack+0x2c/0x38
>> [ 285.777236] dump_stack+0x164/0x1fc
>> [ 285.778188] xfs_error_report+0xdc/0xe0
>> [ 285.779292] xfs_trans_cancel+0x490/0x878
>> [ 285.780439] xfs_create+0x700/0x1220
>> [ 285.781477] xfs_generic_create+0x42c/0x568
>> [ 285.782673] xfs_vn_mknod+0x48/0x58
>> [ 285.783687] xfs_vn_create+0x40/0x50
>> [ 285.784724] lookup_open+0x960/0x1580
>> [ 285.785782] do_last+0xd44/0x2180
>> [ 285.786760] path_openat+0x1a0/0x6d0
>> [ 285.787791] do_filp_open+0x14c/0x208
>> [ 285.788844] do_sys_open+0x340/0x470
>> [ 285.789880] __arm64_sys_openat+0x98/0xd8
>> [ 285.791039] el0_svc_common+0x230/0x3f0
>> [ 285.792139] el0_svc_handler+0x144/0x1a8
>> [ 285.793260] el0_svc+0x8/0x1b0
>> [ 285.794283] XFS (dm-0): xfs_do_force_shutdown(0x8) called from line 1047
>> of file fs/xfs/xfs_trans.c. Return address = 00000000a4a366b9
>> [ 285.816187] XFS (dm-0): Corruption of in-memory data detected. Shutting
>> down filesystem
>> [ 285.818476] XFS (dm-0): Please umount the filesystem and rectify the
>> problem(s)
>
> Yup, that's a shutdown with a dirty transaction because memory
> allocation failed in the middle of a transaction. XFS can not
> tolerate memory allocation failure within the scope of a dirty
> transactions and, in practice, this almost never happens. Indeed,
> I've never seen this allocation from security_inode_alloc():
>
> int lsm_inode_alloc(struct inode *inode)
> {
> if (!lsm_inode_cache) {
> inode->i_security = NULL;
> return 0;
> }
>
>>>>>> inode->i_security = kmem_cache_zalloc(lsm_inode_cache, GFP_NOFS);
> if (inode->i_security == NULL)
> return -ENOMEM;
> return 0;
> }
>
> fail in all my OOM testing. Hence, to me, this is a theoretical
> failure as I've never, ever seen this allocation fail in production
> or test systems, even when driving them hard into OOM with excessive
> inode allocation and triggering the OOM killer repeatedly until the
> system kills init....
>
> Hence I don't think there's anything we need to change here right
> now. If users start hitting this, then we're going to have add new
> memalloc_nofail_save/restore() functionality to XFS transaction
> contexts. But until then, I don't think we need to worry about
> syzkaller intentionally hitting this shutdown.

Thanks Dave.

In the actual test, the x86 or arm64 device test will trigger this
error more easily when FAILSLAB is turned on. After our internal
discussion, we can try again through such a patch. Anyway, thank you for
your reply.

diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index ceee27b70384..360304409c0c 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -435,6 +435,7 @@ xfs_iget_cache_hit(
wake_up_bit(&ip->i_flags, __XFS_INEW_BIT);
ASSERT(ip->i_flags & XFS_IRECLAIMABLE);
trace_xfs_iget_reclaim_fail(ip);
+ error = -EAGAIN;
goto out_error;
}

@@ -503,7 +504,7 @@ xfs_iget_cache_miss(

ip = xfs_inode_alloc(mp, ino);
if (!ip)
- return -ENOMEM;
+ return -EAGAIN;

error = xfs_iread(mp, tp, ip, flags);
if (error)

--
BR, Jackie Liu

>
> Cheers,
>
> Dave.
>