LinuxLists.cc - [PATCH AUTOSEL 6.4 1/5] iomap: Fix possible overflow condition in iomap_write_delalloc

2023-09-07 17:17:21

Subject: [PATCH AUTOSEL 6.4 1/5] iomap: Fix possible overflow condition in iomap_write_delalloc_scan

From: "Ritesh Harjani (IBM)" <[email protected]>

[ Upstream commit eee2d2e6ea5550118170dbd5bb1316ceb38455fb ]

folio_next_index() returns an unsigned long value which left shifted
by PAGE_SHIFT could possibly cause an overflow on 32-bit system. Instead
use folio_pos(folio) + folio_size(folio), which does this correctly.

Suggested-by: Matthew Wilcox <[email protected]>
Signed-off-by: Ritesh Harjani (IBM) <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/iomap/buffered-io.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 063133ec77f49..5e5bffa384976 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -929,7 +929,7 @@ static int iomap_write_delalloc_scan(struct inode *inode,
* the end of this data range, not the end of the folio.
*/
*punch_start_byte = min_t(loff_t, end_byte,
- folio_next_index(folio) << PAGE_SHIFT);
+ folio_pos(folio) + folio_size(folio));
}

/* move offset to start of next folio in range */
--
2.40.1

2023-09-07 18:17:06

by Sasha Levin

[permalink] [raw]

Subject: [PATCH AUTOSEL 6.4 5/5] locks: fix KASAN: use-after-free in trace_event_raw_event_filelock_lock

From: Will Shiu <[email protected]>

[ Upstream commit 74f6f5912693ce454384eaeec48705646a21c74f ]

As following backtrace, the struct file_lock request , in posix_lock_inode
is free before ftrace function using.
Replace the ftrace function ahead free flow could fix the use-after-free
issue.

[name:report&]===============================================
BUG:KASAN: use-after-free in trace_event_raw_event_filelock_lock+0x80/0x12c
[name:report&]Read at addr f6ffff8025622620 by task NativeThread/16753
[name:report_hw_tags&]Pointer tag: [f6], memory tag: [fe]
[name:report&]
BT:
Hardware name: MT6897 (DT)
Call trace:
dump_backtrace+0xf8/0x148
show_stack+0x18/0x24
dump_stack_lvl+0x60/0x7c
print_report+0x2c8/0xa08
kasan_report+0xb0/0x120
__do_kernel_fault+0xc8/0x248
do_bad_area+0x30/0xdc
do_tag_check_fault+0x1c/0x30
do_mem_abort+0x58/0xbc
el1_abort+0x3c/0x5c
el1h_64_sync_handler+0x54/0x90
el1h_64_sync+0x68/0x6c
trace_event_raw_event_filelock_lock+0x80/0x12c
posix_lock_inode+0xd0c/0xd60
do_lock_file_wait+0xb8/0x190
fcntl_setlk+0x2d8/0x440
...
[name:report&]
[name:report&]Allocated by task 16752:
...
slab_post_alloc_hook+0x74/0x340
kmem_cache_alloc+0x1b0/0x2f0
posix_lock_inode+0xb0/0xd60
...
[name:report&]
[name:report&]Freed by task 16752:
...
kmem_cache_free+0x274/0x5b0
locks_dispose_list+0x3c/0x148
posix_lock_inode+0xc40/0xd60
do_lock_file_wait+0xb8/0x190
fcntl_setlk+0x2d8/0x440
do_fcntl+0x150/0xc18
...

Signed-off-by: Will Shiu <[email protected]>
Signed-off-by: Jeff Layton <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/locks.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/locks.c b/fs/locks.c
index df8b26a425248..a552bdb6badc0 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1301,6 +1301,7 @@ static int posix_lock_inode(struct inode *inode, struct file_lock *request,
out:
spin_unlock(&ctx->flc_lock);
percpu_up_read(&file_rwsem);
+ trace_posix_lock_inode(inode, request, error);
/*
* Free any unused locks.
*/
@@ -1309,7 +1310,6 @@ static int posix_lock_inode(struct inode *inode, struct file_lock *request,
if (new_fl2)
locks_free_lock(new_fl2);
locks_dispose_list(&dispose);
- trace_posix_lock_inode(inode, request, error);

return error;
}
--
2.40.1

2023-09-07 18:18:28

by Sasha Levin

[permalink] [raw]

Subject: [PATCH AUTOSEL 6.4 2/5] autofs: fix memory leak of waitqueues in autofs_catatonic_mode

From: Fedor Pchelkin <[email protected]>

[ Upstream commit ccbe77f7e45dfb4420f7f531b650c00c6e9c7507 ]

Syzkaller reports a memory leak:

BUG: memory leak
unreferenced object 0xffff88810b279e00 (size 96):
comm "syz-executor399", pid 3631, jiffies 4294964921 (age 23.870s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 08 9e 27 0b 81 88 ff ff ..........'.....
08 9e 27 0b 81 88 ff ff 00 00 00 00 00 00 00 00 ..'.............
backtrace:
[<ffffffff814cfc90>] kmalloc_trace+0x20/0x90 mm/slab_common.c:1046
[<ffffffff81bb75ca>] kmalloc include/linux/slab.h:576 [inline]
[<ffffffff81bb75ca>] autofs_wait+0x3fa/0x9a0 fs/autofs/waitq.c:378
[<ffffffff81bb88a7>] autofs_do_expire_multi+0xa7/0x3e0 fs/autofs/expire.c:593
[<ffffffff81bb8c33>] autofs_expire_multi+0x53/0x80 fs/autofs/expire.c:619
[<ffffffff81bb6972>] autofs_root_ioctl_unlocked+0x322/0x3b0 fs/autofs/root.c:897
[<ffffffff81bb6a95>] autofs_root_ioctl+0x25/0x30 fs/autofs/root.c:910
[<ffffffff81602a9c>] vfs_ioctl fs/ioctl.c:51 [inline]
[<ffffffff81602a9c>] __do_sys_ioctl fs/ioctl.c:870 [inline]
[<ffffffff81602a9c>] __se_sys_ioctl fs/ioctl.c:856 [inline]
[<ffffffff81602a9c>] __x64_sys_ioctl+0xfc/0x140 fs/ioctl.c:856
[<ffffffff84608225>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
[<ffffffff84608225>] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
[<ffffffff84800087>] entry_SYSCALL_64_after_hwframe+0x63/0xcd

autofs_wait_queue structs should be freed if their wait_ctr becomes zero.
Otherwise they will be lost.

In this case an AUTOFS_IOC_EXPIRE_MULTI ioctl is done, then a new
waitqueue struct is allocated in autofs_wait(), its initial wait_ctr
equals 2. After that wait_event_killable() is interrupted (it returns
-ERESTARTSYS), so that 'wq->name.name == NULL' condition may be not
satisfied. Actually, this condition can be satisfied when
autofs_wait_release() or autofs_catatonic_mode() is called and, what is
also important, wait_ctr is decremented in those places. Upon the exit of
autofs_wait(), wait_ctr is decremented to 1. Then the unmounting process
begins: kill_sb calls autofs_catatonic_mode(), which should have freed the
waitqueues, but it only decrements its usage counter to zero which is not
a correct behaviour.

edit:imk
This description is of course not correct. The umount performed as a result
of an expire is a umount of a mount that has been automounted, it's not the
autofs mount itself. They happen independently, usually after everything
mounted within the autofs file system has been expired away. If everything
hasn't been expired away the automount daemon can still exit leaving mounts
in place. But expires done in both cases will result in a notification that
calls autofs_wait_release() with a result status. The problem case is the
summary execution of of the automount daemon. In this case any waiting
processes won't be woken up until either they are terminated or the mount
is umounted.
end edit: imk

So in catatonic mode we should free waitqueues which counter becomes zero.

edit: imk
Initially I was concerned that the calling of autofs_wait_release() and
autofs_catatonic_mode() was not mutually exclusive but that can't be the
case (obviously) because the queue entry (or entries) is removed from the
list when either of these two functions are called. Consequently the wait
entry will be freed by only one of these functions or by the woken process
in autofs_wait() depending on the order of the calls.
end edit: imk

Reported-by: [email protected]
Suggested-by: Takeshi Misawa <[email protected]>
Signed-off-by: Fedor Pchelkin <[email protected]>
Signed-off-by: Alexey Khoroshilov <[email protected]>
Signed-off-by: Ian Kent <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Andrei Vagin <[email protected]>
Cc: [email protected]
Cc: [email protected]
Message-Id: <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/autofs/waitq.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/autofs/waitq.c b/fs/autofs/waitq.c
index 54c1f8b8b0757..efdc76732faed 100644
--- a/fs/autofs/waitq.c
+++ b/fs/autofs/waitq.c
@@ -32,8 +32,9 @@ void autofs_catatonic_mode(struct autofs_sb_info *sbi)
wq->status = -ENOENT; /* Magic is gone - report failure */
kfree(wq->name.name - wq->offset);
wq->name.name = NULL;
- wq->wait_ctr--;
wake_up_interruptible(&wq->queue);
+ if (!--wq->wait_ctr)
+ kfree(wq);
wq = nwq;
}
fput(sbi->pipe); /* Close the pipe */
--
2.40.1

2023-09-07 18:27:15

by Sasha Levin

[permalink] [raw]

Subject: [PATCH AUTOSEL 6.4 3/5] btrfs: return real error when orphan cleanup fails due to a transaction abort

From: Filipe Manana <[email protected]>

[ Upstream commit a7f8de500e28bb227e02a7bd35988cf37b816c86 ]

During mount we will call btrfs_orphan_cleanup() to remove any inodes that
were previously deleted (have a link count of 0) but for which we were not
able before to remove their items from the subvolume tree. The removal of
the items will happen by triggering eviction, when we do the final iput()
on them at btrfs_orphan_cleanup(), which will end in the loop at
btrfs_evict_inode() that truncates inode items.

In a dire situation we may have a transaction abort due to -ENOSPC when
attempting to truncate the inode items, and in that case the orphan item
(key type BTRFS_ORPHAN_ITEM_KEY) will remain in the subvolume tree and
when we hit the next iteration of the while loop at btrfs_orphan_cleanup()
we will find the same orphan item as before, and then we will return
-EINVAL from btrfs_orphan_cleanup() through the following if statement:

if (found_key.offset == last_objectid) {
btrfs_err(fs_info,
"Error removing orphan entry, stopping orphan cleanup");
ret = -EINVAL;
goto out;
}

This makes the mount operation fail with -EINVAL, when it should have been
-ENOSPC. This is confusing because -EINVAL might lead a user into thinking
it provided invalid mount options for example.

An example where this happens:

$ mount test.img /mnt
mount: /mnt: wrong fs type, bad option, bad superblock on /dev/loop0, missing codepage or helper program, or other error.

$ dmesg
[ 2542.356934] BTRFS: device fsid 977fff75-1181-4d2b-a739-384fa710d16e devid 1 transid 47409973 /dev/loop0 scanned by mount (4459)
[ 2542.357451] BTRFS info (device loop0): using crc32c (crc32c-intel) checksum algorithm
[ 2542.357461] BTRFS info (device loop0): disk space caching is enabled
[ 2542.742287] BTRFS info (device loop0): auto enabling async discard
[ 2542.764554] BTRFS info (device loop0): checking UUID tree
[ 2551.743065] ------------[ cut here ]------------
[ 2551.743068] BTRFS: Transaction aborted (error -28)
[ 2551.743149] WARNING: CPU: 7 PID: 215 at fs/btrfs/block-group.c:3494 btrfs_write_dirty_block_groups+0x397/0x3d0 [btrfs]
[ 2551.743311] Modules linked in: btrfs blake2b_generic (...)
[ 2551.743353] CPU: 7 PID: 215 Comm: kworker/u24:5 Not tainted 6.4.0-rc6-btrfs-next-134+ #1
[ 2551.743356] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014
[ 2551.743357] Workqueue: events_unbound btrfs_async_reclaim_metadata_space [btrfs]
[ 2551.743405] RIP: 0010:btrfs_write_dirty_block_groups+0x397/0x3d0 [btrfs]
[ 2551.743449] Code: 8b 43 0c (...)
[ 2551.743451] RSP: 0018:ffff982c005a7c40 EFLAGS: 00010286
[ 2551.743452] RAX: 0000000000000000 RBX: ffff88fc6e44b400 RCX: 0000000000000000
[ 2551.743453] RDX: 0000000000000002 RSI: ffffffff8dff0878 RDI: 00000000ffffffff
[ 2551.743454] RBP: ffff88fc51817208 R08: 0000000000000000 R09: ffff982c005a7ae0
[ 2551.743455] R10: 0000000000000001 R11: 0000000000000001 R12: ffff88fc43d2e570
[ 2551.743456] R13: ffff88fc43d2e400 R14: ffff88fc8fb08ee0 R15: ffff88fc6e44b530
[ 2551.743457] FS: 0000000000000000(0000) GS:ffff89035fbc0000(0000) knlGS:0000000000000000
[ 2551.743458] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2551.743459] CR2: 00007fa8cdf2f6f4 CR3: 0000000124850003 CR4: 0000000000370ee0
[ 2551.743462] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2551.743463] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2551.743464] Call Trace:
[ 2551.743472] <TASK>
[ 2551.743474] ? __warn+0x80/0x130
[ 2551.743478] ? btrfs_write_dirty_block_groups+0x397/0x3d0 [btrfs]
[ 2551.743520] ? report_bug+0x1f4/0x200
[ 2551.743523] ? handle_bug+0x42/0x70
[ 2551.743526] ? exc_invalid_op+0x14/0x70
[ 2551.743528] ? asm_exc_invalid_op+0x16/0x20
[ 2551.743532] ? btrfs_write_dirty_block_groups+0x397/0x3d0 [btrfs]
[ 2551.743574] ? _raw_spin_unlock+0x15/0x30
[ 2551.743576] ? btrfs_run_delayed_refs+0x1bd/0x200 [btrfs]
[ 2551.743609] commit_cowonly_roots+0x1e9/0x260 [btrfs]
[ 2551.743652] btrfs_commit_transaction+0x42e/0xfa0 [btrfs]
[ 2551.743693] ? __pfx_autoremove_wake_function+0x10/0x10
[ 2551.743697] flush_space+0xf1/0x5d0 [btrfs]
[ 2551.743743] ? _raw_spin_unlock+0x15/0x30
[ 2551.743745] ? finish_task_switch+0x91/0x2a0
[ 2551.743748] ? _raw_spin_unlock+0x15/0x30
[ 2551.743750] ? btrfs_get_alloc_profile+0xc9/0x1f0 [btrfs]
[ 2551.743793] btrfs_async_reclaim_metadata_space+0xe1/0x230 [btrfs]
[ 2551.743837] process_one_work+0x1d9/0x3e0
[ 2551.743844] worker_thread+0x4a/0x3b0
[ 2551.743847] ? __pfx_worker_thread+0x10/0x10
[ 2551.743849] kthread+0xee/0x120
[ 2551.743852] ? __pfx_kthread+0x10/0x10
[ 2551.743854] ret_from_fork+0x29/0x50
[ 2551.743860] </TASK>
[ 2551.743861] ---[ end trace 0000000000000000 ]---
[ 2551.743863] BTRFS info (device loop0: state A): dumping space info:
[ 2551.743866] BTRFS info (device loop0: state A): space_info DATA has 126976 free, is full
[ 2551.743868] BTRFS info (device loop0: state A): space_info total=13458472960, used=13458137088, pinned=143360, reserved=0, may_use=0, readonly=65536 zone_unusable=0
[ 2551.743870] BTRFS info (device loop0: state A): space_info METADATA has -51625984 free, is full
[ 2551.743872] BTRFS info (device loop0: state A): space_info total=771751936, used=770146304, pinned=1605632, reserved=0, may_use=51625984, readonly=0 zone_unusable=0
[ 2551.743874] BTRFS info (device loop0: state A): space_info SYSTEM has 14663680 free, is not full
[ 2551.743875] BTRFS info (device loop0: state A): space_info total=14680064, used=16384, pinned=0, reserved=0, may_use=0, readonly=0 zone_unusable=0
[ 2551.743877] BTRFS info (device loop0: state A): global_block_rsv: size 53231616 reserved 51544064
[ 2551.743878] BTRFS info (device loop0: state A): trans_block_rsv: size 0 reserved 0
[ 2551.743879] BTRFS info (device loop0: state A): chunk_block_rsv: size 0 reserved 0
[ 2551.743880] BTRFS info (device loop0: state A): delayed_block_rsv: size 0 reserved 0
[ 2551.743881] BTRFS info (device loop0: state A): delayed_refs_rsv: size 786432 reserved 0
[ 2551.743886] BTRFS: error (device loop0: state A) in btrfs_write_dirty_block_groups:3494: errno=-28 No space left
[ 2551.743911] BTRFS info (device loop0: state EA): forced readonly
[ 2551.743951] BTRFS warning (device loop0: state EA): could not allocate space for delete; will truncate on mount
[ 2551.743962] BTRFS error (device loop0: state EA): Error removing orphan entry, stopping orphan cleanup
[ 2551.743973] BTRFS warning (device loop0: state EA): Skipping commit of aborted transaction.
[ 2551.743989] BTRFS error (device loop0: state EA): could not do orphan cleanup -22

So make the btrfs_orphan_cleanup() return the value of BTRFS_FS_ERROR(),
if it's set, and -EINVAL otherwise.

For that same example, after this change, the mount operation fails with
-ENOSPC:

$ mount test.img /mnt
mount: /mnt: mount(2) system call failed: No space left on device.

Signed-off-by: Filipe Manana <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/btrfs/inode.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index a446965d701db..f22449aea5b7f 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -3534,9 +3534,16 @@ int btrfs_orphan_cleanup(struct btrfs_root *root)
*/

if (found_key.offset == last_objectid) {
+ /*
+ * We found the same inode as before. This means we were
+ * not able to remove its items via eviction triggered
+ * by an iput(). A transaction abort may have happened,
+ * due to -ENOSPC for example, so try to grab the error
+ * that lead to a transaction abort, if any.
+ */
btrfs_err(fs_info,
"Error removing orphan entry, stopping orphan cleanup");
- ret = -EINVAL;
+ ret = BTRFS_FS_ERROR(fs_info) ?: -EINVAL;
goto out;
}

--
2.40.1

2023-09-07 19:05:03

by Filipe Manana

[permalink] [raw]

Subject: Re: [PATCH AUTOSEL 6.4 3/5] btrfs: return real error when orphan cleanup fails due to a transaction abort

On Thu, Sep 07, 2023 at 11:43:47AM -0400, Sasha Levin wrote:
> From: Filipe Manana <[email protected]>
>
> [ Upstream commit a7f8de500e28bb227e02a7bd35988cf37b816c86 ]

Please don't add this patch to any stable release.
Besides not being that important for stable, backporting it alone would not
be correct as it depends on:

commit ae3364e5215bed9ce89db6b0c2d21eae4b66f4ae
Author: Filipe Manana <[email protected]>
Date: Wed Jul 26 16:57:04 2023 +0100

btrfs: store the error that turned the fs into error state

Thanks.

>
> During mount we will call btrfs_orphan_cleanup() to remove any inodes that
> were previously deleted (have a link count of 0) but for which we were not
> able before to remove their items from the subvolume tree. The removal of
> the items will happen by triggering eviction, when we do the final iput()
> on them at btrfs_orphan_cleanup(), which will end in the loop at
> btrfs_evict_inode() that truncates inode items.
>
> In a dire situation we may have a transaction abort due to -ENOSPC when
> attempting to truncate the inode items, and in that case the orphan item
> (key type BTRFS_ORPHAN_ITEM_KEY) will remain in the subvolume tree and
> when we hit the next iteration of the while loop at btrfs_orphan_cleanup()
> we will find the same orphan item as before, and then we will return
> -EINVAL from btrfs_orphan_cleanup() through the following if statement:
>
> if (found_key.offset == last_objectid) {
> btrfs_err(fs_info,
> "Error removing orphan entry, stopping orphan cleanup");
> ret = -EINVAL;
> goto out;
> }
>
> This makes the mount operation fail with -EINVAL, when it should have been
> -ENOSPC. This is confusing because -EINVAL might lead a user into thinking
> it provided invalid mount options for example.
>
> An example where this happens:
>
> $ mount test.img /mnt
> mount: /mnt: wrong fs type, bad option, bad superblock on /dev/loop0, missing codepage or helper program, or other error.
>
> $ dmesg
> [ 2542.356934] BTRFS: device fsid 977fff75-1181-4d2b-a739-384fa710d16e devid 1 transid 47409973 /dev/loop0 scanned by mount (4459)
> [ 2542.357451] BTRFS info (device loop0): using crc32c (crc32c-intel) checksum algorithm
> [ 2542.357461] BTRFS info (device loop0): disk space caching is enabled
> [ 2542.742287] BTRFS info (device loop0): auto enabling async discard
> [ 2542.764554] BTRFS info (device loop0): checking UUID tree
> [ 2551.743065] ------------[ cut here ]------------
> [ 2551.743068] BTRFS: Transaction aborted (error -28)
> [ 2551.743149] WARNING: CPU: 7 PID: 215 at fs/btrfs/block-group.c:3494 btrfs_write_dirty_block_groups+0x397/0x3d0 [btrfs]
> [ 2551.743311] Modules linked in: btrfs blake2b_generic (...)
> [ 2551.743353] CPU: 7 PID: 215 Comm: kworker/u24:5 Not tainted 6.4.0-rc6-btrfs-next-134+ #1
> [ 2551.743356] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014
> [ 2551.743357] Workqueue: events_unbound btrfs_async_reclaim_metadata_space [btrfs]
> [ 2551.743405] RIP: 0010:btrfs_write_dirty_block_groups+0x397/0x3d0 [btrfs]
> [ 2551.743449] Code: 8b 43 0c (...)
> [ 2551.743451] RSP: 0018:ffff982c005a7c40 EFLAGS: 00010286
> [ 2551.743452] RAX: 0000000000000000 RBX: ffff88fc6e44b400 RCX: 0000000000000000
> [ 2551.743453] RDX: 0000000000000002 RSI: ffffffff8dff0878 RDI: 00000000ffffffff
> [ 2551.743454] RBP: ffff88fc51817208 R08: 0000000000000000 R09: ffff982c005a7ae0
> [ 2551.743455] R10: 0000000000000001 R11: 0000000000000001 R12: ffff88fc43d2e570
> [ 2551.743456] R13: ffff88fc43d2e400 R14: ffff88fc8fb08ee0 R15: ffff88fc6e44b530
> [ 2551.743457] FS: 0000000000000000(0000) GS:ffff89035fbc0000(0000) knlGS:0000000000000000
> [ 2551.743458] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 2551.743459] CR2: 00007fa8cdf2f6f4 CR3: 0000000124850003 CR4: 0000000000370ee0
> [ 2551.743462] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 2551.743463] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 2551.743464] Call Trace:
> [ 2551.743472] <TASK>
> [ 2551.743474] ? __warn+0x80/0x130
> [ 2551.743478] ? btrfs_write_dirty_block_groups+0x397/0x3d0 [btrfs]
> [ 2551.743520] ? report_bug+0x1f4/0x200
> [ 2551.743523] ? handle_bug+0x42/0x70
> [ 2551.743526] ? exc_invalid_op+0x14/0x70
> [ 2551.743528] ? asm_exc_invalid_op+0x16/0x20
> [ 2551.743532] ? btrfs_write_dirty_block_groups+0x397/0x3d0 [btrfs]
> [ 2551.743574] ? _raw_spin_unlock+0x15/0x30
> [ 2551.743576] ? btrfs_run_delayed_refs+0x1bd/0x200 [btrfs]
> [ 2551.743609] commit_cowonly_roots+0x1e9/0x260 [btrfs]
> [ 2551.743652] btrfs_commit_transaction+0x42e/0xfa0 [btrfs]
> [ 2551.743693] ? __pfx_autoremove_wake_function+0x10/0x10
> [ 2551.743697] flush_space+0xf1/0x5d0 [btrfs]
> [ 2551.743743] ? _raw_spin_unlock+0x15/0x30
> [ 2551.743745] ? finish_task_switch+0x91/0x2a0
> [ 2551.743748] ? _raw_spin_unlock+0x15/0x30
> [ 2551.743750] ? btrfs_get_alloc_profile+0xc9/0x1f0 [btrfs]
> [ 2551.743793] btrfs_async_reclaim_metadata_space+0xe1/0x230 [btrfs]
> [ 2551.743837] process_one_work+0x1d9/0x3e0
> [ 2551.743844] worker_thread+0x4a/0x3b0
> [ 2551.743847] ? __pfx_worker_thread+0x10/0x10
> [ 2551.743849] kthread+0xee/0x120
> [ 2551.743852] ? __pfx_kthread+0x10/0x10
> [ 2551.743854] ret_from_fork+0x29/0x50
> [ 2551.743860] </TASK>
> [ 2551.743861] ---[ end trace 0000000000000000 ]---
> [ 2551.743863] BTRFS info (device loop0: state A): dumping space info:
> [ 2551.743866] BTRFS info (device loop0: state A): space_info DATA has 126976 free, is full
> [ 2551.743868] BTRFS info (device loop0: state A): space_info total=13458472960, used=13458137088, pinned=143360, reserved=0, may_use=0, readonly=65536 zone_unusable=0
> [ 2551.743870] BTRFS info (device loop0: state A): space_info METADATA has -51625984 free, is full
> [ 2551.743872] BTRFS info (device loop0: state A): space_info total=771751936, used=770146304, pinned=1605632, reserved=0, may_use=51625984, readonly=0 zone_unusable=0
> [ 2551.743874] BTRFS info (device loop0: state A): space_info SYSTEM has 14663680 free, is not full
> [ 2551.743875] BTRFS info (device loop0: state A): space_info total=14680064, used=16384, pinned=0, reserved=0, may_use=0, readonly=0 zone_unusable=0
> [ 2551.743877] BTRFS info (device loop0: state A): global_block_rsv: size 53231616 reserved 51544064
> [ 2551.743878] BTRFS info (device loop0: state A): trans_block_rsv: size 0 reserved 0
> [ 2551.743879] BTRFS info (device loop0: state A): chunk_block_rsv: size 0 reserved 0
> [ 2551.743880] BTRFS info (device loop0: state A): delayed_block_rsv: size 0 reserved 0
> [ 2551.743881] BTRFS info (device loop0: state A): delayed_refs_rsv: size 786432 reserved 0
> [ 2551.743886] BTRFS: error (device loop0: state A) in btrfs_write_dirty_block_groups:3494: errno=-28 No space left
> [ 2551.743911] BTRFS info (device loop0: state EA): forced readonly
> [ 2551.743951] BTRFS warning (device loop0: state EA): could not allocate space for delete; will truncate on mount
> [ 2551.743962] BTRFS error (device loop0: state EA): Error removing orphan entry, stopping orphan cleanup
> [ 2551.743973] BTRFS warning (device loop0: state EA): Skipping commit of aborted transaction.
> [ 2551.743989] BTRFS error (device loop0: state EA): could not do orphan cleanup -22
>
> So make the btrfs_orphan_cleanup() return the value of BTRFS_FS_ERROR(),
> if it's set, and -EINVAL otherwise.
>
> For that same example, after this change, the mount operation fails with
> -ENOSPC:
>
> $ mount test.img /mnt
> mount: /mnt: mount(2) system call failed: No space left on device.
>
> Signed-off-by: Filipe Manana <[email protected]>
> Signed-off-by: David Sterba <[email protected]>
> Signed-off-by: Sasha Levin <[email protected]>
> ---
> fs/btrfs/inode.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index a446965d701db..f22449aea5b7f 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -3534,9 +3534,16 @@ int btrfs_orphan_cleanup(struct btrfs_root *root)
> */
>
> if (found_key.offset == last_objectid) {
> + /*
> + * We found the same inode as before. This means we were
> + * not able to remove its items via eviction triggered
> + * by an iput(). A transaction abort may have happened,
> + * due to -ENOSPC for example, so try to grab the error
> + * that lead to a transaction abort, if any.
> + */
> btrfs_err(fs_info,
> "Error removing orphan entry, stopping orphan cleanup");
> - ret = -EINVAL;
> + ret = BTRFS_FS_ERROR(fs_info) ?: -EINVAL;
> goto out;
> }
>
> --
> 2.40.1
>