2021-07-24 07:43:30

by Wang Jianchao

[permalink] [raw]
Subject: [PATCH V3 0/5] ext4: get discard out of jbd2 commit context

Hi all

This is the version 3 patch set that attempts to get discard out of the jbd2
commit kthread. When the user delete a lot data and cause discard flooding,
the jbd2 commit kthread can be blocked for very long time and then all of
the metadata operations are blocked due to no journal space.

The xfstest with following parameters,
MODULAR=0
TEST_DIR=/mnt/test
TEST_DEV=/dev/nbd37p1
SCRATCH_MNT=/mnt/scratch
SCRATCH_DEV=/dev/nbd37p2
MOUNT_OPTIONS="-o discard"
has passed. The result is consistent w/ or w/o this patch set.

There are 5 patches,

Patch 1 ~ 3, there are no functional changes in them, but just some preparation
for following patches

Patch 4 introduces a async kworker to do discard in fstrim fation which implements
the core idea of this patch set.

Patch 5 let the fallocate retry when err is ENOSPC. This fix the generic/371

Any comments are welcome ;)

V2 -> V3
- Get rid of the per block group rb tree which carries freed entry. It is not neccesary
because we have done aggregation when wait for journal commit. Just use a list
to carry the free entries.

V1 -> V2
- free the blocks back to mb buddy after commit and then do ftrim fashion discard

fs/ext4/ext4.h | 2 +
fs/ext4/extents.c | 6 ++-
fs/ext4/mballoc.c | 223 ++++++++++++++++++++++++++++++++++++++++++++++++++++----------------------------
3 files changed, 151 insertions(+), 80 deletions(-)


2021-07-24 07:43:30

by Wang Jianchao

[permalink] [raw]
Subject: [PATCH V3 1/5] ext4: remove the 'group' parameter of ext4_trim_extent

From: Wang Jianchao <[email protected]>

Get rid of the 'group' parameter of ext4_trim_extent as we can get
it from the 'e4b'.

Reviewed-by: Andreas Dilger <[email protected]>
Signed-off-by: Wang Jianchao <[email protected]>
---
fs/ext4/mballoc.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 089c958aa2c3..018d5d3c6eeb 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -6183,19 +6183,19 @@ int ext4_group_add_blocks(handle_t *handle, struct super_block *sb,
* @sb: super block for the file system
* @start: starting block of the free extent in the alloc. group
* @count: number of blocks to TRIM
- * @group: alloc. group we are working with
* @e4b: ext4 buddy for the group
*
* Trim "count" blocks starting at "start" in the "group". To assure that no
* one will allocate those blocks, mark it as used in buddy bitmap. This must
* be called with under the group lock.
*/
-static int ext4_trim_extent(struct super_block *sb, int start, int count,
- ext4_group_t group, struct ext4_buddy *e4b)
+static int ext4_trim_extent(struct super_block *sb,
+ int start, int count, struct ext4_buddy *e4b)
__releases(bitlock)
__acquires(bitlock)
{
struct ext4_free_extent ex;
+ ext4_group_t group = e4b->bd_group;
int ret = 0;

trace_ext4_trim_extent(sb, group, start, count);
@@ -6271,8 +6271,7 @@ ext4_trim_all_free(struct super_block *sb, ext4_group_t group,
next = mb_find_next_bit(bitmap, max + 1, start);

if ((next - start) >= minblocks) {
- ret = ext4_trim_extent(sb, start,
- next - start, group, &e4b);
+ ret = ext4_trim_extent(sb, start, next - start, &e4b);
if (ret && ret != -EOPNOTSUPP)
break;
ret = 0;
--
2.17.1

2021-07-24 07:43:40

by Wang Jianchao

[permalink] [raw]
Subject: [PATCH V3 2/5] ext4: add new helper interface ext4_try_to_trim_range()

From: Wang Jianchao <[email protected]>

There is no functional change in this patch but just split the
codes, which serachs free block and does trim, into a new function
ext4_try_to_trim_range. This is preparing for the following async
backgroup discard.

Reviewed-by: Andreas Dilger <[email protected]>
Signed-off-by: Wang Jianchao <[email protected]>
---
fs/ext4/mballoc.c | 102 ++++++++++++++++++++++++++--------------------
1 file changed, 57 insertions(+), 45 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 018d5d3c6eeb..e3844152a643 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -6218,6 +6218,54 @@ __acquires(bitlock)
return ret;
}

+static int ext4_try_to_trim_range(struct super_block *sb,
+ struct ext4_buddy *e4b, ext4_grpblk_t start,
+ ext4_grpblk_t max, ext4_grpblk_t minblocks)
+{
+ ext4_grpblk_t next, count, free_count;
+ void *bitmap;
+ int ret = 0;
+
+ bitmap = e4b->bd_bitmap;
+ start = (e4b->bd_info->bb_first_free > start) ?
+ e4b->bd_info->bb_first_free : start;
+ count = 0;
+ free_count = 0;
+
+ while (start <= max) {
+ start = mb_find_next_zero_bit(bitmap, max + 1, start);
+ if (start > max)
+ break;
+ next = mb_find_next_bit(bitmap, max + 1, start);
+
+ if ((next - start) >= minblocks) {
+ ret = ext4_trim_extent(sb, start, next - start, e4b);
+ if (ret && ret != -EOPNOTSUPP)
+ break;
+ ret = 0;
+ count += next - start;
+ }
+ free_count += next - start;
+ start = next + 1;
+
+ if (fatal_signal_pending(current)) {
+ count = -ERESTARTSYS;
+ break;
+ }
+
+ if (need_resched()) {
+ ext4_unlock_group(sb, e4b->bd_group);
+ cond_resched();
+ ext4_lock_group(sb, e4b->bd_group);
+ }
+
+ if ((e4b->bd_info->bb_free - free_count) < minblocks)
+ break;
+ }
+
+ return count;
+}
+
/**
* ext4_trim_all_free -- function to trim all free space in alloc. group
* @sb: super block for file system
@@ -6241,10 +6289,8 @@ ext4_trim_all_free(struct super_block *sb, ext4_group_t group,
ext4_grpblk_t start, ext4_grpblk_t max,
ext4_grpblk_t minblocks)
{
- void *bitmap;
- ext4_grpblk_t next, count = 0, free_count = 0;
struct ext4_buddy e4b;
- int ret = 0;
+ int ret;

trace_ext4_trim_all_free(sb, group, start, max);

@@ -6254,57 +6300,23 @@ ext4_trim_all_free(struct super_block *sb, ext4_group_t group,
ret, group);
return ret;
}
- bitmap = e4b.bd_bitmap;

ext4_lock_group(sb, group);
- if (EXT4_MB_GRP_WAS_TRIMMED(e4b.bd_info) &&
- minblocks >= atomic_read(&EXT4_SB(sb)->s_last_trim_minblks))
- goto out;
-
- start = (e4b.bd_info->bb_first_free > start) ?
- e4b.bd_info->bb_first_free : start;

- while (start <= max) {
- start = mb_find_next_zero_bit(bitmap, max + 1, start);
- if (start > max)
- break;
- next = mb_find_next_bit(bitmap, max + 1, start);
-
- if ((next - start) >= minblocks) {
- ret = ext4_trim_extent(sb, start, next - start, &e4b);
- if (ret && ret != -EOPNOTSUPP)
- break;
- ret = 0;
- count += next - start;
- }
- free_count += next - start;
- start = next + 1;
-
- if (fatal_signal_pending(current)) {
- count = -ERESTARTSYS;
- break;
- }
-
- if (need_resched()) {
- ext4_unlock_group(sb, group);
- cond_resched();
- ext4_lock_group(sb, group);
- }
-
- if ((e4b.bd_info->bb_free - free_count) < minblocks)
- break;
+ if (!EXT4_MB_GRP_WAS_TRIMMED(e4b.bd_info) ||
+ minblocks < atomic_read(&EXT4_SB(sb)->s_last_trim_minblks)) {
+ ret = ext4_try_to_trim_range(sb, &e4b, start, max, minblocks);
+ if (ret >= 0)
+ EXT4_MB_GRP_SET_TRIMMED(e4b.bd_info);
+ } else {
+ ret = 0;
}

- if (!ret) {
- ret = count;
- EXT4_MB_GRP_SET_TRIMMED(e4b.bd_info);
- }
-out:
ext4_unlock_group(sb, group);
ext4_mb_unload_buddy(&e4b);

ext4_debug("trimmed %d blocks in the group %d\n",
- count, group);
+ ret, group);

return ret;
}
--
2.17.1

2021-07-24 07:44:16

by Wang Jianchao

[permalink] [raw]
Subject: [PATCH V3 3/5] ext4: remove the repeated comment of ext4_trim_all_free

From: Wang Jianchao <[email protected]>

Reviewed-by: Andreas Dilger <[email protected]>
Signed-off-by: Wang Jianchao <[email protected]>
---
fs/ext4/mballoc.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index e3844152a643..34be2f07449d 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -6274,15 +6274,10 @@ static int ext4_try_to_trim_range(struct super_block *sb,
* @max: last group block to examine
* @minblocks: minimum extent block count
*
- * ext4_trim_all_free walks through group's buddy bitmap searching for free
- * extents. When the free block is found, ext4_trim_extent is called to TRIM
- * the extent.
- *
- *
* ext4_trim_all_free walks through group's block bitmap searching for free
* extents. When the free extent is found, mark it as used in group buddy
* bitmap. Then issue a TRIM command on this extent and free the extent in
- * the group buddy bitmap. This is done until whole group is scanned.
+ * the group buddy bitmap.
*/
static ext4_grpblk_t
ext4_trim_all_free(struct super_block *sb, ext4_group_t group,
--
2.17.1

2021-07-24 07:44:32

by Wang Jianchao

[permalink] [raw]
Subject: [PATCH V3 4/5] ext4: get discard out of jbd2 commit kthread contex

From: Wang Jianchao <[email protected]>

Right now, discard is issued and waited to be completed in jbd2
commit kthread context after the logs are committed. When large
amount of files are deleted and discard is flooding, jbd2 commit
kthread can be blocked for long time. Then all of the metadata
operations can be blocked to wait the log space.

One case is the page fault path with read mm->mmap_sem held, which
wants to update the file time but has to wait for the log space.
When other threads in the task wants to do mmap, then write mmap_sem
is blocked. Finally all of the following read mmap_sem requirements
are blocked, even the ps command which need to read the /proc/pid/
-cmdline. Our monitor service which needs to read /proc/pid/cmdline
used to be blocked for 5 mins.

This patch frees the blocks back to buddy after commit and then do
discard in a async kworker context in fstrim fashion, namely,
- mark blocks to be discarded as used if they have not been allocated
- do discard
- mark them free
After this, jbd2 commit kthread won't be blocked any more by discard
and we won't get NOSPC even if the discard is slow or throttled.

Link: https://marc.info/?l=linux-kernel&m=162143690731901&w=2
Suggested-by: Theodore Ts'o <[email protected]>
Signed-off-by: Wang Jianchao <[email protected]>
---
fs/ext4/ext4.h | 2 +
fs/ext4/mballoc.c | 109 +++++++++++++++++++++++++++++++++++-----------
2 files changed, 86 insertions(+), 25 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 3c51e243450d..6b678b968d84 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1536,6 +1536,8 @@ struct ext4_sb_info {
unsigned int s_mb_free_pending;
struct list_head s_freed_data_list; /* List of blocks to be freed
after commit completed */
+ struct list_head s_discard_list;
+ struct work_struct s_discard_work;
struct rb_root s_mb_avg_fragment_size_root;
rwlock_t s_mb_rb_lock;
struct list_head *s_mb_largest_free_orders;
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 34be2f07449d..a496509e61b7 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -386,6 +386,7 @@
static struct kmem_cache *ext4_pspace_cachep;
static struct kmem_cache *ext4_ac_cachep;
static struct kmem_cache *ext4_free_data_cachep;
+static struct workqueue_struct *ext4_discard_wq;

/* We create slab caches for groupinfo data structures based on the
* superblock block size. There will be one per mounted filesystem for
@@ -408,6 +409,10 @@ static void ext4_mb_new_preallocation(struct ext4_allocation_context *ac);
static bool ext4_mb_good_group(struct ext4_allocation_context *ac,
ext4_group_t group, int cr);

+static int ext4_try_to_trim_range(struct super_block *sb,
+ struct ext4_buddy *e4b, ext4_grpblk_t start,
+ ext4_grpblk_t max, ext4_grpblk_t minblocks);
+
/*
* The algorithm using this percpu seq counter goes below:
* 1. We sample the percpu discard_pa_seq counter before trying for block
@@ -3308,6 +3313,55 @@ static int ext4_groupinfo_create_slab(size_t size)
return 0;
}

+static void ext4_discard_work(struct work_struct *work)
+{
+ struct ext4_sb_info *sbi = container_of(work,
+ struct ext4_sb_info, s_discard_work);
+ struct super_block *sb = sbi->s_sb;
+ struct ext4_free_data *fd, *nfd;
+ struct ext4_buddy e4b;
+ struct list_head discard_list;
+ ext4_group_t grp, load_grp;
+ int err = 0;
+
+ INIT_LIST_HEAD(&discard_list);
+ spin_lock(&sbi->s_md_lock);
+ list_splice_init(&sbi->s_discard_list, &discard_list);
+ spin_unlock(&sbi->s_md_lock);
+
+ load_grp = UINT_MAX;
+ list_for_each_entry_safe(fd, nfd, &discard_list, efd_list) {
+ /*
+ * If filesystem is umounting or no memory, give up the discard
+ */
+ if ((sb->s_flags & SB_ACTIVE) && !err) {
+ grp = fd->efd_group;
+ if (grp != load_grp) {
+ if (load_grp != UINT_MAX)
+ ext4_mb_unload_buddy(&e4b);
+
+ err = ext4_mb_load_buddy(sb, grp, &e4b);
+ if (err) {
+ kmem_cache_free(ext4_free_data_cachep, fd);
+ load_grp = UINT_MAX;
+ continue;
+ } else {
+ load_grp = grp;
+ }
+ }
+
+ ext4_lock_group(sb, grp);
+ ext4_try_to_trim_range(sb, &e4b, fd->efd_start_cluster,
+ fd->efd_start_cluster + fd->efd_count - 1, 1);
+ ext4_unlock_group(sb, grp);
+ }
+ kmem_cache_free(ext4_free_data_cachep, fd);
+ }
+
+ if (load_grp != UINT_MAX)
+ ext4_mb_unload_buddy(&e4b);
+}
+
int ext4_mb_init(struct super_block *sb)
{
struct ext4_sb_info *sbi = EXT4_SB(sb);
@@ -3376,6 +3430,8 @@ int ext4_mb_init(struct super_block *sb)
spin_lock_init(&sbi->s_md_lock);
sbi->s_mb_free_pending = 0;
INIT_LIST_HEAD(&sbi->s_freed_data_list);
+ INIT_LIST_HEAD(&sbi->s_discard_list);
+ INIT_WORK(&sbi->s_discard_work, ext4_discard_work);

sbi->s_mb_max_to_scan = MB_DEFAULT_MAX_TO_SCAN;
sbi->s_mb_min_to_scan = MB_DEFAULT_MIN_TO_SCAN;
@@ -3474,6 +3530,14 @@ int ext4_mb_release(struct super_block *sb)
struct kmem_cache *cachep = get_groupinfo_cache(sb->s_blocksize_bits);
int count;

+ if (test_opt(sb, DISCARD)) {
+ /*
+ * wait the discard work to drain all of ext4_free_data
+ */
+ queue_work(ext4_discard_wq, &sbi->s_discard_work);
+ flush_work(&sbi->s_discard_work);
+ }
+
if (sbi->s_group_info) {
for (i = 0; i < ngroups; i++) {
cond_resched();
@@ -3596,7 +3660,6 @@ static void ext4_free_data_in_buddy(struct super_block *sb,
put_page(e4b.bd_bitmap_page);
}
ext4_unlock_group(sb, entry->efd_group);
- kmem_cache_free(ext4_free_data_cachep, entry);
ext4_mb_unload_buddy(&e4b);

mb_debug(sb, "freed %d blocks in %d structures\n", count,
@@ -3611,10 +3674,9 @@ void ext4_process_freed_data(struct super_block *sb, tid_t commit_tid)
{
struct ext4_sb_info *sbi = EXT4_SB(sb);
struct ext4_free_data *entry, *tmp;
- struct bio *discard_bio = NULL;
struct list_head freed_data_list;
struct list_head *cut_pos = NULL;
- int err;
+ bool wake;

INIT_LIST_HEAD(&freed_data_list);

@@ -3629,30 +3691,20 @@ void ext4_process_freed_data(struct super_block *sb, tid_t commit_tid)
cut_pos);
spin_unlock(&sbi->s_md_lock);

- if (test_opt(sb, DISCARD)) {
- list_for_each_entry(entry, &freed_data_list, efd_list) {
- err = ext4_issue_discard(sb, entry->efd_group,
- entry->efd_start_cluster,
- entry->efd_count,
- &discard_bio);
- if (err && err != -EOPNOTSUPP) {
- ext4_msg(sb, KERN_WARNING, "discard request in"
- " group:%d block:%d count:%d failed"
- " with %d", entry->efd_group,
- entry->efd_start_cluster,
- entry->efd_count, err);
- } else if (err == -EOPNOTSUPP)
- break;
- }
+ list_for_each_entry(entry, &freed_data_list, efd_list)
+ ext4_free_data_in_buddy(sb, entry);

- if (discard_bio) {
- submit_bio_wait(discard_bio);
- bio_put(discard_bio);
- }
+ if (test_opt(sb, DISCARD)) {
+ spin_lock(&sbi->s_md_lock);
+ wake = list_empty(&sbi->s_discard_list);
+ list_splice_tail(&freed_data_list, &sbi->s_discard_list);
+ spin_unlock(&sbi->s_md_lock);
+ if (wake)
+ queue_work(ext4_discard_wq, &sbi->s_discard_work);
+ } else {
+ list_for_each_entry_safe(entry, tmp, &freed_data_list, efd_list)
+ kmem_cache_free(ext4_free_data_cachep, entry);
}
-
- list_for_each_entry_safe(entry, tmp, &freed_data_list, efd_list)
- ext4_free_data_in_buddy(sb, entry);
}

int __init ext4_init_mballoc(void)
@@ -3672,8 +3724,14 @@ int __init ext4_init_mballoc(void)
if (ext4_free_data_cachep == NULL)
goto out_ac_free;

+ ext4_discard_wq = alloc_workqueue("ext4discard", WQ_UNBOUND, 0);
+ if (!ext4_discard_wq)
+ goto out_free_data;
+
return 0;

+out_free_data:
+ kmem_cache_destroy(ext4_free_data_cachep);
out_ac_free:
kmem_cache_destroy(ext4_ac_cachep);
out_pa_free:
@@ -3693,6 +3751,7 @@ void ext4_exit_mballoc(void)
kmem_cache_destroy(ext4_ac_cachep);
kmem_cache_destroy(ext4_free_data_cachep);
ext4_groupinfo_destroy_slabs();
+ destroy_workqueue(ext4_discard_wq);
}


--
2.17.1

2021-07-24 07:44:49

by Wang Jianchao

[permalink] [raw]
Subject: [PATCH V3 5/5] ext4: make fallocate retry when err is ENOSPC

From: Wang Jianchao <[email protected]>

The blocks may be waiting for journal commit to be freed back to
mb buddy. Let fallocate wait and retry in that case.

Signed-off-by: Wang Jianchao <[email protected]>
---
fs/ext4/extents.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 92ad64b89d9b..ad0b874d3448 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -4635,7 +4635,7 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
struct inode *inode = file_inode(file);
loff_t new_size = 0;
unsigned int max_blocks;
- int ret = 0;
+ int ret = 0, retries = 0;
int flags;
ext4_lblk_t lblk;
unsigned int blkbits = inode->i_blkbits;
@@ -4656,6 +4656,7 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
FALLOC_FL_INSERT_RANGE))
return -EOPNOTSUPP;

+retry:
ext4_fc_start_update(inode);

if (mode & FALLOC_FL_PUNCH_HOLE) {
@@ -4722,6 +4723,9 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
trace_ext4_fallocate_exit(inode, offset, max_blocks, ret);
exit:
ext4_fc_stop_update(inode);
+ if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries))
+ goto retry;
+
return ret;
}

--
2.17.1

2021-07-26 03:41:24

by Guoqing Jiang

[permalink] [raw]
Subject: Re: [PATCH V3 5/5] ext4: make fallocate retry when err is ENOSPC

Hi,

On 7/24/21 3:41 PM, Wang Jianchao wrote:
> From: Wang Jianchao <[email protected]>
>
> The blocks may be waiting for journal commit to be freed back to
> mb buddy. Let fallocate wait and retry in that case.
>
> Signed-off-by: Wang Jianchao <[email protected]>
> ---
> fs/ext4/extents.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 92ad64b89d9b..ad0b874d3448 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -4635,7 +4635,7 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
> struct inode *inode = file_inode(file);
> loff_t new_size = 0;
> unsigned int max_blocks;
> - int ret = 0;
> + int ret = 0, retries = 0;
> int flags;
> ext4_lblk_t lblk;
> unsigned int blkbits = inode->i_blkbits;
> @@ -4656,6 +4656,7 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
> FALLOC_FL_INSERT_RANGE))
> return -EOPNOTSUPP;
>
> +retry:
> ext4_fc_start_update(inode);
>
> if (mode & FALLOC_FL_PUNCH_HOLE) {
> @@ -4722,6 +4723,9 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
> trace_ext4_fallocate_exit(inode, offset, max_blocks, ret);
> exit:
> ext4_fc_stop_update(inode);
> + if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries))
> + goto retry;
> +

Not sure if it is necessary since ext4_alloc_file_blocks already retries
allocate.

Thanks,
Guoqing

2021-07-26 03:42:47

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH V3 2/5] ext4: add new helper interface ext4_try_to_trim_range()

Hi Wang,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on ext4/dev]
[also build test WARNING on linux/master linus/master v5.14-rc2
next-20210723]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Wang-Jianchao/ext4-get-discard-out-of-jbd2-commit-context/20210724-154426
base: https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git dev
:::::: branch date: 25 hours ago
:::::: commit date: 25 hours ago
config: x86_64-randconfig-c001-20210725 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project
c63dbd850182797bc4b76124d08e1c320ab2365d)
reproduce (this is a W=1 build):
wget
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross
-O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install x86_64 cross compiling tool for clang build
# apt-get install binutils-x86-64-linux-gnu
#
https://github.com/0day-ci/linux/commit/55a3430685e83709742c1fe2e0d4b347781bcc80
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review
Wang-Jianchao/ext4-get-discard-out-of-jbd2-commit-context/20210724-154426
git checkout 55a3430685e83709742c1fe2e0d4b347781bcc80
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross
clang-analyzer ARCH=x86_64
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>


clang-analyzer warnings: (new ones prefixed by >>)
fs/ext4/mballoc.c:1909:2: note: Taking true branch
if (ex->fe_start + ex->fe_len >
EXT4_CLUSTERS_PER_GROUP(e4b->bd_sb)) {
^
fs/ext4/mballoc.c:1911:3: note: Taking true branch
WARN_ON(1);
^
include/asm-generic/bug.h:120:2: note: expanded from macro 'WARN_ON'
if (unlikely(__ret_warn_on))
\
^
fs/ext4/mballoc.c:1911:3: note: Loop condition is false. Exiting loop
WARN_ON(1);
^
include/asm-generic/bug.h:121:3: note: expanded from macro 'WARN_ON'
__WARN();
\
^
include/asm-generic/bug.h:94:19: note: expanded from macro '__WARN'
#define __WARN() __WARN_FLAGS(BUGFLAG_TAINT(TAINT_WARN))
^
arch/x86/include/asm/bug.h:78:2: note: expanded from macro
'__WARN_FLAGS'
instrumentation_begin(); \
^
include/linux/instrumentation.h:53:34: note: expanded from macro
'instrumentation_begin'
# define instrumentation_begin() do { } while(0)
^
fs/ext4/mballoc.c:1911:3: note: Loop condition is false. Exiting loop
WARN_ON(1);
^
include/asm-generic/bug.h:121:3: note: expanded from macro 'WARN_ON'
__WARN();
\
^
include/asm-generic/bug.h:94:19: note: expanded from macro '__WARN'
#define __WARN() __WARN_FLAGS(BUGFLAG_TAINT(TAINT_WARN))
^
arch/x86/include/asm/bug.h:79:2: note: expanded from macro
'__WARN_FLAGS'
_BUG_FLAGS(ASM_UD2, BUGFLAG_WARNING|(flags)); \
^
arch/x86/include/asm/bug.h:25:37: note: expanded from macro '_BUG_FLAGS'
#define _BUG_FLAGS(ins, flags)
\

^
fs/ext4/mballoc.c:1911:3: note: Loop condition is false. Exiting loop
WARN_ON(1);
^
include/asm-generic/bug.h:121:3: note: expanded from macro 'WARN_ON'
__WARN();
\
^
include/asm-generic/bug.h:94:19: note: expanded from macro '__WARN'
#define __WARN() __WARN_FLAGS(BUGFLAG_TAINT(TAINT_WARN))
^
arch/x86/include/asm/bug.h:81:2: note: expanded from macro
'__WARN_FLAGS'
instrumentation_end(); \
^
include/linux/instrumentation.h:54:33: note: expanded from macro
'instrumentation_end'
# define instrumentation_end() do { } while(0)
^
fs/ext4/mballoc.c:1911:3: note: Loop condition is false. Exiting loop
WARN_ON(1);
^
include/asm-generic/bug.h:121:3: note: expanded from macro 'WARN_ON'
__WARN();
\
^
include/asm-generic/bug.h:94:19: note: expanded from macro '__WARN'
#define __WARN() __WARN_FLAGS(BUGFLAG_TAINT(TAINT_WARN))
^
arch/x86/include/asm/bug.h:76:33: note: expanded from macro
'__WARN_FLAGS'
#define __WARN_FLAGS(flags) \
^
fs/ext4/mballoc.c:1912:3: note: 14th function call argument is an
uninitialized value
ext4_grp_locked_error(e4b->bd_sb, e4b->bd_group, 0, 0,
^
fs/ext4/ext4.h:3126:2: note: expanded from macro 'ext4_grp_locked_error'
__ext4_grp_locked_error(__func__, __LINE__, sb, grp, ino,
block, \
^
fs/ext4/mballoc.c:2597:27: warning: Value stored to 'grp' during its
initialization is never read [clang-analyzer-deadcode.DeadStores]
struct ext4_group_info *grp =
ext4_get_group_info(sb, group);
^~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
fs/ext4/mballoc.c:2597:27: note: Value stored to 'grp' during its
initialization is never read
struct ext4_group_info *grp =
ext4_get_group_info(sb, group);
^~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
fs/ext4/mballoc.c:3839:3: warning: Value stored to 'err' is never
read [clang-analyzer-deadcode.DeadStores]
err = PTR_ERR(bitmap_bh);
^ ~~~~~~~~~~~~~~~~~~
fs/ext4/mballoc.c:3839:3: note: Value stored to 'err' is never read
err = PTR_ERR(bitmap_bh);
^ ~~~~~~~~~~~~~~~~~~
fs/ext4/mballoc.c:3844:2: warning: Value stored to 'err' is never
read [clang-analyzer-deadcode.DeadStores]
err = -EIO;
^ ~~~~
fs/ext4/mballoc.c:3844:2: note: Value stored to 'err' is never read
err = -EIO;
^ ~~~~
fs/ext4/mballoc.c:3889:2: warning: Value stored to 'err' is never
read [clang-analyzer-deadcode.DeadStores]
err = ext4_handle_dirty_metadata(NULL, NULL, gdp_bh);
^
fs/ext4/mballoc.c:3889:2: note: Value stored to 'err' is never read
fs/ext4/mballoc.c:5746:3: warning: Value stored to 'err' is never
read [clang-analyzer-deadcode.DeadStores]
err = PTR_ERR(bitmap_bh);
^ ~~~~~~~~~~~~~~~~~~
fs/ext4/mballoc.c:5746:3: note: Value stored to 'err' is never read
err = PTR_ERR(bitmap_bh);
^ ~~~~~~~~~~~~~~~~~~
>> fs/ext4/mballoc.c:6245:4: warning: Value stored to 'ret' is never read [clang-analyzer-deadcode.DeadStores]
ret = 0;
^ ~
fs/ext4/mballoc.c:6245:4: note: Value stored to 'ret' is never read
ret = 0;
^ ~
Suppressed 6 warnings (6 in non-user code).
Use -header-filter=.* to display errors from all non-system headers.
Use -system-headers to display errors from system headers as well.
4 warnings generated.
Suppressed 4 warnings (4 in non-user code).
Use -header-filter=.* to display errors from all non-system headers.
Use -system-headers to display errors from system headers as well.
12 warnings generated.
arch/x86/include/asm/paravirt.h:585:2: warning: Assigned value is
garbage or undefined [clang-analyzer-core.uninitialized.Assign]
PVOP_VCALL2(lock.queued_spin_lock_slowpath, lock, val);
^
arch/x86/include/asm/paravirt_types.h:547:2: note: expanded from
macro 'PVOP_VCALL2'
__PVOP_VCALL(op, PVOP_CALL_ARG1(arg1), PVOP_CALL_ARG2(arg2))
^
arch/x86/include/asm/paravirt_types.h:491:8: note: expanded from
macro '__PVOP_VCALL'
(void)____PVOP_CALL(, op, CLBR_ANY, PVOP_VCALL_CLOBBERS,
\
^
arch/x86/include/asm/paravirt_types.h:446:3: note: expanded from
macro '____PVOP_CALL'
PVOP_CALL_ARGS;
\
^
arch/x86/include/asm/paravirt_types.h:404:16: note: expanded from
macro 'PVOP_CALL_ARGS'
unsigned long __edi = __edi, __esi = __esi, \
^
kernel/trace/trace.c:5779:2: note: Calling 'queued_spin_lock'
arch_spin_lock(&trace_cmdline_lock);
^
include/asm-generic/qspinlock.h:117:28: note: expanded from macro
'arch_spin_lock'
#define arch_spin_lock(l) queued_spin_lock(l)
^~~~~~~~~~~~~~~~~~~
include/asm-generic/qspinlock.h:82:6: note: Assuming the condition
is false
if (likely(atomic_try_cmpxchg_acquire(&lock->val, &val,
_Q_LOCKED_VAL)))
^
include/linux/compiler.h:77:20: note: expanded from macro 'likely'
# define likely(x) __builtin_expect(!!(x), 1)
^~~~~~~~~~~~~~~~~~~~~~~~~~
include/asm-generic/qspinlock.h:82:2: note: Taking false branch
if (likely(atomic_try_cmpxchg_acquire(&lock->val, &val,
_Q_LOCKED_VAL)))
^
include/asm-generic/qspinlock.h:85:2: note: Calling
'queued_spin_lock_slowpath'
queued_spin_lock_slowpath(lock, val);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/x86/include/asm/qspinlock.h:51:2: note: Calling
'pv_queued_spin_lock_slowpath'
pv_queued_spin_lock_slowpath(lock, val);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/x86/include/asm/paravirt.h:585:2: note: Assigned value is
garbage or undefined
PVOP_VCALL2(lock.queued_spin_lock_slowpath, lock, val);
^
arch/x86/include/asm/paravirt_types.h:547:2: note: expanded from
macro 'PVOP_VCALL2'
__PVOP_VCALL(op, PVOP_CALL_ARG1(arg1), PVOP_CALL_ARG2(arg2))
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/x86/include/asm/paravirt_types.h:491:8: note: expanded from
macro '__PVOP_VCALL'
(void)____PVOP_CALL(, op, CLBR_ANY, PVOP_VCALL_CLOBBERS,
\

^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/x86/include/asm/paravirt_types.h:446:3: note: expanded from
macro '____PVOP_CALL'
PVOP_CALL_ARGS;
\
^~~~~~~~~~~~~~
arch/x86/include/asm/paravirt_types.h:404:16: note: expanded from
macro 'PVOP_CALL_ARGS'
unsigned long __edi = __edi, __esi = __esi, \
^ ~~~~~
arch/x86/include/asm/paravirt.h:590:2: warning: Assigned value is
garbage or undefined [clang-analyzer-core.uninitialized.Assign]
PVOP_ALT_VCALLEE1(lock.queued_spin_unlock, lock,
^
arch/x86/include/asm/paravirt_types.h:541:2: note: expanded from
macro 'PVOP_ALT_VCALLEE1'
__PVOP_ALT_VCALLEESAVE(op, alt, cond, PVOP_CALL_ARG1(arg1))
^
arch/x86/include/asm/paravirt_types.h:504:8: note: expanded from
macro '__PVOP_ALT_VCALLEESAVE'
(void)____PVOP_ALT_CALL(, op.func, alt, cond, CLBR_RET_REG,
\
^
arch/x86/include/asm/paravirt_types.h:460:3: note: expanded from
macro '____PVOP_ALT_CALL'
PVOP_CALL_ARGS;
\
^
arch/x86/include/asm/paravirt_types.h:404:16: note: expanded from
macro 'PVOP_CALL_ARGS'
unsigned long __edi = __edi, __esi = __esi, \
^
kernel/trace/trace.c:9948:6: note: Assuming 'tracepoint_printk' is 0
if (tracepoint_printk) {
^~~~~~~~~~~~~~~~~
kernel/trace/trace.c:9948:2: note: Taking false branch
if (tracepoint_printk) {
^
kernel/trace/trace.c:9957:2: note: Calling 'tracer_alloc_buffers'
tracer_alloc_buffers();
^~~~~~~~~~~~~~~~~~~~~~
kernel/trace/trace.c:9821:6: note: Assuming the condition is false
if (security_locked_down(LOCKDOWN_TRACEFS)) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
kernel/trace/trace.c:9821:2: note: Taking false branch
if (security_locked_down(LOCKDOWN_TRACEFS)) {
^
kernel/trace/trace.c:9830:15: note: TRACE_ITER_LAST_BIT is <= 32
BUILD_BUG_ON(TRACE_ITER_LAST_BIT > TRACE_FLAGS_MAX_SIZE);
^
include/linux/build_bug.h:50:19: note: expanded from macro
'BUILD_BUG_ON'
BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
^~~~~~~~~
include/linux/build_bug.h:39:58: note: expanded from macro
'BUILD_BUG_ON_MSG'
#define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)

vim +/ret +6245 fs/ext4/mballoc.c

7360d1731e5dc7 Lukas Czerner 2010-10-27 6220 55a3430685e837 Wang
Jianchao 2021-07-24 6221 static int ext4_try_to_trim_range(struct
super_block *sb,
55a3430685e837 Wang Jianchao 2021-07-24 6222 struct ext4_buddy *e4b,
ext4_grpblk_t start,
55a3430685e837 Wang Jianchao 2021-07-24 6223 ext4_grpblk_t max,
ext4_grpblk_t minblocks)
55a3430685e837 Wang Jianchao 2021-07-24 6224 {
55a3430685e837 Wang Jianchao 2021-07-24 6225 ext4_grpblk_t next,
count, free_count;
55a3430685e837 Wang Jianchao 2021-07-24 6226 void *bitmap;
55a3430685e837 Wang Jianchao 2021-07-24 6227 int ret = 0;
55a3430685e837 Wang Jianchao 2021-07-24 6228 55a3430685e837 Wang
Jianchao 2021-07-24 6229 bitmap = e4b->bd_bitmap;
55a3430685e837 Wang Jianchao 2021-07-24 6230 start =
(e4b->bd_info->bb_first_free > start) ?
55a3430685e837 Wang Jianchao 2021-07-24 6231
e4b->bd_info->bb_first_free : start;
55a3430685e837 Wang Jianchao 2021-07-24 6232 count = 0;
55a3430685e837 Wang Jianchao 2021-07-24 6233 free_count = 0;
55a3430685e837 Wang Jianchao 2021-07-24 6234 55a3430685e837 Wang
Jianchao 2021-07-24 6235 while (start <= max) {
55a3430685e837 Wang Jianchao 2021-07-24 6236 start =
mb_find_next_zero_bit(bitmap, max + 1, start);
55a3430685e837 Wang Jianchao 2021-07-24 6237 if (start > max)
55a3430685e837 Wang Jianchao 2021-07-24 6238 break;
55a3430685e837 Wang Jianchao 2021-07-24 6239 next =
mb_find_next_bit(bitmap, max + 1, start);
55a3430685e837 Wang Jianchao 2021-07-24 6240 55a3430685e837 Wang
Jianchao 2021-07-24 6241 if ((next - start) >= minblocks) {
55a3430685e837 Wang Jianchao 2021-07-24 6242 ret =
ext4_trim_extent(sb, start, next - start, e4b);
55a3430685e837 Wang Jianchao 2021-07-24 6243 if (ret && ret !=
-EOPNOTSUPP)
55a3430685e837 Wang Jianchao 2021-07-24 6244 break;
55a3430685e837 Wang Jianchao 2021-07-24 @6245 ret = 0;
55a3430685e837 Wang Jianchao 2021-07-24 6246 count += next - start;
55a3430685e837 Wang Jianchao 2021-07-24 6247 }
55a3430685e837 Wang Jianchao 2021-07-24 6248 free_count += next - start;
55a3430685e837 Wang Jianchao 2021-07-24 6249 start = next + 1;
55a3430685e837 Wang Jianchao 2021-07-24 6250 55a3430685e837 Wang
Jianchao 2021-07-24 6251 if (fatal_signal_pending(current)) {
55a3430685e837 Wang Jianchao 2021-07-24 6252 count = -ERESTARTSYS;
55a3430685e837 Wang Jianchao 2021-07-24 6253 break;
55a3430685e837 Wang Jianchao 2021-07-24 6254 }
55a3430685e837 Wang Jianchao 2021-07-24 6255 55a3430685e837 Wang
Jianchao 2021-07-24 6256 if (need_resched()) {
55a3430685e837 Wang Jianchao 2021-07-24 6257 ext4_unlock_group(sb,
e4b->bd_group);
55a3430685e837 Wang Jianchao 2021-07-24 6258 cond_resched();
55a3430685e837 Wang Jianchao 2021-07-24 6259 ext4_lock_group(sb,
e4b->bd_group);
55a3430685e837 Wang Jianchao 2021-07-24 6260 }
55a3430685e837 Wang Jianchao 2021-07-24 6261 55a3430685e837 Wang
Jianchao 2021-07-24 6262 if ((e4b->bd_info->bb_free - free_count) <
minblocks)
55a3430685e837 Wang Jianchao 2021-07-24 6263 break;
55a3430685e837 Wang Jianchao 2021-07-24 6264 }
55a3430685e837 Wang Jianchao 2021-07-24 6265 55a3430685e837 Wang
Jianchao 2021-07-24 6266 return count;
55a3430685e837 Wang Jianchao 2021-07-24 6267 }
55a3430685e837 Wang Jianchao 2021-07-24 6268
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected].org


Attachments:
.config.gz (28.72 kB)
Attached Message Part (154.00 B)
Download all attachments

2021-07-26 03:44:21

by Guoqing Jiang

[permalink] [raw]
Subject: Re: [PATCH V3 1/5] ext4: remove the 'group' parameter of ext4_trim_extent


On 7/24/21 3:41 PM, Wang Jianchao wrote:
> -static int ext4_trim_extent(struct super_block *sb, int start, int count,
> - ext4_group_t group, struct ext4_buddy *e4b)
> +static int ext4_trim_extent(struct super_block *sb,
> + int start, int count, struct ext4_buddy *e4b)

Nit, seems only need to change the second line.

Thanks,
Guoqing

2021-07-26 07:06:58

by Wang Jianchao

[permalink] [raw]
Subject: Re: [PATCH V3 5/5] ext4: make fallocate retry when err is ENOSPC



On 2021/7/26 11:40 AM, Guoqing Jiang wrote:
> Hi,
>
> On 7/24/21 3:41 PM, Wang Jianchao wrote:
>> From: Wang Jianchao <[email protected]>
>>
>> The blocks may be waiting for journal commit to be freed back to
>> mb buddy. Let fallocate wait and retry in that case.
>>
>> Signed-off-by: Wang Jianchao <[email protected]>
>> ---
>>   fs/ext4/extents.c | 6 +++++-
>>   1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
>> index 92ad64b89d9b..ad0b874d3448 100644
>> --- a/fs/ext4/extents.c
>> +++ b/fs/ext4/extents.c
>> @@ -4635,7 +4635,7 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
>>       struct inode *inode = file_inode(file);
>>       loff_t new_size = 0;
>>       unsigned int max_blocks;
>> -    int ret = 0;
>> +    int ret = 0, retries = 0;
>>       int flags;
>>       ext4_lblk_t lblk;
>>       unsigned int blkbits = inode->i_blkbits;
>> @@ -4656,6 +4656,7 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
>>                FALLOC_FL_INSERT_RANGE))
>>           return -EOPNOTSUPP;
>>   +retry:
>>       ext4_fc_start_update(inode);
>>         if (mode & FALLOC_FL_PUNCH_HOLE) {
>> @@ -4722,6 +4723,9 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
>>       trace_ext4_fallocate_exit(inode, offset, max_blocks, ret);
>>   exit:
>>       ext4_fc_stop_update(inode);
>> +    if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries))
>> +        goto retry;
>> +
>
> Not sure if it is necessary since ext4_alloc_file_blocks already retries allocate.

Yes, this patch should be get rid of.
But it is indeed helpful to fix the xfstest generic/371 which does concurrently write/rm
and fallocate/rm. I'll figure out some other way to improve that

Thanks
Jianchao

>
> Thanks,
> Guoqing