2020-08-25 13:10:00

by Ritesh Harjani

[permalink] [raw]
Subject: Re: [PATCH] iomap: iomap_bmap should accept unwritten maps



On 5/6/20 1:00 AM, Darrick J. Wong wrote:
> On Tue, May 05, 2020 at 07:36:08PM +0100, Yuxuan Shui wrote:
>> commit ac58e4fb03f9d111d733a4ad379d06eef3a24705 moved ext4_bmap from
>> generic_block_bmap to iomap_bmap, this introduced a regression which
>> prevents some user from using previously working swapfiles. The kernel
>> will complain about holes while there is none.
>>
>> What is happening here is that the swapfile has unwritten mappings,
>> which is rejected by iomap_bmap, but was accepted by ext4_get_block.
>
> ...which is why ext4 ought to use iomap_swapfile_activate.

I tested this patch (diff below), which seems to be working fine for me
for straight forward use case of swapon/swapoff on ext4.
Could you give it a try?

<log showing ext4_iomap_swap_activate path kicking in>
swapon 1283 [000] 438.651028: 250000 cpu-clock:pppH:
ffffffff817f7f56 percpu_counter_add_batch+0x26 (/boot/vmlinux)
ffffffff813a61d0 ext4_es_lookup_extent+0x1d0 (/boot/vmlinux)
ffffffff813b8095 ext4_map_blocks+0x65 (/boot/vmlinux)
ffffffff813b8d4b ext4_iomap_begin_report+0x10b (/boot/vmlinux)
ffffffff81367f58 iomap_apply+0xa8 (/boot/vmlinux)
ffffffff8136d1c3 iomap_swapfile_activate+0xb3 (/boot/vmlinux)
ffffffff813b51a5 ext4_iomap_swap_activate+0x15 (/boot/vmlinux)
ffffffff812a3a27 __do_sys_swapon+0xb37 (/boot/vmlinux)
ffffffff812a40f6 __x64_sys_swapon+0x16 (/boot/vmlinux)
ffffffff820b760a do_syscall_64+0x5a (/boot/vmlinux)
ffffffff8220007c entry_SYSCALL_64+0x7c (/boot/vmlinux)
7ffff7de68bb swapon+0xb (/usr/lib/x86_64-linux-gnu/libc-2.30.so)
66706177732f756d [unknown] ([unknown])

<shows that swapfile(which I setup using fallocate) has some used bytes>
$ swapon -s
Filename Type Size Used
Priority
/home/qemu/swapfile-test file 2097148 42312 -2


@Jan/Ted/Darrick,

I am not that familiar with how swap subsystem works.
So, is there anything else you feel is required apart from below changes
for supporting swap_activate via iomap? I did test both swapon/swapoff
and see that swap is getting used up on ext4 with delalloc mount opt.

As I see from code, iomap_swapfile_activate is mainly looking for
extent mapping information of that file to pass to swap subsystem.
And IIUC, "ext4_iomap_report_ops" is meant exactly for that.
Same as how we use it in ext4_fiemap().


diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 6eae17758ece..1e390157541d 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3614,6 +3614,13 @@ static int ext4_set_page_dirty(struct page *page)
return __set_page_dirty_buffers(page);
}

+static int ext4_iomap_swap_activate(struct swap_info_struct *sis,
+ struct file *file, sector_t *span)
+{
+ return iomap_swapfile_activate(sis, file, span,
+ &ext4_iomap_report_ops);
+}
+
static const struct address_space_operations ext4_aops = {
.readpage = ext4_readpage,
.readahead = ext4_readahead,
@@ -3629,6 +3636,7 @@ static const struct address_space_operations
ext4_aops = {
.migratepage = buffer_migrate_page,
.is_partially_uptodate = block_is_partially_uptodate,
.error_remove_page = generic_error_remove_page,
+ .swap_activate = ext4_iomap_swap_activate,
};

static const struct address_space_operations ext4_journalled_aops = {
@@ -3645,6 +3653,7 @@ static const struct address_space_operations
ext4_journalled_aops = {
.direct_IO = noop_direct_IO,
.is_partially_uptodate = block_is_partially_uptodate,
.error_remove_page = generic_error_remove_page,
+ .swap_activate = ext4_iomap_swap_activate,
};

static const struct address_space_operations ext4_da_aops = {
@@ -3662,6 +3671,7 @@ static const struct address_space_operations
ext4_da_aops = {
.migratepage = buffer_migrate_page,
.is_partially_uptodate = block_is_partially_uptodate,
.error_remove_page = generic_error_remove_page,
+ .swap_activate = ext4_iomap_swap_activate,
};

static const struct address_space_operations ext4_dax_aops = {
@@ -3670,6 +3680,7 @@ static const struct address_space_operations
ext4_dax_aops = {
.set_page_dirty = noop_set_page_dirty,
.bmap = ext4_bmap,
.invalidatepage = noop_invalidatepage,
+ .swap_activate = ext4_iomap_swap_activate,
};

void ext4_set_aops(struct inode *inode)


2020-08-25 15:50:34

by Darrick J. Wong

[permalink] [raw]
Subject: Re: [PATCH] iomap: iomap_bmap should accept unwritten maps

On Tue, Aug 25, 2020 at 06:06:49PM +0530, Ritesh Harjani wrote:
>
>
> On 5/6/20 1:00 AM, Darrick J. Wong wrote:
> > On Tue, May 05, 2020 at 07:36:08PM +0100, Yuxuan Shui wrote:
> > > commit ac58e4fb03f9d111d733a4ad379d06eef3a24705 moved ext4_bmap from
> > > generic_block_bmap to iomap_bmap, this introduced a regression which
> > > prevents some user from using previously working swapfiles. The kernel
> > > will complain about holes while there is none.
> > >
> > > What is happening here is that the swapfile has unwritten mappings,
> > > which is rejected by iomap_bmap, but was accepted by ext4_get_block.
> >
> > ...which is why ext4 ought to use iomap_swapfile_activate.
>
> I tested this patch (diff below), which seems to be working fine for me
> for straight forward use case of swapon/swapoff on ext4.
> Could you give it a try?
>
> <log showing ext4_iomap_swap_activate path kicking in>
> swapon 1283 [000] 438.651028: 250000 cpu-clock:pppH:
> ffffffff817f7f56 percpu_counter_add_batch+0x26 (/boot/vmlinux)
> ffffffff813a61d0 ext4_es_lookup_extent+0x1d0 (/boot/vmlinux)
> ffffffff813b8095 ext4_map_blocks+0x65 (/boot/vmlinux)
> ffffffff813b8d4b ext4_iomap_begin_report+0x10b (/boot/vmlinux)
> ffffffff81367f58 iomap_apply+0xa8 (/boot/vmlinux)
> ffffffff8136d1c3 iomap_swapfile_activate+0xb3 (/boot/vmlinux)
> ffffffff813b51a5 ext4_iomap_swap_activate+0x15 (/boot/vmlinux)
> ffffffff812a3a27 __do_sys_swapon+0xb37 (/boot/vmlinux)
> ffffffff812a40f6 __x64_sys_swapon+0x16 (/boot/vmlinux)
> ffffffff820b760a do_syscall_64+0x5a (/boot/vmlinux)
> ffffffff8220007c entry_SYSCALL_64+0x7c (/boot/vmlinux)
> 7ffff7de68bb swapon+0xb (/usr/lib/x86_64-linux-gnu/libc-2.30.so)
> 66706177732f756d [unknown] ([unknown])
>
> <shows that swapfile(which I setup using fallocate) has some used bytes>
> $ swapon -s
> Filename Type Size Used
> Priority
> /home/qemu/swapfile-test file 2097148 42312 -2
>
>
> @Jan/Ted/Darrick,
>
> I am not that familiar with how swap subsystem works.
> So, is there anything else you feel is required apart from below changes
> for supporting swap_activate via iomap? I did test both swapon/swapoff
> and see that swap is getting used up on ext4 with delalloc mount opt.
>
> As I see from code, iomap_swapfile_activate is mainly looking for
> extent mapping information of that file to pass to swap subsystem.
> And IIUC, "ext4_iomap_report_ops" is meant exactly for that.
> Same as how we use it in ext4_fiemap().

<nod> The swap code doesn't even care about the file offsets, it just
wants the physical mappings, and it only wants to find real and
unwritten mappings (i.e. no holes, delalloc, or shared extents).

So ... I think it's ok to use the same iomap ops as fiemap.

FWIW the xfs version uses xfs_read_iomap_ops for reads, readahead,
fiemap, and swapfiles, so this is ... probably fine, especially if it
passes the swap group fstests. :)

--D

>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 6eae17758ece..1e390157541d 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -3614,6 +3614,13 @@ static int ext4_set_page_dirty(struct page *page)
> return __set_page_dirty_buffers(page);
> }
>
> +static int ext4_iomap_swap_activate(struct swap_info_struct *sis,
> + struct file *file, sector_t *span)
> +{
> + return iomap_swapfile_activate(sis, file, span,
> + &ext4_iomap_report_ops);
> +}
> +
> static const struct address_space_operations ext4_aops = {
> .readpage = ext4_readpage,
> .readahead = ext4_readahead,
> @@ -3629,6 +3636,7 @@ static const struct address_space_operations ext4_aops
> = {
> .migratepage = buffer_migrate_page,
> .is_partially_uptodate = block_is_partially_uptodate,
> .error_remove_page = generic_error_remove_page,
> + .swap_activate = ext4_iomap_swap_activate,
> };
>
> static const struct address_space_operations ext4_journalled_aops = {
> @@ -3645,6 +3653,7 @@ static const struct address_space_operations
> ext4_journalled_aops = {
> .direct_IO = noop_direct_IO,
> .is_partially_uptodate = block_is_partially_uptodate,
> .error_remove_page = generic_error_remove_page,
> + .swap_activate = ext4_iomap_swap_activate,
> };
>
> static const struct address_space_operations ext4_da_aops = {
> @@ -3662,6 +3671,7 @@ static const struct address_space_operations
> ext4_da_aops = {
> .migratepage = buffer_migrate_page,
> .is_partially_uptodate = block_is_partially_uptodate,
> .error_remove_page = generic_error_remove_page,
> + .swap_activate = ext4_iomap_swap_activate,
> };
>
> static const struct address_space_operations ext4_dax_aops = {
> @@ -3670,6 +3680,7 @@ static const struct address_space_operations
> ext4_dax_aops = {
> .set_page_dirty = noop_set_page_dirty,
> .bmap = ext4_bmap,
> .invalidatepage = noop_invalidatepage,
> + .swap_activate = ext4_iomap_swap_activate,
> };
>
> void ext4_set_aops(struct inode *inode)

2020-08-25 15:57:29

by Yuxuan Shui

[permalink] [raw]
Subject: Re: [PATCH] iomap: iomap_bmap should accept unwritten maps

This patch appears to be working on my machine as well.

On Tue, Aug 25, 2020 at 1:37 PM Ritesh Harjani <[email protected]> wrote:
>
>
>
> On 5/6/20 1:00 AM, Darrick J. Wong wrote:
> > On Tue, May 05, 2020 at 07:36:08PM +0100, Yuxuan Shui wrote:
> >> commit ac58e4fb03f9d111d733a4ad379d06eef3a24705 moved ext4_bmap from
> >> generic_block_bmap to iomap_bmap, this introduced a regression which
> >> prevents some user from using previously working swapfiles. The kernel
> >> will complain about holes while there is none.
> >>
> >> What is happening here is that the swapfile has unwritten mappings,
> >> which is rejected by iomap_bmap, but was accepted by ext4_get_block.
> >
> > ...which is why ext4 ought to use iomap_swapfile_activate.
>
> I tested this patch (diff below), which seems to be working fine for me
> for straight forward use case of swapon/swapoff on ext4.
> Could you give it a try?
>
> <log showing ext4_iomap_swap_activate path kicking in>
> swapon 1283 [000] 438.651028: 250000 cpu-clock:pppH:
> ffffffff817f7f56 percpu_counter_add_batch+0x26 (/boot/vmlinux)
> ffffffff813a61d0 ext4_es_lookup_extent+0x1d0 (/boot/vmlinux)
> ffffffff813b8095 ext4_map_blocks+0x65 (/boot/vmlinux)
> ffffffff813b8d4b ext4_iomap_begin_report+0x10b (/boot/vmlinux)
> ffffffff81367f58 iomap_apply+0xa8 (/boot/vmlinux)
> ffffffff8136d1c3 iomap_swapfile_activate+0xb3 (/boot/vmlinux)
> ffffffff813b51a5 ext4_iomap_swap_activate+0x15 (/boot/vmlinux)
> ffffffff812a3a27 __do_sys_swapon+0xb37 (/boot/vmlinux)
> ffffffff812a40f6 __x64_sys_swapon+0x16 (/boot/vmlinux)
> ffffffff820b760a do_syscall_64+0x5a (/boot/vmlinux)
> ffffffff8220007c entry_SYSCALL_64+0x7c (/boot/vmlinux)
> 7ffff7de68bb swapon+0xb (/usr/lib/x86_64-linux-gnu/libc-2.30.so)
> 66706177732f756d [unknown] ([unknown])
>
> <shows that swapfile(which I setup using fallocate) has some used bytes>
> $ swapon -s
> Filename Type Size Used
> Priority
> /home/qemu/swapfile-test file 2097148 42312 -2
>
>
> @Jan/Ted/Darrick,
>
> I am not that familiar with how swap subsystem works.
> So, is there anything else you feel is required apart from below changes
> for supporting swap_activate via iomap? I did test both swapon/swapoff
> and see that swap is getting used up on ext4 with delalloc mount opt.
>
> As I see from code, iomap_swapfile_activate is mainly looking for
> extent mapping information of that file to pass to swap subsystem.
> And IIUC, "ext4_iomap_report_ops" is meant exactly for that.
> Same as how we use it in ext4_fiemap().
>
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 6eae17758ece..1e390157541d 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -3614,6 +3614,13 @@ static int ext4_set_page_dirty(struct page *page)
> return __set_page_dirty_buffers(page);
> }
>
> +static int ext4_iomap_swap_activate(struct swap_info_struct *sis,
> + struct file *file, sector_t *span)
> +{
> + return iomap_swapfile_activate(sis, file, span,
> + &ext4_iomap_report_ops);
> +}
> +
> static const struct address_space_operations ext4_aops = {
> .readpage = ext4_readpage,
> .readahead = ext4_readahead,
> @@ -3629,6 +3636,7 @@ static const struct address_space_operations
> ext4_aops = {
> .migratepage = buffer_migrate_page,
> .is_partially_uptodate = block_is_partially_uptodate,
> .error_remove_page = generic_error_remove_page,
> + .swap_activate = ext4_iomap_swap_activate,
> };
>
> static const struct address_space_operations ext4_journalled_aops = {
> @@ -3645,6 +3653,7 @@ static const struct address_space_operations
> ext4_journalled_aops = {
> .direct_IO = noop_direct_IO,
> .is_partially_uptodate = block_is_partially_uptodate,
> .error_remove_page = generic_error_remove_page,
> + .swap_activate = ext4_iomap_swap_activate,
> };
>
> static const struct address_space_operations ext4_da_aops = {
> @@ -3662,6 +3671,7 @@ static const struct address_space_operations
> ext4_da_aops = {
> .migratepage = buffer_migrate_page,
> .is_partially_uptodate = block_is_partially_uptodate,
> .error_remove_page = generic_error_remove_page,
> + .swap_activate = ext4_iomap_swap_activate,
> };
>
> static const struct address_space_operations ext4_dax_aops = {
> @@ -3670,6 +3680,7 @@ static const struct address_space_operations
> ext4_dax_aops = {
> .set_page_dirty = noop_set_page_dirty,
> .bmap = ext4_bmap,
> .invalidatepage = noop_invalidatepage,
> + .swap_activate = ext4_iomap_swap_activate,
> };
>
> void ext4_set_aops(struct inode *inode)



--

Regards
Yuxuan Shui

2020-08-25 18:04:36

by Ritesh Harjani

[permalink] [raw]
Subject: Re: [PATCH] iomap: iomap_bmap should accept unwritten maps



On 8/25/20 9:19 PM, Darrick J. Wong wrote:
> On Tue, Aug 25, 2020 at 06:06:49PM +0530, Ritesh Harjani wrote:
>>
>>
>> On 5/6/20 1:00 AM, Darrick J. Wong wrote:
>>> On Tue, May 05, 2020 at 07:36:08PM +0100, Yuxuan Shui wrote:
>>>> commit ac58e4fb03f9d111d733a4ad379d06eef3a24705 moved ext4_bmap from
>>>> generic_block_bmap to iomap_bmap, this introduced a regression which
>>>> prevents some user from using previously working swapfiles. The kernel
>>>> will complain about holes while there is none.
>>>>
>>>> What is happening here is that the swapfile has unwritten mappings,
>>>> which is rejected by iomap_bmap, but was accepted by ext4_get_block.
>>>
>>> ...which is why ext4 ought to use iomap_swapfile_activate.
>>
>> I tested this patch (diff below), which seems to be working fine for me
>> for straight forward use case of swapon/swapoff on ext4.
>> Could you give it a try?
>>
>> <log showing ext4_iomap_swap_activate path kicking in>
>> swapon 1283 [000] 438.651028: 250000 cpu-clock:pppH:
>> ffffffff817f7f56 percpu_counter_add_batch+0x26 (/boot/vmlinux)
>> ffffffff813a61d0 ext4_es_lookup_extent+0x1d0 (/boot/vmlinux)
>> ffffffff813b8095 ext4_map_blocks+0x65 (/boot/vmlinux)
>> ffffffff813b8d4b ext4_iomap_begin_report+0x10b (/boot/vmlinux)
>> ffffffff81367f58 iomap_apply+0xa8 (/boot/vmlinux)
>> ffffffff8136d1c3 iomap_swapfile_activate+0xb3 (/boot/vmlinux)
>> ffffffff813b51a5 ext4_iomap_swap_activate+0x15 (/boot/vmlinux)
>> ffffffff812a3a27 __do_sys_swapon+0xb37 (/boot/vmlinux)
>> ffffffff812a40f6 __x64_sys_swapon+0x16 (/boot/vmlinux)
>> ffffffff820b760a do_syscall_64+0x5a (/boot/vmlinux)
>> ffffffff8220007c entry_SYSCALL_64+0x7c (/boot/vmlinux)
>> 7ffff7de68bb swapon+0xb (/usr/lib/x86_64-linux-gnu/libc-2.30.so)
>> 66706177732f756d [unknown] ([unknown])
>>
>> <shows that swapfile(which I setup using fallocate) has some used bytes>
>> $ swapon -s
>> Filename Type Size Used
>> Priority
>> /home/qemu/swapfile-test file 2097148 42312 -2
>>
>>
>> @Jan/Ted/Darrick,
>>
>> I am not that familiar with how swap subsystem works.
>> So, is there anything else you feel is required apart from below changes
>> for supporting swap_activate via iomap? I did test both swapon/swapoff
>> and see that swap is getting used up on ext4 with delalloc mount opt.
>>
>> As I see from code, iomap_swapfile_activate is mainly looking for
>> extent mapping information of that file to pass to swap subsystem.
>> And IIUC, "ext4_iomap_report_ops" is meant exactly for that.
>> Same as how we use it in ext4_fiemap().
>
> <nod> The swap code doesn't even care about the file offsets, it just
> wants the physical mappings, and it only wants to find real and
> unwritten mappings (i.e. no holes, delalloc, or shared extents).
>
> So ... I think it's ok to use the same iomap ops as fiemap.
>
> FWIW the xfs version uses xfs_read_iomap_ops for reads, readahead,
> fiemap, and swapfiles, so this is ... probably fine, especially if it
> passes the swap group fstests. :)
>

Ohh yes, thanks. :)
I tested "-g swap" fstests and those were fine.
For completion sake, I will go through generic_swapfile_activate()
just to confirm that nothing is missed.
Will try and spin a formal patch early next week -(in LPC this week)

-ritesh