2022-11-16 13:52:05

by Christoph Hellwig

[permalink] [raw]
Subject: generic_writepages & jbd2 and ext4

Hi all,

I've recently started looking into killing off the ->writepage method,
and as an initial subproject kill of external uses of generic_writepages.
One of the two remaining callers s in jbd2 and I'm a bit confused about
it.

jbd2_journal_submit_inode_data_buffers has two comments that explicitly
ask for ->writepages as that doesn't allocate data:

/*
* write the filemap data using writepage() address_space_operations.
* We don't do block allocation here even for delalloc. We don't
* use writepages() because with delayed allocation we may be doing
* block allocation in writepages().
*/

/*
* submit the inode data buffers. We use writepage
* instead of writepages. Because writepages can do
* block allocation with delalloc. We need to write
* only allocated blocks here.
*/

and these look really stange to me. ->writepage and ->writepages per
their document VM/VFS semantics don't different on what they allocate,
so this seems to reverse engineer ext4 internal behavior in some
way. Either way looping over ->writepage just for that is rather
inefficient. If jbd2 really wants a way to skip delalloc conversion
can we come up with a flag in struct writeback_control for that?

Is there anyone familiar enough with this code who would be willing
to give it a try?


2022-11-16 15:38:04

by Jan Kara

[permalink] [raw]
Subject: Re: generic_writepages & jbd2 and ext4

On Wed 16-11-22 14:50:16, Christoph Hellwig wrote:
> Hi all,
>
> I've recently started looking into killing off the ->writepage method,
> and as an initial subproject kill of external uses of generic_writepages.
> One of the two remaining callers s in jbd2 and I'm a bit confused about
> it.
>
> jbd2_journal_submit_inode_data_buffers has two comments that explicitly
> ask for ->writepages as that doesn't allocate data:
>
> /*
> * write the filemap data using writepage() address_space_operations.
> * We don't do block allocation here even for delalloc. We don't
> * use writepages() because with delayed allocation we may be doing
> * block allocation in writepages().
> */
>
> /*
> * submit the inode data buffers. We use writepage
> * instead of writepages. Because writepages can do
> * block allocation with delalloc. We need to write
> * only allocated blocks here.
> */
>
> and these look really stange to me. ->writepage and ->writepages per
> their document VM/VFS semantics don't different on what they allocate,
> so this seems to reverse engineer ext4 internal behavior in some
> way. Either way looping over ->writepage just for that is rather
> inefficient. If jbd2 really wants a way to skip delalloc conversion
> can we come up with a flag in struct writeback_control for that?
>
> Is there anyone familiar enough with this code who would be willing
> to give it a try?

Yes, I've written that code quite a few years ago :) And I agree JBD2 is
abusing internal knowledge about ext4 here. So yes, writeback_control flag
so that we can propagate the information to ->writepages method should do
the trick. I'll have a look into that.

Honza

--
Jan Kara <[email protected]>
SUSE Labs, CR