2024-02-13 16:30:56

by Darrick J. Wong

[permalink] [raw]
Subject: Re: [RFC v2 10/14] iomap: fix iomap_dio_zero() for fs bs > system page size

On Tue, Feb 13, 2024 at 10:37:09AM +0100, Pankaj Raghav (Samsung) wrote:
> From: Pankaj Raghav <[email protected]>
>
> iomap_dio_zero() will pad a fs block with zeroes if the direct IO size
> < fs block size. iomap_dio_zero() has an implicit assumption that fs block
> size < page_size. This is true for most filesystems at the moment.
>
> If the block size > page size, this will send the contents of the page
> next to zero page(as len > PAGE_SIZE) to the underlying block device,
> causing FS corruption.
>
> iomap is a generic infrastructure and it should not make any assumptions
> about the fs block size and the page size of the system.
>
> Signed-off-by: Pankaj Raghav <[email protected]>
> ---
> fs/iomap/direct-io.c | 13 +++++++++++--
> 1 file changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> index bcd3f8cf5ea4..04f6c5548136 100644
> --- a/fs/iomap/direct-io.c
> +++ b/fs/iomap/direct-io.c
> @@ -239,14 +239,23 @@ static void iomap_dio_zero(const struct iomap_iter *iter, struct iomap_dio *dio,
> struct page *page = ZERO_PAGE(0);
> struct bio *bio;
>
> - bio = iomap_dio_alloc_bio(iter, dio, 1, REQ_OP_WRITE | REQ_SYNC | REQ_IDLE);
> + WARN_ON_ONCE(len > (BIO_MAX_VECS * PAGE_SIZE));
> +
> + bio = iomap_dio_alloc_bio(iter, dio, BIO_MAX_VECS,
> + REQ_OP_WRITE | REQ_SYNC | REQ_IDLE);
> fscrypt_set_bio_crypt_ctx(bio, inode, pos >> inode->i_blkbits,
> GFP_KERNEL);
> +
> bio->bi_iter.bi_sector = iomap_sector(&iter->iomap, pos);
> bio->bi_private = dio;
> bio->bi_end_io = iomap_dio_bio_end_io;
>
> - __bio_add_page(bio, page, len, 0);
> + while (len) {
> + unsigned int io_len = min_t(unsigned int, len, PAGE_SIZE);

What was the result of all that discussion about using the PMD-sized
zero-folio the last time this patch was submitted? Did that prove to be
unwieldly, or did it require enough extra surgery to become its own
series?

(The code here looks good to me.)

--D

> +
> + __bio_add_page(bio, page, io_len, 0);
> + len -= io_len;
> + }
> iomap_dio_submit_bio(iter, dio, bio, pos);
> }
>
> --
> 2.43.0
>
>


2024-02-14 21:01:05

by Pankaj Raghav (Samsung)

[permalink] [raw]
Subject: Re: [RFC v2 10/14] iomap: fix iomap_dio_zero() for fs bs > system page size

On Tue, Feb 13, 2024 at 08:30:37AM -0800, Darrick J. Wong wrote:
> On Tue, Feb 13, 2024 at 10:37:09AM +0100, Pankaj Raghav (Samsung) wrote:
> > From: Pankaj Raghav <[email protected]>
> >
> > iomap_dio_zero() will pad a fs block with zeroes if the direct IO size
> > < fs block size. iomap_dio_zero() has an implicit assumption that fs block
> > size < page_size. This is true for most filesystems at the moment.
> >
> > If the block size > page size, this will send the contents of the page
> > next to zero page(as len > PAGE_SIZE) to the underlying block device,
> > causing FS corruption.
> >
> > iomap is a generic infrastructure and it should not make any assumptions
> > about the fs block size and the page size of the system.
> >
> > Signed-off-by: Pankaj Raghav <[email protected]>
> > ---
> > fs/iomap/direct-io.c | 13 +++++++++++--
> > 1 file changed, 11 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> > index bcd3f8cf5ea4..04f6c5548136 100644
> > --- a/fs/iomap/direct-io.c
> > +++ b/fs/iomap/direct-io.c
> > @@ -239,14 +239,23 @@ static void iomap_dio_zero(const struct iomap_iter *iter, struct iomap_dio *dio,
> > struct page *page = ZERO_PAGE(0);
> > struct bio *bio;
> >
> > - bio = iomap_dio_alloc_bio(iter, dio, 1, REQ_OP_WRITE | REQ_SYNC | REQ_IDLE);
> > + WARN_ON_ONCE(len > (BIO_MAX_VECS * PAGE_SIZE));
> > +
> > + bio = iomap_dio_alloc_bio(iter, dio, BIO_MAX_VECS,
> > + REQ_OP_WRITE | REQ_SYNC | REQ_IDLE);
> > fscrypt_set_bio_crypt_ctx(bio, inode, pos >> inode->i_blkbits,
> > GFP_KERNEL);
> > +
> > bio->bi_iter.bi_sector = iomap_sector(&iter->iomap, pos);
> > bio->bi_private = dio;
> > bio->bi_end_io = iomap_dio_bio_end_io;
> >
> > - __bio_add_page(bio, page, len, 0);
> > + while (len) {
> > + unsigned int io_len = min_t(unsigned int, len, PAGE_SIZE);
>
> What was the result of all that discussion about using the PMD-sized
> zero-folio the last time this patch was submitted? Did that prove to be
> unwieldly, or did it require enough extra surgery to become its own
> series?
>

It proved a bit unwieldly to me at least as I did not know any straight
forward way to do it at the time. So I thought I will keep this approach
as it is, and add support for the PMD-sized zero folio for later
improvement.

> (The code here looks good to me.)

Thanks!
>
> --D
>
> > +
> > + __bio_add_page(bio, page, io_len, 0);
> > + len -= io_len;
> > + }
> > iomap_dio_submit_bio(iter, dio, bio, pos);
> > }
> >
> > --
> > 2.43.0
> >
> >