2007-07-03 03:28:38

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] Teach do_mpage_readpage() about unwritten buffers

On Tue, 3 Jul 2007 11:10:19 +1000 David Chinner <[email protected]> wrote:

> Teach do_mpage_readpage() about unwritten extents so we can
> always map them in get_blocks rather than they are are holes on
> read. Allows setup_swap_extents() to use preallocated files on XFS
> filesystems for swap files without ever needing to convert them.
>
> Signed-Off-By: Dave Chinner <[email protected]>
>
> ---
> fs/mpage.c | 5 +++--
> fs/xfs/linux-2.6/xfs_aops.c | 13 +++----------
> 2 files changed, 6 insertions(+), 12 deletions(-)
>
> Index: 2.6.x-xfs-new/fs/mpage.c
> ===================================================================
> --- 2.6.x-xfs-new.orig/fs/mpage.c 2007-05-29 16:17:59.000000000 +1000
> +++ 2.6.x-xfs-new/fs/mpage.c 2007-06-27 22:39:35.568852270 +1000
> @@ -207,7 +207,8 @@ do_mpage_readpage(struct bio *bio, struc
> * Map blocks using the result from the previous get_blocks call first.
> */
> nblocks = map_bh->b_size >> blkbits;
> - if (buffer_mapped(map_bh) && block_in_file > *first_logical_block &&
> + if (buffer_mapped(map_bh) && !buffer_unwritten(map_bh) &&
> + block_in_file > *first_logical_block &&
> block_in_file < (*first_logical_block + nblocks)) {
> unsigned map_offset = block_in_file - *first_logical_block;
> unsigned last = nblocks - map_offset;
> @@ -242,7 +243,7 @@ do_mpage_readpage(struct bio *bio, struc
> *first_logical_block = block_in_file;
> }
>
> - if (!buffer_mapped(map_bh)) {
> + if (!buffer_mapped(map_bh) || buffer_unwritten(map_bh)) {
> fully_mapped = 0;
> if (first_hole == blocks_per_page)
> first_hole = page_block;
> Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_aops.c
> ===================================================================
> --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_aops.c 2007-06-05 22:14:39.000000000 +1000
> +++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_aops.c 2007-06-27 22:39:29.545636749 +1000
> @@ -1340,16 +1340,9 @@ __xfs_get_blocks(
> return 0;
>
> if (iomap.iomap_bn != IOMAP_DADDR_NULL) {
> - /*
> - * For unwritten extents do not report a disk address on
> - * the read case (treat as if we're reading into a hole).
> - */
> - if (create || !(iomap.iomap_flags & IOMAP_UNWRITTEN)) {
> - xfs_map_buffer(bh_result, &iomap, offset,
> - inode->i_blkbits);
> - }
> - if (create && (iomap.iomap_flags & IOMAP_UNWRITTEN)) {
> - if (direct)
> + xfs_map_buffer(bh_result, &iomap, offset, inode->i_blkbits);
> + if (iomap.iomap_flags & IOMAP_UNWRITTEN) {
> + if (create && direct)
> bh_result->b_private = inode;
> set_buffer_unwritten(bh_result);
> }

Seems sane, although one does wonder whether it's a worthy tradeoff. We
add additional overhead to readpage[s]() just to avoid some IO during
mkswap?

Also, I wonder what ext4 does in this situation?


2007-07-03 03:52:36

by David Chinner

[permalink] [raw]
Subject: Re: [PATCH] Teach do_mpage_readpage() about unwritten buffers

On Mon, Jul 02, 2007 at 08:28:27PM -0700, Andrew Morton wrote:
> On Tue, 3 Jul 2007 11:10:19 +1000 David Chinner <[email protected]> wrote:
>
> Seems sane, although one does wonder whether it's a worthy tradeoff. We
> add additional overhead to readpage[s]() just to avoid some IO during
> mkswap?

It removes the hidden magic from XFS that hides unwritten extents
behind holes so that they get zero filled by readpages. With other
filesystems starting to use unwritten extents, these sorts of tricksy
hacks should be avoided and we should be explicitly handling unwritten
buffers just like we explicitly handle holes.

The side effect of this change is that other things (like sys_swapon)
behave sanely when faced with unwritten extents (i.e. they are not
detected incorrectly as holes).

FWIW, using unwritten extents for swap files means you can allocate
gigabytes of swap space on demand, even under severe memory pressure
because you don't need to write those gigabytes of data to disk.
Also, the only way to read the contents of the swap file is to go
directly to the underlying device so even if you screw up the
permissions on the swap file it will still always read as zeros.
So there are some benefits here....

Cheers,

Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group