2009-04-14 11:10:20

by Nikanth Karthikesan

[permalink] [raw]
Subject: [PATCH 0/6] Handle bio_alloc failure

Hi Jens

Some of the callers of bio_alloc() assume that it will never fail and always
return bios. But it can fail. This patch set changes those callers to take
action when bio_alloc() fails.

Thanks
Nikanth


2009-04-14 11:18:49

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH 0/6] Handle bio_alloc failure

On Tue, Apr 14 2009, Nikanth Karthikesan wrote:
> Hi Jens
>
> Some of the callers of bio_alloc() assume that it will never fail and always
> return bios. But it can fail. This patch set changes those callers to take
> action when bio_alloc() fails.

It will not fail as long as __GFP_WAIT is set, which it is for all 6 of
your patches.

--
Jens Axboe

2009-04-14 11:43:53

by Nikanth Karthikesan

[permalink] [raw]
Subject: Re: [PATCH 0/6] Handle bio_alloc failure

On Tuesday 14 April 2009 16:48:38 Jens Axboe wrote:
> On Tue, Apr 14 2009, Nikanth Karthikesan wrote:
> > Hi Jens
> >
> > Some of the callers of bio_alloc() assume that it will never fail and
> > always return bios. But it can fail. This patch set changes those callers
> > to take action when bio_alloc() fails.
>
> It will not fail as long as __GFP_WAIT is set, which it is for all 6 of
> your patches.

I thought so, but was confused by various places where it was being handled!
Can this be merged then?

Thanks
Nikanth

bio_alloc() will not fail as long as __GFP_WAIT is set. __GFP_WAIT is also set
as part of GFP_KERNEL, GFP_NOIO and GFP_NOFS. Remove unnecessary code to
handle bio_alloc failure in those cases.

Signed-off-by: Nikanth Karthikesan <[email protected]>

---

diff --git a/block/blk-barrier.c b/block/blk-barrier.c
index f7dae57..20b4111 100644
--- a/block/blk-barrier.c
+++ b/block/blk-barrier.c
@@ -319,9 +319,6 @@ int blkdev_issue_flush(struct block_device *bdev, sector_t *error_sector)
return -ENXIO;

bio = bio_alloc(GFP_KERNEL, 0);
- if (!bio)
- return -ENOMEM;
-
bio->bi_end_io = bio_end_empty_barrier;
bio->bi_private = &wait;
bio->bi_bdev = bdev;
diff --git a/block/ioctl.c b/block/ioctl.c
index 0f22e62..ad474d4 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -146,8 +146,6 @@ static int blk_ioctl_discard(struct block_device *bdev, uint64_t start,
struct bio *bio;

bio = bio_alloc(GFP_KERNEL, 0);
- if (!bio)
- return -ENOMEM;

bio->bi_end_io = blk_ioc_discard_endio;
bio->bi_bdev = bdev;
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index ddae808..a3659c1 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -636,8 +636,6 @@ static int loop_switch(struct loop_device *lo, struct file *file)
{
struct switch_request w;
struct bio *bio = bio_alloc(GFP_KERNEL, 0);
- if (!bio)
- return -ENOMEM;
init_completion(&w.wait);
w.file = file;
bio->bi_private = &w;
diff --git a/fs/direct-io.c b/fs/direct-io.c
index da258e7..05763bb 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -307,8 +307,6 @@ dio_bio_alloc(struct dio *dio, struct block_device *bdev,
struct bio *bio;

bio = bio_alloc(GFP_KERNEL, nr_vecs);
- if (bio == NULL)
- return -ENOMEM;

bio->bi_bdev = bdev;
bio->bi_sector = first_sector;
diff --git a/fs/exofs/inode.c b/fs/exofs/inode.c
index ba8d9fa..ee1f438 100644
--- a/fs/exofs/inode.c
+++ b/fs/exofs/inode.c
@@ -97,15 +97,10 @@ static int pcol_try_alloc(struct page_collect *pcol)
{
int pages = min_t(unsigned, pcol->expected_pages, BIO_MAX_PAGES);

- for (; pages; pages >>= 1) {
+ if (pages)
pcol->bio = bio_alloc(GFP_KERNEL, pages);
- if (likely(pcol->bio))
- return 0;
- }

- EXOFS_ERR("Failed to kcalloc expected_pages=%u\n",
- pcol->expected_pages);
- return -ENOMEM;
+ return 0;
}

static void pcol_free(struct page_collect *pcol)
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 6132353..2a1cb09 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -2416,8 +2416,6 @@ static int ext4_ext_zeroout(struct inode *inode, struct ext4_extent *ex)
len = ee_len;

bio = bio_alloc(GFP_NOIO, len);
- if (!bio)
- return -ENOMEM;
bio->bi_sector = ee_pblock;
bio->bi_bdev = inode->i_sb->s_bdev;

diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 51883b3..650a730 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -272,11 +272,6 @@ static int gfs2_read_super(struct gfs2_sbd *sdp, sector_t sector)
lock_page(page);

bio = bio_alloc(GFP_NOFS, 1);
- if (unlikely(!bio)) {
- __free_page(page);
- return -ENOBUFS;
- }
-
bio->bi_sector = sector * (sb->s_blocksize >> 9);
bio->bi_bdev = sb->s_bdev;
bio_add_page(bio, page, PAGE_SIZE, 0);
diff --git a/fs/xfs/linux-2.6/xfs_aops.c b/fs/xfs/linux-2.6/xfs_aops.c
index 7ec89fc..fb4f516 100644
--- a/fs/xfs/linux-2.6/xfs_aops.c
+++ b/fs/xfs/linux-2.6/xfs_aops.c
@@ -421,10 +421,7 @@ xfs_alloc_ioend_bio(
struct bio *bio;
int nvecs = bio_get_nr_vecs(bh->b_bdev);

- do {
- bio = bio_alloc(GFP_NOIO, nvecs);
- nvecs >>= 1;
- } while (!bio);
+ bio = bio_alloc(GFP_NOIO, nvecs);

ASSERT(bio->bi_private == NULL);
bio->bi_sector = bh->b_blocknr * (bh->b_size >> 9);
diff --git a/kernel/power/swap.c b/kernel/power/swap.c
index 505f319..8ba052c 100644
--- a/kernel/power/swap.c
+++ b/kernel/power/swap.c
@@ -64,8 +64,6 @@ static int submit(int rw, pgoff_t page_off, struct page *page,
struct bio *bio;

bio = bio_alloc(__GFP_WAIT | __GFP_HIGH, 1);
- if (!bio)
- return -ENOMEM;
bio->bi_sector = page_off * (PAGE_SIZE >> 9);
bio->bi_bdev = resume_bdev;
bio->bi_end_io = end_swap_bio_read;

2009-04-14 18:16:56

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH 0/6] Handle bio_alloc failure

On Tue, Apr 14, 2009 at 05:11:19PM +0530, Nikanth Karthikesan wrote:
> On Tuesday 14 April 2009 16:48:38 Jens Axboe wrote:
> >
> > It will not fail as long as __GFP_WAIT is set, which it is for all 6 of
> > your patches.

Um, before we take out the checks, can we please make sure this is a
guaranteed, documented behaviour? In include/linux/page_alloc.h,
__GFP_NOFAIL is documented as "will never fail", but it says
absolutely nothing about __GFP_WAIT.

Some day, someone will create a static checker that will flag warnings
when people fail to check for allocation failures, and it would be
good if the formal semantics for __GFP_WAIT, and hence for GFP_NOFS,
GFP_KERNEL, and GFP_USER, et. al. are defined.

We have code in fs/jbd2/transaction.c that calls kzalloc with
GFP_NOFS|__GFP_NOFAIL, since I and many other people had the
assumption that without __GFP_NOFAIL, an GFP_NOFS allocation could
very well fail.

Or is this special-case behaviour which bio_alloc() guarantees, but
not necessarily any other allocation function?

- Ted

2009-04-14 18:21:01

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH 0/6] Handle bio_alloc failure

On Tue, Apr 14 2009, Theodore Tso wrote:
> On Tue, Apr 14, 2009 at 05:11:19PM +0530, Nikanth Karthikesan wrote:
> > On Tuesday 14 April 2009 16:48:38 Jens Axboe wrote:
> > >
> > > It will not fail as long as __GFP_WAIT is set, which it is for all 6 of
> > > your patches.
>
> Um, before we take out the checks, can we please make sure this is a
> guaranteed, documented behaviour? In include/linux/page_alloc.h,
> __GFP_NOFAIL is documented as "will never fail", but it says
> absolutely nothing about __GFP_WAIT.
>
> Some day, someone will create a static checker that will flag warnings
> when people fail to check for allocation failures, and it would be
> good if the formal semantics for __GFP_WAIT, and hence for GFP_NOFS,
> GFP_KERNEL, and GFP_USER, et. al. are defined.
>
> We have code in fs/jbd2/transaction.c that calls kzalloc with
> GFP_NOFS|__GFP_NOFAIL, since I and many other people had the
> assumption that without __GFP_NOFAIL, an GFP_NOFS allocation could
> very well fail.
>
> Or is this special-case behaviour which bio_alloc() guarantees, but
> not necessarily any other allocation function?

It's a bio_alloc() guarantee, it uses a mempool backing. And if you use
a mempool backing, any allocation that can wait will always be
satisfied.

--
Jens Axboe

2009-04-14 18:35:57

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH 0/6] Handle bio_alloc failure

On Tue, Apr 14, 2009 at 08:20:49PM +0200, Jens Axboe wrote:
>
> It's a bio_alloc() guarantee, it uses a mempool backing. And if you use
> a mempool backing, any allocation that can wait will always be
> satisfied.
>

Am I missing something? I don't see anything in
include/linux/mempool.h or mm/mempool.c, or in block/blk-core.c or
include/linux/bio.h which documents that GFP_WAIT implies that
bio_alloc() will always succeed.

My concern is that at some point in the future, someone either in the
block device layer or in mm/mempool.c will consider this an
implementation detail, and all of sudden calls to bio_alloc() with
GFP_WAIT will start failing and the resulting hilarty which ensues
won't be easily predicted by the developer making this change.

Regards,

- Ted

2009-04-14 18:41:20

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH 0/6] Handle bio_alloc failure

On Tue, Apr 14 2009, Theodore Tso wrote:
> On Tue, Apr 14, 2009 at 08:20:49PM +0200, Jens Axboe wrote:
> >
> > It's a bio_alloc() guarantee, it uses a mempool backing. And if you use
> > a mempool backing, any allocation that can wait will always be
> > satisfied.
> >
>
> Am I missing something? I don't see anything in
> include/linux/mempool.h or mm/mempool.c, or in block/blk-core.c or
> include/linux/bio.h which documents that GFP_WAIT implies that
> bio_alloc() will always succeed.

Read mempool.c:mempool_alloc(). If __GFP_WAIT is set, it'll never turn
without having done the allocation. It's a bit weird that it isn't
documented in bio_alloc() itself, but there are several other places in
bio.c that references the fact that it cannot fail.

> My concern is that at some point in the future, someone either in the
> block device layer or in mm/mempool.c will consider this an
> implementation detail, and all of sudden calls to bio_alloc() with
> GFP_WAIT will start failing and the resulting hilarty which ensues
> won't be easily predicted by the developer making this change.

It's the entire premise of a mempool, so trust me, it'll never go away.
It is the reason they were added in the first place, for eg swap you
need the mempool guarantee or you risk deadlocking.

--
Jens Axboe

2009-04-14 18:51:42

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 0/6] Handle bio_alloc failure

On Tue, 14 Apr 2009 14:16:32 -0400
Theodore Tso <[email protected]> wrote:

> In include/linux/page_alloc.h,
> __GFP_NOFAIL is documented as "will never fail", but it says
> absolutely nothing about __GFP_WAIT.

In the present implementation, a __GFP_WAIT allocation for order <=3
will only fail if the caller was oom-killed.

Which raises the question "what happens when a mempool_alloc() caller
gets oom-killed?".

Seems that it will loop around in mempool_alloc() doing weak attempts
to allocate memory, not doing direct reclaim while waiting for someone
else to free something up. hm. I guess it'll recover eventually.

2009-04-15 08:46:26

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH 0/6] Handle bio_alloc failure

On Wednesday 15 April 2009 04:46:04 Andrew Morton wrote:
> On Tue, 14 Apr 2009 14:16:32 -0400
> Theodore Tso <[email protected]> wrote:
>
> > In include/linux/page_alloc.h,
> > __GFP_NOFAIL is documented as "will never fail", but it says
> > absolutely nothing about __GFP_WAIT.
>
> In the present implementation, a __GFP_WAIT allocation for order <=3
> will only fail if the caller was oom-killed.
>
> Which raises the question "what happens when a mempool_alloc() caller
> gets oom-killed?".
>
> Seems that it will loop around in mempool_alloc() doing weak attempts
> to allocate memory, not doing direct reclaim while waiting for someone
> else to free something up. hm. I guess it'll recover eventually.

Yes, it doesn't have to reclaim anything (quite likely if we've
been OOM killed, reclaim is very difficult or impossible at this
point anyway). It will recover when an object gets returned to
the mempool by someone else. No point in using page allocator
reserve when we have guaranteed forward progress anyway.