2004-10-10 03:55:29

by Adam J. Richter

[permalink] [raw]
Subject: [PATCH?] make __bio_add_page check q->max_hw_sectors

On an dm-crypt partiton on an IDE disk, 2.6.9-rc3 and
2.6.9-rc3-bk9 repeatedly generate the following error, which
does not occur in 2.6.9-rc1:

bio too big for device dm-0 (256 > 255)

The stack trace looked something like this:

submit_bio
mpage_bio_submit
mpage_readpages
readpages
do_page_cache_readahead
filemap_nopage
do_no_page
handle_mm_fault

Around 2.6.9-rc3, a new field q->max_hw_sectors was
added to struct request_queue. I was able to make this
problem disappear by the following patch, which adds a
check of this new field to __bio_add_page. (I've edited
this patch to hide other differences in my fs/bio.c, so
it may be necessary to apply it by hand if patch fails.)

I do not understand the intended difference between
the new max_hw_sectors field and max_sectors, so it is unclear
to me if it is a bug that my dm-crypt request_queue has
q->max_hw_sectors < q->max_sectors. If q->max_hw_sectors
is supposed to be guaranteed to be greater than or equal
to q->max_sectors, then the real bug is elsewhere and my
patch is unnecessary.

I am cc'ing the dm-crypt mailing list, since I
suspect that dm-crypt users who are running on a disk
partition (as opposed to a file via a loop device) and who
upgrade to 2.6.9-rc3 or later are effected and I think the
bug could result in file system corruption. However, I've
only observed this problem in a harmless readahead situation.

--
__ ______________
Adam J. Richter \ /
[email protected] | g g d r a s i l


Index: linux/fs/bio.c
===================================================================
RCS file: /usr/src.repository/repository/linux/fs/bio.c,v
retrieving revision 1.19
diff -u -1 -8 -r1.19 bio.c
--- linux/fs/bio.c 2004/10/09 18:16:58 1.19
+++ linux/fs/bio.c 2004/10/10 18:18:36
@@ -289,36 +289,39 @@
static int __bio_add_page(request_queue_t *q, struct bio *bio, struct page
*page, unsigned int len, unsigned int offset)
{
int retried_segments = 0;
struct bio_vec *bvec;

/*
* cloned bio must not modify vec list
*/
if (unlikely(bio_flagged(bio, BIO_CLONED)))
return 0;

if (bio->bi_vcnt >= bio->bi_max_vecs)
return 0;

if (((bio->bi_size + len) >> 9) > q->max_sectors)
return 0;

+ if (((bio->bi_size + len) >> 9) > q->max_hw_sectors)
+ return 0;
+
/*
* we might lose a segment or two here, but rather that than
* make this too complex.
*/

while (bio->bi_phys_segments >= q->max_phys_segments
|| bio->bi_hw_segments >= q->max_hw_segments
|| BIOVEC_VIRT_OVERSIZE(bio->bi_size)) {

if (retried_segments)
return 0;

retried_segments = 1;
blk_recount_segments(q, bio);
}

/*
* setup the new entry, we might clear it again later if we


2004-10-10 08:14:55

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH?] make __bio_add_page check q->max_hw_sectors

On Sun, Oct 10 2004, Adam J. Richter wrote:
> On an dm-crypt partiton on an IDE disk, 2.6.9-rc3 and
> 2.6.9-rc3-bk9 repeatedly generate the following error, which
> does not occur in 2.6.9-rc1:
>
> bio too big for device dm-0 (256 > 255)
>
> The stack trace looked something like this:
>
> submit_bio
> mpage_bio_submit
> mpage_readpages
> readpages
> do_page_cache_readahead
> filemap_nopage
> do_no_page
> handle_mm_fault
>
> Around 2.6.9-rc3, a new field q->max_hw_sectors was
> added to struct request_queue. I was able to make this
> problem disappear by the following patch, which adds a
> check of this new field to __bio_add_page. (I've edited
> this patch to hide other differences in my fs/bio.c, so
> it may be necessary to apply it by hand if patch fails.)
>
> I do not understand the intended difference between
> the new max_hw_sectors field and max_sectors, so it is unclear
> to me if it is a bug that my dm-crypt request_queue has
> q->max_hw_sectors < q->max_sectors. If q->max_hw_sectors
> is supposed to be guaranteed to be greater than or equal
> to q->max_sectors, then the real bug is elsewhere and my
> patch is unnecessary.

That's exactly correct, ->max_sectors must never be bigger than
max_hw_sectors, that is the real bug.

--
Jens Axboe

2004-10-11 03:39:44

by Adam J. Richter

[permalink] [raw]
Subject: Re: [PATCH?] make __bio_add_page check q->max_hw_sectors

On Sun, 10 Oct 2004 10:14:16 +0200, Jens Axboe wrote:
>On Sun, Oct 10 2004, Adam J. Richter wrote:
[...]
>> I do not understand the intended difference between
>> the new max_hw_sectors field and max_sectors, so it is unclear
>> to me if it is a bug that my dm-crypt request_queue has
>> q->max_hw_sectors < q->max_sectors. If q->max_hw_sectors
>> is supposed to be guaranteed to be greater than or equal
>> to q->max_sectors, then the real bug is elsewhere and my
>> patch is unnecessary.

>That's exactly correct, ->max_sectors must never be bigger than
>max_hw_sectors, that is the real bug.

OK. Please disregard my previous patch. Thanks for
your clarification.

The problem I saw was due to my mistake in merging the
2.6.9-rc3 change that added request_queue->max_sectors with one
of my local changes (replacing some fields in struct request_queue
with struct io_restrictions, a patch which I should repost one of
these days).

__ ______________
Adam J. Richter \ /
[email protected] | g g d r a s i l