On an dm-crypt partiton on an IDE disk, 2.6.9-rc3 and
2.6.9-rc3-bk9 repeatedly generate the following error, which
does not occur in 2.6.9-rc1:
bio too big for device dm-0 (256 > 255)
The stack trace looked something like this:
submit_bio
mpage_bio_submit
mpage_readpages
readpages
do_page_cache_readahead
filemap_nopage
do_no_page
handle_mm_fault
Around 2.6.9-rc3, a new field q->max_hw_sectors was
added to struct request_queue. I was able to make this
problem disappear by the following patch, which adds a
check of this new field to __bio_add_page. (I've edited
this patch to hide other differences in my fs/bio.c, so
it may be necessary to apply it by hand if patch fails.)
I do not understand the intended difference between
the new max_hw_sectors field and max_sectors, so it is unclear
to me if it is a bug that my dm-crypt request_queue has
q->max_hw_sectors < q->max_sectors. If q->max_hw_sectors
is supposed to be guaranteed to be greater than or equal
to q->max_sectors, then the real bug is elsewhere and my
patch is unnecessary.
I am cc'ing the dm-crypt mailing list, since I
suspect that dm-crypt users who are running on a disk
partition (as opposed to a file via a loop device) and who
upgrade to 2.6.9-rc3 or later are effected and I think the
bug could result in file system corruption. However, I've
only observed this problem in a harmless readahead situation.
--
__ ______________
Adam J. Richter \ /
[email protected] | g g d r a s i l
Index: linux/fs/bio.c
===================================================================
RCS file: /usr/src.repository/repository/linux/fs/bio.c,v
retrieving revision 1.19
diff -u -1 -8 -r1.19 bio.c
--- linux/fs/bio.c 2004/10/09 18:16:58 1.19
+++ linux/fs/bio.c 2004/10/10 18:18:36
@@ -289,36 +289,39 @@
static int __bio_add_page(request_queue_t *q, struct bio *bio, struct page
*page, unsigned int len, unsigned int offset)
{
int retried_segments = 0;
struct bio_vec *bvec;
/*
* cloned bio must not modify vec list
*/
if (unlikely(bio_flagged(bio, BIO_CLONED)))
return 0;
if (bio->bi_vcnt >= bio->bi_max_vecs)
return 0;
if (((bio->bi_size + len) >> 9) > q->max_sectors)
return 0;
+ if (((bio->bi_size + len) >> 9) > q->max_hw_sectors)
+ return 0;
+
/*
* we might lose a segment or two here, but rather that than
* make this too complex.
*/
while (bio->bi_phys_segments >= q->max_phys_segments
|| bio->bi_hw_segments >= q->max_hw_segments
|| BIOVEC_VIRT_OVERSIZE(bio->bi_size)) {
if (retried_segments)
return 0;
retried_segments = 1;
blk_recount_segments(q, bio);
}
/*
* setup the new entry, we might clear it again later if we
On Sun, Oct 10 2004, Adam J. Richter wrote:
> On an dm-crypt partiton on an IDE disk, 2.6.9-rc3 and
> 2.6.9-rc3-bk9 repeatedly generate the following error, which
> does not occur in 2.6.9-rc1:
>
> bio too big for device dm-0 (256 > 255)
>
> The stack trace looked something like this:
>
> submit_bio
> mpage_bio_submit
> mpage_readpages
> readpages
> do_page_cache_readahead
> filemap_nopage
> do_no_page
> handle_mm_fault
>
> Around 2.6.9-rc3, a new field q->max_hw_sectors was
> added to struct request_queue. I was able to make this
> problem disappear by the following patch, which adds a
> check of this new field to __bio_add_page. (I've edited
> this patch to hide other differences in my fs/bio.c, so
> it may be necessary to apply it by hand if patch fails.)
>
> I do not understand the intended difference between
> the new max_hw_sectors field and max_sectors, so it is unclear
> to me if it is a bug that my dm-crypt request_queue has
> q->max_hw_sectors < q->max_sectors. If q->max_hw_sectors
> is supposed to be guaranteed to be greater than or equal
> to q->max_sectors, then the real bug is elsewhere and my
> patch is unnecessary.
That's exactly correct, ->max_sectors must never be bigger than
max_hw_sectors, that is the real bug.
--
Jens Axboe
On Sun, 10 Oct 2004 10:14:16 +0200, Jens Axboe wrote:
>On Sun, Oct 10 2004, Adam J. Richter wrote:
[...]
>> I do not understand the intended difference between
>> the new max_hw_sectors field and max_sectors, so it is unclear
>> to me if it is a bug that my dm-crypt request_queue has
>> q->max_hw_sectors < q->max_sectors. If q->max_hw_sectors
>> is supposed to be guaranteed to be greater than or equal
>> to q->max_sectors, then the real bug is elsewhere and my
>> patch is unnecessary.
>That's exactly correct, ->max_sectors must never be bigger than
>max_hw_sectors, that is the real bug.
OK. Please disregard my previous patch. Thanks for
your clarification.
The problem I saw was due to my mistake in merging the
2.6.9-rc3 change that added request_queue->max_sectors with one
of my local changes (replacing some fields in struct request_queue
with struct io_restrictions, a patch which I should repost one of
these days).
__ ______________
Adam J. Richter \ /
[email protected] | g g d r a s i l